Where should you store machine data: external or internal?
For many machine builders and production companies, this question comes up naturally. Machines are connected, data is available through PLCs and gateways, and the first dashboards are in place. The next step is deciding where to store that data structurally.
The choice between external and internal hosting is not a minor detail. It directly affects scalability, cost, data ownership, integration possibilities, and how your service model evolves.
In practice, this decision is often made implicitly, by staying within a platform by default. But it is an architectural choice that should be made deliberately. Hosting is not just a tech decision. It is a service-model decision.
What do we mean by external and internal hosting?
In an industrial context, this typically means:
External hosting
Data is stored outside the organization, usually in a platform or managed cloud environment. Examples include:
- Industrial IoT and connectivity platforms used as the primary storage and access layer
- Managed cloud environments operated by an external partner
- Vendor portals that store and expose data via APIs
Internal hosting
Data is stored within the OEM’s or end customer’s own cloud or IT environment. Examples include:
- An own cloud tenant on Azure, AWS, or Google Cloud
- A dedicated data lake or data warehouse
- A Microsoft Fabric workspace or equivalent unified analytics environment
- An on-premises warehouse
The data flow itself is often similar:
PLC → gateway (for example IXON) → cloud or API → storage → data model → BI layer
The real question is where the central storage and transformation layer sits.
External hosting: fast to start, with dependencies
External hosting is often the default starting point, especially when speed matters.
Advantages
- Fast time-to-market
Data is already available in the platform and accessible through APIs or standard connectors. Dashboards and basic reporting can be built quickly, without additional infrastructure work. - Low operational overhead
No need to manage infrastructure. Security, uptime, and updates are handled by the platform provider. - Standard connectivity
Platforms of this kind are built for industrial environments. Protocols, remote access, and device management are already in place. - Good fit for initial use cases
Monitoring dashboards, alarm overviews, and basic service insights can be delivered quickly.
Disadvantages
- Vendor dependency
You are limited by the platform’s APIs, data structures, and constraints. - Data ownership in practice
Data may formally belong to you or your customer, but it is operationally tied to the platform. - Cost at scale
What starts as cost-effective can become expensive with more machines, higher data frequency, or longer retention. - Constraints on historical analysis
Not all platforms are designed for long-term storage or complex analytics.
External hosting works well until requirements go beyond visibility, for example when customers want integrated reporting or structured service insights.
Internal hosting: control and flexibility, with responsibility
Internal hosting becomes relevant when machine data plays a structural role in operations, service, or commercial offerings.
Advantages
- Full control over data
You define retention, structure, and access. Data is stored within your own environment. - Strong integration capabilities
Direct integration with ERP, MES, and CRM systems is much easier. For example, linking machine alarms to service tickets or customer contracts. - Flexible data model
You can design a model that fits your machines and processes, instead of adapting to a platform structure. - Scalability through standardization
A data model can be built once per machine type and reused across the entire installed base. - Support for more advanced use cases
such as SLA reporting, fleet-wide analytics, or energy and consumption reporting across customers.
Disadvantages
- Higher initial investment
Infrastructure, data pipelines, and models need to be designed and built. - Operational responsibility
Security, monitoring, and performance are your responsibility or that of your IT partner. - Technical complexity
Requires expertise in data engineering, cloud platforms, and BI tooling. - Longer time to first result
Compared to platform-based setups, initial delivery takes more time.

Hybrid models: often the most practical approach
In many cases, the best solution is a combination of both. The strongest OEMs tend to do both.
A common setup:
- Connectivity and data acquisition external, via a platform.
- Storage and transformation internal, for example in Azure, AWS or Google Cloud.
The data flow becomes:
PLC → gateway → patlform Cloud → API → own cloud → data model → BI tools
This approach combines:
- Platform speed from the connectivity layer
- Internal ceiling from your own stack
For OEMs, this is often the most scalable architecture. You leverage existing platforms where they add value, and build your own layer where flexibility is required.
Decision framework: how to choose
The right choice depends on your context. These questions help define the direction:
- Who owns the data?
OEM, end customer, or both? Who needs long-term access and control? - How many machines and customers are involved?
Managing 10 machines is very different from managing 1,000 across multiple regions. - What level of reporting is required?
Basic monitoring, or structured reporting with benchmarking and customer-facing dashboards? - What integrations are needed?
Do you need to combine machine data with ERP, service systems, or CRM? - What is your service model?
Reactive support, or proactive service with reporting toward customers? - What IT capacity is available?
Do you have the capability to manage data infrastructure, or do you want to minimize that? - What is your scale ambition?
One-off solutions per customer, or a standardized model across the installed base?
These questions usually make the direction clear quickly. Choose the ceiling, not the tool.
Closing
There is no universal right answer between external and internal hosting. What works for a first pilot often becomes a limitation when data becomes part of your service offering or commercial model.
StriData helps machine builders and production companies make this choice based on their specific situation. Not based on a preferred stack, but on what works technically and commercially.
We build both platform-based solutions and architectures where data is managed in your own environment. In many cases, the right answer is a combination. If you want to review your current setup or define a future architecture, a quick scan is usually enough to clarify the next step.
