Cloud-driven process historians: delivering richer insight for strategic decisions
It’s time to look at enterprise process data in new ways and with new technology.
A dramatic shift is occurring in the process industries as industrial companies gain an appreciation for the potential impact that cloud and big data could have on their operations. The cloud is the enabler for the Industrial Internet of Things (IIoT), which has the power to transform every aspect of a business, not least the strategically important process historian function.
In fact, the cloud needs a process historian for big data to be fully exploited. This is because the effective deployment of IIoT requires three things:
- The ability to securely gather and access large amounts of data.
- Analytics technology to make sense of the data.
- Domain expertise to determine how to act upon the data.
So, as industrial companies begin to transform themselves digitally, process historians must similarly evolve to unlock the potential of IIoT.
This article examines the ideal way to migrate process historians to the cloud and the benefit these next-generation systems will have on both users and the business.
Cloud’s potential to transform process historians
Process historians made their debut in the 1980s and today are a part of every process facility’s system architecture. Later, high-level historians were introduced for the enterprise while process historians handled day-to-day management and plant floor improvements. High-level historians would operate in unison with process historians to make enterprise-wide information available to corporate stakeholders.
The cloud represents the next step in the evolution of process historians. To understand the cloud’s impact on historians, however, it’s essential to understand the drivers behind cloud adoption.
To support the growth of smart devices, additional data storage is required. The cloud is the natural solution. Cloud technologies by their nature can scale, and they can exploit large, complex datasets for analytics. Data analytics initiatives are high on corporate priority lists: a recent survey of 200 manufacturing executives revealed that over two-thirds of companies are now investing in this area, and have plans to increase that investment.
Even as industrial companies cut expenditure, their investment in cloud is rising, and it’s easy to see why. Cloud technologies deliver unprecedented flexibility and scalability, and can dramatically reduce costs. With the cloud, systems can be set up faster — by scaling up as opposed to starting from scratch.
One of the benefits of the cloud is that it shifts a company’s IT responsibilities to a cloud provider, reducing infrastructure costs. Additionally, a cloud provider can leverage economies of scale to offer a compelling value proposition.
The real value of the cloud, though, is in its ability to exploit historical data. Today, companies are demanding more and more data be collected and used throughout the business, but the scalability limits of current process historians cannot keep pace with demand. With cloud technology, a site that previously collected process data at minute intervals using a conventional historian can now collect tags at vastly higher speeds and frequencies. Cloud technologies can scale in terms of throughput and storage far better than current historian architectures bound to a server.
They also use clustering, load balancing and feature storage that supports virtually unlimited scaling.
The cloud’s scalability is a natural fit for multisite organisations. Today, different sites within an enterprise tend to have their own historians, making it difficult to perform cross-site analysis and troubleshooting. With the cloud, it’s easy to integrate data across different sites and make it available across the enterprise.
Multitenancy is a key element in lowering the cost of these systems and enabling vendors to realise economies of scale. Individual systems can be expensive and require significant upfront investment. However, if shared by small and medium-sized customers, they can be more economical.
The cloud is also ideal for effective data analytics. Technologies such as Hadoop, R and Python can analyse process data and other data types to reveal insights not possible using existing historian tools.
Current industry approaches to placing a historian in the cloud
To pull process data into cloud-based solutions, most companies are taking one of two approaches: virtualising their process historian in the cloud or using a data lake. Let’s look at these approaches in more detail.
Virtualising a process historian in the cloud
One way to reduce hardware infrastructure is to virtualise servers in the cloud. Most process historians support virtualisation. Manufacturers are already virtualising server components in the cloud, including their process historians, and this approach is being used by some process historian vendors as a first step toward cloud applications. This allows vendors to offer preconfigured cloud technology to customers.
By taking this approach, virtual images can scale more easily than physical computers and can effectively share server resources. But ultimately, scalability is limited by the historian’s traditional architecture, making the primary reason to choose this approach to reduce hardware infrastructure and cost of ownership.
In this scenario, data is pulled into a central, less structured database in the cloud, whereupon special tools can be used to manipulate and identify correlations in the data that are not possible with traditional tools. The goal of ‘data lakes’ is generally to load them with enough process data to support analysis. However, process data generally lacks structure, making it difficult to combine and compare with other data. To resolve this issue, it is useful to arrange data according to an asset model, giving context to process values and allowing easy comparison with similar assets such as compressors or heat exchangers. The data can also be related to sources such as maintenance records, which may identify failures or other periods of interest to correlate.
The process of uploading raw data, organising it and relating it to other data is called ‘data wrangling’, and it can consume 80% of a project before any meaningful analysis can be performed.
Another characteristic of big data tools is that they don’t differentiate time- series from other forms of data. This isn’t a major issue for offline analysis, but these tools can struggle to deal with interactive time series queries that are common to the process industries.
Another approach involves data infrastructures based on data lake technologies that are combined with context and analysis tools. In this scenario, customers use vendors to push data to the cloud, typically by generating large offline files. These systems cannot be considered historians, however, as they cannot present data in real time and are therefore more appropriate for batch analysis.
Four imperatives for an effective, cloud-based historian
As the IIoT advances, the lines between process and enterprise historians will eventually blur, if not altogether disappear. Cloud deployment is the primary reason for that change.
There are four features that any effective, cloud-driven historian must support. These include:
- Traditional time series data, and alarm and event data: Traditional tools can be used to visualise and analyse data. Most analysis and root cause detection on process data is performed more efficiently by visualising data over time to track anomalies and related process variables.
- Data lake for big data-type analysis: This is a key driver for medium-to-large organisations considering cloud technologies. Plant and site data can be pulled into this environment and analysed with tools that detect hard-to-find correlations.
- Enterprise asset context data: When working with massive datasets, it is difficult if not impossible to perform analysis without asset context. Tag names are usually only known to local process engineers and operators. Once data is pulled into the cloud and made available to the enterprise, context is needed for users to make sense of it and perform relevant correlations.
- Broader data types: This means that relevant data is stored in a data lake, and tools can be used on top of the system that do not need to connect with anything else. In addition to time series data, the data lake stores information such as alarms and events, production data, transactional data, application data, geolocation data, complex data, and Internet data such as weather or real-time pricing.
Combining a historian and a data lake in the cloud
One way to achieve these goals is to combine a massively scalable historian using native cloud technology with a data lake to facilitate advanced analysis using big data technologies. In this way, it is possible to deliver the real-time process data analysis expected from a traditional enterprise historian plus the batch analysis capability from the data lake, ERP and production data. As a result, enterprise data can be analysed with tools and functions that are already in use at sites and plants, but on a larger scale. Insights found at one plant can be leveraged across all plants.
With a data lake based on a common big data analysis stack such as Hadoop, a company’s data scientists can use their preferred tools, and can use the data store to collate other types of data and analyse it against process data across the enterprise.
Once enterprise data is amassed in a single data store, it is important to have a well-defined model for the data, to provide context for anyone trying to make sense of it. While data at the plant level is well known to local control engineers and operators, once it is aggregated at the site or enterprise level it is important to make it easily available to anyone looking to analyse and visualise the information.
Such a service therefore needs to offer smart cloud connector software interfaces that enables users to connect to various data sources and configure the data for transfer to the cloud quickly and easily, particularly from existing plant historians.
Configuring data for transfer to the cloud should be easily achieved by adding tag filters to identify tags and the method of transfer — for example, as raw data or as an aggregation. By taking this approach, a project involving the transfer of data to a data lake can be undertaken in just a couple of hours.
Next-generation, cloud-driven historians must be more than traditional historians virtualised or developed for the cloud, or more than a lake of unstructured data. The cloud historian of the future must be a combination of both, and much more. It must be the data platform for all cloud applications, as well as for on-site applications that connect to the cloud.
Designed in this way, cloud-driven historians will yield the rich data required for better strategic decisions that will impact profitability and business success.
The recent series of cyber attacks on Australian organisations has highlighted the need to place...
If 2020 has taught us anything so far, it's that we must never be complacent.
Advice on how to realise the dream of implementing successful predictive maintenance applications.