Big data analytics and the IIoT
The so-called new paradigm in manufacturing — the Industrial Internet of Things — has a problem: how to best use the data.
Today’s highly volatile market environment and high cost of maintaining ageing infrastructure, as well as the demands of global competition, are challenging companies to sustain their profitability by finding new sources of revenue and by lowering their operating costs. Organisations today need to be more flexible, adaptable and transparent in their practices. Manufacturers therefore need a single source of the truth to help them make the right decisions for improved performance while mitigating risk from unexpected incidents. That is, they need to know what they can do to improve yields, reduce scrap, rework and recalls, and make supply chains more efficient.
While manufacturing systems today generate enormous amounts of data, most business analytics systems do not support any connection to that data, leaving the staff to mentally connect the business and operational worlds in order to achieve a meaningful analysis.
Today there are significant megatrends at play, greatly driven by the accelerating pace of adoption of digital technologies. These major market trends fall into three different areas that are relevant to digital transformation. While these trends represent a challenge to current business practices, they also represent opportunities to leverage new technologies to maintain competitiveness.
Big data and the IoT
With the cost of computing, bandwidth and sensors decreasing by orders of magnitude in recent years, there has been an explosion of embedded devices that can communicate with one another and produce large volumes of data. Big data can be defined as “high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation.”1
The physical world is being digitised, with ‘smart’ objects linked through wireless networks that carry information forming the much-discussed Internet of Things (IoT). The real-time dynamic analysis of data that is implied by the IoT is a challenge to businesses that rely on static and rigid information architectures.
Digital innovations are disrupting industries everywhere. Disruption lead to market incumbents getting displaced by nimble companies that have commercialised cheaper, more convenient and widely available digital technologies, creating new markets. Existing manufacturers don’t want to be left behind.
Changes to the nature of work
Demographic changes in the workforce are also having an impact, with the composition of today’s workforce having never been so diverse. Often millennials are working side-by-side with more experienced workers. As digital technologies further penetrate the workplace, creating new ways of organising work and dispersing knowledge, it will mean finding new ways to manage organisations, their knowledge and their people.
The lower cost of storage, sensing and communications technologies is making ever more data available, but this presents a number of challenges.
Unstructured and isolated data
It is not unusual to struggle with data deluge driven by modern automation technologies. Big data that is neither structured nor contextualised is strenuous to cost-effectively store and analyse through traditional computing approaches.
Data islands are formed as a result of operational and project decisions not made in the context of a larger data strategy, leading to a limited view of the data and reducing collaboration. Data gets siloed, whether it is enterprise data, manufacturing data inside an organisation or data across different organisations in a supply chain. When the data is scattered throughout plant and the company, integrating and analysing it manually becomes resource-intensive and tedious. By the time data is analysed, its value may have been lost.
In industrial organisations, there has traditionally been a technological and operational split between information technology (IT) and operations technology (OT). With smarter machines, big data and initiatives like Industry 4.0, a convergence of IT and OT is beginning. IT and OT, having developed separately with independent systems architectures and purposes, need to come together and find common ground.
There is therefore a need to link analytical systems to operational systems, but most business analytics technologies do not currently support any connection back to the originating systems of the data.
To achieve production targets, the monitoring of assets in real time is necessary to ensure that all assets — whether in a single plant or across all plants — are performing at an optimal level. They need increased visibility and better insights that can be acted upon. This enables them to detect anomalies and fix issues before they occur, yielding no unplanned downtime.
An ageing workforce
The retirement of experienced workers is already creating a skills gap, so it is essential that the knowledge and experience accumulated by more senior workers is captured and made accessible to the new workforce. Inability to institutionalise this knowledge can be detrimental to the organisation. Preparing for this impending change by using digital technologies can ease the transition.
IT/OT integration, as well as smarter assets that network and store information in the cloud, will result in greater cybersecurity risks. Cyber attacks pose a range of threats, making individuals and institutions vulnerable to financial and physical harm. As companies invest in digital technologies, cybersecurity capability must be an important factor in design and purchasing decisions.
New data processing techniques to the rescue
Of course the largest problem to be solved is dealing with the increased availability of data. Overall data generation is expected to grow by 40% per year, totalling 35 zettabytes by 20202, with an estimated 25–50 billion connected things generating trillions of gigabytes of data3. For the manufacturing domain, this data will allow enterprises to monitor and control processes at a much higher level of sophistication. The ad hoc availability of such a large amount of data opens up new opportunities for novel types of analysis and visual representation, but the issue is how to take such advantage of the data.
Batch-generated static reports will be a thing of the past as it becomes possible to view, chart, drill into and explore data flexibly in close to real time, and automated analytics algorithms can now be applied to provide decisions. And it is not only manufacturing-related data that is relevant for analysis: data from other companies or ecosystems (such as those in a supply chain) also have to be considered.
All this of course requires an infrastructure that is capable of supporting very large data sets and the ability to apply machine learning algorithms to the data. The trick is to gather and store only the information required — the right data — as opposed to all data generated from a device, equipment or an operation. Patterns in the data can then be used to derive insights about existing and future operations. The resulting models can be incorporated into operational flows so that as device data is received, the models generate projections, forecasts and recommendations for improving the current operational situation.
Given the amount of information captured and stored, the performance available from such analytics systems is important. The challenge here is to know what subset of right data needs to be accessed to facilitate business process improvement and optimisation. Currently, IIoT data can be analysed deeply and broadly, but not quickly at the same time.
In-memory database processing
The latest developments in big data performance involve in-memory database computing, which is intended to remove the performance constraints of disk-based data storage and retrieval systems. The development of in-memory computing is being spurred on by the decreasing cost and increasing speed of dynamic random-access memory (RAM).
Traditional disk-based database systems (relational database management systems — RDBMS) are transactional systems based around multidimensional linked data structures such as tables, in which transactions are performed against those data structures stored on disk. With an in-memory database, all information is initially loaded into memory, and newer techniques — such as column-centric databases, which store similar information together — allow data to be stored more efficiently and with greater compression. These differences allow larger amounts of data to be stored in the same physical space, reducing the amount of memory needed to perform a query and therefore increasing processing speed even further.
New types of data structures also make it possible to execute analysis on as much IoT data as is relevant to the question, without boundaries or restrictions and without limitations as to data volume or data types. It can also take into account the relevance of the data to be analysed since, for example, recent IIoT data can be more valuable than old data.
New analytic capabilities
Typically, individual IIoT data represents an event taking place in a manufacturing or operational environment. Events may be unrelated to each other or may be correlated. Multiple events may need to be related and correlated in order to determine causal relationships.
In recent times, data analytics capabilities have been developed to more efficiently process such information: event stream processing (ESP) and complex event processing (CEP).
ESP is designed to make it possible to stream, process, filter and group all of the IIoT data and events collected. ESP business rules determine which events are important, which data should be filtered out and which should be kept, and which event correlations or patterns should trigger a broader business event, alert or decision. ESP can utilise IIoT integration to stream the data from the edge to the ESP engine for processing in near real time.
CEP is a more sophisticated capability, which searches for complex patterns in an ordered sequence of events. It is ESP and CEP running on big data enabled by in-memory data processing that are making possible the analytics necessary to take advantage of the IIoT.
Laying the groundwork
For companies to take advantage of the new paradigms that the IIoT and big data can bring to their operations, there are a few challenges that need to be overcome.
- Get the data: Organisations need to collect and come to grips with the various data that is available inside the company’s operations and outside in the supply chain. Some of the data will come from existing processes, some will be new data that can be collected by implementing new sensing technologies, and some will come from other companies or third-party information services.
- Understand the people: The company may need to employ new staff that have the knowledge of how to make use of new data, and will also need to do something about the traditional separation of IT and OT staff — their experience and expertise needs to be brought together in a cooperative way to break down barriers and exploit their strengths.
- Revisit the operational architecture: The traditional ‘air gap’ separation between the operational networks and IT networks will need to end, if it has not already. Linking operational data sources with analytics systems (most likely public or private cloud-based systems) will require best practice knowledge in secure and safe data communications, and may require significant change to data network architectures to support it. The choice of private or public cloud processing will also have a significant bearing on the cost and security of the architecture.
- Gartner Inc 2016, Big data, Gartner IT Glossary, <http://www.gartner.com/it-glossary/big-data>.
- McKinsey Global Institute 2011, Big Data: The Next Frontier for Innovation, Competition, and Productivity.
- Schmitt K 2014, SAP HANA drives Internet of Things Scenarios in Real-time, SAP Community Network, <http://scn.sap.com/community/internet-of-things/blog/2014/05>.
Python is now increasingly being used in the industrial automation and embedded systems world.
Continuous condition monitoring powered by the IIoT is helping many manufacturers in...
In the Fourth Industrial Revolution, self-learning algorithms for autonomous engineering,...