Abstract
Effective and efficient modern manufacturing operations require the acceptance and incorporation of the fourth industrial revolution, also known as Industry 4.0. Traditional shop floors are evolving their production into smart factories. To continue this trend, a specific architecture for the cyber-physical system is required, as well as a systematic approach to automate the application of algorithms and transform the acquired data into useful information. This work makes use of an approach that distinguishes three layers that are part of the existing Industry 4.0 paradigm: edge, fog, and cloud. Each of the layers performs computational operations, transforming the data produced in the smart factory into useful information. Trained or untrained methods for data analytics can be incorporated into the architecture. A case study is presented in which a real-time statistical control process algorithm based on control charts was implemented. The algorithm automatically detects changes in the material being processed in a computerized numerical control (CNC) machine. The algorithm implemented in the proposed architecture yielded short response times. The performance was effective since it automatically adapted to the machining of aluminum and then detected when the material was switched to steel. The data were backed up in a database that would allow traceability to the line of g-code that performed the machining.
1 Introduction
In the current manufacturing landscape, machines can produce and share data about their operations. Sensor and data transmission technology is ubiquitous and low cost with ever-improving sample quality and rates, resulting in low cost and high-quality manufacturing data. However, most of the generated data are not fully leveraged due to two major issues. First, the data do not follow a standard structure, even though numerous standards and protocols exist such as MTConnect (for machine tools) [1,2], and open platform communications united architecture (OPC UA) [3], respectively. Nevertheless, the industry still operates using old CNC machines, legacy equipment, and other sensors that do not adhere to any standard data structure. Second, once obtained, the data need to be transformed into useful information, a process that is not automated and sometimes without a clear framework.
The literature on the subject can be divided into two categories. In the first category, several case studies are addressing the incorporation of artificial intelligence to the data generated during machining processes [4]. Techniques such as neural networks (NNs), support vector machine [5], and random forests have been successfully implemented to predict tool wear using data from milling operations [6]. The cutting parameters for high-speed turning are predicted efficiently using machine learning methods [5]. Works in this category present an implementation of machine learning techniques to specific problems related to machining operations; however, they do not present a clear unified model to implement systematically a path from data collection to information generation. In the second category, cyber-physical systems (CPSs) are explained through models [7]. The 5C model comprises the levels of connection, data conversion, cyber, cognition, and configuration [8,9]. The work of Monostori et al. [10] presents a comprehensive CPS framework with some case studies related to the use of OPC/UA and other smart connection capabilities. The work in this category, while providing the architecture, communication protocols, and other structure towards a model for CPS, do not provide the link between these protocols and an automated data-driven information generation system. Table 1 showcases works related to manufacturing and the use of fog computing (FC), which is one of the most recent elements that is receiving attention as key in the transformative process of data to information. The table synthesizes the role of the edge and cloud layer considered in their respective frameworks, if automatic analytics is included, and the general purpose of the application. Related to the architecture for CPS, there are strong arguments on the advantages of using edge and FC versus cloud computing [14]. Some others propose even more categories (other than edge, fog, and cloud) as is the case of the fluid manufacturing architecture [16] (which included mist and dew). Most of the works clearly distinguish the elements in the edge layer such as sensors and machine tools (those that produce data). Most of the works also recognize the functions that the cloud usually performs which are storage (database) and big data analytics. Study cases have been presented with the layers edge, fog, and cloud not completely distinguishable from each other [19]. The works that better provide a clear definition of the CPS architecture [12,15] considering the layers of edge, fog, and cloud did not explicitly manifest how to implement automated analytics for the generated data.
This work proposes an approach with three main objectives. The first objective is to allocate the computational resources to perform a data-driven generation of information on the layers of edge, fog, and cloud. This is an original implementation that, while similar to models already available in the literature [12,15], adds an message queue telemetry transport (MQTT) broker and a database in the fog layer that buffers data in case of interruptions of communication with the cloud. The second objective is to present a model for a CPS incorporating the standards of communication available in the industry and an automated unsupervised classifier implementation based on the Nelson rules [20] for control charts (CCs), a convenient and widely used method from the industry. The third objective is to display the implementation of the model in a case study that implements the recognition of the material being processed by a CNC machine. The general aim of this work is to build the CPS architecture with the layers of fog, edge, and cloud, implement in this architecture efficient automated analytics, and put all of this into practice with data produced by machine tools. The main contribution of this work is the presentation of a framework enabling manufacturing operations to utilize current data streams and effectively integrate new sources of digital data to generate information.
The organization of the manuscript is as follows: Sec. 2 describes the structural CPS model developed in this work, Sec. 3 presents the real-time statistical process control (RT-SPC), and the case study is described in Sec. 4. Results are presented in Sec. 5 and the discussion of the results can be found in Sec. 6. Section 7 contains the conclusion of this research and Sec. 8 provides the identified future work.
2 Model for Cyber-Physical Systems
CPSs are the intersection and interaction between the digital and physical environment and associates them with analytics to improve the efficiency of industrial systems [10,21,22]. CPS architecture contains a variety of layers. The selection and configuration of these layers have an impact on the performance of the system, for example, in terms of latency and processing rates [23]. One of the layers of CPS applied to manufacturing purposes is the edge. In edge computing (EC), data processing occurs as close as possible to the data source (e.g., the sensor). The accepted convention is that EC is an umbrella term comprising FC, mobile edge computing (MEC), and cloudlet computing [24–26]. Nevertheless, considering the contrasts between EC (especially MEC) and FC [27], it is useful to distinguish a separate layer in the CPS for FC, a fog layer. The fog (where the clouds touch the ground/edge) layer takes data from the edge layer and centralizes it in the local area network and, when necessary, with additional processing and storage. It also determines which data and information are shared with or transferred to the cloud. The cloud represents another layer in the CPS.
In this work, a CPS for manufacturing systems was designed with the three layers presented in the block diagram in Fig. 1. The edge layer in Fig. 1(a) contains the machine tools, embedded systems (e.g., microcontroller units, PLCs, microprocessor units), and other data sources from automated systems. Modern machine tools are typically equipped to transmit the data produced in the machining process using protocols such as MTConnect. MTConnect relies on a structure composed of an adapter and agent. The latter communicates the data related to the machining process and the machine status to the outside via an xml file with a structure optimized to be read by an interpreter. Once interpreted, the data can be sent through a gateway with a communication protocol. In this case, the application layer protocol of communication, MQTT, is ideal to transmit data because of its lightweight and low-latency characteristics. The components of the edge layer act as clients in MQTT communication. Certainly, besides MTConnect and MQTT, OPC/UA is another relevant protocol for machine-to-machine communication [28] in the edge layer; however, it was not implemented in this particular CPS.
The fog layer in Fig. 1(b) contains a local MQTT broker that handles communication between the components of the edge layer and the components of the cloud layer. The fog layer contains a database that buffers data while offline (interrupted communication with cloud) and transmits the data once online (communication with cloud reestablished). The fog layer also performs statistical analysis and applies machine learning models to the data.
The cloud layer shown in Fig. 1(c) has the functions of maintaining a database log of the raw data coming from the fog layer, processing data, and training/applying the algorithms (and maintains a database of the generated data) and produces analytics based on user-defined statistics. The results update the fog layer and web applications (client-server-based applications that run in a browser) that are used to communicate the information to users and other systems. While some of the functions can be performed either in the cloud or in the other layers (local), there are powerful reasons why the cloud can be of benefit. The cloud has the characteristic that can be accessed remotely, that is the reason why it can offer services [16] to several users at the same time, for example, accessibility to dashboards, monitoring, or the possibility of collaborating from different geographical spaces. Those services depending on the cloud platform can be “pay as you go” [12] which can include storage and computing capabilities without the need to invest in local resources. Among the limitations of the cloud layer are that the services are interrupted with internet access intermittencies, there are high latencies, an increase in network traffic load, and the risk of cyber-attacks or concerns on data privacy [14].
These three layers constitute the general architecture of the CPS. The components of each of the layers can be modified to adapt to any particular purpose. A demonstration of this architecture working with an unsupervised self-learning algorithm is presented in the ensuing sections. The algorithm is primarily based on the first Nelson rule (1NR) and CCs, which have applications in manufacturing shop floors [29]. The implementation of this heuristic makes up a self-adapting and RT-SPC. The implementation presented in this work was oriented to indicate when different materials (e.g., steel, aluminum, etc.) were processed in a CNC lathe machine.
3 Real-Time Statistical Process Control
Manufacturing systems are becoming increasingly complex. Modern systems produce a substantial amount of raw data that typically is not used effectively to optimize production. To satisfy the need of producing higher quality products with fewer resources and time, researchers have initiated utilizing machine learning algorithms in the manufacturing domain. Machine learning is usually categorized into four main sections of unsupervised learning, semi-supervised learning, supervised learning, and reinforcement learning. This study focuses on the implementation and automation of the RT-SPC on the proposed architecture. Implementation of RT-SPC is similar to unsupervised machine learning methods since the data are not labeled. In contrast to the NN and other complex machine learning methods that require a large amount of labeled data and substantial computational power, RT-SPC requires fewer data points and minimal processing power.
Traditional SPC charts have historically been essential tools for manufacturing process control. Significant areas of opportunity are in the manual setup and time-consuming procedure of implementing CCs especially considering the automated nature of modern manufacturing processes. RT-SPC uses the same methodology underlying conventional CCs, but with stream analysis of the data in real-time. Assuming that the data captured can be represented by a normal distribution with a mean μ and a standard deviation σ, the odds of having the data points fall into three times the standard deviation (3σ) away from the mean, μ, is 99.73%. Considering this statistic, the 3σ lower and upper control limits are defined as the 1NR.
Considering Eqs. (2), (4), and (5), along with the 1NR requirements, to incrementally analyze the data using the proposed RT-SPC method, only four parameters need to be allocated in the memory: count, mean, standard deviation, and a buffer that holds the last 15 data points (n = 15). These parameters are updated when a new data point is received. The control limits are also updated accordingly. When a new data point is generated, the oldest data point is discarded and a new mean and standard deviation are calculated using Eqs. (2), (4), and (5), now with the updated set of 15 data points.
The RT-SPC based on the 1NR was developed in the cloud layer of the architecture as shown in Fig. 2, where the algorithm analyzes the incoming messages received in real-time from the MQTT message broker. It is possible to implement the same development in the fog or edge layer depending on the needs of the application. The codes were developed in node-red,2 a flow-based programing environment for Node.js. In this case, the cloud implementation had the advantage that programing and testing could be done remotely by several users in collaboration and no local computing resources were needed since the operation (heavy lifting) was performed in remote servers. In other words, this part of the process worked as a service.
As discussed above, rather than storing all the data points in a database and calculating their mean and standard deviation, incremental calculations only require a few parameters to be stored, making the algorithm extremely high performance with low storage needed. As the number of data points increases, calculating the mean and standard deviation of all data points for every incoming data point will be computationally expensive. Through incremental calculations (e.g., stream analytics), the computational requirements are constant, and the performance of the system does not decrease over time due to computational complexity, enhancing scalability. This set of nodes was integrated into subsequent elements of the architecture.
The Nelson rules and anomaly detection focus of this study were implemented on the spindle load data captured from a CNC lathe. The goal was to develop an unsupervised algorithm that detects the material type difference in the machining of a part in a CNC lathe. In other words, the objective of this section is to identify the repeating G-Code files and ensuring that a similar material was used for the similar G-Code blocks. This algorithm is, therefore, expected to generate an alert back if a part with different hardness and characteristics is a machine that does not match the previously cut materials with the same G-code. An entity relation diagram is shown in Fig. 3 for the database that was created to store and correlate the G-Code filename, G-Code lines, and spindle loads captured from the MTConnect data stream generated by the CNC controller. This database was implemented in the fog layer.
The MTConnect data for the case study of this work were captured and stored in comma separated values (CSV) format to be used in a simulation environment to ease the debugging and verification of the algorithm as shown in Fig. 4. This setup publishes the same messages to the MQTT message broker that was captured during the actual operation of the CNC machine.
In contrast with the traditional internet communication protocols such as HTTP that forces the user to receive all the messages and filter out the client-side, this architecture advocates the use of MQTT as it provides the flexibility of subscribing to the desired topics of interest, reducing the unnecessary traffic between the gateways. This was implemented in the architecture to only subscribe to the topics of “Lpprogram,” “Lp1block,” and “LS1load” which represent G-Code filename, G-Code line, and spindle load, respectively. Since these messages are received with different timing, once each one is received, they are stored in the global variable of the node-red flow as shown in Fig. 5.
The function for the spindle load has an additional task of incrementally adding the spindle loads together while recording their count number. When the spindle load messages related to a G-Code line are all received, this node calculates the mean of the captured spindle loads and passes that along with the G-Code filename and G-Code line to the next stage. This process is shown in the flow chart in Fig. 6.
The “Incrementally Calculate” flow shown in Fig. 6 consists of multiple stages. The first stage in this process is to read the current values of the mean, standard deviation, and count of the spindle load as well as the buffer that holds the last 15 spindle load readings that are needed for computation of the Nelson rules and to update them with the incoming spindle load average. The critical challenge in this process is that reading and writing to the database is time-consuming and asynchronous. By simply transmitting the spindle load averages to these nodes, rather than incrementally updating the values one by one, all the data are read from the database at once and are updated simultaneously. This causes unnecessary computations as each data point must be read or written in a big database.
To solve this problem, a buffer is created to hold all the messages. One message is injected into the flow and when this message reaches the ending node, the buffer is commanded to release the next data point to the flow. Therefore, the messages are kept in the buffer until they are permitted through the computational flows, ensuring they are analyzed serially in the sequence. This entire process with the integration of the Nelson rules nodes is shown in Fig. 7.
4 Case Study: Machining Aluminum and Steel
The purpose of the case study is to evaluate the integrated algorithm RT-SPC on the architecture in detecting anomalies. More specifically, the case study determines whether the algorithm can detect the change in the material type by watching the spindle loads generated during the mass production of a part using a CNC lathe. An OKUMA-Genos L250 was used as the CNC machine. The data from the machine were accessed using a SmartBox3 to add security to the IoT configuration. The data were available in the standard MTConnect. A picture of this machine is shown in Fig. 8. A facing operation was performed with this lathe for different materials. Three samples of aluminum and one of steel were used in this experiment. The diameter of the materials, as well as the machining parameters for the facing operation, is shown in Table 2. All the data collection and generation for this case study were performed in the edge layer (Fig. 1(a)).
The data generate in the edge layer, then pass to the fog layer (Fig. 1(b)) in which the data were transmitted to an MQTT broker. Here the data were made accessible by the cloud. In the fog layer, the data were stored in a database for offline backup, as displayed in Fig. 3. Finally, the data were accessed by the cloud layer (Fig. 1(c)) where the analytics and further data storage were produced.
5 Results
The developed Nelson rules section was first tested with a repeated set of the array {1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5,9.5}, which was injected in different time intervals. The result of the RT-SPC for the 1NR is shown in Fig. 9. Figure 10 shows the accumulation of the detected errors in Nelson rules 3 (more than six points in a row continually increasing or decreasing) and 7 (15 points in a row within the first standard deviation).
Note that there is no need to collect any sample data, compute the control limits, or implement any steps manually. Once the second data set is received by the algorithm, it automatically computes the control limits and updates them accordingly as the new data points are received. Also, note that as soon as any anomalies are detected, the algorithm responds back with the detected Nelson rule error.
The spindle loads for each G-Code line captured from MTConnect for a facing operation on an aluminum sample (AL-S) are shown in Fig. 11.
Running the AL-S1, AL-S2, AL-S3, AL-S4, AL-S5, and steel sample 1 (ST-S1), respectively, the captured spindle loads are shown in Fig. 12. Since there are multiple aluminum parts machined and the next sample was a steel part, it is expected that the algorithm detects this change in the material.
The mean for the spindle loads per G-Code blocks was calculated. The result of this calculation for one of the AL-Ss is shown in Fig. 13. As presented in Fig. 11, the G-Code line “G1 X0 F.03” represents the facing operation and is the most important G-Code considered in this study. Figure 14 presents the spindle loads for this G-Code block, the mean, and the control limits. That last data point in Fig. 14(a) represents the spindle loads related to the stainless-steel sample and as can be seen is statistically abnormal. Figure 14(b) provides a zoomed view into the zone of part (a) where the lower and higher control limits have not yet been calculated for the available data.
Figure 15 presents the actual data points calculated and fed to the architecture. This figure shows the computed statistical errors (abnormalities) in the average spindle loads per G-Code blocks for all six samples that are calculated using the above-mentioned flows in the architecture. All the values stored in the database to perform RT-SPC for this case study are shown in Fig. 16. Note that these values are updated as more data points are published in contrast with the traditional methods where all the data points are stored and the process gets more complex over time as presented in Fig. 17.
6 Discussion
During the implementation of the model of the CPS provided in Fig. 1, it was clear that some functions could be performed in either the cloud or the fog layer. For example, in this case, the application of the RT-SPC algorithm was implemented in the cloud layer using the node-red software as explained in Sec. 2; however, a similar implementation may be executed in the local fog layer. Other functions can only be applied in the layer in which they were designed, for example, creating a buffer in the fog layer to maintain the data even when disconnection from the internet occurs.
The implementation of the MQTT protocol (which started in the fog layer) for transporting the data provided the necessary performance and flexibility to be compatible with the rest of the architecture and to implement the process in the cloud as shown in Fig. 7.
From Figs. 14 and 15, the RT-SPC algorithm calculates the control limits immediately (from 1NR); however, it shows early errors until it becomes stable. Once stable, the algorithm now can identify (by a violation of the 1NR) the switch in working materials. The challenge now is to define when stability has been reached. One proposal is by accounting the rate of detection of abnormalities and when this rate approaches zero and then stability has been reached. Defects in the machining process or changes in the material being machined can be detected only after stability has been reached. After the switch of materials (from aluminum to steel), the algorithm now adapts to the new normal as shown in Fig. 15. Besides the points detected as abnormalities while the control limits are not stable, another limitation of this work is that it is focused only on the spindle loads of the CNC machines and only two materials. The effect of having different materials but with similar properties that affect its machining was not assessed in this work.
From Fig. 17, the advantage of having the RT-SPC is displayed since the traditional method of performing the computation of all data every time a new data point is sampled increases the computational cost each time new data arrive. The RT-SPC allows a constant, low computational time (3 ms) to calculate the control limits that are independent of the data sample size. The data record length independence also provides clear memory requirements (in this case 15 points) which are quite low.
While this work focused only on one machine tool, having multiple machine tools enabled in this model of CPS with automated analytics can bring great benefits towards the I4.0. First, the traceability of manufactured components can be greatly simplified since the framework allows to retrieve the data to the detail of line of G-Code in a G-Code program run in a particular machine. With this, defects or redesigns can be efficiently addressed. Second, since the process is automated, productivity can be monitored as described by key performance indicators such as runtime, throughput, change over time, cycle time, downtime, etc. Third, quality and failure prevention can be monitored effortlessly in real-time since the RT-SPC will detect abnormalities that can come from human errors or machine disfunction. And last, having the fog and cloud layers brings the balance between remote services accessible by multiple users in different geographic zones (cloud) and having offline backup and almost real-time response (fog).
This work presented the results for machine tools; however, the framework should work for other streams of data, for example, the data coming from sensors embedded on machines or not. When that limitation is removed, the CPS presented here can be employed in manufacturing facilities of different nature, which can be the topic of future research.
7 Conclusions
This work contributes to the implementation of Industry 4.0 in a way that includes not only the architecture of a CPS to allocate the use of the computational resources but also the systematic approach to implementing the automatic application of algorithms to transform data into insightful information. The proposed architecture consists of three layers: edge, fog, and cloud. The edge layer collects the data from different sources using MTConnect and other protocols to adapt to each case. The data are transmitted to the fog layer using the protocol MQTT. In the fog layer, the data are prepared to be sent to the cloud layer and in case of lost connectivity, the data can be buffered and sent once the connection is reestablished. The fog layer can also perform the application of algorithms. The cloud layer stores the data and applies analytics to turn data into information. The concept was deployed in an industrial environment in which a case study was performed. The case study had the objective of identifying variations in the material being worked by a CNC machine tool. A RT-SPC based on the Nelson rules was implemented in the CPS architecture proposed. The results present an algorithm that can identify different materials (in this case aluminum and steel). The algorithm implemented does not require training and provides a short and stable response time, compared with the construction of CCs from buffered data.
8 Future Work
Future work is necessary for the next topics: cyber-security, optimization, energy consumption, and digital twin integration. This work did not consider the aspect of cyber-attacks and online security. This aspect is one of the bottlenecks for I4.0 technologies and an important concern for manufacturing companies. Further work is necessary to assess the risk level of this proposal of CPS and to draft strategies to reduce the risk. The CPS proposed here was developed organically based on the needs and resources available in the lab; however, a systematic optimization can produce a lean model for CPS architecture which can assess, according to the application, the precise number of resources to allocate in each of the layers of the CPS. The aspect of energy consumption needs to be addressed since there are important environmental issues that will be impacted in the future when more companies use solutions that combine edge, fog, and cloud layers. The problem can be attacked from the point of view of energy efficiency, not only of the layer of the CPS but also of the source of energy used. Finally, the CPS proposed here and the analytics produced can be inserted into a Digital Twin for products, which can be helpful to increase traceability, quality, and to test different scenarios. The way to connect the CPS explored in this work and the Digital Twin is an aspect that requires further research.
Footnotes
Acknowledgment
This work was partially supported by National Science Foundation (Grant Nos. 1631803 and 1646013). This manuscript has been authored in part by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan.4
The authors would like to acknowledge the financial and technical support of Writing Lab, TecLabs, Tecnológico de Monterrey, CONACYT, and the research group Automotive Consortium for CPSs in the production of this work.
Author Contributions
The following are the contributions of the authors: conceptualization, M.P., T.K., and P.D.U.C.; methodology, M.P.; software, M.P.; validation, M.P.; formal analysis, P.D.U.C.; investigation, M.P.; resources, M.P.; data curation, M.P.; writing—original draft preparation, M.P., P.D.U.C.; writing—review and editing, M.P. and P.D.U.C.; visualization, M.P.; supervision, T.K. and C.S.; project administration, T.K. and C.S.; funding acquisition, T.K.
Conflict of Interest
There are no conflicts of interest.