SMILE: Smart Monitoring IoT Learning Ecosystem

In industrial contexts to date, there are several solutions to monitor and intervene in case of anomalies and/or failures. Using a classic approach to cover all the requirements needed in the industrial field, different solutions should be implemented for different monitoring platforms, covering the required end-to-end. The classic cause-effect association process in the field of industrial monitoring requires thorough understanding of the monitored ecosystem and the main characteristics triggering the detected anomalies. In these cases, complex decision-making systems are in place often providing poor results. This paper introduces a new approach based on an innovative industrial monitoring platform, which has been denominated SMILE. It allows offering an automatic service of global modern industry performance monitoring, giving the possibility to create, by setting goals, its own machine/deep learning models through a web dashboard from which one can view the collected data and the produced results. Thanks to an unsupervised approach the SMILE platform can understand which the linear and non-linear correlations are representing the overall state of the system to predict and, therefore, report abnormal behavior. Keywords—unsupervised machine learning; industry 4.0; smart monitoring; internet of things; maintenance perspective.


I. INTRODUCTION
To date, industrial smart monitoring solutions are based on threshold systems, applied only to the monitored device, or based on supervised machine learning techniques [1], [2]. In the case of single-board systems, each device can be associated with an alerting system if the individual component does not behave as expected, thus informing the system reactively, resulting in a halt to the production process. Techniques that rely on the use of supervised machine learning techniques proactively solve cases of failure and malfunction, but require considerable application time and skills, resulting in increased costs; the extraction phase of the features turns out to be the most timeconsuming process, making this solution applicable only in a few cases.
The innovation suggested in this study is the proposal of a Smart Monitoring IoT Learning Ecosystem (SMILE) for the intelligent monitoring and control of an ecosystem of processes that characterize the corporate "life" of a factory by using automatic prescriptive techniques combined with supervised and non-supervised machine and deep learning techniques, artificial intelligence and innovative IoT sensors equipped with advanced communication algorithms. In addition, SMILE aims to automate information extraction by lowering the costs of implementing this process and, moreover, offering greater accuracy in the detection of anomaly, thanks to the innovative use of deep neural networks. An innovative dashboard will allow visualizing the characteristics learned from neural networks related to the operation of a given process or system, as well as linear and non-linear correlations, thus highlighting the parameters that affect the individual device which belongs to the network.
The following types of analysis can be carried out on the platform: • descriptive, with the aim of representing the state and correlations between the data; • diagnostics, with the aim of detecting the causes of abnormal behavior; • predictive, to predict when anomalous behavior will occur; • prescriptive, to know when to intervene to avoid abnormal behavior and downtime. The SMILE platform will focus on two application scenarios of interest: • Prescriptive maintenance for quasi-unmanned environment; • Architecture for remote control as a service. • The heart of the SMILE project includes: • development of an integrated advanced IoT sensor monitoring and control system; development of predictive and prescriptive maintenance processindependent systems based on the use of unsupervised and supervised machine learning algorithms, with the aim of making maintenance and production phases more efficient; acceleration of the learning phases for anomaly detection and prediction models through the analysis of inter-factory data, that is, homogeneous data measured in different factories; • alert communication system (sms, email, telegram, etc); • developing a dashboard through which the user can manage the entire platform; • design and development of innovative IoT sensors (responsive power, monitoring network parameters, rain estimation, etc.) to improve the quality of monitored data; • monitoring the reliability of a given sensor (life time prediction) and the validity of the measured data (dirty sensor, etc.), applying machine learning techniques. All this will allow moving towards a model of the process-independent monitoring and prescriptive analysis as a service.

A. Platform Architecture Overview
The proposed architecture aims to create a platform for intelligent monitoring of an industry that is as customizable and of general purpose as possible, offering the possibility to verticalize the platform according to one's needs. The platform aims to monitor complex systems that allow triggers to run when previously defined conditions occur.
Specifically, an open-source enterprise product for monitoring IT infrastructure, services, applications and resources in the cloud, i.e. Zabbix, was identified. Zabbix is highly customizable for any type of company with the most diverse practical applications [3]. The Zabbix platform finds excellent application in distributed systems, managing to work together with other Zabbix systems through configuration that conveys data in a server-to-server or proxy-to-server mode. Thanks to these features, it is possible to interface with systems that already use Zabbix or other monitoring platforms allowing the SMILE platform to provide an intelligent monitoring service, aimed at any type of industry that will add value to the normal features of Zabbix. In addition, the great readiness for distributed environments will allow the SMILE platform to aggregate and process homogeneous data from different installations to accelerate the analysis and prediction processes by implementing what are described as processes of interfactory and intra-factory analysis.
The diagram outlined in Fig. 1 shows a possible configuration in case the required verticalization does not involve previous installation of a monitoring tool: through the installation of specific applications, called agents, the SMILE platform will provide data for processing. Fig. 2, on the other hand, shows how the platform will be configured in the event of a pre-existing monitoring tool from which all the data needed for the analysis is sent to SMILE. In this case, the SMILE platform will develop a special agent called a "platform agent" that will allow the data to be conveyed from one platform to another. Thanks to these two configurations each industry will be able to interface with the SMILE platform to take advantage of the offered services, whereas the platform itself will arrange all the necessary components. Once the infrastructure is configured, a period of system observation will begin where, via the web interface, the user will be able to define the business objectives based on the produced data: then, once the set data is received, the user will have the opportunity to define what constraints the system must comply with and the events that must be recorded during observation. Hence the SMILE platform will be able to find correlations between the data and classify, as well as predict, events in the system.

Possible uses involve:
• detection and prediction of anomalies and/or failures of industrial plants; • detection and prediction of abnormal environmental conditions such as fire and/or increased pollution; • detection and prediction of anomalies and/or failures in data centers/cloud environments. The results of the analyses will be shown in the dashboard and made available to the monitoring system from which one can, for example, configure triggers to interact with the system for specific notifications and/or actions. This is what the SMILE platform refers to as a "smart agent": an intelligent application agent capable of evaluating, predicting and acting appropriately in the specified events.

B. Process-independent Smart Monitoring Algorithms
This section proposes a method for the creation of an optimized and composed neural network for the process of detecting and predicting anomalies (defined based on industrial objectives) in the generic context of industrial machinery. In this way, prescriptive maintenance can be carried out, minimizing, for example, the costs of periodic maintenance of the machinery, which appear to have a significant economic impact in the industrial context. There are several studies on the prediction of anomalies in the smart industries, based on supervised and unsupervised machine learning techniques [4]- [6]. Today, industries usually use supervised machine learning techniques for intelligent industrial monitoring. Supervised learning methods such as the Na've Bayes [7], The Support Vector Machine [8], etc. can be used to implement data classification and regression, but only after the phase of automatic feature creation. The solution to this is provided by the introduction of deep learning techniques that solve the problem of feature engineering and extraction, as deep neural networks have hidden multi-layered structures that allow representing the data in a more abstract way, allowing to find linear and non-linear correlations between information of different types, generated from multiple sources belonging to the monitored industrial system [9], [10].
The idea is to find a set of deep neural networks to automatically perform the process of selecting and extracting features. The fundamental difference between the networking context (paragraph C) and the more generic one (smart monitoring of machinery via IoT sensors) lies in the heterogeneity of the data provided by the IoT sensor ecosystem, which was first installed in the machines: due to different nature of the machines, different physical quantities are monitored, which therefore require a pre-processing process (homogenization) before being provided as an input to deep networks for the extraction of salient features. Once the features are extrapolated through unsupervised methods implemented by deep networks, the classification is defined in relation to the operating objective (defined by the industry itself), for the detection of any errors and/or anomalies, if the industrial ecosystem does not behave as defined by objective. To date, various deep learning architectures have been developed and research topics relevant to the industrial field are growing rapidly. Several typical deep learning architectures are discussed below.

1) Convolutional Neural Networks (CNN):
It is a multilayer feed forward artificial neural network that is initially proposed for two-dimensional image processing. One-dimensional sequential analysis of data, including natural language processing and speech recognition has also recently been studied. In CNN, feature learning is achieved by alternating and stacking convolutional levels and grouping operations. After learning multi-layered features, fully connected layers convert a two-dimensional feature map into a one-dimensional vector that powers it, resulting in a SoftMax function for model construction. Studies have been conducted on the applicability of CNN [11] to solve this problem, where by modelling the signal received with Fast Fourier Transform and wavelet transform, the processed signal can be sent as a 2D input for CNN and bi-directional LSTM to make a prediction that takes into account the energy correlations of the machines in the long term, thus being able to predict a malfunction in relation to the default goal.

2) Restricted Boltzmann Machine (RBM):
It is an energybased model in which the visible layer of the neural network is used to insert data, while the hidden layer is used to extract features, leading to the latter different representations of the visible layer, that is, the input. RBM takes advantage of automatic feature extraction required by training datasets, avoiding the local minimum value.

3) Auto-Encoder (AE):
Auto Encoder (AE) is an unsupervised learning algorithm that extracts functionality from input data without the need for outgoing label information. It consists mainly of two parts, including encoders and decoders. The encoder can perform data compression especially when processing high-dimensional input by mapping the input to a hidden layer. The decoder can reconstruct the approximation of the input. The goal of this approach is the optimal combination of deep neural network and machine learning method, predisposed to the "as-a-service" formula and the continuous changes that take place in complex systems such as those in question.

C. Smart Monitoring Network Algorithms for Predictive and Perspective Maintenance
The goal is to build an AI-based networking framework to optimize the anomaly detection and prediction process (based on a predefined goal), for example, minimizing business downtime, or maximizing routing performance based on the load in the system. The networking framework will need to be optimized to provide industry-specified key performance index (KPI) delivery methods, including the application of unsupervised machine learning for extracting significant features, combined with machine learning techniques to classify and/or predict predetermined goals (e.g. optimization of run-time routing).
The application of complex neural networks, which have more levels of complexity than traditional neural networks, allows for a more precise training phase. Various deep neural networks are expected in the various forms and with the appropriate changes (to solve the selection and extraction of features) in combination with the various types of known classifiers in order to choose the configuration between deep network and technical machine learning and achieve optimal accuracy. The proposed framework will use software defined network (SDN) paradigms, network function virtualization (NFV) [13] and multi-access edge computing (MEC) [14] and consists of three levels: 1) Physical Deployment layer: A communication network with SDN switches, divided into Ingress Switch (IS) and Core Network (CS) switches. The communication infrastructure will therefore allow separating the data plan, made with the SDN switches, from the control one, which is generated by the SDN Controller. According to the SDN standard, and with the use of the OpenFlow communication protocol between the SDN Controller and the SDN switch, the SDN Controller will be able to programmatically decide on the run-time for the flow routing in order to fulfil performance requirements as to the flow routing with special reliability and delay needs for appropriate routes.
2) Network-Level Support Engine: It is responsible for configuring the network path for each stream to ensure the KPIs specified for it. It is made with Network-Level Agent (NL-A) obtained as Virtual-Network Functions (VNFs) running on servers directly connected to the Ingress Switches. Their goal is to analyze inbound traffic to switches with a Deep Packet Inspection (DPI) function and predict the current state and future behavior of the device that generated it. This prediction is made with a combined system based on machine learning, supervised and not, able to analyze observable sequences and infer from them a model that can consider any correlations between different flows generated by machines that interact with each other.

3) Application-Level Support Engine:
Its task is to intervene promptly in the control of the machinery if the commands coming from the remote-control system do not arrive on time, that is, with times below the predetermined levels for each flow. This is achieved by the presence of Application-Level Agents installed as Virtual Network Functions (VNFs) on network access nodes, close to the industrial machines to be controlled.

D. "Smart Agents"
The "smart agent" is a software module that acts as a connection interface between the monitoring platform and deep learning and machine learning algorithms that will provide predictions about anomalies and possible triggers.
The "smart agent" should also be able to assess the learning status of the algorithms and provide detailed information which will appear on the dashboard concerning possible shortcomings (e.g. lack of data). Once the accuracy evaluation process provided by machine learning algorithms is complete, the "smart agent" can make decisions and take actions such as alerting or triggering previously defined triggers. In addition, at regular intervals the new data received will be aggregated and added to the dataset to refine the created models (by fine tuning and continuous improvement). In this way the "smart agent" will respond to possible structural changes in the system in a transparent manner with minimal or no intervention by the user. The processed data will be sent to SMILE, which will be tasked with storing and viewing it on the dashboard. Innovative sensors need specific "smart agents". Their purpose is to collect data from innovative sensors (paragraph E) and determine its reliability in terms of quality and residual life time. This module should be able to take the data flow from the aggregation platform provided by the open source software Zabbix and determine the degree of reliability of measurements from one or more sensors in real time. This is necessary for the validation of the information learning process as poor data quality brings about multiple false positives or negatives resulting in platform's goal not being achieved. For this reason, it is essential to ensure the accuracy and quality of the data received by the sensor itself. This assessment will allow the SMILE platform to evaluate the sensor's remaining lifespan in consideration to take preventive/prescriptive action. The quality assessment agent of the acquired data will also have to initiate an alerting procedure alerting the user, through the systems described above, about the malfunction resulting from a faulty sensor and therefore the need for maintenance.

E. Innovative Sensors
SMILE involves the integration of innovative sensors to increase the heterogeneity of the monitored data, identify correlations that may be difficult to obtain and provide an innovative tool to detect the reliability of the sensors, both in terms of size and the remaining life time. Innovative sensors will be geared towards monitoring different aspects of the smart factory interest. The data sampled from these sensors, added to those already in place, will provide a more comprehensive overall picture of the state of the factory, thus contributing to a more accurate detection of the anomalies and their underlying causes. The areas in which innovative sensors will be introduced are: • environmental monitoring (air quality, rain, outdoor and internal temperature sensors) [ [20] [21]; • data network monitoring [22].

F. Reliability and "Smart Agent"
IoT sensors are the real heart of a monitoring system: it is impossible, in fact, to unleash the true strength of predictive maintenance without being able to rely on qualified sources of data. For these reasons, the objective is to establish, with a certain degree of accuracy, the quality of the measured data (e.g. a dirty sensor may detect incorrect values) and the reliability of the sensors, i.e. to obtain an estimate of the remaining lifetime. This, of course, will improve the predictive model and consequently the quality of notifications that allow operators to differentiate anomalies due to a sensor malfunction from those due to a faulty device integrated into the factory's mission-critical processes.
The large amount of data collected by sensors is used by decision-making systems that, using intelligent paradigms for the processing of measurement signals or machine learning (ML) techniques provide feedback on the status of systems and personnel. Malfunctions of these devices can seriously compromise the operation of entire industrial structures or, in the worst cases, even put workers' lives at risk. The science of predicting and detecting device malfunction is called reliability. Reliability refers to the sensor/system's ability to meet "nominal" operating specifications over time. SMILE considers different reliability analysis techniques based on: • Principal Component Analysis (PCA) for process monitoring and validation of received sensory data: after modelling the process as a PCA model, model residues are used to detect the presence of faulty sensors; • use of time series linear models that allow creating models without prior knowledge of the types of anomalies that sensor data might contain; • clustering algorithms for error detection; • decision-making trees.

G. Dashboard and Alerting System
The Smart Dashboard will be useful for integrating and viewing the SMILE platform. In particular, the required features are: • visualization of sensor data and system state; • goal-setting; • definition of actions (triggers); • predictions generated by "smart agents". For features 1, 2 and 3, the SMILE platform will need to interface with Zabbix and/or equivalent platform using APIs that have the necessary services to develop applications that use the platform as a data aggregator and trigger. Fig. 3 shows the interaction between the various modules that make up SMILE and the external platforms (Zabbix). "Smart agents" will receive the flow of data from Zabbix, which will be processed continuously according to the specific task they are used for (e.g. sensor reliability or process anomaly predictions), and will communicate the prediction to SMILE which in turn can initiate, using the alerting module, notifications to operators via SMS, email or social media. If there is a solution to the problem, the SMILE platform can define actions to minimize and resolve the detected anomaly.
Finally, the communication of any "IoT alerts" that allow the SMILE system to manage the potential anomalies of the various sensors that carry out the measurements is possible both by interacting on them directly, through specific methods contained within the Zabbix API, or indirectly, posting alerts via social media and configuring IoT sensors to receive them. Environments; • Architecture for Remote Control as a Service.

A. Prescriptive
Maintenance for Quasi-unmanned environments The planned network infrastructure consists of a network of sensors that periodically collects information and makes it available through a gateway to the SMILE platform that will process it appropriately and apply intelligence through prescriptive maintenance algorithms. The architecture is shown in Fig. 4. The sensor nodes use LoRaWAN [23] technology, well known for its high range of transmission available, which also allows excellent indoor propagation because of the use of 868 MHz frequencies. It will then be possible to communicate at long distances and with long device lifetime. If necessary, such networks could also become mesh networks with 6LoW devices to ensure a wider pool of infrastructure-compatible devices.
The collected information is then sent to one or more LoRaWAN gateways, which will provide it to the cloud. On the cloud side, inbound features are managed by a prescriptive maintenance algorithm. As outlined in Fig. 5, the algorithm consists of two sub machine learning modules that cooperate with each other. The first will perform an unsupervised learning of the incoming features, periodically assessing from the available data the presence of outliers and, if so, determining the presence of possible system anomalies. The presence of an unsupervised algorithm ensures the abstraction of the model, resulting in the possibility of being applied in different industrial contexts with different parameters to monitor.
The outgoing data from the first algorithm is then provided as an input to the second, which will take care of carrying out specific actions following a reported anomaly during the previous phase. The learning technique to be adopted this time is supervised learning, such as recurrent neural networks (RNNs) that allow operating according to what has happened previously. The cloud, or the SMILE platform, consists of "smart agents" that contain the smart monitoring techniques developed and verticalized in the networking context.

B. Remote Control Architecture as a Service
Another application is that of a remote-control center, which along with a modular approach presents itself architecturally as described in Figure 6. The modules involved are therefore as follows: • Identification and management of access: management and control of access to the system to be set up, with differentiation in terms of the scopes of the services offered and the operational capabilities; • Dashboard and data processing: managing the flow of data from the underlying layers and displaying output via front end that contains, in addition to a visualization of the data obtained from the sensor network, also the statistics of the key values characterizing the reference system; • Event management and statistics: Generating statistics from sensor data, for example, this category includes the evaluation of the analyzed data for the purpose of applying the various machine learning algorithms for prescriptive maintenance. This data and events are provided to this architectural block by the "smart agent" algorithms that will provide structured predictive and possibly prescriptive data, if "smart agents" act; • Device management: sensor management, with any changes to their behavior to achieve the desired trend of the system; • Info Broker: collects data from sensors and provides it to the front end; • Message dispatcher/MQTT: communication between the various IoT devices.
By cooperation between the various modules, the creation of a remote-control station is proposed, replacing/alongside localized interventions on the spot, such as the maintenance of machinery within an industrial context. This allows optimizing resources and increasing the productivity of the relevant sector. The dashboard will be displayed in the suitable environment, equipped with an audio/video station for remote control that will handle the entire process in the manner described in the previous paragraphs.

IV. CONCLUSION
All SMILE functional features are linked with radical changes in production processes, as they increase the quality and performance of the process (e.g. Data Center, Smart Road, etc.). The central and innovative concept of SMILE is that of a "horizontal" platform of intelligent monitoring, according to an as a service logic, with the ability to verticalize any production process but with reduced logic and development times and masking the difficulties of having to directly manage the configuration of complex modules by the operator. This, without any doubt, highlights a great innovation in process control.
The most innovative part is automatic and, in part, unsupervised monitoring of the processes and sensors that characterize a smart factory, with the considerable and strategic advantage of maintaining the company's performance always at the highest level, preventing anomalies, avoiding disruptions, improving the performance indices of the product/service production process. This will be accomplished through the study and implementation of advanced machine learning techniques. Finally, a further degree of innovation also stems from the introduction of innovative sensors, such as sensors that predict electrical malfunctions of devices, or systems in general, based on the analysis of reactive power.