Sander van de Velde is an experienced Microsoft Azure IoT Solution architect specializing in designing and developing Internet of Things (IoT) platforms. He is a Microsoft Certified Azure IoT Developer. Sander has been awarded the Microsoft MVP Azure award for his community work for Azure IoT six times since 2017. He works closely together with Microsoft on new features and innovations. Sander is also interested in using new techniques but with interoperability and maintenance in mind.
Sander shares his take on Azure IoT, “The IoT platform, in general, is made up of three things. First, we have devices somewhere in the world that generate data; that data must come into the cloud. Cloud is the second component, and each device needs a secure connection to drop off the telemetry and get some commands back if needed. It is a two-way communication. Lastly, based on the data, we want to have insights, so some reports or dashboards will be built on top of it to bring value.”
Microsoft’s IoT offering is a broad one. First, the most important thing is the cloud gateway, the endpoint where any device can drop off its telemetry. That means each device has a so-called registration inside the IoT Hub, with that registration comes security keys. Each device has a secure connection, and we can see if a device is connected or not. There is also an offline scenario where the devices are not always connected if there is some information for that device to be picked up, the user can put it in the IoT app. If the device comes along, it can pick up those desired properties and work on the desired properties. Once they are accepted, they can be reported back to the reported properties. That is the most critical resource; we can do a lot once the data comes in from there. Microsoft also has this SSH we are offering, a concept of a highly secure device. There are many gates to check if it’s compromised; if it’s compromised, it will reset itself and go back to the latest installment of the software. This action cannot be done on the device itself; it must be done from the cloud. If several devices have the same problem, then the whole family of devices is seen as compromised. If you want a highly secure device, maybe Azure sphere is a solution.
Let’s consider laptops, industrial devices, or even a VM, which lives on the edge. That is the name on the edge of the local network. It has a secure connection to the cloud and the IoT Hub. This is a perfect way if you have devices in your factory or your company in other places. But it’s not allowed directly to connect to the cloud. Many devices do not know the concept of the internet, and they don’t know the concept of security. They expose data that is sensitive. There is IP in there, or it’s GDPR sensitive, then you can use the edge compute as a man in the middle, but then in a positive way, which connects to the devices on the local network. That’s something with the data coming in and trying to drop it off in the cloud, if needed, based on the logic you run on the edge.
It is common to work together in the cloud with the IoT Hub if you want to react directly to data with alerts. That’s called the hot path. It is done with Azure Stream Analytics, Logic Apps, or other solutions, like something with WebSockets within a second or millisecond; near real-time, you can react to incoming data. But on the other hand, there is also data coming in, which has yet to have failure whose telemetry is something that you think it’s essential. What users typically do is drop it off in some cold storage. That’s called the cold path. That could be just a Data Lake or blob storage, or nowadays, what we see a lot is Azure Data Explorer, which is a perfect way to store a vast amount of data and to store it in a price-wise way. The monster data is stored in your agitative store. It’s there; you can either leave it there or try to figure out and investigate the data to determine if there’s a trend or an anomaly. Even if data scientists find something with value-added, a specific KPI, an OE, or some trends or anomalies, they can put it in a dashboard. To make it more simple and more accessible, I want to have some graph or chart, and by clicking on a page, it should be visible. That’s called the warm path. The raw data is aggregated, filtered, and enriched daily. Then it’s made available in some dashboarding. Because there is something done with the data, it’s quiet and must be accessible in some dashboarding solution. This is primarily a more expensive solution. The failure of that data is temporary; you need to look at charts about production from two or three years ago. After a certain number of months, you probably are not interested anymore, so it’s time to wipe it. That original data is still there, and it’s in cold storage so that you can create a backup. But the work path is mostly time limited. Those are the three pillars of handling data, the hot, the warm, and the cold path.