A framework for architecting IoT solutions
How to approach, design and implement a production grade IoT solution using public cloud platforms
One of the challenges for developers building IoT solutions is determining how to collect and securely ingest data from millions of devices into a data center for processing. This is the first installment in a series of guides in which I lay out my recommendations for how to approach, design and implement a production grade IoT solution using public cloud platform-as-a-service (PaaS) offerings. I focus on AWS for this discussion, but the approach can be applied to other cloud vendors such as Microsoft's Azure.
Do I really need to use Cloud PaaS?
In short, no, you can manage your own infrastructure either on-prem or using compute instances and/or containers on any of the public clouds. You can set up an MQTT broker such as HiveMQ which also offers a managed cloud solution. I have some experience migrating a customer from HiveMQ to AWS IoT and learned that HiveMQ has many more features than the AWS broker such as QoS 2, shared subscriptions and conforms to the MQTT specification. It's a very powerful product and I was impressed with the overall solution and documentation. I particularly liked that you can extend the broker functionality through a plugins feature and customize the solution to your business needs. In contrast, if you need a new feature when using a cloud service, then you are at the mercy of the vendor to build the solution or try to create workarounds. Unless you are an enterprise customer spending $100M+ then its unlikely that it will happen. If I were building my own platform, required full control, wanted to avoid "lock in" or the random deprecation of IoT products (GCP), then I would use a solution such as HiveMQ.
In contrast, the key benefit of a cloud service such as AWS/Azure IoT, is that their product teams have created tooling and API's which in theory simplifies building the overall solution allowing you to focus on building upstream applications to generate value for the business rather than managing complex infrastructure. There are additional services for Device Management, Analytics, Embedded devices and visual dev tools. The tradeoffs are cost, lack of control and vendor lock in since applications become more deeply integrated over time and very difficult/expensive to migrate without disruption to your customers. It's important to note that cloud vendors generate very little revenue from their IoT products directly and rely on large volumes of ingested data that requires storage, computation and upstream application services to analyze and present insights from the data; cloud computing is not always cheaper, and nothing is free.
Overall, there is no one size fits all answer as to the best solution. Startups can benefit from using cloud platform services since they can leverage tooling to move fast and deliver MVP/V1 of their product to market. This is especially true if they are lacking technical knowledge/skills on the team. It's likely the connectivity and messaging part of the solution is lower value than the analytics and end user application. For enterprise customers who have unique feature requirements and for whom technology is a core competency, I would lean toward a self-managed solution such as HiveMQ, at least for a part of the solution and retain the flexibility to move providers, build custom features as needed. There are also hybrid solutions, use HiveMQ for messaging and Azure/AWS for data processing.
Why do I need a design framework?
In the early days of IoT, cloud providers saw an opportunity to become the “data lake” of every solution and invested heavily in building self-service (PaaS) services and tooling which made it easy to connect devices and start ingesting data. However, these tools don't provide guardrails on how to best design a solution making it easy to architect yourself into a corner which can be difficult and expensive to resolve. I’ve worked with several customers who needed to rearchitect their ingestion pipeline to meet scaling requirements which was costly and disruptive.
To help prevent this, I find it helpful to work backwards from the desired user experience, use-case and business outcomes which then determines the architecture. I also find that many customers aren’t sure of how they may use their device data in the future, this is especially common with startups still proving out their business model. In this case, I recommend building a flexible, cost optimized architecture which supports ingesting data in a format that can easily be processed to support future applications at a later date.
I find it helpful to work through the following questions:
- Define your business case
- What are the key objectives of the solution?
- What is the desired user experience?
- What are the key outcomes that determine project success?
- What use case(s) do you want to enable? e.g. As a user I want to view the real time status of devices in a dashboard and receive alerts when devices detect an anomaly
2. Understand edge devices
- What is the processing and storage capabilities of devices?
- How will data be collected and transferred to the cloud?
- Does all data need to go to the cloud or can it be stored, processed and analyzed at the Edge?
- Can data be compressed/batched?
- What is the messaging protocol from device to cloud? e.g. AMQP, MQTT, CoAP etc.
3. Connectivity & Network
- How will devices connect to the cloud? e.g. Wi-Fi, NB-IoT, Lora, 5G
- What is the reliability and bandwidth of the communication medium?
- How many minutes per day will a device be connected?
- What is the type and size of the message payload? Is it binary data? Can it fit into 128kb packets or is it a combination e.g. an image and associated metadata?
- What is the desired network topology e.g. Device→Cloud, Device→Gateway, Hub and Spoke
4. Cloud & Data Architecture
- Does data need to be processed in real time to support upstream applications e.g. dashboards, predictive maintenance solution?
- Does data need to be routed to additional services on the cloud side?
- Is two-way communication required e.g. to support Command and Control scenarios?
- What are the device security requirements?
- How will devices authenticate & authorize with the cloud
- If using certificates, how will they get onto the device
- What are the cost constraints?
- How much data will you need to ingest? Think about frequency, number of devices
- How many devices do you anticipate in 6, 12, 24 months?
8. Provisioning & Device Management
- How will credentials get transferred onto devices?
- How will devices be activated/provisioned? e.g. at time of manufacture, on first boot?
Conclusion & Next Steps
I’ve worked with customers in AgTech, Manufacturing, Automotive, Pharmaceutical and Healthcare and observed that they all have different needs and unique challenges which is why there is no ‘one size fits all’ for IoT implementations. It's rare to deploy a 100% greenfield solution and its likely you need to work with hardware and network constraints including legacy firmware that is difficult to modify. I find it's essential to involve domain experts during the planning process, and see my job as an Architect to listen, ask the right questions and guide the technical design and build.
Once you have an initial design, it's time to create the initial architecture. As you can imagine there are many pros/cons associated with every approach so expect to make tradeoffs, there is no perfect design despite what cloud vendors may tell you!