9 reasons why we think Edge development is so hard.
Published
By Elina Norling

Even though we’re a company specialized in Edge AI and most of our team spends their days building and deploying models to all kinds of devices – there’s no getting around the fact that all edge developers still run into recurring challenges on an almost daily basis. That’s why we decided to do a small in-house investigation to figure out what we can agree are the biggest challenges in or related to Edge AI development today. In this blog post, we present the results.
1. Real-time demands with limited compute
One of the biggest challenges with the edge is that systems often need to meet real-time latency requirements. For example, it is not certain that an autonomous vehicle has time to send data back and forth to the cloud – decisions must be made immediately. At the same time, the computers on edge devices are tiny compared to data centers, which means the models must be extremely efficient.
Running an inference can require billions of operations, which both takes time and consumes energy. This becomes a major challenge in embedded systems. Devices often lack of compute resources, operate under strict real-time requirements, and frequently run on battery. The combination makes the demands on optimization tougher than in almost any other environment. Even if you can clear the hurdle of getting a model to run, you might still be running into resource bottlenecks that make the model so slow that it’s useless.
2. No universal target device
The problem is also that hardware architectures vary greatly between devices, and there’s no one-size-fits-all for model and device pairing in this area. More specialized hardware may outclass a more general device for one type of model, yet utterly fail to compile and run another. ‘Faster’ is therefore not always faster. Furthermore, it is rarely a trivial task to determine why a model fails to run on a given device. This is where edge development differs significantly from cloud development. In the cloud, you can install virtually anything you want on your server. On a chip, the constraints are not only much stricter – it is often challenging to even get a model to run at all, because the hardware is so specific.
3. Device capabilities define which techniques work
Since hardware architecture varies from device to device, the techniques to improve model performance are also device-dependent. For example, some devices support unstructured sparsity, leading to significant speedups without any loss in accuracy, while others gain no performance benefit at all from sparse calculations and become less accurate. Different quantization capabilities across devices also compound the problem of optimizing a model for a given hardware target.
4. Investing blindly in hardware
Another frustrating part related to the fact that it isn’t obvious which model-hardware combo that will perform best – or at all – is that you might sometimes have to invest in hardware before knowing whether your model will actually run well on it. In theory, it’s possible to guess based on spec sheets or vendor claims – but in practice, the only reliable way to find out is to test. And testing is hard when you don’t have access to the device. This means many developers end up investing time – and sometimes money – on hardware that turns out to be a poor fit. Whatever the reason the model doesn’t run, you often don’t find out until late in the process. By then, you’ve already invested.
5. Full-stack knowledge is a prerequisite
To build an AI application with excellent performance on an edge device, you not only have to consider the model’s architecture, but, as a consequence of the challenges mentioned above, also the resources available on the hardware, its specific architecture, and which compression and deployment techniques are allowed. As a result, this requires more low-level knowledge compared to running in the cloud. An edge developer needs to understand the entire stack: both software and hardware, and the various layers in between that translate a trained model into the actual code that runs on the target device.
6. A multidisciplinary process by default
This requirement for a holistic understanding also makes edge deployment inherently a multidisciplinary process. Successfully deploying models to edge hardware end-to-end may require contributions from software engineers, hardware specialists, researchers, data engineers, and network experts. When teams with such diverse expertise need to collaborate, the process becomes not only technically challenging but also organizationally complex. A common issue that arises when this alignment is missing is that developers do not always take the hardware into account when selecting a model. As a result, a model that looked promising during training may prove unsuitable for the target hardware, and once this becomes apparent, it is often difficult for an edge developer to remedy the situation.
7. Hardware support lags behind evolving AI
The challenges in edge deployment are further compounded by the fact that we’re dealing with a very large and rapidly evolving problem space. AI development is advancing at incredible speed – new models are constantly being released, while hardware support is always playing catch up. Meanwhile, new hardware platforms are being introduced, but there is little consistency across these platforms in terms of support, compatibility, or efficiency. Porting a model from one hardware platform that runs it successfully to another hardware platform often feels like starting from zero all over again.
8. Poor documentation
Additionally, we want to highlight that the field is evolving rapidly while still being in an early stage. Documentation in edge development is therefore often poor. For example, it is not always clear which version it refers to, or which flags can actually be combined. Error messages are often vague, making it difficult to understand exactly what went wrong when a model fails to compile.
9. Fragmented tooling
Another consequence of the fact that edge AI is still not fully established compared to cloud AI is that tooling has long been fragmented, with no clear standards (as we have already touched upon). One major source of fragmentation is that there are multiple hardware vendors, and they all have their own inference tools and ecosystems. Furthermore, going from PyTorch to a working model on a phone or a development board often involves four to five different tools: PyTorch → ONNX → quantization tool → vendor-specific SDK → deployment code. These rarely work seamlessly together. You are expected to stitch everything together yourself, troubleshoot broken conversions, and still deliver something with low latency and high accuracy. It is not only time-consuming, it is also fragile. Every update of a tool risks breaking something further downstream.
Trying to make edge AI a little less painful
Since we in the Embedl Hub team also recognize ourselves in these challenges, we’ve tried to build a platform that we believe can solve quite a few of them. We believe that – even if we might be biased – our current BETA offers a solid solution for at least the issues related to fragmented tooling, poor documentation, and finding the most performant model-hardware combination. We offer a consistent workflow that combines model optimization, quantization, compilation, and benchmarking in one unified Python library that’s connected to a web UI where everything is saved and can be compared in one place. We’ve also built a remote device farm where you can test models and techniques on real hardware – directly and without even needing access to the devices yourself – making it easier to quickly find the right hardware-model combination without you having to take chances on which hardware to invest in. In our benchmark suite, there are also ready-made benchmarks to browse and explore.
Even if it’s hard to cover everything in one single tool, our hope is that the Hub will make Edge AI development smoother, and even though the field is evolving fast, the Hub can help create better conditions to keep up.
Do you miss something else on our platform to make your Edge development easier? We are super-curious! Please get in touch with us here.