Machine learning at the edge
I’ve recently been experimenting with Machine Learning for an upcoming project video. The advantage of using Machine Learning is that it can find patterns in data that would be difficult to code for using conventional methods. For example, they can be used for image classification (is it a dragon), detecting movement or processing sound.
People have been processing machine learning in the cloud for some time and all the major cloud providers have tools and platforms to help. Running machine learning on smaller computers at the source of the data e.g. a camera means that network bandwidth is not required for getting that data to the cloud. This can save on costs, reduce power consumption and open up possibilities for new uses.
Machine learning has two key steps. Firstly the training the model. This typically requires a massive amount of data to produce reliable results. Often this step is still run on high powered computers. The second step is called “prediction” this is using the model to predict outcomes. This is the step I’ve been putting onto a microcontroller.
There are a few constraints for putting machine learning prediction onto a microcontroller. The first is processing power, the core (main processing unit) of the microcontroller needs to do lots of calculations to complete a prediction. This requires a fast and powerful core. The code I was using suggested a 32-bit core which I realised that the MKRZero board had. However, there did seem to be some compatibility issues so I’d suggest using one of the boards on the recommended compatibility list https://www.tensorflow.org/lite/microcontrollers
The other key constraint is memory. The models generated by the training can be quite large in their own right. The first few I generated were thousands of bytes in size. That meant that they would not fit into the memory space of the micro-controller which was just 32K. Which coincidentally is the about the same size my old Zx Spectrum which I learnt to programme on.
The main tool for processing the machine learning model is TensorFlow. This is a machine learning platform which allows you to use the Python programming language to build, train and analyse models. A Tensor is a data structure that represents information using numbers. For example, an image or sound wave could be represented as a tensor. Having just this one key data type means that standard tools and processes can be used for building all kinds of models. Because the models are coded using python that means we have access to a whole range of data processing libraries such as Numpy and Pandas. My experiments also used Keras, a language built on top of TensorFlow that simplifies the creation of the neural networks I am using for the machine learning models.
The TensorFlow libraries are big and complex so there is a special “TensorFlow Lite” version that can be run on microcontrollers.
So the process is:
- Preprocess the training data
- Create the model
- Train the model
- Test the model
- Convert the model for TensorFlow Lite
- Compile and load the model
To help build and train the models, I used a tool called “Juypter Notebooks” this a platform that allows you to run Python code along side text explanations. It is a great way to lead someone though some processing steps. And I’ve used it to experiment with different models and datasets.
Although it is possible to build models for these micro-controllers, I did find myself fighting the system with regards to getting the models in under the memory limit and then debugging them when they went wrong. The training of the model is key to the process and although I did manage to get a model that seemed to produce different outputs for my inputs, I do question its suitability.
If I was to do this process again, I think I’d go for a small embedded computer such as a PiZero. That of course brings other disadvantages such as a longer boot time and more power consumption but I think it would be an easier starting point for machine learning.
Alternatively just wait a few years and the hardware and tools will have caught up?
[…] of that would be simple but many hours were spent trying to get the USB Host to co-operate. The machine learning also provided challenges, processing the data is a big part of training the models and there is a […]