@ -19,7 +19,6 @@ For systems that are located in areas where reachability is limited the cost inc
If a system is already able to communicate over a network interface, this can be leveraged to apply updates on these system - this is typically referred to as \textit{Over the Air (OTA)}.
If a system is already able to communicate over a network interface, this can be leveraged to apply updates on these system - this is typically referred to as \textit{Over the Air (OTA)}.
By reusing the existing communication channels, the dedicated update interface can be omitted which leads to smaller packaging and reduces production cost.
By reusing the existing communication channels, the dedicated update interface can be omitted which leads to smaller packaging and reduces production cost.
It also decreases the maintenance cost drastically, because updates can be triggered remotely.
It also decreases the maintenance cost drastically, because updates can be triggered remotely.
\textit{OTA} updates enable administrators to apply automation methods on the update process allowing to roll out new releases and fixes in a controlled fashion.
\textit{OTA} updates enable administrators to apply automation methods on the update process allowing to roll out new releases and fixes in a controlled fashion.
As an example, updates can be done on test-devices first, followed by security-critical deployments and subordinate ones can be delayed to times when the device is not utilized.
As an example, updates can be done on test-devices first, followed by security-critical deployments and subordinate ones can be delayed to times when the device is not utilized.
Further, a feedback channel which provides information about the update status of a devices allows administrators to apply monitoring techniques ensuring all updates are installed and devices are in the desired state.
Further, a feedback channel which provides information about the update status of a devices allows administrators to apply monitoring techniques ensuring all updates are installed and devices are in the desired state.
For the implementation of an OTA update mechanism, the following requirements were defined.
For the implementation of an OTA update mechanism, the following requirements were defined.
\subsubsection{}\label{req1}
\subsubsection{}\label{req1}
The systems should be able to perform updates on the release of new software without manual interaction.
If a new firmware version is published, it should be prepared automatically for installation on the target devices.
All these devices should then fetch and install the new software version and start using it subsequently, if no errors have occurred during the update.
The systems must be able to perform updates on the release of new software without manual interaction.
If a new firmware version is published for a type of devices, the target devices must fetch and install the new software version automatically, and start using it subsequently, if no errors have occurred during the update.
\subsubsection{}\label{req2}
\subsubsection{}\label{req2}
To ensure minimal maintenance effort, the update process should be insusceptible to errors as much as possible.
To ensure minimal maintenance effort, the update process should be insusceptible to errors as much as possible.
Even if the installation of an update fails in the middle of reprogramming the device, the system should continue to work fully functional immediately and after reboot.
Even if the installation of an update fails in the middle of reprogramming the device, the system should continue to work fully functional immediately and after reboot.
\subsubsection{}\label{req3}
\subsubsection{}\label{req3}
Firmware downloads should be performed over the same WiFi connection as used during normal operation.
Firmware downloads must be possible over the same WiFi connection as used during normal operation.
Fetching the firmware should be done side-by-side with operational traffic.
Fetching the firmware should be done side-by-side with operational traffic.
\subsubsection{}\label{req4}
\subsubsection{}\label{req4}
The update process can happen over any untrusted wireless network or Internet connection and therefor must not being vulnerable to attackers.
The update process must be possible over any untrusted wireless network or Internet connection.
To prevent possible attackers from injecting malicious software into the embedded devices, a cryptographic signature mechanism must be implemented.
To prevent possible attackers from injecting malicious software into the embedded devices, a cryptographic signature mechanism must be implemented.
New firmware only gets accepted by the device, iff the cryptographic signature of the downloaded firmware image can be verified.
New firmware only gets accepted by the device, if the cryptographic signature of the downloaded firmware image can be verified.
\subsubsection{}\label{req5}
\subsubsection{}\label{req5}
To reduce network load and aim for the maximum possible uptime of the device, the update process should be done only if a new firmware version is available.
To reduce network load and aim for the maximum possible uptime of the device, the update process should only be done if a new firmware version is available.
In contrast, on the release of new firmware, the roll-out to all devices should be performed as fast as possible.
In contrast, on the release of new firmware, the roll-out to all devices should be performed as fast as possible.
While checking for available updates and downloading such an update, the device should continue to work as usual.
While checking for available updates and downloading such an update, the device should continue to work as usual.
\subsubsection{}\label{req6}
\subsubsection{}\label{req6}
For easy maintenance and monitoring, each device should provide detailed information about the currently installed firmware version and other details relevant for the update process.
For easy maintenance and monitoring, each device must provide information about the currently installed firmware version and other details relevant for the update process.
@ -67,6 +67,7 @@ The first one is a helper \textit{Makefile} built to accept a parameter for devi
In addition, the primary \textit{Makefile} scans a project subdirectory and uses each directory in there as a container for device specific code.
In addition, the primary \textit{Makefile} scans a project subdirectory and uses each directory in there as a container for device specific code.
For each of these directories, the helper \textit{Makefile} is called and the subdirectories name is used as the value of the \texttt{DEVICE} parameter.
For each of these directories, the helper \textit{Makefile} is called and the subdirectories name is used as the value of the \texttt{DEVICE} parameter.
By splitting the build and recompiling the framework each time before intermixing it with the device specific code, the device type identifier can be used inside the shared framework code.
By splitting the build and recompiling the framework each time before intermixing it with the device specific code, the device type identifier can be used inside the shared framework code.
While building a devices firmware, the meta-information file used during updates is also created and stored beside the firmware image.
For development, each device can be build separately by using the device type identifier as \textit{Makefile} target.
For development, each device can be build separately by using the device type identifier as \textit{Makefile} target.
In addition, the suffix \texttt{/flash} can be used to flash a specific firmware to the device.
In addition, the suffix \texttt{/flash} can be used to flash a specific firmware to the device.
While building a devices firmware, the meta-information file used during updates is also created and stored beside the firmware image.
The meta-information file has a simple line oriented ASCII format, which is easy to generate and efficient to pars within the limited constrains of the embedded device.
The build process will create the two firmware images, one for each ROM slot, and the meta-information file.
To create the meta-information file, the current version identifier is written to the \texttt{.version} file.
To create the meta-information file, the current version identifier is written to the \texttt{.version} file.
After the build, the signatures for both firmware images are created and attached to the file.
After the build, the signatures for both firmware images are created and attached to the file.
The update mechanism is split into four main phases: checking for updates, reprogramming the device, calculating and verifying the cryptographic signature of the updated firmware, and reconfiguring the boot process to use the new firmware.
The update mechanism is split into four main phases: checking for updates, reprogramming the device, calculating and verifying the cryptographic signature of the updated firmware, and - assuming that the update was successful - reconfiguring the boot process to use the new firmware.
\subsubsection{Checking for updates}
\subsubsection{Checking for updates}
In order to inform the IoT devices of the availability of a new firmware version, the update server provides a file for each device type containing meta-information about the latest available firmware version.
In order to inform the IoT devices of the availability of a new firmware version, the update server provides a file for each device type containing meta-information about the latest available firmware version.
The meta-information file has a simple line oriented ASCII format, which is easy to generate and efficient to pars within the limited constrains of the embedded device.
It consists of the version identifier and the cryptographic signatures of both of the firmware binaries.
It consists of the version identifier and the cryptographic signatures of both of the firmware binaries.
The version identifier can be an arbitrary string as the content is not interpreted semantically but only compared to the version identifier used during build time.
The version identifier can be an arbitrary string as the content is not interpreted semantically but only compared to the version identifier used during build time.
The other two lines in the meta-information file provide the hexadecimal representation of the cryptographic signatures, one line for each firmware binary file.
The other two lines in the meta-information file provide the hexadecimal representation of the cryptographic signatures, one line for each firmware binary file.
These meta-information files are provided by the update server using \textit{HTTP 1.1} under the following path pattern: \texttt{\$\{DEVICE\}.version} (whereas \texttt{\$\{DEVICE\}} is the device type name).
These meta-information files are provided by the update server using \textit{HTTP 1.1}\cite{HTTP_1.1}under the following path pattern: \texttt{\$\{DEVICE\}.version} (whereas \texttt{\$\{DEVICE\}} is the device type name).
Each device queries the update server regularly (initially at the end of the boot process and periodically once an hour) for the currently available firmware version.
Each device queries the update server regularly (initially when the boot process is finished and periodically once an hour) for the currently available firmware version.
It uses the \texttt{UPDATER\_URL} option to identify the update server.
It uses the \texttt{UPDATER\_URL} option to identify the update server.
After the meta-information file has been downloaded successfully, the version identifier is extracted and compared to the version identifier of the running firmware.
After the meta-information file has been downloaded successfully, the version identifier is extracted and compared to the version identifier of the running firmware.
If the version identifiers differ, the update process is initialized.
If the version identifiers differ, the update process is initialized.
In cases where the download fails, the update server or network connection is not available, or any other error occurres, another attempt will be made at the next regular interval.
In cases where the download fails, the update server or network connection is not available, or any other error occurres, another attempt will be made automatically at the next regular interval.
In addition to the interval, a special \textit{MQTT} topic shared by all devices is subscribed on device startup: \texttt{\$\{MQTT\_REALM\}/update}.
In addition to the interval, a special \textit{MQTT} topic shared by all devices is subscribed on device startup: \texttt{\$\{MQTT\_REALM\}/update}.
Every time a message is received on this topic, a fetch attempt for the meta-information file is triggered and the process restarts.
Every time a message is received on this topic, a fetch attempt for the meta-information file is triggered and the process restarts.
This allows faster roll-outs of updates and finer control for manual maintenance.
This allows faster roll-outs of updates and finer control for manual maintenance.
\subsubsection{Reprogramming the device}
\subsubsection{Reprogramming the device}
As the \textit{ESP-01s} is only equipped with 1 MB of flash, this means that the whole memory is mapped to a contiguous address space.
As the \textit{ESP-01s} is only equipped with 1 MB of flash, this means that the whole memory is mapped to a contiguous address space (refer to Section \ref{flashlayout}).
Therefore, the second ROM slot can not be re-mapped to have the same start address as the first ROM slot.
Therefore, the second ROM slot can not be re-mapped to have the same start address as the first ROM slot.
While the firmware is executed without any dynamic linking mechanism and the chip does not support position independent code, the addresses used in the ROM slots are dependent to the offset at which the firmware is stored.
While the firmware is executed without any dynamic linking mechanism and the chip does not support position independent code, the addresses used in the ROM slots are dependent to the offset at which the firmware is stored.
This arises the need for building two firmware images, one for each target location.
This arises the need for building two firmware images, one for each target location.
@ -38,11 +40,18 @@ irom0_0_seg :
len = ( 1M / 2 - 0x2010 )// Half ROM size excl. bootloader
len = ( 1M / 2 - 0x2010 )// Half ROM size excl. bootloader
\end{lstlisting}
\end{lstlisting}
For installing a firmware update, the new firmware image file is downloaded using an HTTP GET request.
\subsubsection{Verifying the cryptographic signature}
While the image is being downloaded each chunk received in the download stream is used to update the \textit{SHA256} hash before it is written to the flash.
When the write has been finished, the next chunk is received and the process continues until all chunks have been received.
After the download of a new ROM has been finished successfully, the calculated hash is checked against the cryptographically signed hash provided in the meta-information file. The required public key is always baked into the running firmware.
Only if the firmware is considered valid, the bootloader configuration is altered to boot into the new ROM slot and the device is rebooted.
\textit{rBoot}\cite{rBoot} has been choosen as it is integrated within the \textit{Sming} framework and allows to boot to multiple ROM slots.
\subsubsection{Reconfiguring the boot process}
For the bootloader, \textit{rBoot}\cite{rBoot} has been choosen as it is integrated within the \textit{Sming} framework and allows to boot to multiple ROM slots.
For configuration, an \textit{rBoot} specific structure is placed in the flash at a well-known location directly after the space reserved for the bootloader code.
For configuration, an \textit{rBoot} specific structure is placed in the flash at a well-known location directly after the space reserved for the bootloader code.
This structure contains, among other things, the target offsets for all known ROM slots and the number of the ROM slot to boot from on next reboot.
The full memory layout of this approach is shown in Figure~\ref{fig:memory_layout}.
This structure contains, among other things, the target offsets for all known ROMs and the number of the ROM to boot from on next reboot.
@ -51,10 +60,10 @@ The full memory layout of this approach is shown in Figure~\ref{fig:memory_layou
\end{figure}
\end{figure}
To calculate the origin of application data for each ROM slot, the available memory of 1 MB is split in half and an offset of the size of the bootloader code and its configuration (0x2000 bytes) is added.
To calculate the origin of application data for each ROM slot, the available memory of 1 MB is split in half and an offset of the size of the bootloader code and its configuration (0x2000 bytes) is added.
For alignment and easy debugging, the second block is also shifted by the same amount ob bytes as the first block.
The unused gap of 8192 bytes is used by some applications to store data which can persist over application updates.
For alignment and easy debugging, the second block is shifted by the same amount of bytes as the first block.
The gap of 8192 bytes is available to applications to store data, which can persist over application updates.
\begin{lstlisting}[caption={The flash layout used for two ROM slots.},
\begin{lstlisting}[caption={The flash layout used for two ROMs.},
The firmware files provided on the update server are the exact same ones as used to initially flash the chip for the according version.
The firmware files provided on the update server are the exact same ones as used to initially flash the chip for the according version.
Using the same files for flashing and updating allows better debugging by eliminating errors related to the update process itself and makes development and initial installation very easy.
Using the same files for flashing and updating allows better debugging by eliminating errors related to the update process itself and eases development and initial installation.
Listing~\ref{lst:choosing_rom} shows the algorithm used to determine the download address and reconfigure the bootloader.
Listing~\ref{lst:choosing_rom} shows the algorithm used to determine the download address and reconfigure the bootloader.
The update server provides these files in the exact same way as it provides the meta-information files, but the path pattern differs: the suffixes \texttt{.rom0} and \texttt{.rom1} are used to provide the firmware image files for the first and second slot respectively.
The update server provides these files in the exact same way as it provides the meta-information files, but the path pattern differs: the suffixes \texttt{.rom0} and \texttt{.rom1} are used to provide the firmware image files for the first and second slot respectively.
The project has been successfully deployed in the hackerspace and is now an essential part of home-automation development and deployment.
In this article, we have presented a concept for building and publishing cryptographically secure \textit{Over The Air} updates for embedded devices based on ESP8266 microcontrollers.
A proof of concept implementation has been developed, which is now an essential part of the home-automation development and deployment in the hackerspace \textit{Magrathea Laboratories}.
The update infrastructure has been the crucial point for decisions towards the framework for most members.
The update infrastructure has been the crucial point for decisions towards the framework for most members.
Enabling the developers to do updates in combination with the shared configuration and behavior provided by the framework resulted in a massive speedup when it comes to project deployment.
Enabling the developers to do updates in combination with the shared configuration and behavior provided by the framework resulted in a massive speedup when it comes to project deployment.
Before that, the cost for an change after deployment was estimated so high, that most projects tend to delay deployment until all required and wanted features are implemented.
Before that, the cost for applying changes after deployment was estimated so high, that most projects tend to delay deployment until all required and wanted features were implemented.
Now, as the devices are deployed as soon as the hardware is considered stable, these devices start to provide functionality early and therefore the developers can get better feedback on the provided functionality.
Now, as the devices are deployed as soon as the hardware is considered stable, these devices start to provide functionality early and therefore the developers can get better feedback on the provided functionality.
Most of the devices running the update-enabled firmware have undergone multiple major updates without any problems.
Most of the devices running the update-enabled firmware have undergone multiple major updates without any problems.