esper-ota/5-concept.tex


								\section{Concept for implementing \textit{OTA} updates}\label{concept}

								To implement \textit{OTA} updates under the given requirements, we first define a topology that integrates our build infrastructure, firmware repository, and controller with the IoT WiFi network, which the devices are connected to.

								For our reference implementation, we particularly chose lightweight and common software projects to allow for easy exchangeability of the individual components.

								The base topology, as well as the specific components used is shown in Figure \ref{fig:topology}.


								\begin{figure}[htbp]

								\centering\fbox{\includegraphics[width=.98\linewidth]{topology.pdf}}

								\caption{The base network topology.}

								\label{fig:topology}

								\end{figure}


								The source code of the \textit{ESPer} project is published into a \textit{Git} \cite{git} source code repository.

								From there, the continuous integration (CI) system is responsible for automatically building and publishing the firmware image files, as soon as updated source code is available.

								It is also in charge of assembling and publishing meta-information (version number and cryptographic signature) required for the update process.

								The CI systems is described in detail in the following section.

								Updates to the devices firmware are either triggered actively (manual or by the CI system) or on a regular schedule by the devices themselves.

								This process is described in section \ref{flashlayout}.


								For monitoring and maintenance purposes, each device publishes a set of information to a well-known \textit{MQTT} topic after connecting to the network.

								Beside data like device type, chip and flash ID, the published data includes details about the bootloader, SDK and firmware version as well as relevant details from the bootloader configuration, like the currently booted ROM slot and the default ROM slot to boot from.

								This allows administrators to find devices with outdated bootloaders and helps to find missing or failed updates.


								\subsection{Common framework and build infrastructure}

								The framework includes a build system, which allows to configure basic parameters for all devices, including, but not limited to, the WiFi access parameters, the \textit{MQTT} connection settings and the updater URLs.

								Each device requires to have the \texttt{UPDATE\_URL} option set to make the update work.

								Skipping the option results in the exclusion of the code for update management during the build.

								By sharing the same code, all devices ensure to have a common behavior when it comes to reporting the device status or interacting with the home-automation controller.

								This eases configuration and allows to collect information about all devices at a central location.


								As development on the devices usually happens in cycles, some of the projects would miss updates of the framework and therefor would not benefit from newly added features or fixed problems.

								Regularly updating the framework version and rebuilding the firmware would often result in an easy gain of these benefits, but requires manual interaction.

								Further, problems could arise if the application programming interface (API) of the framework changes.

								In this situation, the device firmware must be updated to use the changed API, which can be an unpleasant and complex task that leads to higher latency for firmware updates.

								To prevent these problems, the firmware of all devices in the hackerspace is integrated together with the framework into a larger project.

								By doing so, any device specific code is always linked to the latest version of the framework.

								The according device type is provided as a string through a global constant at compile time and it must never be changed during operation.

								Device specific code is organized in a sub-folder for each device type.

								To build the software, a \textit{Makefile} \cite{make} is used, which provides a simple way for reproducible builds.

								Whenever a new build is started, the build system scans for all device specific folders and calls the build process for each of them.

								After the build of the firmware has finished, the build system also creates a file for each device type, containing the build version and cryptographic signatures of the corresponding firmware images.

								To avoid interferences between different build environments, and to roll out new versions as quickly as possible, the code has been integrated into a continuous integration (CI) system which is also responsible for publishing the resulting firmware images to the firmware server queried during updates, and for notifying the devices to check for an update.


								\subsection{Device setup and flash layout}\label{flashlayout}

								Microcontroller boards based on the \textit{ESP8266} MCU are mostly following the same layout: the MCU is attached to a flash chip which contains the bootloader, firmware and other application data.

								The memory mapping mechanism of the MCU allows only a single page of 1 MB of flash to be mapped at the same time \cite{ESP8266_Memory_Map} and the selected range must be aligned to 1 MB blocks.


								As the image to download and flash possibly exceeds the size of free memory heap space, the received data must be written to flash directly.

								In contrast, executing the code from the memory mapped flash while writing the same area with the downloaded update leads to unexpected behavior, as the executed code changes immediately to the updated one.

								To avoid this, the flash is split into half to contain two firmware ROM slots with different versions, one being executed and one which is being downloaded.

								This standby ROM slot also acts as a safety mechanism if the download fails or is interrupted as the previous version stays intact and can still be used (refer to requirement \ref{req2}).

								In case of an error, the old firmware is kept unchanged and will be used until the successful download of a newer firmware succeeds.

								In addition to the two firmware ROM slots, the flash provides room for the bootloader and its configuration.


								\subsection{Cryptographically securing the firmware update}

								To ensure only valid firmware is running on the devices, a cryptographic signature of the firmware images is calculated and checked as part of the update process.

								For calculating and verifying the signatures of a firmware image, the \textit{SHA-256} hashing algorithm \cite{RFC6234} and an elliptic curve cipher based on \textit{Curve25519} \cite{bernstein2006curve25519} are used, which are both considered modern and secure methods for software signing (see \cite{barker2016nist, bsi}).


								The cryptographic signature for each of the two firmware images is created by the continuous integration system during build time and is provided as meta-information along with the firmware images.

								Therefore, the CI system must be equipped with the private key used to create the signatures.

								In contrast, the micro controller only needs to know the according public key, to be able to verify the cryptographic signature.


								For the same reason as stated in Section \ref{flashlayout}, the signature of the new firmware image can not be verified before it is written to flash.

								Therefore, the calculation of the \textit{SHA-256} checksum required for the signature check is done while the update is downloaded and written to flash.

								After the download has succeeded, the checksum is verified against the signature and the bootloader gets reconfigured iff the signature is validated successfully.

								Otherwise, the bootloader will not be reconfigured and the system will not start the invalid firmware.