A detailed presentation of this research was delivered at RSA US 2016, and is available at https://www.rsaconference.com/writable/presentations/file_upload/tech-t09-smart-megalopolises.-how-safe-and-reliable-is-your-data.pdf
In the past two years traffic sensors have mushroomed in Russian cities. Drivers using speed camera detectors were the first to spot the white boxes stuck to posts along the roadside. Their devices, designed to warn drivers about traffic enforcement cameras, react to the signals emanating from the new sensors in the same way they do to the radar guns used by traffic police. Helping enforce the speed limit is merely a positive side effect, however; the city authorities installed the sensors for a completely different reason. The devices count the number of cars of varying size in each lane, determine their average speed, and send the data to a unified traffic control center.
A traffic sensor in Moscow
As a result, the city authorities receive information about traffic intensity, which allows them to, for example, adjust traffic light phasing or plan further road infrastructure. The weekly reports issued by Moscow’s traffic authorities present information about the slowest and the fastest highways, based on data coming from both the Center for Road Traffic Management and from Yandex. While the latter’s data comes from apps running on users’ smartphones, the former would not be able to collect its data without the road infrastructure that will be discussed here.
Each week the Moscow city authorities publish data about the city’s fastest and slowest highways
These sensors are the lowest tier of ‘smart city’ infrastructure – they collect raw data about traffic and pass it on; without that data, no analysis can be done and systems cannot be configured properly. Therefore, the information coming from the sensors has to be accurate. But is that actually the case? Can an outsider manipulate the operation of the sensors and the information that they collect? We will try to answer these questions and identify any improvements that can be made to the urban IT infrastructure.
How to search for devices and information about them
Any research begins by collecting all the available data, and research into embedded systems, to which traffic sensors belong, is no exception. Even in a narrow market segment, you cannot know all sensor types and models by sight unless you are dealing with them professionally every day. The odds are that you won’t be able to immediately identity the manufacturer of a device by just looking at it. This makes all the logos and manufacturer labels on devices all the more valuable.
If you do succeed in identifying the model of a road sensor just by looking at it, you can find various documentation on the vendor’s site (or that of their integrator), and, if you are lucky enough, you will also find the software used for working with the devices. You will almost certainly find a marketing leaflet about your device; there is also a good chance you will find a larger sales-oriented document. It is also not uncommon to come across documentation, but finding a full-fledged technological description with the device’s command system is a rare piece of luck.
It is practical to automate the process of working with the sensors, so you don’t have to sit under each and every device with a laptop. Nowadays, such automation is quite normal – wireless connections are no longer a rarity for ‘smart city’ components. However, to ensure such automation, you first need to know which communication protocol each sensor uses, and how to separate the devices you need from all the other devices.
For this purpose, you can use any identifiers, including peculiarities in how the devices communicate their data. For example, most MAC addresses are reserved to specific manufacturers (however, anonymous MAC addresses also exist). As well as numeric IDs, devices also typically have alphabetic names that may also follow some type of standard, i.e. the device model plus an incremental index.
All of this makes it possible to write a scanner to search for devices that are of interest to us. One of the sensor models installed in Moscow uses Bluetooth for data communication. These devices have both MAC addresses and ‘friendly names’ that are quite distinctive, so we can add only these traffic sensors to the list and filter out all nearby smartphones and TV sets. A discussion on Bluetooth security goes beyond the scope of this topic, so we will not talk here of compromising Bluetooth devices. We informed the Moscow city authorities about the configuration drawbacks in November 2015.
Traffic sensor records saved to a database
I used Python, PostgreSQL and a bit of C. In real time, as I drive by each traffic sensor, the scanner identifies each device’s MAC address, friendly name and coordinates. The fields with the vendor name and the physical address are filled out later on a separate pass through the database based on the data already collected. Determining the device’s physical address from coordinates is, to a certain degree, a time-consuming procedure, so it shouldn’t be done simultaneously while searching for devices. Establishing a Bluetooth connection is not fast either, so if you want to find traffic sensors, you’ll have to drive slowly.
What can be done with the firmware
The openness shown by the manufacturers to installation engineers, their readiness to give them access to tools and documents, automatically means they are open to researchers. (I respect this sort of approach; in my view, this sort of openness combined with a ‘bug bounty’ approach yields better results than secrecy.) After selecting any of the identified sensors, you can install the device configuration software supplied by the vendor on your laptop, drive to the location (the physical address saved in the database), and connect to the device.
As we do in any research of embedded system security, we first of all check if it’s possible to reinstall the firmware on the device.
The configuration software allows the firmware on the traffic sensor to be changed
Yes, we can install new firmware on the device via this wireless connection designated for servicing purposes. It is just as easy to find the manufacturer’s firmware and as it is its software. The firmware looks reminiscent of Intel iHex or Motorola SREC, but this is the manufacturer’s proprietary product. If we remove the overhead information (‘:’ – the write instruction, serial numbers, memory addresses and checksums) from the data blocks for the digital signal processor (DSP) and the main processing unit (MPU), we obtain the clean code. However, we don’t know the architecture of the controllers in the device, so we cannot simply open the file with a disassembler.
The traffic sensor firmware
Oddly enough, LinkedIn helps out here – it’s not just a handy resource for careerists and HR departments. Sometimes the device architecture is not a secret, and engineers who used to work for the manufacturer may be willing to talk about it. Now, as well as the file, we also have an understanding of the architecture which the firmware was compiled for. However, our good fortune only lasts until we launch IDA.
You can find the controller types even if they aren’t specified in the documentation
Even when we know the architecture, the firmware remains a meaningless jumble of bytes. However, we could find out from the same engineer how exactly the firmware is encrypted, as well as the encryption algorithms and key tables. I didn’t have the device to hand, so at that stage I decided this black box mode of firmware modification was showing little promise, and set it aside. We have to admit that in this specific case the microelectronics engineers know how to protect firmware. However, this did not mean the end of the road for our sensor research.
Only trucks travel at night
Firmware modification is “good” in that new functional capabilities can be added. However, there is sufficient functionality in the manufacturer’s standard software. For example, the device has about 8 MB of memory that is used to keep a copy of traffic data until the memory is full. This memory can be accessed. The firmware allows you to change the way that passing vehicles are classified according to their length, or change the number of lanes. Would you like a copy of the information collected on traffic? No problem. Would you like to classify all vehicles as trucks driving in the right lane? You can do that too. Of course, this will affect the accuracy of the statistics that are collected, with all ensuing consequences.
Sample of the data traffic sensors collect and pass along. The same data is stored on the device
If someone wants to get a copy of Moscow traffic statistics or manipulate the data, they will have to walk or drive around all the traffic sensors; however, they won’t have to launch software for each of the thousands of sensors and modify the settings manually. In this specific case, there is a description of the proprietary system of commands for the devices. This is not something that you come across a lot when researching embedded systems. In any case, after establishing a connection to the traffic sensor using the manufacturer’s software, the commands are no longer a secret – they are visible using a sniffer. Even then, there’s a description in English that saves us the trouble of analyzing the machine-language communication protocol.
Documented device commands make traffic analysis unnecessary
Bluetooth services are not actually implemented on the traffic sensor; the wireless protocol in this case is only a data communication environment. The data is communicated via a regular serial port. Software handling of such ports is no different from reading and writing to files, the code for sending commands is trivial. For these purposes, it’s not even necessary to implement the usual multithreaded port handling – it’s sufficient to send bytes and receive the response in a single thread.
To sum up, a car driving slowly around the city, a laptop with a powerful Bluetooth transmitter and scanner software is capable of recording the locations of traffic sensors, collecting traffic information from them and, if desired, changing their configurations. I wouldn’t say that traffic stats are a major secret, but tampering with sensor configurations could affect their validity. And that data could be used as a basis for controlling ‘smart’ traffic lights and other traffic equipment.
The sensor sent a response, the command has been accepted. Because we know the command system, it is easy to ‘translate’ the response
What can be done?
It turns out that the answers to both questions we raised at the beginning are negative: traffic data is not protected, and it can be manipulated. Why is that? Well, there was no authorization except that required for Bluetooth, and that was not configured properly. The manufacturer of the road sensors we examined is very generous when it comes to service engineers, with a lot of information about the devices publicly available on the manufacturer’s official website and elsewhere. Personally, I agree with the manufacturer and respect them for this, as I don’t think the “security through obscurity” approach makes much sense these days; anyone determined enough will find out the command system and gain access to the engineering software. In my view, it makes more sense to combine openness, big bounty programs and a fast response to any identified vulnerabilities, if for the only reason that the number of researchers will always be bigger than the number of employees in any information security department.
At the installation stage, it makes sense to avoid using any standard identifiers. Obviously, manufacturers need to advertise their products, and the servicing teams may need to collect additional information from the adhesive labels on a device, but besides the convenience there are also issues of information security to consider. Last but not least, it’s not worth relying solely on the standard identification implemented in well-known protocols. Any additional proprietary protection that is properly implemented will be relevant and make penetration much more difficult.
A detailed presentation of this research was delivered at RSA US 2016, and is available at RSA conference website.