With recent advances in IoT technologies, many real-time services are expected to emerge to utilize the vast amounts of data flowing into the cloud from various devices in factories, homes, and social infrastructure. As we progress toward autonomous driving with connected cars, researchers are considering analyzing the vast amounts of information, such as speed and location, generated by vehicles, which can then be presented to drivers in the form of warnings, for example.
Stream processing technology, which is effective at high-speed processing of such large volumes of data, has problems because processing must be temporarily stopped when changing or adding processing content in accordance with additions or improvements to services, and the provision of services may be delayed.
Fujitsu has now developed a new stream processing architecture that automatically switches to a new data processing program when a parallelized data processing task is complete. This architecture separates stream processing into data reception processing and current data processing, ensuring that both receive and current data processing are continuous (patent pending). As a result, in a simulation of receiving a few dozen bytes of data per second from one million vehicles, Fujitsu has confirmed that this architecture can continue processing streaming data while adding or changing processing programs, with an average increase in wait times of five milliseconds or less.
Fujitsu Laboratories aims to commercialize this technology during fiscal year 2018 on the Mobility IoT platform, offered by Fujitsu Limited, and expand it to other areas of the industry.
The details of this technology were presented at DEIM2018 (the Forum on Data Engineering and Information Management), a conference held in Awara, Fukui Prefecture, Japan, from March 4.
Development Background:
With the recent development of IoT technologies, data has begun to be collected from all kinds of objects and stored in data centers. It is expected that analyzing and using this data will lead to the creation of new services. In the case of connected cars, for example, it is believed that collecting, analyzing, and using real-time car data will make it possible to alleviate congestion, assist drivers, and improve the safety of autonomous driving (Figure 1).
of
moving vehicles, the most efficient method is to build a system that uses stream processing to process data in parallel, on a vehicle-to-vehicle basis. To add or change the processing program according to service additions and improvements, the current method involves preparing two systems of the same scale in advance, using one for operations, making changes to the other, and then quickly swapping them. However, this method required both systems to be temporarily shut down while data, such as a vehicle's speed or position, was stored in the memory of the system in use and copied to the system being updated. This made it difficult to produce services that required truly continuous operations, such as the real-time transmission of warnings to connected vehicles.
In addition, because new database processing programs, known as repositories, were obtained, congestion resulted from numerous queries from large volumes of processing units, which slowed down overall processing.
Details of the Newly Developed Technology
: Fujitsu Laboratories has developed Dracena, an architecture that can modify a system's processing programs while it is running, without interrupting operations.
With this technology, when data processing content is changed or added, this architecture distributes the new data processing program as a message—in the same way that data is distributed—to each individual processing unit, called an object, much like the processing unit for each car. This eliminates the impact on overall processing speed caused by query concentration in the repository. Furthermore, by separating intra-object message reception processing from data processing in this architecture, the system can add the new data processing program without stopping either the message reception or existing data processing, and then have all objects switch to the new data processing program simultaneously. This has allowed Fujitsu Laboratories to create a stream processing architecture in which the data processing program can be added or modified without stopping, to continue parallelized processing without holding back the flow of large volumes of data for copying (Figure 2).
Effects:
The results of a simulated evaluation confirmed that, in a use case where a few dozen bytes of data from one million vehicles are transmitted once per second, this architecture was able to continuously provide services by adding a harsh braking detection service in a situation where the system was already providing a service to detect excessive driving times, with an average latency increase of five milliseconds or less. This architecture will enable the rapid delivery of real-time services that require uninterrupted operation and can address societal issues, including driver assistance for connected cars, energy-saving appliances, home health and safety monitoring, and providing travel guidance for tourists using smartphones.
Furthermore, this architecture allows users to adopt a build-by-build approach where they first construct a base system for simple analysis and use, and then gradually add new services. Using this technology in the automotive sector, for example, it would be possible to start with a system that reads drunk driving warnings based on steering wheel data and then add new services layer by layer, such as combining this with map data to detect crosswinds at tunnel exits, or combining it with image data to detect illegally parked cars. This is expected to improve the efficiency of service development.
future plans
include commercializing this technology during fiscal year 2018 as a component of the Mobility IoT platform offered by Fujitsu Limited. Furthermore, Fujitsu aims to expand this technology beyond the mobility sector into commercial areas requiring real-time services based on data generated continuously at a high frequency, such as providing directions to people during events or disasters.
