Data revolution in real time travel information

Damian Black, CEO and founder of SQLstream Inc, writes about relational stream processing for real-time intelligent transport systems Almost unnoticed there is a revolution going on in Internet data which is different from anything seen before. It is taking place in sensor data, which research organisation Gartner predicts in 2012 will exceed 20 per cent of all non-video Internet traffic.
Air Quality & Weather Systems / February 3, 2012
relational stream computing
In concept, relational stream computing is extremely simple: it allows users to run relational queries continuously against their live streaming data without first storing the data

Damian Black, CEO and founder of SQLstream Inc, writes about relational stream processing for real-time intelligent transport systems

Almost unnoticed there is a revolution going on in Internet data which is different from anything seen before. It is taking place in sensor data, which research organisation 2177 Gartner predicts in 2012 will exceed 20 per cent of all non-video Internet traffic.

Sensors feed most Intelligent Transport Systems (ITS) and come in various forms, ranging from fixed roadside sensors to vehicle-based sensors and personal devices.

The dramatic drop in price of GPS and sensor hardware in general coupled with the widespread availability of 3G wireless cellular data networks is paving the way for a tidal wave of ITS sensor data and enabling a slew of new real-time ITS applications and services.

 However, this sensor data 'tsunami' is a major headache for the conventional database world, where data must first be stored, cleaned and aggregated before being queried. The problem is that the volume of data is growing too fast, the value of the individual data is too low and the useful lifetime of the raw data is so short that it has become both cost-prohibitive and technologically very challenging to process it using conventional database and data warehousing approaches.

New-generation streaming apps

For almost a year now, 589 SQLstream has been working with a major national government transportation agency, one with a long history of successful innovation in the ITS field and a multi-billion dollar annual budget, to create a new generation of streaming ITS applications. This agency has invested almost a decade in pushing the envelope of conventional database technology to try and process sensor data but kept running into technical difficulties when trying to provide true real-time processing. SQL is considered to be the only viable approach when it comes to cost-effectively expressing and maintaining this agency's data processing and querying logic. SQL has stood the test of time and there is a wealth of readily available expertise. It has proven expressive power, offering high-level simplicity along with declarative elegance. Complex queries require only a few lines of SQL, leaving the database to optimise the query and generate rapid results efficiently. The problem in this case was how to make SQL database technology (which was designed for periodically querying large amounts of relatively unchanging historical data) work when the data were constantly changing and the results must be continually updated. A further complication was the desire to avoid repetition of results already delivered; for the most part, because of the speed at which data loses value, only the most recent was of interest. Getting the database turtle to jump through the flaming hoop of continuous, real-time processing represented something of a challenge.

Fortunately there is a solution - a whole new paradigm of data processing ideal for handling such scenarios: relational stream computing.

Relational stream computing is extremely simple in concept but really profound in its impact: run relational queries continuously against your live streaming data without first storing the data.

SQL is the lingua franca for querying databases - standardised and implemented by hundreds of database vendors. The technology makes it possible to continuously stream out the new results corresponding to the incrementally newly arriving sensor and service data, and do it all in real time with near-zero latency. SQLstream is the first company to bring relational stream computing platforms to the market based on standards-compliant SQL and now is marketing a growing range of related ITS analytics applications.

Rolling updates

Simple queries can be used to compute the average speed of vehicles on any given road segment against rolling time windows that are updated with the arrival of each new datum. More sophisticated queries can combine multiple data streams 'joining' streams on common values or time ranges, perhaps then enhancing streams with data joined from existing historical databases. The result is streams of real-time analytics. Such queries can be used to continually track what is happening on transportation networks and generate alerts when predefined conditions arise, and to provide a feedback loop for traffic management equipment such as speed limit signs or other flow control devices.

The ITS architects of the transportation agency cited above discovered SQLstream's technology by surveying technology tutorials posted by 1691 Google on 2178 Youtube. One of these discussed the latest emerging Internet data processing technologies and highlighted 2179 Mozilla's Firefox downloads website, showing second-by-second analytics of downloads of its famous browsers as they happen all around the world. Take a look at <%$Linker:External000oLinkExternalhttp://downloadstats.mozilla.comDownloads - Mozillahttp://downloadstats.mozilla.com/falsefalse%> where Mozilla kindly promotes SQLstream as the technology powering that site. It was one of the most talked-about sites on the Internet upon release last summer. The agency's architects decided to approach SQLstream to determine whether SQLstream would be able to help realise a vision of truly real-time ITS applications and services. In doing so, they were about to pull the trigger on a multi-million dollar roadside sensor deployment but quickly saw that SQLstream could both save them millions and also provide a superior solution by instead processing GPS data in real-time from sensors installed in vehicles . They understood that the key to 'lighting up' the road system and generating useful ITS analytics would require sensor data from just a small fraction of the overall road traffic. The intention is not to target or track specific vehicles. Instead, the data are anonymised and aggregated in order to protect the privacy of the individual while generating valuable results.

Product spin-offs

SQLstream is now marketing the resulting ITS solutions worldwide. The first solution is called 'ITS Insight'.

This provides a wide range of continuous analytics providing real-time insights into transportation systems. It processes GPS sensor data from vehicles on the road system (including freeways/motorways, major and minor roads), data continually transmitted over the wireless Internet. The time interval between transmissions from a given sensor can vary from sub-second to every few minutes. Each sensor transmits its geospatial and temporal location along with instantaneous heading plus speed and a unique identifier that includes vehicle type. Going forward we expect a wide range of other data to be transmitted and processed including for example the health of the vehicle's engine, its instantaneous fuel consumption, acceleration/retardation, tyre pressure, current environmental conditions, degree of road adhesion and so on. The data can then feed a wide range of real-time applications and services.

ITS Insight analyses the traffic continuously for each road segment across the entire road network. It delivers real-time visibility into the speed of traffic for each road segment and for each vehicle type with comparisons against historical norms and, of course, the posted speed limits. It can generate real-time recommendations for posted speed limits in order to maximise traffic throughput and minimise congestion. Even small savings in per-vehicle travel time and greenhouse gas emissions quickly translate into billions of dollars of savings per year when aggregated over millions of vehicles on the public road system, and help fight climate change.

ITS Insight can also monitor traffic for potential accidents, traffic signal failure, overloading of infrastructure, and tailbacks at intersections or points of motorway ingress/egress. It continually updates a large historical data warehouse - not of the raw traffic events but rather of the 'cooked' traffic information. It lowers the data volumes and increases the accuracy of stored information, making any downstream historical trend analysis and data mining inherently more valuable. ITS Insight can provide real-time point-to-point travel time based on instantaneous analysis of current real-time driving conditions by splitting up each point-to-point itinerary requested into a number of connected road segments. It can then analyse the conditions of the road both currently and at the anticipated travel time by predicting the road conditions for the time at which the vehicle would hit each segment. It is also able to factor in current and anticipated weather conditions.

SQLstream is also working on solutions covering the tracking of heavy vehicles over the road system in order to avoid overloading of infrastructure and to help monitor the expected wear and tear arising from the actual tonnage of vehicles passing over the infrastructure in any given time period. It compares actual loads with that for which the infrastructure was originally designed and maintains a data warehouse ('wearhouse'?), which it continually updates so that road planners can continually revise and optimise plans to update, strengthen or repair roadways, surfaces and infrastructure. It uses Google Earth to provide an easy-to-use and intuitive user interface easily accessible over the Internet by its users.

Future releases will include applications for shipping, transportation and logistics companies which will allow them to deliver their goods in a better and more timely fashion, for services and passengers over roadways, and for railways and waterways. One example is ensuring container trucks deliver on time and to schedule for the onward shipping leg. Trucks crossing borders have the potential to wreak havoc with any state's road system and it is critical for such deliveries to arrive in a predetermined time window in order to minimise congestion, wasted fuel and related exhaust gas emissions; this also helps operators to avoid incurring fines.

Potential

Relational stream computing based on the familiar, high-level SQL language has the power to change both the economics and space of possibilities for ITS applications.

One can now create 'transparent box' ITS applications where the customer, or their partners and integrators, can maintain and enhance such applications using readily available SQL consultants and in-house expertise. Not only can such changes now be made at modest cost, but because the SQL language is so concise and powerful, such changes can be made in short order. Existing features can be tweaked in minutes rather than in weeks and new features can be added in days rather than months. We have found that a few statements of streaming SQL logic can replace the need for thousands of lines of conventional program code. In comparison, persuading an existing shrink-wrapped application vendor into delivering new custom features is a process that is both expensive and glacially slow to deliver.

 In summary, stream computing, in conjunction with today's 3G wireless Internet, inexpensive GPS sensors and powerful data rendering technologies such as Google Earth, is dramatically increasing the scope and power of ITS applications and is enabling a brave new (continuously updated) world of real-time data and services.

Related Images

For more information on companies in this article