RATP and its open data sets

I have written about the  data that RATP offers to the enthusiasts. Some data sets seem easier to use as part of a web/mobile application:

  • GTFS
  • Real-time GTFS
  • GPS positions
  • Official colors of bus lines

Some data sets seem harder to use :

  • number of passengers that enters a station
  • air quality in an underground station

Let’s see how the open data  could be used.


Data set : GTFS

No big surprise here. The vast majority of the mobile apps that help people travel inside city borders use GTFS information to an important degree. However, the GTFS data sets are not standard. I understand that manpower is important and the open data is a gift to the community. However, there are several things that could change:

  • the exception calendar, which is far too big. Most of the bus lines are striving to respect the timetables and the drivers are great at doing it. I appreciate a lot this. The format of the data and information itself are two different things.
  • The timetables for bus lines that are in service until after 12 a.m. (past midnight). The GTFS standard suggests to use 24 hours notation and to mark hours past 12 a.m.  by adding 24. For example, 00:30 would become 24:30.  The data is workable because of its redundancy. The stop_sequence column from the stop_times file  helps a lot here.

These are not big changes, and for the moment being they are not a problem for me. In fact, I am working on a filter that could be applied to the GTFS data set in order to make it compatible with other tools.


Data set: real-time GTFS API

The API uses SOAP. A very mature technology, SOAP  has proven very useful in the past. However, when it comes to data traffic, the envelope (which is redundant information)  is much more important than the useful part of the data.  I suppose that some filtering is necessary. In the vast majority of the cases, the volume of the relevant data is small enough to be sent via Twitter.


Data set: GPS positions

There are several data sets:

  • train and bus stops
  • public toilette cabins

The train and bus stops are a subset of the GTFS data set. It is nice to have them, as most people don’t need to download a big zip file. On the other side, the toilette cabins are a very useful data set. I like the additional information such as the open/close status of the cabins.


Data set: official colors

This data set contains the official logos for every bus and train line. I would have preferred to know the exact RGB color of the lines. Fortunately, the graphic image files contain this piece of information. With the exception of some special lines, a local  image processing filter is enough to extract the exact color value. I will talk about it in a later post.









One thought on “RATP and its open data sets

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.