How Google and Portland’s TriMet Set the Standard for Open Transit Data

With national data transparency efforts like President Obama’s Open Data Initiative and municipal projects like New York City’s Big Apps or San Francisco’s Data SF, government agencies across the country have been opening their raw data sets, some more reluctantly than others. With the debut of City-Go-Round and media coverage generated about transit data transparency, many transit operators have taken steps to release their schedule and route information to third party developers, who in turn use the data to develop an array of applications to improve rider experience.

If those agencies haven’t already formatted their data in the Google Transit Feed Specification (GTFS), the industry standard, they are likely rushing to do so now. How Google’s specification became the common language for transit data is an interesting story and, as with many tales of transit innovation, it begins in Portland, Oregon.

After traveling internationally in the summer of 2005, Bibiana McHugh, an IT Manager at Portland’s TriMet transit agency, was frustrated that she couldn’t access transit information on a mapping program like Mapquest and certainly couldn’t plan a trip by transit with the same ease as a driving trip. When she returned stateside, she sent inquiries to Mapquest, Yahoo!, and Google, asking each if they had plans to incorporate transit data into their mapping services and if TriMet could partner in the endeavor.

Of the three, only Google replied. As it happened, software engineer Chris Harrelson had been using his 20 percent time to interface transit data with Google Maps, what became the Google Transit Trip Planner. TriMet worked with Google to prepare TriMet’s data set in a format that would work for Google Maps, a difficult task, according to McHugh.

"Transit data is extremely complex," she said. "There is a temporal element and spacial
element and it takes a relational database in order to manage all of that
information."

She added, "A lot of agencies have this fear that it will be
misrepresented or won’t be used accurately."

Because TriMet was proactive with its data, the subsequent GTFS very closely resembled the operator’s data feed. Google Transit Trip Planner launched on December 7th, 2005, and for most of the first year, TriMet was the only operator available on Google Maps. In September, 2006, five more cities got on board: Eugene, OR; Honolulu, HI; Pittsburgh, PA; Seattle, WA; and Tampa, FL.

Seattle.gif

Currently, Google Maps has agreements with over 100 transit operators in the U.S. and over 400 around the world.

In addition to fears by some operators about misrepresentation of the data, many operators were simply reluctant to open data for fear of bad publicity, according to Joe Hughes, a Google Transit software engineer.

"Transit agencies are used to being beat up in the press. Public transit has been the underdog since the 1950s and I think it’s made the agencies pretty conservative," he said.

Hughes, who began his transit mapping career in Pittsburgh in 2002, several years before joining Google, said that prior to GTFS, many software engineers had to "data scrape" operator websites or submit Freedom of Information Act requests to obtain data. Often times that data was mailed on a CD and could be out-of-date by the time it was turned over.

With the exception of Tri-Met and the other early adopters, "It’s been a slow and painful process to open this stuff up," said Hughes. "At first there was no infrastructure available to do this."

McHugh echoed the sentiment, suggesting that many agencies had
outdated assumptions about data and were reluctant to provide it
for free. "For some agencies, they are used to making money off it. When
they asked why we aren’t charging for our data, the answer is that the
taxpayers have already paid for it and the benefits are so big for
openness."

In Portland, said McHugh, their "lawyers are
pretty versed with open source. Having open data aligns with our
agency’s philosophies. We didn’t even have to think about it."

In the past few years the process has sped up tremendously, according to Hughes. "If you told me even a few years ago that every significant transit agency in the country would open its data, it would have been pretty hard to believe. The U.S. is now ahead of much of the world in releasing data."

Despite this optimism, there are still obstacles to full data transparency, particularly for those software developers not named Google. A number of transit operators, particularly those in the New York Metropolitan Area, like the New York Metropolitan Transportation Authority (NYMTA) and New Jersey Transit, have licensed their data with Google, but no one else. Though Google won’t pay for data, their caché and the ubiquity of their mapping service on personal computers and mobile devices has led agencies to provide GTFS only to them.

"What Google has is a clearly useful product, PR value, and name recognition," said Hughes of the situation, arguing the fact that these agencies have released info to them is a step toward openness. "At least they’re sharing it with one developer, but that’s not the end state. Ideally that data is available to any developer to use."

Matt Lerner, Chief Technology Officer for City-Go-Round, said that sharing with Google alone does not make the data open. Pointing to the GTFS Data Exchange and City-Go-Round’s top-ten list of transit operators that don’t open their data, he said millions of people in metropolitan regions don’t have access to open data, though they easily could.

"All the operators have to do is provide a URL where someone can download the feed. They already have the data, all they have to do is let the data be downloaded. It’s not open until they give that URL."

Lerner also lauded Google for its impact, "The agencies wouldn’t have ever put their data into a standard format if it weren’t for Google. It took a really big company to get the agencies to have a standard format at all."

Now that GTFS is the baseline, Google is considering dropping it’s name from the title and changing it to the "General Transit Feed Specification." Hughes proposed the changes on several transit developer listservs and said the renaming could come as early as next week.

"The name Google Transit Feed Specification is a non-name," said Hughes. "We didn’t want to be presumptuous by saying this would become the standard when we started. Having Google in there doesn’t really reflect all the different apps that are being used with the format." 

Hughes reiterated that the goal from the beginning was to make transit directions and maps as commonplace and simple as driving directions.

"I hope we’re helping to bring transit back on equal footing with driving."

  • Maps and transit – heaven!

  • This is all very nice but in case y’all aren’t paying attention transit is in big trouble!
    Forget the fluff and concentrate on providing bus and rail service!
    It aint no good having all this junk while service just keeps getting gutted!

  • in reply to the second comment. That is exactly what the operating people always say. But the reality is the absence of information leaves the public wondering if the agency is capable of runing the place properly. When the information is there, and the public starts to participate in on time performance and understanding why trains and buses don’t run properly then they will also be telling their political leaders to better fund transit and what they should focus on. When they have more faith in transit, elected officials will be getting more calls to fund it. But without faith (which is what we have now) people logically say to themselves – money is better spent elsewhere. We need the info to stay on top of the operation.

  • Matthew

    It’s very important that people be able to quickly find out how to get from one place to another using transit. If that’s one of the options right next to driving directions, or if users have the convenience of easily finding out how long a particular journey will take and when the next departure is, transit will be much better able to compete with cars. I think it’s more than fluff.

  • #2:
    Google Transit actually provides a huge benefit to struggling transit operators. People who never would have considered a bus trip are much more likely to do so when such a trip pops up next to their Google driving directions. Some agencies that have released Google Transit data (a task that is usually completed at minimal cost) have seen substantial gains in off-peak ridership. Regular commuters (who mostly travel during peak times) know their schedules, but people who have never ridden transit before are likely to consider it for daily errands if they are given step-by-step instructions while doing so, and that is exactly the effect operators have seen. Consequently, fare revenues rise without a significant increase in operating costs.

  • A Commuter

    Open data and applications may work fine for many transit agencies, but for the New York MTA data secrecy seems to be their mantra.

    Many people have been wanting to have transit applications for New York but cannot because of the lack of data.

    Numerous requests for timing and operations data are denied, probably because they think it would expose the inaccuracy of their published On Time Performance data.

  • Really great story of people like Bibiana McHugh having a vision and making it real. Thanks for sharing this. Standards drive innovation but when it comes to govt data it’s the exception. We need to change that. There are some promising examples like Open311 API but we need more people solving this problem. To those that say this is fluff, I would argue that liberating taxpayer funded data is the right of the people. Government is just a steward. 

  • AL M

    The only Data that Trimet allows the pubic to see is the data for arrival times for its vehicles.

    Every other piece of data regarding the agency is kept hidden away behind locked walls.
    The only way to access the nuts and bolts of Trimet data is through a lengthy and complex public records request.
    Trimet is one of the most secretive transit districts in America in reality.

  • FYI: The name of the specification did change to General Transit Feed Specification. Also GTFS Realtime status (which includes vehicle location/positions) has become far more prevalent. For example, Google announced realtime information in U.K., Netherlands, Budapest, Chicago, San Francisco, and Seattle at the beginning of June 2015. Another benefit that was not mentioned in the comments below, is how such transit apps make it easier for tourists. Tourists have already learnt how to use the apps in their home cities, so on arrival at a new location there is no longer the need to come to grips with the cities public transport website/app.