
The promised benefits of the General Transit Feed Specification (GTFS) Schedule and Realtime standards are dependent on the underlying quality of the data. Despite this fundamental reliance, there has been relatively little research on techniques and strategies to assess GTFS accuracy. The need for such assessment is growing as federal and state governments increasingly require transit agencies to make these data available to the public. This research fills this gap by presenting a suite of methods and metrics to assess the temporal accuracy of GTFS Realtime and the spatial accuracy of GTFS Schedule feeds. The temporal assessment demonstrates an approach to collect and clean TripUpdate messages to identify (and derive) a set of values for measuring the accuracy of the vehicle arrival predictions. These metrics are carefully designed to provide transit agencies insight into the quality of the data they provide to customers in terms of the impact of those inaccuracies on the customer experience. The spatial assessment demonstrates an approach to match scheduled information on the location of transit routes and stops with the actual travel patterns demonstrated in the realtime VehiclePosition messages. The measured divergence between the planned and provided transit service yields a series of location accuracy metrics. All of the proposed metrics can be scaled to examine GTFS accuracy from the stop to the systemwide level. All of the proposed metrics can be easily generated from publicly available GTFS feeds without any additional data sources. Finally, all of the proposed metrics can help transit agencies continuously assess and therefore improve the quality of GTFS data they share with the public.

Publication Date


Publication Type



Transportation Technology, Transit and Passenger Rail

Digital Object Identifier


MTI Project



Public transit, Data, Real time information, Problem solving, Statistical analysis.


Categorical Data Analysis | Statistical Methodology | Transportation
