The Chicago Transit Authority (CTA) has a fleet of nearly 1900 buses and operates more than 120 bus routes over 1,300 route-miles. Each bus in the CTA's fleet is equipped with a GPS device that tracks its location at any given time. Buses report their locations approximately every minute to the CTA's servers where the latest vehicle positions are accessible through their Bus Tracker API. In addition to providing location information, the API provides estimated arrival times and geospatial information for each bus route.
This is an ongoing project to analyze the geopositional data generated by the CTA's bus fleet and collected from the Bus Tracker API to identify issues with the existing service. Some of the questions the project hopes to answer are: what are travel and wait times for each bus route, and how do these times compare to the scheduled service? Which routes experience bus bunching most frequently? When and where along a given route does bus bunching initiate and dissipate? Are there areas of the city with better bus performance than others? The current project page focuses on the data and analysis of a handful of individual bus routes. Data on this page was gathered for 181 days between January 1, 2019 and June 30, 2019, though data for all routes is still actively being through the rest of the year. A future project page will provide analysis for each major CTA bus route and examine the quality of bus service on a neighborhood level. Read more about future plans for the project below.
Select a bus route from the dropdown menu below to learn more about it.
RouteThe joy-plot style histograms show when buses depart from an origin terminal heading toward a destination terminal. Clusters with sharp peaks indicate the buses leave at similar times each day, whereas more spread out clusters indicate variablity in the departure time.
Bus bunching is the phenomenon when two or more buses from the same route heading toward the same terminus arrive together or in close succession at a bus stop when a longer wait time between the buses was scheduled. Bunching has a number of possible causes, including bus operators driving too fast or too slow, the volume of the traffic on the road, and the number of passengers boarding or alighting from the bus at each stop. Whatever the cause, bunching results in unreliable service for passengers who may expect more evenly distributed and predictable service. Bus bunching can be defined in terms of the headway between buses, where headway is the temporal distance between two buses. For example, if a bus arrives at Western Ave at 7:00AM and the next bus arrives at the same stop at 7:02AM, then the headway between the buses is two minutes. If the buses were scheduled to arrive 15 minutes apart, then the two buses could be considered bunched. There is no standard minimum threshold headway that defines bus bunching, so the choice will depend on the level of scheduled service on a route. The heat map below displays the proportion of observed bus bunching incidents along Route during the 6-month data collection period.
Notice that most bus bunching incidents are observed during the morning and evening rush periods. This is partly due to the fact that more buses are dispatched during rush hour, as shown in the previous heat map, and so there are more opportunities for bunching to occur. The volume of car traffic on the road during rush hour also likely impacts bus headways. Because traffic congestion varies along the route, some buses will travel faster or slower and different points in their trips, causing the headway between buses to become uneven.
Using the dropdown boxes on the left, select an origin bus stop, a destination bus stop, and a day of the week to generate scatter plots detailing trip and wait times for the Garfield bus throughout the day. The top plot shows the travel times for buses going between the origin and desination stops. The bottom plot shows the longest possible wait times between buses leaving the origin stop heading toward the desintation stop. It is necessary to provide a destination stop when computing wait times, since not all scheduled bus trips from the same route travel visit the same sequence of stops. The red curves indicate the median travel and wait times calculated over 15-minute intervals. Read a written summary of the travel and wait times at a particular time of day by selecting values from the hour and minute dropdown menus. The pale blue boxes highlight a 30-minute interval around the time being considered.
I am currently collecting geopositional data from 123 CTA bus routes—nearly every route with the exception of seasonal and special services. I began data collection right before January 1, 2019 and will continue collecting data for the rest of the year. I intend to update the project page with new visualizations and improved analysis for each bus route throughout the year. Beyond analyzing travel and wait times and bus bunching, I aim to quantify the degree to which bus routes adhere to or deviate from the CTA's official bus schedules. Additionally, I hope to analyze the quality of service by Chicago neighborhood and identify areas of the city where bus service could be improved.
This project was created by Spencer Chan as part of the ChiPy Mentorship program. Special thanks to Matt Hall, my mentor, for his patience and support while I learned Python for the first time, as well as Ray Berg, the mentorship director, for his hard work organizing a fun and successful program. Further details about the project, including the project's back story, my experience in the ChiPy mentorship, and data processing are discussed in a series of blog posts. The SciPy Stack, Jupyter, Bootstrap, and D3.js were used in the creation of this project. Flask and Bokeh were used during production and were still in use at the time the project was presented at the July 2017 ChiPy meeting. Data was accessed from the Chicago Transit Authority using their Bus Tracker API.