Riding the Data Wave: How Time Series Analysis Transforms Public Transport

Dec 02, 2024

When it comes to public transport, understanding the rhythm of commuter movement is crucial for delivering a seamless experience. Whether you’re planning schedules, allocating buses, or predicting peak demand, time series analysis can be your secret weapon. But what is it, and why does it matter for the world of public transport? Let’s dive into this fascinating topic in a conversational and approachable way.

man surfing with big wave — Photo by Marvin Meyer on Unsplash

What is Time Series Analysis, and Why Should Public Transport Care?

Imagine this: You’re managing a city’s bus network. Every day, you’re faced with a seemingly chaotic stream of data—ridership numbers, ticket sales, delays, weather conditions. This isn’t just random chaos. Hidden in this data are patterns that can help you plan better routes, anticipate bottlenecks, and optimise fleet allocation.

Time series analysis is the method we use to uncover these patterns. It’s all about looking at how numbers change over time—like daily passenger counts or seasonal peaks in ticket sales—and using those insights to make informed decisions. It’s the difference between reacting to problems and proactively addressing them.

The Building Blocks of Time Series Data

Before diving into complex algorithms, let’s break time series data into simple, relatable components:

Level: The average passenger count across time. Think of it as your baseline.
Trend: Are more people riding buses year after year? That’s a trend.
Seasonality: Peaks during holidays, school term rushes, or weekend lulls—seasonality is all about recurring patterns.
Residuals: The unpredictable bits, like sudden spikes due to concerts or weather disruptions.

Understanding these components helps us get a clear picture of what’s happening and why.

Stationary vs Non-Stationary Data: Why It Matters

Now, let’s talk about two types of time series data—stationary and non-stationary—and why this distinction is critical for public transport:

Stationary Data: The data behaves consistently over time. Imagine passenger numbers on a bus route that hover around 500 every weekday with slight random variations. These steady patterns are predictable and easy to model.
Non-Stationary Data: The data changes over time. Picture ridership steadily growing year-on-year due to urban expansion or fluctuating wildly during major events. This data needs some tweaking to make it usable for forecasting.

For most public transport systems, raw data is non-stationary because trends and seasonality are a big part of the story. But don’t worry—there are ways to adjust for this.

How We Handle Non-Stationary Data

Non-stationary data needs a bit of TLC before it can be used for analysis. Here’s how we transform it:

Differencing: Subtracting today’s ridership from yesterday’s to remove trends.
Log Transformations: Smoothing out big spikes in data, like when passenger counts soar during holiday weekends.
Decomposition: Breaking data into trend, seasonality, and noise to analyse each part separately.

Think of it like preparing a canvas before painting—you’re setting the stage for accurate analysis.

time lapse photography of road — Photo by Geoffroy Hauwen on Unsplash

A Practical Example: Weekend Rush on Public Transport

Let’s say you manage a bus network in a bustling city. Over the past few years, you’ve noticed an increase in weekend ridership, but it’s inconsistent. Some weekends are busier than others. Here’s how time series analysis can help:

Step 1: Visualise the Data
Plot weekend ridership over the past two years. You’ll likely see seasonal spikes during holidays.
Step 2: Decompose the Data
Break the data into components to see the trend (overall growth), seasonality (holiday surges), and residuals (unexpected events like festivals).
Step 3: Build a Model
Use techniques like ARIMA or exponential smoothing to predict future demand.
Step 4: Take Action
Based on predictions, add extra buses on weekends when demand is expected to spike. This reduces overcrowding and improves customer satisfaction.

Avoiding Common Pitfalls: Flattening Data and Losing Insights

One common challenge in time series analysis is handling big spikes in data. If you flatten the data too much (e.g., by applying overly aggressive transformations), you might lose critical insights, like why those spikes happen in the first place. For public transport, these spikes often tell a story:

A sudden rise in ridership might indicate a popular event.
A drop could highlight areas where service disruptions occurred.

Instead of flattening the data completely, aim to stabilise it just enough for analysis while retaining the context of those spikes.

Real-World Applications in Public Transport

Time series analysis isn’t just theoretical—it has real, tangible benefits for public transport:

Predicting Peak Hours: Use past data to anticipate when stations will be busiest and deploy additional staff or resources accordingly.
Optimising Schedules: Identify trends in demand and adjust bus or train schedules to match.
Handling Disruptions: Analyse historical data to understand how weather or strikes impact ridership, and create contingency plans.
Enhancing Passenger Experience: By forecasting demand, you can ensure smoother operations, shorter wait times, and happier passengers.

Tools and Techniques for Time Series Analysis

Here are some popular methods to make sense of time series data:

Autocorrelation and Partial Autocorrelation (ACF/PACF): These help identify dependencies between past and present data points. For example, does Monday’s ridership affect Tuesday’s?
Ljung-Box Test: A statistical test to check if your model’s errors (residuals) resemble white noise—meaning your model has captured all meaningful patterns.
Augmented Dickey-Fuller (ADF) Test: Determines if your data is stationary or if transformations are needed.

A Conversation About Insights, Not Just Numbers

Time series analysis is more than just crunching numbers; it’s about understanding the story behind the data. For public transport, this means:

Anticipating commuter needs.
Improving operational efficiency.
Enhancing passenger satisfaction.

When you unlock the secrets of your data, you’re not just running a transport system—you’re transforming the way people move through your city.

Closing Thoughts

Public transport is the backbone of urban life, and time series analysis is the tool that helps keep it running smoothly. By recognising patterns, predicting demand, and planning for the future, you can deliver better service while staying ahead of challenges.

So, next time you marvel at how your local bus system manages to run like clockwork (most of the time), remember: there’s a bit of data magic at play. And now, you know how it works!

Sheldon’s Substack

Discussion about this post