Suppressed demand is not the same as absent demand. The gap between predicted and actual ridership is an infrastructure problem — and an investment opportunity.

Findings

01
Accessibility

Transit use increases where access is strong

The most predictive variables are not raw infrastructure counts, but how dense and how close the network is — particularly the rail network. bus_density_2mi accounts for 54.8% of Random Forest feature importance — more than all other variables combined. dist_to_rail is 2nd at 10.8%. Notably, dist_to_bus falls to just 3.3% once density is in the model — the density variable already captures bus proximity. A tract with many stops that are far away underperforms a tract with fewer but walkable stops.

02
Model Result

Areas with unmet demand are the clearest opportunities for investment

By comparing predicted transit share (from the Random Forest model) to actual observed ridership, the analysis surfaces tracts with significant positive residuals — places where the model expects high usage but observes low usage. These are not low-demand areas; they are infrastructure-constrained areas. They cluster in middle-ring suburbs with fragmented bus networks — particularly in Bergen, Morris, Somerset, and Middlesex counties — where demand exists but the network has not kept up. The gap map is the most actionable output of this project.

03
Accessibility

There is a clear distance threshold

The relationship between distance and ridership is not linear. Beyond roughly one mile from the nearest bus stop, transit use drops sharply — even in areas with strong underlying demand. Tracts past the 1-mile mark show near-zero ridership regardless of other favorable conditions. Closing access gaps in the 0.5–1.5 mile range produces far greater ridership returns than expanding service in already well-served areas.

Supporting visual — Finding 03

Distance to nearest bus stop (miles) vs transit commute share. Color = bus stop density within 2 miles. Ridership collapses past the 1-mile mark.

04
Model Result

Demographics indicate need — infrastructure determines whether it can be met

Demographic variables combined account for ~27% of Random Forest feature importance — meaningful but secondary to infrastructure (~73%). pct_hispanic (7.8%) and pct_foreign_born (6.1%) are the most influential demographic predictors, reflecting both transit dependency and residential proximity to transit corridors. pct_black is not statistically significant in OLS (p = 0.45), suggesting Black-majority tracts are not uniformly over- or under-served relative to their infrastructure profile. Demographic data shows where demand exists; improving access is what changes behavior.

05
Policy Implication

Targeted investment produces stronger results than system-wide expansion

The uniform +20% expansion (Scenario 1) produces the largest system-wide average gain — +0.61 percentage points across all 2,165 tracts, with 1,110 tracts showing improvement. Scenario 2 (targeted bus) concentrates investment in 215 high-need tracts with low bus density and high demand scores — producing +0.15 pp average in those tracts but with equity-focused reach. Scenario 3 (targeted rail) shows mixed results: areas already well-served by transit show limited additional responsiveness to rail improvements. When total ridership gain is the goal, uniform expansion produces more. When equity is the goal, targeted investment reaches the communities with the greatest unmet need.

Summary

The Central Argument

Where transit is close and dense, people use it. Where it's sparse, they don't — not because they don't want to, but because it isn't there.

Bus stop density within 2 miles (54.8% of RF feature importance) and distance to the nearest stop explain far more variance in ridership than income, race, or age — individually or combined. Demographic maps reveal who needs transit; the gap map reveals where to build. Prioritize the gap zones — the areas where infrastructure is the binding constraint, not the absence of demand.

Planning Application

Decision Support

This analysis supports corridor screening, investment prioritization, and performance-based planning.

It identifies where access constraints suppress demand and provides a data-driven way to target transit improvements where they will have the greatest impact on ridership. The gap map is the key output: it translates model predictions into a spatial priority queue that can inform service planning, grant applications, and long-range network design.

View the maps →