How Hopthru Cleans, Validates, and Expands Ridership Data for NTD Reporting

Follow

Swiftly’s Hopthru Ridership platform ensures agencies can produce accurate, reliable ridership data for planning, reporting, and National Transit Database (NTD) submissions. This article explains how Swiftly transforms raw Automated Passenger Counter (APC) data into a clean, validated dataset through its cleaning algorithm, sampling and reliability framework, and expansion methodology. Together, these processes ensure agencies can trust their ridership data — even when real-world hardware or data challenges occur.

 

APC Correlation and Cleaning Algorithms

Each morning, the Swiftly executes a series of software processes designed to accurately correlate raw APC events with the agency’s published schedule. Swiftly then retrieves the most up-to-date schedule data from the public GTFS feed to ensure alignment with current service information.

During processing, Swiftly's cleaning algorithms identify events that occurred in revenue service and assigns each event to the corresponding route, trip, and stop. Events determined not to be in revenue service are separated and excluded from the set of valid passenger events.

Unlike other APC solutions that apply a strict threshold for rejecting entire trips based on discrepancies between boardings (“ons”) and alightings (“offs”), Swiftly's approach evaluates data at the individual event level. This enables more precise identification of valid passenger activity. To make this determination, these algorithms analyze multiple metadata attributes for each event, including vehicle ID, vehicle assignment, GPS location, timestamp, and the context of surrounding events, to generate a confidence score for each APC record. Events with low confidence scores are excluded from revenue service calculations.
This event-level methodology allows Swiftly to minimize false negatives compared to approaches that reject entire trips based on a single anomaly.

Once data processing is complete, the cleaned and validated records are stored securely within Swiftly's network. The Hopthru Ridership product then queries this dataset for use in planning, reporting, and analytics.

In cases where a vehicle produces clearly anomalous data for an entire service day, the raw data for that vehicle and date are fully discarded before processing. For example, if a vehicle reports a substantial imbalance between boardings and alightings prior to any data cleansing, that day’s dataset for that vehicle will be excluded from analysis.

Once data is validated and cleansed, Swiftly ensures continued reliability through proactive monitoring and maintenance.

 

Benchmark Study

Before using APC data for NTD reporting, agencies complete a benchmark study to confirm accuracy and alignment. This study establishes the APC system’s accuracy, identifies potential discrepancies, and validates the system for ongoing reporting use. Swiftly partners closely with agency staff during this validation stage to ensure compliance with FTA and NTD standards.

 

APC Maintenance

A comprehensive APC Diagnostic Report (Hopthru Diagnostics) is updated daily which provides feedback on the performance of the APC system. Alerts can be configured to proactively alert maintenance staff of vehicles that are not producing any APC data, vehicles that are producing partial APC data, vehicles producing a substantial delta between boardings/alightings, and vehicles that are producing data - but haven’t been recorded as having been in service. Both Swiftly and the Agency staff routinely reviews the APC Diagnostics and schedules any APC Maintenance with the Agency staff or APC vendor as necessary.

With validated data in place, Swiftly applies industry-standard sampling methodologies to ensure that the data used for reporting meets FTA and NTD confidence standards.

 

APC Sampling and Reliability

Swiftly’s platform facilitates reliable data sampling by identifying valid trip records and filtering out incomplete or anomalous data. The system ensures that only verified data are included in NTD reporting calculations. Sampling methodologies always follow established NTD guidance, balancing precision with operational practicality.

If trips are determined to have run, but were not detected in APC data, the cleaning algorithm applies an Alternative Sampling Method that’s designed to meet the FTA confidence intervals and precision levels as defined in the Reporting Policy Manual. This process will also apply to trips served by vehicles that exhibited anomalous raw APC data that was subsequently discarded.

 

Expansion Algorithm

Swiftly's expansion algorithm is a critical capability for producing complete and accurate ridership data, especially in environments where APC coverage isn’t perfect.

Swiftly's expansion algorithm fills in missing ridership data by statistically estimating passenger counts for trips that lack functioning APC data. It does this by “expanding” observed trips — those where APCs worked correctly — to represent similar trips that were unobserved. The methodology relies on having enough real data samples for each trip pattern (weekday, Saturday, Sunday, etc.) across a rolling period (typically four weeks).

Even agencies with 100% APC coverage across their fleet experience hardware failures or data gaps. 

Malfunctioning APCs can invalidate an entire day of data for a vehicle, and that missing data needs to be replaced with expanded estimates. Expansion ensures agencies can still report accurate totals to the National Transit Database (NTD) and maintain continuity in analytics even during outages or equipment replacement periods.

 

Expansion Methodology

Swiftly's expansion methodology samples both UPT (Unlinked Passenger Trips) and PMT (Passenger Miles Traveled) by filling in missing trips with historical averages at the event level.

  1. The system detects all trips that occurred on a given service date.
  2. For each trip that ran, we check the Agency’s resulting APC data to see if any event was recorded for the trip and given service date. 
    • Please note that these events do not need to be boardings or alightings. This ensures that Swiftly does not incidentally expand a trip that ran, even if it served no passengers. 
  3. If there are no events recorded for the trip and service date, then the system starts the expansion process.
  4. The system checks to see if the service type is a weekday, Saturday, or Sunday.
  5. The system checks to see if there is recorded APC data for the trip and service day type in both the preceding and following weeks. 
  6. The system calculates stop level averages for ons/offs/load/pmt at the stop level using the data from step 5.
  7. For each scheduled stop in the trip, the system creates a stop level event using the average ons/off/load/pmt from step 6.

This method allows us to fill in event-level data while maintaining the confidence intervals and precision levels defined by the FTA.

 
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.