Is the 370 the worst bus in Sydney? 11 October, 2016 Questions: - - PowerPoint PPT Presentation
Is the 370 the worst bus in Sydney? 11 October, 2016 Questions: - - PowerPoint PPT Presentation
Is the 370 the worst bus in Sydney? 11 October, 2016 Questions: Bus privitisation? Better or worse? Is the 370 is the worst bus route in Sydney? (or are they all that bad?) Transport for NSW Open Data Old timetables Bus
11 October, 2016
Questions:
» Bus privitisation? Better or worse? » Is the 370 is the worst bus route in Sydney? (or are they all that bad?)
» Old timetables » Bus occupancy » Bus/train/light-rail/ferry patronage » Contract areas and details » Opal card usage and stats » Transport Forecasts » Population Forecasts » Fare compliance » Walking data » Aviation data » Cycling data
Transport for NSW Open Data
TripView - Grofsoft https://www.grofsoft.com/
GTFS
General Transit Feed Specification Published by Google Apache 2.0 License Separate static (timetable) and realtime data https://developers.google.com/transit/gtfs/ https://developers.google.com/transit/gtfs-realtime/
This is a zip of CSV files
Bus Routes Agencies Trips Map paths (shapes) Trip stop times Stop names and locations Dates this trip will run Exceptions to this Fare information
GTFS [Static Timetable]
trip { trip_id: "631043" start_time: "01:55:00" start_date: "20180111" schedule_relationship: SCHEDULED route_id: "2441_370" } vehicle { id: "42558_346913_3000_14_1" } stop_time_update { stop_sequence: 47 arrival { delay: 125 time: 1516714432 } departure { delay: 120 time: 1516714446 } stop_id: "228776" schedule_relationship: SCHEDULED } stop_time_update { stop_sequence: 48 ... }
GTFS Realtime
This is a giant protobuf
Fetching and storing realtime data
AWS Lambda (Python) Cloudwatch trigger (every minute) TfNSW Open Data Hub S3
Fetching and storing timetable data
AWS Lambda (Python) Cloudwatch trigger (every hour) TfNSW Open Data Hub S3 Postgres (RDS)
GTFS Static
4 Months 29 Separate timetable feeds 786 Timetable updates 3.5 GB 4 Months 1 realtime feed (combined NSW) 186,628 collections (every 1min) 557GB
GTFS Realtime
Trip Trip stop time Trip date Realtime stop entry Route Agency
My data model
Stop
GTFS Static Timetable (Zip of CSV files)
Bus Routes Agencies Trips Map paths (shapes) Trip stop times Stop names and locations Dates this trip will run Exceptions to this Fare information Route ID Trip ID Stop ID
trip { trip_id: "631043" start_time: "01:55:00" start_date: "20180111" schedule_relationship: SCHEDULED route_id: "2441_370" } vehicle { id: "42558_346913_3000_14_1" } stop_time_update { stop_sequence: 47 arrival { delay: 125 time: 1516714432 } departure { delay: 120 time: 1516714446 } stop_id: "228776" schedule_relationship: SCHEDULED } stop_time_update { stop_sequence: 48 ... }
GTFS Realtime Data
Route ID Trip ID Stop ID
Bus Routes Agencies Trips Map paths (shapes) Trip stop times Stop names and locations Dates this trip will run Exceptions to this Fare information Route ID Trip ID Stop ID 2018-01-10 25:55:00 26:05:00 ...
trip { trip_id: "631043" start_time: "01:55:00" start_date: "20180111" schedule_relationship: SCHEDULED route_id: "2441_370" } vehicle { id: "42558_346913_3000_14_1" } stop_time_update { stop_sequence: 47 arrival { delay: 125 time: 1516714432 } departure { delay: 120 time: 1516714446 } stop_id: "228776" schedule_relationship: SCHEDULED } stop_time_update { stop_sequence: 48 ... }
GTFS Realtime Data
Route ID Trip ID Stop ID "" "631043_2" ???
Route did not exist: _382 Route did not exist: _382 Route did not exist: _317 Route did not exist: 2436_993 Route did not exist: 2454_TROL Route did not exist: 2436_994 Route did not exist: 2433_RAIL Route did not exist: 2452_RAIL Route did not exist: 2433_4000 Route did not exist: 2433_RAIL Route did not exist: 2436_994 Route did not exist: 2433_RAIL Route did not exist: 2436_994 Route did not exist: 2454_JC Route did not exist: 2436_993 Route did not exist: 2436_993 Route did not exist: 2454_TROL
1. Download 1 realtime dump 2. Parse realtime data protobuf 3. Match each of the ~7000 trips with the timetable 4. Write ~20000 realtime updates to the DB
Processing realtime data
Main Django server Postgres RDS EC2 Spot instance EC2 Spot instance EC2 Spot instance EC2 Spot instance EC2 Spot instance
Processing the data
Is the 370 the worst bus in Sydney?
More than 2min early 2 min early - 5 min late (inclusive) More than 5min late More than 20min late Early On time Late Very late
Results
Recorded bus trips: 3,726,226 On time: 1,180,774 (31.69%) More than 20min late: 106,535 (2.86%)
# of trips % on time Route Route Name 1 9215 97.03 Stkn Stockton Ferry 2 1062 96.33 N20 Riverwood to Rockdale 3 8811 92.09 273 Fassifern to Toronto 4 1680 90.36 453 Percival Street, Rockdale to Rockdale Station 5 6086 90.16 954 Hurstville Grove to Hurstville 6 1029 88.53 15 Bay Village to Tuggerah 7 2322 88.20 N10 Sutherland to City Town Hall 8 1190 87.98 N11 Cronulla to City Town Hall 9 4313 87.87 15 Stanwell Park to Helensburgh 10 2005 85.94 280 Cooranbong to Morisset
Best Routes
# of trips % on time Route Route Name 1 542 2.77 160 Cessnock to Newcastle 2 1056 3.22 622 Dural to Milsons Point via Cherrybrook 3 699 3.29 L70 Terrey Hills to City QVB (Limited Stops) 4 2442 3.64 627 Castle Hill to Chatswood 5 1876 3.68 628 Norwest to Chatswood 6 1360 4.19 740 Macquarie Park to Plumpton via Stanhope Gardens 7 1280 4.38 594H Hornsby to City QVB 8 880 5.00 803 Liverpool to Miller (Loop Service) 9 4862 5.78 841 Narellan to Leppington via Gregory Hills 10 2771 5.92 896 Campbelltown to Oran Park via Gregory Hills (Loop Service)
Worst Routes (by % on time)
22 14190 8.79 370 Leichhardt Marketplace to Coogee
# of trips % on time % >20min late Route Route Name 1 1104 20.29 34.69 7 Wollongong to Bellambi (Loop Service) 2 1638 23.99 30.40 8 Wollongong to Bellambi via Balgownie (Loop Service) 3 1660 23.61 25.42 3 Wollongong to Bellambi via Towradgi (Loop Service) 4 1592 48.43 24.81 10 Wollongong to West Wollongong (Loop Service) 5 1605 24.61 24.74 277 Castle Cove to Chatswood 6 14190 8.79 23.45 370 Leichhardt Marketplace to Coogee 7 1616 20.85 22.77 11 Wollongong to Wollongong University (Loop Service) 8 1230 29.67 22.68 24 Wollongong to Figtree via Mangerton (Loop Service) 9 2499 16.49 22.45 281 Davidson to Chatswood 10 1200 22.50 21.92 571 Turramurra to South Turramurra (Loop Service)
Worst Routes (by % > 20min late)
# of trips % on time % >20min late Route 1 312840 21.05 1.69 Hillsbus 2 11297 21.09 3.49 Rover Coaches 3 155130 23.26 1.17 Transit Systems 4 67066 24.87 6.05 Forest Coach Lines 5 1724152 28.34 3.36 State Transit Sydney 6 344292 30.66 3.24 Transdev NSW 7 119277 33.17 3.14 Newcastle Transport 8 97363 33.34 4.31 Busabout 9 29297 36.47 4.13 Blue Mountains Transit 10 82005 37.99 6.28 Premier Illawarra
Worst Agencies (by on-time %)
Conclusions
» Bus privitisation - could go either way ¯\_(ツ)_/¯ » The 370 is the worst bus route in Sydney. (or maybe it's 277 - Castle Cove to Chatswood)
Future Work
» Analyse wait time between busses » Collect bus fullness data » Publish the data live
Any questions?
Find me at » katiebell.net » @notsolonecoder » github.com/katharosada/bus-shaming
24 - 26 August, Sydney ICC Call for talk proposals open now! 2018.pycon-au.org/speak/