Public transport live tracker

Serverless Event-driven solution

Task: Design solution for public transport live tracking in the biggest regions in the world.

Bus count by Region




Requirements gathering

  1. The actual location of each bus sending to cloud each 5 sec. With assumption of random distribution, the load is about 200.000 messages per second for USA, 180.000 msg/sec for Europe.
  2. Users interact with databases in specific predefined way (get actual position of single bus or buses on single line), which eliminate complex queries.
  3. Traffic is unstable and cyclic over day, with spikes of traffic in rush hour and nighttime low.
  4. All data should be effectively logged for further statistical analysis.
  5. Fine-grained access control should be available for field administrators for specific tasks (like device fleet management or log access).
  6. The solution should be cost-effective.




High-Level Design

High-Level Architecture




Design Rationale

  1. IoT Core:
    • Each bus sends location data every 5 seconds, generating a large volume of events. IoT Core is designed to process this high-throughput, event-driven data efficiently.
    • It supports secure communication with IoT devices (X.509 cert.), ensuring data integrity and privacy.
  2. DynamoDB:
    • DynamoDB is a NoSQL database that scales horizontally, making it ideal for handling the large datasets generated by cities like Shanghai (20,000 buses).
    • The predefined queries (e.g., fetching the location of a single bus or buses on a line) align with DynamoDB's fast key-value lookups.
    • Its on-demand capacity mode handles cyclic and unstable traffic patterns effectively.
  3. API Gateway:
    • Users can query the database to fetch live bus positions, with API Gateway acting as a lightweight, scalable intermediary.
    • It supports automatic scaling to handle traffic spikes during rush hours.
  4. IAM (Identity and Access Management):
    • Delegated administration is possible through IAM, allowing specific tasks (like fleet management) to be securely assigned to administrators.
    • Fine-grained access controls ensure secure handling of sensitive data.
  5. S3 Bucket:
    • Detailed logs can be collected for statistical analysis, such as usage patterns or performance metrics.
    • S3 is highly durable and cost-effective for storing large amounts of data.




Low-Level Design

Low-Level Architecture




Performance and Future Enchancements