Data Ingestion for Porto Pollution Dashboard

2024

Course

Big Data Analytics and Decision Making (ISEP) capstone

  • Python
  • Apache Airflow
  • PostgreSQL
  • MinIO
  • Docker

End-to-end ETL pipeline integrating urban mobility, air quality, and transit data for analytics-ready dashboards.

Repo

Collaborators: Daniel Sampaio Osório, Diogo Dias Assunção Serra, Jorge Laginhas, Pedro Rodrigues, Tiago Fernandes

End-to-end ETL pipeline integrating urban mobility, air quality, and transit data for analytics-ready dashboards.

Purpose Create a reliable ingestion pipeline for multi-source urban data to support pollution analytics and dashboards.

Approach Extracted data from Porto’s Urban Platform, GTFS feeds, and demographic sources, then orchestrated ELT into a star-schema warehouse with Airflow.

Constraints Containerized the pipeline to keep the stack portable and reproducible across environments.

Tech stack Python · Apache Airflow · PostgreSQL · MinIO · Docker