Back to Projects

AWS Data Engineering

Cloud Flight Fare Pipeline

End-to-end flight fare pipeline with a fast local demo path and a documented production-style AWS architecture, orchestrated with Airflow and modeled with dbt.

Honest scope: local demo execution is proven in the repo today, while the AWS path is documented as a production-style architecture with runbooks, warehouse SQL, and validation assets.
PythonAirflowdbtDockerPostgresAWS S3RedshiftGitHub Actions
Cloud Flight Fare Pipeline project image

Local demo

Docker + Postgres

AWS path

S3 + Redshift runbook

Orchestration

Airflow DAG + logs

Modeling

dbt marts + tests

Reviewer proof

Screenshots, SQL, CI

Fast local validation path

The fastest review path is local: start Postgres, load the sample fares, build the marts with dbt, then run the proof queries that show downstream readiness.

Step 01

Start local services

docker compose up -d

Step 02

Load sample fares

python scripts/load_sample_to_postgres.py

Step 03

Build dbt models

dbt build  --project-dir dbt/flight_fares  --profiles-dir dbt

Step 04

Run proof queries

python scripts/run_analysis_queries.py

Architecture

Two clear paths, one analytics-ready outcome

The repo is strongest when reviewed as a proven local demo path plus a documented production-style AWS target. Both routes converge on dbt-modeled marts and SQL outputs that are ready for analytics review.

Proven local execution path

  • Boot the local Docker + Postgres demo stack
  • Load sample fare snapshots into warehouse-ready raw tables
  • Run dbt staging, marts, tests, and docs against Postgres
  • Execute proof queries that validate analytics-ready outputs

Documented AWS target path

  • Ingest daily fare snapshots to S3 bronze storage
  • Load warehouse tables into Redshift with runbook-backed SQL helpers
  • Orchestrate the warehouse path with Airflow and dbt
  • Use validation SQL to confirm mart row counts and readiness

Shared analytics handoff

  • Mart outputs include marts.fact_fares, marts.dim_route, and marts.dim_date.
  • Docs explain how analysts should query route, pricing, and timing patterns
  • Example SQL covers route trends, monthly movement, and lead-time analysis
  • CI validates linting, tests, demo loading, and dbt build steps

Architecture Diagram

Repo architecture overview showing ingestion, bronze/raw landing, silver/cleaned processing, dbt modeling, validation, and analytics outputs.

Open full diagram
Cloud Flight Fare Pipeline architecture overview diagram from the project repo

Analytics-ready outputs

Downstream analysis is visible, not implied

This case study does not stop at ingestion and modeling. The repo documents the marts, the analyst query patterns, and the downstream output handoff so reviewers can see what gets delivered after the pipeline runs.

Route fare trends

Route-level SQL shows how marts.fact_fares and marts.dim_route support pricing trends by route.
Route trend SQL

Monthly fare movement

README rollups make the time-series handoff clear without implying a live reporting layer.
Example queries

Lead-time analysis

Lead-time queries show how the marts support booking-window and pricing review.
Lead-time SQL
Dashboard preview artifact from the Cloud Flight Fare Pipeline repo
Preview artifact

Dashboard preview artifact

The repo includes a static downstream artifact for reviewer handoff. It is shown here as documentation evidence, not as a claim of a live hosted BI application.

Execution Proof

Real proof assets shown directly on the page

Reviewer evidence is surfaced here instead of being hidden behind link lists. The screenshots below come from the project repo and cover storage proof, orchestration proof, execution logs, and downstream artifact handoff.

Reviewer path

Start with the local demo path, then review the AWS/Redshift runbook, followed by the proof assets and docs for the documented target. That keeps the proven local scope and cloud target easy to separate.