Data Engineering Portfolio

Data Engineer building reliable pipelines and analytics-ready datasets

Designing scalable data platforms with Python, SQL, Airflow, dbt, Azure, AWS, Databricks, and modern cloud warehouses.

Featured Project

NYC 311 Service Requests Lakehouse project thumbnail

AZURE DATA ENGINEERING

NYC 311 Service Requests Lakehouse

Azure-first medallion lakehouse for NYC 311 operational analytics, transforming raw API data into analytics-ready bronze, silver, and gold datasets.

  • Azure Data Factory -> ADLS Gen2 -> Databricks pipeline with proven raw landing and medallion processing
  • Reusable data quality checks, dimensional models, and reporting marts
  • Architecture notes, runbooks, SQL assets, notebook exports, and cloud execution proof
Azure Data FactoryADLS Gen2DatabricksPySparkDelta LakePythonSQLPower BIGitHub Actions

Data Engineering

Cloud Flight Fare Pipeline project thumbnail

AWS DATA ENGINEERING

Cloud Flight Fare Pipeline

End-to-end flight fare pipeline with a fast local demo stack (Docker + Postgres) and a production-style AWS path (S3 + Redshift), orchestrated with Airflow and modeled with dbt.

PythonAirflowdbtDockerAWS S3RedshiftGitHub Actions
Travelpayouts Flight Collector project thumbnail

PYTHON DATA INGESTION

Travelpayouts Flight Collector

Python API ingestion project that collects live Travelpayouts flight fare data and publishes dated CSV snapshots for analytics.

PythonAPI IngestionCSVSchedulingpytestGitHub Actions

Supporting Work

Sumryze - AI-Powered SEO Reporting Dashboard project thumbnail

AI REPORTING SAAS

Sumryze - AI-Powered SEO Reporting Dashboard

SaaS-style dashboard for automated SEO reporting, AI-generated summaries, analytics visualizations, and client-ready insights.

Next.jsTypeScriptTailwindOpenAIREST APIsVercel
Floral Daily SKU Analysis project thumbnail

DATA ANALYTICS

Floral Daily SKU Analysis

Sales and inventory analysis project focused on daily SKU movement, reporting, and business decision support.

SQLAnalyticsReporting

Skills & Tools

Orchestration & Workflow

Apache AirflowAzure Data FactoryGitHub ActionsDocker

Storage, Lakehouse & Warehousing

ADLS Gen2Delta LakeAmazon RedshiftPostgreSQLS3

Transformation & Modeling

dbtSQLDimensional ModelingStar SchemaPySpark

Data Processing & Platforms

PythonPandasDatabricksBatch Pipelines

Data Quality & CI

dbt TestsGitHub ActionspytestGreat Expectations

Analytics Enablement

Power BIKPI DesignBI HandoffDocumentation

About

I am a Data Engineer focused on building reliable pipelines and analytics-ready datasets for business decision-making.

My work starts with ingestion and data quality checks, then moves through transformation layers, dimensional modeling, and reusable data marts.

I prioritize clear SQL, maintainable Python, automation, and CI practices that keep pipelines stable as data volume and complexity grow.

With a background in analytics and web development, I bridge technical data engineering work with practical reporting, dashboard, and business needs.

Contact

Interested in collaborating on data engineering work or portfolio projects? Reach out and I will follow up.