Data Engineer with 6+ years of backend experience specializing in SQL, ETL workflows, and data validation. Strong in Python and relational databases, with experience building data pipelines, cleaning datasets, and ensuring data quality for analytics.
Built ETL pipelines for collecting, transforming, and validating data.
Wrote complex SQL queries (joins, aggregations, window functions) for analysis and reporting.
Performed data cleaning and quality checks to ensure accuracy and consistency.
Designed optimized MySQL schemas for large datasets.
Investigated and resolved data-related issues.
Worked in Linux environments using command-line tools for data processing.
Projects
Run/Walk Data Pipeline — Built batch pipeline (CSV → Parquet → DuckDB) with SQL transformations and materialized tables. Orchestrated pipeline using Apache Airflow and exposed data via Streamlit dashboard and Flask API for reporting and validation. (GitHub)
Analytics Platform — Built an end-to-end data platform with ETL pipeline, DuckDB storage, FastAPI, and Streamlit dashboard. Ingested and normalized JSONL telemetry data into analytical tables for querying and reporting. (GitHub)