This repository showcases two end-to-end projects focused on data engineering and data analytics using SQL.
The work demonstrates my ability to: - Design and build data pipelines - Model data using star schema architecture - Write efficient analytical SQL queries - Transform raw data into actionable insights
📂 1_EDA
- Queried a data warehouse (star schema) to answer business questions
- Built analytical queries to identify:
- Most in-demand skills\
- Highest-paying skills\
- Optimal skills (balancing demand & salary)
- Used multi-table joins across fact and dimension tables
- Created derived metrics using aggregation and mathematical functions
- SQL (joins, aggregations, filtering, grouping)
- DuckDB (analytical query engine)
- Star schema (fact + dimension + bridge tables)
- Functions like
COUNT(),MEDIAN(),LN(),ROUND()
- How to translate business questions into SQL queries
- Writing efficient analytical queries on structured data
- Understanding trade-offs between demand and compensation
- Working with real-world messy datasets
- Built an end-to-end ETL pipeline
- Designed fact, dimension, and bridge tables
- Created multiple data marts
- Implemented incremental updates using MERGE
- DuckDB
- SQL (DDL + DML)
- ETL pipeline design
- Star schema & dimensional modeling
- Incremental processing
- Designing production-style data pipelines
- Importance of data modeling
- Writing modular SQL
- Handling incremental updates
- 🐤 DuckDB\
- 🧮 SQL\
- ☁️ Google Cloud Storage\
- 🛠️ VS Code\
- 📦 Git & GitHub
.
├── 1_EDA/
├── 2_Mart_Build_DW/
├── Resources/Images
└── README.md
- Built a full data workflow
- Demonstrated analytics + engineering
- Applied real-world practices
- Add dashboards\
- Automate pipeline\
- Optimize queries
Feel free to connect!