Data-Engineering-Projects

View the Project on GitHub liev2525/Data-Engineering-Projects

⚙️ Data Engineering Portfolio

Welcome to my Data Engineering project repository.

I am a Data Engineer with experience designing and validating end-to-end data pipelines within enterprise banking and regulatory environments.

💼 Background:

Data Engineer II working across DEV/SIT/PAT environments
Strong focus on data lineage, S2T mapping, and reconciliation
Experience supporting AML/KYC and regulatory reporting systems

👉 Explore my full background, work history, and certifications

This repository highlights projects focused on:

ETL/ELT pipeline development
Medallion architecture (Bronze, Silver, Gold)
Data quality and validation frameworks
Cloud-based data platforms (Databricks, Microsoft Fabric, Azure)
Pipeline automation and CI/CD integration

🛠️ Tools & Technologies: Databricks | Snowflake | DBT | PySpark | Spark SQL | Delta Lake | Azure | Docker | GitLab CI/CD

📁 Each project includes:

Architecture overview
Data ingestion (Bronze layer)
Transformation logic (Silver layer)
Business-ready outputs (Gold layer)
Data validation and quality checks

🚀 Goal: Build scalable, reliable, and governance-aligned data systems.

PROJECT 1: Retail Sales and Inventory Analysis

Brief overview: Conducted an end-to-end analysis of historical performance and future sales potential for shipping containers produced using various materials
Technology used: ✅ Data Engineering ✅ Lakehouse Architecture ✅ PySpark ✅ Power BI ✅ Semantic Modeling ✅ Docker ✅ Ubuntu/Linux ✅ GitLab CI/CD ✅ DataOps Concepts

Architecture

RetailAnalyticsEndtoEndArchitecture

Dashboard Preview

10YrForecastDashboard (2)

📄 View Full PDF Documentation: Supply Chain & Sales Analytics Case Study.pdf