Data Engineering Projects

Cloud Resume Challenge

AWS (S3, CloudFront, Lambda, DynamoDB, API Gateway), IaC (AWS SAM), CI/CD (GitHub Actions), Python, JavaScript

  • Engineered a full-stack, serverless web application on AWS to host a personal resume, featuring a dynamic visitor counter.
  • Developed a Python Lambda function with a DynamoDB NoSQL backend to process API requests, increment, and store visitor data.
  • Automated the entire infrastructure deployment using Infrastructure as Code (IaC) with the AWS Serverless Application Model (SAM).
  • Built and configured separate CI/CD pipelines using GitHub Actions for automated deployment of both frontend and backend stacks.

Tech Graveyard: Trend Monitoring ELT Pipeline

Python, GitHub Actions, dbt, DuckDB, Streamlit

  • Engineered a serverless, event-driven ELT pipeline using GitHub Actions to automate tracking of programming language trends from the GitHub API.
  • Developed modular, tested data models using dbt to transform raw JSON data into analytics-ready tables in a DuckDB data warehouse.

Work Experience

Data Annotator

Aug 2024 – May 2025

CNTXT AI | Karachi, SD

  • Processed and structured complex multilingual datasets to train industry-specific AI models.
  • Developed Python scripts to automate data cleaning and preprocessing tasks, improving data quality and team efficiency.