INITIALIZING RESUME...

RAJAT SINGH

Data Engineer
Building Scalable Data Infrastructure
0 Daily Records
0 Uptime %
0 Cost Reduction %
0 Faster Queries %
▼ Scroll to Explore ▼

PROFESSIONAL EXPERIENCE

Data Engineer
Chegg India
Jul 2024 – Present
  • Engineered and maintained robust data pipelines using Apache Airflow, Databricks, PySpark, and AWS Redshift, processing 1M+ records daily for large-scale reporting and analytics workloads with 99.9% uptime.
  • Automated API ingestion workflows from Google Ad Manager, AdSense, and AdMob to support ad revenue tracking and financial reporting, integrating with NetSuite and Braintree payment systems, reducing manual effort by 80%, improving data freshness from 24h to near real-time, and cutting operational costs by 30%.
  • Led migration of 20+ legacy ETLFM pipelines to Airflow, implementing modern lakehouse architecture using AWS S3, Delta Lake, and Delta Live Tables, reducing query latency by 40% and improving scalability.
  • Established comprehensive data quality framework with anomaly detection and validation checks, achieving 99.5% data accuracy in partnership with analytics teams and reducing data incidents by 60%.
Data Engineer Intern
Chegg India
Jan 2024 – Jul 2024
  • Onboarded 15+ diverse data sources to Databricks lakehouse and architected Airflow DAGs for scheduled ingestion with SLA monitoring and alerting, reducing onboarding time by 50%.
  • Conducted rigorous QA and data validation for RIO event tracking, resolving 30+ data quality issues and supporting cross-functional teams in issue triaging, improving overall pipeline reliability by 35%.
  • Collaborated on instrumentation of IRD documents and monitored New Relic events for newly launched features, ensuring data integrity.

TECHNICAL PROJECTS

🎵
Real-Time Song Recommender
AI-powered emotion detection system using Python, OpenCV, and CNN models to analyze facial expressions in real-time with 85% accuracy. Integrated Spotify API to dynamically curate personalized playlists based on detected moods.
Python OpenCV CNN Spotify API
View on GitHub →
📞
Phonebook Directory System
High-performance C++ CLI application with full CRUD operations using doubly linked lists for O(1) insertion/deletion and bidirectional traversal. Implemented binary search, multiple sorting algorithms, and comprehensive input validation.
C++ Data Structures Algorithms CLI
View on GitHub →

TECHNICAL SKILLS

☁️
Cloud & Infrastructure
AWS S3 Redshift Lambda EC2 GCP
Data Processing
Apache Airflow Databricks PySpark Delta Lake SQL
🐍
Programming
Python SQL C++ C
🔧
Tools & Concepts
Docker Git PostgreSQL ETL/ELT Data Lakehouse CI/CD

EDUCATION

🎓
Delhi Technological University
Bachelor of Technology in Computer Science and Engineering
Aug 2020 – Jun 2024
CGPA: 8.56/10.0

CERTIFICATIONS

🏆
Gremlin Certified Chaos Engineering Practitioner
Gremlin
📜
C for Everyone: Programming Fundamentals
Coursera
📚
Text Mining and Analytics
Coursera

🎮 DATA PIPELINE GAME

Click to catch data packets and maintain 99.9% uptime! Can you process 1M+ records?

Score
0
Uptime
100%
Level
1

⌨️ CODE TYPING CHALLENGE

Select a level and type the code snippet correctly! Test your coding speed and accuracy.

GET IN TOUCH

🎉 Easter Egg Found! 🎉
🏆 Achievement Unlocked!