Data Engineer | ML & GenAI Enthusiast | Building Smart Systems that Think, Learn & Scale |📍 Boston, MA
I'm a Data Engineer and AI Engineer in the making, passionate about crafting intelligent data pipelines and LLM-powered systems that make information flow seamlessly.
Currently pursuing my Master’s in Data Analytics Engineering @ Northeastern University (GPA: 3.84/4.0), where I explore the sweet spot between data infrastructure, machine learning, and generative AI.
- 🎓 Master's in Computer Science @ Northeastern University (2024-2026)
- 💼 Former ML Engineer (AI/ML Team Lead) @ Kubby
- 🏆 Teaching Assistant @ Machine Learning Operations
- 🏆 Research Assistant @ Movement Neuroscience Lab
⚙️ I love solving data bottlenecks, automating the boring stuff, and pushing AI to be faster, smarter, and more explainable.
I’m currently seeking Spring 2026 internships and full-time opportunities starting January 2026 in:
Data Engineering | ML Ops | AI Systems | Cloud Infrastructure
- Built multi-label classifiers for sEMG gesture recognition — improved accuracy by 25%.
- Designed synthetic data generation pipelines, cutting calibration time by 50%.
- Processed real-time sensor data with LabGraph, achieving 10× faster training.
- Mentored 100+ students in Vertex AI, Airflow & Docker for ML deployment.
- Designed hands-on labs and led evaluations for 25+ student teams.
- Contributed to an open-source MLOps repository for scalable learning.
- Engineered the Kubby IQ Chatbot using RAG + GPT-4 + AWS OpenSearch, achieving 95% accuracy.
- Migrated recommendation systems to hybrid collaborative filtering, enhancing personalization.
- Built NLP-based tagging using TF-IDF & NER, boosting search accuracy by 30%.
- Implemented RAG + embeddings for healthcare data, achieving 97% retrieval accuracy.
- Leveraged GPT-4, Mistral, Llama & Gemini Pro to create context-aware healthcare assistants.
- Led KPI analysis for Atavus Football, mining data from 150+ games to improve defense strategy.
- Optimized ETL pipelines in Tableau and AWS S3, cutting data latency by 30%.
- Partnered with analytics teams to create data-driven insights that improved business outcomes.
- Delivered Agile-based cloud solutions that scaled across teams and departments.
Northeastern University, Boston, MA
M.S. in Data Analytics Engineering — GPA: 3.84/4.0 (Expected Dec 2025)
📘 Key Coursework: MLOps · NLP · Data Mining · Generative AI · Cloud Data Management
Anna University, Chennai, India
B.E. in Computer Science — GPA: 3.3/4.0 (June 2023)
📘 Key Coursework: Machine Learning · Data Structures · Cloud Computing · Software Development
AI counselor using transformers & GPT for mental health support
- Achieved 81% accuracy, reduced inference cost by 95%, deployed on AWS EC2 with CI/CD.
Real-time climate insight platform with automated model retraining
- Reduced latency by 30%, maintained 90% accuracy, automated deployment via Git Actions.
Predicting market sentiment using BERT/GPT/Llama-2
- Improved stock prediction accuracy by 15% and achieved 86.57% model accuracy.
Understanding ad impact by combining text & visual signals
- Boosted prediction reliability by 15% with optimized training.
Visualized $750M in loan data from 5.7M+ applications to uncover lending trends.
🧩 More projects available on my GitHub
- Mentored incoming international graduate students in their first semesters
- Advocated for 100,000+ international graduate students across 100+ U.S. institutions.
- Represented 18,000+ graduate students across 9 colleges.
- Created Power BI dashboards improving transparency by 25%.
- Led community events — Fall Brunch, Karaoke Night, Sustainability Drive — enhancing engagement.
📘 Secure Smart Cabin Using Optimized Arduino GSM Interface
Springer — Advances in Computing & Information (ERCICA 2023)
🔗 DOI: 10.1007/978-981-99-7622-5
🌀 Distributed AI & Edge ML
🧩 RAG + Knowledge Graphs for Context-Aware LLMs
☁️ MLOps Automation in GCP & AWS
🧠 Multimodal Model Optimization
💬 Open to collaborations, hackathons, and deep tech discussions!
📧 Email: guhan.p@northeastern.edu
💼 LinkedIn: linkedin.com/in/your-link
💻 GitHub: github.com/your-link
🌐 Portfolio: your-portfolio-link.com
“Building systems that scale, models that learn, and data that tells stories.”