Hello! I'm Anuj, a passionate Data Scientist currently pursuing my Master's in Computer Science at the University of Texas at Arlington, graduating in May 2027. I enjoy creating things that live on the internet, whether that's building machine learning models, developing data-driven solutions, or solving complex analytical problems.
With 10+ completed projects under my belt, I've worked on various technologies and frameworks. My journey in data science has been driven by a passion for creating efficient, scalable, and user-friendly applications.
Designed machine-learning-powered automation systems, integrating OCR + LLMs for structured data extraction.
Rebuilt a legacy application, reducing system errors by 20% and increasing data reliability.
Coordinated with multiple companies to conduct campus recruitment drives and managed the entire placement process.
Developed a churn prediction model using supervised ML algorithms (Logistic Regression, Random Forest, XGBoost). Performed feature engineering, exploratory analysis, and hyperparameter tuning. Provided actionable insights for retention strategies based on model outputs.
Created a Streamlit application to extract structured fields (date, amount, vendor) from scanned bills. Improved OCR accuracy by 30% using preprocessing and Google Vision API. Used LLM-based entity extraction (Hugging Face), achieving 85% classification accuracy.
Built a sentiment classification model (~80% accuracy) using NLP preprocessing and ML algorithms. Developed a web interface for real-time analysis via Flask API endpoints. Conducted cleaning, tokenization, stop-word removal, and vectorization.
Built a hybrid search system combining semantic embeddings + contextual LLM responses. Implemented ChromaDB vector store for dense embedding search. Designed an evaluation workflow for retrieval quality and output coherence.
Built a multi-agent orchestration framework enabling autonomous agent-to-agent communication using a standardized A2A protocol. Designed a centralized orchestrator for task routing, context sharing, and structured message passing between specialized agents.
Published in Metszet Journal. An in-depth analysis of AI technologies impacting the healthcare industry.
An exploration of DeepSeek OCR's revolutionary approach to optical character recognition and image compression technology.
An in-depth analysis of Google's latest AI breakthrough and its impact on the future of artificial intelligence innovation.
Exploring the revolutionary concept of AI-to-AI communication and how autonomous agents are reshaping the future of technology.
A comprehensive comparison of Retrieval-Augmented Generation and Context-Augmented Generation in modern AI systems.
An exploration of the four fundamental types of AI agents and their increasing importance in the evolving landscape of artificial intelligence.
I am currently looking for new opportunities to contribute to software development projects. Whether you have a question or just want to say hi, I'll try my best to get back to you!
Say Hello