Artisan of GenAI & Machine Learning · Custodian of MLOps

Aditya Raut

A scholar of intelligent systems
forging models from raw data, tempering them through quantization, and measuring their worth in the crucible of production.

Explore Work Start Conversation

❦

Open Source

Contributor

GenAI

Intern

Experience

Hamsa Damayanti - Renaissance Manuscript Illustration

AllCognix AI

Generative AI Engineer Intern

Nov 2025 — Present

Orchestrated the migration of the Verba knowledge base to the modern Haystack 2.x engine and GPT-4o; tempering latency by 40% and token expenditures by 60% through context windowing.
Transformed the ingestion pipeline via concurrent ThreadPoolExecutor thread-vessels, accelerating batch processing from 70s to a swift 27s.
Bypassed costly vision token overhead by preprocessing PDF and image structures with Tesseract OCR, enabling efficient text extraction.

deepset-ai/Haystack & run-llama/LlamaIndex

Open Source Contributor

June 2026 — Present

Averted fatal crashes in the DocumentLanguageClassifier when parsing blob-only manuscripts with null contents (content=None); ensuring a graceful fallback in lieu of uncaught TypeErrors. (PR #11419)
Rectified silent metadata corruption within the RecursiveDocumentSplitter, where starting split indices strayed during word or token subdivision; replaced flawed unit-count arithmetic with true overlap character measurements. (PR #11711, PR #11768)
Fixed nested generator serialization in the FallbackChatGenerator, preserving fallback chains via standard component serialization. (PR #11847)
Fixed ineffective stopword removal in the SemanticDoubleMergingSplitterNodeParser via word-level tokenization. (PR #22167)

AI4Chat

Full Stack Developer Intern

Apr 2025 — Oct 2025

Forged a multimodal AI system integrating real-time web search, LLM-driven reasoning, and dynamic response synthesis across text, image, and document modalities.
Designed backend workflows for intelligent query routing, enabling context-aware decisions between search, generation, and analytical pipelines.

Project

The Scholar's Path

As an initiate of the Generative AI arts and open-source contributor to deepset Haystack and LlamaIndex, I dedicate my craft to building robust RAG pipelines, LLM-driven architectures, and production-grade ML systems. My hands-on labor traverses the entire machine learning stack—from training ConvNeXt-Tiny models with OpenCV and ONNX quantization to serving LightGBM forecasting engines via FastAPI and Docker vessels.

Through meticulous calibration, I have tempered retrieval latency by 40%, reduced token costs by 60%, and compressed ingestion times from 70s to 27s in live environments. Beyond the cloud, I study the stability of edge AI architectures, publishing framework quantization preprints on Zenodo and deploying real-time classifiers on the Jetson Nano.

The Quest

Seeking ML Engineer & GenAI Apprenticeships Across Realms

From the bazaars of India to distant shores, I seek opportunities where artificial intelligence meets human ingenuity.

Scientific Contributions

Ongoing Research · Agricultural Edge AI

Framework-Dependent Quantization Stability in Agricultural Edge AI

Preprint Link

A systematic study on model stability under resource-constrained deployment. This work audits dataset integrity and challenges the architectural "fragility" of lightweight backbones by demonstrating the critical role of calibration strategies in quantized inference.

I. The 11.6% Leakage Audit

Utilizing pHash and MD5 verification to identify cross-split contamination in a 14,154-image wheat dataset, establishing a new "Clean" baseline for agricultural computer vision.

II. Quantization Backend Sensitivity

Demonstrating how Entropy-Calibrated Static Quantization (TensorRT) restores MobileNetV3 accuracy from 31% back to 82.5%, bypassing CPU-based dynamic limitations.

III. Edge Deployment Stability

Engineered HardSwish and LayerNorm deployment patches for stable FP16 and INT8 inference on Jetson Nano, achieving a fluid 54.5 FPS real-time control rate in autonomous environments.

Metric Innovation: Deployment Efficiency Score (DES)

"Accuracy × ln(FPS - 1)" — A multi-objective success criterion designed to prioritize fluid control rates in autonomous field robotics.

Skills

GenAI / RAG

RAG & LLMs
Embedding Models
Semantic Search
Vector Databases
LangChain
OpenAI / Anthropic APIs
Haystack

ML Systems / Backend

Python
FastAPI, Flask
Git, Linux
PostgreSQL / SQL

MLOps & Deployment

Docker, ONNX
Git LFS, Render
DockerHub

ML / DL

PyTorch, CNNs
Transfer Learning
ONNX Runtime
Scikit-learn, XGBoost
LightGBM, Hugging Face
Fine-tuning

Achievements

2nd Place — IIC Udaan 2.0

April 2025

Technical Lead — Google Gemini Student Club

2025 – Present

The Scholar's Cabinet

Mechanical Artistry

Enchanted by the symphony of steel and speed in MV Agusta and Ducati craftsmanship. The De Tomaso P72 stands as modern sculpture in motion.

Celestial Navigation

Drawn to the ballet of combat systems and the dominion of the skies. The SR-71 Blackbird remains a testament to human audacity.

Cosmic Inquiry

Gazing into the firmament with my Phoenix 60700, seeking the infinite. The rings of Saturn whisper secrets of cosmic architecture.

Temple of Form

Honoring the classical ideal through the discipline of bodybuilding. Where physical excellence mirrors the pursuit of mental clarity.