|
I build production AI systems that reason, plan, and execute autonomously — from multi-agent orchestration to enterprise RAG pipelines, LoRA fine-tuning at scale, and multi-adapter inference serving. 🔭 Currently Building:
|
|
Production multi-agent orchestration with Google ADK — coordinator, planner, coder & reviewer agents |
Knowledge Graph + RAG with Neo4j for intelligent document QA |
|
Hybrid search RAG with guardrails, reranking & Langfuse observability |
Natural language → SQL with self-correction & multi-dialect support |
|
Build language models from zero — tokenizer, transformer, training, alignment |
Production LoRA/QLoRA fine-tuning — YAML recipes, MLflow, vLLM serving |
|
One base model, many adapters per request — OpenAI-compatible inference gateway |
Adapter lifecycle — train, evaluate, merge (TIES/DARE), version & publish |
|
Specialize LLMs for medical, legal, finance & code — domain benchmarks, curriculum training, safety guardrails |
ML-powered insurance fraud detection — 10 expert rules, PyCaret AutoML, explainable decisions |
|
Designed the first Firestore session service — transactional state, subcollection events, batch deletes. Design patterns adopted in the official implementation by a Google engineer who credited the work. |
Contributed FirestoreSessionService with 19 unit tests, in-memory mocks, and production-grade design — race-safe transactions, N+1 query elimination, async batch deletes. |
Bug fixes to the LangGraph adapter — trailing slash route fix, fork config passthrough, and message ID validation for regenerate streams. |
|
Built-in history processor for orphaned tool call/result repair preventing provider 400 errors, and fixed LLM-as-judge reason field pollution from reasoning models. |
Added type and integer range validation to GGUFWriter.add_key_value — catches type mismatches and integer overflow/underflow before they silently corrupt model metadata. |
📋 Full Tech Breakdown
LLM Providers OpenAI • Anthropic • Google Gemini • Llama • Mistral
Agent Frameworks Google ADK • A2A Protocol • MCP Tools • LangGraph • CrewAI
RAG Stack LlamaIndex • LangChain • Neo4j • OpenSearch • Pinecone • Weaviate
Observability Langfuse • MLflow • Weights & Biases • OpenTelemetry
Inference vLLM • Multi-LoRA Serving • TensorRT-LLM • ONNX Runtime
Fine-tuning LoRA • QLoRA • DoRA • Unsloth • Axolotl • DeepSpeed • RLHF/DPO
Model Building PyTorch Transformers • BPE Tokenizers • GGUF/ONNX Export
Frontend React • Vite • Next.js • TypeScript • TailwindCSS
Backend FastAPI • Python • Node.js • GraphQL
Cloud AWS (Bedrock, SageMaker) • GCP (Vertex AI) • Azure
Infrastructure Docker • Kubernetes • Terraform • GitHub Actions



