AYUSH.GUPTA

Principal GenAI Engineer & Architect

EMAIL GITHUB LINKEDIN

OVERVIEW

Principal Generative AI Engineer and Architect specializing in the design and deployment of production-grade GenAI systems, Advanced RAG pipelines, and Agentic workflows. Expert at bridging the gap between raw LLM capabilities and high-scale, low-latency production environments. Instrumental in building the Vooz ecosystem serving over 500k monthly active users. Committed to a "Systems First" philosophy—building reliable, observable, and cost-optimized AI solutions.

SKILLS

GENAI & AGENTS

LangGraph, AutoGen, CrewAI, LlamaIndex, OpenAI, Function Calling, Prompt Engineering

VECTOR DBS & RETRIEVAL

Qdrant, Pinecone, Milvus, Chroma, LanceDB, Advanced RAG, GraphRAG, Hybrid Search

AI INFRASTRUCTURE

Triton Inference Server, ONNX Runtime, vLLM, Ollama, Dynamic Batching, LoRA, QLoRA

DEVOPS & MLOPs

Azure AKS, ArgoCD, Azure Front Door, OpenTelemetry, Ragas, TruLens, DeepEval, Guardrails

FRONTEND & APPS

Next.js 16, TypeScript, React Native, WebRTC, Framer Motion

BACKEND & DATA

Node.js, Python, SignalR, Memgraph, PostgreSQL, Redis

EXPERIENCE

VOOZ INC

Principal GenAI Architect (Founding Engineer)

Sep 2024 — Present
  • Developed a dual-track moderation pipeline integrating backend LLMs with local edge-inference for real-time safety responses.
  • Spearheaded Triton Inference cluster with Dynamic Batching, achieving 1280+ RPS with full OTel tracing.
  • Engineered browser-local inference using ONNX Runtime (WebGPU) for real-time content processing.
  • Engineered multi-tenant Next.js environment utilizing Parallel and Intercepting Routes, scaling to 500k+ MAU.
  • Orchestrated multi-cloud deployments secured by ArgoCD and Azure Front Door.

AIMICA LTD

Software Developer (Lead Role - AI & RAG)

Jan 2023 — Sep 2024
  • Designed and implemented a scalable RAG-based knowledge system integrating open-source LLMs.
  • Led a team of 3 developers building AI-powered mobile apps using React Native (Expo).
  • Developed backend services (Node.js/Python) to support AI orchestration and caching.

INFOEDGE

Mobile Developer

Dec 2021 — Dec 2022
  • Optimized critical React Native flows for high-traffic apps, resolving memory leaks.

PROJECTS

Multi-Agent Workflow Orchestrator

Autonomous agent system using LangGraph and AutoGen for complex task planning and tool execution.

Medical Paper NLP Categoriser

Deep learning model using TensorFlow on PubMed datasets. Achieved 90%+ accuracy in classifying medical abstracts.

EDUCATION

B.Tech in Computer Science

Dr. A.P.J. Abdul Kalam Technical University

2018 — 2022