AI Engineer Portfolio — Prince Singh | LLM, RAG, Full Stack, Cloud, DevOps

AI/ML Engineer | LLM Engineer | RAG Developer | Full-Stack Engineer | Founding Engineer | Cloud & DevOps Engineer

Prince Singh is an AI Engineer, Full-Stack Developer, and Founding Engineer with expertise in modern AI systems, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), LangChain, vector databases, multi-agent systems, cloud computing, DevOps, scalable architectures, and end-to-end product engineering. His portfolio represents real-world engineering experience across AI, ML, full-stack development, and high-performance web applications.

Large Language Model (LLM) Engineering

Prince builds advanced LLM workflows including custom prompts, embeddings, hybrid search, token optimization, context building, and production-grade inference systems. He works with OpenAI, GPT models, LangChain, and vector stores to create intelligent and scalable AI applications.

RAG Pipeline Engineering

Expertise includes document chunking, embeddings generation, semantic search, ChromaDB, Pinecone, context ranking, vector search optimization, and end-to-end RAG pipelines used in production environments with low latency and high accuracy.

Agentic AI & Multi-Agent Systems

Designs autonomous agents capable of tool calling, reasoning, planning, workflow execution, code generation, debugging, research automation, and contextual problem solving powered by multi-step LLM reasoning and memory components.

AI Product Engineering

Prince has built AI-powered platforms like RoadmapAI, AskAI, CodeLLM, contextual AI code editors, peer-to-peer AI tools, and intelligent coding assistants that combine full-stack engineering with advanced LLM capabilities.

Full-Stack Engineering (React, Next.js, Node.js, TypeScript)

Skilled in Next.js, React.js, Node.js, Express.js, TypeScript, MongoDB, PostgreSQL, Redis, REST APIs, GraphQL, WebSockets, authentication systems, SSR/ISR, and building responsive and scalable frontend and backend applications.

Cloud Engineering & DevOps

Hands-on experience with AWS, Docker, Kubernetes, CI/CD pipelines, GitHub Actions, EC2, S3, load balancing, scaling APIs, containerization, microservices, observability, and high-performance deployments optimized for millions of requests.

System Design & Architecture

Expertise in designing scalable distributed systems, real-time architectures, caching layers, pub/sub messaging, event-driven systems, serverless functions, and fault-tolerant engineering used in modern SaaS products.

Competitive Programming & DSA

Solved 5000+ DSA problems across LeetCode, GFG, CodeStudio, InterviewBit, and HackerEarth. Strong foundation in algorithms, data structures, problem solving, and coding interviews. Ranked in top competitive programming brackets with global achievements.

Founding Engineer Experience

Experienced as a Founding Engineer owning end-to-end product development, architecture, feature planning, user-facing engineering, backend optimization, LLM integrations, cloud deployments, reliability engineering, and building products at startup speed.

Remote AI Engineer | Global Collaboration

Proven track record working with international teams, remote-first startups, and cross-timezone engineering environments. Experienced in delivering scalable and clean engineering solutions in distributed teams.

Software Engineer Portfolio

This portfolio reflects expertise in frontend engineering, backend API development, AI systems, cloud pipelines, microservices, scalable infrastructures, and high-quality modern applications designed with user-first engineering.

Developer Portfolio

Explore work spanning AI engineering, machine learning projects, full-stack applications, intelligent tools, design systems, SaaS products, open source contributions, and real-world production code used by thousands of users.

Hello 👋

Prince Singh | Founding Engineerverified

Looking for Switch
Interview Ready

Founding Engineer & AI Architect @ProPeers | Ex-SDE @CloudConduction | 2 YOE | Mentor @ProPeers & @Topmate.io | Building Agentic AI & LLM Systems | MERN + DevOps + Scalable Infra | System Design & Vector Search | LeetCode Knight 👑 | GFG Inst. Rank 1 🥇 | InterviewBit Global 13 🥇 | CodeStudio Specialist 🌞

Cracked National & International 4 Remote Job As A Fresher, ( 4x Remote SDE )

About

I'm a Founding Engineer & AI Architect with 2 years of hands-on experience building large-scale AI systems, distributed backend infrastructures, and production-grade full-stack platforms. At ProPeers, I own and engineer the core systems that power 80%+ of total platform traffic including Roadmaps, RoadmapAI, AskAI, CodeLLM, and the Contextual AI Code Editor.
I design high-scale backend architectures, real-time data pipelines, aggregation engines for 100K+ users, Redis-backed caching layers, search-validation systems, role-based access flows, rate-limiting frameworks, and CI/CD deployment automation that cut release time by 34% and improved reliability across 150+ microservices. Through SSR, dynamic imports and hybrid rendering patterns, I’ve reduced key user journey response times from 1.1s → 200ms, delivering a noticeably smoother product experience.
As an AI Architect, I build Agentic AI pipelines, RAG retrieval systems, MCP protocol layers, and multi-model inference workflows using Azure OpenAI, Azure Databricks, GPT models, and Llama 3.x OSS models. My work spans token optimization, context-window compression, semantic chunking, and adaptive prompt engineering to deliver intelligent experiences at <1s latency under real production traffic.
I’ve engineered RoadmapAI with a self-learning RAG pipeline (text-embedding-ada-002, ChromaDB, semantic filters, vector enrichment), achieving ~99% roadmap accuracy and lifting roadmap ratings from the early 12% baseline. I built CodeLLM, a production AI judge featuring multi-language detection, dual-layer JSON parsing, COMPILATION/RUNTIME/VALIDATION error classification, and deterministic verdict synthesis for educational code evaluation.
I developed AskAI with MCP-layered prompts, resource-type detection (roadmap/article/practice), O1/O3 model routing, token metering, and auto-structured responses improving resolution speed and engagement . I also built the AI Code Editor with ~40ms inference, inline reasoning, multi-language execution, and deep integration with RoadmapAI and CodeLLM, significantly boosting editor retention.
Beyond AI flows, I’ve implemented token-based tiered access systems (one-time/monthly/yearly) on top of these capabilities, and engineered self-optimizing RAG pipelines and distributed multi-model inference workflows that balance accuracy, cost and latency under real-world traffic.
On the product and platform side, I’ve delivered Individual Roadmap Communities, scalable live-stream pipelines, error-resilient API layers, multi-step onboarding flows, connected roadmap progress engines, and search validation systems ensuring hallucination-free retrieval across Roadmaps and RoadmapAI.
At the infrastructure layer, I’ve reduced downtime by 90% (4 hours → 45 mins/month), stabilized Azure VM workloads, eliminated Bastion and high-cost D8 VM footprints, fixed bandwidth cost spikes, and built high-availability fallback layers with cache-first routing and distributed failover.
Day-to-day, I work across MERN + TypeScript, Node.js microservices, Docker/Kubernetes, Azure Cloud, Databricks, CI/CD automation, Prometheus/Grafana observability, and async caching pipelines powering 100K+ monthly active operations.
Outside core engineering, I’m a Problem-Solving & DSA Enthusiast with 5000+ problems solved, a 1400+ day coding streak, and top 0.1% global rankings across platforms. As a mentor to 40,000+ learners, I help engineers master DSA, System Design, Development, DevOps, and Remote Job Preparation, guiding them from theory to real-world success.
I love building scalable systems, intelligent architectures, and next-generation AI-first engineering experiences that blend reliability, performance, and deep technical innovation.

Experience

ProPeers logo

ProPeers

Founding Engineer

July 2025 – Present · Delhi, India · Remote

  • Architected the full AI ecosystem powering RoadmapAI, CodeLLM, AskAI and the AI Code Editor building Agentic AI pipelines, RAG systems, MCP server architecture and LLM orchestration that now drives 80%+ of total platform traffic.
  • Engineered RoadmapAI end-to-end with a self-learning RAG pipeline (text-embedding-ada-002, ChromaDB, semantic filtering, adaptive difficulty) and MCP-layered prompts, achieving sub-second inference and large-scale personalization.
  • Delivered ~99% personalized roadmap accuracy using Agentic flows, structured prompt masks, multi-model routing, and RAG optimization directly improving RoadmapAI user ratings from the early 12% baseline.
  • Built CodeLLM, an AI judge with multi-language detection, dual-layer JSON parsing, context-aware error classification (COMPILATION/RUNTIME/VALIDATION), semantic retrieval and deterministic verdict synthesis.
  • Developed AskAI, an agentic programming assistant using MCP-based prompt pipelines, resource-aware context analysis, dynamic O3Mini/O1 routing, token metering and automated formatting boosting engagement 3× and answer resolution speed 2×.
  • Shipped the AI Code Editor with real-time AI review (<40ms), inline reasoning, multi-language execution and deep RoadmapAI/CodeLLM integration raising editor retention by 40%.
  • Scaled Roadmap features to 120K+ organic users and improved MAU by 46% through rapid iteration, tight user-feedback loops and stable AI feature launches.
  • Delivered Individual Roadmap Communities enabling peer-matching, shared progress tracking and roadmap-level micro-communities.
  • Optimized CI/CD and deployment systems, cutting deployment time by 34%, automating multi-service rollouts, and enabling safer high-frequency releases.
  • Reduced platform downtime by 90% (4 hrs to 45 mins/month) via infra hardening, progressive fallbacks, cache-first routing, real-time health checks and load-aware autoscaling.
  • Implemented complete analytics & aggregation pipelines for 100K+ users with Redis caching, chunked batch aggregation, API acceleration and advanced rate-limit enforcement.
  • Developed full search-validation engines (Roadmaps + RoadmapAI), ensuring context-safe retrieval, hallucination-resistance and consistent multi-node semantic validation.
  • Performed Azure cost & infra optimization VM right-sizing, eliminated Bastion, stabilized Redis/Entra costs, contained Cognitive Service spikes and resolved large bandwidth egress surges.

SDE - 1

July 2024 – July 2025 · Delhi, India · Remote

  • Built and scaled the flagship "Roadmaps" feature, delivering 100+ curated learning paths across DSA, Development, and System Design used by 100K+ users. Improved personalization and relevance, while reducing API response time from 2.1s to < 300ms, resulting in a 7x faster experience and 40% higher user engagement.
  • Worked on complex APIs to reduce processing time and improved tab switching experience for smoother navigation
  • Developed and integrated the "AskAI + Discussion Forum", an intelligent peer-programming assistant where users can interact with AI to solve DSA/Dev doubts and collaborate with others enabling on-demand doubt resolution and community learning.
  • Engineered a Session Recording Bot using Python, Selenium, and headless Azure VMs with deep link automation automating session joining and recording, cutting down 100% of manual effort and improving reliability.
  • Optimized 150+ APIs by implementing advanced caching layers, async processing, and API pipelines, reducing backend latency by up to 70% and improving system throughput.
  • Reduced core web vitals TBT, LCP, and FCP from 4.4s to 990ms through advanced frontend optimizations (SSR, dynamic imports, lazy-loading APIs), significantly boosting UX for 15K+ monthly active users.
  • Led the end-to-end performance overhaul of the platform, focusing on smoother tab-switching experiences, minimal downtime, and blazing-fast navigation across the app.
  • Migrated MongoDB from Atlas to self-hosted replica sets, wrote automated backup & recovery scripts, set up VMs, and integrated cron-based backups to Azure Blob, ensuring data durability and cost-efficiency.
  • Set up real-time monitoring and alerting with Prometheus and Grafana, ensuring system health, proactive issue resolution, and enhanced DevOps visibility.
  • Deployed scalable CI/CD pipelines using Azure, GitLab, and Vercel, ensuring zero-downtime deployments and faster iteration cycles across teams.
  • Handled end-to-end production deployment and scaling for a system serving 15K+ users, maintaining high availability, fault tolerance, and robust performance at scale.
Cloud Conduction logo

Cloud Conduction

Junior Software Engineer

Jan 2024 – June 2024 · USA, · Remote

  • Built an AI-powered chat application from the ground up using React and .NET, improving frontend efficiency by 60% and backend performance by 30%, delivering a highly responsive user experience.
  • Integrated and optimized AI model responses, reducing latency from 1.86s to 1.2s (35% faster) through strategic API design, caching, and performance tuning.
  • Designed scalable cloud architecture on Microsoft Azure for AI workloads, improving system throughput by 10% while significantly reducing infrastructure costs via autoscaling and resource optimization.
  • Developed modern, responsive UI components in React that improved user engagement metrics by 25%, including better retention and interaction rates.
  • Implemented secure, scalable API gateways in .NET Core, capable of handling 500+ concurrent requests with 99.9% uptime, supporting production-level reliability.
  • Led the implementation of new features using the MERN stack, cutting down development time by 40%, and accelerating product iteration cycles.
  • Established CI/CD pipelines (Azure DevOps & GitHub Actions), reducing deployment failures by 75% and enabling faster, automated releases.
  • Conducted in-depth code reviews and optimization, reducing technical debt by 30%, standardizing best practices across teams, and improving maintainability.
  • Owned and managed the complete project lifecycle, from initial system design and dev planning to production deployment, server setup, and post-launch support.

INDIVIDUAL CONTRIBUTOR

maximize
I’ve engineered the core of our AI ecosystem RoadmapAI, CodeLLM, AskAI, and the AI Code Editorbuilding Agentic AI pipelines, RAG-driven personalization, MCP-layered orchestration, and multi-model LLM systems that deliver real-time learning guidance, deterministic code evaluation, and context-aware programming assistance at scale. My work spans LLM System Design, Tokenization & Reasoning Flows, Azure OpenAI backed inference, Azure Databricks aligned data pipelines, vectorized context retrieval, and high-availability AI microservices powering the majority of our platform’s intelligence layer. I’ve also strengthened the platform’s foundation by optimizing over 150+ critical APIs for latency, reliability, throughput, and large-scale fault-tolerance.
  • Architected an end-to-end RAG-powered AI learning platform serving 100K+ users with sub-second inference latency, leveraging Azure OpenAI embeddings (text-embedding-ada-002), ChromaDB vector indexing, and semantic retrieval with dynamic topic-aware filtering achieving 0.25 similarity-threshold precision
  • Engineered a self-evolving knowledge graph where every AI-generated artifact (roadmaps, articles, practice questions) is automatically embedded, vectorized, and reintegrated into ChromaDB creating a continuously learning retrieval layer that improves semantic accuracy with each user interaction
  • Built an intelligent RAG pipeline with multi-stage context optimization combining semantic vector similarity search, domain-specific keyword enforcement, exclusion-based noise filtering, and quality-threshold gating (0.25 cutoff) to deliver hallucination-resistant contextual augmentation
  • Designed a production-grade MCP-compliant prompt orchestration system with structured message arrays (system/user roles), dynamic context injection based on user proficiency levels (1-5 scale), adaptive difficulty mapping (Beginner/Intermediate/Advanced), and goal-oriented content generation across 3 formats
  • Implemented a real-time intent classification engine with confidence-weighted pattern matching across 4 transformation operators (NEW_SUBROADMAP, ADD_TOPICS, PROJECT_CREATION, REGENERATE_PIPELINE) using 20+ keyword signatures per intent and hierarchical fallback resolution for ambiguous requests
  • Developed a conflict-safe progress-preserving merge algorithm that maintains atomic user state (isDone flags, bookmarks, annotations, code links) during AI-driven content expansions through differential patching, duplicate detection, and rollback-capable database transactions
  • Created a multi-layer security validation framework with lexical abuse detection (violent/illegal/inappropriate patterns), technical relevance scoring across 15+ engineering domains, injection-attack guards, and AI-powered verification with a 0.6 confidence threshold for edge cases
  • Architected a scalable token-governance system with tiered allocation models (8 free tokens + purchased pools), operation-based cost accounting (Creation: 2 tokens, Customization: 4 tokens), atomic transaction handling via MongoDB optimistic locking, and graceful quota degradation
  • Optimized database performance through strategic indexing with compound indices on (userId, sessionId, isDeleted), aggregation pipeline optimization for history queries, session-based data isolation, soft-delete mechanisms, and pagination limiting to 50 records per fetch
  • Implemented a multi-model AI orchestration layer supporting dynamic routing between o3-mini (8K context window) for complex generation and gpt-3.5-turbo (4K context) for standard operations, with consistent MCP interface abstraction and model-specific parameter tuning
  • Built a resilient fallback architecture ensuring 100% availability with RAG-miss graceful degradation, sparse-query fallback prompts, cache-bypass recovery paths, multi-tier error handling, structured security-event logging, and health-check monitoring across all AI subsystems

System Architecture & Details

Select a view to see the architecture flow

  • Architected an end-to-end AI-powered code evaluation system replacing traditional compilers with RAG-enhanced logical judgment, leveraging semantic retrieval, model-context engineering, and multi-model orchestration to achieve 99% evaluation accuracy across Python, Java, C++, and JavaScript.
  • Built a multi-stage language detection engine using regex patterns, anti-pattern suppression, syntax heuristics, and confidence-based classification to prevent cross-language submissions and ensure evaluation integrity for every code block.
  • Implemented a production-grade MCP-compliant prompt pipeline generating strictly structured system/user message arrays, including judge instructions, evaluation rules, test-case schemas, complexity requirements, and JSON-first verdict formatting.
  • Designed a dual-layer response parsing system with JSON block extraction, Markdown fallback resolution, regex-based error isolation, and verdict normalization to guarantee consistent outputs even with noisy AI responses.
  • Engineered a multi-model AI orchestration layer dynamically routing requests between o3-mini (accuracy), o1 (reasoning), and gpt-35-turbo (performance) with token-window optimization and context-aware selection.
  • Integrated a RAG pipeline with ChromaDB using text-embedding-ada-002 to retrieve reference solutions, constraints, edge cases, and complexity hints, enabling AI to perform context-enriched evaluation rather than plain code matching.
  • Created a modular progress-tracking engine mapping submissions to TodoItems, Topics, and Subroadmaps, automatically updating isDone status and learning milestones through real-time backend sync and user completion logic.
  • Developed a robust validation and error-classification layer with strict checks for payload integrity, language mismatches, test-case correctness, sanitized code inspection, and COMPILATION_ERROR / RUNTIME_ERROR / VALIDATION_ERROR generation.
  • Implemented a structured verdict generator delivering human-like educational feedback including passed/failed test-case breakdowns, root-cause explanations, error localization, corrected code suggestions, and time/space complexity analysis.
  • Optimized backend infrastructure using MongoDB submission architecture with collections for Submission, TodoItem, Topic, UserTodoItemMapping, ensuring analytics-ready storage, high-throughput writes, and environment-aware routing for dev/prod deployments.
  • Achieved scalable, real-time evaluation flows combining JWT-secured endpoints, load-balanced AI calls, semantic retrieval augmentation, multi-model fail-safes, and a high-availability fallback pipeline for uninterrupted code judging.

System Architecture & Details

Select a view to see the architecture flow

  • Architected and developed a production-grade AI programming assistant handling 100+ RPS with 99.9% uptime across learning platform resources.
  • Engineered sophisticated multi-model AI orchestration routing questions between O3Mini, O1, GPT-3.5 Turbo, and Llama 3.3 based on question complexity and resource type.
  • Built comprehensive token management system with dual-token architecture (9 free + purchased), atomic MongoDB operations, and fair usage enforcement preventing system abuse.
  • Implemented MCP (Model Context Protocol) prompt engineering with three specialized generators eliminating RAG infrastructure while maintaining response quality.
  • Designed intelligent model selection algorithm routing Practice Questions to O1, complex DSA to O1, articles to GPT-3.5, and general questions to O3Mini for optimal performance.
  • Developed advanced response processing pipeline with autoWrapCode (10+ language detection), formatAIResponse (markdown fixing), and removeConversationalEndings (AI fluff removal).
  • Created scalable session management with three MongoDB schemas (generic, roadmap-specific, content creation), soft deletion, voting system, and optimized query patterns.
  • Built complete API security layer with JWT authentication, rate limiting, input sanitization, HTTPS enforcement, and comprehensive error handling across 6+ endpoints.
  • Implemented production monitoring system with response time tracking, token usage analytics, structured logging, and health checks for continuous optimization.
  • Achieved 3x user engagement and 2x resolution speed through intelligent model selection, clean response formatting, and context-aware interactions.
  • Engineered no-RAG architecture using sophisticated prompt engineering instead of vector databases, reducing infrastructure costs by 60%.
  • Added content caching optimization with RoadmapAskAIContentCreation schema and duplicate request prevention for article improvements.
  • Implemented question classification system using GPT-3.5 Turbo to categorize questions into 7 types (DSA, System Design, Development, etc.) for better routing.
  • Designed circuit breaker pattern and fallback chains (O1 → O3Mini → GPT-3.5 → Llama) for API failure resilience and graceful degradation.

System Architecture & Details

Select a view to see the architecture flow

  • Engineered an AI-integrated code editor using Monaco, seamlessly tied into CodeLLM and AskAI pipelines.
  • Supported live verdicts, multi-language (C++, Java, Python) switching, and dynamic prompts based on user activity.
  • Embedded AI-based feedback inline within the editor via backend event sync and code stream capture.
  • Delivered interactive IDE-like experience with <40ms event lag, boosting engagement and retention by 40%.
  • Tight integration with RoadmapAI and CodeLLM for contextual assistance
  • Real-time code validation and suggestions during typing

System Architecture & Details

Select a view to see the architecture flow

  • Refactored and optimized over 150 core APIs (Editor, Roadmap, AskAI, Profile) for high-throughput performance.
  • Reduced average response latency from 2.2s → 300ms through async queues, parallel batches, and Redis caching.
  • Introduced pagination layers, ElasticSearch indexing, and horizontal load balancing to maintain SLA under scale.
  • Achieved 70% backend performance boost and improved Core Web Vitals (TTFB, LCP, FCP) across all pages.
  • Load tested to 10K RPM 99.95% uptime sustained with zero cold-starts using warmed cloud functions.
  • Implemented advanced caching strategies and async processing
  • Enhanced frontend performance through SSR, dynamic imports, and lazy-loading

System Architecture & Details

Select a view to see the architecture flow

Technical Skills

maximize

AI/ML

LLMsRAGAIOpsMCPLangChainOpenAI APIPyTorchVector DatabasesPrompt Engineeringscikit-learn

Frontend Development

Next.jsReactTailwindCSSReduxReact QueryCSSHTMLSSRCSRHybrid RenderingBootstrap

Backend Development

Node.jsFastAPIExpressDjango.NET

Cloud & DevOps

DockerKubernetesAWSAzureTerraformCI/CDGitHub ActionsGitLab ActionsJenkinsGrafanaPrometheus

Databases

MongoDBPostgreSQLMySQLRedisFirebaseChromaDBVector DatabasesVector Search

Programming Languages

PythonTypeScriptJavaScriptSQLJavaC++Bash

Tools

GitGitHubGitHub CopilotVS CodePyCharmLinuxIntelliJ IDEAPostmanFigmaSeleniumScrapy

Education

institution iconSage University Indore

B.Tech in Computer Science

2020 – 2024 · MP, India

CGPA: 8.5/10

GitHub (Contributions Overview)

maximize

Loading Contributions Status...

© 2026 Prince Singh. All rights reserved.

Updated at December 2025
👀 6,800+ Visitors
Profile

Prince Singhverified

Founding Engineer & AI Architect @ProPeers | Portfolio aka WebResume

0%

2 Years Experience
600K+ Users Impact
AI Products Builder
AI Prince Background
AI Prince
Prince Singh| AI Architect
Online

Hello! 👋

Prince Singh| AI Architect

Your AI assistant. Ask anything about my work & expertise.

My Expertise:

AI ArchitectureAgentic AIRAG & MCPPrompt EngineeringLLM ModelsFine-Tuning

Try asking: