Cogniflix Artificial Intelligence Pvt Ltd (Mindflix AI) is building the next generation of AI-powered personalization and engagement systems across video, voice, and conversational interfaces. Our platform combines Generative AI, Computer Vision, deep learning, and real-time intelligence to enable scalable, human-like content and interactions.
ROLE OVERVIEW
We are hiring an LLM / Agentic AI Engineer (2 to 5 years) to design and build production-grade AI systems using LLMs, retrieval, and orchestration patterns. This role is hands-on and engineering-heavy focused on building reliable services, scalable pipelines, and measurable quality improvements.KEY RESPONSIBILITIES
· Build and ship LLM-powered backend services and APIs for conversational / agentic workflows.
· Implement retrieval + context orchestration patterns (RAG, memory, tool/function calling).
· Design stateful systems (conversation/session handling, workflow orchestration, fallbacks).
· Implement safety + guardrails, input validation, and refusal patterns where needed.
· Create evaluation frameworks (test sets, regression tests, quality scoring) and monitoring.
· Optimize for latency, cost, reliability, and production observability.
REQUIRED SKILLS & BACKGROUND
· 2 to 5 years of experience in software engineering with hands-on work in LLMs / Conversational AI / RAG / Agents (production or near-production preferred)
· Strong Python (clean code, performance-minded, API development)
· Solid understanding of backend engineering: REST APIs, auth, DB basics, caching, queue/worker patterns
· Experience with LLM integration patterns: Prompt + context composition, Tool/function calling, Retrieval-augmented generation (RAG), Session memory patterns (short-term/long-term)
· Familiarity with evaluation and debugging LLM systems (hallucinations, grounding failures, prompt regressions)
GOOD TO HAVE SKILLS
2 to 5 years of experience in software engineering with hands-on work in LLMs / Conversational AI / RAG / Agents (production or near-production preferred)
· System design for scalable AI services (multi-tenant, stateless services, orchestration layers)
· Frameworks: LangChain / LangGraph / LlamaIndex (or equivalent)
· Vector search & embeddings: Pinecone / Weaviate / Qdrant / FAISS, rerankers
· Observability: logs/metrics/traces, Sentry/Datadog/Open Telemetry, structured logging
· Infra familiarity: Docker, cloud basics (AWS/GCP/Azure), CI/CD
· Streaming / real-time: web sockets, event-driven architecture
· Model serving familiarity (optional): vLLM, Triton, etc.
· Familiarity with privacy/security basics for user data and logs
WHAT WE ARE LOOKING FOR
· Strong engineering fundamentals, pragmatic problem-solving
· Comfort working in ambiguity and shipping iteratively
· Ownership mindset: build -> test -> monitor -> improve
Job Type: Full-time
Pay: ₹800,000.00 - ₹1,500,000.00 per year
Work Location: In person