AI Platform Engineer Observability (Remote) Company: Fluid AI About Us:Fluid AI is a specialized cutting - edge Gen AI company with an enterprise level GPT productthat caters to corporates. We are one of the pioneering companies in the field of ArtificialIntelligence. Our GPT product allows businesses to increase their productivity by gettingassistance in all key functions including sales and marketing, operations, customer supportetc. Our founders, Abhinav & Raghav Aggarwal have been on the cover of the ForbesMagazine as well as on the Forbes 30 Under 30 list and the Fortune 40 Under 40 list.Abhinav & Raghav have appeared on numerous TV channels including NDTV, CNBC TV18, ETNOW etc. They are the authors of the worlds first ever book written by an AI Algorithm.They have worked with Warren Buffet to create a digital human avatar for the Forbescentennial issue.Location: Remote (India) Experience: 1–5 Years About the Role Fluid AI is looking for an AI Platform Engineer – Observability to build the monitoring, debugging, tracing, and performance capabilities of our enterprise Agentic AI Platform.You’ll work across Python services, JavaScript applications, Kubernetes infrastructure, and AI workflows to provide deep visibility into how our platform operates in production. Your work will help engineers and customers understand system behavior, identify bottlenecks, troubleshoot issues, and improve the reliability of autonomous AI systems.This is a platform engineering role focused on building developer tools, observability infrastructure, and performance capabilities that power enterprise-scale AI deployments. ResponsibilitiesDesign and build observability capabilities for the Fluid AI Agentic Platform.Instrument Python backend services, JavaScript applications, and Kubernetes workloads.Implement distributed tracing, metrics, and logging across AI agents, APIs, databases, and enterprise integrations.Build dashboards and visualization tools to monitor platform health and AI workflow execution.Develop debugging, profiling, and diagnostics tools for autonomous AI systems.Investigate production issues, identify root causes, and improve system reliability and performance.Integrate and extend observability technologies such as OpenTelemetry, Grafana, Prometheus, Loki, Tempo, and Jaeger.Collaborate with platform and product engineering teams to improve developer experience and operational excellence.Required Skills1–5 years of software engineering experience.Strong programming skills in Python and/or JavaScript/TypeScript.Experience building backend services or distributed applications.Understanding of REST APIs, Linux, and networking fundamentals.Strong debugging, analytical, and problem-solving skills.Interest in performance engineering, monitoring, and distributed systems.Good to HaveFastAPI, Node.js, React, or Next.js.Docker and Kubernetes.Experience with OpenTelemetry or distributed tracing.Experience with Grafana, Prometheus, Loki, Tempo, Jaeger, or similar observability platforms.Familiarity with cloud platforms such as AWS, Azure, or GCP.Exposure to LLMs, AI platforms, or agentic AI systems.Why Join Fluid AI?Build the observability layer for a cutting-edge enterprise Agentic AI Platform.Work on modern technologies spanning AI, distributed systems, cloud infrastructure, and Kubernetes.Solve challenging engineering problems involving performance, scalability, and reliability.Shape the developer experience and operational tooling used to power enterprise AI deployments.Join a fast-growing team building AI solutions for leading enterprises across banking, financial services, manufacturing, and the public sector.Role: Engineering - Other,Industry Type: IT Services & Consulting,Department: Production, Manufacturing & Engineering,Employment Type: Full Time, PermanentRole Category: EngineeringEducationUG: Any Graduate