About the Role
You will help scale our Voice AI Agent (telephony + STT + LLM + TTS) into an enterprise-grade product. This includes building real-time speech/LLM pipelines, barge-in and human takeover flows, data loops for model improvement, and natural interruption handling.
What You’ll Do
Implement and tune real-time STT/ASR and TTS pipelines.
Orchestrate LLM-driven conversations: prompting, function calling, dialog state, and context handling.
Build barge-in logic and reliable human handoff signals.
Create datasets for training: recordings, redaction, labeling, metadata; enable search and retrieval.
Optimize latency, accuracy, turn-taking, and cost.
Add guardrails, fallbacks, and basic A/B experimentation.
Required Skills
Programming: Python or Node.js; modular code, async handling.
APIs & Events: REST, webhooks, retries, idempotency.
Speech/LLM: Using STT/TTS SDKs or streaming APIs; prompt and function-calling basics.
Data: Basic SQL; schema design for transcripts; understanding embeddings/vector search concepts.
Cloud (AWS preferred): S3, Lambda or equivalent services, CloudWatch.
Git & CI: Branching, PRs, basic pipelines.
Nice to Have
Telephony/WebRTC/Twilio/Kixie experience.
Vector DBs (pgvector/FAISS) or Redis for caching/queues.
Observability tools (OpenTelemetry/Grafana).
Security fundamentals (encryption, IAM best practices).
Tech Stack
OpenAI Realtime, Deepgram, Twilio/Kixie, Node.js/Python, Postgres, Redis, Docker, AWS (S3, Lambda, API Gateway, CloudWatch), WebSockets/SSE.
Compensation & Benefits
Competitive salary and performance bonus
Remote/hybrid options within India
Learning budget and mentorship
Education
B.E./B.Tech/M.E./M.Tech/MCA or equivalent practical experience.
Notice Period
Immediate to 30 days preferred.
Job Type: Contractual / Temporary
Contract length: 6 months
Pay: ₹25,000.00 - ₹30,000.00 per month
Work Location: Remote