The Role
We’re hiring a senior full-stack engineer who lives and breathes real-time audio. You’ll build end-to-end features across web and edge backends, own streaming STT/TTS integrations, and push p95 latency down while keeping accuracy up. You’ll collaborate with Product, Design, and ML to ship polished, reliable experiences at scale.
What You’ll Do
Design and ship low-latency, streaming features (WebSocket/SSE/WebRTC) for real-time transcription, summaries, and AI prompts.
Integrate and optimize STT (OpenAI Realtime/Whisper, Deepgram, Google/Azure/AWS) and TTS (ElevenLabs, PlayHT, Amazon Polly, Azure Neural), including streaming output, barge-in, and endpointing.
Build and harden APIs/services (FastAPI/Node) with observability (OpenTelemetry), rate limiting, retries, and back-pressure.
Implement audio pipelines: VAD, diarization, Opus handling, chunking/partial results, and transcript alignment.
Work with LLMs for real-time coaching: function/tool calling, structured outputs, guardrails, and prompt design/evals.
Own performance: measure WER, p50/p95/p99 latency, startup/first-token time, MOS for TTS; profile and improve.
Collaborate on UI in Next.js/React/Tailwind; ship reusable components that feel instant under load.
Ensure privacy & security: PII handling, data retention controls, tenancy, OAuth/SSO, audit trails.
Write meaningful tests (unit/integration/e2e, load tests for streaming) and review code with care.
You Might Be a Fit If You Have
5+ years building production SaaS; 3+ years with TypeScript + React/Next.js.
Strong backend skills in Python (FastAPI) and/or Node.js (tRPC/Express).
Hands-on production experience integrating streaming STT and TTS with strict latency budgets.
Web real-time chops: WebRTC, Web Audio API, MediaStream, Opus/PCM, jitter buffers, STUN/TURN/ICE.
Cloud/serverless: Vercel, Cloudflare Workers/Queues, Supabase (Auth/Postgres/pgvector), Redis.
Practical LLM integration: tool calling, evals, prompt safety, and cost/perf trade-offs.
Telemetry mindset: tracing/metrics/alerting (OpenTelemetry, Sentry, Prometheus/Grafana, Datadog).
Bonus Points
LiveKit/Jitsi/Twilio, SFU design, or media servers (Janus, GStreamer, FFmpeg).
Rust/Go for audio/streaming services; WebAssembly for on-client DSP.
Vector DBs & RAG (pgvector, Pinecone, Weaviate), LangChain/LlamaIndex.
Zoom/Meet/Teams marketplace apps, OAuth scopes, and webhook ingestion at scale.
On-device ML (VAD/denoise) and GPU-accelerated inference.
Startup 01 experience; shipped something many users rely on.
Our Stack (Today)
Front-end: Next.js, React, Tailwind, Zustand/Redux, shadcn/ui
Back-end: Python (FastAPI), Node.js (tRPC), WebSocket/SSE, REST
Infra: Supabase (Auth/Postgres/pgvector), Redis, Vercel, Cloudflare Workers/Queues/KV/R2
AI: OpenAI/Deepgram/Whisper + vendor mix for TTS/STT; prompt engineering & evals
How We Work
Small, senior team with high ownership and fast iteration.
Pragmatic testing and measure-first performance culture.
Remote-first; 4+ hours overlap with US Pacific most days is ideal.
Perks & Benefits
Competitive salary + meaningful equity.
Flexible hours; generous PTO.
Annual AI/cloud credit stipend.
BYO laptop + upgrade support.
Learning budget for conferences/courses.