About Me

Machine Learning Engineer @ Meta · Ex-Amazon, Atlassian, Behavidence

I write about production ML, system design, and building technology that compounds.

Hi, I’m Saurabh Khandelwal — a Machine Learning Engineer focused on building real-world, production-grade ML systems that actually scale, perform, and create impact.

Over the years, I’ve learned that great machine learning is not just about training better models — it’s about designing systems that work reliably in the messy realities of production: evolving data, shifting requirements, performance constraints, and business goals. That mindset has shaped how I approach my work and my career.

Why am I here?

I was drawn to machine learning less by models themselves and more by a broader question: how do complex systems behave under real-world constraints — noisy data, scale, latency, and human usage.

Early in my career, working on healthcare AI systems, I saw firsthand how quickly theoretical elegance breaks down in production. Building pipelines that could turn raw smartphone signals into reliable, near-real-time insights forced me to think end-to-end: data quality, system reliability, model behavior, and operational trade-offs.

That systems-first mindset carried forward into large-scale ML infrastructure work, where the challenge shifted from “does this model work?” to “can this system run predictably at scale?” — across distributed training, deployment, and long-lived production workflows.

Over time, my focus has moved away from individual models toward designing ML systems as products: pipelines, platforms, and feedback loops that improve with use and enable teams to move faster without sacrificing rigor.

That throughline continues today in my work as a Machine Learning Engineer at Meta — and in how I think about building technology that compounds over time.

Selected Experience

Machine Learning Engineer — Meta

Currently working

Machine Learning Engineer — Amazon

At Amazon, I worked on production ML infrastructure and distributed systems, operating at the intersection of scale, reliability, and efficiency. The work required designing systems that could support complex training and inference workflows while meeting strict operational and business constraints.

This experience reinforced a systems-first approach: optimizing not just for model quality, but for latency, cost, debuggability, and long-term maintainability.

Founding Engineer — Behavidence

At Behavidence, I helped build end-to-end ML systems for digital psychiatry — transforming raw smartphone sensor data into clinically meaningful signals.

I designed and implemented data pipelines and ML workflows that reduced inference latency from batch-scale processing to near real-time, enabling continuous behavioral analysis. This work contributed to peer-reviewed research and supported a platform that went on to raise ~$5M in funding.

What I’m Focused On Now

Right now, my focus is on building and scaling machine learning systems that are designed as products, not experiments.

I’m spending time thinking about:

how ML platforms can enable teams rather than slow them down,
how system design choices compound over time — technically and organizationally,
and how to bridge research-quality ideas with production realities in high-impact domains.

Alongside my role at Meta, I’m increasingly interested in early-stage company building, especially at the intersection of AI, systems, and real-world use cases. I care about problems where technology must operate under real constraints and still deliver meaningful outcomes.This site is one small part of that journey.

If you’re interested in collaborating, learning together, or just exchanging ideas — feel free to reach out on Linkedin