Skip to content

Robotics Infrastructure Engineer

Tutor Intelligence
Watertown, MAonsite$120,000 - $175,000Apr 10, 2026·Posted 2 days ago
View Application Page

Description

Robotics Infrastructure Engineer

Systems, Infrastructure & Reliability

About the Role

We build robots that run 24/7 in production environments. We're looking for a hands-on engineer to own the reliability, infrastructure, and developer tooling that keeps our fleet running and our engineering team fast. You'll split your time between robot-side systems work, cloud infrastructure, and building automation that multiplies the team's output. A significant portion of this role involves working with AI coding agents. You'll direct autonomous agents to diagnose CI failures, triage production issues, run automated security and compliance checks, and execute multi-step engineering tasks. Knowing how to scope work for an agent, review its output critically, and build tooling that agents can use effectively is as important as writing the code yourself.

What You'll Do

Own robot-side software (Python)

Maintain the on-robot codebase that orchestrates arms, cameras, sensors, and I/O. Debug production hardware/software failures and ship fixes fast

Build and maintain infrastructure as code

Manage cloud infrastructure — identity and access management, CI/CD credentials, secrets, container registries, cluster autoscaling — using declarative configuration and reproducible builds

Drive build system and packaging migrations

Own the transition of robot software packaging to reproducible, hermetic build systems. Maintain machine images, dev environments, and deployment pipelines

Build simulation and testing infrastructure

Develop end-to-end simulation systems that validate robot behavior without physical hardware — camera projection, kinematics, placement validation, fleet-wide calibration

Develop and operate AI-powered engineering automation

Build autonomous agents that run nightly CI triage, security audits, infrastructure compliance checks, and code quality sweeps. Design the interfaces and instructions that make agents effective at real engineering work

Improve observability and health monitoring

Instrument robot software with metrics and structured telemetry. Build alerting that catches problems before humans notice them

Work across the stack

Touch frontend, backend, protobuf definitions, deployment tooling, and cloud services as needed. No part of the system is someone else's problem

What We're Looking For

3+ years of Python in a systems context — not web/ML Python, but the kind where you deal with processes, hardware I/O, async, and real-time constraints

Strong Linux systems knowledge

Memory management, device management, systemd, containers, networking, kernel tuning

Infrastructure as code experience

Declarative infrastructure and configuration management tools. You've managed IAM, CI runners, secrets, and machine images programmatically

Experience with real hardware

Robot arms, depth cameras, grippers, force/torque sensors, pneumatics, or similar

CI/CD ownership

You've not just used CI — you've owned it. Runner infrastructure, flaky test triage, build caching, GPU-enabled pipelines

Comfort with AI coding agents

You've used tools like Claude Code, Cursor, Copilot Workspace, or similar to do real engineering work — not just autocomplete, but directing agents through multi-step debugging, refactoring, and infrastructure tasks. You understand their failure modes and know when to trust vs. verify

Strong debugging instincts

You can go from a vague production symptom to root cause across hardware, OS, network, and application layers

Bias toward shipping over perfecting

You fix, monitor, iterate. Your commit history has more

fix

than

feat

and you're proud of that Nice to Have NixOS or reproducible build system experience Experience building or operating autonomous engineering agents/bots Robotics simulation (kinematics, camera models, physics) gRPC / Protocol Buffers Managed network infrastructure, VPNs, overlay networks Time-series databases and observability stacks About the Work Style This is a high-autonomy, high-output role. On a typical day you might direct an AI agent to triage overnight CI failures while you debug a production robot issue, then spend the afternoon migrating a package to a new build system. You'll write a lot of code, but you'll also write a lot of prompts — and the best candidates will see those as the same skill.

Location Context