Staff Data Scientist - Verification & Validation
Domain
Tech Stack
Must-Have Requirements
- ✓MS or PhD in Statistics, Computer Science, Machine Learning, Applied Mathematics, or related quantitative field
- ✓Proficiency in Python
- ✓Proficiency in SQL
- ✓Experience in production-quality code
- ✓Demonstrated expertise in statistical methodologies including hypothesis testing
- ✓Demonstrated expertise in power analysis
- ✓Demonstrated expertise in spatiotemporal modeling
- ✓Demonstrated expertise in Bayesian inference
- ✓Demonstrated expertise in multivariate analysis
- ✓Experience with large-scale data analysis
- ✓Experience with statistical modeling
- ✓Proficiency with Git
- ✓Experience with unit testing
- ✓Experience with collaborative development practices
Nice to Have
- -Hands-on experience with production machine learning pipelines
- -Experience with dataset creation
- -Experience with training frameworks
- -Experience with metrics pipelines
- -Experience with Apache Spark
- -Experience with Spark SQL
- -Experience with Databricks
- -Experience with designing metrics
- -Experience with delivering actionable insights
Description
Zoox is on an ambitious journey to develop a full-stack autonomous vehicle system for cities. We are seeking a Staff Data Scientist to join a verification and validation team that evaluates safety-critical AI systems. You will join a team of software and data engineers that leverage methods including log data analysis, simulation, and closed-course structured testing. You'll work cross-functionally with AI software, System Design and Mission Assurance, Simulation, Sensors, and other teams to develop, execute, and iterate on validation methods and pipelines. These pipelines evaluate safety-critical systems, are highly visible, and are an important critical path element of launching our service. The ideal candidate brings a hybrid of statistical rigor and engineering mindset to drive clarity from ambiguity, establish new processes, and propel the team forward. This is a deeply technical and hands-on role where you will be expected to be a self-sufficient builder and coder, not just a manager of projects.
In this role, you will
Design Evaluation Frameworks
Architect statistical methodologies for safety-critical AI systems to form objective, rigorous conclusions about their performance and reliability.
Conduct Robust Analysis
Deliver validation evidence to support increasingly complex operations and identify potential edge-case failures.
Inform Strategy
Deliver clear, data-driven insights to development teams to guide system improvement, and to executive leadership to inform milestone-level go/no-go decisions.
Define Metrics
Drive alignment across engineering teams on performance metrics and data extraction strategies.
Lead the Lifecycle
Manage all phases of evaluation including prototyping, requirements capture, design, implementation, and validation.
Scale Pipelines
Partner with engineers to build and maintain scalable data processing and simulation pipelines, applying distributed computing to analyze petabytes of driving data.
Qualifications
MS or PhD in Statistics, Computer Science, Machine Learning, Applied Mathematics, or related quantitative field Proficiency in Python and SQL with experience in production-quality code Demonstrated expertise in statistical methodologies including hypothesis testing, power analysis, spatiotemporal modeling, Bayesian inference, and multivariate analysis. Experience with large-scale data analysis and statistical modeling Proficiency with Git, unit testing, and collaborative development practices
Bonus Qualifications
Hands-on experience with production machine learning pipelines: dataset creation, training frameworks, metrics pipelines Experience with modern data processing technologies such as Apache Spark, Spark SQL, and Databricks Experience with designing metrics and delivering actionable insights that drive business decisions