Software Engineer in Test III - Data Analytics

Emburse

Dallas, TXHybridMar 3, 2026·Posted 2 months ago

View Application Page

Domain

Data DevOps Observability

Tech Stack

PythonSQLSnowflakeBigQueryRedshiftMicrosoft FabricMS SQLAirflowParquetAvroORCAzureGreat ExpectationsMonte Carlodbt

Must-Have Requirements

✓Bachelor's degree in Computer Science or related field, or equivalent experience
✓Expert in SQL for analytical validation
✓Proficiency in at least one programming language for data engineering such as Python
✓Experience testing data platforms and tooling
✓Experience with cloud data systems like Snowflake, BigQuery, Redshift, or Microsoft Fabric
✓Experience with relational databases like MS SQL
✓Experience with orchestration tools like Airflow
✓Knowledge of columnar file formats such as Parquet, Avro, or ORC
✓Experience with SDLC and agile practices
✓Understanding of data contracts and governance

Nice to Have

-Experience with Great Expectations or Monte Carlo
-Experience with Azure cloud provider
-Experience implementing automated checks and alert logic
-Experience with performance and throughput testing
-Experience with data lineage and transformations
-Experience with schema validation and PII handling
-Experience with ETL/ELT jobs and streaming pipelines

Description

Who We Are

At Emburse, you’ll not just imagine the future – you’ll build it. As a leader in travel and expense solutions, we are creating a future where technology drives business value and inspires extraordinary results. Our AI-powered platform helps organizations modernize financial operations, increase visibility, and optimize spend across the enterprise.

Emburse SDETs contribute to the development of an engaging and interconnected set of system solutions. As an engineer, you will enhance the experiences of your customers, solve interesting challenges, and design new solutions. Emburse, known for its innovation and award-winning technologies, is strong on engineering. This ensures you will have access to the best and brightest minds in our industry to grow your experience and career within Emburse.

What you will do

Technical Data testing & validation: Design, implement and maintain unit, integration, regression, pipeline, acceptance and data-quality tests for ETL/ELT jobs, streaming pipelines and data services (batch and real-time).

Automation: Build and own test automation frameworks and test harnesses for data pipelines (e.g., Python/pytest, dbt tests, Great Expectations), including synthetic dataset generators, golden datasets and fixture management.

Data systems: Deep experience testing data platforms and tooling: cloud data systems (Snowflake, BigQuery, Redshift, Microsoft Fabric), relational databases (MS SQL) and cloud providers like Azure , orchestration tools (Airflow), and columnar file formats (Parquet, Avro, ORC).

SQL & code: Expert in SQL for analytical validation and in at least one programming language used for data engineering such as Python. Build scripts, tools and automation to validate schema, data lineage, transformations, aggregation correctness, null/missing value handling and performance characteristics.

Observability & monitoring: Implement automated checks and alert logic for data freshness, schema drift, volume anomalies and metric regressions using observability/data quality tools (e.g., Great Expectations, Monte Carlo, custom monitoring).

Performance & scale testing: Design and run performance, throughput and scalability tests for pipelines and data services; profile and tune ETL jobs and queries to identify bottlenecks.

Data contracts & governance: Work with engineering and product teams to enforce data contracts, contract tests for data APIs, and validate PII handling, access controls and compliance requirements.

Debugging & triage: Investigate, reproduce and document data defects, root causes and remediation plans; apply production debugging skills against ETL jobs, queries and logs. Process Follow SDLC and agile practices, including writing and reviewing tests as code, peer code reviews, CI/CD pipeline integration of tests, and release gating based on automated data quality checks.

Maintain and follow coding and test case management standards; ensure test suites are deterministic, reproducible and fast enough to run in CI and nightly/rolling regression schedules.

Document testing approaches, known limitations and runbooks for operational incidents involving data. Impact Own quality for analytics deliverables and data platform features within your area. Establish test strategies and define appropriate test coverage for pipelines, models and metrics.

Drive improvements in data quality and reliability through automation, test architecture, and proactive detection of data problems.

Contribute to data documentation (data dictionaries, lineage diagrams, assumptions) and raise the bar for data governance and observability across teams. Communication Collaborate with Data Engineers, Analytics Engineers, Data Scientists, Product Managers and Platform Engineers to design tests and validate business logic in analytical pipelines.

Explain technical tradeoffs, testing coverage and risk to non-technical stakeholders and suggest pragmatic compromises when necessary.

Produce clear, evidence-based defect reports and remediation plans; participate in post-mortems focused on data incidents.

Education & Experience

Education

Required: Bachelor’s degree in Computer Science or related field, or equivalent years’ experience

Experience

Required: 4+ years of testing or SDET experience with a strong emphasis on data systems and analytics. Demonstrated experience writing tests for data pipelines, validating analytical outputs, or building test frameworks for ETL/ELT processes.

Required: Proficiency in SQL and at least one general-purpose programming language used for data processing (Python preferred; Scala/Java acceptable).

Required: Experience with cloud data warehouses (e.g., Snowflake, BigQuery, Redshift), and familiarity with at least one orchestration or streaming technology (Airflow, Kafka, Spark).

Preferred: Experience with dbt, data observability platforms (e.g., Monte Carlo), data modeling, and analytical/BI tools (Looker, Tableau, PowerBI, etc.). Experience testing ML models, A/B test validation, or statistical methods is a plus. Experience within Travel industry

What we are looking for

Strong SQL skills for complex analytical validation: joins, window functions, aggregation correctness and performance tuning.

Proficient in Python for test automation, data manipulation (pandas/pyarrow), and building test harnesses.

Demonstrable experience designing automated data quality checks and defining acceptance criteria for analytics deliverables.

Experience with CI/CD systems (GitHub Actions, Jenkins, CircleCI) and integrating test runs into pipelines.

Familiarity with data formats (JSON, Avro, Parquet) and schema evolution strategies.

Solid understanding of distributed data processing, consistency, eventual consistency tradeoffs, and data lineage.

Ability to reason statistically about datasets—detecting outliers, sampling strategies, and validating model/metric correctness.

Excellent debugging skills across code, SQL, job logs and metadata; ability to produce reproducible test cases and remediation paths.

Strong collaboration and written communication skills; experience conducting design/code reviews and mentoring peers.

Location Context

Must-Have Requirements

✓Bachelor's degree in Computer Science or related field, or equivalent experience

✓Expert in SQL for analytical validation

✓Proficiency in at least one programming language for data engineering such as Python

✓Experience testing data platforms and tooling

✓Experience with cloud data systems like Snowflake, BigQuery, Redshift, or Microsoft Fabric

✓Experience with relational databases like MS SQL

✓Experience with orchestration tools like Airflow

✓Knowledge of columnar file formats such as Parquet, Avro, or ORC

✓Experience with SDLC and agile practices

✓Understanding of data contracts and governance

Nice to Have

-Experience with Great Expectations or Monte Carlo

-Experience with Azure cloud provider

-Experience implementing automated checks and alert logic

-Experience with performance and throughput testing

-Experience with data lineage and transformations

-Experience with schema validation and PII handling

-Experience with ETL/ELT jobs and streaming pipelines