Senior Data Engineer - Internal Measurement
Domain
Tech Stack
Must-Have Requirements
- ✓4+ years of professional development experience building high-performance, large-scale applications/pipelines
- ✓Strong proficiency in Python/PySpark and SQL
- ✓Hands-on experience with Airflow or similar orchestration tool
- ✓Strong analytical skills and curiosity for understanding data
- ✓Strong written communication skills
- ✓Solid computer science fundamentals - data structures, algorithms, and software design principles
Nice to Have
- -Experience with Databricks or similar data lakehouse platforms
- -Familiarity with FinOps concepts or cloud cost management
Description
Samba is an AI-powered media intelligence company on a mission to give marketers the complete picture of their audiences. Our AI indexes media consumption across millions of smart TVs and 2.5 billion web pages, combining that data with third-party signals through the Samba Knowledge Graph, a map of the real interests, behaviors, and purchase intent of 1.5 billion user profiles globally. Brands, agencies, publishers, and platforms use Samba to make smarter decisions across every stage of the marketing funnel.
We are seeking a Data Engineer to join our Internal Data and Performance (IDP) team that is a part of the Internal Measurement department. The team’s mission is to serve as the authoritative source of truth for internal data health and operational metrics - owning and maintaining systems that track the company's television footprint, partner payments, and data quality, while partnering with other technical teams to drive visibility and data-driven decision-making across the organization.
What You'll Do
Design, build, and maintain scalable data pipelines that power internal metrics, dashboards, and reporting across the Technology and Product organizations. Analyze and improve the efficiency, scalability, and stability of data collection, storage, and retrieval processes for our core systems. Work in collaboration with data scientists/data analysts to develop new and improved algorithms that best capture the value of our data. Own and operate Airflow DAGs and other orchestration workflows, ensuring reliable and timely delivery of internal data products. Participate in code reviews, contribute to documentation, and help raise engineering standards within the team.
Who You Are
4+ years of professional development experience building high-performance, large-scale applications/pipelines. Strong proficiency in Python/PySpark and SQL. While we work primarily in Python/PySpark, we acknowledge that engineers with sound fundamentals can pick up new languages relatively quickly. Hands-on experience with Airflow or a similar orchestration tool. Experience with Databricks or similar data lakehouse platforms is desirable. Strong analytical skills and a genuine curiosity for understanding data - you should be comfortable getting deep into datasets to identify problems and opportunities. Strong communication skills, especially in written form as we are a distributed team with members in both Warsaw and San Francisco. Solid computer science fundamentals - data structures, algorithms, and software design principles. Familiarity with FinOps concepts or cloud cost management is a plus, but not required.