Jacqueline Bilston

Learning to effectively test data pipelines

A Talk by Jacqueline Bilston (Software Developer, Yelp)

Proudly supported by

About this Talk

This talk is a story, using examples in Python and pySpark, about testing ETL pipelines efficiently. I won’t try to convince you that you need unit tests or automated tests – that’s up to you. If you do have unit tests for your ETL pipelines, or if you want them, it can be useful to make sure your tests are verifying what's important. I’ll be describing how a practical (non-pyramid shaped) heuristic helps me avoid duplicate test coverage, so that I test only the code needed for the feature I’m building.

About The Speaker

Say hello to your Speaker for this Talk.

Jacqueline Bilston

Jacqueline Bilston

Software Developer, Yelp

Categories Covered


Data Engineering

Data Governance