Todd Underwood

Todd Underwood

Engineering Director

Todd Underwood is a Director at Google and leads Machine Learning for Site Reliability Engineering Director. He is also Site Lead for Google’s Pittsburgh office. ML SRE teams build and scale internal and external ML services and are critical to almost every Product Area at Google.

When Good Models Go Bad: The damage caused by wayward models and how to prevent it
Wednesday, January 20 | 
10:00 AM - 
10:30 AM

Delivering a bad model into production/serving is deceptively easy to do and can create significant and difficult-to-mitigate damage. Problems with models range from simple issues like incompatibility with the serving system to more subtle quality regressions. Using a hand analysis of approximately 100 incidents tracked over 10 years, we look carefully at cases where these models reached, or almost reached the serving system. We identify common causes and manifestations of these failures and provide some ideas for how to measure the potential damage of various failures. Most importantly, we propose a set of simple (and some more sophisticated) techniques for detecting the problems before they cause damage.

Scroll to Top