We could not locate the page you were looking for.

Below we have generated a list of search results based on the page you were trying to reach.

404 Error
Google’s premier cloud computing AI conference, Google Cloud Next 2023, took place the last week of August at Moscone Center in San Francisco. I attended the event and had the opportunity to spend several days in a variety of keynotes, briefings, sessions, as well as explore the event’s expo floor. Of course, I shared some of my real-time observations via Twitter X, which you can check out here. Here, I’ll share a few of my key takeaways from the event. This was the first in-person Google Cloud Next event in three years. While the event felt a lot smaller and more compact than the last one I attended, it was still large for a post-pandemic conference with approximately 15,000 attendees present. Generative AI in FocusNo surprise here, but Generative AI was very much a key theme flowing throughout the event, though there was plenty of content for folks more interested in traditional cloud computing topics.In addition to enabling new features and capabilities for the company’s the core AI stack (AI-oriented infrastructure and accelerators, AI/ML/DS platforms, and AI-powered applications), Google is weaving generative AI into non-AI products through Duet AI, which adds AI-bases assistant technologies to a wide range of Google Cloud products.A good indication of the breadth of work they’ve done to quickly build generative AI into their product base can be seen in the many AI-related announcements they made during the event. Here’s a summary of the most interesting AI-focused ones, out of the full list of 161 noted in Alison Wagonfeld’s wrap-up post:Duet AI in Google Cloud is now in preview with new capabilities, and general availability coming later this year. There were a dozen more announcements covering Duet AI features for specific Google Cloud tools, but you can check out the blog post for a summary.Vertex AI Search and Conversation, formerly Enterprise Search on Generative AI App Builder and Conversational AI on Generative AI App Builder, are both now generally available.Google Cloud added new models to Vertex AI Model Garden including Meta’s Llama 2 and Code Llama and Technology Innovation Institute's Falcon LLM, and pre-announced Anthropic’s Claude 2. The PaLM 2 foundation model now supports 38 languages, and 32,000-token context windows that make it possible to process long documents in prompts.The Codey chat and code generation model offers up to a 25% quality improvement in major supported languages for code generation and code chat. The Imagen image generation model features improved visual appeal, image editing, captioning, a new tuning feature to align images to guidelines with 10 or fewer samples, and visual questions and answering, as well as digital watermarking functionality powered by Google DeepMind SynthID. Adapter tuning in Vertex AI is generally available for PaLM 2 for text. Reinforcement Learning with Human Feedback (RLHF) is now in public preview. New Vertex AI Extensions let models take actions and retrieve specific information in real time and act on behalf of users across Google and third-party applications like Datastax, MongoDB and Redis. New Vertex AI data connectors help ingest data from enterprise and third-party applications like Salesforce, Confluence, and JIRA.Vertex AI now supports Ray, an open-source unified compute framework to scale AI and Python workloads. Google Cloud announced Colab Enterprise, a managed service in public preview that combines the ease-of-use of Google’s Colab notebooks with enterprise-level security and compliance support capabilities. Next month Google will make Med-PaLM 2, a medically-tuned version of PaLM 2, available as a preview to more customers in the healthcare and life sciences industry.New features to enhance MLOps for generative AI, including Automatic Metrics in Vertex AI to evaluate models based on a defined task and “ground truth” dataset, and Automatic Side by Side in Vertex AI, which uses a large model to evaluate the output of multiple models being tested, helping to augment human evaluation at scale, and a new generation of Vertex AI Feature Store, now built on BigQuery, to help avoid data duplication and preserve data access policies.Now Vertex AI foundation models, including PaLM 2, can be accessed directly from BigQuery. New model inference in BigQuery lets users run model inferences across formats like TensorFlow, ONNX, and XGBoost, and new capabilities for real-time inference can identify patterns and automatically generate alerts. Vector and semantic search for model tuning now supported in BigQuery. You also can automatically synchronize vector embeddings in BigQuery with Vertex AI Feature Store for model grounding. A3 VMs, based on NVIDIA H100 GPUs and delivered as a GPU supercomputer, will be generally available next month. The new Google Cloud TPU v5e, in preview, has up to 2x higher training performance per dollar and up to 2.5x inference performance per dollar for LLMs and generative AI models compared to Cloud TPU v4. New Multislice technology in preview lets you scale AI models beyond the boundaries of physical TPU pods, with tens of thousands of Cloud TPU v5e or TPU v4 chips. Support for Cloud TPUs in GKE is now available for Cloud TPU v5e and Cloud TPU v4. Support for AI inference on Cloud TPUs is also in preview. GKE now supports Cloud TPU v5e, A3 VMs with NVIDIA H100 GPUs, and Google Cloud Storage FUSE on GKE (GA).Key TakeawaysMy takeaways from Google Cloud Next are very much in the same vein as those from my attendance at Google’s Cloud Executive Forum held earlier in the summer. I continued to be impressed with Google Cloud’s velocity and focus when it comes to attacking the opportunity presented by generative AI. The company clearly sees gen AI as a way to leap ahead of competitors AWS and Microsoft and is taking an “all in” approach. The company has also been very quick to rally customers around its new gen AI product offerings. In addition to the product announcements noted above, Google Cloud announced and highlighted new and expanded generative-AI-focused collaborations with a wide variety of customers and partners, including AdoreMe, Anthropic, Bayer Pharmaceuticals, Canoo, Deutsche Bank, Dun & Bradstreet, Fox Sports, GE Appliances, General Motors, Ginkgo Bioworks, Hackensack Meridian Health, HCA Healthcare, Huma, Infinitus, Meditech, MSCI, NVIDIA, Runway, Six Flags, eleven generative AI startups, DocuSign, SAP, and more.Interesting overview of @FOXSports use of Gen AI. Have 27 PB of video, ingest 10k hrs per month. Have custom models for things like celebrity detection, foul ball prediction, and more. Use the tech to allow analysts to more easily search archives. #GoogleCloudNext pic.twitter.com/ea3tQCVXU0— Sam Charrington (@samcharrington) August 29, 2023 AI-Driven Transformation panel at #googlecloudnext Analyst Summit featuring data leaders from ⁦@Snap⁩ and ⁦@Wayfair⁩. pic.twitter.com/aANlHv6nHT— Sam Charrington (@samcharrington) August 29, 2023 https://twitter.com/samcharrington/status/1696597457134817490https://twitter.com/samcharrington/status/1696597126090985574"For the first time, the business is really engaged in transformation... We will figure out hallucinations, omissions, etc., ... but the level of engagement is game changing."- Gil Perez, Chief Innovation Officer, Deutsche Bank /*! elementor - v3.12.2 - 23-04-2023 */ .elementor-widget-divider{--divider-border-style:none;--divider-border-width:1px;--divider-color:#0c0d0e;--divider-icon-size:20px;--divider-element-spacing:10px;--divider-pattern-height:24px;--divider-pattern-size:20px;--divider-pattern-url:none;--divider-pattern-repeat:repeat-x}.elementor-widget-divider .elementor-divider{display:flex}.elementor-widget-divider .elementor-divider__text{font-size:15px;line-height:1;max-width:95%}.elementor-widget-divider .elementor-divider__element{margin:0 var(--divider-element-spacing);flex-shrink:0}.elementor-widget-divider .elementor-icon{font-size:var(--divider-icon-size)}.elementor-widget-divider .elementor-divider-separator{display:flex;margin:0;direction:ltr}.elementor-widget-divider--view-line_icon .elementor-divider-separator,.elementor-widget-divider--view-line_text .elementor-divider-separator{align-items:center}.elementor-widget-divider--view-line_icon .elementor-divider-separator:after,.elementor-widget-divider--view-line_icon .elementor-divider-separator:before,.elementor-widget-divider--view-line_text .elementor-divider-separator:after,.elementor-widget-divider--view-line_text .elementor-divider-separator:before{display:block;content:"";border-bottom:0;flex-grow:1;border-top:var(--divider-border-width) var(--divider-border-style) var(--divider-color)}.elementor-widget-divider--element-align-left .elementor-divider .elementor-divider-separator>.elementor-divider__svg:first-of-type{flex-grow:0;flex-shrink:100}.elementor-widget-divider--element-align-left .elementor-divider-separator:before{content:none}.elementor-widget-divider--element-align-left .elementor-divider__element{margin-left:0}.elementor-widget-divider--element-align-right .elementor-divider .elementor-divider-separator>.elementor-divider__svg:last-of-type{flex-grow:0;flex-shrink:100}.elementor-widget-divider--element-align-right .elementor-divider-separator:after{content:none}.elementor-widget-divider--element-align-right .elementor-divider__element{margin-right:0}.elementor-widget-divider:not(.elementor-widget-divider--view-line_text):not(.elementor-widget-divider--view-line_icon) .elementor-divider-separator{border-top:var(--divider-border-width) var(--divider-border-style) var(--divider-color)}.elementor-widget-divider--separator-type-pattern{--divider-border-style:none}.elementor-widget-divider--separator-type-pattern.elementor-widget-divider--view-line .elementor-divider-separator,.elementor-widget-divider--separator-type-pattern:not(.elementor-widget-divider--view-line) .elementor-divider-separator:after,.elementor-widget-divider--separator-type-pattern:not(.elementor-widget-divider--view-line) .elementor-divider-separator:before,.elementor-widget-divider--separator-type-pattern:not([class*=elementor-widget-divider--view]) .elementor-divider-separator{width:100%;min-height:var(--divider-pattern-height);-webkit-mask-size:var(--divider-pattern-size) 100%;mask-size:var(--divider-pattern-size) 100%;-webkit-mask-repeat:var(--divider-pattern-repeat);mask-repeat:var(--divider-pattern-repeat);background-color:var(--divider-color);-webkit-mask-image:var(--divider-pattern-url);mask-image:var(--divider-pattern-url)}.elementor-widget-divider--no-spacing{--divider-pattern-size:auto}.elementor-widget-divider--bg-round{--divider-pattern-repeat:round}.rtl .elementor-widget-divider .elementor-divider__text{direction:rtl}.e-con-inner>.elementor-widget-divider,.e-con>.elementor-widget-divider{width:var(--container-widget-width,100%);--flex-grow:var(--container-widget-flex-grow)} Additionally, Google Cloud continues to grow their generative AI ecosystem, announcing availability of Anthropic’s Claude2 and Meta’s Llama2 & CodeLlama models in the Vertex AI Model Garden.TK highlighting breadth of model catalog in Vertex AI, via new and existing model partners. Announcing support for @AnthropicAI Claude2 and @MetaAI Llama2 and CodeLlama models. #googlecloudnext pic.twitter.com/E1gkpT59UA— Sam Charrington (@samcharrington) August 29, 2023 /*! elementor - v3.12.2 - 23-04-2023 */ .elementor-widget-image{text-align:center}.elementor-widget-image a{display:inline-block}.elementor-widget-image a img[src$=".svg"]{width:48px}.elementor-widget-image img{vertical-align:middle;display:inline-block} OpportunitiesNumerous opportunities remain for Google Cloud, most notably in managing complexity in both their messaging and communication as well as in the products themselves.From a messaging perspective, with so many new ideas to talk about, it is not always clear what is actually a new feature or product capability, vs. simply a trendy topic that the company wants to be able to talk about. For example, the company mentioned new Grounding features for LLMs numerous times but I’ve been unable to find any concrete detail about how new features enable this on the platform. The wrap-up blog post noted previously links to an older blog post on the broader topic of using embeddings to ground LLM output using 1st party and 3rd party products. It’s a nice resource but not really related to any new product features.And since the conference, I’ve spent some time exploring various Vertex AI features and APIs and generally still find the console and example notebooks a bit confusing to use and the documentation a bit inconsistent. To be fair, these complaints could be leveled at any of Google Cloud’s major competitors as well, but coming from an underdog position in the cloud computing race, Google has the most to lose if product complexity makes switching costs too high.Nonetheless, I’m looking forward to seeing how things evolve for Google Cloud over the next few months. In fact, we won’t need to wait a full year for updates, since Google Cloud Next ‘24 will take place in the spring, April 9-11, in Las Vegas.
I recently had the opportunity to attend the Google Cloud Executive Forum, held at Google’s impressive new Bay View campus, in Mountain View, California. The Forum was an invitation-only event that brought together CIOs and CTOs of leading companies to discuss Generative AI and showcase Google Cloud’s latest advancements in the domain. I shared my real-time reactions to the event content via Twitter, some of which you can find here. (Some weren't hash-tagged, but you can find most by navigating the threads.) In this post I’ll add a few key takeaways and observations from the day I spent at the event. Key Takeaways Continued product velocity Google Cloud has executed impressively against the Generative AI opportunity, with a wide variety of product offerings announced at the Google Data Cloud & AI Summit in March and at Google I/O in May. These include new tools like Generative AI Studio and Generative AI App Builder; models like PaLM for Text and Chat, Chirp, Imagen, and Codey; Embeddings APIs for Text and Images; Duet AI for Google Workspace and Google Cloud; new hardware offerings; and more. The company took the opportunity of the Forum to announce the general availability of Generative AI Studio and Model Garden, both part of the Vertex AI platform, as well as the pre-order availability of Duet AI for Google Workspace. Nenshad Bardoliwalla, product director for Vertex AI, delivered an impressive demo showing one-click fine tuning and deployment of foundation models on the platform. /*! elementor - v3.12.2 - 23-04-2023 */.elementor-widget-image{text-align:center}.elementor-widget-image a{display:inline-block}.elementor-widget-image a img[src$=".svg"]{width:48px}.elementor-widget-image img{vertical-align:middle;display:inline-block} Considering that the post-ChatGPT Generative AI wave is only six months old, Google’s ability to quickly get Gen AI products out the door and into customer hands quickly has been noteworthy. Customer and partner traction Speaking of customers, this was another area where Google Cloud’s performance has been impressive. The company announced several new Generative AI customer case studies at the Forum, including Mayo Clinic, GA Telesis, Priceline, and PhotoRoom. Executives from Wendy’s, Wayfair, Priceline and Mayo participated in an engaging customer panel that was part of the opening keynote session. Several other customers were mentioned during various keynotes and sessions, as well as in private meetings I had with Google Cloud execs. See my Twitter thread for highlights and perspectives from the customer panel, which shared interesting insights about how those orgs are thinking about generative AI. Strong positioning While Models Aren’t Everything™, in a generative AI competitive landscape in which Microsoft’s strategy is strongly oriented around a single opaque model (ChatGPT via its OpenAI investment) and AWS’ strategy is strongly oriented around models from partners and open source communities, Google Cloud is promoting itself as a one-stop shop with strong first party models from Google AI, support for open source models via its Model Garden, as well as partnerships with external research labs like AI21, Anthropic and Cohere. The company also demonstrates a strong understanding of enterprise customer requirements around generative AI, with particular emphasis on data and model privacy, security and governance. The company’s strategy will continue to evolve and unfold in the upcoming months and much more will be discussed at Google Cloud Next in August, but I liked what I heard from product leaders at the event about the direction they’re heading. One hint: they have some strong ideas about how to address hallucination, which is one of the biggest drawbacks to enterprise use of large language models (LLMs). I don’t believe that hallucinations by LLMs can ever be completely eliminated, but in the context of a complete system with access to a comprehensive map of the world’s knowledge, there’s a good chance that the issue can be sufficiently mitigated to make LLMs useful in a wide variety of customer-facing enterprise use cases. Complex communication environment and need to educate In his opening keynote to an audience of executives, TK introduced concepts like reinforcement learning from human feedback, low-rank adaptation, synthetic data generation, and more. While impressive, and to some degree an issue of TK’s personal style, it’s also a bit indicative of where we are in this market that we’re talking to CIOs about LoRA and not ROI. This will certainly evolve as customers get more sophisticated and use cases get more stabilized, but it’s indicative of the complex communication challenges Google faces in evangelizing highly technical products in a brand new space to a rapidly growing audience. This also highlights the need for strong customer and market education efforts, to help bring all the new entrants up to speed. To this end, Google Cloud announced new consulting offerings, learning journeys, and reference architectures at the Forum to help customers get up to speed. (To add to the training courses announced at I/O). I also got to chat 1:1 with one of their “black belt ambassadors,” part of a team they’ve put in place to help support the broader engineering, sales and other internal teams at the company. Overall, I think the company’s success will be in large part dependent on their effectiveness at helping to bring these external and internal communities up to speed on Generative AI. Broad range of attitudes A broad range of attitudes about Generative AI was present at the event. On the one hand there was what I took as a very healthy “moderated enthusiasm” on the part of some. Wayfair CTO Fiona Tan exemplified this perspective both in her comments on the customer panel and in our lunch discussion. She talked about the need to manage “digital legacy” and the importance of platform investments, and was clear in noting that many of the company’s early investments in generative AI were experiments (e.g. a stable-diffusion based room designer they’re working on). On the other hand, there were comments clearly indicative of “inflated expectations,” like those of another panelist who speculated that using code generation would allow enterprises to reduce the time it takes to build applications from six weeks to two days or those of a fellow analyst who proclaimed that generative AI was the solution to healthcare in America. The quicker we get everyone past this stage the better. For its part, Google Cloud did a good job navigating this communication challenge by staying grounded on what real companies were doing with its products. I’m grateful to the Google Cloud Analyst Relations team for bringing me out to attend the event. Disclosure: Google is a client.
Today we’re joined by Disha Singla, a senior director of machine learning engineering at Capital One. In our conversation with Disha, we explore her role as the leader of the Data Insights team at Capital One, where they’ve been tasked with creating reusable libraries, components, and workflows to make ML usable broadly across the company, as well as a platform to make it all accessible and to drive meaningful insights. We discuss the construction of her team, as well as the types of interactions and requests they receive from their customers (data scientists), productionized use cases from the platform, and their efforts to transition from batch to real-time deployment. Disha also shares her thoughts on the ROI of machine learning and getting buy-in from executives, how she sees machine learning evolving at the company over the next 10 years, and much more!
Peter Skomoroch is an entrepreneur, investor, and the former Head of Data Products at Workday and LinkedIn. He was Co-Founder and CEO of SkipFlag, a venture backed deep learning startup acquired by Workday in 2018. Peter is a senior executive with extensive experience building and running teams that develop products powered by data and machine learning. He was an early member of the data team at LinkedIn, the world's largest professional network with over 500 million members worldwide. As a Principal Data Scientist at LinkedIn, he led data science teams focused on reputation, search, inferred identity, and building data products. He was also the creator of LinkedIn Skills and Endorsements, one of the fastest growing new product features in LinkedIn's history. Before joining LinkedIn, Peter was Director of Analytics at Juice Analytics and a Senior Research Engineer at AOL Search. In a previous life, he developed price optimization models for Fortune 500 retailers, studied machine learning at MIT, and worked on Biodefense projects for DARPA and The Department of Defense. Peter has a B.S. in Mathematics and Physics from Brandeis University and research experience in Machine Learning and Neuroscience.
Technical executive working at the intersection of business, data, health, and technology. Experience in analytic product development, computational research, and management consulting Favorite parts of what I do: - Solving problems that “move the needle” in healthcare delivery and community health - Collaborating at all levels, being challenged intellectually and technically - Incubating new capabilities and then maturing them into practical and scalable solution - Working “in the trenches” with researchers and engineers I am known for… - Lead by influence to drive outcome, reaching cross-functionally, internal and external - Translate complex business/clinical needs into sustainable end-to-end data science solution - Identify the right data, technology and scientific methods to address the problem-at-hand - Recruit, retain, cultivate, and empower analytics, data science, ML engineer and AI talent - Advocate and practice AI transparency and ethics
What currently keeps Noah busy is working as an Executive in Residence at the Duke MIDS (Data Science) and Duke AI Product Management program and as a consultant and author in Cloud Computing, Big Data, DevOps, and MLOps. He is the author of five O'Reilly books and numerous courses on MLOPs, DevOps, and Cloud Computing.
Diego Oppenheimer is an entrepreneur and product developer with an extensive background in all things data. Currently, he is an executive vice president at DataRobot, the enterprise MLOps platform, where he helps organizations scale and achieve their full potential through machine learning. Previously he founded and led Algorithmia through its acquisition. Diego is active in AI/ML communities and works with leaders to define ML industry standards and best practices. He brings his passion for data from his time at Microsoft where he shipped some of Microsoft's most used data analysis products including Excel, Power Pivot, SQL Server, and Power BI. Diego holds a Bachelor's degree in Information Systems and a Masters degree in Business Intelligence and Data Analytics from Carnegie Mellon University.
Today was the sixth day of TWIMLcon and the final day of presentations before we head into a full day of workshops and then a wrap-up unconference. Today we were fortunate to speak to folks from LinkedIn, Intuit, Cloudera, Yelp, Rakuten, Microsoft, Salesforce, and Fiddler. We covered a variety of subjects including: how to build out an ML Platform team and gain success over time;  that there is now a MDLC (Model Development Lifecycle) to go along with our SDLC (Software Development Lifecycle); how you should support your data team on “Day 1” and also “Day N”;  how you can work better with the business by also providing transparency and visibility, education, and shared ROI analysis; three key persona groups and what their different requirements and needs are from an ML platform; how and why testing and experimentation are different but complementary;  challenges to operationalizing ML; characteristics of a good ML platform; a rule of thumb for build vs. buy, and a powerful vendor-agnostic end-to-end ML stack architecture; tips and techniques for taking Responsible AI from a rare conversation to a key step in your overall enterprise software development lifecycle. Ya Xu, Head of Data Science, at LinkedIn kicked things off by sharing her thoughts on the three stages of building a platform: Build Phase; Adoption Phase; Maturity Phase. She believes that if you build an easily extensible platform that solves your user's needs, that they’ll stick with your platform and not try to build their own. She shared some shocking scale numbers, noting that LinkedIn often has over 500 experiments actively running at any given time. She summarized why we designed and built this entire conference: “I’m a big fan of platforms. We have an experiments platform, our main ML platform, an artifact catalog (DataHub), an anomaly detection platform, and a distributed OLAP system for data storage. Platforms let you move faster.” Ian Sebanja (Product Manager, Intuit Machine Learning Platform) and Srivathsan Canchi (Head of Engineering, ML Platform Team) shared their two-pronged approach on ensuring that costs are known and ROI can be calculated consistently. The first was minimizing overload on the data science team by ensuring that things (models, features, resources) are all tracked without the need for their input. They build on that by providing as much automation as possible to help the data science team be as effective as possible. Finally they have designed “smart defaults” so that users can spin up instances as needed and infrastructure spins down afterwards automatically to avoid unwanted infrastructure spend. The second prong involved being as transparent and educational as possible. They surface all infrastructure cost information to the developers at the point of execution so that they can make good business decisions and trade-offs with regard to speed or performance vs. cost. Then the data team and ops team can also have shared information from which to assess and calculate the ROI of a given project. Justin Norman, VP Data Science & Analytics from Yelp joined us next and shared with us a lot of the Yelp ML Platform stack architecture. He made the case that: “The goal is to produce an ML system that functions reliably and that can replicate that at scale. We need to run experiments and we also need to test. These are different!.” He then provided us all with a simple rule of thumb to use, suggesting that if we can fill in the following, that we’re ready to proceed and if we can’t, then we have more work to do before moving to the next step: “If we [build this thing defined by the ML developer] then [this metric defined by the data scientist] will move because of [this change in behavior identified by the product manager.]” Next up, Mohamed Elgendy, formerly of Amazon, Intel, Twilio, Rakuten, and new CEO and Co-founder of a new startup called Kolena.io, shared with us a comprehensive review/summary of the MLOps space. It contained ML operationalization challenges, a discussion of technical debt in AI/ML, a guess at what you have running today in your own shop, characteristics of a real (full) ML platform, some rules of thumb on build vs. buy, recommendations if you decide to build or buy, and finally an invitation to a new ML community he’s building at Kolena.io. Phew! We then moved into a panel discussion with Romer Rosales (Head of Consumer AI, LinkedIn), Sarah Bird (Principal Program Manager, Microsoft) and Kathy Baxter (Principal Architect, Ethical AI Practice, Salesforce.) Sarah kicked us off with a comment that echoed Diego Oppenheimer’s quote from early in the week: “Last year the conversation was ‘How do we think about this? This year, the principles are known and it’s more about scaling Ethical AI.” Romer made the case that while it takes a village to do responsible AI, there is no option to NOT do this and that it’s key to have executive support for your Responsible AI initiatives. Sarah shared a tactic that their team uses which is to have a “ship room” where any team can come and work directly with Responsible AI experts to ensure that their product both meets the standards and will ship on time. Cool concept! Kathy shared a tactic used by her team (borrowed from Timnit Gebru and Deb Raji) of providing “model cards” that are like nutrition labels on food that outline the model lineage, data sources, training approach and more. All three shared examples of how their respective teams are directly impacting product releases. They shared a common belief that making Responsible AI a major element of the full software development cycle was a better approach than using it as a last-minute audit and enforcement function on product teams. Our last session of the day was a workshop with Krishna Gade (CEO/Founder) and Rob Harrell (Product Manager) discussing how Fiddler provides Explainable Monitoring for AI/ML projects. They shared a couple of key points: Most models are a black box: In most cases we have no visibility into model performance, no monitoring to catch potential bias or drift, and no explanations of model behavior / predictions. Deployed AI systems are error-prone: Error creeps in through data bias, data drive, feature processing, data pipelines, model performance decay, model bias, all of that impacts KPIs and impacts the business (and possibly customers.) We need APM for AI/ML: Product managers and Developers have Application Performance Monitoring tools; we need the same thing for AI and ML. We would like to thank Ya, Srivathsan, Ian, Priyank, Justin, Mohamed, Rober, Sarah, Kathy, Krishma, and Rob for sharing all their hard-earned lessons with us today. We are now heading into the final two days of the conference. We have a full day of workshops tomorrow and then an unconference on Friday. If you missed registration or were unable to attend, you can still register now (you’ll need the Pro Plus or Exec summit pass) and you’ll have full ongoing access to all the incredible sessions from the conference. The only thing you’ll miss was the awesome networking and the swag bag!
Sam Charrington: Hey Everyone! Last week was the first week of our TWIMLcon: AI Platforms conference, and what a great first week it was! Following three days of informative sessions and workshops, we concluded the week with our inaugural TWIMLcon Executive Summit, a packed day featuring insightful and inspiring sessions with leaders from companies like BP, Walmart, Accenture, Qualcomm, Orangtheory Fitness, Cruise, and many more. If you’re not attending the conference and would like a sense of what’s been happening, check out twimlcon.com/blog for our daily recaps, and consider joining us for week two! Before we jump into today’s interview, I’d like to say thanks to our friends at Microsoft for their continued support of the podcast and their sponsorship of this series! Microsoft’s mission is to empower every single person on the planet to achieve more. We’re excited to partner with them on this series of shows, in which we share experiences at the intersection of AI and innovation to inspire customers to reimagine their businesses and the world. Learn more at Microsoft.com/ai and Microsoft.com/innovation Sam Charrington: [00:01:29] All right, everyone. I am here with Gurdeep Paul. Gurdeep is a corporate vice president with Microsoft. Gurdeep, welcome to the podcast! Gurdeep Pall: [00:01:38] Thank you, Sam. Really excited to be here. Sam Charrington: [00:01:40] I’m super excited for our conversation today! As is our typical flow, I’d love to have you start by introducing yourself. You’ve had quite a career at Microsoft culminating in your work in AI and autonomous systems. Tell us a little bit about your background and how you came to work in this field. Gurdeep Pall: [00:02:02] Thanks Sam. I’ve had a really nice long run at Microsoft, as you mentioned. And in fact, today is my 31st anniversary at Microsoft. Sam Charrington: [00:02:11] Wow. Gurdeep Pall: [00:02:12] So, yeah, it’s been a long career, but I really had a great time. In fact I feel like I’ve been into the candy store like three times. So my career can be divided into three parts. I worked on networking and operating systems. So that was sort of my first gig at Microsoft. I was very fortunate to work on a lot of the internet technologies when they were first rolled out in operating systems. I worked on VPNs, I’ve worked on remote access. And then I worked up to windows XP, I was the general manager for windows networking, where we shipped wifi for the first time in a general purpose operating system. And then at that time I moved over to work on communications and I started Microsoft’s communications business. So these are products that you may remember from the past, things like office communication server, which became link, which became Skype for Business, which is now Teams. So started that business from scratch, and all the way until we announced teams, in fact, a few days before we announced Teams, I was involved with that business. Though I’d had a stint in the middle on AI and I came back to work on AI. So it’s been, I would say, roughly three parts to my career and the latest being AI. And I’ve had lots of fun in all of them. Sam Charrington: [00:03:30] That’s awesome. I talked to so many people at Microsoft too, are working in AI and a lot of them started their careers working on Bing. You’re maybe one of the the outliers in that regard. Gurdeep Pall: [00:03:43] Well, the funny thing is that first stint had mentioned on AI was actually in the Bing team and I was running Microsoft speech. I was running some of our interesting explorations we were doing at Bing, recognizing objects. In fact, some of the image stabilization work we’ve mentioned to HoloLens actually came out of that group. So yeah, I worked on maps and lots of interesting stuff. Sam Charrington: [00:04:08] That’s awesome. So tell us a little bit about autonomous systems and some of the work you’re doing in that area. Gurdeep Pall: [00:04:14] Yeah. So, for the last four years or so, I’ve been focused on emerging technology and how it can be applied to interesting business problems. And, in that regard, I’ve worked on some interesting technology in the language space, language, understanding space. Worked on ambient intelligence where you could actually make sense of a space sort of make reality computable if you will. And then as I was exploring interesting emergency AI, which can solve business problems, we started focusing on autonomous systems. That was interesting to us, not just as a very interesting aspect of which AI was enabling, but also Microsoft didn’t have a lot of focus in that area before. So, when I talked to Satya and the time Harry Shum was here, we decided this was an area we were going to go invest in. Sam Charrington: [00:05:04] Interesting. And one of those investments was the acquisition of a company called Bonsai. This is a company that I know well. I interviewed one of the founders, Mark Hammond. This was back in 2017. It’s hard to believe it was that long ago. And the company had a really interesting take on using technologies that are still difficult for folks to put to productive use, namely reinforcement learning. Their take on it was this idea of machine teaching. Maybe you can tell us a little bit about that acquisition, the role that it plays in the way Microsoft thinks about autonomous systems and elaborate on this idea of machine teaching and some of the things that Bonsai brings to the table. Gurdeep Pall: [00:05:49] Sure. Absolutely. So, when we started focusing on autonomous systems, we were like trying to get our hands around this thing. People interpret the autonomous systems, many different ways. Some people think it’s only about autonomous driving, so let’s build a vertical stack. Some people think about robots, these humanoid robots with arms and joints and so on. And we’re thinking, what is our point of view? And, at the end of the day, we look at our own capabilities. We’re a software company, what is a software interpretation of the space? And it was with this sort of point of view that we started thinking about it. There was some work going on in Microsoft research at the time, which I’ll talk more about. And that’s when I first met Mark and team and we had a really good discussion and, as we finished the first meeting, I remember this thing going through my head, that this is like such a great approach. And it really fits into how we are starting to think about this space and makes sense to us. And then also thought, God, this feels like, just the wrong thing for a startup to do, building platforms and tools. It’s a tough thing. And Mark is such an incredible guy. I think you’ve talked to him, so you know that. So when we first finished the acquisition, he shared that with me too. He says, every VC I talked to, he says, why are you doing this? This is like the kind of thing Microsoft should be doing. So it was a marriage sort of made in heaven as it were, and C acquired that company. And it’s been really great, actually working with Mark and picking up from some incredible thinking that. You know, he and Keene had done and the team that was there, and then actually really expanding on that and really helping it realize its potential and also making it much more of an enterprise ready sort of an offering because this space is as mission critical and as important as it gets. So that’s been a very fun journey for the last two and a half years. Sam Charrington: [00:07:52] One of the ways I’ve heard you describe the way you’re approaching autonomous systems or that world broadly, and its two words and I still may butcher one of them, but it’s like this marriage of bits, and is it atoms that you say? Or molecules, or something else? But the idea is that,and this was something that was core to the way Bonsai Gurdeep Pall: [00:08:15] articulated what they Sam Charrington: [00:08:16] called then industrial AI. It’s a different problem when you’re applying AI solely in a software world, Gurdeep Pall: [00:08:23] recommendations on a website or looking at Sam Charrington: [00:08:27] customer churn, to when you’re actually trying to move physical goods or devices or systems. Elaborate on what you’ve seen in terms of the different requirements that come up in that world. Gurdeep Pall: [00:08:43] Absolutely. This is a very important point, when we start focusing on autonomous systems. I know people asking me about half the time, “oh, you’re talking about RPA, right?” No, I’m talking about RPA. Of course it doesn’t help when some of the RPA companies were calling their tech robots and, it could take action and so on. So it was in some ways, it was just a way for us to be clear about what we are doing. And we said, no, we’re actually focused on atoms, not things we just deal with bits. Of course, to digitize anything, you have to go from atoms to bits and then reason over it. But that became sort of the mainstay for us. The biggest difference, I would say, between those two worlds is that there is in the physical world, it is governed by some things like physics. The physical world, of course there’s Newtonian physics, and then you get into some of the multi-joint movements and you get into fluids, that’s a whole different kind of a physics which comes in. So you have to really think about modeling the real world and how then you can apply the tech towards that. The second thing I would say is that, most of the scenarios in autonomous systems pertain to taking action in the real world. And when you’re taking action in the real world, every time you take an action, the real world changes. And this is where reinforcement learning becomes a very natural mate as an AI technology for the problems that really apply to the real world, which is great because we have no other science which allows us to take a really sort of an unbounded state space and actually reason within it. And reinforcement learning becomes this really important piece in it. Lastly, I would say is that, every problem that we’ve looked at from an autonomous system space typically is one where there are experts who exist already. So far we haven’t been called to a problem where this is completely new and completely different and “oh, let’s solve it for the first time,” you know? And so tapping into the human expertise became a very important piece of this equation as well, which sometimes you don’t need to worry about, [inaudible] the data, you throw things at it and then maybe there is judging, certainly, if you want to sort of fine tune the models and so on, but that was another interesting aspect of this. Sam Charrington: [00:11:11] So we’ll be digging a little bit deeper into some of the technology that makes all this happen, but you started to mention some of the use case scenarios. Can you dig a little bit deeper into some specific scenarios that you’ve been working on? Gurdeep Pall: [00:11:27] Absolutely. And that’s, one of the things which makes this very, very interesting to me because it’s literally everything you see in the world around you can be a target for some of the technology that we’re building. Everything from smart climate controls. This is a field, HVAC control is a field that has, for the last 70 years, theres been very incremental improvement. Things like fuzzy logic and stuff like that has been used. And, we’ve seen incredible results using our approach. There things have plateaued out in performance. We were able to bring a much better performance, so energy savings or better climate control. We’ve seen oil drilling, horizontal drilling from companies like Shell, where you have these incredibly big machines and they look like these Bazookas, and you’re drilling with them. And these machines need a pretty high level of precision, so great human experts can do it, but you sometimes need more work than you can actually get that many trained experts on the problem. So being able to guide the drill bits through that. Cheeto extrusion is a very interesting, complicated process. You know, it’s very easy to eat, very hard to make. I always say, I know there are professional chefs out there, but certainly I cannot make the same kind of eggs every morning. Because even that simple task of heating the oil and getting it just right and putting the eggs in, you cannot replicate it every time. But if you’re Pepsi and you’re making Cheetos, that has to be consistent every time. When you open a bag of Cheetos, everybody’s familiar with the fluffiness and the crispness, and so everybody’s a judge and you have to win that every time. So very hard problem, because you have this corn meal, which was mixed with water. It’s impacted by the age of the machine which is extruding, sometimes impacted by humidity, temperature, all these things. So it’s a highly dynamical system and experts today, they sample and then they tweak, and then sample and then tweak, and they’re really, very stressful jobs of trying to keep that quality right. Otherwise the quality folks will come in and reject the material. So this is a problem we’ve been approved to apply our tools to, and basically consistently keep tweaking the parameters of this process so that you can have consistent Cheetos coming out on the other side. Chemical process control and other polymer manufacturing. Very, very hard problem. Some of these problems take six months to design the process for producing polymer for a particular grade. And, if you’ve been able to apply this problem, they’re both in the designing and the actual manufacturing process itself. Our favorite thing is flying things. Bell Flight is an incredible company, they have all kinds of commercial as well as a military applications for their vertical liftoff vehicles and so on. They’re trying to bring autonomous capability to those things. So we’ve been able to apply this towards that as well. So as you can see, anything which has control in the real world where you’re sensing and you’re picking an action, and you’re taking that action sensing again, this kind of a loop exists, this technology can be applied. Sam Charrington: [00:14:53] It’s been interesting over the past few years, just reflecting on some of the early conversations I had with Mark and the team at Bonsai around. There’s kind of this pendulum in the industry where we started out with kind of, rules, like physics and how things work. And we’ve kind of early on in the, in applying AI, we throw all those rules away and kind of leaned heavily on data and statistics. And over the past few years, there have been efforts, both in academia as well as what you’re doing, to kind of incorporate the rules and the human expertise back into the equation, without tossing everything that we’ve gained in applying data. One of the interesting challenges, when you layer on the physical world here is simulation, and how do you let an agent explore and learn without destroying helicopters and lots of Cheetos? Share a little bit about the challenge of simulation and how that’s evolved to help make some of these problems more tenable. Gurdeep Pall: [00:16:01] Yeah. Yeah. I think that’s such an important piece of this equation. Reinforcement learning is great, but reinforcement learning requires many, many, many steps, literally just to get a policy to be robust. You can be six 60 million cranks in before you start to see your policy start to develop at the appropriate level. So the question is, how do you go do that in the real world. And this is, one of the big insights I think the Bonsai folks came up with, and then this was some work that was happening at Microsoft Research coming at it from a very different direction, but they sort of merge together.   This is AirSim, and I can talk more about that, but the ability to model the appropriate aspects of the real world so that you can actually take action against them, get the right input back, and use that to train the model has been sort of the biggest insights here. Because really, what it says is you’re taking the physical world and you’re creating a mapping of it in the digital world, which then allows you to train the models quickly. And that’s where these simulators come in. Now simulators can be, depending on what they’re trying to simulate, can be very computationally intensive. And if you are nervous towards equations and things like that, cFDs. These are pretty long running simulations and some are, of course, faster. Now because we are using simulators for training AI, we want to crank this very, very quickly. So sometimes you end up with this problem where the physics, or at least how that physics is approached using these mathematical equations, actually becomes like a big piece of the problem. And so this is an area on how to take simulation, and how do you mate it with the training of the AI in a way that you can do it fast, you can do it cheap and you can frankly do it in parallel because that is one of the things, we have with some of the RL algorithms now is that you can actually take a policy, the last best known policy, you can explore in thousands of machines at the same time, you can take the samples and come back and update the policy. And then you take that, and again, you fan it out and you’ve got learners which are learning very quickly.  Getting all that figured out is actually one of the big things we managed to get done after the acquisition as well. And it’s all running on Azure and really allows us to do stuff efficiently. Sam Charrington: [00:18:33] You mentioned AirSim what is that, and what’s the role that it plays? Gurdeep Pall: [00:18:36] Yeah, so fierce them was a project in Microsoft research, which started off in a team that was exploring drones and how you bring autonomy to drones. And they had very similar experience. This was, I think they started in 2015. They would go out with their drone in the morning and they would come back with a broken drone in the evening and they will have very, very little data. And it’s like, how are we ever going to get enough data to actually get this thing to fly, to do even the basic tasks? So that’s when they looked at some of the work that is happening in, frankly, the gaming world. And they looked at some of the incredible scenes that could be rendered with unreal and unity and those kinds of things, which, if you’ve seen Forza and stuff like that, I mean, these things start to look pretty real. And they said, let’s create a simulator for perception oriented tasks, where you can create a scene and you can integrate physics into that scene for the different objects that are involved. There could be a flying object, it could be something with wheels, which is driving, et cetera.   And so you integrate the physics and now you’ve created in an environment in which you can train AI. Now it could be reinforcement learning where you’re sensing. So, you model the actual sensors inside this virtual environment, and you are able to use that for reinforcement learning and taking actions. Or you can use these sensors that are modeled inside of AirSim itself, and you can just generate lots of data on which you can do supervised learning offline. For both these purposes. So AirSim, they created this tool for themselves and they realized it’s so powerful, so they put it out as an open source utility. So today it has more than 10,000 stars on GitHub. It is really one of the most popular tools because others are realizing that, this idea of being able to simulate the reality is a very powerful approach. Sam Charrington: [00:20:35] So, can you maybe talk us through for some of the, any of the use cases you described when you go into an environment with a real customer, with, real problems. What’s the process to actually get something up and running and demonstrate value that they can build on meaning concrete value as opposed to theoretical POC value. What, what does it take to really do that? Gurdeep Pall: [00:21:02] I think, and this is something that we’ve been working on and we will continue to work on because our goal is to get this to a point where people are able to identify that this is a great tool for the problem that they have. It’s not like some sort of a speculative exploring exercise. They know that they’ll definitely get the results if they adopt this tool chain and going from there, to actually training the policy and to be able to export the brain, and actually start using it at the real world. That period is pretty short. So this is a journey for us, it started off fairly long. And now we are at a point where we are focusing on these so-called solution, accelerators, these areas where, the problem is very clear, what we are solving, how to solve it is very clear. And then some of the things that you need, like what simulators do you need sometimes, folks already have simulators, other cases, they need a simulator. And then the entire thing is stitched together and all they need to do is come in and create the variations for the problem, create the policy, and then go ahead and use it. But this is what is needed to take a customer from, “Hey, I’ve got a problem. I don’t know what this thing does. Maybe I’ll understand that.” No. Okay. Now I know kind of a problem. I don’t know if the problem can be solved with this or not. So this is what we’ve been targeting. And as we’ve gotten our solution explorations to be very crisp, our own how we talk to customers because there’s, as you’re alluding to. There’s an education thing here, there is a confidence thing here. So we have to address all those pieces and we’re bringing the customers along the journey. The great thing is, customers like Pepsi moment, one thing they thought successful. They looked around the factory and said, I can put this approach on many things and that’s the conversation we’re having right now. The same thing with Shell, same things at Dell. So, this is the journey. Sam Charrington: [00:23:01] I appreciate in that the idea that to the contrary of what you might think if you read popular reporting about AI, it’s not like a silver bullet, particularly in this domain where, you’ve got some tool chain and it applies to every problem that any customer might have. And it sounds like you’re being strategic, selective and building kind of expertise and supporting tools around specific areas, so that, to your point, at when you are engaging with someone, they can have a high degree of confidence that you’ve done this before, you know how it’s going to work and what the process is. Gurdeep Pall: [00:23:37] Exactly. And the other interesting thing that we found, which is I think a little unique compared to some of the other things we’ve done with AI, is that the experts that we end up talking to in the different industries and these application areas, they have never encountered AI before. Folks who went to engineering discipline schools, real engineers, not fake engineers like software engineers, like us. I mean, these are like mechanical chemical, what have you. And when they went through college, they did Matlab and they did learn Simulink and so on. And they have relied on a set of tools that have given them employment, giving them career success and stood the test of time. And here, these five guys walked in with a swag and, Hey, we got AI for you and it’s called reinforcement learning. You gotta it’s really awesome. You got to try it. I mean that just doesn’t work. You should really bring them along. And then they have some real, real things that we’ve had to sort of go and take in like safety. Even if this thing worked, they want to be able to assert that this thing is gonna do something crazy. I mean, when you have that horizontal drilling machine from shell, And I mean, this thing can drill through anything. I mean, it’s this huge thing. There was a wall street journal article about three years ago when we first did this project with a two years ago, we did the challenge and, for them, they want to make sure that this thing actually is going to be safe and I’m going to create another new problem while it solve one for one. Yeah. So it’s, it’s been a learning thing for us, but it’s the need for the education, the need for bringing these folks along. And this is one of the reasons we did this project more app, which is this very interesting device. It’s like a toy, basically. It’s the three robotic arms, if you will. And there’s a clear plate on top. And the task is to balance a ping pong ball on this device, on this plate. Now this problem, of course, they’ll image it. The engineers will go to pin, right? I mean, PID control is something, in college. And guess what? So we said first, let’s start with Pitt. He does a pretty good job. But then he said, okay, well, I’m going to toss the ball onto the plate and see if it catches it well, turns up it doesn’t catch it. So that starts, then he said, I’m going to add more complexity. How about we try and make the ball go around the edge of the plate. So as the problem progresses in complexity, You now realize that the only way you can solve it is if you had something like our tool chain, which we have with Bonsai, you create a simulator and you have policy that you’re training, and then you’re able to get to that level of performance. So we did this solely to bring engineers who are used to a particular way along and to start to believe, and to start to get excited about this. So we created the sort of metaphor in which we could connect together with them. Sam Charrington: [00:26:37] Interesting. Interesting. It reminds me of this idea of, why deep learning is, is so important and software 2.0 and how, what is, where, where it’s particularly powerful is. In solving problems that we didn’t know how to write the rules for like in computer vision. Like how do you identify a cat versus a dog, right. The rules for that, who knows how to do that, but the neural network and figure that out. And similarly, there is a, a range of problems that PID is easily applied to, but there’s also a level of complexity that it is difficult to apply it to. And that is where you’re finding. The value in applying RL. Gurdeep Pall: [00:27:18] Exactly, exactly. And, we’ve you seen that either. They were just too many moving parts. So the folks had achieved automation, but they have not issued autonomy. So either it’s that class of problems, wherever you’re getting traction or that with the existing methods, they’ve plateaued out and performance. You know, there is more performance to be had, and this is incredible. Like you would think like, we’ve figured everything out, right? I mean, as a society and with all the advancements that’s happened, but HVAC control in buildings, we’ve been able to get startling results. I mean, this is millions of dollars, like on a campus that you can save. And then also the green benefits that you get from that. So there’s just tremendous opportunity. Sam Charrington: [00:28:07] So maybe let’s drill into that example more because I do want to get to kind of a more concrete understanding of what is the process look like? I’ve got a data center or physical plant or something, and, I have my HVAC costs are through the roof and someone told me about this AI thing on an airplane. And I called her deep, like, what’s the first thing that I do and how do I get from there to some cost reduction or greater efficiency or whatever my goal is applying some of this. Yeah. Gurdeep Pall: [00:28:40] So in this particular case, that’s, we’re focusing one of our solution accelerators just on this use case. Okay. And so we are able to say with very high confidence that. If you can give us this information. Which is typically you can have data that you might have collected because a lot of these are now sort of IOT sort of devices, the data that you’ve collected, we’re able to go from that data to we ingest that. And then this case, which is sort of another double click on the simulation thing, we able to actually create a data-driven simulator and we are able to now start creating a policy. Now they do need to specify, and this is where machine teaching comes in. They need to specify to us what behavior they are desiring. Which means that, that specification can be, is fairly, flexible. So you could say things like, I want it to be really informed between these times of the day. Or you could say if the outside temperature, which becomes one of the state variables, which goes into creating the brain, if that variable is outside of this range, then I want this kind of a behavior, in somewhere I want it to be cooler and inventory, I want to be warmer. All those inputs that are there now create a policy for me, which automatically controls the HVAC system, which means turning on the fan or turning on the heat or turning on the cooling and to do it dynamically because once the brain is built, all you have to do is to connect the inputs and the actions. So inputs is where we are sampling the state. And actions is what you’re saying. Okay. Increase heat, decrease, heat fees, the fan done off the fan, et cetera. And by the way, it’s not just temperature in this case. It’s also the carbon dioxide and nitrogen levels. And so on, all those are making sense and then the actions will be taken based on that. So that is what the position we would have. And we, again, trying to make it as. Turn key, et cetera, but recognize that every building is different. So every building has its own climate sort of fingerprint. And so there is work required in creating the brains. So you could take a brain off the shelf and use it. You know, I can’t say whether that would work better. It might have better energy consumption, but then use the people are not as comfortable. So you have to sort of tweak it and the more efficient we can make this end to end thing, but sooner folks can realize the value and a brain in this case is essentially a model or an agent or something like that is that fair? Great question. I have had, lots of folks asked me, including bill Gates. Why do you call it brain? and I think it’s a really good question. So the way we talk about it is it’s actually a collection of models. Okay. So. autonomous system tasks, sometimes these be decomposed into different parts. Like for example, if sort of robotic hand, it had to pick up an object and to stack it, you can pick up, can reach, can be one action. Pickup can be another action in a move and then stack. These are all distinct actions. No, some are pretty easy. You can almost sort of program them, reaching as nowadays, obviously many program depending on the device you have, but some need to be trained. So now this whole collection of things has to be orchestrated. And the right piece has to be invoked at the right time. And each one of them either is programmed, or this is a model and it’s a deep learning model. The Deanna Lynn Swann, and putting all of it together, becomes the brain. In fact, that’s how the human brain works. So the name is actually quite great, the visual cortex, and then, that’s the one has a particular purpose of, then it gives us another piece which then does reasoning. And then, you want to take. The action and that invokes a different part of the brain. So that’s why we call it a brain. And, yeah. Sam Charrington: [00:32:33] Okay. Going back to the HVAC example, you mentioned that a data driven simulation, so I’m imagining you coming to my company, I guess since this is my scenario and I’ve got the data center, I probably don’t have a simulation that exists for my data center and HVAC. And so. That’s immediately a big challenge if I need that to train a brain, but you’ve got a way to generate that just from the data that I’ve collected. Gurdeep Pall: [00:33:01] Yes. And this was something that we are having to do a lot more of as we are swinging and talking to customers, some have a simulator. Interestingly, now, simulators, as have been used for designing, modeling, testing they’ve existed. But typically there’s been a human on one side of the simulator, driving the simulator for whatever purpose they want. You know, if it’s flight simulator, you’re, you’re flying it. But for our case, It’s the AI, which has been trained as sitting on the other end of the simulator. And so some cases, we were able to take their existing simulators and to actually change the use case and still make it work okay. In some cases that worked great. Now, in some cases it didn’t work great because their simulator was designed for really different booklets. Like if you do CFD. the purpose is to model this thing and you have to model it to high precision. I mean, this is going to be, a plane flying through rain. So, it has to be very precisely done, but each crank, they typically have like HPC setups for CFD simulation, but each crank can take so much. So how are we don’t crack it so fast that we could learn, right. So we said, Well, that doesn’t work or they just don’t have a similar at all, like your case. So that’s where our next step is. Can you give us data? And for many folks, they have the data. If they have the data, then we say, okay, let’s start how we can take data. And how do we can actually make it into something that we can meet with our system. That worked for certain class of problems. And then we said as a complexity of problems, started increasing, we realized that we need a new trick up our sleeve. there’s a research group as part of my team. And we started looking at how can we apply deep learning to learn from this data to create simulators there. We ran into the first insight, which is that, deep learning is designed for sort of inference, right? So you run one crank. And you get a prediction and you’re done well. It turns out the real world is not like that. You know, this real world is modeled with differential equations, differential equations. Basically, you’ve got time and you’ve got this thing, which is continue to change its behavior with time. Depending on the previous state and the actions are being taken. So there’s some work, great work that is being done right now. And we are publishing it right now. In fact, some of it is already out in deep simulation networks and basically it’s like a noodle competitional fabric where you have, it’s kind of like ordinance where. You have with every crank, you take the output and sort of feed it back into the next time cycle. Of course you have to have, so the sampling of time can be actually variable. So you have to that neural competition fabric has to do with that, which is a pretty big thing in itself, but it also allows you to have many different components inside the simulation each, which is sort of learning in a different way. For example, if you’re tossing a ball. The ball has it’s physics. And then there’s the environment that has physics, which is new for me in physics, but turns out the Newtonian physics doesn’t change. You can toss a ball, you can toss up a water. So if you are training those components, it’s give me some of these pre-trained components. If you will, that can be trained ones, then you can, maybe tweak it based on the, the object will have different physics. But now, so you did this noodle competition fabric, which plays out in time. You are now able to have multiple components and you train this thing. This new architecture we believe is a pretty transformative thing in simulation because it now allows us to offer any complex simulation space. Which basically has lots of differential equations that are sort of running around inside of it. And we can train it reasonably quickly. Really.  It’s kind of like a graph noodle network because you have time and you have space. If you look at the components that actually make space. So there’s message passing, which is happening between every stage and that allows the learning to happen. And this backpropagation, which happens in which each of the components, like eventually you’re able to get a trained model, which can run like a simulator. So you stopped at some state to take an action, distinct States changes and you’re able to crack it. So we’re really excited about it. We think this will be a big accelerant in the approach that we have. Again, we get the data, use it, we can go at it and this similarly, they can also learn from other simulators. So if you have something that is quite inefficient, in terms of competition and stuff like that, this thing can learn of it. And then it can execute very fast. Because once it learns the fundamental differential equations that are underlying, this is just inference. It’s not doing any kind of a big competition once a string. So that is an area that we’re really excited about right now. Sam Charrington: [00:38:09] Awesome. So first step is capture some data. Next step, use that to train a simulator using this idea of deep simulation networks, potentially. Then you mentioned kind of using that to create a brain. It sounds like part of that is you corrected me when I said it’s a model. So part of that I’m imagining is figuring out the right level of abstraction for these different components or pieces. And then individually, I guess one of the questions that I had around that was. And when we talk about reinforcement learning and kind of a academic sense and how difficult it is to put it to use in real world situations. A lot of it has to do with like carefully crafting this objective function or cost function and all of the issues associated with that. You described what the customer has to do as more, less about describing this objective function and maybe constraining what the solution looks like. Am I kind of reading that correctly? And maybe you can elaborate on that and help us understand. Gurdeep Pall: [00:39:17] Absolutely. And you’ve, you’ve hit the nail on head on with reinforcement learning the reward, specification, the reward function that he had, the specification of that becomes the next problem. In fact, we have a very famous researcher at Microsoft research. Blackford, he’ll tell you that. He says, if you have a problem, And you modeled it as a reinforcement learning problem. You don’t have to, it really gets to the core of it, this thing, which is that getting the reward function. Right. And there’s lots of funny stories about bad reward functions and unintended consequences, but we ran into that and they still allow that in our tool chain, you can specify the board function, but now we are actually. The machine teaching, we read exploring what are other ways for an expert to describe what they want done and we’ve come to the concert or goal. So they specify the goal, using a particular approach, the semantics of which are contained within the problem and the environment. And we will automatically generate the reward function. Under the covers based on the goal. And we found this to be a very, much more approachable thing for, for our customers. In fact, a lot of our new engagements with customers, most of the time we ended up using goals. So that’s been, you know, and like I said, you know, we’re on this learning thing ourselves. And, you know, we’re seeing what’s working, what’s not working how to enhance it and move from there. Sam Charrington: [00:40:45] And so some of these like classical challenges with reward functions, like delayed attribution and things like that, that you see in reinforcement learning does goals as an approach. Side skirt those in some ways, or are those still issues that you see in the autonomy systems world? Gurdeep Pall: [00:41:06] Yeah. I mean, those are still issues we see and separately the algorithms are getting pretty good too. So he, you know, there’s an active area of research and better algorithms coming up. we are, you know, we are, we stay on top of that and be an incorporating more and more algorithms now into our tool chain because there’s some albums. Better suited for certain class of problems. Others are better for suited for another other type of problems, which then of course moves the problem to the next layer, which is which one do you select for? Which kind of problem. And you don’t want, obviously folks who’ve never done programming or AI to say, Oh, you tell me, do you want SAC? Or do you want this. No idea. Right? So we are also trying to put in that intelligence, so that it’s a, it’s a meta reasoning thing, which says, you know, given this kind of a goal, given this kind of a problem, and this is a sampling rate. So state space let’s automatically select the best algorithm. And we will use that for training. So, you know, nobody ever has to know, like, you know what craziness you had walked under the covers, but staying on top of this has been a really important piece for us. You know, we use this framework called re which has come out of a lot of the book please. you know, still can source Facebook. We are one of the. Big users of it and contributors for it now, in fact, the rate team 13, which is building that my team in Berkeley are literally in the same building on one floor apart. So there’s a lot of good intermingling there as well. So because we using that framework V relive is how people are adding more and more algorithms, you know, being able to really tap into that and what we find, of course, sometimes, you know, people will write an algorithm to publish a paper, but it’s not really Production grade. So then these come back and do our own implementation of it and contribute that. Sam Charrington: [00:42:54] So, kind of in this journey, we started with data, we built a simulation, we built a brain out of that simulation. Then that brain is able to then help me control my data center. HVAC. I’m imagining in this scenario that, you know, I still care about the safety issue that you mentioned. Maybe not, you know, it’s not a drill, that’s going to destroy my data center, but you know, I don’t wouldn’t want the policy that you recommend to decrease the life of my coolers or chillers. And then there’s also maybe explainability issues that arise. Like, why are you telling me to, you know, my HVAC engineer has always set the XYZ at six and you’re saying it should be at eight. Why is that? Gurdeep Pall: [00:43:40] Yeah, no, this is, it’s such a great topic. And, I’ve talked to my team and given my, experience at Microsoft. I remember when we were building windows NT and putting, networking into it. And so on, we had no idea how stuff was going to be attacked when the internet was starting out In fact, I was the development manager for the TCP IP stack for windows from 95 to 2000. I still managed to keep some of my sanity, but I can tell you, there were folks on my team who really were pushing 20 updates a week because we were starting to get attacked with every layer bottom of the network, moving its way up. All the way up into sockets, you know, all the tear drop API’s and all that. And then when they got to the top layer, that’s what is really started the most sophisticated attacks. That’s where I don’t know if you remember back after windows XP shipped the entire team took one year to harden the system. Because it was no longer just my problem as the networking guy, it was everybody’s problem. People who do buffer overruns and they would insert code and all that. So literally every component had it So the reason I’m telling this story is that I think that safety is a problem like that. And when we came into it, Hey, we got really good control and I can show you it better performance, but then there’s all this hidden stuff that you have to deal with. That’s been a big realization for us. it’s a multifaceted approach. So the first thing is, you know, you talked about like the wear and tear of the machine or breaking it down. A bunch of our use cases right now with customers are with those are factored in, and actually they’re factored in at the time of the teaching. So when you talk about the state space and something that has to be specified so that the policy is taking that dork out, so that component gets handled. The hardest safety things that are, there are like when the brain is operating, like, are we really at the mercy of the, sort of a deep learning model, which is going to say, take this action. And then, you know, the consequences of that are actually out of scope for, for, for what we’re doing. And this is where we started, you know, this is going to be ongoing work. This is never done. You know, kind of like what cyber security right now, we’re learning. It’s never going to be done, but we want to take some pretty concrete steps. So one really important work. And there was a newspaper that is published on this is that he developed a policy and the policy suggests an action. What do you do is you introduce another layer after that to decide if the action is a safe action or not. Now what goes into deciding, is it a safe action or not? Can be many things can be predicate logic. It can be temporal logic, you know? So you can pretty much assert no. Yes, because it is outside some range or it actually can be trained things itself. Like imagine adversity. Models which go into that component. So now when you are specifying in machine teaching right upfront, you can now start to insert ways where, you know, safety can be specified and that actually follows a very different path. Some of it will actually follow the path of the policy building itself because some things can be caught there, but other things are actually more brought into bear at operation style. And that is very important because, you know, you probably heard about some of the discussions on how like level five autonomy is going to be rolled out in cities. And they’re saying, you know, these bus lanes and stuff like that. And I think it’s a wonderful idea because you’re solving the other side of the equation, which is you can control. So imagine like, you know, I always talk about this example and my team just sort of looks at me strange. So imagine you have the sort of armed robot and it is working the space with humans, also working. It is very common. You see this in machines in factories, they will have a red line or dotted red line around the protection. And the humans know they’re not going to go there. And now you’ve created a rule which says, regardless of what policy, what action, the policy tells you, if it is outside of radial, whatever distance that is. You will not take that action. So you’ve created an environment in which humans and this armed robot to swing around can actually co-exist in the same place. So it’s a very pragmatic approach, but it has to be part of your solution. Otherwise you don’t, the engineers are right. I mean, these crazies are showing up with reinforcement learning and it’s going to create all kinds of issues for, for us safety issues and so on. Sam Charrington: [00:48:33] Yeah. I love that analogy and just taking it one step further. It would be a lot more difficult to build into your kind of motion trajectories, for example, a way for this arm to avoid a human that stepped into the zone, then building something that determines that a human has stepped into the zone and just shuts everything down. And I think what I’m taking away from what you’re saying here is that. Safety is a multi-layered problem. And it’s not all about kind of making the neural net responsible for everything it’s about identifying, you know, how you can enforce safety in these different levels. And thinking about it as a system, like from an engineering person. Right. Gurdeep Pall: [00:49:16] Exactly. I think that has been a big learning for us as well, that, you know, it’s not just resolved the hardest they have problem and suddenly, you know, everything and they will come, right? No, you have to really think about it that way. And I think this, you know, the safety layer, which evaluates after every action is recommended, you know, it has to be this amazing, like. This is where a lot of the new capabilities will come in in the future adversity stuff. But you can imagine a completely separate model, which is basically trying to, this is going to give you this one or zero. If anybody human has stepped into the red line, it is going to give you a one and it shut off. Right. And that keeps improving the perception and things like that. So, yeah. So it is, it is a system thing as you, as you know, that’s, that’s very good to think of. Sam Charrington: [00:50:03] Right, right. So maybe to help us wrap up. It’s the very beginning of 2021 autonomous systems is a. Kind of a broad area, where do you see things going over the next few years? How does this all evolve? Gurdeep Pall: [00:50:18] Yeah. You know, we believe that we’re entering the era of autonomous systems and you know, it’s always hard to predict, right? This is famous billboard thing. Prediction is hard, especially about the future, but, you know, I remember looking on windows, NT, the networking of the internet, you know, these things just, they explode. And some right elements have to be there for this explosion to happen. And I think with the breakthroughs in AI, with the focus on solving business problems in a complete way, like we talked with safety with the industry coming along, like, you know, we’ve been spending a lot of time on data during simulators, but we believe that the simulation industry that is there, you know, we really want to partner with them. We’ve got great partners with MathWorks, you know, with you to bring them along. So that. Together. We can create an end to end tool chain in which these autonomous systems can be created without, you know, requiring, you know, the level of high level of expertise. That for example is going into a lot of the autonomous driving. I mean, the teams that are building this dominance, driving stacks are just super deep driving. There’s super experts and they’re building it all in the sort of silo way, very vertical way. We want it to be horizontal components. Then you’ll have some of the vendors of autonomous systems where anybody can come in, they come and describe the problems. They’re able to create the brain and employ it. That’s going to explode the number of autonomous systems that are out there. And I think this is great for many different things, including our climate, including, you know, resilience that we’ve seen during COVID where logistics and these things just have to continue. Production has to continue. So I think now’s the time and, you know, I think it’s going to happen. Sam Charrington: [00:52:05] Awesome. Awesome. Well, good deal. Thanks so much for taking the time to chat and sharing a bit about what you’re up to there. Gurdeep Pall: [00:52:13] Totally my pleasure. And you know, you have a great podcast, so it’s great to be here talking to you about my stuff. Sam Charrington: [00:52:25]Awesome. Thank you. Thank you. Take care. All right, everyone. That’s our show for today to learn more about today’s guest or the topics mentioned in this interview, visit twimlai.com. Of course, if you like what you hear on the podcast, please subscribe, rate, and review the show on your favorite pod catcher. Thanks so much for listening and catch you next time.
Friday’s TWIMLcon Executive Summit closed out a full first week at the conference! Speakers from BP, Walmart, Accenture, Qualcomm, Orangetheory Fitness, and more shared their experiences and insights on key issues faced by AI/ML leaders and teams. The day began with a keynote interview featuring Franziska Bell, VP of Data and Analytics at BP. Fran had some very strong advice on what it takes to ensure ML project success. Her principles include creating mutual partnership between the business and the data team early on in the process; working hard to ensure that the data team is actually solving the business need; and emphasizing the importance of empathy, understanding, and common goals and language among the cross-disciplinary teams building data products. The first panel of the day focused on Building the Business Case for ML Platforms and featured Divya Jain (Director of ML Platform, Adobe), Justin Norman (VP Data Science and Analytics at Yelp), and Kirk Borne (Principal Data Scientist and Executive Advisor, Booz Allen Hamilton). We discussed business value, measuring impact and ROI, build vs. buy, centralized vs. embedded teams, and standardization of infrastructure vs. flexibility. One attendee question prompted panelists to explore the topic of whether centralization was even a good thing. All panelists had strong opinions on this topic--not always in agreement--but Justin summarized it well with the following: “Businesses have many teams, those teams have requirements and those requirements should drive the platform choices. If it makes sense to centralize something... then do it. But if a team is doing something very unique with a different set of requirements than the other teams, they may need their own vertically integrated stack.” The next session had Adrian Cartier (VP of Data Science, Ocelot Consulting), Andy Minteer (Senior Director, Digital Transformation - Head AI Products, Walmart Global Tech), Jurgen Weichenberger (Data Science Senior Principal & Global AI Lead, Resources, Accenture) up to discuss Why ML projects Fail and How to Ensure Their Success. Right off the bat, Andrew challenged the idea of failure and had us rethink what failure even means. He asked: “What if the model is accurate but nobody adopts it? Isn’t that also failure?” Jurgen, who has worked with many customers in many industries, cautioned that it’s important to back up even further and to assess where the customer is on their maturity curve: Some industries are further ahead than others and that will drive a lot of what success and failure even mean to them. The panel closed with a discussion about the central role of people in the technology decisions leaders make. Jurgen offered: “It is our obligation to bring our customers on the journey with us. We need to be in the mindset that we are enabling people to do their jobs. You need to take the whole company on a journey with you... Bring them along, build trust and confidence, and show them how this can make their lives easier.” The fourth session of the day centered around what is required when Building Teams and Cultures that Support ML Innovation. For this discussion, we invited Ameen Kazerouni (Chief Analytics Officer, Orangetheory Fitness), Pardis Noorzad (Head of Data Science, Carbon Health), and Ziad Asghar (VP of AI at Qualcomm) to share their thoughts. The conversation included topics such as: what are the factors in building high-performance teams; how do we measure team success; and what is the role of culture in building teams. Sufficient budgets, common language, and shared rituals were all mentioned as key elements enabling effective teams. The impact of the pandemic on teams, namely the accelerated shift to remote work, was discussed as well. Ziad left us with this amusing thought on the topic: “If 2020 had a t-shirt, it would read: ‘Hey we can’t hear you, you’re on mute,’” illustrating how fundamental some of the challenges we face are. The final session of the day, the Executive Summit Roundtable Discussion, was a particularly animated and rich discussion. Ameen Kazerouni, Hussein Mehanna (VP, Head of ML/AI, Cruise), and Paul van der Boor (Senior Director of Data Science, Prosus Group), each shared their experiences on a topic relevant to leading ML teams and then Sam facilitated a great discussion afterwards with the attendees. A few of the many compelling ideas that came out of this section include: Ameen’s suggestion that: “The currency of an analytics team is trust, not data.” Paul’s insights from the experience of one of the teams at Prosus which has developed dashboards to granularly track the impact of every ML model they deliver on the business, and the need to understand whether a project’s key contribution is operational (improving what you already do) or innovation (doing new things). Hussein’s definition of “AI Native Products” as those that must leverage AI even at the MVP stage and his mind-blowing hypothesis (presented first at TWIMLcon!) that in order for organizations to create AI-Native Products that they need to organize internally like a neural network. We had a fun and engaging discussion after those three wrapped up and I think the summary was that we in the ML community have been spoiled with an explosion of new tools and techniques, and that at some point, there will likely be a “great reckoning” where the toolchain will all converge and MLOps will become more standardized. As one attendee, Gavin Bell, put it: “We used to have serial ports, parallel ports, printer ports, display ports, headphone jacks...and now it’s all USB-C. What is the respiration of this round of technology going to leave behind? “ Big thanks to Adrian, Ameen, Andy, Divya, Franziska, Jurgen, Justin, Kirk, Pardis, Paul, Ziad, Hussein and all of the Executive Summit attendees for a fun and stimulating day of discussion on these very important topics. And special thanks to Qualcomm, Executive Summit Platinum Sponsor. If you missed the session today, it’s not too late to register for TWIMLcon! There are still four more days of sessions next week. “Executive” tickets offer on-demand access to all of the Executive Summit sessions you missed, as well as the entirety of TWIMLcon. All tickets offer on-demand access to all regular conference sessions through the end of January, and “Pro Plus” and “Executive” tickets let you watch replay sessions whenever you like. You can check out the TWIMLcon agenda here and the speakers here. See you next week!
Day 3 of TWIMLcon 2012: AI Platforms was all about how to get models from development to production reliably and consistently by using modern MLOps tools and platforms. (The conference started Tuesday and runs through Friday, January 29th. It’s not too late to join in! Use discount code GREATCONTENT for 25% off registration.) David Hershey, Solutions Architect at Tecton led a three hour workshop on how to deploy a fraud detection model leveraging their Feature Store product. Why even use a Feature Store? In David’s words: “Feature stores enable faster development cycles, lower time-to-production, lower operational costs, and easier adoption of ML across your teams.” He started off with raw data, did some feature engineering, created a training dataset, trained the model, and created end-points for both the model and features for the model to use during inference. Finally he demonstrated integrating it all into production. We had some fun with the networking today by playing a variety of high-speed games. We can’t tell you which games, in case you try to prepare for the next TWIMLcon! Finally Kristopher Overholt, a Solution Engineer from Algorithmia, walked through the high-level issues facing any organization struggling with getting ML into production and finding the ROI of the project. Specifically, he called out some pretty common challenges: “Getting stuck in the lab, disconnected teams, technology mismatch, lack of stakeholder buy-in, hidden technical debt, all leading to an inefficient machine learning lifecycle.” We’d like to thank David and Kristopher for sharing their time and expertise with everybody both in their workshop presentations and in the Q&A. The conference continues tomorrow, Day 4 of 8, with the Executive Summit. We will be hosting a line-up of great speakers from BP, Yelp, Booz Allen, Adobe, Accenture, Walmart, Ocelot Consulting, Carbon Health, Orangetheory Fitness, Prosus Group, and Cruise. Topics for the day will include: Building the Business Case for ML Platforms; Why AI projects Fail and How to Ensure Success; Building Teams and Culture to Support ML Innovation. If this sounds interesting, it’s not too late to register! There are still five more days of sessions, the Executive Summit above. Pro Plus and Executive passes provide ongoing access to the conference recordings so that you can catch up after the event. You can check out the agenda here and the speakers here.
Day 2 (of 8!) of TWIMLcon: AI Platforms 2021 was a day of sharing hard-earned lessons. (The conference started yesterday and runs through January 29, 2021. It’s not too late to join in! Use discount code GREATCONTENT for 25% off registration.) We kicked off the day interviewing Faisal Saddiqi, Director of Engineering for Personalization Infrastructure at Netlix. Faisal has been at Netflix for a little over six years and he shared a ton of great lessons learned by him and his team while building out their internal ML platforms. This was a very dense discussion, full of hard lessons and good advice. Some key take-aways that stood out: Get clear on your internal users and what they need as your customer and then build systems that empower them to do their work with the tools they want to use. Be both opinionated AND flexible. Use prescriptive approaches and technologies lower in the stack and where you need to maintain control and provide more flexibility up at higher levels where people need the room to innovate. Overall the discussion on structure vs. flexibility was worth the price of admission as it ties into usability of the platforms we’re all building. Understand that components in your tech stack and MLOps platform are probably going to be mixed and matched between the four possibilities: build it (DIY), borrow it (from elsewhere in the company), use an open-source element, or use a commercial solution. He commented that Netflix and his team used all four options. There was so much more in this conversation and it was such a great start to the day. I highly recommend going back and catching the replay of this episode. Next up, we heard from Todd Underwood, an Engineering Director at Google. Todd walked us through how models fail and how to prevent it. He probably made a lot of people feel both better and worse by starting off saying that model quality is a common production problem and that it’s both an operational (systems) problem and also a human trust problem. Basically, if they fail (and they will) and you don’t know why they fail, people are less likely to trust them. From there, he walked through his lessons learned on how to think about failure as a gift, how to look past the obvious sources of failure to the more esoteric and boring causes, and how to learn from failures as an organization over time. From there, he walked through classes of failures and their frequent causes and then illustrated the principles he had laid out by walking through a particular story. While he had many interesting quotes in the presentation, I’ll put one of my favorite ones here: “Understand YOUR system, and your system’s failures. It is worth doing. It pays dividends in better models, more resilience...if you don’t monitor model quality yet, start. If you don’t write and track post mortems yet, start. When you have an outage, make sure you learn everything you can from it. You’ve already paid for it, so you should get the value out of it.”  - Todd Underwood, Engineering Director • Google Overall, his presentation was a call to action to embrace failure and accept it as a part of building complex systems generally, and AI/ML systems in particular. This talk is worth sharing with your whole team and taking action on. After that tough love talk, we got to hear from Ariel Biller, an Evangelist at ClearML and his customer Dotan Asselman, Co-Founder and CTO of theator. The talk continued on with the Build vs. Buy debate that Faisal touched on in the morning keynote. Spoiler alert: Both of them agreed on the core insight of this talk: “It’s not build vs. buy - it’s build AND buy and that golden ratio is use-case specific. ‘Buy’ also means open-source - remember that it may be ‘free’ but it has associated support costs.” Ariel Biller, Evangelist • ClearML What was really great about this talk was that they outlined the end-to-end ML platform system at theator and walked through which components were BUILT and which were BOUGHT. More importantly, they explained the thinking behind those decisions. I won’t get into the details here as you can check out the replay until the end of the conference (or after the conference for the Pro Plus Passes or Executive Summit pass holders). I encourage you to check out the full presentation. As we got into the thick of the day, Chip Huyen, the author of the excellent MLOps Tooling Landscape, let the audience know that ML is going real-time and that they’re probably not prepared for it. (What a day of tough love around here!) Chip’s core message was that organizations needed to move beyond thinking about real-time vs batch, but rather consider “online learning.” “Online learning is crucial for systems to adapt to rare events... Because Black Friday happens only once a year, there’s no way Amazon or other ecommerce sites can get enough historical data to learn how users are going to behave that day, so their systems need to continually learn on that day to adapt.” She then discussed how the two pipeline architecture that many systems have (a batch based pipeline for training and a streaming data pipeline for inference) is a common source of production failure and that teams should be looking at ways to unify those into a common pipeline that does both. Overall, she made a compelling case for rethinking the status quo of system architectures and considering whether online learning should be a goal for your system design. As with the others above, we can’t really do it justice here: check out the replay! To continue on with the themes of systems and their components, Monte Zweben, CEO of Splice Machine shared his thoughts on feature stores - what they are, what they do, and how they’re traditionally deployed in a three database architecture alongside scale-out operational databases and scale-out analytical data platforms. From there, he explained how Splice Machine has unified the three functions in one open-source system, to help customers deliver features much faster, and simplify the lifecycle of ML models. He made a argument for a database-centric approach to MLOps and I’d encourage any of you wrestling with the complexities feature management and delivery to go chat with Monte and his co-presenter Jack Ploshnick here at the conference this week to learn more. The second last session of the day was a fun panel discussion with a bunch of the Spotify ML team who shared their thoughts on how to drive platform adoption within their broader company. A key takeaway from this discussion was that “if you build it, they will come” is not enough at a certain level of scale. Spotify created a new “engagement manager” role on its platform team to address this, with a focus on evangelizing the platform to Spotify teams, and helping them be successful. There were lots of lessons in this chat for anybody building and evangelizing an internal ML platform. Finally, we closed out the day with a workshop presented by John Posada, a Partner Solutions Architect at Dataiku. Echoing what Ariel Biller discussed earlier in the day, John discussed how technical debt builds up, for example as regulatory frameworks change, and if you’re not agile enough your AI systems can fall afoul of the regulatory environment, causing customer and business harm. This is the stuff that keeps your risk management team up at night (and probably your CEO as well.) He suggested that the key is to use modular platforms that let you evolve the elements in the system while not changing the whole system, adding a layer of governance and building in guardrails to ensure fair and responsible use of AI. Dataiku’s answer to all of these requirements is their Data Science Software (DSS) platform and John presented a thorough walk through of how it can be used to create end-to-end MLOps pipelines. Tomorrow, we will be changing things up by shifting the focus to two major workshops, plus a networking session: David Hershey, a Solutions Architect for Tecton AI will walk through an entire case study in how to deploy a Fraud Detection Model with their Feature Store More networking (with a twist!) Kristopher Overholt, a Solution Engineer from Algorithmia, will demonstrate how to move models from training into production. Friday, our Executive Summit sessions will be happening, and then the regular mix of technical sessions will pick up again on Tuesday January 26th. If this sounds interesting, it’s not too late to register! There are still six more days of sessions, including Friday’s Executive Summit. Pro Plus and Executive passes provide ongoing access to the conference recordings so that you can catch up after the event. You can check out the agenda here and the speakers here. Thanks to all of today’s speakers Faisal Siddiqi, Todd Underwood, Dotan Asselman, Ariel Biller, Chip Huyen, Monte Zweben, Maya Hristakeva, Lex Beattie, Maisha Lopa, Samuel Ngahane, and John Posada for their time and contributions to a great day of learning.
We had a solid kick-off today at TWIMLcon 2021: AI Platforms. The conference started today and runs through January 29, 2021. It’s not too late to join in! Use discount code GREATCONTENT for 25% off registration. We started off the day talking to Solmaz Shahalizadeh, VP of Commerce Intelligence, at Shopify. During her time there, she implemented the company’s first ML products, built their financial data warehouse, led multiple cross-functional teams, and played a critical role in their IPO. In our discussion today, she said something that set the tone for the rest of the conference: “If you’re serious about your data, you want to invest in your platforms.” We could not have said it better ourselves. She also shared lessons learned from building a team of hundreds of data scientists, for example, paying attention to how well each of the team members can articulate the real world impact or how the model will solve a specific business problem. Next up, Aman Khan (Product Manager) and Josh Baer (ML Platform Product Lead) talked about how Spotify built its ML platform to provide service to over 300 million customers. They shared a few of the key tenets that now guide their approach to delivering ML infrastructure: Build infrastructure together: Having your infrastructure teams and your ML teams collaborate to build a common platform serves the organization best. Be opinionated: Having more tools is not better. Fewer tools leads to less custom code, less technical debt, and less confusion for the development team. Make difficult trade-offs: They focused hard on building a platform that served their ML Engineers first and foremost, with the idea that once they nailed that, they could extend it to other roles in the organization. We then shifted gears (pun completely intended) and talked with Sudeep Pillai, ML engineering team lead at Toyota Research Institute. Sudeep shared an overview of the MLOps environment developed at TRI and discussed some of the key ways MLOps techniques must be adapted to meet the needs of high-stakes environments like robotics and autonomous vehicles. He noted that early autonomous driving systems were strongly rule-based and rigid but that there has been a major shift away from rules-based systems and in his words: “ML is eating the Autonomous Driving Stack.” He further shared how ML moved into the Perception, Prediction, Planning, and Control aspects of Autonomous vehicle design. MLOps is sometimes thought of as “DevOps for Machine Learning” it was clear from Sudeep’s presentation that it needs to be more. MLOps at TRI is a complete set of processes specifically adapted not only for ML but also for the AD domain. It feels like the MLOps conversation is truly evolving and maturing when you see conversations like this one. After speaking with Sudeep, we chatted with Mike Del Balso, CEO and Co-Founder of Tecton. He walked us through the issues of feature development, management, and deployment. It seems hard to believe but he noted that just getting a few features into production can delay a project by months or even a year because of the hand-off between the data science team and the data engineering teams. He made an observation which is important to highlight here: “Feature stores are some of the highest value data we have in our organizations and we don’t manage them as such.” Mike went on to share with us some customer success stories (like Atlassian reducing model deployment times from months to days while increasing accuracy by up to 20%). Overall, it was a great discussion and I think we’re going to be all hearing a lot more about feature stores in the year ahead. As we rolled towards the end of Day 1, we had the great opportunity to hear from Dr. Jennifer Prendki, the founder and CEO of Alectio. Before founding Alectio, Jennifer was the VP of Machine Learning at Figure Eight, she built the first ML department from scratch at Atlassian, and she pioneered MLOps at Walmart Labs. Dr. Prendki and her team are challenging the long held belief that more data is a prerequisite to increasing the performance of an ML model. In order to break down her thesis that more data is not necessarily better, she unpacked what she refers to as “Data Prep Ops.” After unpacking Data Prep Ops in great detail which is too long to cover here, she summarized with a few major points: Good data preparation is a prerequisite for doing ML well; There is a Data Prep Ops market that is misunderstood and we as community members need to make it a first-class citizen in our MLOps practices; Data Preparation is more than labeling - It is a multi-faceted set of complex operational processes that are effectively their own discipline; Data prep can not be separated from the machine learning process - these two processes are related. It was a great discussion, with lots of food for thought for practitioners out there wrestling with the “more data = better predictions” status quo. In keeping with the TWIMLconnect theme at this year’s event, attendees had an opportunity to participate in a networking activity towards the end of the day. Attendees were randomly grouped into small breakout rooms for four lightning getting-to-know-you rounds. With smiles all the way around and folks complaining that the rounds were too short, it was clear everyone had a great time. We wrapped up the day with Jeff Fletcher, a Cloud Machine Learning Specialist from Cloudera walking everybody through a workshop exploring how ML can be done on the Cloudera Data Platform, including data preparation, pipelines, and production deployment. Jeff was clearly in his element and happy to show off the power of their platform. Tomorrow, we have a full schedule with: A keynote interview with Faisal Siddiqi, Director of Engineering from Netflix; Todd Underwood, an Engineering Director from Google will discuss what happens when “Good Models Go Bad”; Dotan Asselman, Co-Founder/CTO of theator, and Ariel Biller, Evangelist for ClearML will talk about continuous training; Chip Huyen (who wrote multiple amazing surveys of the MLOps market) will talk about the move to real-time ML; Monte Zweben, CEO of Splice Machine will discuss how you can scale models by moving beyond the traditional database architectures and by combining operational, analytical and feature store databases onto a common platform; Jeff Fletcher from Cloudera will close out the day with a continued look into the power of the Cloudera Data Platform. If this sounds interesting, it’s not too late to register! There are still seven more days of sessions, including Friday’s Executive Summit. Pro Plus and Executive passes provide ongoing access to the conference recordings so that you can catch up after the event.
We have a TON of practical presentations coming your way from companies like Google, Spotify, and Intuit - and we’re so excited about these keynotes we just announced!  Keynote Interview with Solmaz Shahalizadeh of Shopify Solmaz is the Vice President of Data Science & Engineering at Shopify. During her time at the company she implemented and scaled the company’s first ML products, built their financial data warehouse, led multiple cross-functional teams, and played a critical role in their IPO. Sam had a great podcast interview with Solmaz on using ML to fighting fraud, and she'll be back at TWIMLcon to discuss how she’s helped the company scale its use of machine learning and how that has helped power the company’s growth. Keynote Interview with Faisal Siddiqi of Netflix Faisal is the Director of Engineering for Personalization Infrastructure at Netflix, where he runs multiple teams delivering large-scale ML infrastructure to support the company’s personalization research and systems. The teams he runs currently support model development, model tools and data, and model serving. At TWIMLcon, Faisal and Sam will discuss lessons learned building all this personalization infrastructure.   Keynote Interview with Franziska Bell of BP We’re excited to have Franziska Bell back at TWIMLcon, this time for a keynote interview at the Executive Summit! Fran is now a Distinguished Advisor for Data Science at energy giant, BP. We’ll be discussing key themes on the mind of ML  leaders like why ML/AI projects fail, building the business case for ML projects and platform investments, and managing the organizational and cultural changes required for success with ML and AI.   We’ll be announcing more agenda details soon so keep an eye out! 
Today we're joined Richard Socher, Chief Scientist and Executive VP at Salesforce. Richard, who has been at the forefront of Salesforce's AI Research since they acquired his startup Metamind in 2016, and his team have been publishing a ton of great projects as of late, including CTRL: A Conditional Transformer Language Model for Controllable Generation, and ProGen, an AI Protein Generator, both of which we cover in-depth in this conversation. We explore the balancing act between investments, product requirement research and otherwise at a large product-focused company like Salesforce, the evolution of his language modeling research since being acquired, and how it ties in with Protein Generation.
Sam Charrington: Hey, what's up everyone? This is Sam. A quick reminder that we've got a bunch of newly formed or forming study groups, including groups focused on Kaggle competitions and the fast.ai NLP and Deep Learning for Coders part one courses. It's not too late to join us, which you can do by visiting twimlai.com/community. Also, this week I'm at re:Invent and next week I'll be at NeurIPS. If you're at either event, please reach out. I'd love to connect. All right. This week on the podcast, I'm excited to share a series of shows recorded in Orlando during the Microsoft Ignite conference. Before we jump in, I'd like to thank Microsoft for their support of the show and their sponsorship of this series. Thanks to decades of breakthrough research and technology, Microsoft is making AI real for businesses with Azure AI, a set of services that span vision, speech, language processing, custom machine learning, and more. Millions of developers and data scientists around the world are using Azure AI to build innovative applications and machine learning models for their organizations, including 85% of the Fortune 100. Microsoft customers like Spotify, Lexmark, and Airbus, choose Azure AI because of its proven enterprise grade capabilities and innovations, wide range of developer tools and services and trusted approach. Stay tuned to learn how Microsoft is enabling developers, data scientists and MLOps and DevOps professionals across all skill levels to increase productivity, operationalize models at scale and innovate faster and more responsibly with Azure machine learning. Learn more at aka.ms/azureml. All right, onto the show. Sam Charrington: [00:01:52] All right everyone, I am here in Sunny Orlando, actually it's not all that sunny today, it's kind of gray and gray and rainy but it is still Sunny Orlando, right? How could it not be? At Microsoft Ignite, and I've got the wonderful pleasure of being seated with Sarah Bird. Sarah is a principal program manager for Azure Machine Learning platform. Sarah, welcome to the TWIML AI Podcast. Sarah Bird: [00:02:15] Thank you, I'm excited to be here. Sam Charrington: [00:02:17] Absolutely. I am really excited about this conversation we're about to have on responsible AI. But before we do that, I'd love to hear a little bit more about your background. You've got a very enviable position kind of at the nexus of research and product and tech strategy how did you create that? Sarah Bird: [00:02:37] Well I started my career in research. I did my PhD in machine learning systems at Berkeley and I loved creating the basic technology, but then I wanted to take it to the next step and I wanted to have people who really used it. And I found that when you take research into production, there's a lot more innovation that happens. So since graduating I have styled my career around living at that intersection of research and product, and taking some of the great cutting edge ideas and figuring out how we can get them in the hands of people as soon as possible. And so my role now is specifically focused on trying to do this for Azure Machine Learning and responsible AI is one of the great new areas where there's a ton of innovation and research, and people need it right now. And so we're working to try to make that possible. Sam Charrington: [00:03:33] Oh, that's fantastic. And so between your grad work at Berkeley and Microsoft, what was the path? Sarah Bird: [00:03:42] So I was in John Lankford's group in Microsoft research and was working on a system for contextual bandits and trying to make it easier for people to use those in practice, because a lot of the times when people were trying to deploy that type of algorithm, the system infrastructure would get in the way. You wouldn't be able to get the features to the point of decision or the logging would not work and it would break the algorithm. And so we designed a system that made it correct by construction, so it was easy for people to go and plug it in, and this has actually turned into the Personalizer cognitive service now. But through that experience, I learned a lot about actually working with customers and doing this in production, and so I decided that I wanted to have more of that in my career. And so I spent a year as a technical advisor which is a great role in Microsoft where you work for an executive and advise them and help work on special projects. And it enables you to see both the business and the strategy side of things as well as all the operational things, how you run orgs and then of course the technical things. And I realized that I think that mix is very interesting. And so after that I joined Facebook and my role was at the intersection of FAIR, Facebook AI Research and AML which was the applied machine learning group with this role of specifically trying to take research into production and accelerate the rate of innovation. So I started the Onyx Project as a part of that, enabling us to solve a tooling gap where it was difficult to get models from one framework to another. And then also worked on PyTorch and enabling us to make that more production ready. And since then I've been working in AI ethics. Sam Charrington: [00:05:34] Yeah. If we weren't going to be focused on AI ethics and responsible AI today, we would be going deep into Personalizer, what was Microsoft Decision Service  and this whole contextual bandits thing. Really interesting topic, not the least of which because we talk a lot about reinforcement learning and if it's useful, and while it's not this deep reinforcement learning game playing thing, it's reinforcement learning and people are getting a lot of use out of it in a lot of different contexts. Sarah Bird: [00:06:05] Yeah. When it works, right? It doesn't work in all cases, but when it works, it works really well. It's the kind of thing where you get the numbers back and you're like, can this be true? And so I think it's a really exciting technology going forward and there's a lot of cases where people are using it successfully now, but I think though there'll be a lot more in the future. Sam Charrington: [00:06:25] Awesome. I'll have to take a rain check on that aspect of the conversation and kind of segue over to the responsible AI piece. And I've been thinking a lot about a a tweet that I saw by Rachel Thomas who is a former guest of the podcast, long time friend of the show and currently the UCSF Center for Applied Data Ethics head. And she was kind of lamenting that there are a lot of people out there talking about AI ethics like it's a solved problem. Do you think it's a solved problem? Sarah Bird: [00:06:58] No, absolutely not. I think there are, are fundamentally hard and difficult problems when we have a new technology, and so I think we're always going to be having the AI ethics conversation, this is not something that we're going to solve and go away. But what I do think we have now is a lot more tools and techniques and best practices to help people start the journey of doing things responsibly. And so I think the reality is there are many things people could be doing right now that they're not. And so I, I feel like there's an urgency date to get some of these tools into people's hands so that we can do that. So I `think we can quickly go a lot farther than we have right now. Sam Charrington: [00:07:41] In my conversations with folks that are working on this and thinking about the role that responsible AI plays and the way they "do AI," do machine learning. A lot of people get stopped at the very beginning like: Who should own this? Where does it live? Is it a research kind of function or is it a product function, or is it more of a compliancy thing for a chief data officer or a chief security officer? [Is it] one of those executive functions and oversight, or compliance is the better word? What do you see folks doing and do you have any thoughts on successful patterns of where it should live? Sarah Bird: [00:08:33] Yeah, I think the models that we've been using and are thinking a lot about... the transition  to security, for example. And I think the reality is it's not one person's job or one function. Everybody now has to think about security, even your basic software developers have to know and think about it when they're designing. However, there are people who are experts in it and handle the really challenging problems. There is of course legal and compliance pieces in there as well. And so I think we're seeing the same thing where we really need every role to come together and do this. And so one of the patterns we are seeing is part of the challenge with responsible AI and technology is that we've designed technology to abstract away things and enable you to just focus on your little problem, and this has led to a ton of innovation. However, the whole idea of responsible AI is actually, you need to pick your head up, you need to have this larger context, you need to think about the application in the real world, you need to think about the implications. And so we have to break a little bit of our patterns of 'my problem is just this little box,' and so we're finding that user research and design, for example, is already trained and equipped to think about the people element in that. And so it's really great to bring them into more conversations as we're developing the technology. So that's one pattern that we're finding adds a lot  of value. Sam Charrington: [00:10:07] In my conversation with with Jordan Edwards, your colleague, many of his answers were all of the above. And it sounds like this one is an "all of the above" response as well. Sarah Bird: [00:10:19] Yeah. I think doing machine learning in practice takes a lot of different roles, as Jordan was talking about, in operationalizing things, and then responsible AI just adds an extra layer of more roles on top of that. Sam Charrington: [00:10:32] Yeah. I guess one of the challenges that kind of naturally evolves when everyone has to be thinking about something is that it's a lot  harder, right? The developer is trained as a developer and now they have to start thinking about this security thing, and it's changing so quickly and the best practices are evolving all the time, and it's hard to stay on top of that. If we're to replicate that same kind of model in responsible AI, what sounds like the right thing to do? How do we support the people that are on the ground trying to do this? Sarah Bird: [00:11:07] Yeah. And I think it's definitely a challenge because the end result can't be that every individual person has to know the state of the art in every area in responsible AI. And so one of the ways that we're trying to do this is, as much as possible, build it into our processes and our tooling. So that you can say, okay, well you should have a fairness metric for your model and you can talk to experts about what that fairness metric should be, but you should know the requirement that you should have a fairness metric, for example. And so we first are starting with that process layer and then in Azure Machine Learning, we've built tools that enable you to easily enact that process. And so the foundational piece is the MLOps story that Jordan was talking about where we actually enable you to have a process that's reproducible, that's repeatable. So you can say, before this model goes into production, I know that it's passed these validation tests and I know that a human looked at it and said, it looks good. And if it's out in production and there's an error or there's some sort of issue that arises, you can go back, you can recreate that model, you can debug the error. And so that's the real foundational piece for all of it. And then on top of that, we're trying to give data scientists more tools to analyze the models themselves. And there's no magic button here. It's not just, Oh, we can run a test and we can tell you everything you want to know. But there's lots of great algorithms out there and research that help you better understand your model. Like SHAP or LIME are common interpretability ones. And so we've created a toolkit called Interpret ML, this is an open source toolkit you can use it anywhere. But it enables you to easily use a variety of these algorithms to explain your model behavior and explore it and see if there are any issues. And so we've also built that into our machine learning process so that if I build a model, I can easily generate explanations for that model. And when I've deployed it in production, I can also deploy and explain her with it so individual predictions can be explained while it's running so I can understand if I think it's doing the right thing and if I want to trust it, for example. Sam Charrington: [00:13:35] It strikes me that there's a bit of a catch 22 here, in the sense that the only way we could possibly do this is by putting tools in the hands of the folks that are working data scientists and machine learning engineers that are working on these problems. But the tools in their very nature kind of abstract them away from the problem and allow them, if not, encourage them to think less deeply about what's going on underneath. Right? How do we address that? Do you agree with that first of all? Sarah Bird: [00:14:09] No, I completely agree with that and it's a challenge that we have in all of these cases where we want to give the tool to help them and to have more insight but it's easy for people to just use it as a shortcut. And so in a lot of cases, we're being very thoughtful about the design of the tool and making sure that it is helping you surface insights. But it's not saying this is the answer because I think when you start doing that where you have something that flags and says this is a problem, then people really start relying on that. And maybe someday we will have the techniques where we have that level of confidence and we can do it. But right now we really don't, and so I think a lot of it is making sure that we designed the tools that encourages this mindset of exploration and deeper understanding of your models and what's going on. And not just, Oh, this is just another compliance tests I have to pass I just run this test and it says green. And I go. Sam Charrington: [00:15:12] You alluded to this earlier in the conversation, but it seems appropriate here as well, and it's maybe a bit of a tangent, but so much of pulling all these pieces together is kind of a user experience and design. Any thoughts on that? Is that something that you've kind of dug into and studied a lot? Or are the other folks worry about that here? Sarah Bird: [00:15:36] It's not in my background, but to me it's an essential part of the function of actually making these technologies usable. And particularly when you take something that as complex as an algorithm and you're trying to make that abstracted and usable for people, the design is a huge part of the story. And so what we're finding in responsible AI is that we need to think about this even more. And a lot of the guidelines are saying be more thoughtful and include sort of more careful design. For example, people are very tempted to say, well, this is the data I have so this is the model I can build and so I'm going to put it in my application that way. And then if it has too much inaccuracy, then you spend a lot of resources to try and make the model more accurate where you could have just had a more elegant UI design, for example, where you actually get better feedback based on the UI design or the design can tolerate more errors and you don't need that higher model accuracy. So we're really encouraging people to co-design the application in the model and not just take it for granted that this is what the model does and that's the thing we're gonna focus on. Sam Charrington: [00:16:53] With the Interpret ML tool, what's the user experience like? Sarah Bird: [00:17:01] It depends on what you're trying to do, there's two types of interpretability that people think about. One is what we call Glass-Box models. And the idea there is I want my model to be inherently interpretable. So I'm gonna pick something like a linear model or decision trees where I can actually inspect the model and enable you to to build a model of that, that you can actually understand. And so we support a bunch of different Glass-Box explainer or models. So then you can actually use it to train your own model. And the other part is Black-Box explainers where I have a model that I is a black box and I can't actually inspect it, but I can use these different algorithms to explain the behavior of the model. And so in that case what we've done is made it easy for you to just call and explain and ask for global explanations and ask for local explanations and ask for feature importance. And then all of those are brought together in an interactive dashboard where you can actually explore the explanations and try to understand the model behavior. So a lot of the experience it's an SDK and so it's all easy calls to ask for explanations, but then we expect a lot of people to spend their time in that dashboard exploring and understanding. Sam Charrington: [00:18:32] I did a really interesting interview with Cynthia Rudin who you may know she's a Duke professor and the interview was focused on her research that essentially says that we should not be using black box models in, I forget the terminology that she used, but something like mission critical scenarios or something along those lines where we're talking about someone's life or Liberty that kind of thing. Does providing interpretability tools that work with black box models, like encourage their use in scenarios that they shouldn't really be used in? And are there ways that you advise folks when and when not they should be using those types of models? Sarah Bird: [00:19:19] So we have people who do publish best practices for interpretability and  it's a very active area of work for the company. And we work with the partnership on AI to try to make industry-wide recommendations for that. I don't think it's completely decided on this idea that models should be interpretable in these settings versus, well, we want other mechanisms to make sure that they're doing the right thing. Interpretability is one way that we could be sure that they're doing the right thing, but we also could have more robust testing regimes. Right? There's a lot of technologies where we don't understand every detail of the technology, but we've been able to build safety critical systems on top of it, for example. And so yeah as a company we do try to provide guidance, but I don't think the industry has really decided the final word on this. And so the mindset of the toolkit is enabling you to use these techniques if it's right for you. But that doesn't specifically say that you should go use a neural net in a particular setting. Sam Charrington: [00:20:27] So in addition to the Interpret ML toolkit you also announced this week here from Ignite, a Fair Learn toolkit. What's that all about? Sarah Bird: [00:20:39] So it's the same spirit as Interpret ML where we want to bring together a collection of fairness techniques that have been published in research and make it easy for people to use them all in one toolkit with the same spirit that you want to be able to analyze your model and understand how it's working so that you could make decisions around fairness. And so there's famously, many different fairness metrics published. I think there was a paper cataloging 21 different fairness metrics. And so we've built many of these common ones into the toolkit and then it makes it easy for you to compare how well your model works for different groups of people in your data set. So for example, I could say does this model have the same accuracy for men and women? Does this model have the same outcomes for men and women? And so we have an interactive dashboard that allows you to explore these differences between groups and your model performance through a variety of these metrics that have been published in research. Then we've also built in several mitigation techniques so that if you want to do mitigation via post-processing and your model, then you can do that. For example, setting thresholds per group. And in a lot of cases it might be that you actually want to go and fix the underlying data or you wanting to make some different decisions. So the mitigation techniques aren't always what you would want to do, but they're available if you want to do that. And so the name of the toolkit actually comes from one of these mitigation techniques from Microsoft research where the algorithm was originally called Fair Learn. And the idea is that you say, I wanna reduce the difference between two groups on a particular dimension. So you pick the metric and you pick the groups and the algorithm actually retrains your model by re-wading data and iteratively retraining to try to reduce that disparity. So we've built that into the toolkit. So now you can actually look at a variety of your versions of your model and see if one of them has properties that works better for what you're looking for, to deploy. Sam Charrington: [00:22:59] Again, I'm curious about the user experience in, in doing this. How much knob turning and tuning does the user need to do when applying that technique you were describing? Or is it more, I'm envisioning something like contextual bandage reinforcement learning where it's kind of tooling the knobs for you. Sarah Bird: [00:23:18] Yeah, it is doing the knobs and the retraining, but what you have to pick is which metric you're trying to minimize. Do I want to reduce the disparity between the outcomes or do I want to reduce the disparity and accuracy or some other there's many different metrics you could pick, but you have to know the metric that's right for your problem. And then you also need to select the groups that you want to do. So it can work in a single dimension like as we were saying making men and women more more equal, but then it would be a totally separate thing to do it for age, for example. So you have to pick both the sensitive attribute that you are trying to reduce disparity and you have to pick the metric for disparity. Sam Charrington: [00:24:10] Were you saying that you're able to do multiple metrics in parallel or you're doing them serially? Sarah Bird: [00:24:17] Right now the techniques work for one, for just one metric. So it will produce a series of models, and if you look at the graph, you can actually plot disparity by accuracy and you'll have models that are on that Pareto optimal curve to look at. But then if you said, okay, well now I want to look at that same chart for age, the models might be all over the place in the space of disparity and accuracy. So it's not a perfect technique, but there are some settings where it's quite useful. Sam Charrington: [00:24:48] So going back to this idea of abstraction and tools versus deeply understanding the problem domain and how to think about it in the context of your problem domain. I guess the challenge domain or your problem domain, I don't know what the right terms are. But you mentioned that paper with all of the different disparity metrics and the like. Is that the best way for folks to get up to speed on this or are there other resources that you've come across that are useful? Sarah Bird: [00:25:23] Yeah, I think for fairness in particular it's better to start with your application domain and understand, for example, if you're working in an employment setting, how do we think about fairness and what are the cases and so in that case we actually recommend that you talk to domain experts, even your legal department to understand what fairness means in that setting. And then you can go to the academic literature and start saying, okay, well, which metrics line up with that higher level concept of fairness for my setting. But if you start with the metrics I think it can be very overwhelming and there's just many different metrics and a lot of them are quite different and in other ways they're very similar with each other. And so I find it much easier to start with the domain expertise and know what you're trying to achieve in fairness and then start finding the metrics that line up with that. Sam Charrington: [00:26:22] You're also starting to do some work in the differential privacy domain. Tell me a little bit about that. Sarah Bird: [00:26:27] Yeah, we announced a couple of weeks ago that we are building an open source privacy platform with Harvard and differential privacy is a really fascinating technology. It was first published in Microsoft Research in 2006 and it was a very interesting idea, but it has taken a while for it, as an idea, to mature and develop and actually be able to be used in practice. However, now we're seeing several different companies who are using it in production. But in every case the deployment was a very bespoke deployment with experts involved. And so we're trying to make a platform that makes it much easier for people to use these techniques without having to understand them as much. And so the idea is the open source platform can go on top of a data store, enable you to do queries in a differentially private way, which means that actually it adds noise to the results so that you can't reconstruct the underlying data and also then potentially use the same techniques to build simple machine learning models. And so we think this is particularly important for some of our really societaly valuable datasets. For example, there are data sets where people would like to do medical research, but because we're worried about the privacy of individuals, there's limits to what they can actually do. And if we use differential private interface on that, we have a lot more privacy guarantees and so we can unlock a new type of innovation and research in understanding our data. So I think we're really excited and think this could be the future of privacy in certain applications, but the tooling just isn't there, and so we're working on trying to make it easier for people to do that. We're building it in the open source because it's important that people can actually ... It's very easy to get the implementation of these algorithms wrong and so we want the community and the privacy experts to be able to inspect and test the implementations and have the confidence that it's there. And also we think this is such an important problem for the community. We would like anybody who wants to, to be joining in and working on this. This is not something that we can solve on our own. Sam Charrington: [00:28:58] Yeah, differential privacy in general and differentially private machine learning are fascinating topics and ones that we've covered fairly extensively in the podcast. We did a series on differential privacy a couple of years ago maybe and it's continuing to be an interesting topic. At the Census Bureau I think is using differential privacy for the first time next year and it's both providing the anticipated benefits but also raising some interesting concerns about an increased opacity on the part of researchers to the data that they wanna get access to. Are you familiar with that challenge? Sarah Bird: [00:29:41] Yeah, absolutely. So the reality is people always want the most accurate data, right? It doesn't sound great to say, well, we're adding noise and the data is less accurate. But, in a lot of cases it is accurate enough for the tasks that you want to accomplish. And I think we have to recognize that, privacy is one of the sort of, fundamental values that we want to uphold, and so in some cases it's worth the cost. For the census in particular, to motivate the decision to start using this for the 2020 census they did a study where they took the reports from the 1940 census and they were able to recreate something like 40% of Americans' data with the result of just the outputs from the census. Sam Charrington: [00:30:33] Meaning personally identify 40% of Americans? Sarah Bird: [00:30:37] Yeah, he talks about this in his ICML keynote from last year. So if you want to learn more you can watch the keynote. But yeah, basically they took all the reports and they used some of these privacy attacks and they could basically recreate a bunch of the underlying data. And this is a real risk, and so we have to recognize that yes, the census results are incredibly important and they help us make many different decisions, but also protecting people's data is important. And so some of it is education and changing our thinking and some of it is making sure that we use the techniques in the right way in that domain where you're not losing what you were trying to achieve in the first place, but you are adding these privacy benefits. Sam Charrington: [00:31:21] There are a couple of different ways that people have been applying differential privacy one is a, a more centralized way where you're applying it to a data store. It sounds a little bit like that's where your focus is. Others like Apple's a noted use case where they're applying differential privacy in a distributed manner at the handset to keep user data on the iPhone, but still provide information centrally for analysis. Am I correct that your focus is on the centralized use case? Or does the toolkit also support the distributed use case? Sarah Bird: [00:32:02] We are focusing on the global model. The local model works really well, and particularly in some of these user telemetry settings, but it limits what you can do. You need much larger volume to actually get the accuracy for a lot of the queries that you need, and there aren't as many queries that you can do. And so the global model, on the other hand, there's a lot more that you can do and still have reasonable privacy guarantees. And so as I was saying, we were motivated by these cases where we have the data sets. Like somebody is trusted to have the data sets but we can't really use them. And so that looks like a global setting. And so to start, we're focused on, on the global piece, but there are many cases where the local is promising and there are cases where we are doing that in our products. And so it's certainly a direction that things could go. Sam Charrington: [00:32:58] And differential privacy from a data perspective doesn't necessarily get you to differentially private machine learning. Are you doing anything in particular on the differentially private ML side of things? Sarah Bird: [00:33:11] The plan is to do that but the project is pretty new so we haven't built it yet. Sam Charrington: [00:33:19] And before we wrap up, you're involved in a bunch of industry and research initiatives in the space that you've mentioned, MLSys, a bunch of other things. Can you talk a little bit about some of the broader things that you're doing? Sarah Bird: [00:33:38] Yeah, so I helped found the, now I think named MLSys systems and machine learning research conference. And that was specifically because I've been working at this intersection for a while and there were some dark days where it was very hard to publish work because the machine learning community was like, this is a systems result. And the systems community was like, this doesn't seem like a systems result and so we started the conference about two years ago and apparently many other people were feeling the same pain because even from the first conference, we got excellent work. People's top work, which is always a challenge with research conferences because people don't want to submit their best work to an unnamed conference. Right? But there was such a gap for the community. So it's been really exciting  to see that community form more  and now have a home where they can put their work and connect.  I've also been running the machine learning systems workshops at NeurIPS for several years now. And that's been a really fun place because it really has helped us form the community, particularly before we started the conference. But it's also a place where you can explore new ideas. This last year we're starting to see a lot more innovation at the intersection of programming languages and machine learning. And so in the workshop format we can have several of those talks highlighted, and have a dialogue, and show some of the emerging trends so that's been a really fun thing to be involved in. Sam Charrington: [00:35:13] Awesome. Yeah, was it last year that there was both the SysML workshop and the ML for systems workshop and it got really confusing? Sarah Bird: [00:35:24] Yeah. This year too. We have both. And I think that's a sign that the field is growing that it used to be that it felt like we didn't even have enough people for one room at the Intersection of Machine Learning and Systems. And I think this last year there was maybe four or 500 people in our workshop alone. And so that's great. Now, there's definitely room to have workshops on more focused topics. Right? And so I think machine learning for systems is an area that people are really excited about now that we have more depth in understanding the intersection. For me, it's very funny because that is really kind of the flavor of my thesis which was a  while ago. And so it's a fun to see it now starting to become an area that people are excited about. Sam Charrington: [00:36:16] The other conference that we didn't talk about, ML for Systems is all about using machine learning within computational systems, networking systems as a way to optimize them. So for example, ML to do database query optimization. Also a super interesting topic. Sarah Bird: [00:36:36] Yeah, I know it absolutely is. And I really believe in that, and I think for several years people were just trying to replace kind of all of the systems intelligent with one machine learning algorithm and it was not working very well. And I think what we're seeing now is recognizing that a lot of the algorithms that we used to control systems were designed for that way and  they work, actually, pretty well. But on the other hand, there's something that's dynamic about the world or the workload. And so you do want this prediction capability built in. And so a lot of the work now has a more intelligent way of plugging the algorithms into the system. And so now we're starting to see promising results at this intersection. So my thesis work was a resource allocation that built models in real time in the operating system and allocated resources. And it was exactly this piece where there was a modeling and a prediction piece, but, the final resource allocation algorithm was not purely machine learning. Sam Charrington: [00:37:43] Awesome. Wonderful conversation, looking forward to catching up with you at NeurIPS, hopefully. thanks so much for taking the time to chat with us. Sarah Bird: [00:37:52] Yes, thanks for having me. And I look forward to seeing you at NeurIPS. Sam Charrington: [00:37:56] Thank you.
Sam Charrington: Today we're excited to present the final episode in our AI for the Benefit of Society series, in which we're joined by Mira Lane, Partner Director for Ethics and Society at Microsoft. Mira and I focus our conversation on the role of culture and human-centered design in AI. We discuss how Mira defines human-centered design, its connections to culture and responsible innovation, and how these ideas can be scalability implemented across large engineering organization. Before diving in I'd like to thank Microsoft once again for their sponsorship of this series. Microsoft is committed to ensuring the responsible development and use of AI and is empowering people around the world with this intelligent technology to help solve previously intractable societal challenges spanning, sustainability, accessibility, and humanitarian action. Learn more about their plan at Microsoft.ai. Enjoy. Mira Lane: [00:00:09] Thank you, Sam. Nice to meet you. Sam Charrington: [00:00:11] Great to meet you and I'm excited to dive into this conversation with you. I saw that you are a video artist and technologist by background. How did you come to, you're looking away, is that correct? Mira Lane: [00:00:28] No, that's absolutely true. Sam Charrington: [00:00:30] Okay. So I noted that you're a video artist. How did you come to work at the intersection of ethics and society and AI? Mira Lane: [00:00:42] For sure. So let me, Sam, let me give you a little bit of a background on how I got to this point. I actually have a mathematics and computer science background from the University of Waterloo in Canada. So I've had an interesting journey, but I've been a developer, program manager, and designer, and when I think about video art and artificial intelligence, I'll touch artificial intelligence first and then the video art, but a few years ago I had the opportunity to take a sabbatical and I do this every few years. I take a little break, reflect on what I'm doing, retool myself as well. So I decided to spend three months just doing art. A lot of people take a sabbatical and they travel but I thought I'm just gonna do art for three months and it was luxurious and very special. But then I also thought I'm going to reflect on career at the same time and I was looking at what was happening in the technology space and feeling really unsettled about where technology was going, how people were talking about it, the way I was seeing it affect our societies and I thought I want to get deeper into the AI space. So when I came back to Microsoft, I started poking around the company and said is there a role in artificial intelligence somewhere in the company? And something opened up for me in our AI and Research group where they were looking for a design manager. So I said absolutely. I'll run one of these groups for you, but before I take the role, I'm demanding that we have an ethics component to this work because what they were doing is they were taking research that was in the AI space and figuring out how do we productize this? Because at that point, research was getting so close to engineering that we were developing new techniques and you were actually able to take those to market fairly quickly and I thought this is a point where we can start thinking about responsible innovation and let's make that a formalized practice. So me taking the role for the design manager was contingent on us creating a spot for ethics at the same time and so backing up a little bit, the video part comes in because I have traditionally been a really analog artist. I've been a printmaker, a painter, and during my sabbatical, I sought some more digitized, looked at digitizing some of the techniques that I was playing with on the analog side. I thought let me go play in the video space for a while. So for three months, like I said, I retooled and I started playing around with different ways of recording, editing, and teaching myself some of these techniques and one of the goals I set out at the time was well, can I get into a festival? Can I get into a music or video festival? So that was one of my goals at the end of the three months. Can I produce something interesting enough to get admitted into a festival? And I won a few, actually. Sam Charrington: [00:03:46] That's fantastic. Mira Lane: [00:03:46] So I was super pleased. I'm like okay, well that means I've got something there I need to continue practicing. But that for me opened up a whole new door and one of the things that I did a few years ago also was to explore art with AI, and could we create a little AI system that could mimic my artwork and become a little co-collaborator with myself? So we can dig into that if you want, but it was a really interesting journey around can AI actually compliment an artist or even replace an artist? So there's interesting learnings that came out of that experience. Sam Charrington: [00:04:25] Okay. Interesting, interesting. We're accumulating a nice list of things to touch on here. Mira Lane: [00:04:30] Yeah, absolutely. Sam Charrington: [00:04:31] Ethics and your views on that was at the top of my list, but before we got started, you mentioned work that you've been doing exploring culture and the intersection between culture and AI and I'm curious what that means for you. It's certainly a topic that I hear brought up quite a bit. Particularly when I'm talking to folks in enterprises that are trying to adopt AI technologies and you hear all the time well one of the biggest things we struggle with is culture. So maybe, I don't know if that's the right place to start, but maybe we'll start there. What does that mean for you when you think about culture in AI? Mira Lane: [00:05:12] Yeah, no, that's a really good question, and I agree that one of the biggest things is culture and the reason why I say that is if you look at every computer scientist that's graduating, none of us have taken an ethics class and you look at the impact of our work, it is touching the fabric of our society. Like it's touching our democracies and our freedoms, our civil liberties, and those are powerful tools that we're building, yet none of us have gone through a formal ethics course and so the discipline is not used to talking about this. It's a few years ago you're like I'm just building a tool. I'm building an app. I'm building a platform that people are using, and we weren't super introspective about that. It wasn't part of the culture, and so when I think about culture in the AI space, because we're building technologies that have scale and power, and are building on top of large amounts of data that empower people to do pretty impressive things, this whole question of culture and asking ourselves, well what could go wrong? How could this be used? Who is going to use it directly or indirectly? And those are parts of the culture of technology that I don't think has been formalized. You usually hear designers talking about that kind of thing. It's part of human-centered design. But even in the human-centered design space, it's really about what is my ideal user or my ideal customer and not thinking about how could we exploit this technology in a way that we hadn't really intended? We've talked about that from an engineering context the way we do threat modeling. How could a system be attacked? How do you think about denial of service attacks? Things like that. But we don't talk about it from a how could you use this to harm communities? How could you use this to harm individuals or how could this be inadvertently harmful? So those parts of cultures are things that we're grappling with right now and we're introducing into our engineering context. So my group sits at an engineering level and we're trying to introduce this new framework around responsible innovation and there's five big components to that. One is being able to anticipate, look ahead, anticipate different futures, look around corners and try to see where the technology might go. How someone could take it, insert it into larger systems, how you could do things at scale that are powerful that you may not intend to do. There's a whole component around this responsible innovation that is around reflection and looking at yourselves and saying where do we have biases? Where are we assuming things? What are our motivations? Can we have an honest conversation about our motivations? Why are we doing this and can we ask those questions? How do we create the space for that? We've been talking about diversity and inclusion like how do you bring diverse voices into the space, especially people that would really object to what you're doing and how do you celebrate that versus tolerate that? There's a big component around our principles and values and how do you create with intention and how do you ensure that they align with the principles and they align with their values and they're still trustworthy? So there's a whole framework around how we're thinking about innovation in the space and at the end of the day it comes down to the culture of the organization that you're building because if you can't operate at scale, then you end up only having small pockets of us that are talking about this versus how do we get every engineer to ask what's this going to be used for? And who's going to use it? Or what if this could happen? And we need people to start asking those types of questions and then start talking about how do we architect things in a way that's responsible. But I'd say most engineers probably don't ask those types of questions right now. So we're trying to build that into the culture of how we design and develop new technologies. Sam Charrington: [00:09:14] Mm-hmm (affirmative). One of the things that I often find frustrating about this conversation particularly when talking to technology vendors is this kind of default answer while we just make the guns, we don't shoot them. We just make the technologies. They can be used for good. They can also be used for bad, but we're focused on the good aspects. It sounds like, well, I'm curious, how do you articulate your responsibility with the tools that you're creating? Or Microsoft's responsibility with the tools it's creating. Do you have a- Mira Lane: [00:09:55] Well I have a very similar reaction to you when I hear oh, we're just making tools. I think, well, fine. That's one perspective, but the responsible perspective is we're making tools and we understand that they can be used in these ways and we've architected them so that they cannot be misused and we know that there will be people that misuse them. So I think you're hearing a lot of this in the technology space and every year there's more and more of it where people are saying look, we have to be responsible. We have to be accountable. So I think we'll hear fewer and fewer people saying what you're hearing, what I'm hearing as well. But one of the things we have to do is we have to avoid the ideal path and just talking only about the ideal path. Because it's really easy to just say here's the great ways that this technology is going to be used and not even talk about the other side because then, again, we fall into that pattern of well, we only thought about it from this one perspective, and so one of the things that my group is trying to do is to make it okay to talk about here's how it could go wrong so that it becomes part of our daily habit and we do it at various levels. We do it at our all hands, so when people are showing our technology, we have them show the dark side of it at the same time so that we can talk about that in an open space and it becomes okay to talk about it. No one wants to share the bad side of technology. No one wants to do that. But if we make it okay to talk about it, then we can start talking about well, how do we prevent that? So we do that at larger forums and I know this is a podcast, but I wanted to show you something. So I'll talk about it, but we created, it's almost like a game, but it's a way for us to look at different stakeholders and perspectives of what could happen. So how do we create a safe environment where you can look at one of our ethical principles. You can look at a stakeholder that is interacting with the system and then you say well if the stakeholder for example is a woman in a car and your system is a voice recognition system, what would she say if she gave it a one star review? She would probably say I had to yell a lot and it didn't recognize me because we know that most of our systems are not tuned to be diverse, right? So we start creating this environment for us to talk about these types of things so that it becomes okay again. How do we create safe spaces? Then as we develop our scenarios, how do we bring those up and track them and say, well how do we fix it now that we've excavated these issues? Well, let's fix it and let's talk about it. So that's, again, part of culture. How do we make it okay to bring up the bad parts of things, right? So it's not just the ideal path. Sam Charrington: [00:12:46] Mm-hmm (affirmative). Do you run into, or run up against engineers or executives that say, introspection, safe spaces, granola? What about the bottom line? What does this mean for us as a business? How do we think about this from a shareholder perspective? Mira Lane: [00:13:09] It's interesting, I don't actually hear a lot of that pushback because I think internally at Microsoft, there is this recognition of hey, we want to be really thoughtful and intentional and I think the bigger issue that we hear is how do we do it? It's not that we don't want to. It's well, how do we do it and how do we do it at scale? So what are the different things you can put in place to help people bring this into their practice? And so there isn't a pushback around well, this is going to affect my bottom line, but there's more of an understanding that yeah, if we build things that are thoughtfully designed and intentional and ethical that it's better for our customers. Our customers want that too, but then again the question is how do we do it and where is it manifest? So there's things that we're doing in that space. When you look at AI, a big part of it is data. So how do you look at the data that's being used to power some of these systems and say is this a diverse data set? Is this well rounded? Do we have gaps here? What's the bias in here? So we start looking at certain components of our systems and helping to architect it in a way that's better. I think all of our customers would want a system that recognized all voices, right? Because again, to them, they wouldn't want a system that just worked for men, it didn't work for women. So again, it's a better product as a result. So if we can couch it in terms of better product, then I think it makes sense versus if it's all about us philosophizing and only doing that, I don't know if that's the best. Only doing that is not productive, right? Sam Charrington: [00:14:59] Do you find that the uncertainty around ethical issues related to AI has been an impediment to customers adopting it? Does that get in the way? Do they need these issues to be figured out before they dive in? Mira Lane: [00:15:22] I don't think it's getting in the way, but I think what I'm hearing from customers is help us think about these issues and a lot of people, a lot of customers don't understand AI deeply, right? It's a complex space and a lot of people are ramping up in it. So the question is more about what should I be aware of? What are the questions that I should be asking and how can we do this together? We know you guys are thinking about this deeply. We're getting just involved in it, a customer might say, and so it's more about how do we educate each other? And for us if we want to understand, how do you want to use this? Because sometimes we don't always know the use case for the customer so we want to deeply understand that to make sure that what we're building actually works for what they are trying to do, and from their perspective they want to understand well how does this technology work and where will it fail and where will it not work for my customers? So the question of ethics is more about we don't understand the space well enough, help us understand it and we are concerned about what it could do and can we work together on that? So it's not preventing them from adopting it, but there's definitely a lot of dialog. It comes up quite a bit around we've heard this. We've heard bias is an issue. Well, what does that mean? Sam Charrington: [00:16:47] Right. Mira Lane: [00:16:47] So I think that's an education opportunity. Sam Charrington: [00:16:49] When you think about ethics from a technology innovation perspective, are there examples of things that you've seen either that Microsoft is doing or out in the broader world that strike you as innovative approaches to this problem? Mira Lane: [00:17:12] Yeah, I'll go back to the data side of things just briefly, but there's this concept called data sheets, which I think is super interesting. You're probably really familiar with that and- Sam Charrington: [00:17:25] I've written about some of the work that Timnit Gebru and some others with Microsoft have done around data sheets for data sets. Mira Lane: [00:17:31] Exactly, and the interesting part for us is how do you put it into the platform? How do you bake that in? So one of the pieces of work that we're doing is we're taking this notion of data sheets and we are applying it into how we are collecting data and how we're building out our platform. So I think that that's, I don't know if it's super novel because to me it's like a nutrition label for your data. You won't understand how is it collected? What's in it? How can you use it? But I think that that's one where now as people leave the group you want to make sure that there's some history and understanding the composition of it. There's some regulation around how we manage it internally and how we manage data in a thoughtful way. I think that's just a really interesting concept that we should be talking about more as an industry and then can we share data between each other in a way that's responsible as well? Sam Charrington: [00:18:24] Right. I don't know that the data sheet, I think inherent to the idea was that hey, this isn't novel. In fact, look at electrical components and all these other industries that do this. It's just "common sense". But what is a little novel, I think, is actually doing it. So since that paper was published, several companies have published similar takes, model cards, and there have been a handful and every time I hear about them I ask okay, when is this? When are you going to be publishing these for your services and the data sets that you're publishing? And no one's done it yet. So it's intriguing to hear you say that you're at least starting to think in this way internally. Do you have a sense for what the path to publishing these kinds of, whether it's data sheet or a card or some kind of set of parameters around bias either in a data set or a model for a commercial public service? Mira Lane: [00:19:41] Yeah, absolutely. We're actually looking at doing this for facial recognition and we've publicly commented about that,  we've said,  hey we're going to be sharing for our services what it's great for, what it's not, and so that stuff is actually actively being worked on right now. You'll probably see more of this in the next few weeks, but there is public comment that's going to come out with more details about it and I'll say that on the data sheet side, I think a large portion of it is it needs to get implemented in the engineering systems first and you need to find the right place to put it. So that's the stuff that we're working on actively right now. Sam Charrington: [00:20:25] Can you comment more on that? It does, as you say that, it does strike me a little bit as one of these iceberg kind of problems. It looks very manageable kind above the waterline but if you think about what goes into the creation of a data set or a model, there's a lot of complexity and certainly the scale that Microsoft is working it needs to be automated. What are some of the challenges that have come into play in trying to implement an idea like that? Mira Lane: [00:21:01] Well, let me think about this for a second so I can frame it the right way. The biggest challenge for us on something like that is really thinking through the data collection effort first and spending a little bit of time there. That's where we're actually spending quite a bit of time as we look at, so let me back up for a second. I work in an engineering group that touches all the speech, language, and vision technologies and we do an enormous amount of data collection to power those technologies. One of the things that we're first spending time on is looking at exactly how we're collecting data and going through those methodologies and saying is this the right way that we should be doing this? Do we want to change it in any way? Do we want to optimize it? Then we want to go and apply that back in. So you're right, this is a big iceberg because there's so many pieces connected to it and the spec for data sheets and the ones we've seen are large and so what we've done is how do we grab the core pieces of this and implement and create the starting point for it? And then scale over time add versioning, being able to add your own custom scheme list to it and scale over time, but what is the minimum piece that we can put into this system and then make sure that it's working the way we want it to? So it's about decomposing the problem and saying which ones do we want to prioritize first? For us, we're spending a lot of time just looking at the data collection methodologies first because there's so much of that going on and at the same time, what is the minimum part of the data sheet spec that we want to go and put in and then lets start iterating together on that. Sam Charrington: [00:22:41] It strikes me that these will be most useful when there's kind of broad industry adoption or at least coalescence around some standard whether it's a standard minimum that everyone's doing and potentially growing over time. Are you involved in or aware of any efforts to create something like that? Mira Lane: [00:23:02] Well I think that that's one piece where it's important. I would say also in a large corporation, it's important internally as well because we work with so many different teams and we're interfacing with, we're a platform but we interface with large parts of our organization and being able to share that information internally, that is a really important piece to the puzzle as well. I think the external part as well, but the internal one is not any less important in my eyes because that's where we are. We want to make sure that if we have a set of data, that this group A is using it in one way. If group B wants to use it, we want to make sure they have the rights to use it. They understand what it's composed of, where it's orientation is and so that if they pick it up, they do it with full knowledge of what's in it. So for us internally it's a really big deal. Externally, I've heard pockets of this but I don't think I can really comment on that yet with full authority. Sam Charrington: [00:24:03] I'm really curious about the intersection between ethics and design and you mentioned human-centered design earlier. My sense is that that phrase kind of captures a lot of that intersection. Can you elaborate on what that means for you? Mira Lane: [00:24:20] Yeah, yeah. So when you look at traditional design functions, when we talk about human-centered design, there's lots of different human-centered design frameworks. The one I typically pick up is Don Norman's emotional design framework where he talks about behavioral design, reflective design, and visceral design. And so behavior is how is something functioning? What is the functionality of it? Reflective is how does it make you feel about yourself? How does it play to your ego and your personality? And visceral is the look and feel of that. That's a very individual oriented approach to design and when I think about these large systems, you actually need to bring in the ecosystem into that. So how does this object you're creating or this system you're creating, how does it fit into the ecosystem? So one of the things we've been playing around with is we've actually reached into adjacent areas like agriculture and explore how do you do sustainable agriculture? What are some of those principles and methodologies and how do you apply that into our space? So a lot of the conversations we're having is around ecosystems and how do you insert something into the ecosystem and what happens to it? What is the ripple effect of that? And then how do you do that in a way that keeps that whole thing sustainable? It's a good solution versus one that's bad and causes other downstream effects. So I think that those are changes that we have to have in our design methodology. We're looking away from the one artifact and thinking about it from a here's how the one user's going to work with it versus how is the society going to interact with it? How are different communities going to interact with it and what does it do to that community? It's a larger problem and so there's this shift in design thinking that we're trying to do with our designers. So they're not just doing UI. They're not just thinking about this one system. They're thinking about it holistically. And there isn't a framework that we can easily pick up, so we have to kind of construct one as we're going along. Sam Charrington: [00:26:28] Yeah, for a while a couple of years ago maybe I was in search of that framework and I think the motivation was just really early experiences of seeing AI shoved into products in ways that were frustrating or annoying. For example, a Nest thermostat. It's intended to be very simple, but it's making these decisions for you in a way that you can't really control and it's starting me down this path of what does it mean to really, build out a discipline of design that is aware of AI and intelligence? I've joked on the podcast before, I call it intelligent design, but that's an overloaded term. Mira Lane: [00:27:23] Totally is. Sam Charrington: [00:27:24] But is there a term for that now or people thinking about that? How far have we come in building out a discipline or a way of thinking of what it means to build intelligence into products? Mira Lane: [00:27:37] Yeah, we have done a lot of work around education for our designers because we found a big gap between what our engineers were doing and talking about and what our designers had awareness over. So we actually created a deep learning for designers workshop. It was a two day workshop and it was really intensive. So we took neural nets, convolutions, all these concepts and introduced them to designers in a way that designers would understand it. We brought it to here's how you think about it in terms of photoshop. Here's how you think about it in terms of the tools you're using and the words you use there, here’s  how it applies. Here's an exercise where people had to get out of their seats and create this really simple neural net with human beings and then we had coding as well. So they were coding in Python and in notebooks, so they were exposed to it and we exposed them to a lot of the techniques and terminology in a way that was concrete and they were able to then say oh, this is what style transfer looks like. Oh, this is how we constructed a bot. So first on the design side, I think having the vocabulary to be able to say oh, I know what this word means. Not just I know what it means, but I've experienced it, so that I can have a meaningful discussion with my engineer, I think that that was an important piece, and then understanding how AI systems are just different from regular systems. They are more probabilistic in nature. The defaults mattered. They can be self learning and so how do we think about these and starting to showcase case studies with our designers to understand that these types of systems are quite different from the deterministic type of systems that may have designed for in the past. Again, I think it comes back to culture because, and we keep doing these workshops. Every quarter we'll do another one because we have so much demand for it and we found even engineers and PMs will come to our design workshops. But kind of democratizing the terminology a little bit and making it concrete to people is an important part of this. Sam Charrington: [00:29:48] It's interesting to think about what it does to a designer's design process to have more intimate knowledge of these concepts. At the same time a lot of the questions that come to mind for me are much higher level concepts in the domain of design. For example, we talk about user experience. To what degree should a user experience AI if that makes any sense? Should we be trying to make AI or this notion of intelligence invisible to users or very visible to users? This has come up recently in, for example, I'm thinking of Google Duplex when they announced that that system was gonna be making phone calls to people and there was a big kerfuffle about whether that should be disclosed. Mira Lane: [00:30:43] Yeah. Sam Charrington: [00:30:43] I don't know that there's a right answer. In some ways you want some of this stuff to be invisible. In other ways, tying back to the whole ethics conversation, it does make sense that there's some degree of disclosure. Mira Lane: [00:30:57] Yeah, absolutely. Sam Charrington: [00:30:58] I imagine as a designer, this notion of disclosure can be a very nuanced thing. What does that even mean? Mira Lane: [00:31:03] Yeah, yeah. And it's all context dependent and it's all norm dependent as well because if you were to look into the future and say are people more comfortable, I mean look at airports for example. People are walking through just using face ID, using the clear system and a few years ago, I think if you ask people would you feel comfortable doing that? Most people would say no, I don't feel comfortable doing that. I don't want that. So I think in this space because it's really fluid and new norms are being established and things are being tested out, we have to be on top of how people are feeling and thinking about these technologies so that we understand where some disclosure needs to happen and where things don't. In a lot of cases you almost want to assume disclosure for things that are very consequential and high stakes. Where there is opportunity for deception. In the Duplex case you have to be thoughtful about that. So this isn't one where you can say okay, you should always disclose. It just depends on the context. So we have this notion of consequential scenarios where things are if there's automated decision making, if there are scenarios where there is, there are high stakes scenarios. Those ones we think about in we just put a little bit more due diligence over those and start to be more thoughtful about those. Then we have other types of scenarios which are more systems-oriented and here's some things that are operationally oriented and they end up having different types of scenarios, but we haven't been able to create a here's the exact way you do every single, you approach it in every single way. So it is super context dependent and expectation dependent. Maybe after a while you get used to your Nest thermostat and you're fine with the way it's operating, right? So I don't know. These social norms are interesting because they are, someone will go and establish something or they'll test the waters. Google Glass tested the waters and that was a cultural response, right? People responded and said I don't want to be surveilled. I want to be able to go to a bar and get a drink and not have someone recording me. Sam Charrington: [00:33:21] Right. Mira Lane: [00:33:22] So I think we have to understand where society is relative to what the technologies are that we're inserting into them. So again, it comes back to are we listening to users? Are we just putting tech out there? I think we really have to start listening to users. My group has a fairly large research component to it and we spend a lot of time talking to people. Especially in the places where we're going to be putting some tech and understanding what it's going to do to the dynamic and how they're reacting to it. Sam Charrington: [00:33:52] Mm-hmm (affirmative). Mm-hmm (affirmative). Yeah, it strikes me that maybe it's kind of the engineer background in me that's looking for a framework, a flowchart for how we can approach this problem and I need to embrace more of the design or it's like every product, every situation is different and it's more about a principled approach as opposed to a process. Mira Lane: [00:34:18] Absolutely. It's more about a principled and intentional approach. So what we're just talking about is everything that you're choosing, are you intentional about that choice and are you very thoughtful about things like defaults? Because we know that people don't change them and so how do you think about every single design choice and being principled and then very intentional and evidence-driven. So we pushed this onto our teams and I think some of our teams maybe don't enjoy being with us sometimes as a result but we say look, we're going to give you some recommendations that are going to be principled, intentional, and evidence-driven and we want to hear back from you if you don't agree on your evidence and why you're saying this is a good or bad idea. Sam Charrington: [00:34:59] Mm-hmm (affirmative). Mira Lane: [00:35:00] That's the way you have to operate right now because it is so context driven. Sam Charrington: [00:35:04] I wonder if you can talk through some examples of how human-centered design, AI, all these things come together in the context of kind of concrete problems that you've looked at. Mira Lane: [00:35:13] Yeah, I was thinking about this because a lot of the work that we do is fairly confidential, but there's one that I can touch on, which was shared at build earlier this year and that was a meeting room device and I don't know if you remember this, but there's a meeting room device that we're working on that recognizes who's in the room and does transcription of that meeting, and to me, as someone who is a manager, I love the idea of having a device in the room that captures action items and who was here and what was said. So we started looking at this and we said okay, well let's look at different types of meetings and people, and let's look at categories of people that this might affect differently. And so how do you think about introverts in a meeting? How do you think about women and minorities because there are subtle dynamics that are happening in meetings that make some of these relationships, they can reinforce certain types of stereotypes or relationships and so we started interviewing people in the context of this sort of meeting room device and this is research that is pretty well recognized. It's not novel research, but it reinforced the fact that when you start putting in things that will monitor anyone that's in a room, certain categories of people behave differently and you see larger discrepancies and impact with women, minorities, more junior people. So we said wow, this is really interesting because as soon as you put a recording device in a room, it's gonna subtly shift the dynamic where some people might talk less or some people might feel like they're observed or depending on if there's a manager in the room and there's a device in the room, they're going to behave differently and does that result in a good meeting or a bad one? We're not sure. But that will affect the dynamic. And so then we took a lot of this research and we went back to the product team and said well how do we now design this in such a way that we design with privacy first in mind? And make users feel like they're empowered to opt into it and so we've had discussions like that, especially around these types of devices where we've seen big impact to how people behave. But it's not like a hard guideline. There's not really a hard set of rules around what you have to do, but because all meetings are different. You have brainstorming ones that are more about fluid ideas. You don't really care who said what, it's about getting the ideas out. You have ones where you're shipping something important and you wanna know who said what because there are clear action items that go with them and so trying to create a system that works with so many different nuanced conversations and different scenarios is not an easy one. So what we do is we'll run alongside with a product team and while they're engineering, they're developing their work, we will take the research that we've gathered and we'll create alternatives for them at the same time so that we can run alongside of them. We can say hey, here's option A, B, C, D, and E. Let's play with these and maybe we come up with a version that mixes them all together. But it gives them options to think about. Because again, it comes back to oh, I might not have time to think about all of this. So how do we empower people with ideas and concrete things to look at? Sam Charrington: [00:38:35] Yeah, I think that example's a great example of the complexity or maybe complexity's not the right word, but the idea that your initial reaction might be like the exact opposite of what you need to do. Mira Lane: [00:38:51] Yep. Sam Charrington: [00:38:51] As you were saying this, I was just like oh, just hide the thing so no one knows it's there. It doesn't change the dynamic. It's like that's exactly wrong. Mira Lane: [00:38:58] You don't want to do that. Don't hide it. Sam Charrington: [00:38:59] Right, right. Mira Lane: [00:39:01] Yeah. And maybe that's another piece. I'm sorry to interrupt that, but one of the things I've noticed is our initial reaction is often wrong, and so how do we hold it at the same that we give ourselves a space to explore other things and then keep an open mind and say okay, I have to adjust and change because hiding it would absolutely be an interesting option, but then you have so many issues with that, right? But again, it is about being able to have an open mindset and being able to challenge yourself in this space. Sam Charrington: [00:39:33] Do you have a sense for where if we kind of buy in to the idea that folks that are working with AI need to be more thoughtful and more intentional and maybe incorporate more of this into more of this design thinking element to their work? Do you have a sense for where this does, or should, or needs to live within a customer organization? Mira Lane: [00:40:01] Yeah, I think it actually, and this is a terrible answer but I think it needs to live everywhere in some ways because one thing that we're noticing is we have corporate level things that happen. We have an ether board. It's an advisory board that looks at AI technologies and advises and that's at a corporate level and that's a really interesting way of approaching it, but it can't live alone and so the thing that we have learned is that if we pair it with groups that mine that sit in the engineering context, that are able to translate principles, concepts, guidelines into practice, that sort of partnership has been really powerful because we can take those principles and say well here's where it really worked and here's where it kind of didn't work and we can also find issues and say well we're grappling with this issue that you guys hadn't thought about. How do you think about this and can we create a broader principle around it? So I think there's this strong cycle of feedback that happens. If you have something at the corporate level or you established just what your values are, what are our guidelines and what are our approaches? But at the engineering context, you have a team that can problem solve and apply and then you can create a really tight feedback loop between that engineering team and your corporate team so that you're continually reinforcing each other, because the worst thing would be just to have a corporate level thing and just be PR speak. You don't want that. Sam Charrington: [00:41:23] Right. Right. Mira Lane: [00:41:24] The worst thing would also be just to have it on the engineering level because then you would have a very distributed mechanism of doing something that may not cohesively ladder up to your principles. So I think you kind of need both to have them work off each other to really have something effective and maybe there's other things as well, but so far this has been a really productive and iterative experiment that we're doing. Sam Charrington: [00:41:50] Do any pointers come to mind for folks that want to explore this space more deeply? Do you have a top three favorite resources or initial directions? Mira Lane: [00:42:02] Well it depends on what you want to explore. So I was reading the AI Now report the other day. It's a fairly large report, 65 page report around the impact of AI in different systems, different industries and so if you're looking at getting up to speed on well what areas is AI going to impact? I would start with some of these types of groups because I found that they are super thoughtful and how they're going into each space and understanding each space and then bubbling up some of the scenarios. So if you're thinking about AI from a how is it impacting? Those types of things are really interesting. On the engineering side, I actually spend a lot of time on a few Facebook groups where they have, there's some big AI groups in Facebook and they're always sharing here's the latest, here's what's going on, try this technique. So that keeps me up to speed on some of those that are happening and also archive just to see what research is being published. The design side I'm sort of mixed. I haven't really found a strong spot yet. I wish I had something in my back pocket where I can just refer to, but the thing that maybe has been on the theory side that has been super interesting is to go back to a few people that have made commentaries just around sustainable design. So I refer back to Wendell Berry quite a bit, the agriculturalist and poet, actually, who has really introspected how agriculture could be reframed. Ursula Franklin is also a commentary from Canada. She did a lot of podcasts or radio broadcast a long time ago and she has a whole series around technology and it’s societal impact and if you replace a few of those words and put in some of our new age words, it would still hold true, and so I think there's a lot of theory out there but not a lot of here's really great examples of what you can do because we're all still feeling out the space and we haven't found perfect patterns yet that you can democratize and share out broadly. Sam Charrington: [00:44:18] Well, Mira, thanks so much for taking the time to chat with us about this stuff. It's a really interesting space and one that I enjoy coming back to periodically and I personally believe that there's this intersection of AI and design as one that's just wide open and should and will be further developed and I'm kind of looking forward to keeping an eye on it and I appreciate you taking the time to chat with me about it. Mira Lane: [00:44:49] Thank you so much, Sam. It was wonderful talking to you. Sam Charrington: [00:44:52] Thank you.
Sam Charrington: Today we're excited to continue the AI for the Benefit of Society series that we've partnered with Microsoft to bring you. In this episode. We're joined by Peter Lee, Corporate Vice President at Microsoft Research responsible for the company's healthcare initiatives. Peter and I met a few months ago at the Microsoft ignite conference where he gave me some really interesting takes on AI development in China. We reference those in the conversation and you can find more on that topic in the show notes. This conversation centers on three impact areas that Peter sees for AI and healthcare, namely diagnostics and therapeutics, tools and the future of precision medicine. We dig into some examples in each area and Peter details the realities of applying machine learning and some of the impediments to rapid scale. Before diving in I'd like to thank Microsoft for their support of the show and their sponsorship of this series. Microsoft is committed to ensuring the responsible development and use of AI and is empowering people around the world with this intelligent technology to help solve previously intractable societal challenges spanning sustainability, accessibility and humanitarian action. Learn more about their plan at Microsoft.ai. Enjoy. Sam Charrington: [00:02:18] All right, everyone. I am on the line with Peter Lee. Peter is a corporate vice president at Microsoft responsible for the company's healthcare initiatives. Peter, it is so great to speak with you again. Welcome to This Week in Machine Learning and AI. Peter Lee: [00:00:14] Sam, it's great to be here. Sam Charrington: [00:00:17] Peter, you gave a really interesting presentation to a group that I was at at Ignite about what some of Microsoft was working on, at Microsoft Research as well as a really interesting take on AI development in China. That kind of piqued my interest, and we ended up sitting down to chat about that in a little bit more detail. While I did cover that for my blog and newsletter, and I'll be linking to it in the show notes, we won't be diving into that today. It was a really, really interesting take that I reflect on often, and I think it's an interesting setup for diving into your background, because you do have a very interesting background and an interesting perspective and set of responsibilities at Microsoft. On that note, can you share with our audience a little bit about your background? Peter Lee: [00:01:11] Sure, Sam. I'd love to do that. I agree it is a little bit unusual, although I think the common thread throughout has been about research and trying to bring research into the real world. I'm a computer scientist by training. I was a professor of computer science at Carnegie Mellon for a long time, actually for 24 years, and at the end of my time there was the head of the Computer Science Department. Then I went to Washington, D.C, to serve at an agency called DARPA, which is the Defense Advanced Research Projects Agency. That's kind of the storied research agency that built the Saturn V booster technology, invented the ARPANET, which became the Internet, developed robotics, lots and lots of other things. I learned a lot about bringing research to life there. Then, after a couple of years there, I was recruited to Microsoft and joined Microsoft Research. Started managing the mothership lab in Redmond, in the headquarters in Redmond, and then a little bit later all of the U.S. research labs and then ultimately, all of Microsoft's 13 labs around the world. Right about that time, Steve Ballmer announced his retirement. Satya Nadella took over as the CEO. Harry Shum took over all of AI and research at Microsoft and became my boss. They asked me to start a new type of research organization internally. It's called NExT, which stands for New Experiences in Technologies, and we've been trying to grow and incubate new research-powered businesses ever since, and most recently in healthcare. Sam Charrington: [00:03:04] I think when I think about AI and healthcare, there's certainly a ton of ground to cover there, but I think one of the areas that gets a lot of attention of late is all the progress that's being made around applying neural nets, CNNs in particular, to imagery. I'm wondering from your perspective, how do you tend to think about AI applied to the healthcare space and where the big opportunities are? Peter Lee: [00:03:37] Yeah. When I think about AI and healthcare, I'm really optimistic about the future. Not that there aren't huge, difficult problems and sometimes things always seem to go slower than you expect. It's a little bit like watching grass grow. It does grow and things do happen, but sometimes it's hard to see it. But over the last 15 years, the thing that I think is underappreciated is the entire healthcare industry has gone digital. It was only 15 years ago that, for example, in the United States, less that 10% of physicians were recording your health history in a digital electronic health record. Now, we're up over 95%, and that's just an amazing transformation over 15 years. It's not like we don't still have problems, data is siloed, it's not in standard formats. There's all sorts of problems, but the fact that it's gone digital just opens up huge, huge amounts of potential. I kind of look at the potential for AI in three areas. One is the thing that you pointed out, which are AI technologies that actually lead to better diagnostics and therapeutics, things that actually advance medical science and medical technology. A second area for AI is in the area of tools, tools that actually make doctors better at what they do, make them happier while they're doing it, and also improve the experience for you and me as patients or consumers of healthcare. Then the third area is in this wonderful future of precision medicine that's taking new sources of information, digital information, your genome, your proteome, your immunome, data from your fitness wearables and so on and integrating all of that together to give you a complete picture of what's going on with your body. Those are sort of three broad areas, and they're all incredibly exciting right now. Sam Charrington: [00:05:51] When you think about the first two of those categories, better diagnostics and therapeutics and tools, how do you distinguish them? It strikes me that giving doctors a better way to analyze medical imagery, for example, or to use that example again, is a tool that they can use, but when you say tools, what do you specifically mean? Peter Lee: [00:06:14] Yeah. You're absolutely right. There's an overlap. It's not like the boundaries between these things are all that hardened, but if you think about one problem that doctors have today is by some estimates in the United States, doctors are spending 40 to 50% of their workdays entering documentation, entering notes that record what happened in their encounters with patients. That's sometimes called an encounter note. That documentation is actually required now by various rules and regulations. It's an incredible source of burden. In fact, I'm guessing you've had this experience, most people have. You go to your doctor, I go to mine, and I like her very much, but while I'm being examined by her, she's not looking at me. She's actually sitting at a PC, typing in the encounter notes. The reason she's doing that is if she doesn't do it while she's examining me, she'll have to do it for a couple of hours maybe in the evening, taking time away from her own family. That burden is credited or blamed for a rise in physician burnout. Well, AI technologies today are rapidly approaching the point where ambient intelligence can just observe and listen to a doctor-patient encounter and automate the vast majority of the burden of that required clinical note-taking. That's an example of the kind of technology that could in a really material way just improve the lives and the workday satisfaction of doctors and nurses. I put that in a different category than technologies that actually give you more precise diagnosis of what's ailing you or ability to target therapies that might actually attack the very specific genetic makeup, let's say, of a cancer that's inhabiting your body right now. Sam Charrington: [00:08:17] Got it. Got it. Maybe let's take each of these categories in turn. I'd love to get a perspective from you on where you see the important developments coming from, from a research perspective, and where you see the opportunities and where you see things heading in each. Peter Lee: [00:08:42] Sure. Well, why don't we start with your example of imaging, because computer vision based on deep neural nets has just been progressing at this stunning rate. It seems like every week you see another company, another startup, or another university research group showing off their latest advances in using deep neural net-based computer vision technologies to do various kinds of medical image diagnosis or segmentation. Here at Microsoft, we've been working pretty hard on those as well. We have this wonderful program based primarily in India that's been trained on the health records and eye images of over 200,000 patients. That idea of taking all that data, you get the signal of which of those patients have, let's say, suffered from, say, diabetic retinopathy or a progression of refractive error leading to blindness. From that signal in the electronic health record, coupled with the images, we are able to train a computer vision-based thing to make a prediction about whether a child whose eye image has been taken is in danger of losing eyesight. That is in deployment right now in India, and, of course, for other parts of the world like the United States and Europe, which are more regulated, these things are in various states of clinical validation so they can be more broadly deployed. Another example is a project that we have called InnerEye that is trying to just reduce the incredible, kind of boring and mundane problem of just pixel-by-pixel outlining the parts of your body that are tumor and should be attacked with the radiation beam as opposed to healthy tissue. That problem with radiation therapy planning has to be done really perfectly, which is why it's this sort of pixel-by-pixel process. But there is maybe five or 15 minutes of real black magic that's drawing on all of the intuition and experience and wisdom of a radiologist and then two to three hours of complete drudgery, and much of that complete drudgery can just be eliminated with modern computer vision technologies. These things are really developing so rapidly and coming online. They tend not to replace completely what doctors and radiologists can do, because there is always some judgment and intuition involved in these things, but when done right, they can integrate into the workflow to really enable, to kind of liberate clinicians from a lot of drudgery and to reduce mistakes. I think one other thing that's sometimes not fully appreciated is you also, when you get these tools, you can take these measurements over and over and over again. When they become cheap, you can take them every day, if necessary, which allows you to track progression of a disease or its treatment over time much more precisely. These sorts of applications, I think, in medical imaging, I think are really promising. One thing I ... it's a hobby horse of mine ... before I pause, is in 2015 here in Microsoft Research we invented something called deep residual networks, which are now commonly called ResNets. ResNet has become part of an industry standard and research standard in computer vision using deep neural nets. We ourselves have refrained from using ResNets for doing things like imaging of 3D images for the purposes of radiation therapy planning, and there are various technical reasons for that. Sometimes we have a mixture of being proud seeing the rest of the world use our invention for interesting medical imaging, but we also sometimes get worried that people don't quite understand the failure modes in these things. But, still, the progress has just been spectacular. Sam Charrington: [00:13:14] That's kind of an interesting prompt. Maybe let's take a moment to explore the failure modes, and why don't you ... It sounds like you don't advise folks to apply ResNets to the types of images that we tend to see in medical imaging. What's that about? Peter Lee: [00:13:32] Yeah. It's not advising or warning people against it. If you think about, let's say, take the problem of radiation therapy planning, it's a 3D problem. You have a tumor that is a 3D mass in your body and you're trying to come up with the plan for that radiation beam to attack ideally as much of that tumor while preserving as much healthy tissue as possible. Of course, your picture into that 3D tumor is as a series of two-dimensional slices, at least with current medical imaging. One very basic question is, as you examine slice-by-slice that tumor with respect to the healthy tissue, is each slice being properly and logically registered with the next one? A simple or naïve application of a convolutional neural network, like a ResNet, doesn't automatically do that. The other problem is it's unclear to what extent a bad training sample or set of training samples will do to one of these deep neural nets. In fact, just in the last few weeks and months, there have been more and more interesting academic research studies showing some interesting failure modes from a surprisingly small number of bad training samples. I think that these things are changing all the time. Our algorithms and our algorithmic understanding are improving all the time, but at least within our research groups, we've taken pains to understand that this application of computer vision isn't like others. It's more in the realm of, say, driverless cars where safety is of paramount concern, and we just have to have absolute certainty that we understand the possible failure modes of these things. Sometimes with just an off-the-shelf application of ResNets or any similar deep neural net algorithm, we and now more and more other researchers at universities are finding that we don't yet fully understand the failure modes. Sam Charrington: [00:16:02] In some ways, there's an opportunity beyond kind of naïve application of an algorithm that performs very well on ImageNet. Today, you can get data sets that include kind of these 2D representations of what are fundamentally 3D applications or 3D images and apply the regular 2D algorithms to them and find interesting things. But you're saying that a) we can do better and b) we may not even be doing the right things in many cases because of these safety issues. I'm wondering, on the first of those two points, the doing better, is there either a standard approach that's better than ResNet for these 3D images that you've developed at Microsoft or have seen otherwise? Or where are we in terms of taking advantage of the 3D nature of medical images and deep learning? Peter Lee: [00:17:06] Yeah. That's a good question. For our InnerEye project, which is really run by a great set of researchers based mostly in our Cambridge, U.K. research lab and led by Antonio Criminisi. He's really one of the preeminent authorities in computer vision. In fact, he led an effort some years ago to work out the 3D computer vision for Kinect, and so he's really specialized in 3D. The InnerEye project, which is really for us an effort to really understand completely the workflow of radiation therapy planning, that system actually doesn't use residual network. What it does is it uses kind of an architecture of layered what are called decision forests. That gives not only some benefits in terms of more compact representations of machine-learned models and, therefore, some performance improvements, but it allows us to kind of capture a kind of logical registration of the images as they go slice-by-slice. In other words, you're inferring not just the segmentation of each 2D image slice, but you're actually trying to infer the voxel, the 3D voxel volume of the tumor that you're trying to attack. Then on top of that, there's a process involved when you're dealing with medical technologies. You don't just put it out there and start applying it on people. You get it peer-reviewed. You get it peer-reviewed, in this case, in computer science journals and in medical journals, and you go through a clinical validation, and if you're in the United States, for example, through an FDA approval process. For us, as we're learning about what does our cloud, what do our AI services, what do our tools have to be in order to support this future of AI-powered healthcare, InnerEye is an example of us going end-to-end to try to build it all out and to understand all those components and to understand what has to be done to really do it right. It's been a great learning experience. We're now in the process not only of working with various companies who might want to integrate this InnerEye technology into their medical devices, but we're starting to now pull apart the kind of bricks and mortar that we used in the technical architecture for InnerEye in order to expose those as APIs for other developers to use. Our intent is not to get into the radiation therapy business. Our intent is not to get into radiology. But we do want our cloud and our AI services and our algorithms to be a great place for any other company or any other startup or innovator who wants to do that and ideally do it on our cloud, using our tools. Sam Charrington: [00:20:29] An interesting point in there. You mention that the decision forests that you developed to address this problem ... I guess we often think of there being this tradeoff between factors like explainability or safety, as you related that second point, and performance, which we think of as the neural net is delivering kind of the ultimate in performance in many cases. But in this case, this decision forest algorithm is outperforming at least your classic 2D ResNets, and I'm imagining also providing benefits in terms of explainability/safety. Is that correct? Peter Lee: [00:21:21] Well, we feel very strongly that it provides benefits in terms of safety. Explainability is really another very interesting question and problem. There's a potential for greater explainability. One of the lessons that we learned when we were working on AI for sales intelligence ... We had really developed tremendous amount of AI that would ingest large amounts of data from the world as well as from customer relationship management databases, emails and so on for our sales teams and used that through various AI algorithms to do things like synthesize new offers to specific customers or to surface new prospective customers or to suggest new discount pricing for specific customers. One of the things we learned is that no self-respecting sales executive is going to offer a 20% discount to a customer just because his algorithm says so. Typically- Sam Charrington: [00:22:35] Doctors are probably similar? Peter Lee: [00:22:37] That's right. In that situation, we also moved away from, in that specific case, moved away from the pure deep neural net architecture to having a kind of layered architecture of Bayesian graphical models. The reason for that was so that we could synthesize an explanation in plain English of not only offer a 20% discount, but why. As we get into, away from more point solutions that are kind of machine learning or AI-powered to more of that digital assistant that is the companion to a clinician and gives that clinician a second opinion or advice on a first opinion, those sorts of explanations undoubtedly are going to become important, especially at the beginning when we're trying to establish trust in these things. As we've been experimenting even with the kind of ambient intelligence to just listen in on a doctor-patient encounter and try to automate a note, one thing we've found is that doctors will look at the synthesized note and not trust everything in it because they don't quite yet have the understanding of why did the note come out this way. It became important to provide tools so that when you, say, click on a specific entry in the note, that it could be mapped back to a running transcript and to the right spot in the running transcript that was recorded. These sorts of things I think are part of maybe the human-computer interaction or the human-AI interaction that we're having to think about pretty hard as we try to integrate these things into clinical workflow. Sam Charrington: [00:24:30] Before we move on beyond diagnostics and therapeutics, all of the examples that you gave fell into the domain of computer vision. Are there interesting things happening in diagnostics beyond the kind of onslaught of these new computer vision-based approaches? Peter Lee: [00:24:51] Yeah. I think actually some of the most interesting things are not in computer vision, and this maybe crosses over into the precision medicine thing. One of the projects I'm so excited about is something that we're doing jointly with a Seattle biotech startup, Adaptive Biotechnologies. The setup is this: If you take a small blood sample from your body, in that sample, in that one-mL sample, you'll end up capturing on the order of one million T cells. The T cells are one of the primary agents in your adaptive immune system. About two and a half years ago, there was a major scientific breakthrough that got published that showed that the receptor ... There's a receptor on the surface of your T cells, and in that receptor, there's a small snippet of DNA. There was strong evidence two and a half years ago that that snippet of DNA completely determines what pathogen or infectious disease agent or cancer that T cell has been programmed to seek out and destroy. That paper was very interesting because it used a simple linear regression in order to identify from a read of that little snippet of DNA on the T cell receptor whether you had CMV, cytomegalovirus, or not. It was really just an impressive paper and just very recent. Well, the thing that was interesting about Adaptive Biotechnologies is Adaptive Biotechnologies was in the business of giving you a printout of that specific snippet of DNA in all the T cell receptors in a blood sample. They had a business model that would help some cancer centers titrate the amount of specific chemotherapy you were getting based on a reading of that DNA. That raised the question, would it be possible to take that printout of those T cell receptor DNA sequences and, in essence, think of that as a language and translate it into the language of antigens? Then, if you can do that, can you take those antigens and do a kind of topic identification problem to figure out what infectious diseases, what cancers, and what autoimmune disorders your body is currently coping with right now? It turned into this very interesting new business opportunity for Adaptive Biotechnologies that if machine learning could be used to solve those two problems, then they would have a technology that would be very similar to a universal diagnostic, a simple blood test powered by machine learning that could do early diagnosis of any infectious disease, any cancer, and any autoimmune disorder. Microsoft found that interesting enough that we actually took an investment position in Adaptive Biotechnologies and agreed to work with them on the machine learning. And Adaptive, for their part, agreed to build a bigger production pipeline in order to generate training data to power that machine learning that we're developing at Microsoft. What has transpired since then has been an amazing amount of progress where we've added tremendous amount of sophistication actually using deep neural nets and started to feed it with billions of points of training data. In fact, this year, the production facility at Adaptive will be able to generate up to a trillion points of training data. We're now targeting five specific diseases, ovarian cancer, pancreatic cancer, type I diabetes, celiac disease, and Lyme disease. That's two cancers, two autoimmune disorders, and one infectious disease with the same machine learning pipeline. It's still an experiment, but it kind of shows you the potential power of these advances in immunology, in genomics, and AI all being bound together to give the possibility. We know the science now is valid, and if we can now build the technology that ties those things together, we get the potential for a universal diagnostic, but as close a thing that we could imagine getting to the Star Trek tricorder as anything. Sam Charrington: [00:29:31] Mm-hmm (affirmative). That was the thing that popped immediately to mind for me, the tricorder. That example, I think, captures for me really plainly both the promise of applying machine learning and AI to this healthcare domain, but also maybe a little bit of the frustration in thinking through, okay, collecting a trillion samples and you've got this pipeline, why does it take so long? There's certainly regulatory and political types of reasons that maybe we'll get into. I'm wondering if you can elaborate on with that much training data and kind of the science in place and a pipeline in place, what are the realities of applying machine learning in this type of context that impede kind of rapid scale? Why just five diseases and not 25, for example? Peter Lee: [00:30:43] Yeah. That's such a great question. Yeah, human biology is just so complicated. I will say there are three ways, maybe, to take a cut at that. If we took a look at the very basic science, just consider the human genome, something that geneticists at several universities have taught me which was really eye-opening, is if you look at the human genome and then look at all the possible variants, the number of variants in the human genome that would still be considered homo sapiens is just astronomically large. Yet, the total number of people on the planet relative to that number is really tiny, only, what, seven and a half billion people. In fact, if we had somehow DNA samples from every human that has ever existed, I think most estimates say there are fewer than 106 billion people that have ever existed since Adam and Eve. If we are using modern machine learning, which is basically looking at statistical patterns and correlations, we have an immediate problem for a lot of basic problems in genomics, because we basically don't have a source of enough training data. The complexity of human beings, the complexity of cancer, the genetic complexity of disease, is just vastly larger than the number of people that have ever existed. Sam Charrington: [00:32:21] Meaning relative to the possible combinations of genes- Peter Lee: [00:32:28] That's right. Sam Charrington: [00:32:28] ... every human is ... I guess it shouldn't be surprising that every human is unique, but even given ... It's a little counterintuitive. You'd think there's only these four letters that were thrown together to figure all this stuff out. Right? Peter Lee: [00:32:43] Yes. What that means is that, yes, we will and we have been making ... We, meaning the scientific community and the technology community, have been making stunning advances and making really meaningful improvements for neonatal intensive care, for cancer treatments, for immunology, but fundamentally, scientifically, we still need something beyond just machine learning. We really need something that gets into the basic biology. That's kind of one reason why this is hard. Another reason is these are just big problems. In the project with Adaptive Biotechnologies, there are between 10 to the 15th and 10 to the 16th different T cell receptors that your body can produce and on the order of maybe 10 to the 7th known antigens. Imagine we're trying to do is trying to fill out a gigantic Excel spreadsheet with 10 to the 16th columns and 10 to the 7th rows. That's just a heck of a big table, and so you end up needing a large amount of training data to discern enough structure, find enough patterns in order to have a shot at filling in at least useful parts of that table. The good news is everybody has T cells, and so we can take blood samples from anybody, from just ordinary, healthy people, and then we can go to research laboratories around the world that have stored libraries of antigens and start correlating those stored libraries of antigens against those what are called naïve blood samples. That's exactly what Adaptive Biotechnologies is doing in order to generate the very large amount of training data. It's a little bit of a good news situation there that we don't need to find thousands or millions of sick people. We can generate the data from just ordinary samples. But it's still a very large amount of data that we need. Then the third kind of way that I think about this is it gets back to the safety issue. We do things a certain way because ultimately, medicine and medical science is based on causal relationships. In other words, we want to know that A causes B, but what we typically get out of machine learning is just A is correlated with B. We get those inferences, and then it takes more work and more testing under controlled circumstances to know that there's a causal relationship. All three of those things kind of create challenges. It does take time, but I think the good thing is as the regulatory organizations like the FDA have gotten smarter and smarter about what is machine learning, what is it good for, what are its limitations, that whole process has gotten, I think, faster and more efficient over time. Then there's a second element, which is, of course, companies are in it to make money. At a minimum, even if they have purely humanitarian intentions, at a minimum they have to be sustained over time. That means that insurance companies and Medicare and Medicaid, they have to be willing to reimburse doctors and nurses when they actually use or prescribe these diagnostics and therapeutics. All of that takes time. Sam Charrington: [00:36:37] At least on the second of your three points, in thinking about scaling, solving problems like this, specifically training data, do you have a rule of thumb, a chart that says, okay, one trillion training samples will get us these five diseases, but we'll need 10 trillion to get to 10 diseases? I realize that that's almost an asinine question and it's much more complex than that, but does it make sense at all to think of it like that? And think of, I guess, the impact of collecting training data and what the trajectory looks like that over time, kind of like the way we thought of as we drive the cost of sequencing down, the downstream effects that that'll have? Peter Lee: [00:37:27] Yeah. Well, when you find the answer to that question, please tell me. In my experience, I've seen this go two ways. One of the wonderful things about modern machine learning algorithms today is that they're far less susceptible to problems of over-fitting. They come very close to this wonderful property that the more data, the more better. But it does happen that sometimes you hit a wall, that you start to see a trail-off in improvement. We really don't know. The kind of early results that we've gotten with admittedly simpler diseases like CMV, and then CMV is actually not that interesting from a medical perspective, they give us tremendous hope. Then other internal, more technical validations, give us supreme confidence that the basic science, the biological science is well-understood now. Once you start really attacking much more complex diseases, like any cancer, it's really hard. I would be unwilling personally to make a prediction about what will happen. But there's every reason today for optimism, and I think the only unknown is whether there is a what if we fall off a cliff at some point and stop finding improvements. Or if we're going to just get to a viable FDA-approved diagnostic in the near term that will be constantly improving as more and more people are diagnosed. It could really go in either way. I'm really unable and actually unwilling to make a prediction about which way it will go, but we are feeling pretty confident. Incidentally, I should say last month Adaptive Biotechnologies closed a deal with Genentech for applications of this T cell receptor antigen map in the therapeutic space, in the area of cellular therapies for targeted cancer treatments. That deal has a value of over $2 billion, so there's also some ... When you're dealing with commercial relationships like that, there's a tremendous amount of due diligence. These are big bets and big pharma is accustomed to making large, risky bets like this, but I think it's another sign that at least leading scientists at one of the larger pharmaceutical organizations is also increasingly confident that we can fill out this map. Sam Charrington: [00:40:38] We've talked about diagnostics. We've talked about precision medicine. What do you see happening on the tooling side, both from the doctor's perspective as well as the patient experience perspective? Peter Lee: [00:40:52] Yeah. One thing, it's a simple thing, but it's been surprising how useful it has turned out to be. We've been piloting chatbot technology that we call the Microsoft Health Bot. This has been sort of in a beta program with a few dozen healthcare organizations. What it does is, we've sort of advanced our cognitive services for language processing, for natural language processing, for conversational understanding and the tooling to provide a drag-and-drop interface so that ordinary people can program these chatbots, at least for medical settings, and then we've improved the models, the language models, so they understand medical and healthcare concepts and terms. We've been surprised at the kinds of applications that people use. One example is there are organizations that have made prescription bots. The idea is this. Maybe you get a prescription from your doctor or from the hospital and you go to the pharmacy, you get your prescription filled, and then a day or two later, you get a message from this intelligent chatbot that's asking, "How's it going? Do you have any questions? Or have you had any issues with your medication?" It invites you proactively to get into a conversation that gives the healthcare provider tremendous insight into whether you're adhering to your prescription. That's a huge problem. Something like 35% of people actually don't follow through with their prescription medications. It's just there to answer questions. Maybe you have some stomach upsets or some people who are on a lot of medications hate having all those bottles and they put them all, dump all the pills into a baggy and then they can't remember which pills are which. The health bot is able to converse with you and say, "Oh, well, why don't you point your phone camera at a bunch of pills and I'll remind you what they are." It uses modern computer vision ResNets, actually, to remind you what these pills are. The kind of engagement that the healthcare providers get, the improvements in engagement and the satisfaction that people like you and me have is really improved. Or just asking simple benefits questions or medical triage of various sorts, these kinds of ideas have been surprisingly interesting. In fact, so surprising for us that later this week, we'll be making that product generally available for sale. You'll be able to use the Microsoft Health Bot technology without any restriction, except for payment, of course. That is something that has gone extremely well. That technology now is being baked into more and more of, I think, of what people will be seeing. We have a collaboration hub application in Office 365 called Teams, and Teams has been this just wonderful technology for improving collaboration in all sorts of workplace settings. Well, we've made Teams healthcare compliant and able to connect to electronic health record systems, and then by integrating great kind of collaboration intelligence tools, to just parse records or a newer way to go to find certain bits of information or just to be able to ask an intelligent agent that is part of your team, "Did so-and-so check the sutures last night?" and be able to get a smart answer whether people are awake or not. There are all these little ways that I think AI can be used in the workflow of healthcare delivery. One of the things that is, I think, underappreciated about healthcare delivery today, especially in acute care settings, is it's a super collaborative environment. Sometimes there can be as many as 20 people that are working together as a team delivering care to multiple patients at a time. How to keep that team of 20 people all on the same page and all coordinated is getting to be a really difficult problem, typically done with Post It notes and half-erased whiteboards now transitioning to pretty insecure consumer messaging apps. But the idea of having real enterprise-grade collaboration support with AI, I think just can make all of that much better and then provide much more security and privacy for people. A lot of these applications of AI end up being less flashy than doing some automatic radiation therapy planning of a medical image, but they really kind of help people, those people on the front lines of healthcare delivery do their jobs better. Sam Charrington: [00:46:34] I tend to find myself having really kind of mixed feelings about conversational applications, at least from the perspective of talking about them on the podcast. There's no question that conversational experiences and interfaces will be a huge part of the way we interact with computers in the future and that there's tons of work that needs to happen there because of the reasons that you mentioned, like less flashy. I wonder if there's still interesting research. At least my question to you is are there still interesting research challenges there? Or is it all, do we have all the pieces and it's just kind of rolling up the sleeves and building enterprise software, which we know is hard and takes time? Peter Lee: [00:47:21] Yeah. It's a good question. It feels like research to me. Sam Charrington: [00:47:27]. (laughter) Elaborate. Peter Lee: [00:47:28] Some of the problems, if anything, feel little difficult, honestly. If we just, say, take the problem of listening to a doctor-patient conversation and from that, understanding what should go into the standard form of a clinical encounter note. Here's a typical thing. There could be an exchange. Let's say, Sam, you're my doctor and I'm your patient, you might be asking me how I'm doing and I might complain about the pain in my left knee hasn't gone away. We can have an exchange about how that goes, and ultimately, what goes into the note by you is a note about my continued lack of weight loss and that my being overweight is contributing to the lack of healing with my knee problem. That may or may not have been a part of our conversation. While it's important that the weight loss element be in that clinical note ... In fact, it might even mean revenue for that doctor because there might be a weight loss program that gets prescribed and so on. That's important and it's important not to miss that. The human exchange here and the things that are implicit in those conversations, let alone the fact that I'll say kneecap and you'll say patella, are things that are as close to general artificial intelligence style problems as anything. Sam Charrington: [00:49:15] Yeah. Peter Lee: [00:49:18] Look, we don't kid ourselves that we're anywhere close to solving those types of problems, but those are the kinds of problems we think about, even when we just look at the kind of day-to-day, minute-by-minute work that people do to deal with their healthcare. Sam Charrington: [00:49:33] Right, right. Peter Lee: [00:49:34] There's another one that's interesting. To really unlock the power of AI, what we would want to do is to just open up huge databases to great researchers and innovators everywhere, but, of course, we need to do that without violating anyone's privacy. There's one problem, something called de-identification. It would be great to be able to take a treasure trove of what's in electronic health records and "de-identify" it. Well, some parts of those electronic health records are easy to do because there might be a field called Social Security Number, another field called Name, another one called Address, and so on, so you can just scrub those out. But large amounts of clinical data involve just unstructured notes, and to really have a deep understanding of what's in those notes and in order to scrub those in a way that won't inadvertently reveal somebody's identity or their medical condition, again, is something that in the ultimate, ends up being a very general AI problem. Sam Charrington: [00:50:41] That's a great reframing of the way to think about this is I guess most chatbots are boring because they're boring. Kind of the entity intent framework that most chatbots are built on is kind of like table stakes relative to what we're really trying to do with conversational experiences. That really requires a level of sophistication and our ability to use and work with and manipulate natural language that is very much at the research frontier now. And that's why most current in-production chatbots are kind of boring. Peter Lee: [00:51:27] Yeah. We've taken a step forward of trying to think of these things almost in terms of being able to play a game of 20 questions. One of the most inspiring applications of health bots that we dream about is in matching people to clinical trials. At any point, there are thousands of clinical trials. You can go to a website called clinicaltrials.gov and there's a search bar there, and you can type in something like breast cancer. When you do that, you get this gigantic dump of every registered clinical trial going on that might be pertinent to breast cancer. While that's useful, the problem with that is it's hard to know which ones of those ... If you are, say, someone who's desperate to find a clinical trial to enroll in because you've run out of other viable options for whatever is ailing you, it's just almost impossible to go through all of that technical information and try to understand this. Would it be possible to use an AI to read through all that technical information and then to synthesize what amounts to a game of 20 questions, something that'll converse with you and ask you questions in order to narrow down to just that one or two or three clinical trials that might be a match for you. It's that kind of thing where it's not fully general conversation of the sort that I think you and I were talking about just a minute ago, but is slightly more structured than that in order to help you more intelligently, more efficiently find the right medical or healthcare solution for you. That kind of application is something that we're really putting a lot of kind of heart and mind into, along with many others around the world. It's exciting that we're starting to see these things actually make it into clinical use today. I kind of agree with you. I do roll my eyes sometimes at the overheated hype around intelligent agents and chatbots as well, just like anybody else, but it's really getting somewhere in these more limited domains. Sam Charrington: [00:53:56] I think it also says why the interesting work in domains like this is going to be ... It's not generic. You're solving a specific problem and there's a lot of investment in getting the machine running AI right for this particular problem as opposed to implementing a generic framework. Peter Lee: [00:54:16] That's right. Sam Charrington: [00:54:17] Awesome. Well, Peter, thank you so much for taking the time to chat with me about the stuff you're seeing and working on in the healthcare space. A ton of really interesting examples in there and I'm looking forward to following all this work and digging deeper. Thank you. Peter Lee: [00:54:37] And we didn't even talk about China once. That's great. Sam Charrington: [00:54:41] Well, you mentioned ResNet a few times kind of taunting me to dive into that conversation, but I'll refer folks to the article and we'll put the link in the show notes. Peter Lee: [00:54:52] Sounds great. It was really a pleasure chatting.
Bits and Bytes This week news from Google I/O and Microsoft Build have dominated the news. Here are the highlights: Oracle rolling out AI applications for manufacturing. The applications leverage machine learning and AI to sift through large amounts of data from production environments to identify and trace issues from production through to customer delivery. IBM granted patent for AI-powered traffic management. The system would use computer vision powered cameras instead of timers to manage the flow of traffic. My friends over at SWIM are also doing interesting work in this area with one of their customers. Top Baidu AI executive stepping down. Top executive behind Baidu's artificial intelligence programs, Lu Qi, is stepping down. Lu is a former Microsoft executive and AI expert and has been responsible for day to day operations at Baidu's AI unit. Boston dynamics announces plans to sell SpotMini robot. The announcement came from Boston Dynamics founder Marc Raibert at the TC Sessions: Robotics conference at Berkeley. The robots are currently in pre-production but could available for sale come the middle of 2019. Researchers pair AI and drones to help manage agriculture. The University of South Australia system allows farmers to pinpoint areas that need more nutrients and water. This potentially improves crop outcomes and reduce resource mismanagement. Intel launches OpenVINO to accelerate computer vision development. The new toolkit, already in use at customers Agent Vi, Dahua, Dell, Current by GE, GE Healthcare, Hikvision, and Honeywell, includes three major APIs: The Deep Learning Deployment toolkit, a common deep learning inference toolkit, as well as optimized functions for OpenCV and OpenVX. Dollars & Sense Primal, an AI consumer and enterprise company, raises $2.3M BrainQ Technologies, a developer of AI to treat neuro-disorders, raises$8.8M in funding Motorleaf, a startup focused on data-driven insights for greenhouse and indoor operators, raises $2.85m Dialpad acquires TalkIQ to bring voice-driven AI to communications. Competitor 8x8 acquires MarianaIQ to strengthen AI capabilities as well. Oracle buys DataScience.com for data science platform Microsoft acquires Semantic Machines, to advance conversational AI Sign up for our Newsletter to receive the Bits & Bytes weekly to your inbox.
Finally ready to be seen; enjoy this newsletter fifteen! Bringing AI Products to Market. Updates and more from TWIML & AI As the New Year approaches many of us are deep in the process of laying the groundwork for future goals and plans. For some of you that might mean figuring out how to get your AI product to market. Here at the NIPS (Neural Information Processing Systems) conference, I had the opportunity to organize an impromptu meetup on the topic, and I thought it’d be worth sharing the groups' observations here. Now, bringing products to market in general, and ML/AI products to market in particular, is a topic that I’m both very passionate about and also have a vested interest in. My work calls on me to advise a broad range of companies on this topic from startups and VCs to large companies. As a result, I’m always interested to hear how other people in the industry are approaching the issue. The NIPS conference app had a bulletin board feature where folks could self-organize into shared interest groups. It was a no-brainer, then, to use this feature to put out a call for folks interested in discussing bringing AI-enabled products to market.   We had a nice turnout of eight or so people from varying backgrounds, working on both internal (intra-enterprise) and external (customer-facing) products. Issues raised spanned both marketing and sales challenges as well as the technology itself. Here’s a breakdown of some the top concerns, as well as the solutions, the group discussed. Servers & scaling. A limiting factor on productivity for AI product teams was having the right amount of computing power. Even in the cloud, this can be expensive, both for training and inference. Lean startup approaches like focusing on a minimum viable product can help here, by ensuring resources are used efficiently. Providing executives with clear-cut examples of how productivity can be increased with the right tools was also seen as a solution to this issue. IP & contracts. This was a tough one. The IP issues associated with machine learning are still murky. For example, issues of ownership of models trained on a given 3rd party dataset or toolset remain largely unchallenged. Startups can often get away with flying under the radar here, but for big companies with lots at stake, there are numerous challenges and roadblocks. Risk, uncertainty & probability. One broad issue discussed was simply a lack of ability to effectively think in terms of probabilistic outcomes on the part of customers or executives. Many buyers and business leaders, even those for whom risk is a fundamental element of their business model (e.g. finance or energy exploration), have trouble wrapping their heads around products that work “sometimes” or with a certain probability. Solutions here include strong communication and education programs, recruiting and nurturing executive sponsors, and connecting to existing business initiatives like digitalization. Communicating the benefits of AI w/out creeping users out. Privacy is an important issue for machine learning and AI products, but this is more than about just privacy. It’s also an issue of the user experience. I’ve talked about this before, but designing for intelligence is an important emerging area of user experience design that shouldn’t be overlooked. The consensus for the group was that messaging around AI products should center around how the AI assists humans and not how it makes decisions for them or otherwise takes control out of their hands. Messaging in general for AI products. The key to getting the messaging right for AI products is to really identify the core problem that the product is solving. Work on communicating how solving that problem will create value for your customer or business. As the product evolves, keep them aware of both small wins and transformational opportunities, and how you and your team are minimizing risk. In these early days of AI adoption, there’s no clear-cut recipe for success to get an AI product up and running. Not only is the technology itself still evolving, but communicating the benefits of the product to customers is also challenging. So having these types of discussions between industry folks about what’s worked for them, and what hasn’t, is really invaluable. Sign up for our Newsletter to receive this weekly to your inbox.
In 1996, Bill Gates popularized the saying "content is king." Twenty years later it's data that's king, and those able to harness it for better insights, predictions and experiences are the new kingmakers. To help give you a view into the next twenty years of data, and how to take advantage of it today, I've partnered with the team at Interop ITX to create the Future of Data Summit. This 2-day event will bring together noted experts and practitioners to discuss the future of enterprise data from a variety of technology perspectives. We’ll be exploring the innovation and opportunity being offered in areas such as, of course, ML, AI and cognitive services, but also IoT and edge computing, AR/VR, blockchain, algorithmic IT operations, data security and more. I’ve hand-picked the speakers to both inspire Summit attendees with a view into what’s possible, as well as provide practical insights into how to get there. Here's our agenda for the Summit: Day 1 - 5/15 Day 2 - 5/16 Opening Remarks Opening Remarks Enterprise AI & the Future of DataSam CharringtonPrincipal Analyst, CloudPulse Strategies Understanding Deep LearningJames McCaffreyResearch Engineer, Microsoft Research Living on the Edge: Fog and Edge Computing for a World of Ubiquitous DevicesJanakiram MSVPrincipal, Janakiram & Assoc. Building AI ProductsJosh BloomCTO, Chairman & Founder, Wise.io (GE) Break Break Data Gravity and Archimedes' Lever for IoT (10:45 - 11:45)Dave McCroryCTO, Basho AI in Financial Services at Capital One (10:45 - 11:22)Zachary HanifDirector of Machine Learning at Capital One The Intersection of Big Data, Cloud, Mobility and IoT: Making the Connections Bob Friend Director, National Practice, BlueMetal Marketing in the Age of AI (11:22 - 12:00) Srividya Kannan RamachandranPrincipal, Marketing & Data Science, Level 3 Communications Lunch Lunch Cloud, IoT and Big Data SecurityDiana KelleyGlobal Executive Security Advisor, IBM Big Data and the Advent of Data MixologyJennifer PrendkiSr. Data Science Manager, WalmartLabs How the Future of Hardware Enables the Future of DataAssaf ArakiSr. Big Data Architect, Intel AI in the Enterprise Panel DiscussionSam Charrington (moderator) Zach Hanif, Capital One Srividya Kannan Ramachandran, Level 3 Jennifer Prendki, Walmart Labs Break Break Algorithmic IT Operations (AIOps)Eric SammerCTO & Co-Founder, Rocana Virtual and Augmented Reality in the EnterpriseAmy PeckFounder & AR/VR Consultant, EndeavorVR Unlocking the Data Lake with AIAshley FidlerHead of Product, Versive IoT and the Bitcoin BlockchainAndre De CastroCEO & Co-Founder, Blockchain of Things Summit Details: Date: May 15-16, 2017 Venue: MGM Grand, Las Vegas A brief word about the parent event, Interop ITX. Interop ITX is one of the largest and longest running conferences providing education and networking opportunities to enterprise technology leaders and practitioners. The conference hosts 3-4,000 attendees. In addition to my Future of Data Summit, which is part of the pre-conference program, the regular conference offers dedicated tracks on Data & Analytics, Cloud, Infrastructure, Security, DevOps and Leadership & Professional Development. I hope you can join me for the great presentations and discussion we'll be having at the Summit. Registration for the Future of Data Summit is done via the Interop ITX web site, and you'll need a package that includes the summits in order to attend. Please use my promo code CHARRINGTON for a 20% discount and to let the folks at Interop ITX know that you're coming for the Summit.
On the heels of last week’s $200 million acquisition by Apple of Turi, Intel announced on Tuesday yet another acquisition in the machine learning and AI space, this time with the $400 million acquisition of deep learning cloud startup Nervana Systems. This is another exciting acquisition; let’s take a minute to unpack it. First of all, for those not familiar with the company, Nervana, spelled N-E-R-vana, is a two year old company developing software, hardware and cloud services for deep learning. The company was originally founded to build hardware for speeding up deep learning, and it’s this focus that made it so attractive to Intel. The company’s first hardware product, due next year, is a custom deep learning chip called the Nervana Engine. The ASIC chip is similar in focus to the Google Tensor Processing Unit or TPU which we highlighted in the very first episode of This Week in Machine Learning & AI back in May. The company has also released a software product called Neon, and operates the Nervana Cloud. Neon is an open source deep learning framework like TensorFlow, Caffe or Theano. Relative to those others, which you hear about here on the show pretty much every week, Neon is known for being particularly fast, especially on NVIDIA GPUs. This is due to some clever optimization work the team did with the GPU firmware. Neon doesn’t have quite the popularity of some of these other frameworks, in part because it was initially a proprietary product, only recently open sourced back in May. The company’s cloud offering is tuned for running deep learning, and will eventually incorporate the company’s own chips. This is a great deal for the company’s founders and investors. With $24.4 million in funding to date, and a price reported to be as high as $408 million, Nervana returned nearly 17x to investors, which is home run territory for most VCs. At the same time, if you’ll allow me to Monday Morning Quarterback, I’m a little surprised that they’ve decided to sell so early in the game. The company is extremely well positioned in really two hot spaces, deep learning and cloud, and the team has only been at it for a couple of years. Projecting out a couple of years, it’s easy to see Nervana with a billion dollar valuation, assuming they continued to execute. This makes me wonder what the team saw in the market that said that now was the time to sell. Of course, it’s certainly the case that Intel brings a lot more to the table here than cash. The company obviously has vast resources and expertise in the chip-making arena and they could certainly help accelerate Nervana’s plans. It’s also the case though that the company faces stiff and growing competition. Google for example, offers everything Nervana does. Google’s TensorFlow, released about 8 months ago, is by most measures the most popular deep learning framework. (You’ll recall we discussed Francois Chollet’s analysis of the landscape back on the July 15 show.) Google also sees TensorFlow as becoming an on-ramp to the Google Compute Platform. And GCP has TPUs, which I just mentioned and which the company announced back in May. So perhaps the Nervana team and investors looked at the long slog ahead and decided to take the money off the table. I do wonder if the lack of an upside in terms of options makes hiring top talent more difficult for the company. So that’s the Nervana side of things, what about Intel’s side? Well, while this is a pretty small acquisition for Intel, I think it’s a smart move on their part. That’s because, despite numerous investments in the space, as recently as their investment in Nervana competitor CognitiveScale last week, Intel has been struggling to tell a story around machine and deep learning. The problem they’re facing is that NVIDIA is eating their lunch when it comes to chips for deep learning applications. In fact, NVIDIA also made news this week when they announced record revenues and a more aggressive sales outlook. The reason for the improved outlook? Quoting CEO Jen-Hsun Huang: “One particular dynamic sticks out, and it’s a very significant growth driver of where we have an extraordinary position in and it’s deep learning,” Huang told analysts in a conference call that lasted almost 80 minutes. “The last five years, we’ve quietly invested in deep learning because we believe that the future of deep learning is so impactful to the entire software industry, the entire computer industry that we, if you will, pushed it all in.” NVIDIA’s lead in deep learning has been a sore spot for Intel of late, to the point that several articles commented on interviews with company data center chief Diane Bryant where she became ruffled at the mention of Intel’s lack of presence in the machine learning market. Now, Intel and Diane are quick to shrug this off, since machine learning is a relatively nascent market. According to the MIT Technology Review, market research firm Tractica pegs the market for AI-related chips at under 1 billion, growing to 2.4 billion in 2024, a small figure compared to Intel’s 2015 revenue of $56 billion. But Intel missed the boat on mobile and PC chip sales are declining, and there’s weakness in data center and IoT revenue growth as well. So while machine learning and AI are an emerging market just at the beginning of the growth cycle, Intel can’t afford to sit this one out. This deal gives them a much needed story around deep learning and if the companies are able to execute, a foot in the door of this nascent market. Moving forward, this poses some of the same challenges I mentioned in the context of Apple/Turi, namely executive focus, but I also think this plays to several of Intel’s strengths. In particular, while I’ve seen the company struggle trying to independently build and sell enterprise software, the company does a good job of building and selling through reference architectures. If Nervana ultimately becomes a reference for how to build out a deep learning cloud using new and traditional Intel hardware combined with open source software, this could drive significant future adoption for them and begin to turn the tide. There are also a good number of possible tie-ins to take advantage of here. One is with Intel’s open source project, the Trusted Analytics Platform. Also, Intel has a significant stake in big data company Cloudera and cloud builder Mirantis. This is getting a bit ahead of ourselves, sure, but there could be some pretty interesting collaborations between these projects and companies over time. Subscribe: iTunes / Youtube / Spotify / RSS