Taking The Magic Out Of AI - A Conversation With Tracy Pizzo Frey Director, Product Strategy & Ops
"One of the things I get very concerned about is that, for so long, AI has been such a mystery. And in that blanket of mysteriousness, it's been built up as something magical. So much so that for the first number of years in this role, customers were coming to Cloud excited about AI as a technology, but not yet as a means to solve tactical business problems, as if any use of AI might be a magic wand," says Tracy Pizzo Frey, Senior Director, Outbound Product Management, Engagements & Responsible AI for Cloud AI & Industry Solutions at Google. The reality is, of course, very different. AI technology is not magic at all. So, you must understand the problems you are trying to solve and then determine whether AI is even suitable for that problem. But, if it is, then one must determine how to implement it in a way that is actually successful and can help transform a business.
I ask Pizzo Frey to explain how she relates to Jeff Dean's organization that many have come to know because of what I, and many others, consider the wrongful termination of lead researcher Timnit Gebru. She explains that while Google's AI principles govern both (and all of Google), structurally, the two organizations are separate and independent and the governance processes and practices that operationalize the AI Principles may be different, depending on the needs of each business. Pizzo Frey feels there is an added level of complexity and responsibility within Google Cloud because it builds products with AI that are then sold to customers and partners. This added responsibility has crystallized through their review processes that they cannot think in terms of edge cases but rather evaluate opportunities and harms. To understand the harms and their extent, one must understand the whole socio-technical landscape, spending time on the societal context leaning on the expertise of social scientists and researchers with deep experience in racial equity, human rights and civil rights. Her team's work is done in deep partnership with the entire AI Principles ecosystem across Google, which includes the foundational research of the Ethical AI Research team. She expressed how critical the work of Ethical AI research has been to her efforts and how valuable these partnerships have and continue to be for her. Interestingly, Pizzo Frey highlights that within the review process, the team has focused on psychological safety. Anyone can feel confident if they speak up about something, no matter how uncomfortable it is, that it is welcomed as part of the process.
A good way to understand how the different AI teams work is to see how the Cloud Vision API came to be and how, years later, face recognition technology was evaluated in Cloud through its Responsible AI reviews. The Google AI research organization developed the perception models that are a part of the Cloud Vision API. In 2016, Cloud leadership decided not to make a general-purpose face recognition API available as the broader Vision API came out of Beta. "The Cloud Vision API was one of the first enterprise AI offerings that we had commercialized as a part of our very nascent organization, six years ago, and face recognition was one of the top requests from customers coming out of the Beta in 2016. Cloud leadership decided at that time it would not make a general-purpose face recognition API available. However, when I joined the Cloud AI organization in 2017, this was still a very active discussion. Also, in 2017, what is now the Responsible Innovation Team flagged face recognition as an area of concern in Google's foundational document algorithmic unfairness. In 2018face, recognition was the 2nd product we brought through our then fledgling AI Principles product development reviews, in which we leaned on the foundational Gender Shades research by Joy Buolamwini and Timnit Gebru to inform our
evaluation. The conclusion the team came to was that the risk of a general-purpose Face Recognition offering could break our commitments to our AI Principles, certainly not in the state computer vision was at the time," recounts Pizzo Frey. This review preceded the publication of Google's AI Principles but aligned with them. Google's AI Principles were published in June 2018 after a year-long effort by a cross-functional group to define them. Their publication coincided with criticism towards the collaboration of Google and the US military on recognizing drone footage. Through the AI Principles, Google commits to both seven objectives alongside four areas where it will not design or deploy AI, committing to develop AI in a socially beneficial way by working on projects "where the overall likely benefits substantially exceed the foreseeable risks and downsides."
As a part of the decision not to make a general-purpose Face Recognition API available, Cloud's Responsible AI team proposed a project researching whether a narrowly scoped Celebrity Recognition offering might be a way to meet some of the customer needs while also ensuring alignment with the AI Principles. They spent the following two years researching and developing the Celebrity Recognition API on a very well-scoped Celebrity Recognition API, working side by side with relevant experts across Google's AI Principles ecosystem, the perception team in Google AI and the nonprofit organization Business for Social Responsibility, who aided Cloud AI's human rights diligence with a Human Rights Impact Assessment, and an internal fairness assessment after this was all in place. By the time Google launched, Amazon was already offering celebrity detection as part of its Rekognition computer vision service. Amazon only allows customers to use celebrity detection when a known celebrity is expected to appear in an image or a video. Still, it doesn't require those customers to pass a review. Google's Celebrity Recognition product is available for use by media & entertainment companies or approved partners on professionally produced media content only. Customers can't currently add people to the list even for private use, and celebrities who don't wish to be recognized can opt out.
As simple as the Celebrity Recognition API may seem, there were more learnings around race and gender for the team that came from the final fairness assessments against test datasets. First, the team found that darker skin tones were resulting in misidentification. Because the team was working with a narrow gallery, it offered an opportunity to investigate further why the errors were occurring. The team's first step was to go back to the Gender Shades study to make sure they were using the most accurate skin tone labels and updated to using the Fitzpatrick scale, which resulted in a 25% improvement in accuracy, and closed the gap entirely for females with medium skin tone. However, they still saw a concerning gap in dark skin tones in terms of both females and males. They found after continued analyses that most of the misidentifications remaining were the dark male skin tone with a small number of male actors resulting in 100% misidentification. They were able to correct this, allowing them to launch this product. This past June, in an independent effort from the Celebrity Recognition product, Google announced that it has been working on a more inclusive skin tone classifier to improve its products further and decrease bias.
In a separate review in 2019, stemming from an internal escalation, the team assessed harms relating to gender labels in the Cloud Vision API. "In our assessment, we realized that gender cannot be inferred based on physical appearance, either by machine learning or by humans, and we had not previously recognized the harm this was creating. With gender identity, expression and fluidity having gained much more acceptance across our culture, this escalation helped us to see what we had missed," says Pizzo Frey. And because of that, Google Cloud removed gender labels from the API. So, when you now use the Cloud Vision API, you will not get male or female labels. Instead, "person" is the return label. Google had to work through the change with some customers and help them understand their options. But of course, there are cases in which gender labels are important, such as the work Google did with the Geena Davis Institute to assess the imbalance across genders in the media.
Guaranteeing fairness in AI is not easy but being intentional about the standard that any company sets for itself is a good start. Pizzo Frey warns, though, that focusing solely on tooling to guarantee fairness is the wrong approach: "Tooling is what allows us to check and measure and know whether or not we are successful in our goals around fairness ensuring that we're not propagating unfair bias. Yet, if you don't have a definition of fairness for a particular product, which is different every time, then you measure things that might not help you achieve fairness in practice."