Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
How Do Governments Track and Understand AI? | Stanford HAI
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

news

How Do Governments Track and Understand AI?

Date
September 28, 2020
Carol M. Highsmith

Researchers discuss the obstacles to measuring AI’s impact.

Artificial intelligence is enabling spectacular advances in fields from medicine to robotics, but it also generates worry about job losses, privacy, fairness and human accountability.

Small wonder that governments worldwide are fixated on policies to both stay competitive and head off dangers. In June alone, for example, U.S. lawmakers introduced seven separate AI bills.

But do policymakers and the public have accurate data? How do we even define AI, much less measure “progress” or competitiveness? Do we have any agreed-upon metrics about benefits and risks?

Those questions were the focus of a recent workshop convened by Stanford HAI and Stanford’s AI Index.

The AI Index may be the world’s most comprehensive public source of data on AI activity, investment and impact. Yet the message from this conference was just how hard it still is to know what’s going on.

We sat down with three of the AI Index’s creators – Saurabh Mishra of Stanford, Ray Perrault of SRI International and Jack Clark of OpenAI – to better understand the challenges.

Why is it hard to measure the progress and impact of AI, and why should we worry?

Perrault: Policymakers want to know what’s happening, but they need good information. People who want more government funding may have incentives to present one set of numbers to warn that we’re underinvesting in AI, while others might present a different set of numbers to claim we’re heavily funding AI and having great impact. So this requires careful thought about what we’re measuring.

The problem is that it’s difficult to put a boundary around what we mean by “artificial intelligence.” AI has borrowed ideas from many disciplines over time, including logic, linguistics and psychology. Machine learning draws many of its foundations from statistics and optimization, and today is being applied to a broad range of other fields, from bioinformatics to finance. Many of those advances are coming not from AI researchers but from people in the applying disciplines.

There’s nothing wrong with that – it’s progress. But it does raise challenges about how to measure investment and advances in artificial intelligence.

For example, should we think that an investment in self-driving cars is all about AI? AI is certainly important, but you can’t give it credit for the whole field. When you actually build a self-driving car, AI is a pretty small share of the total cost. Most applications of AI are driven by a mix of technologies.

How good are we at measuring performance?

Perrault: Strictly from a technical standpoint, there are different metrics of AI performance. These include the amount of data it takes to train a system, but also the amount of computation required and how well a model performs with real-world data that’s different from what it was trained on.

Speech recognition is much more practical now because it’s possible to collect vast numbers of speech samples for the systems to train on. The more data and the more computing power you can throw at the job, the better the results will be.

But accurate results are not the only measure of performance. Another might be: Is there a way to get the same job done, but with less data and computing power? Increasingly, authors of papers about new AI models indicate how much computing was necessary to get their results.

Clark: One way of cutting through the hype is by having better and more standardized metrics for what you want to achieve.

Imagine if car companies hadn’t standardized horsepower. One company could claim its engine had 100,000 “foxpower,” while another claimed its had 700,000 “flypower.” That’s a little like where AI is today. It can be very challenging to compare the performance of a system from task to task, or to compare different systems on the same task, because you use different standards to evaluate them.

You can have a system that’s useful but will use enough energy to boil the ocean, or you can have a system that’s just kind of useful but runs on a triple-A battery. You need to talk about those systems in the same universe.

Mishra: Another metric of progress is in avoiding bias. We know that facial recognition systems are more accurate with some racial groups than others. The National Institute of Standards and Technology has developed a set of systemic evaluation methods to compare the bias of competing facial recognition systems, and it has published reports showing that every system has problems. But those kinds of in-depth standardized measurements and evaluations are still rare in other domains impacted by AI.

Clark: But if you have a single metric, a single performance score, you’re likely to get something wrong. Let’s say you want to measure the bias of facial recognition systems, but the measure is actually a blend of how a system performs for different social or racial groups. What happens if a system is reasonably good overall but weirdly bad at recognizing one particular group?

How good are we at measuring the social and economic impact of AI?

Perrault: It’s a challenge. In spite of all the technological advances in AI, for example, productivity growth has been lagging – even in the West. Part of the answer is that not all the uses of AI generate economic consequences. I can ask a question to my phone and get an answer, but how much economic impact does that have? You can ask many more questions during the day than you could before, but are you more productive than when you had to go look up the information yourself? And how much is the AI contributing? Google says it’s an AI company, but no one really knows how much of their revenue comes from AI.

Mishra: To put this into a global perspective, we need to think about distributional consequences and inequality. We need to study these trends in terms of the impact on developing countries. We don’t have much clarity about which nations, which domains and which organizations are deploying AI. Who has access to which data? Who has access to the computing power? There’s a big paucity of data about developing countries.

Stanford HAI's mission is to advance AI research, education, policy and practice to improve the human condition. Learn more. 

Carol M. Highsmith
Share
Link copied to clipboard!
Contributor(s)
Edmund L. Andrews
Related
  • Coded Bias: Director Shalini Kantayya on Solving Facial Recognition’s Serious Flaws
    Katharine Miller
    Sep 14
    news

    We need ‘guidelines around transparency and laws that balance Big Tech’s power.’

Related News

Stanford Scholars Train Generative AI To Be Better Creative Collaborators
Nikki Goth Itoi
Mar 10, 2026
News
Skilled comic artist creating comic book on computer

The team is building a shared “conceptual grounding” so that artists can steer models with precision.

News
Skilled comic artist creating comic book on computer

Stanford Scholars Train Generative AI To Be Better Creative Collaborators

Nikki Goth Itoi
Mar 10

The team is building a shared “conceptual grounding” so that artists can steer models with precision.

What Your Phone Knows Could Help Scientists Understand Your Health
Katharine Miller
Mar 04, 2026
News
Woman using social media microblogging app on her smart phone

Stanford scientists have released an open-source platform that lets health researchers study the “screenome” – the digital traces of our daily lives – while protecting participants’ privacy.

News
Woman using social media microblogging app on her smart phone

What Your Phone Knows Could Help Scientists Understand Your Health

Katharine Miller
HealthcareMar 04

Stanford scientists have released an open-source platform that lets health researchers study the “screenome” – the digital traces of our daily lives – while protecting participants’ privacy.

How a HAI Seed Grant Helped Launch a Disease-Fighting AI Platform
Dylan Walsh
Mar 03, 2026
News

Stanford scientists in Senegal hunting for schistosomiasis—a parasitic disease infecting 200+ million people worldwide—used AI to transform local field work into satellite-powered disease mapping.

News

How a HAI Seed Grant Helped Launch a Disease-Fighting AI Platform

Dylan Walsh
Computer VisionHealthcareSciences (Social, Health, Biological, Physical)Machine LearningMar 03

Stanford scientists in Senegal hunting for schistosomiasis—a parasitic disease infecting 200+ million people worldwide—used AI to transform local field work into satellite-powered disease mapping.