Skip to main content Skip to secondary navigation
Page Content

The Shibboleth Rule for Artificial Agents

Bots could one day dispense medical advice, teach our children, or call to collect debt. How can we avoid being deceived by actors with bad intentions? 

Image
A young girl in a protective mask stands on the platform of a metro station and looks at a smartphone

Do you know who you're interacting with online? Bots may one day be indistinguishable from humans, which could make us targets for deception. | Eugene Nekrasov

Imagine you are on the phone with an imperious and unpleasant debt collector. She knows everything about your financial history, but has no sympathy for your situation. Despite your increasingly frantic complaints, she simply provides a menu of unattractive repayment options. Would it temper your anger to know conclusively that she was not a hostile human being, but simply a bot tasked with providing you with a set of fixed options? Would you want the power to find out whether she was human?

We are quickly moving into an era in which artificial agents capable of sophisticated communication will be everywhere: collecting debts, dispensing advice, and enticing us to make particular choices. As a result, it will be increasingly difficult to distinguish humans from AIs in conversation or written exchanges. We are concerned that this constitutes a major change in social life, and presents a serious threat to fundamental aspects of our civil society. 

To help preserve trust and promote accountability, we propose the shibboleth rule for artificial agents: All autonomous AIs must identify themselves as such if asked to by any agent (human or otherwise).

The Case of Google Duplex

In May 2018, Google made headlines with demos of Google Duplex, a virtual assistant that can conduct realistic conversations over the phone in a handful of everyday scenarios like scheduling appointments. One striking feature of Duplex is its use of filled pauses (for example, “um” and “uh”). Google described this as part of “sounding natural.” Others interpreted it as “human impersonation.” In tests conducted by the New York Times in 2019, the deception seemed to run deeper: Duplex claimed Irish heritage and explicitly denied being a robot. The Verge reported on similar tests and found that people were routinely tricked into thinking Duplex was a human.

Read related: When Artificial Agents Lie, Defame, and Defraud, Who Is to Blame?

 

When Duplex debuted, it was immediately met with concerns about how it might reshape our society. Three years later, Duplex is available in numerous countries and 49 U.S. states, and it reportedly functions more autonomously than ever before.

And Google isn’t the only company experimenting with ever more realistic bots to interact with customers. In the near future, you will wonder: Am I getting medical advice from a physician or a bot? Is my child’s online classroom staffed with teachers or AIs? Is this my colleague on the call or a simulation? As AIs become more adept, they will become irresistible to numerous organizations, as a way to provide consistent, controlled experiences to people at very low costs.

It is now abundantly clear that sustained, coherent conversation from an AI does not imply that it has any deep understanding of the human experience, or even sensible constraints to avoid troubling behavior. This combination of traits could make AI agents extremely problematic social actors. The history of the Turing test shows that humans are not able to reliably distinguish humans from AIs even when specifically tasked with doing that, and even when the AIs are not especially sophisticated. More recent research with a top-performing language model (GPT-3) suggests that people can’t distinguish model-generated text from human-written text without special training. 

When people are not specifically tasked with looking for a bot, the task of detecting one may be even harder. One of the lessons of cognitive science is that even from infancy, humans are expert at attributing agency – the sense that something is able to act intentionally on the world – and do so pervasively, recognizing and naming mountains, trees, and storms (as well as cars and computers) as agents. So perhaps we are distinctively easy targets to be deceived by AI agents.  

However, humans tend to be more adept at reorienting themselves once they know definitively whether an agent is a human or AI. Our AI shibboleth rule would ensure that they could always obtain this vital information.

Our Modest Proposal

Our proposed shibboleth rule is simple:

Any artificial agent that functions autonomously should be required to produce, on demand, an AI shibboleth: a cryptographic token that unambiguously identifies it as an artificial agent, encodes a product identifier and, where the agent can learn and adapt to its environment, an ownership and training history fingerprint.

Read related: A Moderate Proposal for Radically Better AI-powered Web Search

 

A very similar proposal was made by Walsh 2016, with a rule called "Turing's Red Flag". This rule is already on its way to becoming a reality. For example, in July 2019, California became the first U.S. state to enact legislation to make it unlawful to use a bot to intentionally mislead someone online to incentivize a purchase or influence a vote. For this legislation to have real, enforceable consequences, there must be a specific, actionable test to prevent deception: the shibboleth.

Further, the shibboleth must encode information about the provenance of the agent and its history of ownership and usage. This information provides a potential solution for concerns about tracking and attributing responsibility to agents, especially those that adapt to their environments and thus begin to behave in ways that are unique and hard to predict. 

Questions and (Unintended) Consequences

Our primary goal for the shibboleth rule is simply to avoid ambiguous, frustrating, and potentially deceptive situations that can make it even harder for people to navigate difficult situations or allow unscrupulous actors to engage in troubling practices. Yet we expect such a rule to have wide-ranging consequences. For example, it would likely create a societal pressure to keep AIs out of specific roles, even if it is technically lawful for them to be in those roles – users would have the information necessary to complain. Perhaps “fully human” agents could even become a verifiable marker of a deluxe customer service experience. On the other hand, we may discover scenarios in which people increasingly prefer AIs, who can perhaps be tirelessly polite and attentive. We also expect the shibboleth rule to help us grapple with challenging issues of agency and intention for artificial agents, especially those that can adapt to their environments in complex ways.

Beyond these economic effects, though, our proposal opens up further questions about human–AI interactions. Are there cases where it is ethical to avoid revealing that an interacting agent is an AI; for example, when the AI agent is serving as a crisis counselor whose efficacy critically depends on being thought to be human? Conversely, are there situations in which AI agents should preemptively identify themselves as such?

What counts as an agent? What counts as autonomous? The shibboleth rule might force decisions about complex cases. Although Duplex is clear-cut, many hybrid systems will soon present difficult boundary cases, as when customer service agents manage chat bots that use GPT-successors to generate seamless prose from terse suggestions. More generally, the boundaries of agency will continue to be an important area for researchers and for the law. Would an artificial biological system count as an artificial agent for our purposes? What about a human with extensive neural implants? Cognitive scientists have debated the boundaries of intelligence for years, and this body of theory may see new life as we decide whether thermostats with voice recognition have to identify themselves as autonomous agents.

Implementations of the shibboleth rule will also have to grapple with other human consequences. Perhaps human callers will pretend to be bots to avoid culpability of some sort. How shall we prevent them, either legally or practically, from fraudulently producing shibboleth tokens? 

Finally, the shibboleth rule may have consequences for some of the already existing applications of bots in non-interactive broadcasting environments, including posting on social media. Bots disproportionately contribute to Twitter conversations on controversial political and public health matters, with humans largely unable to distinguish bot from human accounts. Might the shibboleth rule be a way to curb the viral spread of misinformation?

An Ongoing Conversation

In the example of Google Duplex, we can begin to discern the potential value of our shibboleth rule, and that value is only going to increase as we see more agents like Duplex deployed out in the world. However, the full consequences of such a rule are hard to predict, and implementing it correctly would pose significant challenges as well. Thus, we are offering it in the spirit of trying to stimulate discussion among technologists, lawmakers, business leaders, and private citizens, in hope that such discussion can help us grapple with the societal changes that conversational AIs are rapidly bringing about. What makes little sense to us is to ignore the pace of innovation in the capabilities of artificial agents or the need for at least some clear rules to curb the downside risks of a world filled with these agents. 

Authors: Christopher Potts is professor and chair of the Stanford Humanities & Sciences Department of Linguistics and a professor, by courtesy, of computer science. Mariano-Florentino Cuéllar is the Herman Phleger Visiting Professor of Law at Stanford Law School and serves on the Supreme Court of California. Judith Degen is an assistant professor of linguistics. Michael C. Frank is the David and Lucile Packard Foundation Professor in Human Biology and an associate professor of psychology and, by courtesy, of linguistics. Noah D. Goodman is an associate professor of psychology and of computer science and, by courtesy, of linguistics. Thomas Icard is an assistant professor of philosophy and, by courtesy, of computer science. Dorsa Sadigh is an assistant professor of computer science and of electrical engineering.

The authors are the Principal Investigators on the Stanford HAI Hoffman–Yee project “Toward grounded, adaptive communication agents.” Learn more about our HoffmanYee grant winners here

More News Topics