Smart Interfaces for Human-Centered AI
Imagine for a moment that you’re in an office, hard at work.
But it’s no ordinary office. By observing cues like your posture, tone of voice, and breathing patterns, it can sense your mood and tailor the lighting and sound accordingly. Through gradual ambient shifts, the space around you can take the edge off when you’re stressed, or boost your creativity when you hit a lull. Imagine further that you’re a designer, using tools with equally perceptive abilities: at each step in the process, they riff on your ideas based on their knowledge of your own creative persona, contrasted with features from the best work of others.
This isn’t a vision of the distant future, but a glimpse of the true potential of AI. So, while we should carefully consider the risk of intelligent technology replacing human capabilities, it’s important to realize that it has enormous potential to augment them as well: it can boost the creativity of our work, help us be more engaged learners, deliver healthcare more effectively, make our societies more sustainable, and better inform our decisions.
Like any tool, however, our relationship with AI has as much to do with its interface as it does with the underlying capabilities it provides. Does it amplify our actions and remain attentive to our goals—even as we revise them—or is it a black box that accomplishes tasks autonomously? If we want to build a future of open possibility and empowerment, it’s vital that our ability to harness AI evolves alongside AI itself.
This isn’t the first time we’ve faced such a crossroads. In the mid-20th century, mainframe computers were getting faster and more powerful, and played a growing role in the business processes of the time. But the user experience was rigid and strictly transactional; users would write and submit code, wait for the results, determine if the program computed what they expected, and then repeat the process. All of that changed in 1968, when Douglas Engelbart of the Stanford Research Institute presented a novel interface based around rudimentary but responsive text and visuals, all controlled with a hand-carved wooden prototype mouse. This eventually inspired the first graphical user interface (GUI) at Xerox PARC, which extended Englebart’s ideas with icons and more elaborate graphics. By the 1980’s, the GUI was reaching a mainstream audience thanks to the Apple Macintosh and Microsoft Windows, and the rest is history.
However, while it’s tempting to remember Engelbart’s work simply as a landmark in the development of the user interface, his ultimate vision was more profound. He imagined a world in which powerful tools were accessible and user-centric, and would help a rapidly changing society adapt to the complexities of the modern world. In other words, Engelbart was less interested in computers themselves than the people they could empower; his goal was augmenting our intelligence.
Today, AI has the potential to bring that vision closer to reality than ever before, but only if we can bring about a similarly ambitious paradigm shift in how we interact with it. Modern computing systems are often starkly binary, presenting either manual tasks that rely on the user for every decision, or full automation that shuts the user out entirely. A more balanced approach might complement the power of automation with a natural, communicative interface that feels more like a conversation partner.
This will be a complex challenge, relying on expertise from psychology, linguistics and natural language processing, communications, and human-computer interaction, alongside many other fields.
Consider the abilities of home assistants like Amazon’s Alexa; they may be controlled by voice, but their understanding is limited to isolated utterances that trigger predefined tasks like playing music or ordering products. By contrast, human conversation is ongoing, deeply contextual, and inherently multi-modal: we speak aloud, gesture with our hands and refer to our surroundings, and even tap, click and type in parallel.
For example, Alexa might make it easy to order a new throw rug given clear instructions to do so. But imagine a more spontaneous encounter that begins by casually pointing at a rug across the room and asking, “Alexa, how much would it cost me to buy another rug like that?” Handling such natural human interactions will require advanced algorithms that can make sense of what the user is trying to communicate, maintaining an ongoing conversation, and completing the user’s desired tasks as needed.
This will be a complex challenge, relying on expertise from psychology, linguistics and natural language processing, communications, and human-computer interaction, alongside many other fields. And it’s why the multidisciplinary nature of Stanford HAI—a hub for dialogue and collaboration between experts of all kinds—is the ideal environment for pursuing it. A number of research challenges lie before us:
-
AI-based systems will need an entirely new vocabulary of metaphors to make understanding and using these new systems easier, just as the desktop, windows, and icons did for the GUI. Although tempting, it’s not enough to simply graft a layer of intelligence onto today’s computing experiences; we need to invest in original, unconstrained thinking to make the most of the opportunity before us, and to ensure its impact on human users is a fundamentally beneficial one.
-
Next, we’ll need design patterns to help designers generate interfaces that allow people to successfully interact with computers in multiple modalities—voice, gestures, and touch, for example—at any time, across multiple devices. Over time, as more and more ideas are deployed, evaluated and refined across a range of applications, we’ll converge on design patterns that help ensure our experience with AI is consistent and enriching.
-
Finally, knowing how and when to deploy such patterns is equally important, which is why a thoughtfully constructed set of design guidelines will establish boundaries and etiquette. Additionally, we’ll need to develop heuristics that can formally evaluate the effectiveness of an intelligent user interface and, where necessary, highlight areas to improve.
Computing was forever changed by the emergence of the graphical user interface. It made a transformative but esoteric technology accessible for the first time, and gave an entire world of users—from artists to journalists to medical researchers and beyond—a role for themselves to play. Today, we stand at the brink of a similar challenge as AI transitions from an enterprise-level investment to an everyday experience. Considerable challenges lie ahead, but success will mean realizing the potential for AI to improve so much of what we do, not by doing it for us, but by helping us do it better ourselves.