Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
The Stanford Open Virtual Assistant Lab (OVAL) | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
news

The Stanford Open Virtual Assistant Lab (OVAL)

Date
September 03, 2019
Topics
Design, Human-Computer Interaction
Natural Language Processing
Your browser does not support the video tag.

 IntroOver the last two decades, the role of the web browser has evolved in a profound way. What was once a piece of software all its own, not unlike a word processor or photo editor, is now the backdrop to much of our online lives, whether we’re banking, collaborating with coworkers, or playing video games. It’s the web itself that changed the world—the browser is merely our gateway to it.Likewise, it’s tempting to treat today’s virtual assistants like isolated gadgets; novelties that let us order shoes or skip to the next song without getting up from the couch. Like web browsers, however, their true value comes from the world they connect us to: a linguistic web, where thousands of capabilities are accessible through natural language. It’s a fundamentally new way to interact with our technology, and we’ve only just begun to understand the role it can play in our lives.Unfortunately, the spirit of openness that characterized the early web is absent in today’s virtual assistants—a concerning thought given the rapidly expanding reach and power of these devices. That’s why we’re founding The Stanford Open Virtual Assistant Lab, or OVAL. It’s a world-wide open-source initiative intended to confront what we believe are the three major challenges facing the future of this technology: avoiding fragmentation of the linguistic web, democratizing the power of natural language interfaces, and putting privacy back in the hands of consumers.Challenge #1: Creating a single, shared linguistic webImagine if every web browser connected to its own, proprietary version of the internet, complete with its own formats and protocols. Consumers would face inconsistent access to online content, and the task of creating a website—let alone maintaining it—would be orders of magnitude more complex.This is the reality of today’s virtual assistants. Platforms like Amazon’s Alexa and Google Assistant may be open to third parties, but their proprietary nature means nothing created on one can be accessed by the others. As a result, they connect their users to a linguistic web, not the linguistic web. And the landscape grows more fractured by the day.At OVAL, we’re building an alternative. It’s called Thingpedia, and it uses open-world collaboration to collect every task, feature and data source a virtual assistant could want in a non-proprietary format. Its rapidly growing capabilities already include access to content from outlets like the New York Times and apps like Spotify, interfaces for online accounts like Dropbox and Twitter, and integration with devices ranging from your Fitbit to your Nest thermostat.Thingpedia means virtual assistants of all kinds can connect their users to the same shared world. It encourages competition by sparing upstart virtual assistant developers the burden of reinventing the wheel (or rather, tens of thousands of wheels) simply to catch up with incumbents. It lets consumers comparison shop without worrying about whether a particular function will be accessible to the assistant that suits them best.Best of all, because Thingpedia’s skill representation includes all information expected by Alexa and Google Assistant’s skill platforms, Thingpedia skills can be automatically added to both without additional work. This dramatically eases development for third parties relying on voice assistants to reach their users in new ways—including thousands of startups and small businesses. Rather than juggle multiple platforms, they can focus their development on Thingpedia while maintaining the largest possible audience.Challenge #2: Realizing—and democratizing—the full potential of a virtual assistantOrganizing all these features in one place is a great start, but it’s only a first step. What about the underlying technology that allows users to trigger them?Today’s virtual assistants are based on neural networks capable of transcribing the human voice and intelligently interpreting the results. The accuracy of such networks requires a significant amount of training data, typically acquired through manual annotations of real data by a large workforce.However, while the tech giants have made some truly incredible progress, we believe the linguistic user interfaces, or LUIs, of tomorrow will simply be too complex—and too fundamental to the future of computing—to leave their destiny in private hands. That’s why we’re building LUInet, an open-source neural network that provides an alternative to the capabilities at the heart of today’s commercial assistants. Additionally, we’ve developed an innovative tool called Genie that helps domain experts create natural language interfaces for their products, at a greatly reduced cost, and without in-house machine learning expertise. By empowering independent developers and by collecting their contributions from different domains, LUInet is positioned to surpass even the most advanced proprietary model developed by a single company.LUInet’s sophistication, however, is best exemplified by our unique ability to understand never-before-heard sentences that combine functions from different domains. While most virtual assistants are limited to narrow, transactional commands like “skip to the next song” or “open the garage door”, LUInet is built to understand the flexible logic we use in everyday conversation, like “send me a text notification whenever I get an email from work with a PDF attachment over ten megabytes.” With a single phrase, entire problems can be solved. This is made possible by having LUInet directly translate natural language into programs.  Challenge #3: Putting privacy back in the hands of the consumerFinally, the proprietary nature of today’s virtual assistants means their creators have total control over the data passing through them. That includes personal information, preferences and behavior, as well as hours upon hours of voice recordings.Of course, there’s nothing inherently wrong with this. In fact, to the extent that monetizing such data can lower the price of hardware and make many services completely free, some consumers may be happy to make the trade. But what about those that aren’t? And what about information sensitive enough to warrant special legal protections, such as our finances, health and education? It’s not that any one privacy policy is right or wong; rather, our intuitions are evolving into a spectrum, and today’s market caters only to a single extreme.OVAL is changing that with Almond, a complete virtual assistant with a unique focus on privacy and transparency. Not only can it access every function in Thingpedia, and interpret complex commands thanks to LUInet, but it was built from the ground up with privacy preserving measures that let you explicitly control if, when and how data is shared. For example, a user can tell her Almond assistant, running on her own device, that “my father can see motion on my security device, but only if I am not home”. No third party sees any of the shared data.Almond provides a model for the flexibility we should expect from tomorrow’s assistants. For those willing to share their data with advertisers, cloud-based assistants available at little or no cost make perfect sense. For power users with an eye on privacy, locally-run solutions can ensure personal data never leaves the device. And countless variations exist between these extremes. And globally, assistants hosted by different organizations should inter-operate so data can be shared without centralization—just as email does today. The prescription, therefore, is simple: more choice. After all, it won’t be long before our virtual assistants know us nearly as well as our human colleagues. Shouldn’t we have a say in where that knowledge goes?ConclusionThe breakout success of assistants from companies Amazon, Google and Apple is a testament to the power of natural language interfaces. But as big as these brands are, the potential of the linguistic web is even bigger. At OVAL, we envision a future in which this potential is accessible, interoperable, and above all, worthy of our trust.How to get involvedVisit the open-source Almond project at https://almond.stanford.edu. We welcome all contributions, and all of our software is publicly available at https://github.com/stanford-oval. More information on the lab can be found on https://oval.cs.stanford.edu.  And stay tuned for the first Open Virtual Assistant Workshop, to be held on October 30, 2019, part of the Stanford HAI Conference. 

Share
Link copied to clipboard!
Contributor(s)
Monica Lam and Alex Varanese

Related News

New Large Language Model Helps Patients Understand Their Radiology Reports
Vignesh Ramachandran
Jun 23, 2025
News

‘RadGPT’ cuts through medical jargon to answer common patient questions.

News

New Large Language Model Helps Patients Understand Their Radiology Reports

Vignesh Ramachandran
HealthcareNatural Language ProcessingJun 23

‘RadGPT’ cuts through medical jargon to answer common patient questions.

Feature Stanford HAI at Five: Pioneering the Future of Human-Centered AI
Shana Lynch
May 07, 2025
News

In its fifth year, HAI catalyzed a multidisciplinary community of researchers, industry, and civil society to ensure artificial intelligence prioritizes humans.

News

Feature Stanford HAI at Five: Pioneering the Future of Human-Centered AI

Shana Lynch
Industry, InnovationNatural Language ProcessingMay 07

In its fifth year, HAI catalyzed a multidisciplinary community of researchers, industry, and civil society to ensure artificial intelligence prioritizes humans.

MedArena: Comparing LLMs for Medicine in the Wild
Eric Wu, Kevin Wu, James Zou
Apr 24, 2025
News

Stanford scholars leverage physicians to evaluate 11 large language models in real-world settings.

News

MedArena: Comparing LLMs for Medicine in the Wild

Eric Wu, Kevin Wu, James Zou
HealthcareNatural Language ProcessingGenerative AIApr 24

Stanford scholars leverage physicians to evaluate 11 large language models in real-world settings.