The new Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence (EO) demonstrates the Biden Administration is taking seriously its responsibility not only to foster a vibrant AI ecosystem but also to harness and govern AI. Mandates to catalyze AI innovation by strengthening AI talent through streamlined immigration and federal hiring and piloting the National AI Research Resource (NAIRR), while adding safeguards for risks, reflect years of deliberation by commissions, task forces, and advisory committees. The order makes credible America’s refrain that it is committed to trustworthy AI—for the American people and the broader global community—by directing agencies to ensure safety and security, promote rights-respecting development and international collaboration, and protect against discrimination.
And the White House has acknowledged what agencies need to realize the administration’s AI policy agenda—empowered senior officials, staff with AI expertise, incentives to prioritize AI adoption, mechanisms to support and track implementation, specific guidance, and White House-level leadership. There is much to admire in this EO, but also much to do. By our count, over 50 federal entities are named and around 150 distinct requirements (meaning actions, reports, guidance, rules, and policies) must be implemented, many of which face aggressive deadlines within the calendar year.
As scholars from the Stanford Institute for Human-Centered AI (HAI), Stanford’s Regulation, Evaluation, and Governance Lab (RegLab), and the Center for Research on Foundation Models (CRFM), we provide four preliminary perspectives on the EO and what it means for the future of AI.
1. Foundation Models
The EO builds on a recent wave of White House initiatives on foundation models (very large, general-purpose AI models that power a wide range of downstream applications). These initiatives include a red-teaming initiative in August and voluntary commitments for tech companies in July and September. Specifically, the EO defines “dual-use foundation models” as AI models “trained on broad data; generally uses self-supervision; contains at least tens of billions of parameters; is applicable across a wide range of contexts; and that exhibits, or could be easily modified to exhibit, high levels of performance at tasks that pose a serious risk to security, national economic security, national public health or safety, or any combination of those matters.” The White House’s efforts on foundation models accompany the simultaneous release of the G7 principles and code of conduct on generative AI and foundation models. And they join a broader array of global policy making efforts on foundation models in the past 10 days such as the UK AI Summit and EU AI Act.
There are four important elements to note about the EO’s treatment of foundation models.
Thresholds. First, the EO mandates that companies report red-teaming results for foundation models above a given threshold of compute. The order also implicitly cautions against the release of the weights for models (i.e., full external access to the model) above the given threshold, requiring those with plans to train such models to report the “physical and cybersecurity protections” for the training process and model weights, along with information on who will retain ownership and possession of the weights. Concretely, the default thresholds for compliance are for dual-use foundation models that are trained (a) using at least 10^26 floating-point operations (a measure of computing cycles), or (b) using primarily biological sequence data and at least 10^23 floating-point operations. (For reference, Anderljung et al. estimate OpenAI’s GPT-4 to use 2.10 x 10^25 floating point operations, just shy of this threshold.) The Secretary of Commerce is tasked with defining and updating these technical conditions. The EO does not make clear why compute is the default basis for scrutiny, or how these specific values were set, though the specific value of 10^26 appears in the writings of Anderljung et al. on Frontier AI Regulation. These thresholds appear to set the stage for two tiers of foundation models, which may see re-use in other US policy contexts and that resemble recent proposals for the EU AI Act. But it is not clear why heightened scrutiny applies to foundation models based primarily on the greater investment of (computing) resources rather than the greater demonstrated impact, or harm, in society. Toxic chemical compound discovery or AlphaFold discovery of protein structures could potentially be used for much greater harm, but can be trained with much lower compute.
Compute Monitoring. Second, the EO requires infrastructure-as-a-service providers (e.g., cloud providers like Amazon, Google and Microsoft) to monitor and report “foreign persons” involved in and potentially other information about “any training run of an AI model meeting the [threshold] criteria” that may enable any malicious cyber-enabled activity. On the one hand, tracking compute usage is more feasible than tracking all possible models, since cloud computing remains central to training foundation models. On the other hand, this provision raises significant questions around the potential for abuse. Can cloud service providers inspect training runs and data to be able to ascertain whether a model enables malicious cyber-enabled activity or involves biological sequence data – i.e., potentially all models and data? How will the government receive such information? What are the implications for the data protection, privacy, and confidentiality requirements of cloud service providers? These concerns contribute to growing debates surrounding monitoring, such as Apple’s contentious plan to scan user content for child sexual abuse material (CSAM) or the earlier debate around a government backdoor to encryption on the iPhone. Here the emerging tension is between the requirement of the Secretary of Commerce to define threshold criteria in a way that is better tailored to risk, but more extensive tailoring will raise greater concerns about abuse and surveillance of computing and data. FLOPS are easy to measure, substantive risk is not.
Content Provenance. Third, Generative AI has ushered in a host of concerns around the provenance of content (aka whether it is human-generated or machine-generated), building on the rise of deep fakes and synthetic media in the past decade. The EO tasks Commerce to produce a report surveying the space of techniques for content provenance, watermarking, and other detection approaches. Such research is vital, given existing legislative proposals may be premature in mandating watermarking. Namely, watermarking methods are quite nascent, especially for language models, lacking the required technical and institutional feasibility. However, we believe action will be needed, with recent announcements across the pond, especially with the growing concerns of AI-generated CSAM that are highlighted in the EO. Otherwise, we risk regulating with standards that are technically infeasible or simply do not exist.
Open FMs. Fourth, policymakers are grappling with the matter of open foundation models, meaning models like Meta’s Llama 2 and Stability’s Stable Diffusion 2, including inquiry by Senators Richard Blumenthal and Josh Hawley into Meta’s release of LLaMA. The EO draws a binary distinction, emphasizing risk “when the weights for a dual-use foundation model are widely available — such as when they are publicly posted on the Internet.” While a consensus definition for open foundation models does not yet exist, such a binary collapses the more complex gradient of foundation model releases (especially neglecting the release of training data). In fact, as one recent paper by one of us shows, the purported safeguards to more closed models can be stripped away quite easily, illustrating that the binary may not be well-tailored to risk. Understanding release decisions, and specifically the benefits and risks of open model release, will be valuable (as some of us have noted before to the NTIA), especially given the significant benefits of improved transparency and distributed power with open approaches. The impact of the EO on the open foundation model ecosystem will only become clear after the mandated public consultation and report to the President.
While the EO reflects a clear step forward by the US government towards the governance of foundation models, it is a bit like the first rung in Jack’s beanstalk. At present, it is hard to discern where the beanstalk leads and what twists and turns lie in store. What is clear is that the EO does not make much headway on improving transparency in the foundation model ecosystem, even on the topic of evaluations in a broader sense than just red-teaming. This deviates from other efforts like the EU AI Act and G7 Code of Conduct that more directly prioritize transparency as the starting point for effective governance of foundation models.
2. Attracting AI Talent Through Immigration
The order also signifies a major push to draw much-needed technical talent to the United States by identifying paths to attract, recruit, and retain foreign AI talent. The order directs the Secretary of State and Secretary of Homeland Security to increase visa opportunities for “experts in AI or other critical and emerging technologies.” The order also directs the Secretary of Labor to publish a request for information to solicit public input to identify possible updates to the list of Schedule A occupations—a designation which can hasten green card approval for workers by allowing employers to bypass the labor certification process, providing a valuable tool to attract foreign talent. (The list of Schedule A occupations has not been meaningfully amended since 1991.)
This focus on immigration is spot-on—attracting AI talent is essential to America retaining its technological leadership. In 2021, nearly 40% of US doctoral recipients in science and engineering fields were temporary visa holders. Yet, as the announcement of the EO conceded, and as multiple task forces have acknowledged, there is still much for Congress to do.
The other subtlety stems from the perpetual definitional problem of AI regulation: what exactly is AI? In the immigration context, that challenge manifests in a unique way. How broadly should the administration conceive of AI experts or experts in “critical and emerging technologies'' given the increasing number of interdisciplinary applications and uses of AI? Should they include, for instance, cybersecurity workers, project managers, data scientists (as the EO appears to posit in requirements for the Office of Personnel Management), and workers with sociotechnical expertise, all of which are vital in the full AI ecosystem? AI is no longer merely a subfield within computer science.
The use of general categorizations (e.g., “critical and emerging technologies”) and instruments (e.g., J visas for work-and-study-based exchange programs and O visa programs for workers of extraordinary ability, both not subject to statutory caps) within the order will shift the battleground toward agency interpretations of the scope of these provisions. If the categorizations in the order are interpreted broadly, it could result in a significant expansion of federal immigration programs. For instance, by one estimate there are some 700,000 cyber jobs to fill in the country. If interpreted narrowly, these immigration provisions could be a drop in the bucket for attracting the necessary AI talent. The interpretation will hinge on the rulemaking process and requests for information, which means public input will be critical.
3. Leadership and Government Talent
The EO takes major steps to structure both the White House and agencies for AI leadership.
First, the EO requires (a) establishing the White House AI Council, chaired by the Deputy Chief of Staff for Policy and comprised of cabinet level officials, (b) the appointment of a Chief AI Officer at agencies, (c) internal AI governance boards within a smaller number of agencies (so-called Chief Financial Officers (CFO) Act agencies) and (d) an interagency council, initially composed of the Chief AI Officers, for coordination across agencies.
Such leadership structures are long needed. To date, AI activities within federal agencies have been fragmented and decentralized across different bureaus and offices, with notable exceptions of agencies that have issued comprehensive AI strategic plans. Within the White House, one particular concern has been that AI and emerging technologies should be elevated beyond the small, albeit important, National AI Initiative Office (NAIIO) in the Office of Science Technology and Policy, created under the National AI Initiative Act in 2020. The National Security Commission on AI (NSCAI) and the National AI Advisory Committee (NAIAC) each recommended the consideration of a technology analogue to the National Security Council, Council of Economic Advisors, or Domestic Policy Council. Similarly, the EO borrows a page from the Evidence Act by designating an official to oversee AI activities within an agency.
Of course, titles alone won’t do much. The EO hence plans an “AI Talent Surge” into government, including through:
- Expanded use of excepted service (positions distinct from conventional civil service positions) and direct hire authority (hiring authority that allows agencies to avoid certain civil service selection procedures)
- Expanded use of the US Digital Service (USDS), Presidential Innovation Fellows (PIF), and the US Digital Corps (USDC), and other technology talent programs
- A task force to improve hiring pathways, and
- Training for the federal workforce.
The backdrop here is that the AI in Government Act had already required the Office of Personnel Management (OPM) to review the AI hiring needs and to create a new AI hiring line OPM had been over two years behind the statutory deadline, so how real will this hiring surge be? As the AI Index shows, not even 1% of AI PhDs pursue careers in the public sector. And as former US Deputy Chief Technology Officer Jen Pahlka notes, the need for staffing the civil service with technologists is profound, and USDS, PIFs, and USDC are more short-term, stopgap fixes. Federal agencies need technical talent that sticks around for the long-term to develop subject matter expertise to know what problems are worth solving. Government needs technical experts to meet this moment. And that means budgets.
The EO is both broad and precise. It is more wide-ranging than the prior AI EOs (Executive Order 13859, Executive Order 13960). By our count, it places some 150 requirements for agencies, calling out 50 specific ones, to develop guidance, conduct studies, issue recommendations, implement policies, and where appropriate, engage in rulemaking. This signals a whole-of-government approach to AI.
At the same time, the EO is also remarkably specific. It sets ambitious deadlines for the vast majority of requirements, with roughly one fifth of deadlines falling within 90 days and over 90% falling within a year. Lurking in the background is, of course, the uncertainty about the next presidential administration, as EOs can be revoked with the stroke of a pen. The order defines over 30 terms and requires that OMB provide annual guidance and create mechanisms for tracking implementation and reporting progress to the public. Already, OMB issued an admirable memorandum on AI in government, which is open to public comment.
There are four notable elements to the breadth and precision of the EO. First, the focus on implementation is a welcome development, given the prior history of implementation of AI EOs. As noted in a Stanford HAI/RegLab White Paper, federal agencies struggled to implement the prior two EOs, with 47% of agencies failing to file required AI use case inventories (a catalog of how AI systems are used within an agency). The careful definitions – including OMB’s clarification on covered agencies – reflects substantial consideration to implementation. Second, this EO neither rescinds the prior EOs, nor does it reissue the administration’s Blueprint for an AI Bill of Right, as some advocates wanted. Third, while deadlines speak volumes, so too does the absence of them. Many deadlines appear, for instance, in Section 4 that addresses the national security risks of foundation models–with deadlines for about 91% of the requirements in Section 4. But far fewer deadlines are spelled out for the requirements in the sections on supporting workers, advancing equity and civil rights, and protecting privacy, with unspecified deadlines for around 50%, 36%, and 50% of these requirements, respectively. This disparity may illustrate how concerns about national security, and particularly bioweapons risk, have driven much of the recent AI policy debate. Last, while this EO is to be applauded for its specificity, whether agencies are able to meet the longer term requirements that were not already underway remains to be seen. In the words of the former director of the NAIIO Lynne Parker, the EO is “paradoxical” in requiring rapid action while acknowledging the dearth of people with sufficient expertise to take such actions.
In Sum: A Massive Step Forward
In sum, the EO is a major step forward to ensure that America remains at the forefront of responsible innovation. It is a moment to celebrate. The White House AI Council, Chief AI Officers, support for research and development, and an “AI talent surge” are what many of us, for quite some time, have been calling for. To be sure, there is much that is harder to do through executive action alone, including broad regulatory measures of the private sector – such as new forms of licensing, disclosures, or registration beyond the national security domain (as the EO invokes the Defense Production Act) – more expansive immigration reform, changes to statutory authority, and, perhaps most important of all, funding. For that, Congress will need to be involved.
Implicit within the EO is also a concession: America needed to revamp its AI strategy. The rise of AI is too important, too widespread, and too pervasive to be handled by anything but top level attention. How this EO is implemented will shape fundamental elements of AI: the openness of the ecosystem, the degree of power concentration, and whether the public sector will be able to rise to this challenge. Top level attention must now be met by careful implementation, expertise, and resources.
Acknowledgments. We thank Arvind Narayanan and Kit Rodolfa for helpful feedback.
Authors: Rishi Bommasani is the society lead at Stanford CRFM and a PhD student in computer science at Stanford University. Christie M. Lawrence is a concurrent JD/MPP student at Stanford Law School and the Harvard Kennedy School. Lindsey A. Gailmard is a postdoctoral scholar at Stanford RegLab. Caroline Meinhardt is the policy research manager at Stanford HAI. Daniel Zhang is the senior manager for policy initiatives at Stanford HAI. Peter Henderson is an incoming assistant professor at Princeton University with appointments in the Department of Computer Science and School of Public and International Affairs. He received a JD from Stanford Law School and will receive a PhD in computer science from Stanford University. Russell Wald is the managing director for policy and society at Stanford HAI. Daniel E. Ho is the William Benjamin Scott and Luna M. Scott Professor of Law, Professor of Political Science, Professor of Computer Science (by courtesy), Senior Fellow at HAI, Senior Fellow at SIEPR, and Director of the RegLab at Stanford University.