Stanford
University
  • Stanford Home
  • Maps & Directions
  • Search Stanford
  • Emergency Info
  • Terms of Use
  • Privacy
  • Copyright
  • Trademarks
  • Non-Discrimination
  • Accessibility
© Stanford University.  Stanford, California 94305.
New Tool Helps AI and Humans Learn To Code Better | Stanford HAI

Stay Up To Date

Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.

Sign Up For Latest News

Navigate
  • About
  • Events
  • Careers
  • Search
Participate
  • Get Involved
  • Support HAI
  • Contact Us
Skip to content
  • About

    • About
    • People
    • Get Involved with HAI
    • Support HAI
    • Subscribe to Email
  • Research

    • Research
    • Fellowship Programs
    • Grants
    • Student Affinity Groups
    • Centers & Labs
    • Research Publications
    • Research Partners
  • Education

    • Education
    • Executive and Professional Education
    • Government and Policymakers
    • K-12
    • Stanford Students
  • Policy

    • Policy
    • Policy Publications
    • Policymaker Education
    • Student Opportunities
  • AI Index

    • AI Index
    • AI Index Report
    • Global Vibrancy Tool
    • People
  • News
  • Events
  • Industry
  • Centers & Labs
news

New Tool Helps AI and Humans Learn To Code Better

Date
May 01, 2023
Topics
Machine Learning

Stanford researchers developed a new framework called Parsel that solves complex coding tasks the way humans do — breaking down big tasks into smaller ones.

Last December, during a meandering walk near the Mississippi River in New Orleans after the 2022 NeurIPS Conference, Stanford associate professor of computer science and psychology Noah D. Goodman and PhD student Eric Zelikman stumbled upon an idea that could change how large language models (LLMs) solve tasks: They needed to try guiding LLMs to solve problems the way people do — by breaking down hard tasks into smaller pieces and solving them one at a time.

“I'm not sure I thought it actually would work,” recalled Goodman. But their idea did work and much better than they could have hoped. In a new paper, their team, with PhD students Qian Huang and Gabriel Poesia and co-led by Stanford Graduate School of Education assistant professor Nick Haber, showed that LLMs that implemented Parsel — a natural language processing framework they proposed that automatically solves and combines the solutions to many small problems to solve a large one — performed 75 percent better than baselines on competition-level coding problems.

The result came as a surprise to the team, given that before the walk in New Orleans, they designed Parsel as a tool to help students learn how to code.

Now, a tool for teaching could actually be used to significantly advance the capabilities of LLMs. Before the Parsel framework, complex code written by LLMs was prone to failure because a single mistake would cause the entire program to break. Leveraging Parsel means that LLMs can finally write successful multi-part code based on the same algorithmic reasoning style that human programmers use, and all that’s needed is natural language as input.

Parsing Into Parts

To use Parsel as a tool for education, a student would start by simply typing plain English to tell it what behaviors a new program must be able to do to accomplish a task. From those descriptions, Parsel then identifies which parts are related and need to be run together in a sequence, starting with the simplest tasks first. Finally, Parsel runs through different iterations of these coded parts, testing each of them until it lands on the version that satisfies everything the student requested.

In this way, Parsel does the heavy lifting in generating code with correct syntax and allows students to focus on the bigger picture. “What we struggle to teach kids in introductory computer science is this idea of algorithmic decomposition, and syntax often gets in the way of learning that core skill,” said Goodman.

But the researchers realized that LLMs have the opposite problem. While they can easily generate the syntax for a given programming language, they struggle to use algorithmic reasoning to build complex programs with many parts. It means that every line of code they generate is an opportunity to mess up. “Some piece is going to break,” said Haber.

To find out if this kind of reasoning would help the performance of LLMs on competitive coding tasks, the researchers prompted LLMs to first create a higher-level sketch with step-by-step instructions before diving into the problem. Then, the LLMs used the sketch to generate Parsel code — a natural language decomposition of the task into function descriptions and test cases — to run the task.

They soon found that not only were their LLMs doing better than all previous models on a variety of competition-level coding problems from the APPS (Automated Programming Progress Standard) dataset but they could also be used to successfully generate step-by-step movement plans for an embodied robot or even generate a mathematical proof.

“This sort of reasoning that we're forcing it to do is something quite domain general … we demonstrated interesting results around coding in particular, but I think there are a lot of directions,” said Haber.

A Wide-Open Future

Nothing quite like the Parsel framework had ever been attempted before, according to the scholars. “Up to the point that Parsel existed, I don't believe anyone thought it was currently possible to generate these kinds of programs from entirely natural language,” said Zelikman.

Moving forward, Goodman, Haber, Zelikman, and their colleagues are excited to continue working on Parsel as a tool for computer science education. “The education side is really exciting,” Zelikman emphasized. “We’re going to do more work developing that and seeing how it can be made more accessible to students.”

They also plan to continue testing Parsel to see how much it can help LLMs solve complex tasks that are more reflective of what programmers do in the real world. Haber noted that while it was exciting that they were able to show such dramatic improvements, they were limited by the datasets available and the difficulty of being the first ones to define a measure of success for such a pioneering new framework. Most prior work focused on coding problems that would normally be solved with a single function, which are not representative of real-world programming.

In the future, the team expects Parsel to evolve and expand beyond education and coding improvements for LLMs. “It certainly leads me to dream pretty wildly with where the next five to 10 years might take this,” said Haber. “You might imagine that these are things that can code with people, that can offload a lot of the dirty work in creating programs, and somehow free up people's ability to be thinking on a very different level when they're creating.”

Stanford HAI’s mission is to advance AI research, education, policy and practice to improve the human condition. Learn more. 

Share
Link copied to clipboard!
Contributor(s)
Allison Whitten

Related News

AI Leaders Discuss How To Foster Responsible Innovation At TIME100 Roundtable In Davos
TIME
Jan 21, 2026
Media Mention

HAI Senior Fellow Yejin Choi discussed responsible AI model training at Davos, asking, “What if there could be an alternative form of intelligence that really learns … morals, human values from the get-go, as opposed to just training LLMs on the entirety of the internet, which actually includes the worst part of humanity, and then we then try to patch things up by doing ‘alignment’?” 

Media Mention
Your browser does not support the video tag.

AI Leaders Discuss How To Foster Responsible Innovation At TIME100 Roundtable In Davos

TIME
Ethics, Equity, InclusionGenerative AIMachine LearningNatural Language ProcessingJan 21

HAI Senior Fellow Yejin Choi discussed responsible AI model training at Davos, asking, “What if there could be an alternative form of intelligence that really learns … morals, human values from the get-go, as opposed to just training LLMs on the entirety of the internet, which actually includes the worst part of humanity, and then we then try to patch things up by doing ‘alignment’?” 

Stanford’s Yejin Choi & Axios’ Ina Fried
Axios
Jan 19, 2026
Media Mention

Axios chief technology correspondent Ina Fried speaks to HAI Senior Fellow Yejin Choi at Axios House in Davos during the World Economic Forum.

Media Mention
Your browser does not support the video tag.

Stanford’s Yejin Choi & Axios’ Ina Fried

Axios
Energy, EnvironmentMachine LearningGenerative AIEthics, Equity, InclusionJan 19

Axios chief technology correspondent Ina Fried speaks to HAI Senior Fellow Yejin Choi at Axios House in Davos during the World Economic Forum.

Spatial Intelligence Is AI’s Next Frontier
TIME
Dec 11, 2025
Media Mention

"This is AI’s next frontier, and why 2025 was such a pivotal year," writes HAI Co-Director Fei-Fei Li.

Media Mention
Your browser does not support the video tag.

Spatial Intelligence Is AI’s Next Frontier

TIME
Computer VisionMachine LearningGenerative AIDec 11

"This is AI’s next frontier, and why 2025 was such a pivotal year," writes HAI Co-Director Fei-Fei Li.