Stanford Scholars Train Generative AI To Be Better Creative Collaborators

Date

March 10, 2026

Skilled comic artist creating comic book on computer

The team is building a shared “conceptual grounding” so that artists can steer models with precision.

The conversation around AI and art generally swings between two extremes: A flood of AI slop or the total automation of creative work. The more desirable approach may be an AI that behaves as a useful collaborator.

But thus far, visual artists working with text-to-image tools confront frustrating basic hurdles in their abilities to direct AI. Ask an AI to create an image of a house? Not too difficult. Direct it to make the house red, with four front-facing windows, a chimney, and ivy covering the left side? Good luck.

Stanford computer science, cognitive psychology, and education scholars believe they can help AI better augment human creativity by teaching models and people to communicate ideas with each other. With funding from a Stanford Institute for Human-Centered AI (HAI) Hoffman-Yee Research Grant, the scholars are developing a shared conceptual grounding for humans to collaborate with generative AI on production-quality visual content ranging from illustrations to diagrams to animations.

“While the models seem amazing, they are terrible collaborators,” says Maneesh Agrawala, professor of computer science at Stanford and a co-principal investigator for the project. “Creators have no way of knowing what the AI will produce when given a certain text prompt. If you ask for a suburban single-family home, it generates a modern duplex.”

Authoring original content requires having opinions and constantly making choices, Agrawala explains. Humans and AI need a shared set of concepts so the nuance doesn’t get lost in translation.

Deciphering the Human Creative Process

The Stanford team is approaching this problem from two directions. First, the scholars are running experiments to better understand how people collaborate to create visual content. They have conducted several studies of people performing creative tasks to analyze through chat logs and sketches how the participants communicate as they work together.

“If we want to build AI systems that understand how humans think during creative projects, we should start by learning as much as we can from the way that people establish common conceptual ground with each other,” says Judith Fan, assistant professor of psychology at Stanford's School of Humanities and Sciences. “Not everyone talks or draws the same way, but they still expect to be understood.”

Building AI Tools that Understand Creators

Second, the team is building open-source AI tools to apply the lessons learned about human creative communication. For example, ControlNet teaches text-to-image diffusion models about spatial composition, using two separate features, blocking and detailing, to mirror how artists begin with a rough sketch and then complete the detail of a drawing. Today’s models struggle to capture the idea of a pose or how objects should be arranged in a scene. With this tool, creators can guide models to a layout that matches their vision.

Another tool called FramePack enables creators to generate 3D videos from a text prompt for multi-scene storytelling. This tool teaches models to prioritize scenes based on their importance to the overall story, similar to the way a human would work on the project.

A third innovation explores the power of neuro-symbolic AI, which combines neural networks with reasoning capabilities to increase transparency and overcome the limitations of “black box” AI. Using these principles, the team has developed a visual scene coding language that works from a natural language text prompt to produce lines of code, which are executed and rendered to create a 3D scene. Human creators can stay in the loop to inspect or edit the code and prompt the AI to update its program at any time.

Reimagining Education Content

The impact of a shared conceptual grounding between humans and AI promises to yield new applications in diverse fields, including design, simulation, animation, robotics, and education, says Agrawala. The research team is currently working with gaming platform Roblox to enable players to generate unique 3D objects from text prompts while imposing game restrictions (so, for example, players won’t be able to create weapons in a nonviolent game).

More broadly, the scholars hope that one day human creators of all skill levels—from hobbyists and small business owners to visual experts—will have a friction-free way to express their ideas using a combination of natural language, example content, code snippets and other modalities.

“We’re serious about equipping the broader creative community with the tools they need to communicate with AI effectively,” Fan says.

Want to learn more? Watch this research team discuss the latest findings during the recent Hoffman Yee Symposium at Stanford HAI.

Related News

How AI is Transforming Scientific Discovery While Keeping Humans at the Center

Shana Lynch

May 27, 2026

News

From designing new antibodies to simulating 1,000 years of climate in a day, AI is transforming what's possible—but humans remain the ones deciding what matters.

News

How AI is Transforming Scientific Discovery While Keeping Humans at the Center

Shana Lynch

Sciences (Social, Health, Biological, Physical)Generative AIMay 27

From designing new antibodies to simulating 1,000 years of climate in a day, AI is transforming what's possible—but humans remain the ones deciding what matters.

AI Hiring Tools Can Yield Racial Bias and Systemic Rejection

Rishi Bommasani, Sarah H. Bana, Kathleen A. Creel, Dan Jurafsky, Percy Liang

May 26, 2026

News

A 3D isometric conceptual illustration showing a single glowing yellow human icon standing out among a grid of identical blue figures

The first large-scale study of hiring algorithms in the wild finds concerning patterns to how systems reject candidates.

News

AI Hiring Tools Can Yield Racial Bias and Systemic Rejection

Rishi Bommasani, Sarah H. Bana, Kathleen A. Creel, Dan Jurafsky, Percy Liang

Machine LearningEthics, Equity, InclusionWorkforce, LaborMay 26

The first large-scale study of hiring algorithms in the wild finds concerning patterns to how systems reject candidates.

New Approach to Scaling Laws Could Change How AI Models Are Trained

Andrew Myers

May 21, 2026

News

Leveraging statistical concepts from measurement science and education, AI researchers have greatly reduced the computational demand of predicting how the largest of large language models will scale up in the future. It could save millions of dollars in training costs.

News