Prompt Injection is a type of security attack that uses malicious input to trick a Large Language Model (LLM) into behaving in an unintended way. By crafting a deceptive prompt, an attacker can cause the model to bypass its safety guidelines, reveal sensitive information, or follow harmful instructions it was designed to refuse. This vulnerability exploits how the model processes and prioritizes instructions, essentially hijacking its intended function
Get the latest news, advances in research, policy work, and education program updates from HAI in your inbox weekly.
Sign Up For Latest News
Explore Similar Terms:

Scholars detail the current state of large language models in healthcare and advocate for better evaluation frameworks.
Scholars detail the current state of large language models in healthcare and advocate for better evaluation frameworks.


This brief explores student misuse of AI-powered “nudify” apps to create child sexual abuse material and highlights gaps in school response and policy.
This brief explores student misuse of AI-powered “nudify” apps to create child sexual abuse material and highlights gaps in school response and policy.


This brief reviews the current landscape of LLMs developed for psychotherapy and proposes a framework for evaluating the readiness of these AI tools for clinical deployment.
This brief reviews the current landscape of LLMs developed for psychotherapy and proposes a framework for evaluating the readiness of these AI tools for clinical deployment.


Courts will have to grapple with this new challenge, although scholars believe much of generative AI will be protected by the First Amendment.
Courts will have to grapple with this new challenge, although scholars believe much of generative AI will be protected by the First Amendment.
