Assessing the accuracy of automatic speech recognition for psychotherapy
Accurate transcription of audio recordings in psychotherapy would improve therapy effectiveness, clinician training, and safety monitoring. Although automatic speech recognition software is commercially available, its accuracy in mental health settings has not been well described. It is unclear which metrics and thresholds are appropriate for different clinical use cases, which may range from population descriptions to individual safety monitoring.
Related Publications
AI, Health, and Health Care Today and Tomorrow: The JAMA Summit Report on Artificial Intelligence
Automated real-time assessment of intracranial hemorrhage detection AI using an ensembled monitoring model (EMM)
Artificial intelligence (AI) tools for radiology are commonly unmonitored once deployed. The lack of real-time case-by-case assessments of AI prediction confidence requires users to independently distinguish between trustworthy and unreliable AI predictions, which increases cognitive burden, reduces productivity, and potentially leads to misdiagnoses. To address these challenges, we introduce Ensembled Monitoring Model (EMM), a framework inspired by clinical consensus practices using multiple expert reviews. Designed specifically for black-box commercial AI products, EMM operates independently without requiring access to internal AI components or intermediate outputs, while still providing robust confidence measurements. Using intracranial hemorrhage detection as our test case on a large, diverse dataset of 2919 studies, we demonstrate that EMM can successfully categorize confidence in the AI-generated prediction, suggest appropriate actions, and help physicians recognize low confidence scenarios, ultimately reducing cognitive burden. Importantly, we provide key technical considerations and best practices for successfully translating EMM into clinical settings.
Artificial intelligence (AI) tools for radiology are commonly unmonitored once deployed. The lack of real-time case-by-case assessments of AI prediction confidence requires users to independently distinguish between trustworthy and unreliable AI predictions, which increases cognitive burden, reduces productivity, and potentially leads to misdiagnoses. To address these challenges, we introduce Ensembled Monitoring Model (EMM), a framework inspired by clinical consensus practices using multiple expert reviews. Designed specifically for black-box commercial AI products, EMM operates independently without requiring access to internal AI components or intermediate outputs, while still providing robust confidence measurements. Using intracranial hemorrhage detection as our test case on a large, diverse dataset of 2919 studies, we demonstrate that EMM can successfully categorize confidence in the AI-generated prediction, suggest appropriate actions, and help physicians recognize low confidence scenarios, ultimately reducing cognitive burden. Importantly, we provide key technical considerations and best practices for successfully translating EMM into clinical settings.
Developing mental health AI tools that improve care across different groups and contexts
In order to realize the potential of mental health AI applications to deliver improved care, a multipronged approach is needed, including representative AI datasets, research practices that reflect and anticipate potential sources of bias, stakeholder engagement, and equitable design practices.
In order to realize the potential of mental health AI applications to deliver improved care, a multipronged approach is needed, including representative AI datasets, research practices that reflect and anticipate potential sources of bias, stakeholder engagement, and equitable design practices.
Ethical Obligations to Inform Patients About Use of AI Tools
Permeation of artificial intelligence (AI) tools into health care tests traditional understandings of what patients should be told about their care. Despite the general importance of informed consent, decision support tools (eg, automatic electrocardiogram readers, rule-based risk classifiers, and UpToDate summaries) are not usually discussed with patients even though they affect treatment decisions. Should AI tools be treated similarly? The legal doctrine of informed consent requires disclosing information that is material to a reasonable patient’s decision to accept a health care service, and evidence suggests that many patients would think differently about care if they knew it was guided by AI. In recent surveys, 60% of US adults said they would be uncomfortable with their physician relying on AI,1 70% to 80% had low expectations AI would improve important aspects of their care,2 only one-third trusted health care systems to use AI responsibly,3 and 63% said it was very true that they would want to be notified about use of AI in their care.
Permeation of artificial intelligence (AI) tools into health care tests traditional understandings of what patients should be told about their care. Despite the general importance of informed consent, decision support tools (eg, automatic electrocardiogram readers, rule-based risk classifiers, and UpToDate summaries) are not usually discussed with patients even though they affect treatment decisions. Should AI tools be treated similarly? The legal doctrine of informed consent requires disclosing information that is material to a reasonable patient’s decision to accept a health care service, and evidence suggests that many patients would think differently about care if they knew it was guided by AI. In recent surveys, 60% of US adults said they would be uncomfortable with their physician relying on AI,1 70% to 80% had low expectations AI would improve important aspects of their care,2 only one-third trusted health care systems to use AI responsibly,3 and 63% said it was very true that they would want to be notified about use of AI in their care.
