Рубрики NewsAI Week

Competent requests in ChatGPT can save lives — it is better not to make typos

Published by Andrii Rusanov

It turns out that AI chatbots like ChatGPT are extremely sensitive to spelling. This was revealed during the study of medical AI chats.

MIT researchers note that an AI chatbot is more likely to advise patients not to seek medical assistance if their messages contain typos. They mean very minor mistakes, such as an extra space between words.

The advice of medical bots is also influenced by slang or colorful language. The study found that women are disproportionately affected by this and are more often wrongly advised not to see a doctor than men. Moreover, bias in questions or even tone of voice can subtly, but quite noticeably, change the advice.

Not a peer-reviewed study yet, published at ACM, raises doubts about the use of AI models in hospitals. Healthcare facilities already use chatbots to schedule patient appointments, distribute patients based on requests, and act as field advisors in languages. Ordinary people are usually not good at explaining how they feel, and in some situations, they simply cannot physically do so adequately. Patients may hesitate and use words like «possibly» and «allegedly,» which also impacts the response.

The researchers evaluated several models, including OpenAI’s GPT-4, Meta’s open-source LLama-3-70b, and Palmyra-Med’s medical AI. They modeled thousands of cases using combinations of real patient surveys, health posts on Reddit, and some AI-generated cases. Before passing this data to the models, the above variations were added to it to explore how the bots react. These changes were made without affecting the clinical data, just changing the spelling and wording.

The non-standard writing style clearly changed the perception of artificial intelligence. When faced with stylistic accents, they were 7-9% more likely to suggest that a patient treat their symptoms on their own instead of seeing a doctor.

«These models are often trained and tested on medical examination questions, but then used in tasks that are quite far from that, such as assessing the severity of a clinical case. There is still so much we don’t know about LLM»,” writes lead author Abinita Gurabatina, a researcher in the Department of Electrical and Computer Engineering at MIT.

AI also replicates, if not exaggerates, the biases demonstrated by human doctors, especially with regard to gender. Why were female patients more often advised to self-treat than male patients? Could this have something to do with the fact that doctors in real life often downplay women’s medical complaints? Researchers believe that medical AIs need to be thoroughly tested, and it will not be easy to fix the shortcomings of their work. However, it is almost certain that AI is more accurate than a doctor writing a prescription on paper.

Source: Futurism