Relevant quote:
After the pilot period, Garcia and the team issued a survey to the clinicians, asking them to report on their experience. They reported that the AI-generated drafts lightened the cognitive load of responding to patient messages and improved their feelings of work exhaustion despite objective findings that the drafts did not save the clinicians’ time. That’s still a win, Garcia said, as this tool is likely to have even broader applicability and impact as it evolves.
Link to paper in JAMA (currently open access)
Less than 20% of doctors using it doesn’t say anything about how those 20% of doctors used it. The fact 80% of doctors didn’t use it says a great deal about what the majority of doctors think about how appropriate it is to use for patient communication.
Reducing the doctor’s cognitive load is great, if it doesn’t result in some auto-generated nonsense that doesn’t actually address whatever the patient was concerned about. Pretending that this isn’t a possibility with wider usage is just dishonest, there are already plenty of doctors that clearly barely skim emails before responding.
So to be clear, less than 20% used what the AI generated directly. There’s no stats on whether the clinicians copy/pasted parts of it, rewrote the same info but in different words, or otherwise corrected what was presented. The vast majority of clinicians said it was useful. I’d recommend checking out the open access article, it goes into a lot of this detail. I think they did a great job in terms of making sure it was a useful product before even piloting it. They also go into a lot of detail on the ethical framework they were using to evaluate how useful and ethical it was.
The article does indeed have a lot of relevant information, like nearly half of participants not completing the study, and the score that most participants gave it barely squeaking into the positive category: PCPs and APPs, who make up 83 of the 162 participants, rated it a 13 on a scale of -100 to 100.
This is not a mountain of evidence. I’m not going to say it’s specifically been cherry picked, but it’s a small study with low completion rate and mediocre support for the product (which is what it is, don’t fool yourself).
I never said it was a mountain of evidence, I simply shared it because I thought it was an interesting study with plenty of useful information
I’m not trying to berate you for posting it, it’s just important to be highly skeptical of anything AI-related that isn’t just being used for amusement. Patient care is one of the deepest responsibilities that can be placed on someone, and anything that alters care coming directly from a doctor introduces a failure point.
I am in complete agreement. I am a data scientist in health care and over my career I’ve worked on very few ML/AI models, none of which were generative AI or LLM based. I’ve worked on so few because nine times out of ten I am arguing against the inclusion of ML/AI because there are better solutions involving simpler tech. I have serious concerns about ethics when it comes to automating just about anything in patient care, especially when it can effect population health or health equity. However, this was one of the only uses I’ve seen for a generative AI in healthcare where it showed actual promise for being useful, and wanted to share it.