'I'm Unable To': How Generative AI Chatbots Respond when Asked for the Latest News

Richard Fletcher, Marina Adami, Rasmus Kleis Nielsen

April 2024

PDF Project DOI

Abstract

We test how well ChatGPT and Bard (now Gemini) provide the latest news to users who ask for the top five headlines from specific outlets. Based on analysis of 4,500 headline requests (in 900 outputs) in January/February 2024, we find that (i) ChatGPT returned non-news output 52–54% of the time (an ‘I’m unable to’ message), while Bard did this 95% of the time. (ii) For ChatGPT, just 8–10% of requests returned headlines referring to top stories on the outlet’s homepage, and (iii) 30% returned headlines that referred to real, existing stories that were not among the top stories. (iv) 3% of ChatGPT outputs contained headlines that referred to real stories that could only be found on the website of a different outlet and 3% were so vague and ambiguous that they could not be matched to existing stories – both of which could be considered a form of hallucination.

Type

Report

Publication

Reuters Institute for the Study of Journalism

Twitter thread

How do ChatGPT & Bard/Gemini respond when asked about the latest news? @richrdfletcher @Marina__Adami @rasmus_kleis prompted them to provide headlines from the 15 most used news sites in 10 countries

📱Full results in factsheethttps://t.co/HdP4RRDLJc
🧵Key findings in thread pic.twitter.com/4OiHJPbUm9
— Reuters Institute (@risj_oxford) April 25, 2024

generative AI

'I'm Unable To': How Generative AI Chatbots Respond when Asked for the Latest News

Abstract

Twitter thread

Related