AI search has a citation problem
The report evaluates eight generative AI search tools and finds widespread problems in accurately citing news sources. Many systems fabricate or misattribute links, ignore publisher restrictions and provide confident but incorrect answers, raising concerns about information reliability, publisher traffic loss and the transparency of AI-generated search results.
Please login or join for free to read more.
OVERVIEW
Introduction
Generative AI search tools are increasingly used as alternatives to traditional search engines, with nearly one in four Americans reporting using AI instead of conventional search. Unlike traditional search engines that direct users to external websites, generative search tools extract and summarise information directly, often from news publishers. This shift raises concerns about attribution, traffic diversion, and information accuracy.
The Tow Center for Digital Journalism analysed how generative AI search tools access, present, and cite news content. The study examined whether these systems correctly attribute news sources and how they behave when they cannot reliably identify the original article. The findings highlight systemic problems in citation accuracy, transparency, and adherence to publishers’ preferences regarding automated access.
Methodology
Researchers tested eight generative AI search tools with live web search capabilities: ChatGPT Search, Perplexity, Perplexity Pro, DeepSeek Search, Microsoft Copilot, Google Gemini, and xAI’s Grok-2 and Grok-3 (beta). The evaluation focused on the systems’ ability to identify and correctly cite news articles based on excerpts.
The study used twenty news publishers with different approaches to AI access. Some allowed web crawlers through robots.txt files, while others blocked them. The publishers included organisations with licensing or revenue-sharing agreements with AI companies as well as those restricting access.
Researchers issued prompts containing excerpts from news articles and asked each AI tool to identify the article’s headline, publisher, publication date, link, and citation. The results were assessed for accuracy, proper attribution, and compliance with publishers’ crawling permissions.
Findings
Across all systems, citation reliability was poor. Generative search tools frequently failed to decline questions when they lacked reliable information, instead producing incorrect or speculative answers. Fabricated links and misattributed articles were common.
Chatbots often cited syndicated or copied versions of news articles rather than the original source. In some cases, links generated by the systems led to non-existent pages or unrelated content, indicating hallucinated citations.
Premium AI search services did not perform better than free versions. In several cases, paid tools produced answers that were more confidently incorrect, potentially increasing the risk of misinformation.
Compliance with publisher restrictions was inconsistent. Some AI tools appeared to bypass Robot Exclusion Protocol preferences, retrieving or referencing material from publishers that had explicitly blocked automated crawlers. This behaviour undermines publishers’ ability to control how their content is accessed and reused.
Content licensing agreements also did not guarantee accurate attribution. Even when publishers had formal partnerships with AI companies, the tested systems sometimes failed to correctly identify or link to the original article.
Performance varied between systems but remained unreliable overall. Some tools occasionally declined to answer when they could not verify the source, which researchers considered appropriate behaviour. However, these refusals were inconsistent. For example, Microsoft Copilot sometimes declined prompts when it could not identify the source, whereas other tools generated incorrect responses instead.
Google Gemini displayed additional limitations. In tests, it correctly identified the source article only once among multiple attempts. It also refused to answer prompts involving political content, returning generic statements directing users to traditional search instead.
Implications
The study indicates structural weaknesses in how generative AI search tools retrieve and cite journalistic content. Incorrect or fabricated citations risk misleading users and undermine trust in information systems.
For news organisations, generative search creates a distribution imbalance. Traditional search engines drive traffic to publishers, while AI systems summarise content directly within their interfaces, potentially reducing referral traffic and limiting publishers’ visibility.
The findings suggest that current AI search systems require stronger mechanisms for source verification, clearer citation practices, and better adherence to publisher permissions. Improved transparency and technical safeguards are necessary to ensure accurate attribution and maintain the integrity of news information within AI-mediated search environments.