AI-generated summaries of scientific studies are not only simpler and more accessible to general readers but also improve public perceptions of scientists’ trustworthiness and credibility, according to recent research published in PNAS Nexus. By comparing traditional summaries written by researchers with AI-generated versions, the study highlights how AI can enhance the public’s understanding of scientific information while fostering positive attitudes toward science.
Large language models, such as ChatGPT, are advanced artificial intelligence systems trained on vast amounts of text data to understand and generate human-like language. These models rely on deep learning techniques, specifically neural networks, to analyze patterns in text and predict the most likely sequences of words based on input. The primary strength of these models lies in their ability to process and produce natural language, making them useful for tasks like text summarization, translation, and answering questions.
“I was interested in this topic because many social scientific studies on AI (at the time) were focused on benchmarking AI performance against human performance for a range of tasks (e.g., how well could AI models respond to surveys or behavioral tasks like humans),” said David M. Markowitz, an associate professor of communication at Michigan State University.
“I wanted to take this a step further to see how AI could enhance and improve consequential aspects of everyday life, not just meet human-like performance. Therefore, I studied how people appraise and make sense of scientific information. The results of this paper largely suggest AI can make scientific information simpler for people, and this can have downstream positive impacts for how humans think about scientists, research, and their understanding of scientific information.”
Markowitz first examined whether lay summaries of scientific articles are simpler than the corresponding scientific abstracts. He focused on articles published in the Proceedings of the National Academy of Sciences (PNAS), a highly regarded journal that provides both technical summaries (abstracts) and lay summaries (significance statements), compiling a dataset of over 34,000 articles that included both types of summaries.
The texts were analyzed using an automated linguistic tool called Linguistic Inquiry and Word Count (LIWC). This software evaluates texts on various linguistic dimensions, including the use of common words, writing style, and readability. Common words were identified as those frequently used in everyday language, which tend to make texts more accessible. Writing style was measured by analyzing how formal or analytical the text appeared, while readability scores considered the length of sentences and the complexity of vocabulary.
The analysis revealed that lay summaries were indeed simpler than scientific abstracts. They contained more common words, shorter sentences, and a less formal writing style. But these differences, while statistically significant, were relatively small. This raised questions about whether lay summaries were simplified enough for non-expert readers to fully understand the scientific content. The findings highlighted a potential opportunity for further simplification, which led to the exploration of AI-generated summaries in the next phase of the study.
Markowitz next conducted an experiment to evaluate whether AI-generated summaries could simplify scientific communication more effectively than human-authored lay summaries. He selected 800 scientific abstracts from the PNAS dataset and used a popular generative AI model, ChatGPT-4, to create corresponding summaries. The AI was instructed to write concise, clear significance statements, following the guidelines provided to PNAS authors for writing lay summaries.
Participants for Study 2 were recruited online and included a diverse sample of 274 individuals from the United States. Each participant was presented with summaries from both AI and human authors, but the pairings were randomized to avoid bias. After reading the summaries, participants were asked to evaluate the credibility, trustworthiness, and intelligence of the authors. They were also asked to assess the complexity and clarity of the summaries and to indicate whether they believed each summary was written by a human or AI.
The results showed that AI-generated summaries were perceived as simpler, clearer, and easier to understand compared to human-authored ones. Participants rated the authors of AI-generated summaries as more trustworthy and credible but slightly less intelligent than the authors of human-written summaries. Interestingly, participants were more likely to mistake AI-generated summaries for human work, while associating the more complex human-written summaries with AI.
These findings demonstrated the potential of generative AI to enhance science communication by making it more accessible and fostering positive perceptions of scientists. However, the slight reduction in perceived intelligence highlighted a trade-off between simplicity and expertise.
In his third and final study, Markowitz expanded on Study 2 by examining not only public perceptions of AI-generated summaries but also their impact on comprehension. The experiment included a larger and more diverse set of stimuli, with 250 participants reading summaries from 20 pairs of AI and human-authored texts. Each participant was randomly assigned to five pairs of summaries, with one summary from each pair being AI-generated and the other human-written.
After reading each summary, participants were asked to answer a multiple-choice question about the scientific content to test their understanding. They were also asked to summarize the main findings of the research in their own words. The summaries provided by participants were evaluated for accuracy and detail using an independent coding process.
Markowitz found that participants demonstrated better comprehension when reading AI-generated summaries. They were more likely to correctly answer multiple-choice questions and provided more detailed and accurate summaries of the scientific content. These results indicated that the linguistic simplicity of AI-generated summaries facilitated deeper understanding and better retention of information.
As in Study 2, participants rated AI-generated summaries as clearer and less complex than human-written ones. However, the perception of authors’ intelligence remained lower for AI-generated texts, and there were no significant differences in credibility and trustworthiness ratings between the two types of summaries.
“I hope that the average person recognizes the positive impact of simplicity in everyday life,” Markowitz told PsyPost. “Simple language feels better to most people than complex language and therefore, I hope this work also suggests that people should demand more from scientists to make their work more approachable, linguistically. Complex ideas do not necessarily need to be communicated in a complex manner.”
While the results are promising, the study has limitations. The data were drawn from a single journal, PNAS, which may not represent all scientific fields or publication practices. Future studies could expand the scope to include a variety of journals and disciplines to confirm the generalizability of these findings.
Future research could explore how AI-generated summaries perform across different scientific domains, the long-term effects on public scientific literacy, and ways to balance simplicity with nuance. It could also examine whether integrating AI tools into the writing process improves the quality of scientific communication from the outset.
The study, “From complexity to clarity: How AI enhances perceptions of scientists and the public’s understanding of science,” was published September 2024.