People cannot tell AI-generated from human-written poetry and they like AI poetry more

New research has found that people are unable to determine whether a poem was written by a human or generated by AI. Despite this, they tended to give more favorable ratings for qualities such as rhythm and beauty to AI-generated poetry. The paper was published in Scientific Reports.

Generative AI is a type of artificial intelligence that creates new content—such as text, images, music, and code—by learning patterns from data. Examples include ChatGPT and DeepSeek, which generate human-like text, and DALL·E, which creates images from text descriptions. In music, AI composes original melodies, while in gaming, it helps design environments and dialogues. Businesses use AI chatbots for customer service, and researchers generate synthetic data for simulations. AI-powered tools like GitHub Copilot assist programmers by suggesting code.

In recent years, the quality of AI-generated content has improved dramatically. Studies show that humans tend to perceive AI-generated paintings as human-made more often than actual human-created paintings. Similarly, AI-generated humor is found to be as funny as jokes written by humans. Another study found that people perceive AI-generated faces as real human faces at a higher rate than actual photos of human faces.

Study authors Brian Porter and Edouard Machery sought to investigate whether people can distinguish between AI-generated poems and professionally written human poetry and what features they use to make this judgment. They also wanted to explore how participants evaluate the qualities of AI-generated poetry and whether knowing the author of a poem (whether human or AI) influences these evaluations.

The researchers conducted two experiments. The first experiment aimed to determine whether participants could distinguish between AI-generated and human-written poetry. The study included 1,634 United States-based individuals recruited through Prolific. The participants’ median age was 37 years, and 49% of them were women.

The researchers collected 50 poems from 10 English-language poets (five poems per poet) from mypoeticside.com, an online poetry database. They selected poems that were not among the most popular works of each poet, aiming to cover a wide range of genres, styles, and time periods. They also had ChatGPT generate 50 poems, instructing it to mimic the style of each specific poet.

Each participant was assigned a poet and shown five poems written by that poet and five poems generated by ChatGPT in the poet’s style. Their task was to determine whether each poem was written by a human or AI. Participants also rated their confidence in their answers and had the opportunity to explain their reasoning.

The second experiment examined how participants evaluated AI-generated poetry compared to human-written poetry. This experiment involved 696 United States-based individuals, also recruited through Prolific. Their average age was 40 years, and 47% of them were women.

Participants rated various qualities of each poem, including overall quality, rhythm, imagery, and sound. They also assessed how moving, profound, witty, lyrical, inspiring, beautiful, meaningful, and original each poem was, as well as how well it conveyed a specific theme and emotion.

Results showed that participants were unable to reliably distinguish between human-written and AI-generated poetry. Moreover, they were more likely to misidentify AI-generated poems as human-written than vice versa. The five poems least often identified as human were all written by actual human poets.

Findings from the second experiment indicated that AI-generated poems received higher ratings for qualities such as rhythm and beauty. The researchers suggest that these factors contributed to the mistaken belief that these poems were authored by humans.

“Our findings suggest that participants employed shared yet flawed heuristics to differentiate AI from human poetry: the simplicity of AI-generated poems may be easier for non-experts to understand, leading them to prefer AI-generated poetry and misinterpret the complexity of human poems as incoherence generated by AI,” the study authors concluded.

The study sheds light on how people perceive AI-generated poetry. However, it is important to note that the AI-generated poems used in the study were specifically designed to mimic the styles of real human poets. This likely made them closely resemble the human-written poems they were compared with.

The paper, “AI-generated poetry is indistinguishable from human-written poetry and is rated more favorably,” was authored by Brian Porter and Edouard Machery.