AI Tool: AI tool achieves 94% accuracy in distinguishing fake research articles from real ones

Researchers have developed a tool that could distinguish an original research article from one created by AI chatbots, including ChatGPTIn a set of 300 fake and real. Scientific articlesThe AI-based tool, called ‘xFake science‘, they detected up to 94 percent of the fakes.

This was almost double the success rate seen among the most common data mining techniques, said the authors from the State University of New York, US, and Hefei University of Technology, China.

“…introducing xFakeSci, a novel learning algorithmwhich is able to distinguish articles generated by ChatGPT from publications produced by scientists,” they wrote in the study published in the journal Scientific Reports.

To develop the AI-based algorithm, the researchers created two separate datasets. One contained nearly 4,000 scientific articles extracted from PubMed, an open database hosting biomedical and life sciences research articles maintained by the U.S. National Institutes of Health.

The other consisted of 300 fake articles, which the researchers created using ChatGPT.

“I tried to use exactly the same keywords that I used to extract the literature from the PubMed database, so that we would have a common basis for comparison. My intuition told me that there must be a pattern exhibited in the fake world compared to the real world, but I had no idea what that pattern was,” said study co-author Ahmed Abdeen Hamed, a visiting researcher at the State University of New York.

Discover the stories that interest you


Of the 300 fake articles, 100 were related to diseases such as Alzheimer’s disease, cancer, and depression. Each of the 100 included 50 articles created by chatbots and 50 authentic abstracts extracted from PubMed. The xFakeSci algorithm was trained on the first dataset containing scientific articles and then its performance was tested on the second.

“The xFakeSci algorithm achieved (accuracy) scores ranging from 80 to 94%, outperforming common data mining algorithms, which achieved (accuracy) values ​​between 38 and 52%,” the authors wrote.

xFakeSci was programmed to analyze two main features in the fake articles, according to the authors.

One was the number of bigrams, which are two words that often appear together, such as “climate change,” “clinical trials” or “biomedical literature.” The second was how those bigrams linked to other words and concepts in the text, they said.

“The first thing that struck me was that in the fake world there were very few bigrams, but in the real world there were many more. Also, in the fake world, even though there were very few bigrams, they were very connected to everything else,” Hamed said.

The authors proposed that the writing style adopted by an AI is different from that of a human researcher because both do not have the same goals when producing a paper on a given topic.

“As ChatGPT still has limited knowledge, it tries to convince you by using the most meaningful words,” Hamed said.

“It is not the job of a scientist to present convincing arguments. A real research paper honestly reports what happened during an experiment and the method used. ChatGPT focuses on the depth of a single point, while real science focuses on breadth,” Hamed said.

Source link

Disclaimer:
The information contained in this post is for general information purposes only. We make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability or availability with respect to the website or the information, products, services, or related graphics contained on the post for any purpose.
We respect the intellectual property rights of content creators. If you are the owner of any material featured on our website and have concerns about its use, please contact us. We are committed to addressing any copyright issues promptly and will remove any material within 2 days of receiving a request from the rightful owner.

Leave a Comment