We compared ChatGPT to AI-written text recognition tools, and the results are disturbing

As the “chatbot wars” rage in Silicon Valley, the proliferation of artificial intelligence (AI) tools specifically designed to generate human-like text has stumped many.

Educators, in particular, are striving to adapt to the availability of software that can produce a moderately competent essay on any topic in the shortest amount of time. Should we go back to pen and paper assessments? Strengthen exam supervision? Ban AI completely?

All this and more has been suggested. However, none of these less than ideal measures would be required if educators could reliably distinguish between AI-generated text and human-written text.

We looked at several proposed methods and tools to recognize AI-generated text. None of them are foolproof, all are vulnerable to workarounds, and they’re unlikely to ever be as reliable as we’d like them to be.

You might be wondering why the world’s leading AI companies can’t reliably distinguish the products of their own machines from the work of humans. The reason is ridiculously simple: the corporate mission in today’s high-stakes AI arms is to train Natural Language Processor (NLP) AIs to produce results that are as similar to human writing as possible. The public demand for an easy way to detect such AIs in the wild may seem paradoxical, as if we are missing the point of the program.

A mediocre effort

OpenAI – the creators of ChatGPT – introduced a “classifier for displaying AI-written text” at the end of January.

The classifier was trained on both external AIs and the company’s own text generators. In theory, this means it should be able to flag essays created by BLOOM AI or similar, not just those created by ChatGPT.

At best, we give this classifier a grade of C–. OpenAI admits that it only accurately identifies 26% of AI-generated text (true positives), while human prose is falsely labeled as AI-generated (false positives) 9% of the time.

OpenAI has not shared its research on the rate at which AI-generated text is falsely flagged as human-generated text (false negative).

A promising contender

One promising candidate is a classifier developed by a Princeton University student during his Christmas break.

Edward Tian, a computer scientist with a minor in journalism, released the first version of GPTZero in January.

This app identifies AI authorship based on two factors: helplessness and space mania. Perplexity measures how complex a text is, while burstiness compares the variation between sentences. The lower the values for these two factors, the more likely it is that a text was created by an AI.

We pitted this humble David against ChatGPT’s Goliath.

First, we asked ChatGPT to create a short essay on justice. Next, we copied the article – unchanged – into GPTZero. Tian’s tool correctly determined that the text was likely written entirely by an AI since its average perplexity and burstiness values were very low.

GPTZero measures the complexity and variety within a text to determine if it was likely created by AI. GTZero

The classifiers deceive

An easy way to fool AI classifiers is to simply replace a few words with synonyms. Websites offering tools that rewrite AI-generated text for this purpose are already popping up all over the internet.

Many of these tools show their own AI freebies, such as: B. pimping human prose with “tormented phrases” (e.g. using “false consciousness” instead of “AI”).

To further test GPTZero, we copied ChatGPT’s Justice essay into GPT-Minus1 – a website that offers to “encrypt” ChatGPT text with synonyms. The image on the left shows the original attachment. The image on the right shows the changes of GPT-Minus1. About 14% of the text has been changed.

GPT-Minus1 makes small changes to the text to make it look less AI-generated. GPT minus1

We then copied the GPT-Minus1 version of the Justice Essay back into GPTZero. His verdict?

Your text is most likely human-written, but there are a few sentences of minor confusion.

It highlighted only one sentence it believed had a high probability of being written by an AI (see image below left), along with reporting the essay’s overall confusion and burstiness scores, which were much higher ( see picture below right).

Running an AI generated text through an AI deception tool makes it appear more “human”. GPTZero

Tools like Tian’s show promise, but they’re imperfect and also prone to workarounds. For example, a recent YouTube tutorial explains how to get ChatGPT to produce text with a high level of – you guessed it – confusion and burstiness.

watermark

Another proposal is for AI-written text to include a “watermark” that is invisible to human readers but software can detect.

Natural language models work word by word. You choose which word to generate based on statistical probabilities.

However, they don’t always choose words with the highest probability of occurring together. Instead, they randomly choose one from a list of likely words (although words with higher probability values are more likely to be chosen).

This explains why users get a different output every time they generate text using the same prompt.

One of OpenAI’s natural language model interfaces (Playground) gives users the ability to see the probability of selected words. In the screenshot above (captured on February 1, 2023) we can see that the probability of the term “moral” being selected is 2.45%, which is much lower than “equality” at 36.84%. OpenAI playground

Simply put, watermarking involves “blacklisting” some of the likely words and allowing the AI to only pick words from a “whitelist”. Since human-written text is likely to contain “blacklisted” words, this could allow it to be distinguished from AI-generated text.

However, watermark also has limitations. The quality of AI-generated text could be compromised if its vocabulary is restricted. Also, each text generator would probably have a different watermarking system – so next the text would be compared against all of them.

Watermarks could also be circumvented by paraphrasing tools that could insert blacklisted words or rephrase essay questions.

An ongoing arms race

AI-generated text detectors are becoming more sophisticated. Anti-plagiarism service TurnItIn recently announced an upcoming AI typing detector with a claimed accuracy of 97%.

But text generators are also becoming more and more sophisticated. Google’s ChatGPT competitor Bard is in early public testing. OpenAI itself is expected to release a major update, GPT-4, later this year.

It will never be possible to get AI text tags perfect, as even OpenAI acknowledges, and there will always be new ways to fool them.

As this arms race continues, we may see the rise of “contract rewriting”: instead of paying someone to write your contract, pay someone to revise your AI-generated contract to get it through the detectors.

There are no easy answers for educators here. Technical fixes can be part of the solution, but so can new teaching and assessment methods (which may include harnessing the power of AI).

We don’t yet know exactly what that will look like. However, we’ve spent the last year building prototypes of open source AI tools for education and research to help find a path between old and new – and you can access beta versions at Safe-To -Fail AI access.

Source: theconversation.com

Don’t miss interesting posts on Famousbio