GPTZero is a tool designed to detect writing generated by ChatGPT (opens in new tab), the AI writing tool that debuted in November and sent shockwaves through the education system due to its ability to instantly generate human-seeming text in response to prompts.
GPTZero was created by Edward Tian, a senior at Princeton University who majors in computer science and minors in journalism. GPTZero is available for free (opens in new tab) to teachers and others, and can detect work generated by ChatGPT more than 98 percent of the time, Tian tells Tech & Learning. The tool is one of several new (opens in new tab) detection tools that have emerged since the release of ChatGPT.
Tian shares how he created GPTZero, how it works, and how teachers can utilize it to prevent cheating with ChatGPT in their classes.
What is GPTZero?
Tian was inspired to create GPTZero after ChatGPT was released and he, like many others, saw the potential the technology had to aid student cheating (opens in new tab). “I think this technology is the future. AI is here to stay,” he says. “But at the same time, we have to build the safeguards so that these new technologies are adopted responsibly.”
Prior to the release of ChatGPT, Tian’s thesis had focused on detecting AI-generated language, and he worked at Princeton’s Natural Language Processing Lab. When winter break hit, Tian found himself with lots of free time and started coding with his laptop in coffee shops to see if he could build an effective ChatGPT detector. “I was like why don’t I just build this out and see if the world can use it.”
The world has been very interested in using it. Tian has been featured on NPR (opens in new tab) and other national publications (opens in new tab). More than 20,000 educators from across the globe and from K12 to higher ed have signed up to receive updates about GPTZero.
How Does GPTZero work?
GPTZero detects AI-generated text by measuring two properties of text called “perplexity” and “burstiness.”
“Perplexity is a measurement of randomness,” Tian says. “It’s a measurement of how random or how familiar a text is to a language model. So if a piece of text is very random, or chaotic, or unfamiliar to a language model, if it’s very perplexing to this language model, then it’s going to have high perplexity, and it’s more likely to be human generated.”
On the other hand, text that is very familiar and has likely been seen by the AI language model before will not be perplexing to it and is more likely to have been AI-generated.
“Burstiness” refers to the complexity of sentences. Humans tend to vary their sentence length and write in “bursts,” while AI language models are more consistent. This can be seen if you create a chart looking at sentence variability. “For a human essay, it will vary all over the place. It will go up and down,” Tian says. “They’ll be sudden bursts and spikes, versus for a machine essay, it will be pretty boring. It will have a constant baseline.”
How Can Educators Use GPTZero?
The free pilot version of GPTZero is available to all educators on the GPTZero website (opens in new tab). “The current model has a false-positive rate of less than 2 percent,” Tian says.
However, he cautions educators not to treat its results as proof-positive a student has used AI to cheat. “I don’t want anybody making definitive decisions. This is something I built out over holiday break,” he says of the tool.
The technology also has limitations. For instance, it’s not designed to detect a mix of AI- and human-generated text. Educators can sign up to be put on an email list for updates about the next version of the technology, which will be able to highlight the portions of a text that seem to have been generated by AI. “That’s helpful because I don’t think anybody’s going to copy the entire essay off ChatGPT, but people might mix portions in,” he says.
Can GPTZero Keep Up With ChatGPT As The Technology Improves?
Even as ChatGPT and other AI language models improve, Tian is confident that technology such as GPTZero and other AI-detecting software will keep pace. “Training a detection model is so much easier than training one of these gigantic large language models. It’s millions and millions of dollars to train one of these gigantic large language models,” he says. In other words, ChatGPT could not be created over winter break in free WiFi coffee shops as GPTZero was.
As a journalism minor and lover of human writing, Tian is equally confident that the human touch in writing will remain valuable in the future.
“These language models are just ingesting gigantic portions of the internet and regurgitating patterns, and they’re not coming up with anything really original,” he says. “So being able to write originally will remain an important skill.”