For nearly an entire semester last year, a student enrolled in an online Masters-level course in health administration at a University in South Carolina was doing really well, participating in class discussion boards, contributing to live online seminars, and getting very high marks on written work and quizzes.
But it was not a student at all. It was an AI chatbot – ChatGPT (GPT-4) – surreptitiously enrolled in the course as part of a test by academic researchers. They wanted to see whether a chatbot could do graduate-level coursework, and whether the work of a chatbot would be noticed or caught by anyone.
According to their paper on the experiment, ChatGPT can indeed do the work, and quite well it turns out. And no, no one noticed. The work submitted by the generative AI software was undetected.
That paper is by Kenneth R. Deans Jr., Jami Jones, Jillian B. Harvey, and Daniel Brinton. Deans is affiliated with Health Sciences South Carolina, the others are with Medical University of South Carolina.
The team stresses that they did not enhance the output of ChatGPT in any way, aside from checking its grammar, checking for plagiarism, and verifying its citations. And, according to the paper, “AI’s final grade is 99.36, which is higher than both the cohort’s mean (97.70) and median (98.53). This places AI’s performance near the top of the class.”
The bot got an A.
Depending on your views on AI and the state of higher education, that could be distressing or not surprising at all, or both. Either way, it’s an outcome that cries out for attention on the part of the university.
One reason is that the work of ChatGPT was entirely missed by the professor or others responsible for the course and program. Though the authors of the paper did not say so directly, the chatbot’s coursework was likely missed because the school or professor did not use AI detection technology to screen for AI. The authors wrote, “During the study, no specific AI-detection platforms were disclosed or verified to be in use by the institution.” That leaves detecting AI work to teachers, and has been shown in other studies, people are not very good at all at spotting AI text on their own – not even professors.
In other words, if spotting AI-created text material is important – and in a classroom it ought to be – machines are far better and far more accurate than people are. Not using existing technology is like a doctor giving a diagnosis without getting an x-ray or MRI. You can do it. But you probably should not.
Seeing the work of their pseudo-student go undetected, and acknowledging how unchecked AI use undercuts and devalues the course and the degree, the paper’s authors repeatedly called for “enhanced integrity protocols,” and “improved detection frameworks,” and “proactive measures … such as developing more sophisticated AI-detection algorithms.”
They are right. If a Chatbot can get an A, then any student using a chatbot can get the same A without learning anything at all. And if the school does not stop it – and by and large most schools do not – then students will do exactly that. In fact, we know they are.
As big a problem as cheating with AI has become, this new research also raises a bigger, potentially even more serious issue – the quality of the education itself, the rigor of what we seem to be asking of students in this online, graduate-level course.
It’s not that ChatGPT got an A in the course, though that is troubling. It’s that, according to the results in the class, “the cohort’s mean [grade was] (97.70) and median [was] (98.53).”
That means that in this online Master’s Degree course, the average grade was nearly 98%. The median grade – the grade at which an equal number of students scored better and worse – was nearly 99%. That is to say that, for every student who scored less than 98.5%, one student received a grade better than 98.5%. Or, said another way, more than half the actual students in this class scored above a 98.5%. It also means that unless the class was somewhere on the order of a thousand students, no one failed. Or even got a C.
That’s more than troubling, it’s nearly incomprehensible.
There are really only three plausible reasons for such an obviously indefensible grade distribution.
One is that the course is just elementary easy – like show up and get your A, easy. Most people assume that’s how online courses are. For a graduate level course in what we assume is a public university, that’s highly embarrassing.
Another possibility is that the course is compromised, and fraud is rampant. In an age where academic cheating is a global, multi-billion-dollar business, answers are easy and everywhere. ChatGPT, in fact, has made them free. In online courses especially, cheating is nearly ubiquitous. If a program or professor is not actively engaging and deterring academic misconduct, it will flourish. Complete failure to identify a chatbot pretending to be a student is a clear indication of a lack of attention on this issue. So too is a massive distribution of nearly perfect scores. In this case, we see both.
On this point, since ChatGPT “earned” a high grade in this class and was not caught, we may at least ask whether other students pulled the same trick – having ChatGPT or another bot write all their answers and take all their tests. We cannot know. But it’s probably very safe to assume that ChatGPT was pretty busy in this course, as it is in thousands of online, low supervision courses just like it.
The other explanation is grade inflation, the general reluctance of educators to give bad grades. That’s cultural. But in education, it’s also a symptom of increasing transactionalism – the idea that students are customers and colleges are selling products.
It’s likely the grade distribution in question is the result of some measure of all three.
Regardless of the reason however, the outcome is the same – receiving an A in this particular class means nothing. It’s one thing to say a Chatbot got an A. It’s another thing to say that absolutely everyone did. Both are pretty bad. And there is no chance that a university will survive if its courses and degrees are so easy to compromise, fake, or impersonate. Or if they’re just this easy.