When the Handling Editor (me) Receives a ChatGPT Submission
This isn't your college classroom anymore
Many decades in academia, always a reviewer, never an editor. This means I peer-review authors' submissions but I am not the higher-level up. I am not in charge of selecting reviewers and adjudicating their responses and determining accept/revise/reject.
But, I finally accepted an editing position for one of the big pay-to-play publishing platforms. A few days ago I was offered an editing assignment on an article that looked interesting to me and was squarely in my area of expertise.
I watched the videos about my duties; I read eagerly because I liked the topic and the premise was reasonable. This first ever assignment for me, where it was my job to pass judgment, received a severe judgment from me: Reject. Probably fraudulent.
A reasonable premise: bringing the intersection of cultural values and motivation to the realm of foreign language learning
The premise and main findings made sense. Language learning motivation varies from intrinsic motivation to extrinsic. Intrinsic motivation means learning for pleasure or one's own interest, whereas extrinsic learning is to gain a reward, such as a course grade. Intrinsic tends to lead to higher ultimate proficiency.
The author joined this idea with the cultural idea of collectivism vs. individualism. Regarded as individualistic are the countries of North America and Europe. The countries with strong Confucian values or traditional values are considered collectivist. People in individualistic countries tend to be driven by intrinsic interest; those in collectivist countries by extrinsic interest, including approval of others or desires of parents or authorities.
The author merged these ideas by predicting that when language learners were acquiring English via a standard classroom learning situation, German foreign language learners would mostly report intrinsic motivations for learning. Chinese learners would report more extrinsic interest. India, a country regarded as having mixed individualism and collectivism, would have intermediate values for intrinsic versus extrinsic motivations for foreign language learning.
The author increased the value of his study by creating a continuum of four types of interest, making a continuum running from intrinsic to extrinsic categories. Results turned out exactly as one would expect, and were even a little too pat. The continuum was in perfect lock step order, with extrinsic being the most frequent motivation and intrinsic being the lowest ranked motivation for the Chinese students. The German learners had the opposite rankings of interest. The Indian learners were flat across the types of interest.
My positive attitude begins to wilt — Bizarre errors
I noted the oddness of the bar graph having no error bars, but it seemed that the appropriate statistics were run. However, as I was reading through the discussion section to get a quick overview of the results, I noticed more and more odd things. The more I looked the more discrepancies popped up. Here are a couple of them. I begin with the two most bizarre, and quite frankly, disturbing. These rattled me like those reports of published manuscripts opening with a ChatGPT phrase like, "Certainly, here is a possible introduction."
The discussion section had a paragraph beginning with: "Australian L2 learners, with a more individualistic cultural background, scored highest for the intrinsic and identified regulation." But the rest of the ms did not mention Australia or Australian L2 learners. This group is not mentioned in the method section or in the results figure.
I always want to give whoever I am interacting with the benefit of the doubt. Did the authors originally include Australians but this group was deleted? I found my heart sinking. I started feeling: Uh uh. AI confabulation.
As soon as I was alerted, I instantly found another error of this magnitude. The abstract identified the article as "focusing on English and French language learners." But the word French was never mentioned in the body of the ms. The participants were university students studying English at a major university and had learned English for at least 1 year. This extra word in the abstract [arguably the most important paragraph in the article] is a stunning and bizarre mistake.
The other things that started striking me were all cases where the method and results were unrealistically perfect.
Examples of implausible competence
Implausible recruitment
"The sample had an equally balanced distribution according to gender and was composed of students from different university studies." Whoa. Have you ever conducted a research study in three countries, at large but unnamed universities? And you were able to obtain exactly equal gender representation while drawing students from diverse university majors?
Later the author wrote: "A further subsample of 15 men (five from each country) was selected for in-depth interviews." Uh... men? You mean, people, participants, respondents?
You readers will have noticed that I've mentioned "the author" often. Yes, there was a single author. Why ... or rather, how? Three different countries are involved. Three different languages. Materials had to be prepared, translated. Participants remunerated or given course credit at least. The researchers at those institutions who collaborated did not ask for co-author credit??
Implausible time frame
"Data were collected over a six-week period in Spring, 2025." What... Oh laughable!! And the article was submitted June 2, 2025?? Was this spring in Beijing, where temperatures start climbing first week of February? Even so, impossible. No one analyzes data and writes a major manuscript that quickly (urr... but ChatGPT is fast, they say).
Remember the 15 interviews that were qualitatively analyzed? The described methodology was as perfect as anyone can get spat out from a Google AI prompt.
Virtuoso but time consuming qualitative methodology
"Using different sources of data (questionnaires, tests, and interviews) in the triangulation enhanced the interpretive validity. Reflexive journaling and methodological transparency were also upheld to minimize researcher bias."
Oh, this hurt. How will the readers get to verify the reflexive journaling? It is never mentioned again. These sentences sound like someone pieced together material on how to do qualitative research and just wrote down that these recommended practices were followed.
Virtuoso study design
Let's start bugging our eyes out at this claimed feat of virtuoso study design:
"120 respondents, 40 from each country, recruited through stratified random sampling."
But these are university students. Who has ever heard of recruiting university students studying English who were somehow identified through a complex method of stratified random sampling? How can we even do stratified random sampling in universities, which by their nature can not cover all strata of a society?
Oh man.... so that's how people try to publish. You learn enough about a field to imagine a plausible research finding, and then tell ChatPGT to draft a manuscript that would demonstrate that result. And this journal charges $2300 to publish. Wow. Modern publish-or-perish pressures are punishing, indeed.
The motivates of this author were hard to understand, as hard as the famous ‘Certainly…’ errors. You are ready to pay $2300 to publish your article yet you don’t read over your text and notice errors that a human reviewer will noticed in the first 15 minutes of reading.
I was browsing 'What actions do journals take if a submitted article contains fraud'; I stumbled upon the reality of authorship for sale. Had no idea this was a thing.
https://retractionwatch.com/2023/09/08/frontiers-retracts-nearly-40-papers-linked-to-authorship-for-sale
Oh man. I imagine the tsunami of this kind of slop is only just getting started