Sep 02 2025
Detecting Online Predatory Journals
The World Wide Web has proven to be a transformative communication technology (we are using it right now). At the same time there have been some rather negative unforeseen consequences. Significantly lowering the threshold for establishing a communications outlet has democratized content creation and allows users unprecedented access to information from around the world. But it has also lowered the threshold for unscrupulous agents, allowing for a flood of misinformation, disinformation, low quality information, spam, and all sorts of cons.
One area where this has been perhaps especially destructive is in scientific publishing. Here we see a classic example of the trade-off dilemma between editorial quality and open access. Scientific publishing is one area where it is easy to see the need for quality control. Science is a collective endeavor where all research is building on prior research. Scientists cite each other’s work, include the work of others in systematic reviews, and use the collective research to make many important decisions – about funding, their own research, investment in technology, and regulations.
When this collective body of scientific research becomes contaminated with either fraudulent or low-quality research, it gums up the whole system. It creates massive inefficiency and adversely affects decision-making. You certainly wouldn’t want your doctor to be making treatment recommendations on fraudulent or poor-quality research. This is why there is a system in place to evaluate research quality – from funding organizations to universities, journal editors, peer reviewers, and the scientific community at large. But this process can have its own biases, and might inhibit legitimate but controversial research. A journal editor might deem research to be of low quality partly because its conclusions conflict with their own research or scientific conclusions.
There is no perfect answer. The best we can do is have multiple checks in the system and to make a carefully calibrated trade-off between various priorities. The system we have is flawed, but it basically works. High quality research tends to gravitate toward high quality journals, which have the highest “impact” on the community. Bad research generally doesn’t replicate well and will tend to get picked apart by experts. A new idea that is having trouble breaking through will tend to break through eventually – the virtue of being correct usually wins out in the end.
But eventually working out, mostly, in the end isn’t enough. We also want to know how efficient the whole system is. How quickly is fraud and bad research weeded out? We also want to make sure we are moving in the direction of improved research quality, and that the outcome of scientific research is translating to our society effectively and efficiently. This means we have to track trends – and one of those trends is the rise of so-called “predatory” journals.
Predatory or similar scientific journals result from basically two things – the ease of creating an online journal because of the web, and the open-access journal business model. The traditional journal model is based on subscriptions and advertising, which benefit from high quality, high profile, and high impact research. This has its trade-offs too, but overall it’s not a bad model. The open-access business model is to charge researchers for publishing their research, then make the results open-access to the world. This has the benefit of making scientific research open to all, and not hidden behind a paywall. But it creates the perverse incentive to publish lots of articles, regardless of quality. In many cases, that is what is happening.
Scientists and academics, once realizing the issue, have dealt with it by vetting new journals for their process and quality. They can then create essentially a black list of demonstrable low quality or even predatory journals (those that will publish and even solicit any low quality study to collect the publication fee, and then publish online with little or no editorial filter). Or you can create a white list of journals that have passed a thorough vetting process and meet minimum quality standards.
The problem is that this is a lot of work. New predatory journals are easy to create. Once they are identified and blacklisted, the company behind the journal can simply create a new journal with a slightly different name and URL. They are essentially outstripping the ability of academics to evaluate them.
One attempt to rein in the proliferation of such journals using a new AI program to screen journals for the probability that they are predatory. The researchers behind this effort published the results of their first search. They screened over 15,000 open access journals. Just that fact alone is a bit alarming – that is a lot of scientific journals. The AI flagged about 1,300 of them as probably predatory, and then human evaluators looked at those 1,300 and confirmed that about 1,000 of them were predatory – so the AI flagged about 300 false positives. Keep in mind, these thousand journals collectively publish hundreds of thousands of articles each year, which generate millions of citations.
There are lots of systemic issues at work here, and predatory journals are partly a symptom of these problems. But they significantly exacerbate these issues and are making it impossible for legitimate researchers to keep up. Developing new tools for dealing with this flood of low-quality research is essential. This one tool will not be enough, but perhaps it can help.
I think ultimately, a white list of properly vetted science journals, kept to a stringent standard of quality, is the only solution. This won’t stop other journals from popping up, but at least researchers and those in decision-making positions will be able to know if a piece of science they are relying on has been properly vetted. (Again, no guarantee it is correct, but at least it went through some legitimate process.)
Another aspect to this issue is the communication of science to the public. The existence of large numbers of low-quality journals easily accessible and shareable online means that anyone can easily find research to support whatever position they want to take. Further, AI is trained on this flood of low quality research. This makes it almost impossible to have a conversation about any scientific topic – which tends to devolve into dueling citations. Having open-access to scientific studies does not make everyone a scientist, but it makes it easy to pretend that you are.