* * * * * * * *

Statistics of Scientific Fraud

A scientometric study estimating the percentage of fabricated experimental data in biomedical scientific literature somewhere at 5-10 %

An awkward biochemical procedure of graphical data presentation has created a unique trap for unscrupulous researchers. "Absent-minded" fabrication of certain type of this procedure's output very often results in an "impossible" picture that physically can not be based on any real data. Cases seem to be sufficiently frequent to make statistics for quantitative measuring scientific fraud.


Scatchard plot analysis is an extremely popular procedure in many fields of biochemistry, immunology, pharmacology, etc. The meaning of this procedure is that the saturation isotherm (i.e. the amount of substrate (ligand) Bound to some binder is measured as a function of added (or Total) amount of this substrate - see left panel of the figure) is redrawn on a very strange coordinate plane with Bound on X-axe and Bound/Free on Y-axe (where Free=Total-Bound) - see the graph in the center of the figure.
If this is the first time you hear about Scatchard plots, you should have here the right feeling that it is rather difficult to understand how the experimental data should look like on such coordinate plane. Indeed, this graphical transformation (actually, it has some meaning) converts a plain and clear presentation of experimental data into something lying absolutely beyond understanding of most biomedical researchers. One funny consequence is that fabrication of trustworthy Scatchard plot seems to be also a too difficult job for them.

Scientific fraud detection

The picture above illustrates a dreadful pitfall in the process of fabrication.
Left panel represents the typical scenario of two "parallel" binding experiments, i.e. two almost identical experiments giving pairs of experimental points with the same Total concentration and different Bound concn. Of course, on the usual coordinate plane it results in pairs of experimental points lying precisely one under another.
On the Scatchard plot both X- and Y-axes are linear functions of Bound signal, so, (perhaps with some effort) you may understand that these pairs should lie approximately on the line drawn through the start of coordinates as it is shown at the central panel of the figure (Free concentration usually may be considered equivalent to Total).
Now let's imagine that you are trying to fabricate a nice-looking Scatchard plot for such experiment directly, without fabricating data on normal coordinate plot first. Obviously, if you are not too cautious, almost certainly you will draw pairs of points lying one under another (right panel of fig.1).

In my view, this peculiarity in the Scatchard plot is quite sufficient to make accusation of data fabrication. It is almost impossible to imagine real dataset that may correspond to such plot, or at least its receiving requires enormous and absolutely meaningless efforts. I also do not see what sort of "honest" error may lead to this appearance.
Yet, a pair of official opinions (just simplest bureaucratic figures of speech from Nature journal and ORI) do not share my point of view at least partially.


Of course, this is not the most important type of scientific fraud, it may be called an 'absent-minded' data falsification; but it may have interesting applications. After I have found the case of this folly in an article in Nature, I clearly understood that it is really a very widespread type of error and it may provide unique database for quantitative measuring scientific fraud.

Indeed, Scatchard plot analysis is an enormously popular procedure in biomedical sciences; citation index shows over 1000 citations of original Scatchard's article every year. About 3-4 times more authors use his plot without citing anything. So, stupid browsing scientific journals and counting falsified plots may resolve two problems:
1. It makes possible to estimate directly the percentage of fabricated experimental data in biomedical sci. literature simply by dividing the number of falsified plots (i.e. like right panel of fig.1) by the number of correctly drawn plots of parallel experiments (i.e. looking like the central panel of fig.1).
2. The Scatchard's article is dated by 1949, so it may also be possible to estimate the trend of changing this percentage during last 30 years in order to check the popular opinion that quality of science seriously deteriorated itoday in comparison with heydays in sixties.


Apparently, the most convenient way to collect statistics for this study is browsing the Journal of Biol. Chemistry - yearly subscription of this journal contains usually about 200 articles displaying various sorts of Scatchard plots. So, survey of this journal may be quite sufficient for the first goal listed above, though, I guess, estimating the historical trend of fraud may require more thorough study.
So far, I have succeeded to browse the subscription for this journal for 1992. Take look, if interested, at technical data. The result is that I have found 32 correctly drawn "parallel" Scatchard plots and two fabricated plots.
It is not quite correct, but I also may add the mentioned above case in Nature estimating that the number of Scatchard plots I have seen accidentally since I started this project is less than that contained in a half-year subscription for J.Biol.Chem., .
So, the "initial estimate" based on three cases is that the percentage of fabricated plots is somewhere at 5-10%.

This number may be bigger if there are significant additional number of correctly fabricated Scatchard plots. Yet, I don't think there may be such cases in reality; in my view correct falsification requires from the author much more wits than honest conducting the whole experiment, and, therefore, I don't see why he should use data falsification.


This estimation does not mean that only 5-10% of biomedical researchers fabricate data. Obviously, every research paper usually contains data from several different types of experiments such as Scatchard plot analysis. So, the percentage of articles with fabricated pictures at least for one of them should be several times more - that is about 20-30%. Then, every researcher participate in writing more than one article during his lifetime.

Therefore, the conclusion is that:


Dealing with Scatchard plots, I have found at least two other simalar indicators of data fabrication. Unfortunately, they can't serve for numerical estimation; but being substantially more widespread, they certainly support impression that my conclusion above is correct.
For more detailed report read my article Absurd trivial errors in Scatchard plot analysis; in brief, the first other indicator is that the scattering of datapoints around trend line on most Scatchard plots does not comply the usual property of physical measurement that small signals are measured with bigger relative error (CV). Instead, they seem to stress the common notion of a "nice looking curve". It is not a conclusive indicator, but, as I wrote, it just supports an impression.
The other indicator is again a funny one.  There is another type of binding experiment called inhibition (or displacement) experiment resulting in, roughly speaking, the same curve but turned upside down. Scatchard transformation can not be applied to such curve directly (like floppy disk can not be read if it is inserted upside down). Or if an obvious modification was used, it physically can't  result in a "nice looking curve". Nevertheless very numerous papers present really nice Scatchard plots derived from inhibition curves. I think it is impossible and therefore all these plots were fabricated, but I am not sure. Several times I tried to receive explanations from authors about method of deriving their Scatchard plots, but, of course, there were no responses. So, again, this indicator just supports my impression.


in january 2013 i repeated experiment. Advent of internet technologies made it now much easier to conduct: I've just performed google images search for "scatchard plot" . The result was quite compatible with old data - I've found another 5 instances of fabricated Scatchard plots (first article represented with two almost identical graphs). its per approximately 50-60 "correct" parallel Scatchard plots. Therefore now the estimate of fabricated plots percentage is closer to 10%. Perhaps, some rise from my estimates made in mid-ninetyies. More importantly, this result is now much more statistically valid.