Following a post in Nodal Point, I found this paper:
Read before you cite by Simkin & al.
We report a method of estimating what percentage of people who cited a paper had actually read it. The method is based on a stochastic modeling of the citation process that explains empirical studies of misprint distributions in citations (which we show follows a Zipf law). Our estimate is only about 20% of citers read the original.
For example, an interesting statistic revealed in our study is that a lot of misprints are identical. Consider, for example, a 4-digit page number with one digit misprinted. There can be 104 such misprints. The probability of repeating someone else’s misprint accidentally is10-4 . There should be almost no repeat misprints by coincidence. One concludes that repeat misprints are due to copying some one else’s reference, without reading the paper in question.
Picture from phd comics