Assessing the use of #icanhazpdf

When researchers, journalists, or any other citizen is denied access to (parts of) the scientific literature, alternative access routes will come into existence as an act of civil disobedience, which is also called guerilla Open Access [1]. Access is typically denied when a person does not have a subscription to the journal in which the article is located. As a consequence, the person is presented with a paywall without fully knowing what the value of the paper is. Deciding whether to pay for it is then a difficult process.

Guerilla Open Access tries to circumvent these paywalls and grants users access to the full article despite the paywall. The legality of this can be debated, but fact remains that forms of guerilla Open Access have presented themselves in the recent past, which indicates that people are being denied access to the results of scientific research. Moreover, the outrage in the academic community has indicated that the morality of the current system is in doubt.

One of the alternative access routes that has come into existence is to request them from those who do have access — via Twitter. In 2011, Andrea Kuszewski ’invented’ a catchphrase to be used by Twitter users to request articles that they could not reach because they were found to be behind paywalls [2]. This catchphrase, #icanhazpdf, allows Twitter users to make their request publicly available after which a reader that can access this article may send the article to the requester. In a sense, #icanhazpdf is used as a peer-to-peer method of accessing scholarly articles. Due to the use of a hashtag, these requests are easily searchable for users that have access and thus allow for the sharing of academic articles amongst readers.

Results

Gardner and Gardner [3] conducted a study into the #icanhazpdf hashtag from the end of April 2014 to the beginning of August 2014. We conducted a similar study but our results range from 18 August 2015 to 12 December 2015. We used IFTTT (ifttt.co) to collect tweets using the #icanhazpdf hashtag. This captured the tweets in real-time and allowed us to prevent a systematic bias due to #icanhazpdf tweets being deleted by the original poster after the request is fulfilled.

The total number of tweets in this period was 9765. After the data was collected, we first excluded all retweets (6119). Subsequently, we manually coded all tweets, because we knew some tweets were about the hashtag instead of actually requesting a paper. When we disagreed, we simply said that it was not a clear request. This occurred in 294 cases [4]. Our results are thus likely a conservative estimate of the #icanhazpdf hashtag.

The total number of actual requests in this period was 2121. An upward trend over time is clearly visible when plotting these requests below.

fig1

Furthermore, a spike is visible at the end of October. We attribute this to news features on the BBC [5; released depicted with horizontal line in figure], and other publicity during Open Access week (19 to 25 October 2015). During the Open Access week, an average of 24.86 requests were made per day, whereas before Open Access week it was only an average of 9.71 requests.

Considering that we collected 9765, which only contains 2121 requests, it seems that the #icanhazpdf feed is relatively clogged by retweets and sharing of news stories. However, when we compare our results to those from Gardner and Gardner [3] we notice that the usage of #icanhazpdf has increased. Their total number of tweets is 750 requests, while we have found 2121 in roughly the same time span. Together with the figure above, it seems that the use of #icanhazpdf is on the rise.

However, when we fit a loess curve to our data we see that the attention in the media has led to a short increase in usage, but this seems to fade away to normal usage during December.

fig2

Thus, the use of #icanhazpdf seems to have increased over time, but might simply be due to increased publicity during Open Access week. Additionally, only 22% of all tweets with #icanhazpdf are actual requests for research papers. Moreover, the number of requests via #icanhazpdf (i.e., 2121) pales in comparison to the number of papers accessed via Sci-hub (i.e., 217,276; [6]).

Ps. we are both supportive of sharing our data openly for verification and reuse. However, we chose to share only anonymized data here [7] because we’d like to protect the anonymity of users of the hashtag.

[1] https://archive.org/stream/GuerillaOpenAccessManifesto/Goamjuly2008_djvu.txt
[2] https://twitter.com/AndreaKuszewski/status/28257118322688000
[3] http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2015/Gardner.pdf
[4] Cohen’s kappa = 0.8358917
[5] http://www.bbc.co.uk/programmes/p034vd50
[6] https://twitter.com/Sci_Hub/status/699935939502731268
[7] http://github.com/chartgerink/2015icanhazpdf

One thought on “Assessing the use of #icanhazpdf

  1. Pingback: Thoughts on Sci-Hub | Green Tea and Velociraptors

Comments are closed.