It has been an interesting week after I posted my story on how Elsevier has hampered my research (I described my research here). This website went down at some point because the server load became a bit too much when the Techdirt story was ranking high on Hackernews.
tl;dr I will not agree with Elsevier’s TDM policy as it harms individual researchers and the impact of research on society.
Since I posted my story, Elsevier has contacted my library stating I can easily continue my research by using their API under their Text and Data Mining (TDM) policy. This usually requires a separate signed agreement with the university, but they have offered to give me a personal API code without such an agreement. Note that all work I would do with the API would still fall under their TDM policy.
The TDM policy directly threatens my academic freedom and as such I will not agree with it. Elsevier describes its policy here and says
“When researchers have completed their text-mining project through the API, the output can be used for non-commercial purposes under a CC BY-NC license”
which means Elsevier is imposing a license on my research output. This prevents me from publishing in journals such as PeerJ and the PLOS journals because they publish their research articles with a public license that allows for commercial use (i.e., CC-0 or CC-BY). I consider it my own choice where and how I want to publish.
The imposed non-commercial license also unnecessarily restricts the impact and re-use of my publicly funded research.
First, the differentiation between commercial and non-commercial is highly ambiguous and discourages re-use. For instance, if a blogger wants to upload my paper to her personal website, but the blog includes advertisements, would this be considered commercial or non-commercial? It seems to me that this ambiguity would lead to some choosing the safe-route and not re-using.
Second, clearly defined commercial entities now cannot distribute my work, while I, the author of the work, want them to be able to. Re-use possibilities are often unforeseen, and I will not forego these possibilities by having to assign a non-commercial license. For example, if a financial fraud detection company wants to print my research output and use it in a workshop, they would not be able to with a non-commercial license. Academics will still be able to read it, but the impact of research is larger when both non-commercial and commercial entities can use the knowledge to benefit society.
Elsevier is trying to force their API on me using the argument that scraping the website would overload the server. I have shown that the server load need not be large (what I am doing costed only 35KB/s, which is less than streaming the typical YouTube video). As a commenter on the original post mentioned, the Wikipedia API only “ask[s] that you be considerate and try not to take a site down.” Elsevier could institute a similar non-restrictive policy on screen-scraping.
Elsevier’s API is also incomplete, it for instance does not include images, which are vital to my research. As such, what Elsevier is offering me does not allow me to do what I was doing when scraping the webpage directly. The API is simply insufficient, besides imposing a license that threatens academic freedom.
It seems like Elsevier’s TDM policy does not have the researcher’s interests at heart and I can imagine this is one of the reasons library associations do not agree with Elsevier’s TDM license, for instance LIBER and the Dutch University- and Royal Libraries. For the reasons outlined above, I will not agree to Elsevier’s TDM policy as it harms me as a researcher and the impact of research on society.