Monthly Archives: March 2017

Reproducible manuscripts are the future?

This week, a paper that was almost three years in the making finally got published. I feel confident about the paper and the results in it, not because it took three years to write, but because I used a dynamic document to produce it (e.g., Rmarkdown).

Dynamic document? Yes! I no longer had to manually enter all results from the data into tables or the text — the computer did it for me. All I had to do was point it in the right direction. Figures? The same! It saved me tons of time after I made the initial investment to learn how to use it (something else that saved me time was git version control, but that’s for another time).

Why is this important? We are all human, and we make mistakes. And that’s okay! What matters is how we try to remedy those mistakes if they do occur, but even more important, if we can change the way we work in order to prevent some of them, that can help us tremendously. I think dynamic documents like Rmarkdown help us do so.

Markdown is a simple document language, which you can create in any text editor (notepad as well). All it does is standardize how headers are defined (with asterisks, # Header level 1, ## Header level 2, etc.) and how text style is defined (e.g., *text* is italic text). Subsequently, the text file can be converted to pretty much anything (e.g., html, pdf, and even a Word file for those relentless co-authors who love track changes so much).

Rmarkdown takes markdown, and allows you to put R code in between text chunks (which it actually runs!) or even WITHIN the text. Yes, you read that correctly. As such, you can do analyses, make figures, format results (no more manual p-values! statcheck won’t find any errors if you use Rmarkdown) AUTOMATICALLY.

I will just show one exciting and simple aspect, but more step-by-step guides are available (if you want to follow along, install R and Rstudio).

Usually, we tend to type results in the running text ourselves, like such.

Using Rmarkdown to just write a document

Using Rmarkdown to just write a document

As we see, RMarkdown creates a document from a very simple plain text document (this is just markdown doing what it’s supposed to). However, we have a p-value that is calculated based on that t-value and degrees of freedom. So let’s make it dynamic to ensure we have the rounding correct.

Using RMarkdown to generate a document with dynamic results, to ease result presentation

Using RMarkdown to generate a document with dynamic results, to ease result presentation

As we see, the original contained a mistake (p = .027 now turned it into .028) — but Rmarkdown allowed us to catch that by just putting in the R code that generates that p-value and rounds it (i.e., round(pt(q = 1.95, df = 69, lower.tail = FALSE), 3)). No more mistake, and we can be confident. Disclaimer: of course you can still input wrong code — garbage in garbage out!

But this is just a simple example. You can write entire manuscripts in this type of way. That’s what I did for our Collabra manuscript (see here [1]). You can even use citations and alter the citation style without any problem; my experience is that it’s easier with RMarkdown than with EndNote or Mendeley even. All it takes is some initial time investment to learn how to work with it (Markdown can be learned in five minutes) and change your workflow to accomodate this modern approach to writing manuscripts.

The only downside to working this way is that journals don’t accept a raw RMarkdown file as submission, which is too bad — they could link the results directly to the code that produces a result. Now we still end up with a document (e.g., Word file) that hard-codes all results as traditionally was the case. I hope dynamic documents will become more and more widespread in the future, both in how often they’re used by the authors and how publishers support this type of document to truly innovate how scholarly information is communicated and consumed. Image just getting a highlight when you hover over a result, and seeing the underlying code — it would allow you to more directly evaluate the methods in a paper and empower you as a reader to be critical of what you are presented with.

[1] I preferred LaTeX for that project and used Sweave, which is RMarkdown for LaTeX

UPDATE: This blog post has been cross-posted on both the eLife innovation blog and R-bloggers. For more R news and tutorials, please visit https://www.r-bloggers.com/.’

Open mind, open science [Speech for Tilburg uni board]

This is a speech I gave for the board of Tilburg University on why Open science is important for the future of TIlburg University or any knowledge institute, honestly. Speech was given on March 9, 2017.

We produce loads of knowledge at this university, and yet we throw most of that knowledge away. We are throwing away taxpayer’s money; stifling scientific discovery; hampering the curiosity of our students and society.

Research data; peer reviews; research materials; developed software; research articles — they are literally and figuratively thrown away. But these are all building blocks for new knowledge, and the more building blocks available, the more knowledge we can build. Science is a bit like Legos in that sense: more pieces allow you to build greater structures.

Students can use these building blocks to learn better — learn about the real, messy process of discovery for example. Businesses can create innovative tools for information consumption by taking the hitherto unavailable information and reshaping it into something valuable. Citizens can more readily participate in the scientific process. And last but not least, science can become more accurate and more reliable.

Researchers from different faculties and staff members from different support services see the great impact of open science, and today I call on you to give make it part of the new strategic plan.

Let’s as a university work towards making building blocks readily available instead of throwing or locking them away.

Open Access and Open Data, two of these building blocks, have already been on the board’s agenda. All researchers at Tilburg are mandated to make their publications freely available since 2016 and free to re-use by 2024. For Open Data, the plans are already in motion to make all data open per 2018, across all faculties, as I was happy to read in a recent memorandum. So, why am I here?

Open Access and Open Data are part of the larger idea of Open Science; they are only two of many building blocks. Open science is that all research pieces are available for anyone to use for any purpose.

Advancing society is only possible if we radically include society in what we do. The association of Dutch universities, state secretary Sander Dekker, and 17 other organizations have underscored the importance of this when they signed the National Plan Open Science just a few weeks ago.

So I am happy to see the landscape is shifting from closed to open. However, it is happening slowly and incompletely if we only focus only on data and access. Today, I want to focus on one of the biggest problems facing science, that open science can solve: selective publication.

Researchers produce beautifully written articles that read like the script of a good movie: they set the scene with an introduction and methods, have beautiful results, and provide a happy ending in the discussion that makes us think we actually understand the world. And just like movies, not all are available in the public theatre.

But research isn’t about good and successful stories that sell; it is about accuracy. We need to see not just the successful stories, we need to see all stories. We need to see the entire story, not just the exciting parts. Only then can we efficiently produce new knowledge and and understand society.

Because good movies pretend to find effects even if there is truly nothing to find. Here, researchers investigate the relation between jelly beans and pimples.

So they start researching. Nothing.

xkcd comic "Significant"; https://www.xkcd.com/882/

xkcd comic “Significant”; https://www.xkcd.com/882/

More studies; nothing.

More studies; an effect and a good story!

More studies; nothing.

And what is shared? The good story. While there is utterly nothing to find. And this happens daily.

Researchers fool each other, including themselves, and it has been shown time and time again that we researchers wish to see what we want to see. This human bias greatly harms science.

The results are disconcerting. Psychology; cancer research; life sciences; economics — all fields have issues with providing a valid understanding of the world, to a worrying extent. This is due to researchers fooling themselves and confirming prior beliefs to produce good movies instead of being skeptical and producing accurate, good science.

So I and other members across the faculties and services say: Out with the good movies, in with the good science we can actually build on — OPEN science.

Sharing all research that is properly conducted is feasible and will increase validity of results. Moreover, it will lead to less research waste. We as a university could become the first university to share all our research output. All based on a realistic notion of a researcher: do not evaluate results based on whether they are easy to process, confirm your expectations, and whether they provide a good story — evaluate them on their methods.

But please please members of our university, do not expect this change open science to come easily or by magically installing a few policies!

It requires a cultural shift that branches out into the departments and even the individual researchers’ offices. Policies don’t necessarily result in behavior change.

And as a researcher, I want to empirically demonstrate that policy doesn’t necessarily result in behavioral change.

Here is my Open Access audit for this university. Even though policies have been instated by the university board, progress is absent and we are actually doing worse at making our knowledge available to society than in 2012. This way we will not reach ANY of our Open Access goals we have set out.

Open Access audit Tilburg University; data and code: https://github.com/libscie/access-audit

Open Access audit Tilburg University; data and code: https://github.com/libscie/access-audit

In sum, let us advance science by making it open, which in turn will help us advance society. I will keep fighting for more open science. Anyone present, student or staff, I encourage you to do so as well. I am here to help.

Open science is out of the box, and it won’t go back in. The question is, what are we as a university going to do with that knowledge?