Exploring code notebooks through community focused collaboration

February 11, 2021 PLOS Collaboration Innovation Open Code Open Research Open Science

Colorful html being displayed on a computer

Image credit

Written by Lauren Cadwallader, PLOS’ Open Research Manager

The lack of reproducibility of research findings is a continuing concern in modern science. Code reproducibility is a central part of the problem and we have been exploring code notebooks as one potential solution. We need to understand if these are of value to our community, both readers and authors of computational research, and are asking the computational research community to share their views on the most important aspects of sharing code.

Collaborating on code

We have been exploring opportunities for collaboration with community-focused tools, such as code notebooks, that can improve open, reproducible research practices. PLOS Computational Biology is investigating how we can improve the reproducibility of computational articles in collaboration with NeuroLibre, a Canadian-based open science group supported by the Canadian Open Neuroscience Platform. Code notebooks are documents that contain code, the dependencies needed to run the code and reader friendly text elements, such as paragraphs of text explaining the purpose of the code block and figure captions. These notebooks can be made interactive through the ability to change parameters in figures or edit the code itself. Notebooks improve open science practices by removing barriers faced by researchers who try to access others’ code, such as setting up the correct environment to run the code, making it simpler to interrogate code underlying the figures in a paper by presenting the code, the resulting analyses and accompanying text in a browser-based notebook. The Computational Biology community is a logical place to explore how valuable these are since the community already shows high engagement with making code accessible, although primarily via repositories¹.

NeuroLibre has created prototype versions of code notebooks for two published PLOS Computational Biology articles. They took the code and data that were shared with the articles and turned them into browser-based notebooks that display the figures, with the ability to switch to view the code or to view a Jupyter notebook (hosted on Binder). While a base level of interactivity can be created without input from the papers’ authors, the NeuroLibre team worked with one of the papers’ authors to create additional interactive figures. These extra features allow readers to manipulate the underlying data in a different way thanks to author input, such as choosing intermediate sample sizes as opposed to the three sample sizes detailed in the published paper.

Figure 1. Screenshot of Figure 3 from the Larremore (2019) notebook showing the interactive plot (left) and the ‘toggle code’ view (right). The sample size in the interactive plot can be changed using the dropdown box to the top left of the plot.

What are we up to?

Our collaboration with NeuroLibre to date has focused on gathering qualitative user feedback on the prototypes, to understand the value of these notebooks to computational biology researchers. A series of seven initial interviews helped us to better understand how researchers currently manage and share their code, how and why they access other peoples’ code, and if we are asking the right questions about notebooks. Amongst other things, we are interested in hearing what aspects of publishing reproducible research researchers value the most, what challenges they encounter, and what considerations there are for creating an interactive notebook alongside a published paper. Informed by the outcomes of this initial research, we are now looking for more researchers to share their views on these prototypes via an online questionnaire. This will serve as useful feedback for NeuroLibre to continue improving their tools, and support PLOS in working towards improving reproducibility in research publishing, in collaboration with the scientific community.
If you work in computational biology related research, please share your views! Anonymous results of the survey will be made available in the future.

Publications used for prototype notebooks

D. B. Larremore (2019) Bayes-optimal estimation of overlap between populations of fixed size PLOS Computational Biology

Article: https://doi.org/10.1371/journal.pcbi.1006898

Interactive notebook: https://notebook-factory.github.io/BayesianRepetoireOverlap/01Introduction/intro.html

A. Tampuu, T. Matiisen, H. F. Ólafsdóttir, C. Barry & R. Vicente (2019) Efficient neural decoding of self-location with a deep recurrent network PLOS Computational Biology

Article: https://doi.org/10.1371/journal.pcbi.1006822

Interactive notebook: https://notebook-factory.github.io/NeuralDecoding_book/intro.html

Related editorial in PLOS Computational Biology:

¹Boudreau, M., Poline, J.B., Bellec, P., & N. Stikov (2021) On the open-source landscape of PLOS Computational Biology PLOS Computational Biology DOI: 10.1371/journal.pcbi.1008725