How does one identify a good bioinformatics paper to reproduce?

For those moving on to the Reproducibility Challenge, the best bioinformatics paper to choose to reproduce is one with decent data availability and a result relevant to one's research.

In general, all bioinformatics papers should be readily available and easily reproducible. If a paper does not have data available, that's all the more reason to reproduce it. If the authors use a software that you are not familiar with, chances are that there is an open-source counterpart that you can use to check their answer, perhaps in a different programming language. If they have code available, that makes things a lot easier to reproduce, but papers often do not, especially in fields where the bioinformatics workflow is very secondary to fieldwork or other aspects of the research question.

For this tutorial, we have chosen a balance between ready availability and technical challenge to reproduce. As you work through this module, you'll see what aspects of the process are likely to be most frustrating when you're working on your own bioinformatics papers...and hopefully pick up some pointers for how to structure your own analyses down the road.

Things to look for when deciding to reproduce a paper:

  • Well-defined bioinformatics problem
  • Skill that you'd like to brush up on
  • Visual that you'd like to be able to produce