A (short) introduction to research data management

As a funded university researcher, you have a responsibility to use your funds appropriately and properly, and funding agencies are quite rightly determined to ensure that their investments return maximum benefit and value. The more impact and research mileage your work generates, the better for everyone, including the funding agencies who will know their money has been well spent.

Data is your most important asset. It validates your research story and your conclusions; it provides a platform of confidence for other researchers who might continue your work; and it is a resource that can be used by researchers in other fields to undertake new work, perhaps completely unrelated to your own research interests. Well-organised data that is accessible to the research community can continue to provide extended benefit and value long after your projects have been completed.

Anyone who has applied for research funding recently will be aware of the funding agencies’ growing interest in this area. Most funders now require a DMP (Data Management Plan) to be submitted along with any application, describing the data that will be created; how it will be collected; how it will be organised and stored; how it will be published and made accessible to the community; and how it might provide benefit to the community. I think it’s true to say that some researchers regard this simply as an obstacle that must be overcome in order to obtain funding, and despite ticking all the appropriate boxes in the application before their project has started, they are perhaps not quite so diligent about fulfilling those promises after the money has been received and spent. It’s not surprising that funding agencies are starting to look more closely at these outcomes, and to audit compliance with submitted DMPs.

So! I urge you all to consider ways in which you might improve your own data management practices – both in your day-to-day research workflows and in your approach to publishing and archiving your work – and I hope the following tips and resources might be helpful:

  1. All research projects should have a DMP. Make sure yours is concise and purposeful – an honest declaration of intent. It doesn’t have to be verbose or laborious, and there are tools available to help (e.g. template, dmponline).
  2. Stay in control while you work, by using well-planned file naming conventions and directory structures. This might seem obvious and trivial, but it’s surprising how quickly an unmanaged approach can descend into almost irretrievable chaos.
  3. Document your work well. Consider using an electronic research notebook product that will help you to manage links between experiments and data, and permit you to export content in a format that can be shared with others.
  4. Publish your data! Give it away! Not just the bare minimum that’s required to get your paper accepted – publish everything that might be useful to others, and make sure it is Findable, Accessible, Interoperable and Reusable (FAIR). There are very many data repositories to choose from, including community-specific repositories (e.g. Flybase, Gene Expression Omnibus); institutional repositories (e.g. Apollo at Cambridge), and general repositories (e.g. Zenodo, Github). Then tweet about it! Let the world know your data is there to be used. Expand your contribution and increase your impact.
  5. Always think about how someone else will find and understand your data – on your computer, in your fileserver, and in the repository. A disciplined approach to No.2 above, and Readme files will help.
  6. Talk to others who are doing it well. There is a very active global community of data management enthusiasts, and most institutions have formed local working groups to help and encourage their researchers. Cambridge, for example, has its Data Champions programme – you’ll be able to find someone near you who will be delighted to help.
  7. Get some training. The Cambridge University Library has recently developed a great Research Data Management module as part of its larger Research Skills course, and some Guidelines on Working Reproducibly.

Every little helps, and the more you can do the better. If you have other suggestions or comments you’d like to share, please feel free to add them as comments below or contact us in the Computing Office.

Thanks for reading!

Al

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s