Skip to Main Content

Data Management: Basics of data management & the FAIR principles

What is data management and why is it important?

Merriam-Webster defines data as "factual information (such as measurements or statistics) used as a basis for reasoning, discussion, or calculation"

 

Data management is the process of producing, preserving, and sharing data from a research project.  Data is produced in all phases of the research life cycle:

Diagram of the research cycle - generate ideas, procure funding, conduct research, analyze results, disseminate findings

 

Proper management of data is important for several reasons

  1. Transparency - proper management of data allows researchers to 'show their work' in order to promote open and reproducible science.
  2. Compliance - an increasing number of funding organizations and journals require that data be made available and this can only occur when the data has been properly maintained.  Additionally, many funding agencies require data management plans from potential award recipients in which they show how they intend to manage their data throughout the life of the research project.
  3. Personal benefit - properly managing data makes it easier for researchers and research labs to use their own data, especially given the fact that research personnel in many labs change quite frequently.

 

 

Merriam-Webster (n.d.) Definition of data. https://www.merriam-webster.com/dictionary/data?src=search-dict-hed

Schiermeier, Q. (2018) Data management made simple. Nature 15: 403-405.

fairness by Twin rizki from the Noun Project

 

FAIR Principles

Published in 2016, the FAIR Principles are an oft-cited, overarching set of guidelines which promote optimal research data stewardship. 

The FAIR Principles are:

  • Findable
    • "F1. (Meta)data are assigned a globally unique and persistent identifier"
    • "F2. Data are described with rich metadata (defined by R1 below)"
    • "F3. Metadata clearly and explicitly include the identifier of the data they describe"
    • "F4. (Meta)data are registered or indexed in a searchable resource"
  • Accessible
    • "A1. (Meta)data are retrievable by their identifier using a standardised communications protocol"
      • "A1.1 The protocol is open, free, and universally implementable"
      • "A1.2 The protocol allows for an authentication and authorisation procedure, where necessary"
    • "A2. Metadata are accessible, even when the data are no longer available"
  • Interoperable
    • "I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation".
    • "I2. (Meta)data use vocabularies that follow FAIR principles"
    • "I3. (Meta)data include qualified references to other (meta)data"
  • Reusable
    • "R1. (Meta)data are richly described with a plurality of accurate and relevant attributes"
      • "R1.1. (Meta)data are released with a clear and accessible data usage license"
      • "R1.2. (Meta)data are associated with detailed provenance"
      • "R1.3. (Meta)data meet domain-relevant community standards"

 

GoFAIR (n.d.) FAIR Principles. https://www.go-fair.org/fair-principles/

The Himmelfarb Health Sciences Library
Questions? Ask us.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

The George Washington University