Skip to Main Content

NIH Data Management & Sharing Plan (DMSP) Research Guide: Storage Options

A research guide containing information on requirements and resources related the 2023 NIH Data Management & Sharing Policy

Data security

There are many questions to consider when thinking about data security

  • Who will be responsible for storing and backing up your data?  
    • It's important to have a clear plan for this to help avoid lapses in security
  • How will you manage access to your data?
    • Think about physical access to hardware - where will be computers/external storage devices be kept?
    • Does data need to be password-protected?
  • For locally stored data, how will you ensure the security of the hardware you're using?
    • Think about computer firewalls, up to date antivirus protection, making sure software is being updated as needed, etc.
  • How will you ensure the integrity of the data itself?
    • Encryption, watermarking, digital signatures.

Creating a backup plan

  • Having a backup plan will protect your data from damage/loss due to disaster (fire or flood), theft, unauthorized use, and/or hardware/software malfunction.
  • Consider having two backups for your data
    1. Local -  on a device other than your main work station 
    2. Remote - on a device which is geographically remote from your main work station
  • Additionally, you should consider having a plan to regularly backup your data - there are several common backup routines to consider
    1. Full backup - backup every file each time you do a backup.  This method is extremely time and resource intensive, but it is the method that allows for retrieval of data quickest in the event of data loss
    2. Incremental - only backing up files that have been altered since the last backup, there are two common types
      1. Differential Incremental - backup everything that has been altered since the last full or incremental backup.  This method is the least resource intensive, but it also requires the most work to reconstruct after a data loss
      2. Cumulative Incremental - backup everything that has been altered since the last full backup.  This method only requires that you have the latest full backup and the latest incremental backup in order to restore lost data.

 

Lamar Soutter Library (n.d.) Data Storage Backup & Security

 

Storage Options at GW

GW Box: GW Box is the university's enterprise file sharing service for online cloud storage and collaboration that is free to GW students, faculty and staff. For more information, see the information from GW IT and GW Box

Data Protection at GW

IT related inquires for GW researchers can be sent to rtshelp@gwu.edu. Research Technology Services can help to address the technical and security aspects of protecting the data stored on GW servers and computers. They can also provide guidance on industry best practices including access controls and encryption at rest and in transit. Research Technology Services can also assist with architect technical solutions for compliances such as CUL or DbGAP. 

NIH STRIDES Initiative

Basics of data storage

Many data storage options are likely available to you.

  • Local hardware - your own desktop/laptop computer and external storage devices which link to your computer.  While convenient, these options can be subject to damage, loss, theft, or obsolescence.
  • Network drives - centralized storage often run by a department or university.  This option is in many ways more secure than sole reliance on local hardware, there are often size restrictions which make this unworkable if you create large amounts of data
  • Remote storage (the Cloud) - remotely located servers which can be used, often at a cost, to store data.  However, it's crucial to understand the terms of service before going in this direction.  Additionally, many funders and institutions require that sensitive data may only be stored on cloud services whose servers are located in the United States.

 

Lamar Soutter Library (n.d.) Data Storage, Backup, and Security

Storage formats

Carefully consider the format that you will use for long-term storage of your data:

  • Use a format that is unencrypted and uncompressed as the means of making those files readable may be lost in the future
  • Use a format that is open, well-documented, and widely used in the scientific community to ensure long term accessibility

Preferred file formats include:

  • Text - DOCX, ODT, PDF
  • Databases - XML, SQLITE
  • Tabulated data - CSV
  • Images - PNG, JPEG, TIFF
  • Sound - MP3, WAVE
  • Video - MP4