Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
Basics of data storage
There are many data storage options that are likely available to you.
- Local hardware - your own desktop/laptop computer and external storage devices which link to your computer. While convenient, these options can be subject to damage, loss, theft, or obsolescence.
- Network drives - centralized storage often run by a department or university. This option is in many ways more secure than sole reliance on local hardware, there are often size restrictions which make this unworkable if you create large amounts of data
- Remote storage (the Cloud) - remotely located servers which can be used, often at a cost, to store data. However, it's crucial to understand the terms of service before going in this direction. Additionally, many funders and institutions require that sensitive data may only be stored on cloud services whose servers are located in the United States.
Lamar Soutter Library (n.d.) Data Storage, Backup, and Security. https://library.umassmed.edu/resources/necdmc/modules
It's important to carefully consider the format that you will use for long-term storage of your data. Here are some principles to consider:
- Use a format that is unencrypted and uncompressed as the means of making those files readable may be lost in the future
- Use a format that is open, well-documented, and widely used in the scientific community to ensure long term accessibility
Preferred file formats include:
- Text - DOCX, ODT, PDF
- Databases - XML, SQLITE
- Tabulated data - CSV
- Images - PNG, JPEG, TIFF
- Sound - MP3, WAVE
- Video - MP4
Durham University - Research and Innovation Services (n.d.) Choosing file formats for long term preservation. https://www.dur.ac.uk/research.innovation/outputs/data.management/organising/formats/
Creating a backup plan
- Having a backup plan will protect your data from damage/loss due to disaster (fire or flood), theft, unauthorized use, and/or hardware/software malfunction.
- Consider having two backups for your data
- Local - on a device other than your main work station
- Remote - on a device which is geographically remote from your main work station
- Additionally, you should consider having a plan to regularly backup your data - there are several common backup routines to consider
- Full backup - backup every file each time you do a backup. This method is extremely time and resource intensive, but it is the method that allows for retrieval of data quickest in the event of data loss
- Incremental - only backing up files that have been altered since the last backup, there are two common types
- Differential Incremental - backup everything that has been altered since the last full or incremental backup. This method is the least resource intensive, but it also requires the most work to reconstruct after a data loss
- Cumulative Incremental - backup everything that has been altered since the last full backup. This method only requires that you have the latest full backup and the latest incremental backup in order to restore lost data.
Lamar Soutter Library (n.d.) Data Storage Backup & Security. https://library.umassmed.edu/resources/necdmc/modules
There are many questions to consider when thinking about data security
- Who will be responsible for storing and backing up your data?
- It's important to have a clear plan for this to help avoid lapses in security
- How will you manage access to your data?
- Think about physical access to hardware - where will be computers/external storage devices be kept?
- Does data need to be password-protected?
- For locally stored data, how will you ensure the security of the hardware you're using?
- Think about computer firewalls, up to date antivirus protection, making sure software is being updated as needed, etc.
- How will you ensure the integrity of the data itself?
- Encryption, watermarking, digital signatures.