Archival & Disposal

1 Archival

Archiving refers to the secure, long-term storage of data in its final state, upon project completion. Archiving often involves moving data to dedicated storage solutions designed for long-term retention, like archive servers or cloud storage. This is sometimes referred to as “cold storage.”

Archival is important because it:

ensures long-term and secure storage to projects for reproducibility and reuse,
improves organization, accessibility, and usability of both active and completed project files, and
releases computational resources for active projects, reducing energy consumption and storage costs.

Note

Archival takes place when projects are complete. To preserve the state of code and data at major milestones, such as journal article publication, see Publication.

1.1 How to archive data at RFF

Note

At RFF, archived files can still be accessed, read, and copied to active folders.

When RFF data projects are archived, they are migrated to a new storage location, but are still configured to be accessible to specified team members. The folder can be accessed in a way similar to the L drive, except that the files will be read-only to prevent accidental deletion or modification (they can still be copied or fully restored to the L drive).

Step 1: Finalize data organization

Delete obsolete and intermediate files.
- Ensure that irrelevant or outdated files are removed, so that only files necessary for reproduction or understanding are retained.
- In general, intermediate data generated by code does not need to be archived, since it can be easily re-created from raw data and code. However, use discretion: in some cases, intermediate data that are likely to be used again and are time-consuming to re-create should be retained.
The files to be retained may vary by project, but in general should include:
- Source (raw) data
  - When possible, source data should be preserved without modification, as external data sources may be modified or become unavailable.
  - However, for certain reliable data sources, citation and documentation may be sufficient (make sure to include the access date and dataset version).
  - If data were accessed via an API, see sharing API data.
- Final analysis data
- Results and visualizations
- Code
- Documentation
  - one project-level README
  - all raw data README files
  - any metadata files

This is generalized guidance. For additional guidance choosing which files to archive, see Decide what data to preserve.

Step 2: Finalize documentation

Ensure the project-level README file, raw data README files, and any metadata files are up to date.

Step 3: Coordinate with IT to ensure long-term folder access

Contact IT at IThelp@rff.org to arrange and configure archival storage of the folder. Include the following with the email:

Project-level README file
List of researchers that should retain folder access
Approximate folder size (e.g., 5 GB)
The nature of sensitive/proprietary datasets

2 Disposal

Some data may need to be deleted to protect sensitive information or comply with regulations, data agreements, or funder requirements. This is often referred to as data disposition. If any of these requirements applies to a project, follow these best practices when deleting data:

Verify Requirements: Confirm funder agreements and legal obligations regarding data retention and deletion.
Source Deletion: Confirm with IT that the files were fully deleted in accordance with requirements (e.g., backup files).
Documentation: Record when and how data was deleted.

These practices should apply to data stored the RFF network, OneDrive, or Microsoft Teams.