Summary
- Store data and code for RFF projects in a project-specific
L:/
drive folder.
- Use GitHub to share and version control code.
- If working with external collaborators:
- Use OneDrive to share select data. Only store data in OneDrive that is necessary for collaboration.
- Use GitHub to share and review code.
- For small datasets, GitHub may be used for both data and code storage.
Quick reference
Core tools |
|
|
L:/ drive |
Data and script storage |
- Use for all projects - Regular backups - Secure (RFF-only access) - See guidance for setup instructions |
GitHub |
Script version control, script sharing |
- Use for all projects - Supports collaboration, history tracking, and (optionally) project management - Can store and track small datasets |
Additional tools for projects with external collaborators |
|
|
OneDrive |
Sharing essential data with external partners |
- Good for smaller files - Shared access with external collaborators |
Other cloud options |
Sharing large datasets externally |
- Azure, Google Bucket, AWS S3, Dropbox - Contact ITHelp@rff.org for Azure setup |
ArcGIS Online |
Sharing spatial data |
- Share and visualize spatial data over web browser - Contact Thompson@rff.org for setup |
Internal RFF Projects
Data: L:/
drive
The L:/
drive is the primary location for storing project data and code. All internal projects should have a designated L:/
drive folder—even when working with external collaborators.
The L:/
drive is optimized for data-intensive workflows. It offers:
- High storage capacity
- Regular backups
- Enhanced security compared to personal drives
Access to the L:/
drive requires an RFF network account.
See instructions for setting up a project L:/
drive folder.
Code: L:/
drive and GitHub
Project code (e.g., .R
, .py
, .do
files) should be stored in the project’s L:/
drive folder alongside data, and should be version controlled within a Git repository synced to GitHub.
GitHub’s distributed version control system allows team members to:
- Work on scripts independently without disrupting others
- Track changes with clear commit messages
- Reconcile and sync updates across folders
For example, you can revise a script locally and commit changes, with documentation, when they’re ready—without interfering with your colleague’s workflow.
To prevent users from simultaneously editing scripts stored within a shared folder, and to support use of version control, each team member who will be viewing, running, or editing code should have their own personal folder containing a clone of the project’s GitHub repository . Specific suggestions for how to organize your project folder are in the Organization section.
See Version Control for setup instructions.
Solutions for collaborative projects
Sharing code
The GitHub repository method (see Organization section) for storing and sharing scripts is ideal for collaborating with people who do not have access to the RFF network and L:/
drive. If they are made collaborators on the GitHub repository, they’re able to clone a copy of the repository codebase to their local folder directly from the web browser.
Sharing and accessing data
In addition to storing data and code on the L:/
drive project folder, it may be necessary to store, access, and share files in other ways when collaborating with team members without RFF network access. In that case, there are a few options.
Accessing raw data via Application Program Interfaces (APIs)
APIs (Application Programming Interfaces) allow programs to access data directly from remote servers. Many public datasets — such as the US Census or USDA NASS — can be queried through APIs, enabling efficient workflows without needing to download and store large raw files. If APIs are not available, similar workflows can often be built by downloading data directly from URLs using tools like curl.
Benefits of Using APIs
- Reduced storage: APIs allow you to retrieve only the data you need, rather than storing large or comprehensive datasets locally.
- Up-to-date data: API calls can return the most recent available data each time code is run.
- Reproducibility: API-based workflows can support reproducible research when combined with version control and well-documented queries.
Considerations and Caveats
- Stability and longevity: APIs can change or become deprecated. Keep this in mind when building workflows that rely on long-term access.
- Accessibility: Not all APIs have well-documented libraries in your programming language of choice. Some require constructing URLs manually or navigating limited documentation.
- Reproducibility and archiving API-sourced data: To support reproducibility, consider the following approaches when publishing or archiving projects that rely on API-sourced data. Choose the strategy that best balances reproducibility, storage constraints, and project scope.
- Download and archive locally: Save a copy of the queried dataset in your project directory at the time of analysis or publication.
- Deposit in a repository: Archive the data on platforms like Zenodo using tools such as zen4R (R) or zenodo-client (python).
- Document query details: If data is too large and/or static, document the API source, parameters, and access date to support partial reproducibility.
OneDrive
OneDrive can be used to share data with external collaborators, but the L:/
drive should remain the primary storage location due to better access from RFF lab computers, greater computing capacity, more storage, fewer sync issues, and improved data security.
Storage considerations for OneDrive:
- Project folders start with 5 TB of storage (expandable).
- Files over 10 GB may fail to sync — OneDrive is not ideal for large files.
- Share only essential files (e.g., inputs/outputs — not intermediate files).
- Mirror (replicate) the folder structure between
L:/
and OneDrive to ensure clarity and consistency.
- Sensitive data should be handled carefully; set access permissions appropriately.
How to set up collaboration in OneDrive.
GitHub
While we recommend only using GitHub to version control scripts, figures and tables intended for publication, and code documentation, experienced users can also use it to share and version control other small data files (well below 100 MB). This is also discussed in the Organization section.
Consider security, sensitivity, and license restrictions when hosting data on GitHub.
Files larger than 100 MB should be shared using other tools.
Other cloud storage options for large data
Alternatives to OneDrive, such as Azure, Google Bucket, AWS S3, or Dropbox may be well suited to your project, especially for short-term storage and file transfer. However, note that these storage options may incur additional costs, depending on data size (even the “free tiers” of these services may incur pay-as-you-go costs). For Azure setups, contact IT at IThelp@rff.org.
ArcGIS Online for sharing and exploring spatial data
ArcGIS Online is a cloud-based browser platform that allows users to upload, host, and share datasets (both geospatial and tabular). In order to access the data and exploratory mapping interface, users need an ArcGIS Online account. Online accounts cost $100 per user and data storage costs vary by data size. Contact RFF’s GIS Coordinator at Thompson@rff.org for more information.
Enabling external collaborator access to the L:/
drive
While not recommended, it is possible to enable access to the L:/
drive for non-RFF staff. See Server Access for Non-Employees. Temporary accounts can be requested by contacting IT at IThelp@rff.org.
L:/
drive storage request
At the start of a project (or upon major changes to specifications, such as timeline or disk space), email IThelp@rff.org with the following information. Example answers are provided.
Project name |
LUFA-AgSubsidies |
Storage location |
L drive |
Folder name |
L:/Project-lufa-agsubsidies |
Short description |
This project analyzes how variations in agricultural subsidy structures across U.S. counties influence land-use change |
Principal Investigator(s) |
Otgonbayar Aquila and Léonce Dominique |
RFF collaborators |
Evelyn Loren |
External collaborators |
Several University of Eastern Colorado collaborators but they will not need access to the project folder |
Data types |
R, Stata, GIS datasets |
Size requested |
80 GB |
Estimated max. size |
150GB |
Archival date |
December 2028 |
Data agreement or sensitive data security considerations |
Proprietary data will be in raw data folder and will need to be made read-only |