Publication
Publication is making code/data available to the broader community, often through formal dissemination channels such as data repositories, journal articles, or public databases. Publication ensures that data is discoverable and can be accessed by other researchers, stakeholders, or the public. Documentation and metadata are included to facilitate understanding and reuse. Publication may also involve adherence to specific standards and best practices to enhance the visibility and impact of the data.
While RFF does not mandate the publication of code and data, it is highly encouraged.
Increasingly, journals require code and data to be submitted along with the article. In addition, many funders and stakeholders value open source software and data availability. Planning for this in the early stages of a project facilitates reproducibility, access, and the ability of others to use and cite your work.
1 Licensing
A well-chosen license clarifies permissions, prevents misunderstandings, and encourages responsible use. The next section provides guidance on selecting and attaching appropriate licenses to ensure your data and code remain accessible and properly credited.
1.1 Rights and ownership at RFF
All agreements that involve the creation of RFF research or a similar work product must have terms that define intellectual property (IP) rights.
Before proceeding with choosing licenses, confirm that your team owns IP rights to the work produced and has full discretion over licensing the project’s data and code without restrictions from funders, institutional policies, or legal/data agreements.
In some cases the research team may have joint IP ownership with partners or funders. In this case, licensing options must be agreed upon by both parties.
In other cases, the research team may have licensing rights to software/code developed, but not data products. Data license restrictions generally do not restrict code licensing / publication.
Data and code products are licensed separately from RFF publication products. All work on the RFF.org and Resources.org websites (working papers, reports, issue briefs, explainers, Common Resources blog posts, Resources magazine articles, Resources Radio podcast episodes, graphs, charts, photographs, audio, and video) are listed under the Deed - Attribution-NonCommercial-NoDerivatives 4.0 International - Creative Commons license. This Creative Commons license is not suitable for either software or data.
1.2 Choose appropriate licenses
For software and data, there are three license suites common in the academic space: MIT, GNU General Public License (GPL), and Open Data Commons (ODC). MIT and GNU GPL are separate software licenses, while ODC has two commonly-used data licenses. All four are described below.
Software licenses
| Attribute | MIT | GNU General Public Use (GPL) |
|---|---|---|
| Link to license text | MIT | GNU GPL |
| Description | The more permissive and flexible. Users, including commercial entities, can view, use, modify, and distribute the work freely. | Users, including commercial entities, can view, use, modify, and distribute the work freely, but are subject to copyleft restrictions. |
| Attribution required | ✅ Yes | ✅ Yes |
| Enclosure allowed | ✅ Yes | ❌ No |
| Copyleft required (viral) | ❌ No | ✅ Yes |
| Best uses | Libraries and tools where broad reuse is encouraged, including in proprietary or closed-source contexts. | Models, scripts, and workflows where preserving long-term openness and share-alike reuse is a priority. |
Terms:
Enclosure: The act of applying legal or technical restrictions—like proprietary licensing or copyright—to limit access, reuse, or modification of software or data that was previously open.
Attribution: The requirement to credit the original creator or source as specified in the license. It ensures recognition while allowing reuse and modification.
Copyleft: A license condition that requires derivative works to be distributed under the same terms as the original. It ensures continued openness by mandating that source code remains available and free to reuse.
Viral: Describes licenses (like GPL or ODbL) that require any derivatives to adopt the same license. This spreads the original license’s terms to all adaptations, preserving openness but limiting reuse.
Data licenses
| Attribute | ODC-By | ODCbL |
|---|---|---|
| Link to license text | ODC-By | ODCbL |
| Description | A permissive license for databases. Users, including commercial entities, may use, modify, redistribute, and build upon the data. | A copyleft license for databases. Users may use, modify, redistribute, but derivative databases must be shared under the same license. Ensures openness through a share-alike (viral) requirement. |
| Attribution required | ✅ Yes | ✅ Yes |
| Enclosure allowed | ✅ Yes | ❌ No |
| Share-alike required (viral) | ❌ No | ✅ Yes |
| Best uses | When the goal is to encourage broad academic and public reuse of a dataset, including by commercial entities, without requiring openness in derived products | Collaborative academic projects where ensuring that all derivative datasets remain openly accessible under the same terms is critical. |
- Share-alike: A condition applied mainly to data and content (e.g., under CC BY-SA or ODbL), requiring that any adapted works use the same or a compatible open license (analogous to copyleft).
Other licenses
If there are other requirements or the team would like to review a broader range of software licenses and their specifications, visit ChooseALicense.
1.3 Create and customize license files
Once the license has been selected, download or create a .txt license file from the license website. Save the file as
LICENSE.txtin the project folder and/or repository.Review the license terms and modify where necessary (most open-source and open-data licenses are designed to be used as-is, but others may require you to fill in specific details, such as name or organization).
If adding additional terms, include them in a separate README or license appendix to avoid conflicting with the main license.
2 Publishing
For publishing both code and data, ensure that the project-level README is up to date.
2.1 Code
When your project is ready to publish code stored in a GitHub repository (whether alongside a journal article, report, or other research output), it can be tagged to associate it with a publication and linked to the RFF GitHub organization.
Note that some journals require authors to:
- Make the codebase publicly accessible
- Include a link to the GitHub code repository in the manuscript
- Reference the codebase DOI in the manuscript
Recommended Publishing Workflow
If you are not already a member of the RFF GitHub organization, request membership by emailing DGWG.
Attach the appropriate license to the repository, if you have not done so already.
If your repository is already in the RFF GitHub organization:
Make the repository public once you’re ready to publish if it isn’t already.
Tag the publication version (e.g., v1.0-publication-name) as described here. (Note: only admins can do this, request admin privileges if needed.)
If your repository is under a personal GitHub account:
Decide between forking or transferring the repo to the RFF GitHub organization (see explanation of these options above).
Contact the RFF GitHub organization admins at DGWG@rff.org to initiate the fork or transfer and to add a tag (note: tags do not transfer automatically when a repo is forked).
(Optional step) Archive the tagged version on Zenodo and link the DOI in your README. This enables persistent citation, version-specific reference, and supports scholarly credit.
2.2 Data
Data-level documentation (metadata)
It is recommended to attach metadata files to published datasets. Metadata (“data about data”) documents the “who, what, when, where, how, and why” of a data resource. Metadata not only allows users (your future self included) to understand and use datasets, but also facilitates search and retrieval of the data when deposited in a data repository.
Below are the key components of metadata. They can be stored in a simple text or markdown file.
- Title: Descriptive name of the dataset.
- DOI number: Associated with the final publication, dataset, or both
- Abstract: Summary of the dataset’s content
- Keywords: Relevant terms for search and discovery
- Temporal Extent: Time period covered by the data
- Data Format: File format
- Data Source(s): Origin of the data
- Accuracy and Precision: Information about data quality
- Access Constraints: Restrictions on data use
- Attribute / field definitions: Define all abbreviations and coded entries
- Additional geospatial metadata components, if applicable
- Spatial Extent: Geographic coverage (bounding coordinates)
- Projection Information: Coordinate system details
In some contexts, generating machine-readable metadata that adheres to certain disciplinary standards is useful. There are various metadata formats and standards for specific disciplines. Additional guidance and resources for generating machine-readable metadata are here: Metadata and describing data – Cornell Data Services.
Uploading data to Zenodo
Zenodo is an online repository for sharing research data, software, and other scientific outputs. It has a broad disciplinary focus and is safe, citeable (every upload is assigned a DOI), compatible with GitHub, and free for up to 50GB of storage.
Step 1: Prepare the research data
Before uploading, ensure that:
- The data is well-organized (e.g., structured folders, clear file names).
- Metadata files are prepared for each data file or sets of data files.
- A license is attached.
- Any sensitive or restricted data is removed or anonymized (if applicable).
- The project-level README is up to date.
For guidance on choosing which files to publish and how to handle API-accessed data, see Finalize data organization.
Step 2: Create a Zenodo account & access the upload dashboard
- Go to Zenodo and sign in (or create an account). Note that you can create an account using your GitHub profile.
- Click the “New Upload” button on the Zenodo dashboard.
Step 3: Upload the data files, fill in metadata, set access
- Upload data, metadata, and the project-level README file.
- Enter metadata information in applicable fields (contributors, associated journal article or conference presentation, etc.)
- Include a link to the GitHub code repository.
- Choose an Access Level
- Open Access: Publicly available for anyone.
- Embargoed: Set a release date if the data must remain private for a certain period.
- Restricted Access: Requires users to request access.
Step 4: Publish & Get a DOI
- Review all details and make any necessary edits.
- Click “Publish” to finalize the upload.
- Zenodo will generate a DOI — use this when citing the dataset in publications.
Versioning & Updates
If the dataset needs to be updated:
- Use the “New Version” option in Zenodo instead of creating a separate upload.
- Zenodo will link versions together and maintain persistent DOIs.