The u4RIA goals
The u4RIA goals, described in a paper available here, provide a set of high-level goals relating to conducting energy system analyses in countries.
Below I outline a number of practical implications of the 7 goals for which U4RIA is an acronym:
- ubuntu (community)
- retrievability
- repeatability
- reusability
- reconstructability
- interoperability
- auditability
An explanation of u4RIA is provided in the following preprint:
Howells, Mark, Jairo Quiros-Tortos, Robbie Morrison, Holger Rogner, Taco Niet, Luca Petrarulo, Will Usher, William Blyth, Guido Godínez, Luis F Victor, Jam Angulo, Franziska Bock, Eunice Ramos, Francesco Gardumi, Ludwig Hülk, Patrick Van-Hove, Estathios Peteves, Felipe de Leon, Andrea Meza, Thomas Alfstad, Constantinos Taliotis, George Partasides, Nicolina Lindblad, Benjamin Stewart, and Ashish Shrestha. (10 March 2021). Energy system analytics and good governance — U4RIA goals of Energy Modelling for Policy Support — Preprint. doi:10.21203/rs.3.rs-311311/v1.
ubuntu (community)
Involve stakeholders in your project from the beginning
- Design the project to allow transfer of data and code to stakeholders
- Involve stakeholders in development of models, assumptions, scenarios and results
- Build capacity to better understand, use and promote use of tools, methods, data and results
retrievability
Ensure that data, source code and results can be easily found, accessed, downloaded and viewed
- Ensure original data can be accessed upon close of the project
- Archive source code so that it can be accessed upon close of the project
- Copy and archive data that comes from unreliable resources
- Archive data (e.g. on Zenodo), register a DOI, and use an appropriate data license
repeatability
Ensure that results can be reproduced from data and assumptions, ideally automatically
- Use a workflow management tool (such as snakemake) to automate each of the steps in the analysis
- Document which version of each software package is required to run the analysis
- Use an environment manager such as miniconda, docker, or singularity to ensure that the analysis can be easily reused
reusability
Ensure the data, results and source code can be used by others
- Document and publish assumptions together with results
- Publish data behind results with a permissive license allowing reuse (e.g. CC‑BY‑4.0)
- Publish source code under an open-source license (e.g. MIT, Apache‑2.0 etc.)
- Publish documentation on how to re-run the analysis
- Use semantic versioning to tag your source code
- Use version control e.g. git and Github to track the history of your source code
reconstructability
Ensure that an entire analysis can be replicated, through documentation of primary data collection, processing, models
- Document the collection and processing of primary data
- Adopt a modular approach to analysis which allows interchange of data and models
- Use a workflow management system to automate, version and document the generation of results from data and models
interoperability
Enable the transfer of data, assumptions and results to other projects, analyses and models
- Do not use proprietary file formats to store data (such as Excel *.xlsx)
- Where possible, use csv files, or an open format such as feather, HD5
- Use Tabular Data Packages which store data in comma-separated values (CSV) format and metadata in a JSON file
- Store data in CSV files in long format (e.g. headers for GDP statistics should be
Country
,Parameter
,Year
,Value
rather thanCountry
,1990
,1995
,2000
,2005
,2010
) - Use existing standards, for example, adopt ISO conventions for country codes, always use SI units
auditability
Work transparently and openly
- Design the project with an audit in mind. How can you change your work patterns to make it easier for an external partners to see what you have done?
- Comes for free with many of the steps above.
This post incorporates information originally written by the author under a CC-BY-4.0 license and available at https://github.com/ClimateCompatibleGrowth/data_standards