This tool responds to the GC Data Strategy for the Federal Public Service (2023-2026) Mission 3.3 for responsible, transparent and ethical data stewardship to maintain trust. This tool is suitable for a first screening level estimation of the reproducibility of a particular data asset and associated code.

There are two companion tools which you may also find useful:



Do you work with data? Are you looking to make it future proof? The FAIRER data principles will help you!

In 2023, the international consortium, Common Infrastructure for National Cohorts in Europe, Canada, and Africa (CINECA) stated that: " While the FAIR principles have become a guiding technical resource for data sharing, legal and socio-ethical considerations are equally important for a fair data ecosystem . ... FAIR data should be FAIRER, including also ethical and reproducible as key components .”

FAIRER principles refer to the Findability, Accessibility, Interoperability, Reusability, Ethics and Reproducibility of data assets, including related code. Applying these principles to your data assets will help others to find, verify, cite, and reuse your data and code more easily.

This tool helps you to assess the reproducibility of a data asset and associated code, and get tips on how you could increase their value and impact.

The tool is discipline-agnostic, making it relevant to any scientific field.

The checklist will take 15-30 minutes to complete, after which you will receive a quantitative summary of the level of reproducibility of your data and code, and tips on how you can improve the level of reproducibility. No information is saved on our servers, but you will be able to save the results of the assessment, including tips for improvement, to your local computer and add notes for future reference.

CRediT Author statement



REPRODUCIBILITY:

Reproducible data and code means that the final data and code are computationally reproducible within some tolerance interval or defined limits of precision and accuracy, i.e. a 3rd party will be able to verify the data lineage and processing, reanalyze the data and obtain consistent computational results using the same input raw data, computational steps, methods, computer software & code, and conditions of analysis in order to determine if the same result emerges from the reprocessing and reanalysis. “Same result” can mean different things in different contexts: identical measures in a fully deterministic context, the same numeric results but differing in some irrelevant detail, statistically similar results in a non-deterministic context, or validation of a hypothesis. All data and code are made available for 3rd-party verification of reproducibility. Note that reproducibility is a different concept from replicability. In the latter case, the final published data are linked to sufficiently detailed methods and information for a 3rd-party to be able to verify the results based on the independent collection of new raw data using similar or different methods but leading to comparable results.

Checklist Questions

My code is:
  Deterministic
  Non-Deterministic
Executable only on a high performance computing (HPC) system or a super computer.
  Quantum Code
For all models and algorithms, I provided a link to:
A clear description of the mathematical setting, algorithm, and/or model.
A clear explanation of any assumptions.
An analysis of the complexity (time, space, sample size) of the algorithm.
A conceptual outline and/or pseudocode description.
For any theoretical claims, I provided a link to:
A clear statement of the claim.
A complete proof of the claim.
A clear formal statement of all assumptions.
A clear formal statement of all restrictions.
Proofs of all novel claims.
Proof sketches or intuitions for complex and/or novel results.
Appropriate citations to theoretical tools used are given.
An empirical demonstration that all theoretical claims hold.
All experimental code used to eliminate or disprove claims.
For all datasets used, I provided a link to:
A downloadable version of the dataset or simulation environment.
The relevant statistics (e.g., the number of examples).
The details of train / validation / test splits.
An explanation of any data that were excluded, and all pre-processing steps.
A complete description of the data collection process for any new data collected, including instructions to annotators and methods for quality control.
A motivation statement for why the experiments are conducted on the selected datasets.
A licence that allows free usage of the datasets for research purposes.
All datasets drawn from the existing literature (potentially including authors’ own previously published work) are publicly available.
A detailed explanation, where applicable, as to why datasets used are not publicly available, and why publicly available alternatives were not used.
A complete description of the data collection process (e.g., expt’l setup, device(s) used, image acquisition parameters, subjects/objects involved, instructions to annotators, and qa/qc methods.)
Ethics approval.
For all code used, I provided a link to:
The specification of dependencies.
The training code.
The evaluation code.
The (pre-)trained model(s).
A ReadMe file that includes a table of results accompanied by precise command to run to produce those results.
Any code required for pre-processing data.
All source code required for conducting and analyzing the experiment(s).
A licence that allows free usage of the code for research purposes.
A document with comments detailing the implementation of new methods, with references to the paper where each step comes from.
The method used for setting seeds (if an algorithm depends on randomness) described in a way sufficient to allow replication of results.
A description of the computing infrastructure used (hardware and software), including GPU/CPU models; memory; OS; names/versions of software libraries and frameworks.
A formal description of evaluation metrics used and and explanation of the motivation for choosing these metrics.
A statement of the number of algorithm runs used to compute each reported result.
An analysis of experiments that goes beyond single-dimensional summaries of performance (e.g., average; median) to include measures of variation, confidence, or other distributional information.
A description of the significance of any improvement or decrease in performance, judged using appropriate statistical tests (e.g., Wilcoxon signed-rank).
A list of all final (hyper-)parameters used for each model/algorithm in the for each of the experiments.
A statement of the number and range of values tried per (hyper-) parameter during development, along with the criterion used for selecting the final parameter setting.
A ReadMe file with a table of results accompanied by precise commands to produce those results
For all reported experimental results, I provided a link to:
The range of hyper-parameters considered, method to select the best hyper-parameter configuration, and specification of all hyper-parameters used to generate results.
The exact number of training and evaluation runs.
A clear definition of the specific measure or statistics used to report results.
A description of results with central tendency (e.g. mean) & variation (e.g. error bars).
The average runtime for each result, or estimated energy cost.
A description of the computing infrastructure used.
A document that clearly delineates statements that are opinions, hypothesis, and speculation from objective facts and results.
A description of the range of hyper-parameters considered, method to select the best hyper-parameter configuration, and specification of all hyper-parameters used to generate results.
Information on sensitivity regarding parameter changes.
Details on how baseline methods were implemented and tuned.
A clear definition of the specific evaluation metrics and/or statistics used to report results.
A description of results with central tendency (e.g. mean) and variation (e.g. error bars).
An analysis of statistical significance of reported differences in performance between methods.
A description of the average runtime for each result, or estimated energy cost.
A description of the memory footprint.
An analysis of situations in which the method failed.

Your Notes

  • Add any notes you may have here. These notes will be included when you print and save your results to your local computer. No information will be saved to our server. Feel free to capture any thoughts or insights that you'd like to remember or revisit later.
Your Data are 0% Reproducible