What should I do with my intermediate files?
Reproduce them. This is especially important when those intermediate files cannot be browsed easily, i.e.,
.mat files. A reproducible workflow starts from primitives as much as possible and demonstrates each and every step towards generating results. If you would like to save your readers runtime on a future reproduction, you can either:
- save or copy intermediate files to your results folder -- perhaps in a subfolder labeled
intermediate_files-- and provide instructions for how and where to properly place them for future runs;
- upload intermediate files to
/dataand give users the option of loading them rather than reproducing them if they wish to execute only specific steps rather than the entire pipeline.
Downloading data/dependencies or using an API during runtime?
Just say no. Instead, take advantage of Code Ocean's custom package management system. For anything else, the postInstall script can run any code you can run on Code Ocean, and all results are baked into your Docker image. This both reduces runtime for users and ensures reproducibility (that data/model/API might not be available at the same URL in 10 years; with Code Ocean, everything is archived).
Guaranteeing that random number generation leads to identical results between runs?
Set a random seed. This ensures continuity of results between runs that rely on random number generation.
Will my interactive results work indefinitely?