Virtual Experiment Best Practices
Use this page to learn the best practices for developing virtual experiments.
This page collects a set of best practices for developing virtual experiments that are driven by our experience in this field. While choosing whether to follow them or not is up to the developer, the Registry UI will test and display a status report for some of the most important ones.
Strong versioning
For an experiment to be repeatable we need to ensure that the underlying pieces that make up the experiment will not change over time. This translates to the following set of requirements:
Container images should not use the
latest
tag (explicitly or implicitly), as it is more likely to be updated.Base package sources should be hosted on a platform that supports versioning (e.g., git) and reference an identifier that will not change over time (e.g., a commit ID).
Coding style and guidelines
The following guidelines ensure your virtual experiment leverage ST4SD functionalities as best they can while also providing reusability.
Experiments should always include an interface to simplify information retrieval and external integrations.
User-supplied data should be listed in the
inputs
section. No default value for theinputs
that sits indata
should be set.The name of the components should reflect at all times what they do, even if we are modifying an experiment that we fork.
Platforms should be primarily used to encode non-functional information, such as resource limits. Read up more about platforms here.
Unless a component consumes a large amount of inputs, the experiment should list them one by one, instead of just the directory they are in.
The experiment should have defaults set for all options that are not
inputs
. This implies that the default platform should be runnable.
CI/CD and testing
To make sure the experiment is valid and works as expected, we suggest:
Run etest at each commit for all platforms
Developer metadata
Adding metadata to a parameterised package can greatly help other users understand what it does, what license it comes with, and is also useful for filtering purposes in the global virtual experiments registry. As such, it is important for all the following fields to be populated.
The parameterised package should provide a description of what it does, in order to help other users.
Parameterised packages should specify who the maintainer is, so that users can get in touch with someone in case of need.
Parameterised packages should specify what license they are provided with, as this can set requirements for re-use and modifications.
Parameterised packages in the registry should have a meaningful tag and not rely on
latest
.