Use this page to learn about key ST4SD concepts and terminology.

Overview

How researchers and developers interact with ST4SD

Terminology

Virtual Experiment: A virtual experiment (sometimes shortened to experiment or VE) is an application workflow which measures one or more characteristics of one or more input systems. Typically created by developers, they are defined by a configuration file and additional data they need to function e.g. scripts, configuration files.
Parameterised Virtual Experiment Package: A Parameterised Virtual Experiment Package (also parameterised package or PVEP) is a virtual experiment that has been pre-configured to perform a specific measurement. Researchers select and launch Parameterised Virtual Experiment Packages defined by Developers.
Virtual Experiment Instance: A Virtual Experiment Instance (also instance or experiment instance) is an execution of a particular virtual experiment, usually run via a PVEP.
Project: A project is a directory structure used by developers to contain the definition of one or more virtual experiments.
Direct Execution: A Direct Execution refers to a virtual experiment run via the elaunch.py tool from the terminal of the machine the user is logged into e.g. a laptop or a HPC cluster.
REST API Execution: A REST API Execution refers to a virtual experiment run on a (remote) Kubernetes/OpenShift cluster via the st4sd-runtime-service REST API (this is often is done via a Jupyter Notebook).

Virtual Experiment Inputs

There are four ways to provide data to a virtual experiment. Only one - inputs - is required.

Inputs
Data
Variables
Dependencies

`inputs`

Inputs are files that the experiment requires to run. Usually they contain the information on what the experiment is measuring.

To find out what inputs an experiment requires check the experiment’s documentation.

REST API Execution

When running using the REST APIs you provide the input files via the experiment payload. You can directly provide the content in the payload or provide a reference to an S3 bucket containing the files.

See specifying experiment inputs and the example notebooks

Direct Execution

When running directly input files are specified via the -i argument to elaunch.py

inputs are any files you reference in your virtual experiment configuration using the directory input/ e.g. input/somefile:ref.

See specifying input files for more information.

You can see these files by running einputs.py on your experiment.

`data`

data refers to configuration files that the experiment uses during runs. These can be optionally overridden but defaults always exist.

REST API Execution

You provide content for data files via the experiment payload. You can directly provide the content in the payload or provide a reference to an S3 bucket containing the files. You can only do this for experiment packages whose parameterisation allows it.

See specifying experiment data and the example notebooks

Direct Execution

When running directly, you specify data files via the -d argument to elaunch.py

data is any file you reference in your experiment using the directory data/ e.g. data/someconfigurationtemplate.txt:ref.

See specifying data files for more information.

You can see these files by running einputs.py on your experiment.

`variables`

variables are non-file parameters a virtual experiment defines. These can be optionally overridden but defaults always exist.

REST API Execution

You provide values for variables as part of the experiment payload. You can only do this for experiment packages whose parameterisation allows it.

See specifying experiment variables and the example notebooks

Direct Execution

When running directly, you provide variables by supplying a correctly formatted YAML file to the -a argument to elaunch.py

You define variables in the variables section of the experiment configuration.

See specifying variables for more information. Also see variables for the specification syntax.

You can see the defined variables by running einputs.py on your experiment.

`dependencies`

dependencies are external directories that the experiment requires to run c.f. input and data are files or archives

REST API Execution

You providedependencies information via the experiment payload. The dependencies are passed by reference i.e. you give the location of the dependencies.

See providing external directories to experiments for details.

Direct Execution

When running directly, you specify dependencies using the -s option to elaunch.py.

dependencies are directories you reference in your experiment configuration that are listed under application-dependencies key. These directories are populated at runtime based either (a) the manifest written for the experiment OR by (b) a user supplying one at runtime.

See providing external directories to experiments for details

Virtual Experiment Outputs

Virtual experiments can produce many output files of various sizes and importance.

Key-Outputs
Properties
Other Outputs

Key-Outputs

Key-outputs are files/directories produced by the virtual experiment that the developer has identified as being of particular interest. Since the filenames may themselves be meaningless the developer gives them unique identifiers

REST API Execution

You can query and retrieve the key-outputs of a virtual experiment instance using the ST4SD API. See Retrieving key-outputs for more details.

[Coming Soon]: The key-outputs will be listed in Registry UI entry for each parameterized virtual experiment package based on the experiment.

You can instruct ST4SD to copy the key-outputs of an instance to an S3 bucket when the instance has finished. See Automatically uploading key-outputs to S3.

Direct Execution

Metadata describing key-outputs will be in the output directory in the top level of your experiment instance directory in the file outputs.json file.

In addition, users may also request key-outputs be copied to an external location when an experiment instance finishes. You can do this by setting the --s3StoreToURI and --s3AuthWithEnvVars (or --s3AuthBearer64=S3AUTHBEARER64) arguments to elaunch.py.

See the documentation for direct runs for more information.

Properties

When an experiment has a virtual experiment interface defined, property tables are also produced.

REST API Execution

You can see the properties provided by a parameterized virtual package by checking the experiment registry. You can access these properties using the ST4SD API.

See using virtual experiment interfaces for more details.

Direct Execution

The property tables (csv files) defined by the interface are available at in the output folder in the top-level of the experiment instance directory in the file properties.csv

Other Outputs

All other output files produced during an experiment run can be retrieved.

REST API Execution

You can retrieve any output file of a virtual experiment instance using the ST4SD API

If you have access to the cluster hosting the ST4SD you are using you can also browse the outputs via a terminal.

Direct Execution

The experiment instance directory contains all the outputs of all steps and can be easily explored via the terminal.

Edit this page on GitHub

About: Roadmap

About: User Environments

Concepts

Overview

How researchers and developers interact with ST4SD

Terminology

Virtual Experiment Inputs

inputs

REST API Execution

Direct Execution

data

REST API Execution

Direct Execution

variables

REST API Execution

Direct Execution

dependencies

REST API Execution

Direct Execution

Virtual Experiment Outputs

Key-Outputs

REST API Execution

Direct Execution

Properties

REST API Execution

Direct Execution

Other Outputs

REST API Execution

Direct Execution

`inputs`

`data`

`variables`

`dependencies`