The elaunch.py command line tool
The elaunch.py
command line tool executes and monitors virtual experiments. You can use it to run experiments on your laptop or on a High Performance Computing Cluster. When you submit a virtual experiment via ST4SD API it is executed by elaunch.py
.
- Running experiments with elaunch.py
- Providing the inputs to the experiment
- Checking if your experiment worked
- What is the output of my experiment ?
- Troubleshooting
Install elaunch.py
If you haven’t already, install the st4sd-runtime-core
python package:
pip install "st4sd-runtime-core[develop]>=2.5.0"
Running experiments with elaunch.py
Use elaunch.py to run virtual experiments: elaunch.py <path to experiment>
.
Note, running experiments locally requires using either Linux or MacOS. Windows users should either use a Virtual Machine (e.g. VirtualBox etc) or the Windows Subsystem for Linux (WSL).
Some experiments, like the below example, also require you to install one of docker, podman, or Rancher Desktop too.
For example, you can run the workflow nanopore-geometry-experiment
like so:
: # Get the directory containing the virtual experiment and cd into itgit clone https://github.com/st4sd/nanopore-geometry-experiment.gitcd nanopore-geometry-experiment: # Run elaunch.py specifying certain files in the directory created aboveelaunch.py -i docker-example/cif_files.dat -l 40 --nostamp\--applicationDependencySource="nanopore-database=cif:copy" \nanopore-geometry-experiment.package
The experiment should take about 5 minutes to complete. The -l 40
option keeps the log printouts to the bare minimum so don’t worry if the command is silent for a few minutes. When the experiment completes expect to see something similar to the following text on your terminal:
completed-on=2024-03-15 14:39:29.909105cost=0created-on=2024-03-15 14:38:04.402969current-stage=stage3exit-status=Successexperiment-state=finishedstage-progress=1.0stage-state=finishedstages=['stage0', 'stage1', 'stage2', 'stage3']
Providing the inputs to the experiment
The experiment requires two types of inputs: the cif_files.dat input file, which contains a list of filenames, and the nanopore-database application dependency, a directory containing the files listed in cif_files.dat.
Note that application dependencies are directories located in the root of the virtual experiment instance. These directories are populated by the runtime system using the --applicationDependencySource=$appDepName:/path/to/source
command-line argument in elaunch.py
.
For more details, refer to the application-dependencies documentation.
Checking if your experiment worked
If the experiment works elaunch.py
prints exit-status=Success
before it terminates and then exits with return code 0. You can find more information about the status of your experiment under the file ${package_name}-${timestamp}.instance/output/status.txt
. For example, here’s a status.txt
for a successful run of an experiment:
completed-on=2025-03-21 13:15:19.790371cost=0created-on=2025-03-21 13:14:30.088069current-stage=stage2exit-status=Successexperiment-state=finishedstage-progress=1.0stage-state=finishedstages=['stage0', 'stage1', 'stage2']
If the experiment fails you will see the line exit-status=Failed
in the logs of elaunch.py
and it will exit with a return code other than 0
. If the experiment failed after the instance directory was created you will see this information in the output/status.txt
file too. Common reasons for failures are invalid syntax, missing input files, or requesting a compute resource that is not available. For more information and dealing with these errors see our Troubleshooting section.
What is the output of my experiment ?
Looking inside the instance directory created for the experiment, we see the following:
ls -lth nanopore-geometry-experiment.instancetotal 64drwxrwxr-x@ 10 user wheel 320B 21 Mar 13:15 output/-rw-r--r--@ 1 user wheel 28K 21 Mar 13:15 status.db-rw-rw-r--@ 1 user wheel 536B 21 Mar 13:14 elaunch.yamldrwxr-xr-x@ 5 user wheel 160B 21 Mar 13:14 hooks/drwxrwxr-x@ 5 user wheel 160B 21 Mar 13:14 stages/drwxr-xr-x@ 5 user wheel 160B 21 Mar 13:14 conf/lrwxrwxr-x@ 1 user wheel 31B 21 Mar 13:14 python@ -> /Users/user/venvs/st4sd
In order of importance:
- output: This directory contains metadata about the key outputs of the experiment as well as the measured properties of the input systems the experiment processed. You can find more information about the contents of this directory in our detailed breakdown of the output directory.
- stages: This directory holds all the working directories of all the steps that were executed during this experiment instance
- input: The input files and variable parameterisation files we provided to the
elaunch.py
command earlier - nanopore-database: This is an application-dependency that the experiment for reading information about the nanoporous materials it simulates
- conf: The experiment definition
- hooks: This contains code used to produce the interface properties. In general, it may also contain code that checks whether tasks finished successfully or not, however then
nanopore-geometry-experiment
does not use this feature. - python: A link to the virtual environment that was used during the execution of the experiment
- elaunch.yaml: A YAML file containing metadata about the execution of the experiment
- status.db: A SQLITE database containing metadata about execution of the experiment
Key outputs
All experiments produce files, but not all generated files are equally important. To this end ST4SD has the concept of key-outputs. These are files, and directories, that an experiment produces which the developers of the experiment consider important.
The file $instanceDirectory/output/output.json
contains metadata about the key outputs of the experiment. Here’s how tha file looks like for the experiment we just ran:
{"geometricalProperties": {"creationtime": "1742562196.4740903","description": "","filename": "geometry.tgz","filepath": "stages/stage2/AggregateGeometry/geometry.tgz","final": "yes","production": "yes","type": "",
The experiment generates a single key output, geometricalProperties, which is a tar file named geometry.tgz
. This file is created by the stage2.AggregateGeometry
step.
Measured properties
Key outputs are not always immediately parseable without deep understanding of their format. To address this, ST4SD supports the interface feature. This feature allows workflow developers to extract measured properties and store them in a CSV file, making the data easier to consume.
Experiments that implement an interface
will produce a file under ${instance-directory}/output/properties.csv
. Here is the file that the nanopore-geometry-experiment
produced above:
input-id;d_is;d_fs;d_isfs;asa_m^2/cm^3;asa_m^2/g;nasa_m^2/cm^3;nasa_m^2/g;unitcell_volume;density;av_volume_fraction;av_cm^3/g;nav_volume_fraction;nav_cm^3/g;n_pocketsCoRE2019/GUJVOX_clean;3.98106;2.74095;3.93078;0.0;0.0;760.654;564.703;1164.59;1.347;0.0;0.0;0.016015;0.0118894;1.0
The CSV file will always contain the input-id
column. Its values are the identifiers of the input system ids for which the virtual experiment measured properties. The remaining columns contain the measured properties for the input systems.
Troubleshooting
Unable to create instance directory at specified location Error
If you run the same command a second time elaunch.py will complain that the experiment instance already exists:
completed-on=N/Acost=0created-on=N/Acurrent-stage=Noneerror-description=ExperimentSetupError Error encountered while setting up experiment.\\nPackage validation failed - cannot deploy\\nUnderlying error: Unable to create instance directory at specified location.\\nUnderlying error: [Errno 17] File exists: '/private/tmp/test/nanopore-geometry-experiment/nanopore-geometry-experiment/nanopore-geometry-experiment/nanopore-geometry-experiment.instance'exit-status=Failedexperiment-state=failedstage-progress=0stage-state=Initialising
To fix this problem, either remove the --nostamp
argument or delete the directory and retry elaunch.py
.
Investigating a Failed experiment
If the exit-status
of your experiment instance is Failed
then this means that at least one of your components was unable to terminate successfully. You can find the name of the component that caused the experiment to fail in the status
file and printout.
Here is an example:
completed-on=2024-05-23 09:44:03.757679cost=0created-on=2024-05-23 09:42:19.223491current-stage=stage1error-description=Stage 1 failed. Reason:\\\\n3 jobs failed unexpectedly.\\\\nJob: stage1.PartialSum0. Returncode 1. Reason KnownIssue\\\\nJob: stage1.PartialSum2. Returncode 1. Reason KnownIssue\\\\nJob: stage1.PartialSum1. Returncode 1. Reason KnownIssue\\\\nexit-status=Failedexperiment-state=finishedstage-progress=0.5stage-state=failed
The error reports that multiple components failed: stage1.PartialSum0
, stage1.PartialSum1
, stage1.PartialSum2
.
You may also get a full view of the state of the experiment by using the einspect.py -f all
tool.
========== STAGE 0 ==========Components using engine-type: enginereference, state, backend, isWaitingOnOutput, engineExitReason, lastTaskRunTime, lastTaskRunStatestage0.GenerateInput, finished, local, True, Success, 0:00:00.358677, finished========== STAGE 1 ==========Components using engine-type: engine
After you spot a Failed
component, try looking at the files it produced, including its stdout and stderr (for some backends both streams get fed into stdout). Recall that you can find these files under $INSTANCE_DIR/stages/stage<stage index>/<component name>/
. Look for the out.stdout
and out.stderr
files.
Sometimes, a component fails because one of its predecessors (direct, or indirect) produced unexpected output. To find the predecessors of a component, look at the $INSTANCE_DIR/conf/flowir_instance.yaml
, locate the component you are investigating and then follow its predecessors by looking at the references
of the component. You can then investigate the output files and stdout/stderr of those components to see if you can spot why the downstream component failed.
The experiment outputs
Learn more
Write experiments
Get an introduction to writing virtual experiment with ST4SD Core.
Advanced usage of elaunch.py
This example only scratches the surface of how you can use elaunch.py. See elaunch.py for advanced users for more information about using elaunch.py
including running on High Performance Computing clusters.
Exploring the Registry UI
Learn about all the features of our web interface for browsing and examining virtual experiments packages and runs. You can visit the ST4SD Global Registry for a first look.
No Code, No Fuss creation of Experiments
Use an interactive Build Canvas and a Graph Library to create and modify experiments straight from your browser.