Skip to main contentIBM ST4SD

The elaunch.py command line tool

The elaunch.py command line tool executes and monitors virtual experiments. You can use it to run experiments on your laptop or on a High Performance Computing Cluster. When you submit a virtual experiment via ST4SD API it is executed by elaunch.py.

Install elaunch.py

If you haven’t already, install the st4sd-runtime-core python package:

pip install "st4sd-runtime-core[develop]>=2.5.0"

Running experiments with elaunch.py

Use elaunch.py to run virtual experiments: elaunch.py <path to experiment>.

Note, running experiments locally requires using either Linux or MacOS. Windows users should either use a Virtual Machine (e.g. VirtualBox etc) or the Windows Subsystem for Linux (WSL).

Some experiments, like the below example, also require you to install one of docker, podman, or Rancher Desktop too.

For example, you can run the workflow nanopore-geometry-experiment like so:

: # Get the directory containing the virtual experiment and cd into it
git clone https://github.com/st4sd/nanopore-geometry-experiment.git
cd nanopore-geometry-experiment
: # Run elaunch.py specifying certain files in the directory created above
elaunch.py -i docker-example/cif_files.dat -l 40 --nostamp\
--applicationDependencySource="nanopore-database=cif:copy" \
nanopore-geometry-experiment.package

The experiment should take about 5 minutes to complete. The -l 40 option keeps the log printouts to the bare minimum so don’t worry if the command is silent for a few minutes. When the experiment completes expect to see something similar to the following text on your terminal:

completed-on=2024-03-15 14:39:29.909105
cost=0
created-on=2024-03-15 14:38:04.402969
current-stage=stage3
exit-status=Success
experiment-state=finished
stage-progress=1.0
stage-state=finished
stages=['stage0', 'stage1', 'stage2', 'stage3']

If you run the same command a second time elaunch.py will complain that the experiment instance already exists. Either remove the --nostamp argument or delete the directory and retry elaunch.py. See Section What is the output of my experiment? for more information.

Providing the inputs to the experiment

The experiment requires two types of inputs: the cif_files.dat input file, which contains a list of filenames, and the nanopore-database application dependency, a directory containing the files listed in cif_files.dat.

Note that application dependencies are directories located in the root of the virtual experiment instance. These directories are populated by the runtime system using the --applicationDependencySource=$appDepName:/path/to/source command-line argument in elaunch.py.

For more details, refer to the application-dependencies documentation.

Checking if your experiment worked

If the experiment works elaunch.py prints exit-status=Success before it terminates and then exits with return code 0. You can find more information about the status of your experiment under the file ${package_name}-${timestamp}.instance/output/status.txt. For example, here’s a status.txt for a successful run of an experiment:

completed-on=2025-03-21 13:15:19.790371
cost=0
created-on=2025-03-21 13:14:30.088069
current-stage=stage2
exit-status=Success
experiment-state=finished
stage-progress=1.0
stage-state=finished
stages=['stage0', 'stage1', 'stage2']

If the experiment fails you will see the line exit-status=Failed in the logs of elaunch.py and it will exit with a return code other than 0. If the experiment failed after the instance directory was created you will see this information in the output/status.txt file too. Common reasons for failures are invalid syntax, missing input files, or requesting a compute resource that is not available. For more information and dealing with these errors see our Troubleshooting section.

What is the output of my experiment ?

Looking inside the instance directory created for the experiment, we see the following:

ls -lth nanopore-geometry-experiment.instance
total 64
drwxrwxr-x@ 10 user wheel 320B 21 Mar 13:15 output/
-rw-r--r--@ 1 user wheel 28K 21 Mar 13:15 status.db
-rw-rw-r--@ 1 user wheel 536B 21 Mar 13:14 elaunch.yaml
drwxr-xr-x@ 5 user wheel 160B 21 Mar 13:14 hooks/
drwxrwxr-x@ 5 user wheel 160B 21 Mar 13:14 stages/
drwxr-xr-x@ 5 user wheel 160B 21 Mar 13:14 conf/
lrwxrwxr-x@ 1 user wheel 31B 21 Mar 13:14 python@ -> /Users/user/venvs/st4sd

In order of importance:

  • output: This directory contains metadata about the key outputs of the experiment as well as the measured properties of the input systems the experiment processed. You can find more information about the contents of this directory in our detailed breakdown of the output directory.
  • stages: This directory holds all the working directories of all the steps that were executed during this experiment instance
  • input: The input files and variable parameterisation files we provided to the elaunch.py command earlier
  • nanopore-database: This is an application-dependency that the experiment for reading information about the nanoporous materials it simulates
  • conf: The experiment definition
  • hooks: This contains code used to produce the interface properties. In general, it may also contain code that checks whether tasks finished successfully or not, however then nanopore-geometry-experiment does not use this feature.
  • python: A link to the virtual environment that was used during the execution of the experiment
  • elaunch.yaml: A YAML file containing metadata about the execution of the experiment
  • status.db: A SQLITE database containing metadata about execution of the experiment

Key outputs

All experiments produce files, but not all generated files are equally important. To this end ST4SD has the concept of key-outputs. These are files, and directories, that an experiment produces which the developers of the experiment consider important.

The file $instanceDirectory/output/output.json contains metadata about the key outputs of the experiment. Here’s how tha file looks like for the experiment we just ran:

{
"geometricalProperties": {
"creationtime": "1742562196.4740903",
"description": "",
"filename": "geometry.tgz",
"filepath": "stages/stage2/AggregateGeometry/geometry.tgz",
"final": "yes",
"production": "yes",
"type": "",

The experiment generates a single key output, geometricalProperties, which is a tar file named geometry.tgz. This file is created by the stage2.AggregateGeometry step.

Measured properties

Key outputs are not always immediately parseable without deep understanding of their format. To address this, ST4SD supports the interface feature. This feature allows workflow developers to extract measured properties and store them in a CSV file, making the data easier to consume.

Experiments that implement an interface will produce a file under ${instance-directory}/output/properties.csv. Here is the file that the nanopore-geometry-experiment produced above:

input-id;d_is;d_fs;d_isfs;asa_m^2/cm^3;asa_m^2/g;nasa_m^2/cm^3;nasa_m^2/g;unitcell_volume;density;av_volume_fraction;av_cm^3/g;nav_volume_fraction;nav_cm^3/g;n_pockets
CoRE2019/GUJVOX_clean;3.98106;2.74095;3.93078;0.0;0.0;760.654;564.703;1164.59;1.347;0.0;0.0;0.016015;0.0118894;1.0

The CSV file will always contain the input-id column. Its values are the identifiers of the input system ids for which the virtual experiment measured properties. The remaining columns contain the measured properties for the input systems.

Troubleshooting

Unable to create instance directory at specified location Error

If you run the same command a second time elaunch.py will complain that the experiment instance already exists:

completed-on=N/A
cost=0
created-on=N/A
current-stage=None
error-description=ExperimentSetupError Error encountered while setting up experiment.\\nPackage validation failed - cannot deploy\\nUnderlying error: Unable to create instance directory at specified location.\\nUnderlying error: [Errno 17] File exists: '/private/tmp/test/nanopore-geometry-experiment/nanopore-geometry-experiment/nanopore-geometry-experiment/nanopore-geometry-experiment.instance'
exit-status=Failed
experiment-state=failed
stage-progress=0
stage-state=Initialising

To fix this problem, either remove the --nostamp argument or delete the directory and retry elaunch.py.

Investigating a Failed experiment

If the exit-status of your experiment instance is Failed then this means that at least one of your components was unable to terminate successfully. You can find the name of the component that caused the experiment to fail in the status file and printout.

Here is an example:

completed-on=2024-05-23 09:44:03.757679
cost=0
created-on=2024-05-23 09:42:19.223491
current-stage=stage1
error-description=Stage 1 failed. Reason:\\\\n3 jobs failed unexpectedly.\\\\nJob: stage1.PartialSum0. Returncode 1. Reason KnownIssue\\\\nJob: stage1.PartialSum2. Returncode 1. Reason KnownIssue\\\\nJob: stage1.PartialSum1. Returncode 1. Reason KnownIssue\\\\n
exit-status=Failed
experiment-state=finished
stage-progress=0.5
stage-state=failed

The error reports that multiple components failed: stage1.PartialSum0, stage1.PartialSum1, stage1.PartialSum2.

You may also get a full view of the state of the experiment by using the einspect.py -f all tool.

========== STAGE 0 ==========
Components using engine-type: engine
reference, state, backend, isWaitingOnOutput, engineExitReason, lastTaskRunTime, lastTaskRunState
stage0.GenerateInput, finished, local, True, Success, 0:00:00.358677, finished
========== STAGE 1 ==========
Components using engine-type: engine

After you spot a Failed component, try looking at the files it produced, including its stdout and stderr (for some backends both streams get fed into stdout). Recall that you can find these files under $INSTANCE_DIR/stages/stage<stage index>/<component name>/. Look for the out.stdout and out.stderr files.

Sometimes, a component fails because one of its predecessors (direct, or indirect) produced unexpected output. To find the predecessors of a component, look at the $INSTANCE_DIR/conf/flowir_instance.yaml, locate the component you are investigating and then follow its predecessors by looking at the references of the component. You can then investigate the output files and stdout/stderr of those components to see if you can spot why the downstream component failed.

The experiment outputs

Learn more

Write experiments

Get an introduction to writing virtual experiment with ST4SD Core.

Advanced usage of elaunch.py

This example only scratches the surface of how you can use elaunch.py. See elaunch.py for advanced users for more information about using elaunch.py including running on High Performance Computing clusters.

Exploring the Registry UI

Learn about all the features of our web interface for browsing and examining virtual experiments packages and runs. You can visit the ST4SD Global Registry for a first look.

No Code, No Fuss creation of Experiments

Use an interactive Build Canvas and a Graph Library to create and modify experiments straight from your browser.