Write experiments
This page assumes you are familiar with running experiments locally using the elaunch.py command line tool. If you need a refresher take a moment to read our docs before continuing any further.
- Wrapping a python script for native execution
- Packaging your virtual experiment
- Using containers for shareable virtual experiments
- Your first Simulation experiment with GAMESS US
Requirements
An understanding of how to run a virtual experiment locally.
A virtual environment with the
st4sd-runtime-core
python modulepython -m venv venv. ./venv/bin/activatepip pip install "st4sd-runtime-core[develop]>=2.5.0"A Container Runtime system: install one of docker, podman, or Rancher Desktop
Wrapping a python script for native execution
For your first virtual experiment, we will start with a Python script and use a virtual environment where st4sd-runtime-core is installed. This is a great way to quickly prototype virtual experiments without worrying about making them shareable with others. Check out Using containers for shareable virtual experiments for an example.
Begin by creating the directory /tmp/hello-world
and navigating into it. We are going to store all files relevant to the virtual experiment of this example in this directory.
A good way to become familiar with ST4SD is to wrap a simple python script in a virtual experiment and execute it. So, let’s build a hello world experiment using python. Create the file printer.py with the following contents:
import sysprint(" ".join(sys.argv[1:]))
This Python script is straightforward and does not rely on any external Python packages. If it did, you would need to install the required dependencies using pip install
within the virtual environment where st4sd-runtime-core is installed. Since this script has no external dependencies, no additional Python modules need to be installed.
To create a ST4SD virtual experiment based on this script, we will define it using a YAML file. The structure of a ST4SD virtual experiment definition is as follows:
entrypoint:# Instructions of the entry point to your experimentcomponents:# Templates each of which execute a single taskworkflows:# Templates each of which pipelines of tasks which themselves# are either instances of Workflows or Components templates
Developers build experiments by connecting together Components and Workflows and identifying the entry point of the virtual experiment.
Components represent individual tasks, while Workflows represent pipelines (i.e. graphs) of Workflows and Components. The example provided demonstrates the execution of a single task. A helpful way to understand virtual experiments is to think of them as programs written in a programming language. In this analogy, Workflows and Components serve as functions, and the entrypoint is similar to declaring a main() function, specifying which function to execute and what arguments to pass to it.
Let’s put together a simple experiment consisting of a single step that prints a message to its standard output.
Place the following in the file hello-world.yaml
:
entrypoint:entry-instance: printerexecute:- target: <entry-instance>args:message: Hello worldcomponents:- signature:
In this example, the entrypoint to the experiment is an instance of the printer component which sets its message parameter to the value “Hello world”.
Moving on to the Components template section we observe that there is a single template called printer. The printer component template has a single parameter called message, which does not have a default value. Instances of this template, run the executable python, passing the absolute path to the printer.py script and the message value as arguments. In the next example we will show you a better way to package experiments alleviating the need to use absolute paths for your scripts.
Let’s run the experiment using elaunch.py
. Run the following command from inside the /tmp/hello-world directory
elaunch.py --nostamp hello-world.yaml
After a few seconds you should see:
completed-on=2025-03-22 12:39:13.806401cost=0created-on=2025-03-22 12:39:07.382235current-stage=stage0exit-status=Successexperiment-state=finishedstage-progress=1.0stage-state=finishedstages=['stage0']
The experiment will create the directory hello-world.instance and store all files it generated in it.
hello-world.instance├── conf│ ├── dsl.yaml # your virtual experiment definition│ ├── flowir_instance.yaml # ignore this file│ ├── flowir_package.yaml # ignore this file│ └── manifest.yaml # ignore this file├── elaunch.yaml├── input├── output
If you encountered any issues during the process, please refer to the troubleshooting section of the documentation for guidance on launching experiments locally.
Now that you have ran the experiment, take a moment to explore its outputs. You can find the output files following directory:
hello-world.instance/stages/stage0/entry-instance
.
Exercise
In ST4SD you can override the parameters of entry-instance
that the entrypoint
sets via the dictionary entrypoint.execute[0].args
.
For example, place the following into a new file my-variables.yaml
:
global:message: my custom message
Then remove the hello-world.instance
directory and run the experiment again but this time use load the my-variables.yaml
file:
rm -rf hello-world.instanceelaunch.py --nostamp -a my-variables.yaml hello-world.yaml
The stdout of the hello
component can be found in the following file:
hello-world.instance/stages/stage0/entry-instance/out.stdout
.
Packaging your virtual experiment
When automating simulation codes with custom bash scripts, you may have experienced difficulties with absolute paths when relocating your codes to different directories or execution environments. ST4SD provides a solution to this issue. It offers two methods for packaging multiple files, with the most convenient approach being the use of the Standard project structure. This structure consists of a virtual experiment definition, defined in a YAML file, and an optional manifest file, which specifies additional directories to be included with the virtual experiment. The manifest file has the following format:
The manifest file has the following format:
destinationDirectoryName: sourceDirectory
This instructs the runtime system to create a directory called destinationDirectoryName using the files from the path sourceDirectory. If the sourceDirectory is not an absolute path then it is considered relative to the location of the experiment definition YAML file.
To convert the above example to use the Standard project structure, first create the directory /tmp/hello-world/bin
and move the printer.py script into it. Next, create the file manifest.yaml under the directory /tmp/hello-world/
with the following content:
bin: bin
Now, let’s update the virtual definition to use the bin/printer.py file. You just need to change the last line in the components section of your hello-world.yaml file:
...components:- signature:name: printerparameters:- name: messagecommand:executable: pythonarguments: /tmp/hello-world/printer.py "%(message)s" # HERE
Replace the absolute path /tmp/hello-world/printer.py with bin/printer.py:ref. The :ref suffix indicates that this is a reference to a file, rather than a direct path. At runtime, the system will use the manifest.yaml file to resolve this reference, enabling you to include additional files with your virtual experiment definition in a flexible and portable way.
Your updated hello-world.yaml file should now look like this:
entrypoint:entry-instance: printerexecute:- target: <entry-instance>args:message: Hello worldcomponents:- signature:
You should end up with the following files:
/tmp/hello-world├── bin│ └── printer.py├── hello-world.yaml└── manifest.yaml
Finally, let’s run this experiment:
elaunch.py --nostamp --manifest manifest.yaml hello-world.yaml
After a few seconds you should see this output on your terminal:
completed-on=2025-03-31 10:19:45.271792cost=0created-on=2025-03-31 10:19:39.121168current-stage=stage0exit-status=Successexperiment-state=finishedstage-progress=1.0stage-state=finishedstages=['stage0']
Congratulations! You have successfully packaged your virtual experiment!
Exercise
Modify your printer.py script to import a Python package, such as transformers. To ensure successful execution, make sure to install transformers within the same virtual environment where st4sd-runtime-core is installed.
Next, run it elaunch.py:
rm -rf hello-world.instanceelaunch.py --nostamp --manifest manifest.yaml hello-world.yaml
Double check that it runs to completion.
Using containers for shareable virtual experiments
To make experiments truly shareable, they must include the following key information:
- All executables that they run, along with their software dependencies
- How to map the executables to specific steps
- How to connect inputs to these steps
In the above example we wrapped a single-step executable into a virtual experiment, covering the second and third requirements for a single-step experiment. In this example, we will utilize a container to share the software dependencies of the printer.py python script, addressing the first requirement.
Create a new directory in /tmp/docker-package
and cd into it, we will use it for the files of this virtual experiment.
Containerize your python application
In a requirements.txt file place the python dependencies of your script. The printer.py python script that we use here does not have any python requirements but we’ll just install transformers in the container we use to execute the script just to demonstrate the method:
The contents of the requirements.txt
file are:
transformers==4.50.3
Next, create a file called Dockerfile
with the following contexts:
FROM python:3.11-slimRUN apt-get update \&& apt-get upgrade -y \&& apt-get clean -y \&& rm -rf /var/lib/apt/lists/*# Make sure that files under /app are part of $PATHENV PATH=/app:$PATH
Make sure you have the following files in the directory you are currently in:
/tmp/docker-package├── Dockerfile├── printer.py└── requirements.txt
To build your container, run docker build:
docker build --platform linux/amd64 -f Dockerfile -t my-printer:latest .
Making your container available to others
If you plan to share your experiment with others, you will need to push your containers to a remote container registry, such as Docker Hub. This allows others to easily access and pull your container images, making it simpler to share and reproduce your experiment.
To push your container to a remote registry, you can use the following steps:
- Tag your container image: Use the
docker tag
command to assign a unique name to your image, including the registry URL and your username. - Login to the registry: Use the
docker login
command to authenticate with the registry. - Push the image: Use the
docker push
command to upload your image to the registry.
For example:
: # Tag the imagedocker tag my-printer:latest <your-username>/my-printer:latest: # Login to Docker Hubdocker login: # Push the imagedocker push <your-username>/my-printer:latest
Create a virtual experiment that uses the container
Create the file docker-package.yaml in the /tmp/docker-package
directory with the following contents:
entrypoint:entry-instance: printerexecute:- target: <entry-instance>args:message: Hello worldcomponents:- signature:
The differences between this experiment and above hello-world.yaml experiment are all about the printer component
:
- The bin/printer.py:ref Reference is replaced by the direct path
/app/printer.py
- The file is now located inside the container so using a direct path is perfectly fine
- The runtime system will search for this executable in the
$PATH
environment variable of the component
- We set
command.environment.PATH
to include the path to theprinter.py
script- By default, components receive the virtual environment of the runtime process which is not guaranteed to be compatible with the environment variables that enable the execution of commands inside the container
- Configure the docker backend for this component
- Set
resourceManager.config.backend
todocker
- Set
resourceManager.docker.image
tomy-printer:latest
- Set
resourceManager.docker.imagePullPolicy
toIfNotPresent
- This setting instructs the runtime to only attempt to pull the image if it’s not already present on the local machine
- Set
The resulting file tree in /tmp/docker-package
should be:
/tmp/docker-package├── Dockerfile # To build image├── requirements.txt # To build image├── printer.py # To build image└── docker-package.yaml # To execute experiment
Exercise
Run your virtual experiment using elaunch.py:
elaunch.py --nostamp docker-package.yaml
If you encountered any issues during the process, please refer to the troubleshooting section of the documentation for guidance on launching experiments locally.
Now that you have ran the experiment, take a moment to explore its outputs. You can find the output files following directory:
docker-package.instance/stages/stage0/entry-instance
.
Your first Simulation experiment with GAMESS US
In this example, we will create a virtual experiment that performs the Parameterized Model 3 (PM3) method in GAMESS US. PM3 is a semi-empirical quantum chemistry method. Scientists use it to calculate the molecular properties and energies when computational efficiency is a priority as an alternative to high accuracy but slow to run high-level quantum methods like Hartree-Fock or Density Functional Theory (DFT).
Start by creating a new directory in /tmp/gamess-us-pm3
containing 2 directories: bin
and hooks
like so:
/tmp/gamess-us-pm3├── bin└── hooks
Create the file bin/run-gamess.sh
using the following:
#!/usr/bin/env shmolecule=$1cpus=$2# The restart hook expects the filename to exist in the working directory# of GAMESS USmolecule_name=$(basename "${molecule}")cp ${molecule} ${molecule_name}
Then download the extract_gmsout.py script and store it in the bin
directory.
Next, make the run-gamess.sh script executable by running chmod +x bin/run-gamess.sh
from inside the /tmp/gamess-us-pm3
directory.
Download the RestartHook example and save it under hooks/semi_empirical_restart.py. This script checks if the PM3 method in GAMESS US has converged. If not, it triggers a task restart. You can find more information on RestartHooks in our documentation about restarting tasks.
Next, prepare the definition of the experiment by pasting the following into the gamess-us-pm3.yaml file:
entrypoint:entry-instance: gamess-us-pm3execute:- target: <entry-instance>args:input.molecule.inp: input/molecule.inpgamess-number-processors: 1gamess-memory: "4096Mi"# gamess-gpus is only relevant for execution on Kubernetes
Finally create your manifest.yaml file:
bin: binhooks: hooks
You should now have the following file structure:
/tmp/gamess-us-pm3├── bin│ ├── extract_gmsout.py│ └── run-gamess.sh├── hooks│ └── semi_empirical_restart.pyf├── manifest.yaml└── gamess-us-pm3.yaml
Exercise
Try starting your experiment now using a container runtime.
Create your GAMESS-US input molecule.inp file in the directory /tmp/gamess-us-pm3
. You can use this example input file:
$CONTRL COORD=UNIQUE SCFTYP=RHF RUNTYP=OPTIMIZE MULT=1ISPHER=1 ICHARG=0 MAXIT=100 $END$SYSTEM MWORDS=100 TIMLIM=600 $END$BASIS GBASIS=PM3 $END$GUESS GUESS=HUCKEL $END$SCF DIRSCF=.t. FDIFF=.f. DIIS=.t. $END$STATPT NSTEP=500 PROJCT=.f. IHREP=20 HSSEND=.t. $END$DATACH4 C CH4
You should now have this structure:
/tmp/gamess-us-pm3├── bin│ ├── extract_gmsout.py│ └── run-gamess.sh├── gamess-us-pm3.yaml├── hooks│ └── semi_empirical_restart.py├── manifest.yaml└── molecule.inp
Launch your experiment providing the input file molecule.inp and the manifest file manifest.yaml.
elaunch.py --nostamp --manifest manifest.yaml -i molecule.inp gamess-us-pm3.yaml
After a couple of minutes you should see:
completed-on=2025-03-31 11:37:49.301663cost=0created-on=2025-03-31 11:35:46.527619current-stage=stage0exit-status=Successexperiment-state=finishedstage-progress=1.0stage-state=finishedstages=['stage0']
The gamess-us-pm3.instance directory will have the following structure:
gamess-us-pm3.instance├── bin│ ├── extract_gmsout.py│ └── run-gamess.sh├── conf│ ├── dsl.yaml│ ├── flowir_instance.yaml│ ├── flowir_package.yaml│ └── manifest.yaml
Examine the files under stages/stage0/optimise and stages/stage0/parse-gamess.
These were the contents of stages/stage0/parse-gamess/energies.csv for the experiment we ran on our laptop:
label,completed,total-energy,homo,lumo,gap,electric-moments,total-time,total-time-per-coremolecule,OK,-180.53313527498008,-13.641,4.245,17.886,0.000050,0.1,0.10
What’s next?
- Learn more about writing experiments, including more advanced features and best practice here
- Learn how to add key-outputs and interfaces to your experiments
- More information on running experiments directly, i.e. via
elaunch.py
here - More information on the DSL of ST4SD i.e. how to write experiments here
- More information on how to structure and test your experiments here