DSL 2.0 Specification
Use this page to learn about the new Domain Specific Language (DSL 2.0) of ST4SD and how it works.
- Namespace
- Entrypoint
- Workflow
- Component
- Assigning values to parameters
- OutputReference
- Example
- Differences between DSL 2.0 and FlowIR
DSL 2.0 is the new (and beta) way to define the computational graphs of ST4SD workflows.
Namespace
In DSL 2.0, a Computational Graph consists of Components which can be grouped under Workflow containers. It also has an Entrypoint which points to the root node of the graph, which is an instance of a Component or Workflow template.
A Namespace is simply a container for the Component, Workflow, and Entrypoint definitions which represent the Computational Graph of one ST4SD workflow.
Below is an example of a Namespace containing a single component that prints the message Hello world
to the terminal.
entrypoint:entry-instance: printexecute:- target: "<entry-instance>"args:message: Hello worldcomponents:- signature:name: print
Entrypoint
The Optional Entrypoint serves a single purpose. Describe how to execute root Template instance of the Computational Graph.
Its schema is:
# This executes an instance of $template which is called "<entry-instance>"entry-instance: $template # name of a Component or Workflow templateexecute: # an array with exactly 1 entry- target: <entry-instance> # which instance of a Template to execute.# In this scope there is only <entry-instance>args:$paramName: $value # one for each parameter of the template that# the "target" points to
The entry-instance
field receives the name of a Template and creates an instance of it called <entry-instance>
.
The execute
field then describes how to “execute” the <entry-instance>
i.e. how to populate the arguments of the associated Template.
In execute[].args
you:
- must provide values for any parameters in the child
$template
which do not have default values - may override the value of the parameters in
$template
which have default values
The Template instance that the entrypoint points to can have special parameters which are data references to paths that are external to the workflow.
These parameters must be called input.$filename
and they must not have default values in the signature of the Template definition.
The entrypoint may not explicitly override the values of said parameters, the runtime system will auto-generate them.
Consider a scenario where the Template that the <entry-instance>
step points to has a parameter called input.my-input.db
.
The runtime will post-process the entrypoint.execute[0].args
dictionary to include the following key-value pair:
input.my-input.db: "input/my-input.db"
In Assigning values to parameters we describe in more detail how to assign values to parameters of Template instances in general.
Workflow
A Workflow is a Template that describes how to execute
a number of Template instances called steps
.
It has a signature
that consists of a unique name
and a parameter
list.
Each such step can consume the outputs of a sibling step, or the parameters of the parent Workflow.
The outputs of a workflow are its steps
. The schema of Workflow is:
signature:name: $Template # the name of this Workflow Template - must be uniqueparameters:- name: $paramName# optional default valuedefault: $value # str, number, or dictionary of {str: str/number}steps: # which steps to instantiate$stepName: $Template # for example child: simulation-codeexecute: # how to execute the steps - one for each entry of steps
In Assigning values to parameters we describe how to assign values to parameters of Template instances.
Component
A Component describes how to execute
a task.
Just like a Workflow Template, it has a signature
that consists of a name
and a parameter
list.
The outputs of a Component are the paths under its working directory.
The schema of a Component is:
signature:name: $Template # the name of this Component Template - must be uniqueparameters:- name: $paramName# optional default valuedefault: $value # str, number, or dictionary of {str: str/number}# All the FlowIR fields, except for stage, name, references, and overridecommand:executable: str
The above fields are the same as those in the Component section of the Workflow Specification in FlowIR.
For more information, read our documentation on the basic FlowIR component fields.
Assigning values to parameters
Both Component and Workflow templates are instantiated in the same way:
by declaring them as a step
and adding an entry to an execute
block which assigns values to the Template’s parameters.
The value of a parameter can be a number, string, or a key: value dictionary.
The body of a Template can reference its parameters like so %(parameterName)s
.
When assigning a value to the parameters of a template via the execute[].args
dictionary
In execute[].args
you:
- must provide values for any parameters in the child
$template
which do not have default values - may override the value of the parameters in
$template
which have default values - may use
OutputReferences
to indicate dependencies to steps (definition follows this bullet list) - may use
%(parentParameter)s
to indicate a dependency to the value that the parent parameter has. In turn that can be a dependency to the output of a Template instance or an input file or it might just be a literal constant - may use a
$key: $value
dictionary to propagate a dictionary-type value. At the moment Template can only reference this kind of parameters to set the value of thecommand.environment
field of Components - may use
%(input.$filename)s
to propagate an input file reference from a parent to a step.- Eventually a step must apply a DataReferences
:$method
to the parameter to indicates it wishes to consume the input file
- Eventually a step must apply a DataReferences
Wanna find out more? Check out our example.
OutputReference
The format of an OutputReference
is:
<$stepId>/$optionalPath:$optionalMethod
$stepId
is a /
separated array of stepNames
starting from the scope of the current workflow. For example, the OutputReference <one/child>/file.txt:ref
resolves to the absolute path of the file file.txt
that the component child
produces under the sibling step one
which is an instance of a Workflow template. You can find more reference methods
in our DataReferences docs.
Example
Here is a simple example which uses one Workflow and one Component template two run 2 tasks.
- consume-input: prints the contents of an input file called
my-input.db
- consume-sibling: prints the text “my sibling said” followed by stdout of the sibling step
<consume-input>
entrypoint:entry-instance: mainexecute:- target: <entry-instance>workflows:- signature:name: mainparameters:# special variable with auto-populated value
To try it out, store the above DSL in a file called dsl-params.yaml
and run
pip install "st4sd-runtime-core[develop]"
which installs the command-line-tool elaunch.py, followed by:
echo "hello world" >my-input.dbelaunch.py -i my-input.db --failSafeDelays=no -l40 dsl-params.yaml
Differences between DSL 2.0 and FlowIR
There are some differences between DSL 2.0 and FlowIR.
In the current version (0.2.x) of DSL 2.0:
- we offer support for natural composition of Computational Graphs using Workflow and Component templates
- the
signature
field replaces thestage
,name
,references
, andoverride
fields of the component specification in FlowIR - settings and inputs flow through parameters, we do not support global/stage environments or variables
- the fields of components can contain
%(parameter)s
references as well as component%(variable)s
- dependencies between components are defined by referencing the output of a producer component in one parameter of the consumer component - DataReferences are reserved for referencing input files only
- the equivalent of a DataReference for Template instances is an OutputReference
- key-outputs and interface
- data files and manifests
DSL 2.0 will eventually contain a superset of the FlowIR features. However, the current beta version of DSL 2.0 does not support:
- FlowIR platforms
- application-dependencies
- however, you can use a manifest to implicitly define your application-dependencies