The previous section shows how to write basic linear workflows. This chapter describes how to construct workflows containing loops and branches. Looping and conditional control constructs need some way of testing and altering values - so we start by introducing workflow variables
This document is still a bit tangled at present - you may need to read the whole thing once before it starts to make sense
A workflow may contain workflow variables - named
variables that contain simple java objects and structures.
Variables may be accessed and altered from <script>
elements within the workflow document, and may also be referenced from script
expressions in other workflow activities
The <set> activity defines a new
workflow variable, or updates the value of an existing variable. It has
two attributes: var - the name of the workflow variable
(required); value - the value to set this variable to
(optional).
If no value is provided, the variable is created and set to
null
The value attribute may be a straight literal string, or
an script
expression
The <unset> activity deletes a
workflow variable. It has one required attribute - var -
the name of the workflow variable to delete.
If a workflow variable is referenced after it has been deleted, an exception is thrown..
<scope> is a container for activities
that introduces a new nested scope for workflow variables. Any workflow
variables defined within a nested scope are not visible outside that
scope. Any attempts to reference such variables throws an exception
The value of certain attributes and elements may contain references to
workflow variables and other scripting
language expressions. These script expressions are
delimited by ${..}. Script expressions are interpreted as
follows
<!-- in the variable 'a' store the value 'hello world' (java.lang.String) -->
<set var="a" value="hello world" />
<!-- in the variable 'b', store the value '1' (java.lang.String) -->
<set var="b" value="1" />
<!-- in the variable 'c', store the value 1 (java.lang.Integer) -->
<set var="c" value="${1}" />
<!-- in the variable 'd' store the value '1 + 1' (java.lang.String) -->
<set var="d" value="${c} + 1" />
<!-- in the variable 'e' store the value 2 (java.lang.Integer) -->
<set var="e" value="${c + 1}" />
<!-- in the variable 'now' store the current date (java.lang.Date) -->
<set var="now" value="${new java.util.Date()}" />
<!-- in the variable 'nowString' store the current date (java.lang.String) - note trailing space -->
<set var="nowString" value="${new java.util.Date()} " />
<!-- in the variable 'f' store the value '1 - hello world and goodbye." -->
<set var="f" value="${c} - ${a} and goodbye. "/>
<!-- in the variable 'u' store the value 'http://www.astrogrid.org' (java.net.URL) -->
<set var="u" value="${new java.net.URL('http://www.astrogrid.org')} />
<!-- in the variable 'scheme' store the value 'http' (java.lang.String) -->
<set var="scheme" value="${u.getScheme()}" />
<!-- in the variable 'scheme' store the value 'http' using concise bean-property access -->
<set var="scheme" value="${u.scheme}" />
The <script> element is an activity that executes
some inline scripting code. This is a very versatile activity - some
potential uses are
The script element contains an optional
description element and a mandatory body element
that contains the script text. When it is executed a step-execution-record element will be added
which records the execution of the script.
A script may reference workflow variables, reading and storing data in them. The changes to the workflow variables are visible further on in the workflow. A script may also define local variables, functions, etc. However, these are only available to the script itself - they are not visible to subsequent scripts or script expressions. Hence any result that is to be accessed later should be stored in a previously-defined workflow variable.
The scripting language used within script expressions and
<script> elements is Groovy - http://groovy.codehaus.org/. Groovy describes itself
as follows
Groovy is a new agile dynamic language for the JVM combining lots of great features from languages like Python, Ruby and Smalltalk and making them available to the Java developers using a Java-like syntax.
Groovy is designed to help you get things done on the Java platform in a quicker, more concise and fun way - bringing the power of Python and Ruby inside the Java platform.
Groovy is a superset of Java - Java expressions and statements are valid in Groovy scripts. The java-subset is sufficient for most purposes and should be manageable for anyone who's had experience with Java / C / C++ / JavaScript - the notation is the same.
However, Groovy does provide further language features and sugar, which make it more concise and easy to use - dynamic typing, native syntax for collections, closures and internal iterators, regular expressions, support for generating xml, support for consuming xml
There's also a handy reference card to print out - http://docs.codehaus.org/download/attachments/2715/groovy-reference-card.pdf
Print out a message (which gets captured into the execution record).
<script>
<body>
print("hello world");
</body>
</script>
Extract a list of urls from a votable returned by a previous step, store in workflow variable for later use. This example uses methods native to groovy - in the examples chapter we show how to do the same thing more concisely using the STIL library.
<step result-var="results">
<!-- omitted for clarity -->
</step>
<set var="urlList" /> <!-- declare a variable, but don't initialize it-->
<script>
<body>
if (results.size() != 1) {
jes.error("previous step didn't produce expected number of results");
} else {
votable = results.get('votable'); // access result of previous step
parser = new XmlParser(); //create new parser
nodes = parser.parseText(votable); //parse votable into node tree
urlList = nodes.depthFirst().findAll{it.name() == 'STREAM'}.collect{it.value()}.flatten(); // filter node tree on 'STREAM', project value
print(urlList); // show what we've found
}
</body>
</script>
The <if> element allows conditional execution. It
has a required attribute test, which must contain a script
expression that evaluates to a boolean.
The if element may have either or both a then
and else child elements. Each contains an activity (or
sequence of activities) that will be executed depending on the value of
the test attribute
The <while> element expresses a while loop. It has a
required attribute test which must contain a script
expression that evaluates to a boolean.
Its body is an activity (or sequence / flow of activities) that will be
executed for every time that the test evaluates to true.
The <for> element expresses a for loop. The
structure of the for loop is similar to the for in Python (or for-each
in Javascript) - it iterates over a sequence, rather than using an
arithmetic expression like in Java / C / C++.
The for element has two required attributes:
items which must evaluate to a list of items to iterate
over; and var which provides the name of the loop variable
to assign each element of the list to. The body of the for
element is an activity (or sequence / flow of activites) that will be
executed for each item on the list.
Groovy provides native syntactic support for quickly defining numeric sequences - http://groovy.codehaus.org/Collections
Count up to 10.
<for var="x" items="${1...10}> <!-- start ... finish is groovy syntax for numeric ranges -->
<script>
<body>
print(x);
</body>
</script>
</for>
Call a CEA tool for each item in a list of urls (as was created in earlier example)
<for var="u" items="${urlList}">
<sequence>
<script>
<body>
jes.info("calling tool for ${u}")
</body>
</script>
<step name="something">
<tool name="aTool" interface="simple">
<input>
<parameter name="input" indirect="true">
<value>${x}</value><!-- x contains the url of the resouce which contains this parameter value. -->
</parameter>
</input>
<output>
<parameter name="result" indirect="true">
<value>vospace:/myresults/${x.tokenize('/').pop()}-resuts.dat</value>
<!-- use the last part of the input url as part of the output filename -->
</parameter>
</output>
</tool>
</step>
</sequence>
</for>
The <parfor> element expresses a parallel
for loop. It has the same structure as the for
loop, but executes it's loop body simultaneously for each item in the
values list.
This construct is useful for starting many CEA application exections in
parallel. For example, the previous example could be altered to process
each url in the list simulataneously by simply replacing the
for element with a parfor element.
When an error occurs during the execution of an activity, the normal
flow of control is interrupted. The error is recorded, and then
propagates upwards. (as with exceptions in other languages). If it
reaches the root workflow element, then execution of the
workflow halts.
The workflow schema defines a try element that can be used
to wrap activities and intercept errors. There is also a
catch element, which can be used to define activities to
execute only when an error occurs.
NB: try and catch are not
implemented at the moment (Iteration 7)