Introduction

The previous section shows how to write basic linear workflows. This chapter describes how to construct workflows containing loops and branches. Looping and conditional control constructs need some way of testing and altering values - so we start by introducing workflow variables

This document is still a bit tangled at present - you may need to read the whole thing once before it starts to make sense

Workflow Variables

A workflow may contain workflow variables - named variables that contain simple java objects and structures. Variables may be accessed and altered from <script> elements within the workflow document, and may also be referenced from script expressions in other workflow activities

Set

The <set> activity defines a new workflow variable, or updates the value of an existing variable. It has two attributes: var - the name of the workflow variable (required); value - the value to set this variable to (optional).

If no value is provided, the variable is created and set to null

The value attribute may be a straight literal string, or an script expression

Unset

The <unset> activity deletes a workflow variable. It has one required attribute - var - the name of the workflow variable to delete.

If a workflow variable is referenced after it has been deleted, an exception is thrown..

Scope

<scope> is a container for activities that introduces a new nested scope for workflow variables. Any workflow variables defined within a nested scope are not visible outside that scope. Any attempts to reference such variables throws an exception

Script Expressions

The value of certain attributes and elements may contain references to workflow variables and other scripting language expressions. These script expressions are delimited by ${..}. Script expressions are interpreted as follows

  • If the entire content of an attribute / element is a script expression (with no further characters or whitespace) , then the java object that is the result of evaluating the expression is returned.
  • Otherwise, if the attribute / element contains more than one script expression, or an expression plus other characters, each of the expressions is evaluated in turn, the results converted to Strings, and then concatenated together.

Examples

<!-- in the variable 'a' store the value 'hello world' (java.lang.String) -->
<set var="a" value="hello world" />

<!-- in the variable 'b', store the value '1' (java.lang.String) -->
<set var="b" value="1" />

<!-- in the variable 'c', store the value 1 (java.lang.Integer) -->
<set var="c" value="${1}" />

<!-- in the variable 'd' store the value '1 + 1' (java.lang.String) -->
<set var="d" value="${c} + 1" />

<!-- in the variable 'e' store the value 2 (java.lang.Integer) -->
<set var="e" value="${c + 1}" />

<!-- in the variable 'now' store the current date (java.lang.Date) -->
<set var="now" value="${new java.util.Date()}" />

<!-- in the variable 'nowString' store the current date (java.lang.String) - note trailing space -->
<set var="nowString" value="${new java.util.Date()} " />

<!-- in the variable 'f' store the value '1 - hello world and goodbye." -->
<set var="f" value="${c} - ${a} and goodbye. "/>

<!-- in the variable 'u' store the value 'http://www.astrogrid.org' (java.net.URL) -->
<set var="u" value="${new java.net.URL('http://www.astrogrid.org')} />

<!-- in the variable 'scheme' store the value 'http' (java.lang.String) -->
<set var="scheme" value="${u.getScheme()}" />

<!-- in the variable 'scheme' store the value 'http' using concise bean-property access -->
<set var="scheme" value="${u.scheme}" />

Script

The <script> element is an activity that executes some inline scripting code. This is a very versatile activity - some potential uses are

  • post-process the results received from a CEA tool (e.g. extracting fields from a VOTABLE)
  • manipulating contents of workflow variables - computing values that can't easily be expressed as a simple script expression.
  • dynamically adding parameters to subsequent calls to CEA tools
  • interrogating astrogrid registries
  • moving / copying / deleting files in VOSpace
  • initiating / tracking progress / aborting other workflow jobs.
  • Administrative functions - such as creating user accounts.

The script element contains an optional description element and a mandatory body element that contains the script text. When it is executed a step-execution-record element will be added which records the execution of the script.

A script may reference workflow variables, reading and storing data in them. The changes to the workflow variables are visible further on in the workflow. A script may also define local variables, functions, etc. However, these are only available to the script itself - they are not visible to subsequent scripts or script expressions. Hence any result that is to be accessed later should be stored in a previously-defined workflow variable.

Scripting Language

The scripting language used within script expressions and <script> elements is Groovy - http://groovy.codehaus.org/. Groovy describes itself as follows

Groovy is a new agile dynamic language for the JVM combining lots of great features from languages like Python, Ruby and Smalltalk and making them available to the Java developers using a Java-like syntax.

Groovy is designed to help you get things done on the Java platform in a quicker, more concise and fun way - bringing the power of Python and Ruby inside the Java platform.

Groovy is a superset of Java - Java expressions and statements are valid in Groovy scripts. The java-subset is sufficient for most purposes and should be manageable for anyone who's had experience with Java / C / C++ / JavaScript - the notation is the same.

However, Groovy does provide further language features and sugar, which make it more concise and easy to use - dynamic typing, native syntax for collections, closures and internal iterators, regular expressions, support for generating xml, support for consuming xml

There's also a handy reference card to print out - http://docs.codehaus.org/download/attachments/2715/groovy-reference-card.pdf

Examples

Print out a message (which gets captured into the execution record).

<script>
  <body>
    print("hello world");
  </body>
</script>

Extract a list of urls from a votable returned by a previous step, store in workflow variable for later use. This example uses methods native to groovy - in the examples chapter we show how to do the same thing more concisely using the STIL library.

<step result-var="results">
  <!-- omitted for clarity -->
</step>
<set var="urlList" /> <!-- declare a variable, but don't initialize it-->
<script>
  <body>
    if (results.size() != 1) {
      jes.error("previous step didn't produce expected number of results");
    } else {
   votable = results.get('votable'); // access result of previous step
   parser = new XmlParser(); //create new parser
   nodes = parser.parseText(votable); //parse votable into node tree
   urlList = nodes.depthFirst().findAll{it.name() == 'STREAM'}.collect{it.value()}.flatten(); // filter node tree on 'STREAM', project value
print(urlList); // show what we've found

    }
  </body>
</script>

Conditional

The <if> element allows conditional execution. It has a required attribute test, which must contain a script expression that evaluates to a boolean.

The if element may have either or both a then and else child elements. Each contains an activity (or sequence of activities) that will be executed depending on the value of the test attribute

Example

<set var="x" value="${1}" />
<if test="${x > 0}">
  <then>
    <sequence>
   <!-- some activities to do here -->
    </sequence>
  </then>
  <else>
    <script>
   <body>
   print('test was false');
   </body>
    </script>
  </else>
</if>

While Loop

The <while> element expresses a while loop. It has a required attribute test which must contain a script expression that evaluates to a boolean.

Its body is an activity (or sequence / flow of activities) that will be executed for every time that the test evaluates to true.

Example

Repeatedly execute a step, until it returns at least one result value.

<while test="${results == null || results.size() < 1}">
   <step result-var="results">
      <!-- omitted -->
   </step>
</while>

For Loop

The <for&gt element expresses a for loop. The structure of the for loop is similar to the for in Python (or for-each in Javascript) - it iterates over a sequence, rather than using an arithmetic expression like in Java / C / C++.

The for element has two required attributes: items which must evaluate to a list of items to iterate over; and var which provides the name of the loop variable to assign each element of the list to. The body of the for element is an activity (or sequence / flow of activites) that will be executed for each item on the list.

Groovy provides native syntactic support for quickly defining numeric sequences - http://groovy.codehaus.org/Collections

Examples

Count up to 10.

<for var="x" items="${1...10}> <!-- start ... finish is groovy syntax for numeric ranges -->
  <script>
   <body>
      print(x);
   </body>
  </script>
</for>

Call a CEA tool for each item in a list of urls (as was created in earlier example)

<for var="u" items="${urlList}">
   <sequence>
      <script>
         <body>
            jes.info("calling tool for ${u}")
         </body>
      </script>
      <step name="something">
         <tool name="aTool" interface="simple">
            <input>
               <parameter name="input" indirect="true">
                  <value>${x}</value><!-- x contains the url of the resouce which contains this parameter value. -->
               </parameter>
            </input>
            <output>
               <parameter name="result" indirect="true">
               <value>vospace:/myresults/${x.tokenize('/').pop()}-resuts.dat</value>
               <!-- use the last part of the input url as part of the output filename -->
               </parameter>
            </output>
         </tool>
      </step>
   </sequence>
</for>

Parallel For Loop

The <parfor> element expresses a parallel for loop. It has the same structure as the for loop, but executes it's loop body simultaneously for each item in the values list.

This construct is useful for starting many CEA application exections in parallel. For example, the previous example could be altered to process each url in the list simulataneously by simply replacing the for element with a parfor element.

Error Handling

When an error occurs during the execution of an activity, the normal flow of control is interrupted. The error is recorded, and then propagates upwards. (as with exceptions in other languages). If it reaches the root workflow element, then execution of the workflow halts.

The workflow schema defines a try element that can be used to wrap activities and intercept errors. There is also a catch element, which can be used to define activities to execute only when an error occurs.

NB: try and catch are not implemented at the moment (Iteration 7)