Common Execution Architecture Design

Introduction

The Common Execution Architecture (CEA) is an attempt to create a reasonably small set of interfaces and schema to model how to execute a typical Astronomical application within the Virtual Observatory (VO). In this context an application can be any process that consumes or produces data.

The CEA has been primarily designed to work within a web services calling mechanism, although it is possible to have specific language bindings using the same interfaces. For example Astrogrid has a java implementation of the interfaces that can be called directly from a java executable.

Motivation

The primary motivating factors behind creating this architecture are;

  • To create a uniform interface and model for an application and its parameters. This has twin benefits;
    1. It allows VO infrastructure writers a single model of an application that that have to code for.
    2. Application writers know what they have to implement to be compatible with a VO Infrastructure.
  • To provide a higher level description than WSDL can offer.
    • Restrict the almost limitless possibilities into a managable subset
    • Provide specific semantics for some astronomical quantities
    • Provide extra information not allowed in WSDL - e.g. default values, descriptions for use in a GUI etc.
  • To provide links with the VO Resource schema (See the IVOA WG)

History

The design for this architecture was started as part of the Workflow/Job Execution System within astogrid

The astrogrid implementation is discussed here

Actors

  • Application - This is the process that is to be executed. It is defined as a process that can consume or create data. So this can include unix command line tools, database queries, web services etc.
  • Common Execution Controller - this is the component that implements the CommonExecutionConnector interface, and actually controls the execution of the application. There can be various specialisms of this service, such as the CommandLineApplicationContoller, which can be configured to invoke a general unix command line tool, a WebServiceApplicationContoller, which can be configured to act as a proxy to call a general web service in a uniform manner
  • Invoking process
  • Monitoring Service - This is a service that the Common Execution Controller can report status to - it can of course be
  • Storage Service - this is the mechanism by which the application can return its results. It can encompass a wide variety of mechansims e.g.
    • SOAP messages
    • http get/put
    • SOAP attachments
    • ftp/gridftp
    • MySpace
    • local filestore
    • etc.

Formal Definition

Interactions

The above sequence diagram illustrates how the various components of the CEA system interact when an application is executed. This is initiated by the

Some point of note;

  • The monitoring service could equally be the same as the invoking service - they are shown as conceptually separate, as the endpoint of this service is passed in as an argument to the call
  • The only guaranteed status message that the monitoring service will receive is the one informing it that the application has finished (or failed). The application might be capable of sending intermediate messages whilst it is sill executing, but this is not required.
  • The results of the application running are not returned directly to the invoking process. The final destination for the results is implicit in the specification of the output parameters, and it is the responsibility of the CommonExecutionController to ensure that they get there.

Interfaces

CommonExecutionConnector

This is the main interface that is used to communicate with the application. The main methods of are (please note that there are some extra methods defined in the WSDL that are defined for experimental purposes)

The WSDL definition of this interface is stored in cvs at http://www.astrogrid.org/viewcvs/*checkout*/astrogrid/workflow-objects/wsdl/CommonExecutionConnnector.wsdl?rev=HEAD

JobMonitorService

The WSDL definition of this interface is stored in cvs at http://www.astrogrid.org/viewcvs/*checkout*/astrogrid/workflow-objects/wsdl/JobMonitor.wsdl?rev=HEAD

Objects

Application

uml model

As this model depicts an application in CEA is really quite a simple entity consisting of 1 or more interfaces that consist of 0 or more input parameters and 0 or more output parameters.

The schema representation is shown below, and is essentially a representation of the UML model that has been coded to recognise that the same parameter can occur in several interfaces.

Parameter

The description of the parameters and the parameter values are probably the heart of the CEA. It is the model for the parameters that allow us to add semantic meaning, and to give the flexibility in how the parameters are transported. The implementation is still in its infancy, but it is hoped that the parameter definition will be extented to encompass any data models that the VO produces.

The basic parameter definition from the schema is shown below

The parameterValue model is simple but powerful the parameterValue element has 3 attributes

  • Name
  • Type this describes the data type of the parameter. It can range from simple atomic types such and integer and string to specific astronomical types such as right ascension and declination all the way to complex structures such as VOTables and FITS files.
  • Transport this describes the transport method to obtain or return a parameter. For an input parameter the default is for the parameter value to be contained within the calling SOAP message, however for an output parameter there is no such default and this attribute must be given a value. The different sorts of transport include
    • SOAP messages
    • http get/put
    • SOAP attachments
    • ftp/gridftp
    • MySpace
    • local filestore

Note that the current implementation has the type and transport attributes combined as one - so that there can be a type of MySpace_VOTableReference for instance - the full implemnentation will have them separated as described above for more flexibility.

Schema

The schema associated with the CEA fall into two categories

  1. The schema used to define the messages within the CommonExecutionConnector interface.
  2. The schema that is used to define the VOResource extension for CEA.

These schema are strongly interelated, which aids programming with automated object generation tools. The schema associated with CEA are described below with links to their documentation

Filename (with cvs link)DescriptionDocumentation
AGApplicationBase.xsdThis schema defines most of the basic CEA objects that are imported into both the WSDL and the Registry SchemaDocumentation
VOCEA.xsdThis defines the VOResource extensions of CeaApplication and CeaService that are used in the registryDocumentation
AGParameterDefinition.xsdContains the basic parameter definition and parameter value elements used in the other schemaDocumentation
Workflow.xsdThis schema actually describes an astrogrid workflow document in full, but as part of this is the tool element that is passed in as a parameter to the execute method in the CommonExecutionConnector method. This tool element will be factored out into its own CEA specific schema in futureDocumentation

Deployment

Typical Scenario

This deployment shows some of the features of using the CEA

  • On the right hand side of the diagram there are command line applications that are wrapped by specialized CommonExecutionControllers that allow the workflow engine to use the CommonExecutionConnector interface to communicate
  • There is a webservices proxy component that can act as an adapter between a generic web service and the CommonExecutionConnector interface
  • On the left of the diagram the webservices proxy is localised with a web service so that the results returned by the webservice can be stored locally thus minimising network traffic

Future Directions