IVOA

A Proposal for a Common Execution Architecture Version 0.1

IVOA WG Internal Draft 2004-09-27

Working Group:
http://www.ivoa.net/twiki/bin/view/IVOA/IvoaWG_name
Author(s):
Paul Harrison

Abstract

This note describes a proposal for a Common Execution Architecture (CEA) within the Virtual Observatory. It discusses the general motivation behind the design as well as detailed schema and WSDL defintions of the architecture. The scope of this document covers areas of interest to the Registry and Grid Working Groups as well as the Applications Special Interest Group.

Status of this document

This is an IVOA Working Draft for review by IVOA members and other interested parties. It is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use IVOA Working Drafts as reference materials or to cite them as other than "work in progress." A list of current IVOA Recommendations and other technical documents can be found at http://www.ivoa.net/Documents/.



Acknowledgements

Contents



1. Introduction

The Common Execution Architecture (CEA) is an attempt to create a reasonably small set of interfaces and schema to model how to execute a typical Astronomical application within the Virtual Observatory (VO). In this context an application can be any process that consumes or produces data, so in existing terminology could include

The CEA has been primarily designed to work within a web services calling mechanism, although it is possible to have specific language bindings using the same interfaces. For example Astrogrid has a java implementation of the interfaces that can be called directly from a java executable.


1.1 Motivation

The primary requirements motivating the creation of this architecture are;

1.2 Origins

The design for this architecture has evolved from the requirements for the Workflow/Job Execution System components within AstroGrid. It was desireable for the job execution system to have a single model for an application, so that it could deal with the (already complex) problems of scheduling, looping, conditional execution etc. without needing to have specializations for all the different types of service (SIA, Database query, Cone Search, etc.) that it might be required to invoke.

Amongst the VO specifications there was no existing model for applications that was defined at the level at which this design attempts to address. In the VOResource schema an application is defined as a Service with the interface definition. The interface defintion either relies on referring to a WSDL definition of the service, or on other schema extending the service definition to provide some specific detail as in the case of a Simple Image Access service. There is no general definition of an application in the resource.

It is clear that the WSDL model of an interface has had a large influence on the design of the CEA, but it should be remembered that the CEA is intentionally layered on top of WSDL,  so that CEA controls the scope and semantics of operations. There is only one WSDL defintion for all applications, so as far as web services are concerned the interface is constant.  CEA works by transporting meta information about the application interface within this constant WSDL interface.

2. Formal Definition

2.1 Components


2.2 Interactions

CEA UML Sequence Diagram

click on diagram to enlarge

The above sequence diagram illustrates how the various components of the CEA system interact when an application is executed.  The steps are

  1. The invoking process calls the init method of the CommonExecutionConnector interface, which is implemented by the component known as the CommonExecutionController. This will set up the execution environment for the the application and will return immediately with an executionID which is the identifier by which the CommonExecutionController keeps track of this particular execution instance. The parameters to this call are
  2. the invoking process then has the opportunity to register two classes of listener
    1. Status Monitor - this is the endpoint of the service that implements the JobMonitor interface that the ExecutionController can call to inform the monitoring process of the status of the execution instance.
    2. Results Listener - this is the endpoint of a service that implements the ResultsListener port so that the ExecutionController can report the results of the application execution once they are ready
  3. Then the execute operation should be invoked and the CommonExecutionController will then start the application.
  4. The application can then optionally return status information to the CommonExecutionController which will then pass this on to the Monitor Service.
  5. When the application completes it will inform the CommonExecutionController which will then pass the indirect results on to the storage service, the direct results back to any results listeners and inform the monitor service that the application has finished.

Some point of note;

2.3 Interfaces

CommonExecutionConnector

This is the main port that is used to communicate with the application. The main operations in this port are;

The WSDL definition of this interface is stored in cvs at http://www.astrogrid.org/viewcvs/*checkout*/astrogrid/workflow-objects/wsdl/CommonExecutionConnnector.wsdl?rev=HEAD

JobMonitor

The WSDL definition of this interface is stored in cvs at http://www.astrogrid.org/viewcvs/*checkout*/astrogrid/workflow-objects/wsdl/JobMonitor.wsdl?rev=HEAD

The only operation is the JobmMonitor port is the monitorJob operation, which expects to receive a message with the job-identifier-type (as specified in the original init operation of the CommonExectutionConnector port and a status message

ResultsListener

The WSDL definition of this interface is stored in cvs at http://www.astrogrid.org/viewcvs/*checkout*/astrogrid/workflow-objects/wsdl/CeaResultsListener.wsdl?rev=HEAD

The only operation is the putResults on the ResultsListener port. This accepts a message that contains a job-identifier-type and a result-list-type, which is just a list of parameterValues.

2.4 Objects

The objects that participate in CEA can be split into two groups

  1. Those used to describe the application in the registry
  2. Those used to describe the application in the WSDL interface

These are described in more detail in the following sections.

2.4.1 Application

uml model

As this model depicts an application in CEA is really quite a simple entity consisting of 1 or more interfaces that consist of 0 or more input parameters and 0 or more output parameters.

The schema representation is shown below, and is essentially a representation of the UML model that has been coded to recognise that the same parameter can occur in several interfaces.

This diagram also shows a number of specialized elements all within the substitution group which has Parameter as the head. These are implementation details where extra information is needed to specify how to use the parameters - for example in the case of a command line parameter it is necessary to know the command line switch or position that the parameter appears at.

2.4.2 Parameter

The description of the parameters and the parameter values are probably the heart of the CEA. It is the model for the parameters that allow us to add semantic meaning, and to give the flexibility in how the parameters are transported. The implementation is still in its infancy, but it is hoped that the parameter definition will be extended to encompass any data models that the VO produces.

The basic parameter definition from the schema is shown below

2.4.3 Tool

The tool represents the full collection of parameters that are passed to a particular interface of an application and the results that are returned.

2.4.4 ParameterValue

The parameterValue model is simple but powerful representation of the parameters that are passed to an application. The parameterValue element has 2 attributes

2.5 Schema

The schema associated with the CEA fall into two categories

  1. The schema used to define the messages within the CommonExecutionConnector interface.
  2. The schema that is used to define the VOResource extension for CEA.

These schema are strongly interelated (as they are imported in both the WSDL and Registry Schema), which aids programming with automated object generation tools, as there are many common objects. The schema associated with CEA are described below with links to their documentation.

Filename (with cvs link) Description x3sp Documentation
AGApplicationBase.xsd This schema defines most of the basic CEA objects that are imported into both the WSDL and the Registry Schema Documentation
CEATypes.xsd This defines the the message types that are passed in queryStatus operations in the CommonExecutionConnector interface and in the MonitorJob operation of the Job Monitor interface.
Documentation
VOCEA.xsd This defines the VOResource extensions of CeaApplication and CeaService that are used in the registry Documentation
AGParameterDefinition.xsd Contains the basic parameter definition and parameter value elements used in the other schema Documentation
Workflow.xsd This schema actually describes an astrogrid workflow document in full, but as part of this is the tool element that is passed in as a parameter to the execute method in the CommonExecutionConnector method. This tool element will be factored out into its own CEA specific schema in future. Documentation

2.5.1 Discussion of the VOResource Extension

It is a valid question to ask whether there needed to be a specific VOResource extension to accommodate the CEA. The standard Service element expects the interface to the service to be described in WSDL, so given that CEA has constant WSDL definitions for different applications there needs to be a way of expressing the fact that a particular CeaService can run a particular set of applications. The method that was chosen was to extend Service with an element that is just an aggregation of pointers to the actual application defintions defined in CeaApplication which is an extension of the standard ResourceType. These relationships are illustrated in the UML below.

For a particular application there should be only one CeaApplication entry in the registry. This entry will define everything that is necessary to run the application except for the endpoint of the service. This implies that to find a particular instance of a particular application is a two stage registry query.

  1. Query the registry to find the application of interest - note the parameter data and the IVOA identifier for the application.
  2. Query a second time to find the CeaService(s) that can run the application with that IVOA identifier.

The diagram illustrates the point that one CeaService may run several CeaApplications and that a particular CeaApplication can be run by several CeaServices.

3 Deployment

3.1 Typical Scenario

UML Deployment

This deployment shows some of the features of using the CEA


3.2 What it means for an Application to be CEA compliant

3.3 Astrogrid Implementation

The CEA is implemented in the following astrogrid components


4 Future Directions

4.1 What needs to be done to make this suitable for adoption by the IVOA

4.2 Extensions

Appendices

Appendix A: WSDL for the Common Execution Connector

Appendix B:WSDL for the Job Monitor Service

Appendix C:WSDL for the Results Listener Service

Appendix D: Example Registry Entries

Note The AuthorityID in this example is set to an illegal value of @REGAUTHORITY@ which is a token that is replaced by the astrogrid installation system.


References

These are all in-line links at the moment.