Software Architecture of AstroGrid v1

Guy Rixon, 2005-09-27

1                Introduction

I describe the architecture of v1 of the AstroGrid software-product. The description applies to any deployment of AstroGrid components, but is illustrated by reference to the services and resources actually deployed on AstroGrid’s production grid at the time of writing. The description applies to v1.0 and v1.1 of the software.

This document is an introduction to the architecture for readers who are not AstroGrid engineers and who may not be software developers. Therefore, little detail is given and no reference is made to the source code. The document is deliberately insufficient as a manual for modifying or interfacing to AstroGrid. Any developers writing code to work with AstroGrid should see the software web-site at http://software.astrogrid.org/ .

The AstroGrid system is a grid of web services. The essence of this system can be expressed by listing the services, their WSDL contracts, some description of their behaviours and some notes on their construction. However, this presentation makes it harder to understand the interaction between the services and unreasonably hard to connect the internal behaviour of the system with the visible behaviour of its user interfaces. Therefore, the bulk of this document consists of use cases in which I state the system behaviour matching a particular user experience.

The use-cases here are chosen to cover all the AstroGrid components and to illustrate the key points of the architecture. They are by no means a complete catalogue of AstroGrid’s functions.

2                Key concepts

The Resource Registry is a collection of XML resource-documents describing entities in the system that may be needed in an astronomical experiment with the virtual observatory. Since the registry is searchable by keywords, new resources may be discovered at run time. Applications (e.g. SExtractor, used in the use case below) and services (e.g. the CEA server for running SExtractor at Portsmouth) are possible resources; the application resources tell the UI how to set up calls to the applications and the service resources tell the system where the applications may be executed. The document schemata and service interface for the registry are defined by IVOA.

MySpace is a virtual tree of data files made available to all the AstroGrid services. Where data must be passed from service to service, it can be passed as a file in MySpace: this makes the transfer asynchronous and less fragile than a synchronous transfer, and avoids the need to involve the services’ client in the transfer. MySpace can also be used to store files between experiments. MySpace is divided into two layers: file stores that keep the actual files and a file manager that maintains the relationships between files in the file-tree. MySpace exists only in AstroGrid but is considered to be a prototype of the successor system, IVOA’s VOSpace.

 Most AstroGrid components are written in Java as Web Applications. This means that they run in a J2EE web-container provided by the hosting sites and are insulated from the underlying operating system. AstroGrid components have run successfully on Linux, Solaris and MS Windows. AstroGrid has built very little desktop software. Instead, AstroGrid actively supports integration of third-party software on the desktop.

Most of the service interfaces offered by AstroGrid components are web services using the SOAP protocol. Our services all use HTTP as the transport protocol and are all built using version one of the Apache Axis tool-kit.

Some of the operations on AstroGrid services are expected to be long running activities; e.g. queries requiring whole-table traverses of large databases could take hours or days. Therefore, the services run jobs asynchronously from the SOAP operations that start them, following the common patterns in Grid computing. It follows that these services have stateful interactions with their clients.

The Common Execution Architecture (CEA) defines the contracts of AstroGrid services. It defines the description of the available applications, the description of the jobs to be run and the mechanism for asynchronous activities and stateful interactions with those activities. A CEA service follows a standard WSDL contract that makes it easier to call from the desktop and from the workflow engines.

The details of SOAP, WSDL contracts, asynchronous activities and stateful interactions are undesirable when using AstroGrid services in desktop software. AstroGrid hides these details behind the AstroGrid Client Runtime, a class library that presents a simpler facade.

3                The components

3.1          Web applications

The registry implements the IVOA resource-registry specification. It provides three web-service interfaces.

·      Searching for resources matching given criteria.

·      “Harvesting” (i.e. mirroring) resources from other IVOA-registry instances.

·      Supplying resources to other registries (the passive role in the harvesting process).

The web portal is the older of AstroGrid’s two GUIs. It allows a user to drive all parts of the system from a web browser, but requires that all astronomical experiments be set up as workflows; c.f. the AstroGrid “workbench”. The portal has a UI but no web-service endpoints.

The community component supports registration of users and single-sign-on authentication. Its web-service interfaces are as follows.

·      Security: single-sign-on authentication, used mainly to control access to MySpace.

·      Policy: a centralized authorization feature, not currently used in the deployed system.

There is also a UI, local to the web-application for managing the user base. Users are not registered via the web-service interface.

The command-line-application server is a wrapper for astronomical applications. It allows executable applications that have traditionally run on a researcher’s desktop computer to run on demand on remote servers. Each deployment of this component offers a library of applications determined by the person managing the service. The component offers one web-service interface:

·      Common Execution Connector: a WSDL contract for accepting jobs and allowing clients to track and to retrieve the results of those jobs.

The HTTP-application server is a proxy for HTTP applications that do not conform to the CEA. It offers the same Common Execution Connector interface as the command-line-application server.

The Data Set Access (DSA) sky server component is a facade for an archive of tabular data held in a RDBMS. Typically, it is used with astronomical source-catalogues. DSA offers several web-service interfaces.

·      Common Execution Connector.

·      IVOA/NVO cone-search.

·      IVOA SkyNode (obsolete version).

In practice, AstroGrid uses the CEC interface exclusively.

The Data Set Access sun server component is similar to the sky server component (it has the same web-service interfaces) but is specialized for access to catalogues of Solar data.

The File Store component holds the files committed to MySpace. It acts like a single directory in a file system. Its web-service interfaces are as follows.

·      FileStore, allowing a client access to the metadata of the file.

·      HTTP-Put, for streaming data into the file.

·      HTTP-Get, for streaming data out of the file.

The File Manager component organizes the files on the file stores into a virtual directory-tree for each user. It has the single web-service interface FileManager.

The Job Entry System (JES) is the workflow engine. It accepts workflows (written in XML, to an AstroGrid-specific schema) that connect and sequence invocations of the AstroGrid services that offer the CEC interface (currently command-line-application servers, HTTP-application proxies and DSA installations). An AstroGrid “workflow” is really a small programme in a specialized scripting-language. I.e., the “workflow” document defines the data on which it operates instead of allowing a stream of arbitrary data to pass through it.

3.2          Desktop components

The AstroGrid Client Runtime (ACR) is a software library that allows desktop applications to call remote AstroGrid services as if they were local objects. The ACR currently supports Java and Python applications.

The AstroGrid Workbench is AstroGrid’s newer and preferred GUI. It allows access to all features of the system except, currently, the construction of workflows (Workbench can run workflows constructed in the portal); a workflow editor is due in the next release. The Workbench provides the controls for all the features that are not dealt with in third-party applications (such as Starlink Topcat, as in the use case below). The workbench is also the proof-of-concept application for the ACR.


 

4                Use cases

The use cases shown all involve a single human actor: a generic researcher, assumed to be an astronomer. Equivalent use cases for a solar physicist could have been chosen; the same architecture applies. I have not shown use cases to do with maintaining and extending the system as they would not illustrate any extra features of the architecture.

4.1          Invoke SExtractor via AstroGrid Workbench

4.1.1   User experience

·      Researcher selects Run application from Workbench GUI.

·      System displays a drop-down list of available applications. Researcher selects the SExtractor application.

·      System displays a form to collect parameters for the application. Researcher fills in the form.

·      Researcher submits the request. System accepts it and displays the job-monitor GUI. The job is now running asynchronously.

·      The job finishes; System updates the job-monitor GUI accordingly.

·      The results are now filed in MySpace and Researcher can extract them at will (see separate use-case).

4.1.2   Components engaged

The desktop software, all provided to the user via Java Web Start, consists of the Workbench application and the AstroGrid Client Runtime (ACR) library. The runtime library isolates the application from the details of the web-service communication while exposing the semantics of the Common Execution Architecture (CEA) through the ACR interface.

The SExtractor executable is installed at several AstroGrid sites. Each such site also has a CEA server exposing SExtractor as a web service according to the rules of CEA. The web-service interface for this is known as Common Execution Connector (CEC).

4.1.3   System activity

Discover available applications

The Workbench discovers the applications that can be run by querying the resource registry. Each application is described by a resource document: an XML document with root element {http://www.ivoa.net/xml/CEAService/v0.2}CeaApplicationType. The client downloads from the registry copies of all such documents.

Select application and set parameters

The Workbench lets the user choose a single application from a drop-down list. The list contains application names parsed from the resource documents that came from the resource registry.

The resource document for the chosen application defines the input and output parameters of the document sufficiently for the Workbench to generate a GUI for the application.

Discover services for application

The application description does not specify a service to run the application. The Workbench queries the registry for resources of type {http://www.ivoa.net/xml/CEAService/v0.2}CeaServiceType which cite the chosen application in a  {http://www.ivoa.net/xml/CEAService/v0.2}ManagedApplication element. These are defined to be CEA services offering the application via a Common Execution Connector (CEC) web-service contract.

Invoke application

There are multiple services offering the SExtractor application, so the workbench asks the user to choose one from a drop-down list. The values in the list are taken from the service resource-documents.

The workbench submits the application to the ACR which in turn submits it as a job to the CEA service. The ACR then polls the web service for progress until the application has completed. Each pass through the polling loop updates the job-monitor page of the workbench via a callback.

The CEC interface is designed to work asynchronously from the execution of the application. The init operation registers the job in the web service and the execute operation sets it running (there are other operations that could be called between init and execute but the ACR does not use them). The execute operation completes as the application starts to run and the ACR tracks progress using the getExecutionSummary operation.

Given better web-service toolkits, the CEC could be built as a proper, asynchronous service using one-way messages and avoiding polling. However, the system has to be implemented using basic libraries (such as Apache Axis), and the ACR may well be on a system that cannot receive asynchronous notification due to firewall or router limitations. Therefore, all the operations are technically synchronous. The only asynchronous invocation is inside the web-service where the application is managed by a thread that out-lives the web-service operation.

The execution of SExtractor inside the web service uses MySpace. SExtractor requires a configuration file as input and produces a table of data as output. In each case, the web service is given the names of the files in MySpace as part of its job description, and the names embed the formal name of the of file-manager service organizing those files.  To get access to the files, the SExtractor web-service calls the file-manager service and is given in return a URL for streaming access to the data. The file-manager service gets the URLs, which are temporary, from the file-store service holding the actual files.

4.2          Visualize a table in MySpace using Topcat

Topcat is Starlink’s desktop application for visualizing and manipulating astronomical tables. It is designed to work with IVO data standards (FITS and VOTable formats) and with MySpace. Topcat is a prime example of how third-party software can work with the AstroGrid Client Runtime to use AstroGrid services without getting involved in the details. Because we have Topcat and similar application, it is unnecessary for Astrogrid to develop advanced data-visualization software.

4.2.1   User Experience

·      Researcher selects the “Apps” pane of the AstroGrid workbench. Workbench displays a list of applications available via Java Web Start.

·      Researcher selects Topcat v1.6 (this is the Topcat version that uses ACR; earlier versions used different software to talk to MySpace). System launches Topcat on the desktop using Java Web Start.

·      Researcher selects Load table from the File menu; presses select filestore in the resulting dialogue; and selects MySpace as the location in the file-store browser. System displays no files at this stage as Researcher is not yet logged in to MySpace and System has no way to find Researcher’s personal files. System offers a login button.

·      Researcher logs in to MySpace specifying user-name, password and community (all AstroGrid user-names are qualified by the name of the community that issued them). System now displays Researcher’s files.

·      Researcher selects a file in the MySpace file-tree. Topcat loads it and displays the file’s internal metadata.

·      Researcher manipulates the data using Topcat’s features.

4.2.2   Components engaged

4.2.3   System activity

Launch Topcat

Workbench activates the user’s web browser, causing it to invoke an HTTP URL that leads to the Java-Web-Start (JWS) copy of Topcat v1.6. This returns an executable jar with MIME-type application/x-java-jnlp-file. The browser has previously been set up with a Java plug-in including JWS (the standard plug-in from Sun Microsystems is suitable), and this plug-in launches the jar as a desktop application.

Log in to MySpace

Topcat obtains a user-name password and community name from the user and passes these to the ACR. The ACR looks up the community in the resource registry and obtains the address of its SecurityService web-service endpoint.

The ACR contacts the SecurityService in the community and obtains details of the user’s account. From these, the ACR constructs the International Virtual Observatory Identifier (IVOID; the formal name on which the resource registry can best be queried) of the file-manager service holding the root of the user’s MySpace directory-tree.

The ACR looks up the file-manager service in the registry and obtains the address of its web-service endpoint.

Display files

Topcat requests from the ACR the details of files in the root directory of the user’s MySpace; the ACR forwards this request to the file-manager web-service and returns the metadata to Topcat, which displays the metadata.

Some of the entries in the root directory are themselves directories, so Topcat repeats the process recursively descending the MySpace file-tree and displaying the tree structure as it goes. Each node in the displayed tree is made into a control by which the user may operate on that file or directory.

Download and visualize table

The user selects a node in MySpace for visualization. Topcat asks the ACR for a URL from which the contents of the file may be read. (Note that prior to this Topcat and the ACR do not have a data-access URL but only an abstract URI naming the MySpace node.)

The ACR passes the request to the file-manager web-service. The file-manager determines from its internal metadata which file-store web-service holds the file and forwards the request to the latter service. The file-store replies with a URL pointing to a file-server; the file-manager relays this to the ACR and the ACR returns it to Topcat. In the present implementation, the URL points to a file held in the file-store web-application, but it could equally well point to an external file-server. The entity receiving the URL is not supposed to make any assumptions about co-location.

Topcat reads the file from the URL. Assuming that the user has picked a suitable file of tabular data, Topcat can now display and manipulate the file without further use of AstroGrid. Since the data are streamed from MySpace, Topcat can choose whether to read the entire file before processing or whether to process it in chunks.

4.3          Submit an ADQL query via the workbench

4.3.1   User experience

This workflow runs essentially as the use-case invoking SExtractor, above. There are some subtle differences:

·      The application is specific to AstroGrid and is not a wrapping of something that Researcher might run on the desktop. The name of the application, which embeds the name of the data-set queried, should be intelligible to Researcher but is probable not familiar.

·      The application takes as one of its parameters a reference to a file holding the query text. This text is in ADQL/X, the XML form of the Astronomical Data Query Language. The file is typically in MySpace. Architecturally, the ADQL query-file works a little like the configuration file in the SExtractor use-case, above: it’s a file that needs to be prepared and emplaced before the use case can be run. However, where the SExtractor file was potentially reusable for different runs of the application with different data, the query file is specific to the data-set being queried; it would need to be edited to work with different data.

·      In the SExtractor use-case, the application name identifies the algorithm. In the current use-case, the “application” name identifies the data set.

·      Equivalent data-transforming applications are often available at many different services. Data-selection applications are typically available on only one archive service.

4.3.2   Components engaged

The components are almost identical to those in the SExtractor use-case. These are the important differences:

·      A Data-Set Access (DSA) server replaces the CEA server in the SExtractor use-case. The DSA component conforms to the CEA and offers the same interface as the generic CEA server. However, DSA is a different service, one that talks JDBC-enabled databases instead of command-line applications.

·      Typically, there is only one DSA service offering any given query-application since the application is specific to the data set. A mirror of the application implies an exact mirror of the data set with identical DB schemata. I have shown the DSA installation in Edinburgh, but most AstroGrid sites have some DSA facilities for different data sets.

4.3.3   System activity

The activity is identical to the SExtractor use-case except for the internal use of the archive DB by the DSA service.

Each query to a DSA service results in one SQL query to the DB. ADQL essentially is SQL92, with a few extra functions. Therefore, the SQL submitted to the RDBMS is extremely closely related to the ADQL.

The DSA software expands any astronomical macros into plain SQL and casts the SQL into the dialect understood by the RDBMS. To allow this, the DSA service is coupled to a specific DB at time of installation. DSA is not able to stand as proxy for DBs discovered in some way at run-time.

The DSA software casts the query results into XML according to IVOA’s VOTable schema before storing them in MySpace. Specifically, DSA writes the TABLEDATA form of data encoding in which each datum of each row is written out as an XML element. The output to MySpace is a single VOTable file containing a single RESOURCE element, that RESOURCE containing a single TABLE element.

4.4          Execute a SIAP query via the workbench

This use-case demonstrates the use of an HTTP service external to AstroGrid.

SIAP is IVOA’s Simple Image Access Protocol, which queries a catalogue of images of the sky and returns a table of access URLs for images matching the search criteria.

SIAP services are trivially available via the WWW, but do not follow AstroGrid’s Common Execution Architecture, do not support asynchronous execution of queries and do not use MySpace. AstroGrid provides proxy-services, conforming to CEA, for selected SIAP services such as the images of the Sloan Digital Sky Survey.

4.4.1   User experience

For the user, this use case works exactly as does the SExtractor use-case above. Only the application name and parameters change.

A SIAP application is specific to one data set; choosing the application is the way to specify the data to be searched.

4.4.2   Components engaged

The CEA proxy web-application is the third kind of service implementing the CEC web-service interface (the others are the CEA server for command-line applications, used in the SExtractor use-case, and the DSA component).

The HTTP proxy service has no local configuration for particular HTTP services. Instead, it gets its configuration at run time from the registry. Typically, a new CEA proxy is set up to stand as proxy for all HTTP services for which CEA applications have been registered.

4.4.3   System activity

Activity is as for the Sextractor use-case except that instead of running a command-line application internally, the CEA server makes a single HTTP call to a remote SIAP service. This call is managed asynchronously from the CEC web-service. The results of the SIAP query are returned by the SIAP service to the CEC and are then written to MySpace.

This kind of CEC does not read input files from MySpace. The HTTP services for which it stands proxy typically do not require uploaded files as inputs.

4.5          Create a workflow in the portal

This use case demonstrates the interaction between a UI and the resource registry to record an astronomical experiment as a workflow. In the v1.1 system, this can only be done via the AstroGrid web-portal, but we hope soon to add this feature to the AstroGrid Workbench.

The workflow in question executes the SIAP query and SExtractor invocation described in previous use-cases.

4.5.1   User experience

·      Researcher goes to the workflows page of the web portal and chooses New from that page’s File menu. System displays the stub of a graph for a new workflow, with no job-steps set.

·      Researcher selects Insert step/Here from the Edit menu. System adds a node for a job-step to the graph of the workflow.

·      Researcher clicks on the newly-created node. System adds to the page controls for choosing an application to be run in that step.

·      Researcher selects an application. Researcher may choose from a list of recently-used applications or may choose a registry search. In the search, system displays a dialogue for a keyword search; Researcher supplies the search terms; system displays the registry metadata of matching applications and Researcher chooses from this “short-list”. In this case, Researcher chooses ivo://uk.ac.cam.ast/INT-WFS/images/CEA-application for which the description is “This resource defines the CEA application for access to INT-WFS images. The interface follows the SIAP standard.” System adds to the page controls for setting the applications parameters. Researcher fills in the parameter values.

·      Researcher repeats the “add step/choose application/set parameters” procedure for further job-steps that route the images raised in the SIAP query into the SExtractor application.

·      Researcher chooses Save from the File menu of the workflow-builder page. System displays a dialogue following the normal “save-as” pattern of GUI applications but targeted at the virtual file-system in MySpace. Researcher chooses a location and saves the workflow. System records it as an XML file in MySpace.

·      The workflow is now available for immediate or later execution.

4.5.2   Components engaged


4.5.3   System activity

Create workflow graph

The portal displays a new workflow graph empty of job-steps. No external services are needed for this. The portal records the workflow as an XML document in the session state of the portal web-application; this means that the workflow details are held in memory and not committed to disc.

Choose application

This can be done in two ways. If Researcher chooses the application from the list of recently-used applications, then the portal obtains from the list the IVOID of the chosen application but not that application’s registry metadata. In this case, the portal then queries the registry to obtain the resource document for the given IVOID; this is a specific operation in the WSDL contract of the registry web-service. If, instead, Researcher opts for an explicit registry search, then the portal constructs an ADQL/X query on the registry for all resource documents that have the given search term in any of a number of elements, and then calls the registry web-service’s search operation passing the ADQL/X document as part of the call. In this case, the portal caches the resource documents for the next step.

Choose parameter values

Working from the description of the application interface in the resource document, the portal generates a web form by which the user may set the parameters for the application call and adds that the current UI page. Researcher fills in the form and the portal adds the parameter values to the XML document representing the workflow under construction.

Save to MySpace

When Researcher logged in to the portal, the file-manager service holding Researcher’s MySpace file-tree was identified and remembered as part of the portal’s session state. In addition, the portal has, in its session state, a cached copy of the metadata from the file-tree. The procedure by which these metadata were loaded into the portal is similar to the procedure for logging on to MySpace via Tomcat, discussed in a use case above.

Because the portal has cached knowledge of the file-tree, it can handle the selection of a location in MySpace internally.

This process generates a formal name for the destination under which the workflow is to be saved; the name includes the IVOID of the file-manager service and the path down the file-tree to the file.

The actual data-saving operation is identical to that used to save the results in the SExtractor-execution use-case, except that now the client of the file-manager service is the portal itself.


 

o                Execute a workflow via the workbench

This demonstrates the execution of the workflow constructed in the previous use-case. It is related to the workflows in which SExtractor and SIAP were invoked from the workbench.

4.5.4   User experience

·      In the workbench, Researcher selects Workflow viewer, and loads the workflow from MySpace. System displays the workflow graph.

·      Researcher presses the Run button. System executes the workflow, reporting progress in the workbench’s job monitor in the same way as in the SExtractor use-case.

4.5.5   Components engaged

The components are the union of those in the SExtractor and SIAP use-cases with one addition: the Job Entry System. When running workflows the workbench and ACR no longer talk directly to the CEA services. Instead, all communication passes through JES.

 

This rather complex diagram is the union of the deployment diagrams for the SExtractor and SIAP use-cases with JES interposed. Note how the application, via the ACR, now depends only on JES and is no longer directly dependent on particular CEA services for the applications. This allows JES to distribute the work among services offering equivalent applications, either to balance the computational load or to reduce the network load by choosing services co-located with the data. JES turns AstroGrid from a web-like client-server system into a Grid.

4.5.6   System activity

Select SIAP-proxy service

JES reads from the workflow document the name of the SIAP application. JES queries the registry to find services offering this application. There are a handful of such services; JES assigns the job-step randomly to any of these. This achieves simplistic load-balancing but does not co-locate the processing with the data. It optimizes for CPU power at the expense of network bandwidth.

Invoke SIAP

This is done as in the use case where SIAP was invoked from the workbench, but with JES as the client of the proxy-service.

Parse SIAP results

The SIAP invocation writes to MySpace a list of images as a VOTable. The table includes URLs from which the images can be downloaded. JES parses the table and extracts the list of image-access URLs. The instructions for doing this are written in the workflow document in scripting language; this is not an intrinsic function of JES.

Select SExtractor service

Services to execute the SExtractor application are discovered from the registry, in the same pattern used to discover the SIAP service. JES selects as many of these service as there are images to process; if there are more images than services then multiple jobs steps are sent to each service.

Invoke SExtractor

This is done as in the SExtractor use-case, but with JES as the client of the CEA service.

The workflow states that all the SEXtractor job-steps are to be run in parallel. JES submits them in parallel, but the job-steps may be executed sequentially where a single CEA service has more than one item to process.