Building Queries
Querying astronomical databases from AstroGrid using ADQL
Tabular data can cover anything from an observatory log or list of spectra, to a source catalogue, to complicated lists of properties or simulated data. The All-VO Scope provides simple position- or time-based searches but many archive databases support more detailed queries, for example Return RA, Dec., U B V magnitudes and errors within a given region where (Bmag - Vmag > 1) and the source is point-like. This has to be expressed in ADQL, a simplified version of SQL (Structured Query Language), with extensions necessary to support astronomy-specific terms. A list of SQL terms valid for ADQL is given in the ADQL reference manual. (Note that ADQL must be understood by all VO-enabled databases and so it is not always possible to make very specialised queries which are only understood by proprietary or custom interfaces.)
The AstroGrid Query Builder shows the columns present in the data and provides a tree interface or a validator to help you to build queries that conform to the ADQL syntax.
Queries from the VO Desktop
Select a database which supports ADQL, such as those found though VOExplorer Queryable database examples (e.g. the Super Cosmos Catalogues or 2MASS).
When you choose a suitable catalogue, e.g. 2MASS, a button appears on the left to Build ADQL. Click on this to bring up the Task Runner configured to query your chosen data set.Task Runner Reference Help provides a summary of the main menus common to all uses of Task Runner and Query Builder Reference Help covers the specific functions of the Query Builder item by item. Here is a 'quick start' guide.
Finding the required catalogue columns
The main panel provides a description of the selected catalogue. If it contains more than one table e.g. twomass_psc, twomass_scn and twomass_xsc, use the Table tab to select the one you want.
A list of the relevant column names will be displayed, along with their UCDs, Units, Types and Descriptions. The blue Magnifying Glass on the right lets you enter a term to Filter Columns, e.g. ''pos'', as shown, will select all the column descriptions with position, POS_EQ etc. in the column name, UCD etc. Clicking on a heading, e.g. Column Name, sorts those descriptions in ascending or descending alphabetical order.
You can construct a query in two ways. The Edit pane (middle left) allows you to enter an ADQL query free hand, or cut and paste from a text editor (or copy one of the examples here). When you have finished, click Validate Edit. Use the right-hand panel to provide the exact table and column names which must be used, and be careful to use only allowed ADQL terms and the correct SQL syntax. The Tree view below the Edit pane shows the query as an interactive hierarchy. This can be edited directly and will only let you insert branches and values consistent with ADQL syntax (see below). Note that if you highlight a lower branch in Tree view you will only see that clause in the Edit pane - highlight the top SELECT statement to see your entire ADQL query in SQL.
Once you have finished building the query, click on Validate Edit (see below) to make sure that there are no errors and allow it to be sent to the database (you must do this even if you are sure you have no mistakes!). Optionally choose a VOSpace for the Result (top centre) and then Execute (top right). If you have not chosen an output file, the result will be returned into memory. Tip: This is only suitable for small files (use a Top restriction as in the example); save large area searches to VOSpace.. You can then view the output in TopCat or any suitable application.
Examples of valid ADQL queries
- If you are using vodesktop-2008.1.beta.jar (no number after beta) or a 2007 release of the AstroGrid Desktop (using the Registry version 0.1), then you need to quote any column called DEC or Dec or dec, as in o."dec"
SELECT TOP 100 * FROM twomass_psc AS o WHERE (o.ra > 56.25) AND (o.ra < 57.25) AND (o.dec > 23.6167) AND (o.dec < 24.6167) AND (o.j_m - o.k_m) < 1.0
This selects the first 100 columns from the 2MASS point source catalogue within a box bounded by blc (56.25, 23.6167) and trc (57.25, 24.6167), where Jmag - Kmag < 1.0
Note that position constraints must be in the same units (here, decimal degrees) and coordinate frame as the columns in question. Tip: TOP 100 or some other small number is useful for testing without getting a very large data set back.
SELECT * FROM eitdata AS T1 WHERE T1.DATE_OBS > '2002-07-28T01:00:00.000' AND T1.DATE_OBS <= '2002-07-28T06:00:00.000' AND T1.WAVELNTH = 195
This select all columns for SOHO/EIT images at the specified wavelength and within the given times (note quotes).
SELECT s.time_start, s.time_end, s.nar FROM goes_xray_flare AS s WHERE s.nar > 9850 AND s.nar < 9880 AND s.xray_class > 'M5'
Selects columns giving start time, end time and Active Region Number (nar) of flares in the GOES x-ray flare list with nar in the specified range and flare class above M5. (Solar Event Catalogue)
- We also provide a library of Example Queries
ADQL (SQL-like) Vocabulary
The BNF (Backus Naur Form) definition of ADQL lists all ADQL elements and their syntax, although please note that some of the more complex ones may not be supported by all databases. Tip: Column names which are reserved words (listed in the BNF definition) ,or which begin with a non-alphabetical character, should be enclosed in inverted commas e.g. SELECT * FROM table AS t WHERE t."date" > 10.0 AND t."2XMM" < 17 .
Building a query in tree view
The Query Builder opens with the elements Select, Items and From already in the tree. You can click on any node to expand or collapse it. The general mouse behaviour in tree view is that you select a position in the tree and right-click to obtain a list of elements that can be inserted at that position (under Insert).
Start by specifying which table should be used in the query: right click on From, choose Insert and then the table name, in the example twomass_psc. You could also type this in by chosing Micro edit
The default is to return all columns (Items *). To specify which columns you want to retrieve, highlight Items, then go to the right-hand panel list of columns. Right-click on the first column which you want, then CNTRL-click (or equivalent for your system) to select further columns; a series of adjacent columns may be selected by SHIFT-click. Alternatively, in the tree, right-click on Items and under Insert choose Columns, the table name and finally the name of the column required (ra, dec, etc.) .
- In this example we use Filter Columns to choose "flux".
- The edit pane above shows the ADQL for the tree branch highlighted.
Now highlight the top of the tree (Select), right click and insert Where. Move to the new Where clause and right-click to chose an operator, for example Comparison followed by > . You select the column to appear to the left of the operator is inserted by highlighting Comparison then left-clicking to select the name of the column in the right hand panel, then right-clicking to choose Insert 1 reference into .... (or use the pull-down menu in the tree, as for Items).
Next specify what goes to the right of the operator (assuming it is a number or string); click on Comparison, right-click and choose to insert a Literal and then Microedit the Literal to type in the required value e.g. 56.25.
Finally, insert a Top statement into Select. This is particularly useful for complex or large queries to test whether they are returning what you expect.
- If you highlight Select and look at the lower window, the ADQL for the whole query will appear
You can chop and change between entering ADQL as SQL in the Edit pane and editing the Tree View, but one thing to be aware of is that the Edit pane is itself context sensitive. When the editor first opens the Tree view is focused upon the top SELECT statement in the query: the SELECT branch is highlighted in the tree. This effectively targets the Edit pane to accept complete queries. If you wish to type in and see the whole query as you work, then that is what you want. However, if you choose a different branch of the Tree View, then the Edit pane will be targetted at that point, and will only accept ADQL suitable for the specifically chosen construct. Effectively, you can target the query from its smallest component to its largest construct. For complex queries this can help in breaking down the query into managable chunks. If you cursor down the tree using the keyboard, you will see the contents of the Edit pane alter as the context changes. At each point the query is editable in the Edit pane. Be aware if at any point you start getting errors on what is sensible ADQL: check to see you have not inadvertently changed the context of the Edit pane!
Validation and History
If you type the query freehand (or cut-and-paste), you should then click on Validate which should construct the corresponding tree view (or give an error message). You can mix and match as long as you highlight the part of the Tree you are editing (highlight the topmost node Select to enter the entire query freehand) and validate after every free text editing before Tree editing.
- If you make a mistake, the Diagnostics box at the bottom offers help (in this example, the offending statement with a missing bracket is shown) and the text entry box will be framed in red.
- The History stack next to the Diagnostics box contains a record of what was in the Edit pane each time you clicked Validate (Bottom is the first entry you made and Top is the last).
Saving, commenting and advanced views for your query
- If you have selected a file destination (locally or in VOSpace) then your results will be stored there, needing only normal prudence in making backups. If you don't select a destination then the output will be cached as long as that Task Runner window is open, but you have to save the output separately, e.g. to a file, or by sending it to TopCat and then saving an edited version.
- You can also save the query itself (make sure that it is valid first) using the File menu at the top right of Task Runner (this will not save cache output). If you want to run or edit the query in a separate session, repeat the initial catalogue selection from VOExplorer and click on Build ADQL to get a new Task Runner, and then use the File menu to load the saved query.
- The Comment tab (next to the Edit tab on the left) allows you to enter comments which can also be saved along with the query.
- The Debug tab is only available if you have selected Debugging mode from System (Advanced) in the Preferences section of the Edit menu (in Task Runner or VOExplorer). It will show you the XML version of your query.
Tip: You are advised to save long or complex queries in a text editor before validation or execution. You can also save from the Task Runner File menu but occasionally certain syntax errors may cause some of your input to be lost.
Database queries from a script
See Python Scripting with AstroGrid for the general principles and for an ADQL example.
Attachments
- qb_20080314_small.png (69.2 kB) - added by gg 6 months ago.
- QB1.png (34.8 kB) - added by gg 6 months ago.
- QB2_small.png (115.9 kB) - added by gg 6 months ago.
- QB_ADQL_small.png (116.4 kB) - added by gg 6 months ago.
- QB_V_small.png (123.2 kB) - added by gg 6 months ago.
