Difference between revisions of "Calibration Database"

From EOVSA Wiki
Jump to: navigation, search
(Reading and Writing Delay Center Tables)
Line 74: Line 74:
 
By default, acc=True, which means to write the table to the ACC.  In either case, the equivalent table is also written to /tmp/delay_centers.txt, on whatever machine the command is run from.  This table can be changed by hand, if desired, and written back to the database by
 
By default, acc=True, which means to write the table to the ACC.  In either case, the equivalent table is also written to /tmp/delay_centers.txt, on whatever machine the command is run from.  This table can be changed by hand, if desired, and written back to the database by
 
<pre>
 
<pre>
ch.dla_cen2sql(filename='/tmp/delay_centers.txt')
+
ch.dla_centable2sql(filename='/tmp/delay_centers.txt')
 
</pre>
 
</pre>
 
but note that these updates will not be reflected in the ACC file until you run
 
but note that these updates will not be reflected in the ACC file until you run

Revision as of 14:14, 21 November 2016

Description and Use of the EOVSA Calibration Database

Background

We have created a general-purpose table in the SQL-Server database eOVSA06, named abin, which is used to hold binary calibration data in a general format given by an XML format string in the same table. The table is meant to be extendable to any calibration type, although it remains to be seen whether it is general enough to handle all use cases. This document describes the scheme, the format of the abin entries, and the list of currently defined binary types (this will have to be updated on a regular basis as new definitions are added).


Description of the General Scheme

The general idea is to create entries into the abin table that are self-describing and completely general.

The table columns are:

['Bin';, 'Timestamp', 'Version', 'Id', 'Description']


The Id number is auto-incremented to be unique to each record, and is never set by the user. Each type definition will appear in the table with an n.0 Version number (float), and whenever it is updated, a new n.0 record is written with the current Timestamp. This provides a history, with the corresponding Timestamp giving the start timerange of applicability (actually regretting that this key is called Version, since its purpose could more accurately be referred to as the calibration Type). To distinguish between this key and the true versions given within the type definition record, the latter is referred as the “internal version.” The Bin column contains an XML data description that is to be used to decode the data. The Version (type) number n will be unique for each calibration type, so that records with Version = 1.0, for example, will always contain the latest definition for a particular type of data defined as type 1 (the type of calibration data is further described in the Description column). The type definitions, as well as helper routines for creating, reading, and writing records is found in the Python module cal_header.py.

The XML data itself, found in the Bin column of a Version n.0 record, contains an internal version variable that gives a further record of the version of the XML format. As a concrete example, the latest Version 4.0 (delay centers) calibration will contain an XML string that includes its own internal version variable, say its value is 2.1, that would distinguish it from an earlier type 4.0 version. This internal version number is used by the send_xml2sql() to determine whether a definition defined in cal_header.py has changed and needs to be written to the abin table.

After (never before) the defining n.0 record is written, subsequent records of that type can be written containing the binary calibration data, which will be decoded using the defining XML string. Thus, after writing the latest Version 4.0 format record, subsequent records with Version 4.1 (type 4, with internal version 1.0) can be written that will be decoded using that latest 4.0 XML string. Other versions, e.g. 4.2 (internal version 2.0) etc., could in principle be written, although it is not clear why that would be needed (perhaps an important change to the contents, but without a corresponding change to the format, could be indicated with a new 4.x version number). Thus, the latest delay_centers entry can be read with a query like:

SELECT TOP 1 * FROM abin WHERE Version > 4.0 AND Version < 5.0 ORDER BY Timestamp DESC

while the delay_centers entry for a given Timestamp tstamp can be read with a query like:

SELECT TOP 1 * FROM abin WHERE Version > 4.0 AND Version < 5.0 AND Timestamp <= tstamp ORDER BY Timestamp DESC

In testing this, it was discovered that binary records returned by such a query are limited in length to 4096. To get an arbitrarily long record, one must prepend the string “SET TEXTSIZE 2147483647” to the query. Note that such details are already handled by the helper routine read_cal() in cal_header.py. As new calibration types are created, their definitions will be added to cal_header.py, both by updating the cal_types() routine to add the new type’s Version number and Description, and by adding a two writing routines—one called type>2xml() routine that returns the XML description of the data (later written into the database by send_xml2sql()), and one called <type>2sql() that converts the calibration data to a binary buffer and writes it into the database, where <ype> is a hopefully rational name for the new type. As new formats for an existing type are created, it should be fine to simply update the cal_types() routine to change the description (if needed) and update the format embodied in the <type>2xml() and <type>2sql() routines. It should not be necessary to keep the old format, since the database itself already forms a history. Of course, any previous versions of the cal_header.py file will also be kept in the github versioning system.


Currently-Defined Types

This section will hopefully be updated whenever new types are added, to provide a list of currently-defined calibration data types. However, it is probably wise to consult the cal_header.py file to verify the current definitions. Here is the verbatim return statement from cal_types():

return {1:['Total power calibration (output of SOLPNTCAL)','proto_tpcal2xml',1.0],

        2:['DCM master base attenuation table [units=dB]','dcm_master_table2xml',1.0],

        3:['DCM base attenuation table [units=dB]','dcm_table2xml',1.0],

        4:['Delay centers [units=ns]','dlacen2xml',1.0]}

To add a new type, simply add another entry to this dictionary, with a unique type number, and a three-element list whose first element is the Description string, second element is the string name of the routine to call to create the XML definition (returns a binary buffer ready for writing to the abin table), and third element is the version number. Then add the corresponding type>2xml() routine defining the format of the binary data, and the <type>2sql() routine that converts the calibration data to a corresponding binary buffer. The cal_header.py module includes a routine send_xml2sql(), which can be called at any time and checks the latest version of each calibration type in the abin table, and updates any that have changed (i.e. has a different version number than the latest one in the table). The return statement of each <type>2sql() routine should call write_cal() to actually write the binary buffer to the database, so that a single call to the routine does everything. It is anticipated that routines that create the calibration data will call the corresponding <type>2sql() routine directly.

To change an existing type, change the description in the cal_types() routine, if desired, and change the corresponding <type>2xml() and <type>2sql() routines to create the new definition. It should not be strictly necessary to increment the version number that will be written into the XML description, unless two active versions are needed at the same time. It is up to the programmer to decide whether to increment the version’s minor (fractional) or major (integer) part of the version number, since only its uniqueness is required.


Reading Back Data for a given Calibration Type

If the above scheme is followed, it should be possible to use a single, general routine to find and successfully read the binary calibration data for a given time. The read_cal() routine in the cal_header.py module does this, returning a Python dictionary and the binary buffer. The dictionary contains key, value pairs defining the variable names (keys) and the types and start location (values) in the binary buffer. To use these returned entities, one employs the extract() routine defined in the stateframe.py module, e.g. to read the total power (type 1) calibration factors for antenna 5 on April 3, 2016 as of 20:00 UT:

import stateframe, util

tp, buf = read_cal(1, t=util.Time(‘2016-04- 03 20:00’))

calfac = stateframe.extract(buf,tp[‘Antenna’][4][‘Calfac’])

Here the index for antenna 5 is 4, since it is a zero-based index. Note that to read the values for the current time, the input t can be omitted.

Reading and Writing Delay Center Tables

Whenever a new delay center measurement is made, it can be easily written to the database using the following procedure. To create a brand new table, start with a delay_centers.txt file, which is simply a human-readable text file of a specific format patterned after the one originally made by hand. The ability to see a text file of the delay centers is useful to ensure that everything looks sensible. Also, such a table is read from the ACC /parm folder by the dppxmp program. To create a text table from the current database contents, simply run:

import cal_header as ch
ch.dla_censql2table(acc=False)

By default, acc=True, which means to write the table to the ACC. In either case, the equivalent table is also written to /tmp/delay_centers.txt, on whatever machine the command is run from. This table can be changed by hand, if desired, and written back to the database by

ch.dla_centable2sql(filename='/tmp/delay_centers.txt')

but note that these updates will not be reflected in the ACC file until you run

ch.dla_censql2table(acc=True)

where the acc=True is the default, so could be omitted.

In the more usual case, rather than start with a new table, the delay center values will simply be updated by measuring changes to the delays relative to the current one. To update the SQL database for new delay centers AND write them to the ACC, let's assume that the delay offsets, in units of delay steps, have been measured for baselines wrt antenna 1, i.e. 1-2, 1-3, 1-4, ... 1-14, and also on the auto-correlations for each antenna 1 through 14. The former are in a 2 x 14 array dla_update, and the latter are in a 2 x 14 array xy_delay. Then run:

dla_update2sql(dla_update,xy_delay)