For DiFX1.5.* and higher. (Notes on the using the old CIRA-DiFX-1.0 are here if required)
The basic steps in running the new version of DiFX are:
vex2difx - takes .v2d file as input and produces .input and .calc files (also .flag)
calcif2 - takes .calc file as input and produces model files (for
DiFX 1.5 .uvw, .delay, [.rate, .im also produced])
mpifxcorr - writes output files to .difx directory (as specified in input file)
difx2fits - converts output to FITS (and ascii tables including
VLBA-style “sniffer” data)
See also http://cira.ivec.org/dokuwiki/doku.php/difx/difx_run for generic instructions
Espresso: a few local scripts that will make your life easier on CUPPA (see below for usage):
gloPut7T.sh - the best way to transfer data between
CUPPA and pbstore
disk_report.py - summary of disk space usage on the main
CUPPA data areas
disk_exper.py - will produce a default input file for lbafilecheck.py based on the output of disk_report.py
-
mk5scans.py - add time ranges to Mark5 file lists for vex2difx
getEOP.pl - gets EOPS from IERS and returns them in v2d format
updatepos.py - update a station position in the vex file with a position from $CALCDB/fullstation.tab
updateclock.py - update clock information in the .v2d file
mjd2vex.py - converts between mjd and vex format dates (both directions).
espresso.py - wrapper for vex2difx, calcif2, errormon2, mpifxcorr and moves various files to the
CUPPA data areas to facilitate fits file creation.
A typical Espresso correlation would look like this.
The contact author should have been emailed regarding their desired correlation parameters (ask Hayley).
All data from the experiment need to be
transferred from iVEC's PB store to
CUPPA for correlation. If any appear to be missing, useful links may be the
Observers wiki at
ATNF and the
data tracking spreadsheet (e.g. have all Ceduna disks shipped from the previous sessions been backed up?)
Note: There is an issue with OpenMPI and the way vex2difx currently orders stations in the input file, when data for multiple datastreams are mounted on the same compute node. The potential problem is that datastream processes will be started on the wrong host node and files can't be opened to read the data. This can usually be avoided, otherwise NFS mounting of disks may be necessary. (Ask Hayley or Cormac for further details.)
Also need to know which ATCA station pad was used as the reference. Usually defaults to W104. This should be recorded in the ATCA experiment log at the beginning of the session and shouldn't change during the session - but check the wiki. (In future, should be extractable with MoniCA?) If not recorded, you'll have to use trial and error to find it (large rate offsets or delay jumps on source change usually indicate a wrong position). Check the ATCA local observing schedule as the station pads for the config are listed in there. If the particular pad is not in the standard $STADB, $CALCDB/atca_stations.tab contains reasonably accurate pad positions.
If Tid was used, which antenna(s)? It's best to re-run SCHED with the correct one. If Tid used more than one antenna, a bit of fiddling may be required (I usually use a second station code 'Td'). v434d from February 2011 has an example of this. (Alternative option, provided only one Tid antenna was used at a time, is to run SCHED for the fastest antenna and break jobs by time, setting the correct site position for each, and set the antenna name in the .v2d file so they get separate entries in the FITS antenna table.)
Log in to
cuppa as corr and make a working area for the experiment: /home/corr/
LBA/<year>/<session>/<expcode>
Keep notes of setup and any problems encountered, e.g. in a file called <expcode>_notes.txt.
Lost time or other problems should also be recorded
here on the wiki. Each experiment should have a local wiki page for notes on correlation and analysis plots, that also links to the observers wiki (generated by lba_feedback.py). The aim is to make this a “one-stop” point for the PI to obtain all relevant information.
Get the SCHED keyin file <expcode>.key (and any necessary associated files) from the
ATNF ftp area (linked from the observers wiki). We re-run SCHED locally to produce a vex file with up-to-date station positions. ATCA position then defaults to pad W104.
Generally we take source coordinates from the vex file. Occasionally the PI may supply updated coordinates for correlation - these can be added in the .v2d or the vex file.
Set the vex file name
If desired can override default parameters e.g. for maximum output file size [2
GB], maximum length of job [7200s] by specifying maxSize and maxLength. maxGap may also be increased (e.g. it may desirable to correlate an entire experiment in 1 job).
Correlation parameters are specified in the SETUP section(s) which are referenced by RULE section(s).
ANTENNA blocks use 2 letter station codes. Can specify name - occasionally needed e.g. if Tid used two antennas, then would generally use e.g. DSS34, DSS45. Two-letter station codes are preferred (then no need to specify the name in the ANTENNA block).
Include ATCA X,Y,Z corresponding to reference station, and include position for correct Tid antenna, if needed. If needed, e.g. to correct the ATCA reference station or the Tid antenna position, updatepos.py will replace the position in the vexfile with a position from $STADB. Updating positions in the VEX file is preferred (over including station positions in the .v2d file) as vex2difx will automatically take care of antenna motion corrections (assuming dx, dy, dz are available).
Note: Station positions are taken from the vex file unless they are included in the .v2d file.
We use filelist keyword in the ANTENNA block to point to the list of data files to correlate
EOP section + ANTENNA clock offsets can be inserted in .v2d file. (These may also be appended to the vex file.) For EOPs, there is a script
getEOP.py to get the latest EOPs for 5 days surrounding the specified MJD from
http://gemini.gsfc.nasa.gov/solve_save/usno_finals.erp in the format required by vex2difx.
Note: the observation MJD can be found in the $EXPER comments section of the vex file. mjd2vex.py will also convert between vex dates and mjd (both directions).
There is a script 'lbafilecheck.py' that will automatically generate the filelists, check that they have valid headers and sort the files in time order. It will also create a prototype machines and thread file for mpi, and a prototype run file to start correlation. It assumes that you want to use all the nodes listed in a file pointed to by the environment variable $CORR_HOSTS for the correlation job.
lbafilecheck.py <expname.datafiles>
the format of the file <expname.datafiles> is described here
A default input file for lbafilecheck can be produced using disk_report.py (which summarises the contents of CUPPA's data areas), and disk_exper.py (which reads the output of disk_report.py):
disk_report.py > ~/disk.txt
disk_exper.py v252o ~/disk.txt
Mark5B recording
Mark5B recording currently usually requires editing of the $TRACKS section of the vex file, as documented here.
Missing headers
This is automatically fixed for you if you create the filelists using 'lbafilecheck.py' so no need to worry about this unless you are creating the file lists by hand. It used to be quite common for a few LBA files per experiment to have missing or corrupt header information. Old versions of DiFX don't handle this (DiFX-1.5.3 on CUPPA has been patched for this; later versions should also be o.k.); it will just hang (until killed) when it encounters such a file, hogging CPU cycles. Cormac has written a simple script to check headers for a given list of data files, in /home/corr/bin/chk_vlbi.pl. Run with, e.g. chk_vlbi.pl <station>.filelist on the appropriate node where the data live (if it is not in an NFS-mounted area).
Data format - usually LBAVSOP for LBA, but may be LBASTD (for 64 MHz or 4 MHz recording)
If a station has recorded with swapped polarizations, this can be specified in the ANTENNA section of the .v2d file (boolean parameter polSwap).
It should also be MUCH easier than it used to be to set up for spectral line correlation with the .v2d file
The following may still be an issue, if the vex file does not reflect what was recorded!
For old schedules the
vex file may need editing to reflect some stations not recording all channels. For experiments at 512 Mbps, typically Ceduna and Hobart record half of the channels. (Not recommended, but it is also possible to edit the correlator input file datastream and baseline tables. See the explanation of the
correlator input file format.)
Also noted on the
ATNF wiki is what to do in case of band inversion. This is an issue for 64
MHz recording modes. I find it easier to edit the $FREQ block in the vex file! The same fix is needed; just add 64
MHz to the relevant frequency channels and change sideband from
U to
L. Note that in
DiFX-1.5.1, if some stations are inverted, but others are not, then you must hack the .vex or .input to have both the U and L modes for running mpifxcorr, but for calcif2 and difx2fits, you must again hack the .input (or .vex) so that it only refers to one sideband for all stations.
DiFX-1.5.3 has a local patch that makes the single-sideband hack for calcif2 and difx2fits unnecessary. This patch should appear in later versions also.
The simplest way to start a correlator job is to use the Espresso shortcut:
espresso.py <jobid_1> <jobid_2> ... <jobid_n>
It automatically carries out the steps described below, as well as doing a few bookkeeping things like ensuring the data are written to the standard CUPPA data areas, and backing up old jobs before removing them from the output directory.
It requires that the file lists, run and thread files have prototypes produced by lbafilecheck.py (or similar).
vex2difx <baseFilename>.v2d
CALC_SERVER should be set to the IP address of the virtual machine on CUPPA running calcserver (ask Cormac or Hayley in case of problems)
calcif2 <baseFilename>.calc
lbafilecheck.py will create “run”, “machines” and “threads” files but these may not be optimal for your needs, in which case:
Don't know which source is a good fringe-finder?
Check in the latest catalogue compiled by Leonid Petrov
Ceduna clock can be erratic but is monitored. An historical graph should also be available.
The simplest way is to run the LBA pipeline.
If you want to do it by hand:
Start AIPS with aips tv=local:0
Use tasks FITLD (ATLOD for RPFITS), POSSM, FRING, SNPLT. Add delays to ANTENNA sections in .v2d file. Calculate long-term clock rates as required. (If necessary, offsets for particular IFs can be inserted in the DATASTREAM table of the generated correlator input file.)
updateclock.py can be used to update the clocks in the .v2d file.
updateclock.py -s 'AT,PA' -o '2.2,-0.2' -r '0,2' -f 8425 test.v2d
would adjust the AT clock by 2.2 microsec in delay and 0 in rate, and the PA clock by -0.2 microsec in delay and 2 mHz in rate for a frequency of 8425 MHz.