Table of Contents

vex2difx

vex2difx is a program that takes a vex files (such as one produced by sched with various tables based on observe-time data appended) and a configuration file (described below) and generates one or more .input files for use with difx. Each .input file is accompanied by a .calc file which is used by calcif2 to generate the .delay and .uvw files needed at correlation time. vex2difx along with calcif2 supercedes the functionality of vex2config and vex2model.

The vex2difx philosophy

Users and future developers of vex2difx should be aware of the approach used in designing vex2difx which can be summarized as follows:

  1. The output files should never need to be hand edited
  2. Simple experiments should not require complicated configuration
  3. All features implemented by mpifxcorr should be accessible
  4. All experiments expressible by vex should be supported
  5. The configuration file should be human and machine friendly
  6. Command line arguments should not influence the processing of the vex file

Note that not all of these ideals have been completely reached as of now. It is not the intention of the developer to guess all possible future needs of this program. Most new features will be easy to implement so send a message to the difx-users mailing list for requests.

The vex file

The VLBI scheduling programs sched and sked both produce vex files that are used to control antennas for observations. Certain information that is not available prior to an observation needs to be provided to vex2difx in some way. One way is to append this data to the vex file. The alternative is to provide it in the .v2d file (as shown further down). This information includes:

  1. The Earth orientation parameters ($EOP block in the vex file, or EOP blocks in the .v2d file)
  2. The antenna clock offsets ($CLOCK block in the vex file, or clock values in the ANTENNA blocks of the .v2d file)
  3. The volume serial numbers for the recording media ($TAPELOG_OBS block, or file lists in the ANTENNA blocks of the .v2d file)

Population of these three tables is necessarily a correlator/array specific operation and is the responsibility of the vex2difx user to arrange.

Note, only formal vex files are supported as input to vex2difx. Similar looking ovex files used at some/all Mark4 correlators are not acceptable, however, with a small amount of work an ovex file can be hand converted to a valid vex file. It would not be hard to write a conversion script to do this automatically.

The configuration file

The configuration file consists of a number of global parameters that affect the way that jobs are created and several sections that can customize correlation on a per-source, per mode, or per scan basis. All parameters (those that are global and those that reside inside sections) are specified by a parameter name, the equal sign, and one value, or a comma-separated list of values, that cannot contain whitespace. Whitespace is not required except to keep parameter names, values, and section names separate. All parameter names and values are case sensitive except for source names and antenna names. The # is a comment character; any text after this on a line is ignored.

Parameter Types

DiFX1.5 vs DiFX2.0

DiFX2.0 adds support for features like multiple simultaneous phase centres. The corresponding functionality in vex2difx is only available in DiFX2.0, obviously. Any parameters which are only available in DiFX2.0 are italicised like this.

Global Parameters

Global parameters can be specified one or many per line such as:

maxGap = 2000 # seconds

or

mjdStart = 52342.522 mjdStop=52342.532 
Parameter name Type Units Default Comments
vex string REQUIRED filename of the vex file to process; this is the only required parameter
mjdStart date obs. start discard any scans or partial scans before this time
mjdStop date obs. stop discard any scans or partial scans after this time
break date mjd times of forced manual job breaks
minSubarray int 2 don't make jobs for subarrays with fewer than this many antennas
maxGap float sec 180 split an observation into multiple jobs if there are correlation gaps longer than this number
tweakIntTime bool True Adjust (up to 40%) integration time to ensure integer blocks per send
maxSize float MB 2000 The maximum output fits file size, estimated
singleScan bool False if True, split each scan into its own job
singleSetup bool True if True, allow only one setup per job; True is required for FITS-IDI conversion
maxLength float sec 7200 don't allow individual jobs longer than this amount of time
minLength float sec 2 don't allow individual jobs shorter than this amount of time
dataBufferFactor int 32 the mpifxcorr DATABUFFERFACTOR parameter; see mpifxcorr documentation
nDataSegments int 8 the mpifxcorr NUMDATASEGMENTS parameter
jobSeries string job the base filename of .input and .calc files to be created
startSeries int 1 the default starting number for jobs created
sendLength float sec 0 roughly the amount of data to send at a time from datastream processes to core processes
sendSize int bytes 5000000 roughly the send size from datastream to core
antennas string all ants. a comma separated list of antennas to include in correlation
baselines string all bls. a comma separated list of baselines; see below
padScans bool True insert non-correlation scans in recording gaps to prevent mpifxcorr from complaining
invalidMask int 0xFFFF this bit-field selects which flag conditions are considered when writing flag file: 1=Recording, 2=On source, 4=Job time range, 8=Antenna in job
visBufferLength int 32 number of visibility buffers to allocate in mpifxcorr
simFXCORR bool False simulate the VLBA hardware correlator integration and start times
overSamp int force all baseband channels to use the provided overSampling
mode string normal options: normal and profile; see section below
threadsFile string overrides the name of the threads file to use

Note that the baselines parameter supports the following syntaxes: A1-A2 A1+A2+A3-A4+A5 A1-* A1+A2-* and so on. For each list member, all baselines consistant with an antenna match on both sides will be kept.

SOURCE sections

A source section can be used to change the properties of an individual source, such as its position or name. In the future this is where multiple correlation centers for a given source will be specified. A source section is enclosed in a pair of curly braces after the keyword SOURCE followed by the name of a source, e.g.:

SOURCE 3C273
{
  source parameters go here
}

or equivalently

SOURCE 3c273 { source parameters go here }
Parameter name Type Units Default Comments
ra J2000 right ascension, e.g., 12h34m12.6s or 12:34:12.6
dec J2000 declination, e.g., 34d12'23.1” or 34:12:23.1
name string new name for source
calCode char ' ' calibration code, typically A, B, C for calibrators, G for a gated pulsar, or blank for normal target
doPointingCentrebool true Whether the pointing centre should be correlated (only ever turned off for multi-phase centre
addPhaseCentrestring contains info on a source to add, with ra, dec and optionally name/calcode with no spaces, ”/” separation and ”@” in place of ”=” eg “addPhaseCentre = name@1010-1212/RA@10:10:21.1/Dec@-12:12:00.34”

ANTENNA sections

An antenna section allows properties of an individual antenna, such as position, name, or clock/LO offsets, to be adjusted.

Parameter name Type Units Default Comments
name string New name to assign to this antenna
polSwap bool False Swap the polarizations (i.e. L ←→ R) for this antenna
clockOffset float us vex value Overrides the clock offset value from the vex file; used in conjunction with clockEpoch
clockRate float us/s vex value Overrides the clock offset rate value from the vex file; used in conjunction with clockEpoch
clockEpoch dat vex value Overrides the epoch of the clock rate value; must be present if clockRate or clockOffset parameter is set
deltaClock float us 0.0 Adds to the clock offset (either the vex value or the clockOffset above
deltaClockRate float us/s 0.0 Adds to the clock rate (either the vex value or the clockRate above
X float m vex value Change the X coordinate of the antenna location
Y float m vex value Change the Y coordinate of the antenna location
Z float m vex value Change the Z coordinate of the antenna location
format string Force format to be one of VLBA, MKIV, Mark5B or S2
file strings (none) A comma separated list of files that will be copied verbatim to the DATA TABLE of the input file
filelist string A filename listing files for the DATA TABLE and optionally mjdStart and mjdStop for each
networkPort int the eVLBI network port to use. This forces NETWORK media type in .input
windowSize int TCP window size for eVLBI. Set to <0 for UDP
UDP_MTU int Same as setting windowSize to negative of value
vsn string Override the Mark5 Module to be used
addZoomFreq string Adds a zoom band with specified freq/bw as shown: freq@1810.0/bw@4.0[/specAvg@4][/noparent@false]
phaseCalInt int MHz1 Zero turns off phase cal extraction, positive value is the interval between tones to be extracted

The optional arguments for addZoomFreq control spectral averaging (currently constrained to be same as the parent band) and whether or not the parent band is still correlated - default is that it is *not* correlated. These are more for potential future compatibility.

Please note that vex uses as a clock sign convention that is positive for a formatter with its clock running fast (i.e., the second tick happens too early). The clockOffset and clockRate in this ANTENNA section, as well as FITS files, have the opposite sign convention.

SETUP sections

Setup sections are enclosed in braces after the word SETUP and a name given to this setup section. The setup name is referenced by a RULE section (see below). A setup with the special name default will be applied to any scans not otherwise assigned to setups by rule sections. If no setup sections are defined, a setup called default, with all default parameters, will be implicitly created and applied to all scans. The order of setup sections is immaterial.

Parameter name Type Units Default Comments
tInt float sec 2 integration time
nChan int 16 number of channels per spectral window; currently must be a power of 2
doPolar bool True correlate cross hands when possible?
blocksPerSend int the mpifxcorr BLOCKSPERSEND parameter; defaults to a value that depends on other parameters
subintNS int ns 160000000 The mpifxcorr SUBINT NS; should eventually be set to a smarter default
guardNS int ns 2000 The mpifxcorr GUARD NS; 2000 is almost always fine, should eventually be adjusted automatically
maxNSBetweenUVShiftsintns2000000000 Used for multiphase centre stuff - if better time resolution than 1 threads portion of a subint is required
maxNSBetweenACAvgintns2000000000 Used for STA dumping (transient searches) if better time resolution than 1 threads portion of a subint is required
specAvg int 8 The spectral averaging to perform inside the correlator, at the end of a subint
fringeRotOrderint 1 The fringe rotation order - 0=post-F, 1=linear, 2=quadratic
strideLengthint 16 The number of channels to “stride” for fringe rotation, fractional sample correction etc
xmacLength int 128 The number of channels to “stride” for cross-multiply accumulations
numBufferedFFTsint 1 The number of FFTs to do in a row for each datastream, before XMAC'ing them all
specAvg int 1 how many channels to average together after correlation
startChan int 0 first (unaveraged) channel to include in output
nOutChan int (nChan-startChan)/specAvg The number of (averaged) channels to include in output
postFFringe bool False do fringe rotation after FFT?
binConfig string none if specified, apply this pulsar bin configuration file to this setup
freqId int list none a comma separated list of integers that are freq table indexes to select which bands to correlate; default is to correlate all.
phasedArray string If true, tells DiFX to produce a phased array output instead of cross correlations

EOP sections

It is possible to specify the Earth Orientation Parameters (EOPs) through the .v2d file. Normally these values will be appended to the vex file, but there may be cases where a completely unmodified vex file is desired (eVLBI maybe?). Like ANTENNA and SOURCE sections, each EOP section has a name. The name must be in a form that can be converted directly to a date (see above for legal date formats). Conventional use suggests that these dates should correspond to 0 hours UT; deviation from this practice is at the users risk. It is not advised to mix EOP values stored in the vex and .v2d files.

Parameter name Type Units Default Comments
tai_utc float sec TAI minus UTC; the leap-second count
ut1_utc float sec UT1 minus UTC; Earth rotation phase
xPole float asec X component of spin axis offset
yPole float asec Y component of spin axis offset

Example section

EOP 55005 { tai_utc=34 ut1_utc=0.236958 xPole=0.10597 yPole=0.53906 }

RULE sections

A rule section is used to assign a setup to a particular source name, calibration code (currently not supported), scan name, or vex mode. The order of rule sections does matter as the order determines the priority of the rules. The first rule that matches a scan is applied to that scan. The correlator setup used for scans that match a rule is determined by the parameter called setup. A special setup name SKIP causes matching scans not to be correlated. Any parameters not specified are interpreted as fully inclusive. Note that multiple rule sections can reference the same setup section. Multiple values may be applied to any of the parameters except for setup. This is accomplished by comma separation of the values in a single assignment or with repeated assignments. Thus

RUlE rule1
{
  source = 3C84,3C273
  setup = BrightSourceSetup
}

is equivalent to

RULE rule2
{
  source = 3C84 3C273
  setup = BrightSourceSetup
}

is equivalent to

RULE rule3
{
  source = 3C84
  source = 3C273
  setup = BrightSourceSetup
}

The names given to rules (e.g., rule1, rule2 and rule3 above) are not used anywhere but are required to be unique.

Parameter name Type Units Default Comments
scan string one or more scan name, as specified in the vex file, to select with this rule
source string one or more source name, as specified in the vex file, to select with this rule
calCode char one or more calibration code to select with this rule
mode string one or more modes as defined in the vex file to select with this rule
setup string The name of the SETUP section to use, or SKIP if this rule describes scans not to correlate

Note that source names and calibration codes reassigned by source sections are not used. Only the names and calibration codes in the vex file are compared.

Modes

Currently vex2difx operates in one of two modes:

Command line arguments

vex2difx is executed on the command line with:

vex2difx [options] inputFile

Although no command line options can change the way that vex2difx processes a file, there are some options that the user may find useful:

Reporting problems

If you have a problem with vex2difx, please email the difx users email group. Be sure to include the following in the email:

  1. A description of the problem
  2. The v2d file supplied to vex2difx
  3. The vex file pointed to from the v2d file
  4. the captured output when running vex2difx with extra verbosity (use options -v -v)

Examples

Trivial case

The following example demonstrates the simplest case where all defaults are assumed

vex=trivial.vex

Simple case

The following is a more realistic case for a simple experiment

vex=simple.vex

SETUP default
{
  nChan=64
  tInt =3.0 
}

Source coordinate change

This shows how to change the coordinates of two sources in a file

vex=coords.vex

SOURCE J1232+131 { ra=12h32m15.12s dec=13d07'12.5" }
SOURCE PLANETX   { ra=11h59m59.999s dec=-12d59'59.88" }

SETUP default
{
  nChan=128
}

Two setups

This is a more complicated file showing how to apply different correlator setups to different sources

vex=twosetups.vex
maxGap=1000  # don't split the jobs at every source change,
             # instead, make just 2 interleaved jobs
antennas=BR,FD,HN,MK  # select only these four antennas for now  

SETUP target
{
  nchan=1024
  tInt =1.2
}

SETUP calibrator
{
  nchan=32
  tInt =4
}

RULE calRule
{
  source=J1234+1231,3C84,3C273
  setup =calibrator
}

RULE targetRule
{
  # note: not specifying any restrictions so all sources that don't 
  # match above will match here
  setup = target   
}

The above could have used a default setup rather than a catch-all rule and resulted in the same output.

Specifying media

vex2difx allows .input file generation for two types of media. A single .input file can have different media types for different stations. Ensuring specification of media is important as antennas with no media will be dropped from correlation. The default media choice is Mark5 modules. The TAPELOG_OBS table in the input vex file should list the time ranges valid for each module. Jobs will be split at Mark5 module boundaries; that is, a single job can only support a single Mark5 unit per station. All stations using Mark5 modules will have DATA SOURCE set to MODULE in .input files. If file-based correlation is to be performed, the TAPELOG_OBS table is not needed and the burden of specifying media is moved to the .v2d file. The files to correlate are specified separately for each antenna in an ANTENNA block. Note when specifying filenames, it is up to the user to ensure that full and proper paths to each file are provided and that the computer running the datastream for each antenna can “see” that file. Two keywords are used to specify data files. They are not mutually exclusive but it is not recommended to use both for the same antenna. The first is “file”. The value assigned to “file” is one or more (comma separated) files. It is OK to have multiple “file” keywords per antenna; all files supplied will be stored in the same order internally. The second keyword is “filelist” which takes a single argument, which is a file containing the list of files to read. This “filelist” file only needs to be visible to vex2difx. This file contains a list of filenames and optionally start and stop dates (in one of the formats listed above). Comments can be started with a # and are ended by the end-of-line character. Like for the “file” keyword, the filenames listed must be in time order, even if start and stop dates are supplied. An example “filelist” file is below:

# This is a comment.  File list for MK for project BX123
/data/mk/bx123.001.m5a  54322.452112 54322.511304
/data/mk/bx123.002.m5a  54322.512012 54322.514121 # a short scan
/data/mk/bx123.003.m5a  54322.766323 54322.812311 

If times for a file are supplied, the file will be included in the .input file DATA TABLE only if the file time range overlaps with the .input file time range. If not supplied, the file will be included regardless of the .input file time range, which could incur a large performance problem.

A few sample ANTENNA blocks are shown below:

ANTENNA MK 
{
  filelist=bx123.filelist.mk
}
ANTENNA OV { file=/data/ov/bx123.001.m5a, 
                  /data/ov/bx123.002.m5a,
                  /data/ov/bx123.003.m5a }
ANTENNA PT { file=/data/pt/bx123.003.m5a } # recording started late here
ANTENNA default { networkPort = 320 }  # all antennas without ANTENNA setups will get this

Splitting of jobs

Certain events cause a forced job break that could, in some cases, end up requiring many individual software correlations to complete processing of a project. Effort has been made to minimize the number of these cases. The following situations will cause a job break: change in clock model at a station, change of a Mark5 module, change in number of channels or sub-bands, multiple simultaneous subarrays, and leap seconds. Future versions of vex2difx and DiFX may relax some of these circumstances. Some parameters have defaults that may cause more job splitting than is desired (such as maxLength) and can be tuned.

Mark5B issues

The Mark5B format, including its 2048 Mbps extension, is now supported by vex2difx. The “track assignments” for Mark5B format has never been formally documented. Vex2difx has adopted the track assignment convention used by Haystack. Formally speaking, Mark5B has no “tracks”. Instead it stores up to 32 bitstreams in 32 bit words. The concept of “fanout” is no longer used with Mark5B. Instead, the equivalent operation of spreading one bitstream among 1 or more bits in each 32 bit word is performed automatically. Thus to specify a Mark5B mode, only three numbers are needed: Total data bit rate (excluding frame headers), number of channels, and number of bits per sample (1 or 2). The number of bitstreams is the product of channels and bits.

The $TRACKS section of the vex file is used to convey the bitstream assignments. Individually, the sign and magnitude bits for each channel are specified with fanout_def statements. In unfortunate correspondence with existing practice, 2 is the first numbered bitstream and 33 is the highest. In 2-bit mode, all sign bits must be assigned to even numbered bitstreams and the corresponding magnitude bit must be assigned to the next highest bitstream. To indicate that the data is in Mark5B format, one must either ensure that a statement of the form

track_frame_format = MARK5B;

must be present in the appropriate $TRACKS section or

format = MARK5B

must be present in each appropriate ANTENNA section of the .v2d file. As a concrete example, a complete $TRACKS section may resemble:

$TRACKS;
def Mk34112-XX01_full;
  fanout_def = A : &Ch01 : sign : 1 : 02;
  fanout_def = A : &Ch01 : mag  : 1 : 03;
  fanout_def = A : &Ch02 : sign : 1 : 04;
  fanout_def = A : &Ch02 : mag  : 1 : 05;
  fanout_def = A : &Ch03 : sign : 1 : 06;
  fanout_def = A : &Ch03 : mag  : 1 : 07;
  fanout_def = A : &Ch04 : sign : 1 : 08;
  fanout_def = A : &Ch04 : mag  : 1 : 09;
  fanout_def = A : &Ch05 : sign : 1 : 10;
  fanout_def = A : &Ch05 : mag  : 1 : 11;
  fanout_def = A : &Ch06 : sign : 1 : 12;
  fanout_def = A : &Ch06 : mag  : 1 : 13;
  fanout_def = A : &Ch07 : sign : 1 : 14;
  fanout_def = A : &Ch07 : mag  : 1 : 15;
  fanout_def = A : &Ch08 : sign : 1 : 16;
  fanout_def = A : &Ch08 : mag  : 1 : 17;
  fanout_def = A : &Ch09 : sign : 1 : 18;
  fanout_def = A : &Ch09 : mag  : 1 : 19;
  fanout_def = A : &Ch10 : sign : 1 : 20;
  fanout_def = A : &Ch10 : mag  : 1 : 21;
  fanout_def = A : &Ch11 : sign : 1 : 22;
  fanout_def = A : &Ch11 : mag  : 1 : 23;
  fanout_def = A : &Ch12 : sign : 1 : 24;
  fanout_def = A : &Ch12 : mag  : 1 : 25;
  fanout_def = A : &Ch13 : sign : 1 : 26;
  fanout_def = A : &Ch13 : mag  : 1 : 27;
  fanout_def = A : &Ch14 : sign : 1 : 28;
  fanout_def = A : &Ch14 : mag  : 1 : 29;
  fanout_def = A : &Ch15 : sign : 1 : 30;
  fanout_def = A : &Ch15 : mag  : 1 : 31;
  fanout_def = A : &Ch16 : sign : 1 : 32;
  fanout_def = A : &Ch16 : mag  : 1 : 33;
  track_frame_format = MARK5B;
enddef;

About the source code

vex2difx is written in c++ and makes heavy use of the standard template library. This makes applying standard algorithms (sorting, traversing, associating) container members simple and error-free. An object-oriented approach is used. The base class for many of the classes is VexInterval, which is simply a pair of modified Julian days specifying a time interval. From this many other classes, such as scan, job, experiment, flag, … are derived. These classes are defined and implemented in vextables.h and vextables.cpp. This makes simple operations on VexInterval objects (such as sorting and determining overlap) apply automatically to the higher level objects. The vex parsing library from the Field System was borrowed from Goddard Space Flight Center. This code is duplicated with little change within the vex/ subdirectory of the vex2difx source tree. Source file vexload.cpp contains the code that calls the vex parser routines to populate the VexData structure which is then used as the model from which to make jobs. vex2difx uses the difxio library for writing DiFX .input and .calc files. Currently the .flag files are written natively within vex2difx, however this may change.

To aid diagnosis of an experiment and forming jobs, vex2difx keeps an internal list of events. An event could be the experiment starting or stopping, recording at a station starting or stopping, a leap second, an antenna joining or leaving a scan, and others. Event types are enumerated in the vextables.h source file.

Splitting of an experiment into one or more jobs is one of the main functions of vex2difx. The first step in this process is to divide the experiment into JobGroups. A JobGroup is a collection of scans that can be combined into one FITS file. Examples of cases where a JobGroup boundary must be made include changing number of spectral channels or polarizations. The JobGroup boundaries happen at exacting times, dictated entirely by the scheduled scans. The second layer of splitting considers media changes. Often there is a gap between the end of recording on one Mark5 module and beginning recording on the next. vex2difx aims to be smart about choosing when to split jobs to minimize the total number of jobs created.

vex2difx TODO list

List of remaining issues

BUGS

Feature Requests