DSP Programming Requirements

28 April 2000.

by Attila Kovács
attila@socrates.caltech.edu

Priorities and Goals

  1. First, we will need to establish stable basic communications between the DSP board and the control computer (the host VME computer as a first step). This will allow starting/stopping integrations, informing the DSP board of the input data format (such as number of channels, sampling frequency etc.), and setting parameters to be used in the DSP control and analysis routines. It is important that the host computer should always be able to talk to the DSP board (at least as far as to interrupt the integration process).

  2. Next is to create a functional system that can be used to obtain and store data for a number of input channels. This requires:

    1. The ability to read and sort the serial input stream into a data format that is subsequently used by the DSP routines.

    2. Timing Routines. (interrupt driven, or preferably multi-threaded) so that DSP board knows when data is arriving, and where each piece of data belongs. A few examples: Time-Stamp, Start/Interrupt control, using sync signals on some input for controlling processes, etc.

    3. Simple Analysis (mean and sigma for each pixel, calculated for N input samples, where N is to be communicated to the DSP board)

    4. Writing data to disk (either directly at the full sampling rate, or writing the results of the analysis at a lower frequency).

    5. (Optional) Sending data to Display computer.

  3. Implementing advanced functionality and Observing Modes.

  4. Improving Real-Time Display functionality. (by doing some higher level analysis on the data throughout the integration).



General Comments

Comments on Programing Style



Data Storage And Global Variables

Storing The Data in Memory

Arrays should be stored in the standard C (row major) format. This is probably required by most of the ready-written code. The bolometer data will be naturally 2D (frames from here on), that is it will contain information from physically separate rows, with a number of pixels (32) in each row. A 3rd dimension arises from recording several frames in time. A boolean or integer array of the same size should contain flags for marking possibly bad pixels or data points. In C boolean is not a real type, therefore it has to be implemented so that no more than a bit per data point is used for that purpose. However, integer flagging is probably preferred anyway. The advantage of an integer array vs. the boolean array for flagging purposes may be that it allows further qualification of the data problem in question.

There should be further two arrays to contain the mean and sigma values for each of the pixels. These will serve as output arrays (i.e., written to disk)

Recommended Data Types

Some data types (in C syntax) for consideration

typedef struct {
     int *size;
     double *entry;
     int *ientry;
     long flags;
     unsigned long linearsize, linearposition;
} array;

The Advantage of using a structure like the above is

      1. it works for arrays of all dimensions (generic enough).

      2. when an array passed to a function one need not pass extra parameters defining the extent of the array.

      3. that one does not have to keep track of the array sizes separately.

'entry' is a pointer to a regular C array in row-major format (Allowing the use of ready-written routines that use that format as input [and output]). 'size[0]' could contain the number of dimensions, and 'size[1]' through 'size[n]' are the sizes in the respective dimensions. 'ientry' can be allocated either for use as an integer array (instead of the double array), or as a qualifier next to the double array, which can be used for the data flagging purposes (as described before), either as boolean or as integer flags (depending on allocation). The flags parameter's bits can be used to describe properties of the array or the quality of the data contained within the array (e.g., one bit can be set to indicate that the double array had been allocated, another for the integer array, another one to indicate if the array contains complex data [common for Fourier Transforms], one to indicate whether the int array is boolean, etc.). 'linearsize' is the array size (as if 1-D), while 'linearposition' is essentially a cursor to the last element accessed (while it may not be immediately obvious, using these last two redundant looking variables speeds up all operations on that concern the entire array, since one may do a simple for loop where 'linearposition' goes from 0 to 'linearsize', and avoid the repeated conversion of coordinate bases to access elements).

PS. I'm currently working on a set of tools that use such arrays, and make a lot of manipulation very simple. It is written in standard C, and thus should work even on the DSP computer. I suggest to use it, unless the DSP codes supplied already contain something that is convenient. Please contact attila@socrates.caltech.edu with respect to this point.

Function Parameters and Global Variables

We should aim to provide a flexible DSP system which can be easily controlled and customized by commands from the host computer. This means that the parameters, with which the DSP functions get called during the integration cycle, need to be adjustable. This requires that these parameters are accessible to both the command loop (which receives and interprets commands from the host computer) and the integration loop in the DSP (so that functions during integration are called with the desired parameters). Possible solutions are either passing these parameters as arguments or defining them as global variables. The inconvenience of passing arguments is that the number of parameters will be very large. However, that may be overcome by putting function parameters into independent structures and combining all within a single superstructure...

E.g., Suppose there are two user-end functions, Despike and Demodulate, each of which has a set of parameters that can be adjusted by the user. Then one can create (and typedef for convenience) structures that contain their respective list of adjustable parameters. This way their function calls will also simplify, [e.g., Despike(array *data, Despike_Parms *parameter) ] . Then a single superstructure (DSP_Parms) can carry a collection of the individual parameter structure. The Command and control routines then need only to pass the superstructure around.

typedef struct {
     int parm1, parm2 ...;
     double parm3, parm4 ...;
     ...
} Despike_Parms;

typedef struct {
     int parm1 ...;
     double parm2 ...;
     ...
} Demodulate_Parms;

...

struct {
   Demodulate_Parms demod;
   Despike_Parms despike;
   ...
} DSP_Parms;  

The above method will keep a large number of parameters well organized and easy to keep track of. This (or some other sort) of organising the variable is a good idea even if global variables are preferred to passing extra arguments to functions.

Note that only functions that are transparent to the user desire such structured parameters for set-up considerations. Any other subroutines that may be called by these functions can be implemented as wished, or as seen convenient. [E.g., if Despike calls on a possible subroutine, say, Flag(), Flag(double *datapoint, int errortype) is just as acceptable as Flag(Flag_Parms *parameter), when Flag() is not a user-end function but only something used internally by other DSP routines. That is, one does not necessarily have to define a Flag_Parms structure for this purpose).

Of course, the above should be viewed only as a suggestion, not as a strict requirement. If any other means of implementing soft-adjustable function parameters is deemed more convenient, they should, by all means, be preferred.

One also needs to be able to switch on/off each of the analysis routines (to be described below). This can be trivially implemented by an assigned int type variable for each of the routines, which can be bundled with the rest of the functions parameters (if desired), and which controls whether the corresponding function is to be executed in the loop.

E.g.,

int isDespike=DEFAULT_DESPIKE;
...

then in the integration loop:

{
     ...
     if(isDespike) Despike(data, DSP_Parms->despike);
     ...
}

Integration Parameters

Yet another important set of parameters to describe the form of incoming data, and the chunks in which it is desired to be processed. Possibly they could/should be bundled also within a single structure (with or without sub-structures). These variables are designed to describe the sampling rate, the AC bias divide number, how large blocks of data are fed to the analysis routines at a time, what kind of integration (whether OTF, chop-slewy or stare-switch, etc.) is used, and other parameters the integration loop itself need to be aware of.



Core Routines

Communication with Control Computer

General Comments

Communication should probably be performed by sending and receiving commands to and from the DSP board. Most commands would be used to set up the DSP functions, and thus can be communicated while the DSP is idle, or is in a 'listening' mode. However, one should have the ability to interrupt the DSP integration routines at any point. On the reverse side the DSP should be able to communicate continually to the host computer (for storage and display functions) while the integration routines are running.

The communication that needs to be implemented between the DSP and the host Solaris computer will most likely involve:

        1. Host to DSP

          1. Sending Commands (for setting up integrations and analysis on DSP).

          2. Sending Start/Interrupt signals (for starting/interrupting integrations).

        2. DSP to Host

          1. Sending Data to Host (Writing to Disk and/or shared memory).

          2. Send status report to host upon demand. (Not an immediate priority).

Setting Integration/Analysis parameters

Possibly a single function, such as DSPCommand(char *command), should be used to communicate set-up commands to the DSP board. These commands would be only sent when the DSP is idle (not integrating), in a 'listening' mode. It would then be within the DSP to interpret the given command and set up variables accordingly, and then send back a signal to the host indicating that the command was successfully performed or if any errors occurred. This would allow easy scripting support for the DSP.

Starting/Interrupting an Integration

While starting the integration could be performed with the same generic method as described above, interruption will require that the host computer can send an interrupt command at any time to the DSP board. That is, while the DSP code will have nothing to do but listening to commands during the set-up stage, it will have to 'listen' for interrupt signals at all times. This should be possible to achieve either by setting up a proper software interrupt routine, or by writing a multi-threaded action listener, or by briefly checking for interrupt command at some point within the integration cycle. The last could be the most convenient as it assures that the integration is stopped at the end of a processed chunk (or before the next chunk would be acquired). If there is a way of accessing either the DSP memory from the host, or the host memory from the DSP, then the DSP could simply check within the integration cycle to see if an interrupt variable had been set or not by the host, and then decide whether to continue integrating or go into 'listening' mode. It should be expected that the example codes, or libraries supplied with the DSP board will already contain an implementation that can be customized for our purposes.

Writing Data to Disk (and/or shared Memory)

If memory may be shared between the DSP and its host Solaris system, it may be convenient to write the output in such a memory space, either for the display function, or even as an intermediate step in writing to disk (if the DSP cannot write to disk directly). It is to be seen what method of writing/displaying the DSP data is the easiest and most convenient to implement.

        1. Directly (write data at sampling rate).

        2. Reduced Data (Mean and Sigma).

        3. Display Data (Generated by Display Reduction routines).

DSP Status Report

In the final system it should be possible to query the DSP for activity (whether listening, or integrating, perhaps even report what function it is executing at the moment of the query) and for certain variable values (such as loop variables, and set-up variables). That function would most certainly require an interrupt or an action-listener be set-up. However, it is not a priority at this point, unless it is useful and thus desired for debugging purposes.



Data Acquisition

Block Data-Flow Diagram




Control Signals

The following signals have to be generated and/or used by the DSP (and appear on one of the digital I/O lines) for instrument control purposes. (More detailed description of the use of these signals will follow in the next paragraph). The frequencies quoted are not final, thus if there is any reason to use other values (e.g., factors of 2) that are reasonably close and more easily implemented, it is OK. In principle none of these signals should be hard parameters. That is one should have the possibility of painlessly altering them if desired.

It is possible that all of these signals will be provided externally (perhaps by a programmable IO card also in the VME chassis), in which case the DSP will only have to read them as inputs, rather than generate them.

Data Arrival/Sorting

The typical data array in the DSP (for further processing) will be 3-D data cubes, and its 2-D amd 1-D components. The lowest ranking dimension is the row (currently 32 pixels), a number (6-12) rows will make up the frame, and a number of frames taken at different times will provide the 3rd dimesion of the time-cube.

Data acquisition is controlled by a clock signal (1kHz planned but may vary), which will be readily available on the DSP (or is generated by the DSP). The clock transition will signal when a frame of data is arriving on the serial stream. A frame will consist of integer numbers (16-bit ADC output) for each of the channels (all rows are read out in series). The DSP will have to be told how many channels of data to expect (currently it may vary between 1 and 256, in the future it may be 256+). Not all of the input channels may contain data to be used, thus channels also need to be flagged. These flags should be set/reset by commands from the host. Channels that are flagged 'unconnected' may just be ignored completely (not written into DSP memory for processing). The data written to memory will be a double which is the incoming integer value multiplied by the fixed conversion constant DAC2VOLT, converting DAC units into physically meaningful Volts.
variables: channels, channelflag, DAC2VOLT (names are just a guide, don't need to stick with them)

When AC bias is used the data naturally groups with the bias cycle. The bias signal is generated from the sampling clock by integer division (assuring synchronicity). Frames should be counted from each transition of the bias signal.
variables: biasdivide, framecounter

The processing will take place on a user-defined number of frames. That is whenever a chunk of data with that number of frames has been collected, the DSP will send that chunk into the processing thread, meanwhile continuing to obtain the next chunk (in a different array) until the integration is finished (that is a user-defined number of chunks have arrived or if an interrupt signal was received). Thus, in principle, it is enough to allocate memory for only two chunks that can be used alternately to store data while the other is being processed. The natural chunksize is the number of frames between bias transitions (biasdivide). However, one may want to use only a fraction of that at a time, or a number of bias_cycles together.
It may also be desired to process frames that occur at high and low bias levels separately. This then would require 4 buffers to be set up (hi and low, both for data collecting, and processing).
variables: chunksize, Nchunks

Time-Stamp

At the beginning of every integration the first frame of data needs to be time-stamped so that later the data may be combined with the antenna computer readings. (Later frames can be time-stamped optionally, since their arrival should be periodical with a given clock-rate.) The most important is that the DSP/Host clock needs to be in sync with the Antenna Computer clock. The combined error (from misalignment of clocks and algorithmic delays) should under no circumstances ever exceed 10msec (so that the pointing error at 10"/sec scanning rate is no more than 0.1")



Data Manipulation

General Comments

Since it is expected that a lot of simple operations, such as allocating and accessing multi-dimensional arrays, selecting data columns, multiplying with constants or functions etc., will be performed often, it is therefore practical to write them in reusable subroutines. I do have a rough prototype of such routines, and so might the supplied DSP library. See what is practical to use.

Flagging

Just a reminder, that we have to have the ability to mark data points as bad. The important thing to remember is that all analysis routines (described below) are designed for only the remaining 'good' points. I.e., the routines have to be able to discard all flagged data points.

Subsets of Data

It should important to handle subsets of the data with ease and with speed. Such subsets could be: the bolometer array data for a selected amount of time, a single pixel data in time (1D), a single row of data at a give instant (1D) or in time (2D), selected pixels in time (2D or 3D) etc. (I do have some untested prototypes for such routines)



Analysis

Calculating Mean and Sigma of for n samples

This is the most important routine, that is fundamental to much of the processing that follows. In the analysis mean and sigma have to be recalculated potentially a large number of times, for which reason it is crucial that it is as fast as possible. The mean and variance of n samples are defined as:

Thus, the most efficient way is to calculate and within a single loop, after which the proper normalization can be performed. That, however leads to round-off errors, for which reason it is recommended that mean and point variance are calculated in separated cycles (1st definition of point variance).

Mean and Sigma could be needed in all directions (samples could be pixel values in time, pixels in a row, or pixels across the entire array). Care should be taken that these values are only calculated across data points that aren't flagged.

Note: The uncertainty of the mean calculated thus is always correct, even if the incoming data does not contain noise on the full range of frequencies, or if that noise is not uniform in frequency space (filtered data).

Demodulation of Data with Wave Form

Optionally the data should be demodulated with a given wave form f(t). This is a simple multiplication of the pixel value at time t by a wave form at a shifted time. In principle one should be able to define separate waveforms for each pixel I.e.,

on the set of pixels (i). are individual pixel phases that need to be adjustable from the host computer. The waveforms need to be provided from the host computer.

It is possibly one could use an alternate form of demodulation, which discards any shifts of the signals by constant values. In that case the appropriate demodulation form is:

Gain Correction

To accommodate for the uneven sensitivity of pixels, a gain correction should be applied to the incoming data, to bring signal levels to par across the array. Once the pixel gains are available, it simply means re-scaling the data for each pixel by a gain multiplication factor. I.e.,

for all pixels and all time samples. The means and variances can be scaled with the same factor, that is there is no need to fully recalculate them.

,

De-spiking

After a mean and sigma is calculated for each pixel in time (in this case the samples are time series data), all points that are more than n (where n is a parameter) sigma's away from the mean are removed (flagged). At the end mean and sigma are recalculated for the selected samples. This may be performed either once, or repeated until no more data points are removed at the desired de-spiking level (more time consuming but ensures self consistent data).

Flag if

De-correlating (Instantaneous, Constant Level)

Decorrelating will probably happen only in time on a given subset of pixels. The method described below decorrelates at the full sampling rate with a single constant. Later more sophisticated decorrelation methods will be decribed that will include plane fitting and/or decorrelating with a frequency cutoff.


After mean and sigma are calculated for all (gain corrected) pixels. The following is calculated for the subset of pixels (m pixels and n time samples)

or, alternatively,

This is just the (weighted) mean of the pixel deviations on the subset of pixels. The weights are optionally needed depending on whether the source of the correlated noise is gain dependent or not (note: the data is already gain corrected at this point). An example of the former is electronic noise (in the amplifier circuit). Sky noise (presumably dominant) would be an example of the latter.
The second definition is identical to the first, only it is optimized for numerical computation, since is calculated only once, hence reducing the number of operations within the summation cycle.
Subsequently is subtracted from all in the subset.
The mean and sigma are then (once again) recalculated for the subset of pixels.

Note: Subtracting from the pixel values will not result in increased uncorrelated noise. To arrive to that conclusion consider:

sincedon't depend on t,

That is (assuming proper normalization),

Which is,

Therefore, no extra noise had been added. Rather the noise is somewhat decreased, as an effect of 'smudging' pixels together. However, this effect is small if m is sufficiently large. (It is possible to get around this 'undesirable' effect with carefully adjusted weights and a slightly more complex subtraction. However that is not deemed necessary at this point.)

Decorrelating (Linear/Planar)

Fourier Transform

Linear Baseline Fitting

Finding and Removing a Single Fourier Component