HPdataFormat4A.html

NOAA/ OERD Autonomous Hydrophone Data File Description

Last Update: January 7th, 2016

Data Format Type 4A for the NOAA Autonomous Hydrophone Data Files from 2013 on

The Data Format Type 4A & 4B has the same settings as the Type 3. Each data file has first 256 bytes header Information and rest of the bytes are data. In the formats 4A & 4B, it is their header's contents of the last 192 bytes that are different than the header in the data format Type 3.

The main differences between the data format types 4A and 4B are their last 8 bytes in the header. There the data format type 4A does not contain the NCHAN (Number of Channels) parameter whereas the data format type 4B does. In this document, only Data Format Type 4A will be described. See HPdataFormat4B.html for the Data Format Type 4B.

Determine the Data Format Type: 4A or 4B

The PROGNAME parameter in the header will be needed to find out which format type: 4A or 4B will be. The PROGNAME is Program Name, Max. 12 characters long. It is set by the technician before the hydrophone is deployed. The PROGNAME will have the following name: CFxLogSP3izzz.c with or without the 2 suffix characters: ".c". For example, PROGNAME = CFxLogSP3i3_4.c or CFxLogSP3i2_3

When the PROGNAME is equal one of the names:
CFxLogSP3i2_4.c, CFxLogSP3i3_2.c, CFxLogSP3i3_3.c or CFxLogSP3i3_4.c with or without the last 2 suffix characters: ".c", then the data format type is 4B and it has the NCHAN parameter.
See HPdataFormat4B.html for reading the data format type 4B data files;

Otherwise, the data format is type 4A and it has No NCHAN parameters in the header. The NCHAN will be assumed to be 1.

From here on, only Data Format Type: 4A will be described.

Key Header Information

The contents of the header are shown in between the Byte Swapping and the Decoding Instructions sections in this document below. To read the data correctly, the following parameters:

PROGNAME = Program Name, e.g. CFxLogSP3i3_1.c or CFxLogSP3i2_3 for the type 4A format.
NCHAN    = Number of Channels = 1 because of single channel data file.
TIME_GMT = Start Date & Time,
SRATEHZ  = Sample Rate in hertz,
SAMPLES  = Sample Size and type,

from the Header will be needed.

The PROGNAME (from byte positions 152 to 163 of the header with 1st byte position is counted as 0) has been described in the Determine the data format type: 4A or 4B section above and here the data format is assumed to be type 4A, a single channel data. So the NCHAN = 1.

TIME_GMT (from byte positions 90 to 135) contains Year, Julian Date, Hour, Minute and Seconds. An example of TIME_GMT will be

      115 213:21:47:57:862
     Year JDy:Hr:Mn:Sd.xxx

where the Year value needs to be adjusted by adding 1900 so that e.g. 1900 + 115 = 2015, the JDy = Julian Day ( 1 to 366 ), Hr = Hour ( 0 to 23 ), Mn = Minute ( 0 to 59 ), and Sd:xxx = Seconds as Sd + xxx/1000. So e.g. 57:862 = 57.862.

SRATEHZ (from byte positions 196 to 199) is a Long integer value that indicate the Sample Rate, e.g. 1000 Hz. Note that SRATEHZ should not be used as the true Sample Rate. The correct Sample Rate must be computed (see Compute the Sample Rate below) and the value should be very close to SRATEHZ.

SAMPLES (from byte positions 200 to 201) is a Short integer value to indicate how the data value in the file are being store after the header. SAMPLES can be either 0, 2 or 3 where
0 indicates the data after the header are in byte values;
2 indicates the data after the header are 12-bit unsigned integers store in the short integers, i.e., 2 bytes per 1 data point and the last 4 high order bits must be set to zeros in order to interpret the value correctly; and
3 indicates 16-bit unsigned integers are stored into the short integers, i.e. 2 bytes per 1 data point.

For the data format type: 4A and 4B, the SAMPLES most likely will be equal to 3, i.e., 16-bit unsigned integers will be used.

Also, see Byte Swapping below.

Compute the Sample Rate:

Because of the hardware setup, Sample Rate (Data Points per Second) must be determined for each data file by the total data points in a file divided by the file's total time segment in seconds.

(A) Total Data Points in a given data file will be
(File Size in Bytes - Header: 256 bytes )/(Data Size)/NCHAN
where Data Size is 1 if SAMPLES = 0 and Data Size is 2 if SAMPLES = 2 or 3; NCHAN ≥ 1.

(B) Total Time is seconds will be the End Time - Start Time where the Start Time can be obtained from the given data file, e.g. 000011.DAT. The End Time, however, must be retrieved from the next following data file, e.g. 000012.DAT and use its start time as the End Time.

Then (A)/(B) will give the correct Sample Rate for the data file: 000011.DAT. If the next data file is not available or does not exist, SRATEHZ will need to be used or use the computed Sample Rate from the very last 2 data files.

Byte (8-bit) Values: <-- Most likely it will not be used by the data format: 4A and 4B.

When SAMPLES = 0, the data are 8-bit byte numbers. To get the byte numbers, read them as unsigned 8-bit numbers: B. Then convert them into signed integers: I = B - 127.

Byte Swapping:

When SAMPLES = 2 (12-bit) or 3 (16-bit) numbers, the data are stored into the 2 bytes short integers and are generated by the Big Endian format, i.e, the most-significant, left most bit or byte. Byte Swapping for the short integers will be needed if a computer with the little endian format is used. Then the unsigned values (UI2) need to be converted to signed integers: I2 = UI2 - N where N = 2048 for 12-bit data or N = 32768 for 16-bit data.

Header Format for Autonomous Hydrophone Data Files Type 4A:

The header format is shown in C programming language. Note that the 1st and 2nd 32 bytes are same as in the data format Type 3, the remaining header bytes: 192 will be different in Type 4A data format.

typedef struct {
//
// ----- First 32 bytes of BIR header is working variables ------
//
  char   BIRHdrID[4];  // "BIR\0"
  ushort BIRVersion;   // version.release * 10
  ushort BIRUserHeaderSize; // size in bytes for user header
  ushort BIRUnused;    // currently unused (zero)
  ulong  RTCsecs;      // current RTC seconds
  ushort RTCticks;     // current RTC ticks
//
  ulong BIRCapacityBytes;  // total capacity of current drive
  ulong BIRStartFreeBytes; // free space on current drive at start
  ulong BIRReceivedBytes;  // total number of bytes received (likely to wrap)
  ulong BIRWrittenBytes;   // total number of bytes written (likely to wrap)
//
// ----- Second 32 bytes of BIR header is copy of VEE settings ------
//
  long CFPPBSZ;     // size of CompactFlash buffer, typ. 40MB
  long RAMPPBSZ;    // size of data RAM PP buffer, typ. 16 to 64KB
  long RAMHDBFSZ;   // size of CF to HD copy buffer, typ. 16 to 64KB
  long MINFREESZ;   // minimum free space until switch to next drive
//
  char HDDOSDRV[4]; // DOS drive assigned to hard disk ("D:")
  short NODRVTEST;  // testing without accessing drive
  short UARTMONIT;  // sending diagnostics to RS-232 port
  short FLOGFLAG;   // flag to log for major events (startup, spinups)
  short BIADEVICE;  // device type attached to BigIDEA(s)
  short CURBIA;     // index to current BigIDEA/drive
  short CURPRTN;    // index to current DOS partition
}

// DO NOT CHANGE THESE -- THEY MUST REMAIN CONSTANT AND TOTAL 256
#define    ID_LENMAX       4   //System ID length
#define   LAT_LENMAX      10   //Latitude length
#define  LONG_LENMAX      12   //Longitude max
#define   GMT_LENMAX      46   //GMT char length
#define  LOGF_NAME_MAX    14   //JHG-2003-07-28 max filename length for log file
#define     EXPID_LENMAX  16   //EXP ID, usually year
#define  PROGNAME_LENMAX  12   //Added by hm NOAA 09/30/2002

//
// The remaining header space 192 bytes that contain No NCHAN parameter which means
// the PROGNAME will Not equal to any 1 of the following names below:
// CFxLogSP3i2_4.c, CFxLogSP3i3_3.c, CFxLogSP3i3_3.c, CFxLogSP3i3_4.c with or w/o the ".c"
//
typedef struct //user header space 192 bytes 
    {
    char     PLTFRMID[ID_LENMAX];        //System ID (e.g., G001) 4
    char     LATITUDE[LAT_LENMAX];       //Latitude in degrees  10 N45:02.356=N45deg 02.356min
    char     LONGITUDE[LONG_LENMAX];     //Longitude in degrees 12 W128:34.872=W128 deg 34.872min
    char     TIME_GMT[GMT_LENMAX];       //System ID, Lat, Lang and GMT time 46
    char     EXPID[EXPID_LENMAX];        //Exp ID   16
    char     PROGNAME[PROGNAME_LENMAX];  //Program name  12
    ushort   ACQVersion;                 // version.release * 10
    ushort   WARMUP;                     //Pre-amp warm up in sec prior to the A/D logging
    char     PROJID[ID_LENMAX];          //Project ID 4
    char     LOGFILE[LOGF_NAME_MAX];     //Filename for event logging 14 chars
    short    STARTUPS;                   //number of program resets we've seen
    short    MAXSTRTS;                   //Maximum allowable program resets up to 255
    long     MAXNUMFIL;                  //Maximum number of files
    short    GAIN;                       //additional pre-amp gain 0 to 3 with 6dB inc
    long     SRATEHZ;                    //sample rate in hertz. NEW===SRATEHZ is a long value.===NEW
    short    SAMPLES;                    //sample size and type 0=8 bit, 2=12bit(2 byte), 3=16 bit
    short    PWFILT;                     //Pre-whitening filter setting
    short    LOPASS;                     //Low pass filter cut off
    ushort   SLEEP;                      //sleep in hours before program launched
    ulong    ACTIVESEC;                  //Active logging period in second
    ulong    DUTYCYCLE;                  //Duty cycle in seconds
    short    HYDROSENS;                  //Hydrophone sensitivity
    char     PRAMPNAME[10];              //PreAmp revision up to 8-char long
    ulong    WAKEUP;                     //Time to start logging in sec since 01/01/1970 11/02/99 NOAA hm
    char     DAQNAME[10];                //DAQ board name
    char     HYDROSRN[6];                //Hydrophone serial number
    ushort   FILECOUNT;                  //File count
    short    TESTSEC;                    //Test sec prior to the real logging.  Logs all 8 channel.
    short    STANDBY;                    //Standby in second before launching program
    char     dummy[2];
    }    ACQData;

Note that the last 4 parameters: FILECOUNT, TESTSEC, STANDBY and dummy[2] in the typedef struc { ... } ACQData above are unique to the data format type: 4A here. They are not in the header for the data format type: 4B except the TESTSEC; but its byte position is different.

Instructions for Decoding Autonomous Hydrophone Data Files

The following instructions are one of many ways to decode the data file. The objective is to give users examples for reading the header and data. Note that the following codes only read the essential information from the header in order to compute the sample rate and read the data correctly.

Decoding Autonomous Hydrophone Data (Format 2) Block using IDL Software Package

The following IDL commands will decode an autonomous data file 1 block at a time. Note that this section is assuming readers know IDL and how to refer to the IDL manual for details.

Enter IDL ( in UNIX system, type IDL then IDL> should show up )

IDL> OPENR, 10, 'Type4AFormat.Data'    ; Type4Format.data is the file name.
IDL> BLK = ASSOC( 10, BYTARR( 256 ) )  ; 256 is the block size in bytes
                                       ; and it is also the Header size.
;    BLK will allow users to get 256 bytes of data per read.

IDL> HDR = BLK[0]  ; Read in the Header: HDR which will be 1-D byte array
                   ; with 256 elements from 0 to 255.

;    Show the LATITUDE, LONGITUDE & TIME_GMT.
IDL> PRINT, 'Latitude : ', STRING( HDR[068:077] )
IDL> PRINT, 'Longitude: ', STRING( HDR[078:089] )
IDL> PRINT, 'TIME_GMT : ', STRING( HDR[090:135] )

IDL> SRATEHZ   = LONG( HDR, 196, 1 )  ; Sample Rate in Hz. Long Integer.
IDL> SAMPLES   =  FIX( HDR, 200, 1 )  ; Sample Size: 3 means 16-Bit Short Integer.

;    If your computer is using Litte Endian Format, you must do the Byte Swapping.
;    Do the following statement to find out.

IDL> LITTLE_ENDIAN = (BYTE(1,0,2))[0] EQ 1B  ; If LITTLE_ENDIAN is 1, 
;                 then your computer is using the Little Endian Format.

IDL> BYTEORDER, SRATEHZ, /LSWAP  ; Skip this if your computer is Big Endian.
IDL> BYTEORDER, SAMPLES, /SSWAP  ; Skip this if your computer is Big Endian.

IDL> PROGNAME = STRING( HDR[152:163] )  ; Program Name='CFxLogSP3i2_3', e.g.

IDL> PRINT, 'PROGNAME: ', PROGNAME  ; Check to make sure that it is the data format: 4A.
;    i.e. the PROGNAME is Not equal to 1 of the following names:
;    CFxLogSP3i2_4.c, CFxLogSP3i3_3.c, CFxLogSP3i3_3.c, CFxLogSP3i3_4.c with or w/o the ".c"
;    If PROGNAME = 1 of the names above, Get the NCHAN parameters as NCHAN = HEADER[248]

IDL> NCHAN = 1  ; PROGNAME is not = 1 of the names above.

IDL> PRINT, 'SRATEHZ SampleRate: ',  SRATEHZ
IDL> PRINT, 'SAMPLES Size, Type: ',  SAMPLES
IDL> PRINT, 'PROGNAME          : ',  PROGNAME
IDL> PRINT, 'NCHANnals         : ',  NCHAN

;    Decode data blocks.  Assuming SAMPLE Size Type is 0, i.e. 1-byte data.
IDL> B   = BLK[1]   ; Read in the 2nd block, the 2nd set of 256 bytes.
IDL> RCD = B - 127  ; Convert the data into the range between -127 to 128.
;    where RCD will be an integer array of 256 elements.
IDL> DATA = TEMPORARY( RCD )  ; Optional: Assign RCD into DATA at 1-D array.

;    Decode the next data block.
IDL> B   = BLK[2]  &  RCD = B - 127  ; Process 2 IDL statements together.

;    Note that the order of which block to read can be random, i.e.
;    B = BLK[10] & ...  and later B = BLK[5] & ... are OK.

     ::: etc :::

;    Decode data blocks with SAMPLE Size Type = 3 (Short Integer) or 2 (12-Bits Integer).
IDL> B  = BLK[1]            ; Read in the 2nd set of 256 bytes from the data file.
IDL> I2 = UINT( B, 0, 128 ) ; convert 256 bytes into 2-bytes unsigned integers
                            ; with 128 elements.
IDL> BYTEORDER, I2, /SSWAP  ; Required if Little Endian computer is used.
;    Convert I2 to a 2-bytes signed integer array.
IDL> I2 = I2 - 2048         ; if Sample Size = 2.
IDL> I2 = I2 - 32768        ; if Sample Size = 3.
IDL> DATA = TEMPORARY( I2 ) ; Optional: Assign I2 into DATA at 1-D array.

     ::: etc :::

IDL> CLOSE, 10  ; Close the data file when you are done.
IDL> EXIT

Decoding Autonomous Data Block using matlab Software Package

The following matlab commands will decode all the data in an autonomous data file. Note that this section is assuming readers know matlab and how to refer to the matlab manual for details.

Note that the matlab fopen() function will read the Big Endian format to binary the data as the default (however the 'b' option still should be used see the (1) below); therefore, no bytes swapping is needed.

Enter matlab ( in UNIX system, type matlab then the matlab working window will show up )

    Open a data file (1) and Skip the 1st 68 bytes
    (BIR working and VEE setting variables) from the file (2).
>>> FID    = fopen( 'Type4AFormat.Data', 'r', 'b' );  % (1)
>>> status = fseek( FID, 68, 'bof' );                 % (2)

    Read in the next 10 bytes for the LATITUDE.
>>> a      = fread( FID, 10 );
>>> disp( char( a' ) )  % LATITUDE.

    Read in the next 12 bytes for the LONGITUDE.
>>> a      = fread( FID, 12 );
>>> disp( char( a' ) )  % LONGITUDE.

    Read in the next 46 bytes for the TIME_GMT.
>>> a      = fread( FID, 64 );
>>> disp( char( a' ) )  % Show TIME_GMT.

    Skip the 1st 196 bytes from the beginning
>>> status = fseek( FID, 196, 'bof' );

    Read in the next 4 bytes as a long integers for the Sample Rate.
>>> i      = fread( FID, 1, 'long' );    % Read in 1 long integer.
>>> i      % Shows the SRATEHZ (Sample Rate)

    Read in the next 2 bytes as a short integers for the Sample Size.
>>> i      = fread( FID, 1, 'short' );    % Read in 1 short integer.
>>> disp( ['Sample Size = ', num2str( i )] )

    Skip the 1st 152 bytes from the beginning.
>>> status = fseek( FID, 152, 'bof' );    % Move to the correct position.
    
    Read in the next 12 bytes for the PROGNAME
>>> a      = fread( FID, 12 );
>>> disp( char( a' ) )  % Shows the PROGNAME

    Make sure the PROGNAME is Not equal to 1 of the following names:
    CFxLogSP3i2_4.c, CFxLogSP3i3_3.c, CFxLogSP3i3_3.c, CFxLogSP3i3_4.c with or w/o the ".c"
 
>>> nchan  = 1;  % Data Format 4A is a single channel data file.

    Assuming SAMPLE Size Type is 0, i.e. 1-byte data
    Decode data blocks as nchan x (pts = total data points per channel).
>>> status = fseek( FID, 256, 'bof'   );  % Skip the 1st 256 header bytes.
>>> blk    = fread( FID, [ nchan inf ], 'uchar' );  % Read in All the data after the header.
>>> rcd    = blk - 127;                   % convert the range into -127 & 128.


    Assuming SAMPLE Size Type is 3, i.e. 16-bit data
    Decode data blocks as nchan x (pts = total data points per channel).
>>> status = fseek( FID, 256, 'bof'    );  % Skip the 1st 256 header bytes.
>>> blk    = fread( FID, inf, 'uint16' );  % Read in All the data after the header.
>>> rcd    = blk' - 32768;  % convert the range into -32768 & 32767.

    ::: etc :::

>>> fclose( FID );  % Close the data file.
>>> quit            % Exit out of matlab.

Other Data File Formats

Data Format Type 4B, Type 3 and Type 2. Note that Data Format Type 1 was used for experiment and it is No longer being used.