Data Format Type 4A for the NOAA Autonomous Hydrophone Data Files from 2013 on
The Data Format Type 4A & 4B has the same settings as the
Type 3.
Each data file has first 256 bytes header
Information and rest of the bytes are data. In the formats
4A & 4B, it is their header's contents of the last 192 bytes that
are different than the header in the data format
Type 3.
The main differences between the data format types 4A and 4B are their last 8 bytes in the header. There the data format type 4A does not contain the NCHAN (Number of Channels) parameter whereas the data format type 4B does. In this document, only Data Format Type 4A will be described. See HPdataFormat4B.html for the Data Format Type 4B.
Determine the Data Format Type: 4A or 4B
The PROGNAME parameter in the header will be needed to find out which format type: 4A or 4B will be. The PROGNAME is Program Name, Max. 12 characters long. It is set by the technician before the hydrophone is deployed. The PROGNAME will have the following name: CFxLogSP3izzz.c with or without the 2 suffix characters: ".c". For example, PROGNAME = CFxLogSP3i3_4.c or CFxLogSP3i2_3
When the PROGNAME is equal one of the names:
CFxLogSP3i2_4.c, CFxLogSP3i3_2.c, CFxLogSP3i3_3.c or
CFxLogSP3i3_4.c with or without the last 2 suffix
characters: ".c", then the data format type is 4B and it has the
NCHAN parameter.
See HPdataFormat4B.html for
reading the data format type 4B data files;
Otherwise, the data format is type 4A and it has No NCHAN
parameters in the header. The NCHAN will be assumed to be 1.
From here on, only Data Format Type: 4A will be described.
Key Header Information
The contents of the header are shown in between the Byte Swapping and the Decoding Instructions sections in this document below. To read the data correctly, the following parameters:
PROGNAME = Program Name, e.g. CFxLogSP3i3_1.c or CFxLogSP3i2_3 for the type 4A format. NCHAN = Number of Channels = 1 because of single channel data file. TIME_GMT = Start Date & Time, SRATEHZ = Sample Rate in hertz, SAMPLES = Sample Size and type,from the Header will be needed.
The PROGNAME (from byte positions 152 to 163 of the header with 1st byte position is counted as 0) has been described in the Determine the data format type: 4A or 4B section above and here the data format is assumed to be type 4A, a single channel data. So the NCHAN = 1.
TIME_GMT (from byte positions 90 to 135) contains Year, Julian Date, Hour, Minute and Seconds. An example of TIME_GMT will be
115 213:21:47:57:862 Year JDy:Hr:Mn:Sd.xxxwhere the Year value needs to be adjusted by adding 1900 so that e.g. 1900 + 115 = 2015, the JDy = Julian Day ( 1 to 366 ), Hr = Hour ( 0 to 23 ), Mn = Minute ( 0 to 59 ), and Sd:xxx = Seconds as Sd + xxx/1000. So e.g. 57:862 = 57.862.
SRATEHZ (from byte positions 196 to 199) is a Long integer value that indicate the Sample Rate, e.g. 1000 Hz. Note that SRATEHZ should not be used as the true Sample Rate. The correct Sample Rate must be computed (see Compute the Sample Rate below) and the value should be very close to SRATEHZ.
SAMPLES (from byte positions 200 to 201) is a Short integer value
to indicate how the data value in the file are being store after
the header. SAMPLES can be either 0, 2 or 3 where
0 indicates the data after the header are in byte values;
2 indicates the data after the header are 12-bit unsigned integers
store in the short integers, i.e., 2 bytes per 1 data point and
the last 4 high order bits must be set to zeros in order to
interpret the value correctly; and
3 indicates 16-bit unsigned integers are stored into the short
integers, i.e. 2 bytes per 1 data point.
For the data format type: 4A and 4B, the SAMPLES most likely
will be equal to 3, i.e., 16-bit unsigned integers will be used.
Also, see Byte Swapping below.
Compute the Sample Rate:
Because of the hardware setup, Sample Rate (Data Points per Second) must be determined for each data file by the total data points in a file divided by the file's total time segment in seconds.
(A) Total Data Points in a given data file will be
(File Size in Bytes - Header: 256 bytes )/(Data Size)/NCHAN
where Data Size is 1 if SAMPLES = 0 and Data Size is 2 if SAMPLES
= 2 or 3; NCHAN ≥ 1.
(B) Total Time is seconds will be the End Time - Start Time where the Start Time can be obtained from the given data file, e.g. 000011.DAT. The End Time, however, must be retrieved from the next following data file, e.g. 000012.DAT and use its start time as the End Time.
Then (A)/(B) will give the correct Sample Rate for the data file: 000011.DAT. If the next data file is not available or does not exist, SRATEHZ will need to be used or use the computed Sample Rate from the very last 2 data files.
Byte (8-bit) Values: <-- Most likely it will not be used by the data format: 4A and 4B.
When SAMPLES = 0, the data are 8-bit byte numbers. To get the
byte numbers, read them as unsigned 8-bit numbers: B. Then convert
them into signed integers: I = B - 127.
Byte Swapping:
When SAMPLES = 2 (12-bit) or 3 (16-bit) numbers, the data are
stored into the 2 bytes short integers and are generated by the Big
Endian format, i.e, the most-significant, left most bit or byte.
Byte Swapping for the short integers will be needed if a computer
with the little endian format is used. Then the unsigned
values (UI2) need to be converted to signed integers: I2 = UI2
- N where N = 2048 for 12-bit data or N = 32768 for 16-bit
data.
Header Format for Autonomous Hydrophone Data Files Type 4A:
The header format is shown in C programming language. Note
that the 1st and 2nd 32 bytes are same as in the data format Type
3, the remaining header bytes: 192 will be different in Type 4A
data format.
typedef struct { // // ----- First 32 bytes of BIR header is working variables ------ // char BIRHdrID[4]; // "BIR\0" ushort BIRVersion; // version.release * 10 ushort BIRUserHeaderSize; // size in bytes for user header ushort BIRUnused; // currently unused (zero) ulong RTCsecs; // current RTC seconds ushort RTCticks; // current RTC ticks // ulong BIRCapacityBytes; // total capacity of current drive ulong BIRStartFreeBytes; // free space on current drive at start ulong BIRReceivedBytes; // total number of bytes received (likely to wrap) ulong BIRWrittenBytes; // total number of bytes written (likely to wrap) // // ----- Second 32 bytes of BIR header is copy of VEE settings ------ // long CFPPBSZ; // size of CompactFlash buffer, typ. 40MB long RAMPPBSZ; // size of data RAM PP buffer, typ. 16 to 64KB long RAMHDBFSZ; // size of CF to HD copy buffer, typ. 16 to 64KB long MINFREESZ; // minimum free space until switch to next drive // char HDDOSDRV[4]; // DOS drive assigned to hard disk ("D:") short NODRVTEST; // testing without accessing drive short UARTMONIT; // sending diagnostics to RS-232 port short FLOGFLAG; // flag to log for major events (startup, spinups) short BIADEVICE; // device type attached to BigIDEA(s) short CURBIA; // index to current BigIDEA/drive short CURPRTN; // index to current DOS partition } // DO NOT CHANGE THESE -- THEY MUST REMAIN CONSTANT AND TOTAL 256 #define ID_LENMAX 4 //System ID length #define LAT_LENMAX 10 //Latitude length #define LONG_LENMAX 12 //Longitude max #define GMT_LENMAX 46 //GMT char length #define LOGF_NAME_MAX 14 //JHG-2003-07-28 max filename length for log file #define EXPID_LENMAX 16 //EXP ID, usually year #define PROGNAME_LENMAX 12 //Added by hm NOAA 09/30/2002 // // The remaining header space 192 bytes that contain No NCHAN parameter which means // the PROGNAME will Not equal to any 1 of the following names below: // CFxLogSP3i2_4.c, CFxLogSP3i3_3.c, CFxLogSP3i3_3.c, CFxLogSP3i3_4.c with or w/o the ".c" // typedef struct //user header space 192 bytes { char PLTFRMID[ID_LENMAX]; //System ID (e.g., G001) 4 char LATITUDE[LAT_LENMAX]; //Latitude in degrees 10 N45:02.356=N45deg 02.356min char LONGITUDE[LONG_LENMAX]; //Longitude in degrees 12 W128:34.872=W128 deg 34.872min char TIME_GMT[GMT_LENMAX]; //System ID, Lat, Lang and GMT time 46 char EXPID[EXPID_LENMAX]; //Exp ID 16 char PROGNAME[PROGNAME_LENMAX]; //Program name 12 ushort ACQVersion; // version.release * 10 ushort WARMUP; //Pre-amp warm up in sec prior to the A/D logging char PROJID[ID_LENMAX]; //Project ID 4 char LOGFILE[LOGF_NAME_MAX]; //Filename for event logging 14 chars short STARTUPS; //number of program resets we've seen short MAXSTRTS; //Maximum allowable program resets up to 255 long MAXNUMFIL; //Maximum number of files short GAIN; //additional pre-amp gain 0 to 3 with 6dB inc long SRATEHZ; //sample rate in hertz. NEW===SRATEHZ is a long value.===NEW short SAMPLES; //sample size and type 0=8 bit, 2=12bit(2 byte), 3=16 bit short PWFILT; //Pre-whitening filter setting short LOPASS; //Low pass filter cut off ushort SLEEP; //sleep in hours before program launched ulong ACTIVESEC; //Active logging period in second ulong DUTYCYCLE; //Duty cycle in seconds short HYDROSENS; //Hydrophone sensitivity char PRAMPNAME[10]; //PreAmp revision up to 8-char long ulong WAKEUP; //Time to start logging in sec since 01/01/1970 11/02/99 NOAA hm char DAQNAME[10]; //DAQ board name char HYDROSRN[6]; //Hydrophone serial number ushort FILECOUNT; //File count short TESTSEC; //Test sec prior to the real logging. Logs all 8 channel. short STANDBY; //Standby in second before launching program char dummy[2]; } ACQData;
Note that the last 4 parameters: FILECOUNT, TESTSEC, STANDBY and dummy[2] in the typedef struc { ... } ACQData above are unique to the data format type: 4A here. They are not in the header for the data format type: 4B except the TESTSEC; but its byte position is different.
Instructions for Decoding Autonomous Hydrophone Data Files
The following instructions are one of many ways to decode the data file. The objective is to give users examples for reading the header and data. Note that the following codes only read the essential information from the header in order to compute the sample rate and read the data correctly.
Decoding Autonomous Hydrophone Data (Format 2) Block using IDL Software Package
The following IDL commands will decode an autonomous data file 1 block at a time. Note that this section is assuming readers know IDL and how to refer to the IDL manual for details.
Enter IDL ( in UNIX system, type IDL then IDL> should show up )
IDL> OPENR, 10, 'Type4AFormat.Data' ; Type4Format.data is the file name. IDL> BLK = ASSOC( 10, BYTARR( 256 ) ) ; 256 is the block size in bytes ; and it is also the Header size. ; BLK will allow users to get 256 bytes of data per read. IDL> HDR = BLK[0] ; Read in the Header: HDR which will be 1-D byte array ; with 256 elements from 0 to 255. ; Show the LATITUDE, LONGITUDE & TIME_GMT. IDL> PRINT, 'Latitude : ', STRING( HDR[068:077] ) IDL> PRINT, 'Longitude: ', STRING( HDR[078:089] ) IDL> PRINT, 'TIME_GMT : ', STRING( HDR[090:135] ) IDL> SRATEHZ = LONG( HDR, 196, 1 ) ; Sample Rate in Hz. Long Integer. IDL> SAMPLES = FIX( HDR, 200, 1 ) ; Sample Size: 3 means 16-Bit Short Integer. ; If your computer is using Litte Endian Format, you must do the Byte Swapping. ; Do the following statement to find out. IDL> LITTLE_ENDIAN = (BYTE(1,0,2))[0] EQ 1B ; If LITTLE_ENDIAN is 1, ; then your computer is using the Little Endian Format. IDL> BYTEORDER, SRATEHZ, /LSWAP ; Skip this if your computer is Big Endian. IDL> BYTEORDER, SAMPLES, /SSWAP ; Skip this if your computer is Big Endian. IDL> PROGNAME = STRING( HDR[152:163] ) ; Program Name='CFxLogSP3i2_3', e.g. IDL> PRINT, 'PROGNAME: ', PROGNAME ; Check to make sure that it is the data format: 4A. ; i.e. the PROGNAME is Not equal to 1 of the following names: ; CFxLogSP3i2_4.c, CFxLogSP3i3_3.c, CFxLogSP3i3_3.c, CFxLogSP3i3_4.c with or w/o the ".c" ; If PROGNAME = 1 of the names above, Get the NCHAN parameters as NCHAN = HEADER[248] IDL> NCHAN = 1 ; PROGNAME is not = 1 of the names above. IDL> PRINT, 'SRATEHZ SampleRate: ', SRATEHZ IDL> PRINT, 'SAMPLES Size, Type: ', SAMPLES IDL> PRINT, 'PROGNAME : ', PROGNAME IDL> PRINT, 'NCHANnals : ', NCHAN ; Decode data blocks. Assuming SAMPLE Size Type is 0, i.e. 1-byte data. IDL> B = BLK[1] ; Read in the 2nd block, the 2nd set of 256 bytes. IDL> RCD = B - 127 ; Convert the data into the range between -127 to 128. ; where RCD will be an integer array of 256 elements. IDL> DATA = TEMPORARY( RCD ) ; Optional: Assign RCD into DATA at 1-D array. ; Decode the next data block. IDL> B = BLK[2] & RCD = B - 127 ; Process 2 IDL statements together. ; Note that the order of which block to read can be random, i.e. ; B = BLK[10] & ... and later B = BLK[5] & ... are OK. ::: etc ::: ; Decode data blocks with SAMPLE Size Type = 3 (Short Integer) or 2 (12-Bits Integer). IDL> B = BLK[1] ; Read in the 2nd set of 256 bytes from the data file. IDL> I2 = UINT( B, 0, 128 ) ; convert 256 bytes into 2-bytes unsigned integers ; with 128 elements. IDL> BYTEORDER, I2, /SSWAP ; Required if Little Endian computer is used. ; Convert I2 to a 2-bytes signed integer array. IDL> I2 = I2 - 2048 ; if Sample Size = 2. IDL> I2 = I2 - 32768 ; if Sample Size = 3. IDL> DATA = TEMPORARY( I2 ) ; Optional: Assign I2 into DATA at 1-D array. ::: etc ::: IDL> CLOSE, 10 ; Close the data file when you are done. IDL> EXIT
Decoding Autonomous Data Block using matlab Software Package
The following matlab commands will decode all the data in an autonomous data file. Note that this section is assuming readers know matlab and how to refer to the matlab manual for details.
Note that the matlab fopen() function will read the Big Endian format to binary the data as the default (however the 'b' option still should be used see the (1) below); therefore, no bytes swapping is needed.
Enter matlab ( in UNIX system, type matlab then the matlab working window will show up )
Open a data file (1) and Skip the 1st 68 bytes (BIR working and VEE setting variables) from the file (2). >>> FID = fopen( 'Type4AFormat.Data', 'r', 'b' ); % (1) >>> status = fseek( FID, 68, 'bof' ); % (2) Read in the next 10 bytes for the LATITUDE. >>> a = fread( FID, 10 ); >>> disp( char( a' ) ) % LATITUDE. Read in the next 12 bytes for the LONGITUDE. >>> a = fread( FID, 12 ); >>> disp( char( a' ) ) % LONGITUDE. Read in the next 46 bytes for the TIME_GMT. >>> a = fread( FID, 64 ); >>> disp( char( a' ) ) % Show TIME_GMT. Skip the 1st 196 bytes from the beginning >>> status = fseek( FID, 196, 'bof' ); Read in the next 4 bytes as a long integers for the Sample Rate. >>> i = fread( FID, 1, 'long' ); % Read in 1 long integer. >>> i % Shows the SRATEHZ (Sample Rate) Read in the next 2 bytes as a short integers for the Sample Size. >>> i = fread( FID, 1, 'short' ); % Read in 1 short integer. >>> disp( ['Sample Size = ', num2str( i )] ) Skip the 1st 152 bytes from the beginning. >>> status = fseek( FID, 152, 'bof' ); % Move to the correct position. Read in the next 12 bytes for the PROGNAME >>> a = fread( FID, 12 ); >>> disp( char( a' ) ) % Shows the PROGNAME Make sure the PROGNAME is Not equal to 1 of the following names: CFxLogSP3i2_4.c, CFxLogSP3i3_3.c, CFxLogSP3i3_3.c, CFxLogSP3i3_4.c with or w/o the ".c" >>> nchan = 1; % Data Format 4A is a single channel data file. Assuming SAMPLE Size Type is 0, i.e. 1-byte data Decode data blocks as nchan x (pts = total data points per channel). >>> status = fseek( FID, 256, 'bof' ); % Skip the 1st 256 header bytes. >>> blk = fread( FID, [ nchan inf ], 'uchar' ); % Read in All the data after the header. >>> rcd = blk - 127; % convert the range into -127 & 128. Assuming SAMPLE Size Type is 3, i.e. 16-bit data Decode data blocks as nchan x (pts = total data points per channel). >>> status = fseek( FID, 256, 'bof' ); % Skip the 1st 256 header bytes. >>> blk = fread( FID, inf, 'uint16' ); % Read in All the data after the header. >>> rcd = blk' - 32768; % convert the range into -32768 & 32767. ::: etc ::: >>> fclose( FID ); % Close the data file. >>> quit % Exit out of matlab.Other Data File Formats
Data Format Type 4B, Type 3 and Type 2. Note that Data Format Type 1 was used for experiment and it is No longer being used.