(C)Copyright 2000 NTT Cyber Space Laboratories

TwinVQ data format manual

TwinVQ Ver. 2.3

2000.10.12

NTT Cyber Space Labs.
Media Processing Project

Contents


1. Introduction

1.1 data structure of TwinVQ

TwinVQ compressed file consists of header part and audio data part as shown in Fig.1. Both parts should be continuously connected.


Fig.1 data structure of TwinVQ

1.1.1 Structure of header

Header part consists of data structure called "chunk". Header part itself has a chunk structure and header data is a subchunk as shown in Fig.3.

ID of header chunk exceptionally 12 character of "TWIN"+ 8-digit. Current valid ID is only "TWIN97012000". This chunk is also called "TWIN " chunk.


Fig.2 structure of header part

1.1.2 Audio data part

Audio data part consists of "DATA" and raw audio compressed data. The audio data has no field indicating data size. Total audio data size can be specified by DSIZ chunk. "xxxx" chunk means the chunk whose ID is "XXXX".

1.2 Definition of term

chunk
general data unit containing three elements, namely ID, size and data, as shown in Fig.3. ID should be a string of four characters. size denotes the number of bytes of the data in unsigned long type.
subchunk
chunk structure written in data part of chunk.
character string chunk
chunk whose data consists of only character string of data size. There is no termination character.
TwinVQ setup information
a set of configuration information essential to the initialization of TwinVQ encoder and decoder. This is stored in COMM chunk. headerInfo struct is used to exchange this set of information between modules in sample program.
TwinVQ audio data
TwinVQ audio data
ID3v2
Tag format for MP3. TwinVQ data format can include this data as one of the header chunk.


Fig. 3 structure of chunk

1.3 definition of data type

All two and more bytes integers must be big-endian.

type bytes description
char 1 character
char[]   character array whose length is specified by the number in []. If [] is empty, size is defined by chunkSize.
byte 1 unsigned integer
byte[]   arbitral data type array whose length is specified by the number in []. If [] is empty, size is defined by chunkSize.
short 2 2 byte integer
unsigned short 2 2 byte unsigned integer
long 4 4 byte integer
unsigned long 4 4 byte unsigned integer
StringChunk   string chunk consisting
char[4]        chunkID
unsigned long   chunkSize
byte[chunkSize] data



2. detailed structure of subchunk

2.1 outline of header chunk

Header chunk stores header information. The chunk ID is 4 characters.

Header chunk consists of three chunk set; "standard chunk", "extension chunk" and "auxiliary chunk".

standard chunk
This can keep minimum header information in struct headerInfo.

extension chunk
This can keep additional or detailed header information.

auxiliary chunk
This chunk (SCND chunk) is a special subset of extension chunk which can keep bi-lingual sub information. All chunks other than SCND chunk are called normal chunk.

2.2 list of standard chunk

All standard chunk are listed in table 1. COMM chunk is a mandatory chunk.

"ID3 duplication " term shows the name of ID3v2 field which can store the same information.

Fig. 1 list of standard chunk
chunk ID contents data field description ID3 duplication
COMM setup information of codec long channelMode
long bitRate
long samplingRate

long securityLevel
0: mono, 1:stereo
integer in kbit/s
integer in kHz
(e.g. 44.1 kHz -> 44)
always 0
 
NAME title name char[ ] name free form TIT2
COMT comment char[ ] comment free form COMM
AUTH author char[ ] author free form TPE1/2/3/4
(c) copy right char[ ] copyright free form. Note that the chunk ID consists of four charactors ("(C)" followd by a white space). TCOP
FILE compressed file name(B char[ ] fileName free form  
DSIZ size of DATA chunk (byte) unsigned long dataSize data size of compressed audio data (byte) DSIZ

2.3 list of extension chunk

All extension chunks are listed in table 2. This list includes SCND chunk (auxiliary chunk) which keeps auxiliary information.

Table 2. list of extension chunk
chunk ID contents data field description ID3 duplication
ALBM album title char[ ] albumTitle   TALB
YEAR recorded date short year
char month
0 means no data YEAR/TYER
ENCD compressed date short year
char month
char day
char hour
char minute
char timeZone
  0 means no data TDAT
TRAC track number short trackNumber   TRCK
LYRC lyrics char[ ] lyrics   USLT
GUID Globally Unique Identifier byte[16] guid   UFID
ISRC International Standard Record Code char[] isrc identifier of CD TSRC
WORD name of songwriter char[ ] wordsBy   TEXT
MUSC name of composer char[ ] musicBy   TCOM
ARNG name of arranger char[ ] arrangedBy    
PROD name of producer char[ ] producedBy    
REMX name of remixer char[ ] remixedBy   TPE4
CDCT name of conductor char[ ] conductedBy   TPE3
SING name of singer char[ ] singer   TPE1
BAND name of band, orchestra, group char[ ] band   TPE2
PRSN name of player char[ ] personnel    
LABL name of label char[ ] label   TPUB
NOTE liner notes char[ ] linerNotes    
SCND auxiliary information while ( StringChunk subChunks[ ]) This chunk is used for alternative language. Although main character strings are stored in normal chunk, auxiliary character strings in alternative language such as original name and translated song can be stored in subchunk of SCND. Encoder can select the main language.
Contents of subchunk should be character strings.
The first two character of subchunk string need to be following character code.
'0': iso-8859-1
'1': Unicode
'2': S-JIS
'3': JIS
'4': EUC
Unicode string uses UTF-16 encoding with unicode BOM. If there is no unicode BOM, words are big-endian.
SCND/NAME_$B"N_(BTIT3
EXTR reserved      
_ID3 reserved for ID3v2 char[ ] data priority should be defined in case of conflict with TwinVQ  
_YMH reserved chunk      
_NTT reserved chunk      

2.4 list of auxiliary chunk

All auxiliary chunks are stored in SCND subchunk of extension chunk. Each subchunk has the same ID name as that for the standard subchunk. Following names can be used.
NAME, COMT, AUTH, (c) , ALBM, LYRC, WORD, MUSC, ARNG, PROD, REMX, CDCT, SING, BAND, PRSN, LABL, NOTE

3. policy of header chunk