[Back to FAQ SWAG index]  [Back to Main SWAG index]  [Original]

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
<-=-=-=-=- Matthew Mclin to All -=-=-=-=>
 MM> Does anybody know the format of MOD/SAM/WAV/VOC file? Info on any
 MM> of those formats (how to read/write/play them using a PC Speaker or
 MM> LPT 1 with a mono DAC) would be greatly appreciated.

      You know, you are quite lucky that I just decided to pickup the
   Pascal echo even though I'm not a Pascal programmer. I have ALL of
   these file formats! Lucky you! I have had to search high and low all
   over the place for this junk and you're getting it all in one shot.
      Not only do I have those file formats, but I also understand how to
   play them back on the PC's Internal Speaker, LPT DACs, and Sound
   Blaster. I'll be posting that too.
      I have been interested in this field for quite a while, that's how I
   gather up all this information. If I had enough ambition, time, and
   patience, I'd probably write a book on it all because there is not ONE
   SINGLE book that explains how to play digital sound directly (ie,
   without specail drivers), with such drivers, what the file formats are,
   and includes code to do all that stuff.
      Gee, I bet that would make a lot of money, perhaps I should do that
   after all.... Those guys on the 80XXX Assembler echo would probably be
   able to do a better job as they are more knowledgable on this, but most
   of them are into writing demos and creating faster/better MOD players..
      Ok, since this will take up a lot of room, I'll be splitting it up
   into seperate messages. The simpilest stuff goes in this message.

 MM> I would also like info on raw sound data and how to edit/play it.

      Newbe to Digital Sound, eh? Well, you've come to the right place for
   information, or rather, the right person has come to you. Ok, the
   basics. A digital sound file is basically just a bunch of volume
   settings. On the PC, a volume setting of 128 is normally silence.
   Values farther away from 128 in either direction are louder depending
   on its distance from 128. 0 and 255 are the loudest volumes.
      One thing I should make clear, 128 is not nessicarily silence. When
   making a recording, there is always background noise. So, what may
   sound like silence to you, is actually 126-130 or so.
      Now, you have probably seen those neat little graphs that some
   programs make when displaying a digital sound file. VEdit (which comes
   with the Sound Blaster) shows the waveform in the modify part of it. If
   you wanted to display a graph yourself, you could just load in a byte
   from the file, then, use that byte for the Y location. The X location
   is where in the file you are at (which byte). You just keep loading in
   bytes until the end of the screen.
      I could go on and on, but this is just a message, not a book! Hmm,
   you said you wanted to play a digital sound file on the PC's Internal
   Speaker and on a printer port DAC. Well, here comes that part. I'll
   explain usage of printer port DACs first because they are easier to
   understand.
      To play a VOC, WAV, SND, etc file on the DAC, you just read in one
   byte from the file, output it to the printer port, and do it again but
   on the next byte. To get the I/O address of the printer port, read the
   word at memory location 40h:8h for LPT1, 40h:0Ah for LPT2, 40h:0Ch for
   LPT3, and if on a non-ps/2, 40h:0Eh for LPT4.
      The internal speaker is a bit more tricky, you have to do certain
   things to set it up correctly before outputting sound. Before you do
   ANY sound output, you must do the following (sorry, I'm not a Pascal
   programmer, so this is in Assembler):

   Out   43h, 0B6h                     ;Please make note: This code was
   Out   42h, 0FFh                     ;written by a friend of mine in
   Out   42h, 0                        ;australia named Phil Inch. He
   Out   43h, 90h                      ;posted code in the 80x86 Assembler
   In    ax, 61h                       ;echo (GTPN, not Fido) for the
   Or    ax, 3                         ;public domain. Thanks Phil!!
   Out   61h, ax

      Ok, the above sets the timer chip up correctly. From there it is
   pretty simple. Get a byte from the sound file. Divide the byte by a
   'shift' number (I'll explain about this later). Then, output this new
   byte to port 42h. Repeat this for the whole file.
      Ok, now, about that shift value. The PC's Internal Speaker wasn't
   designed for playing digital sound on it, it's just that brainy guys
   like Phil have figured out how to do with software what should have
   been done with hardware.. Anyway, the PC's Internal Speaker isn't very
   loud, so the range of volumes is much less than on a Sound Blaster or
   printer port DAC. This shift value varies from computer to computer, it
   depends on the size of your speaker and other stuff. Genernally, a
   shift value of 4 works on all computers. On my computer, I can get
   anyway with 3 on most files. The smaller the shift value, the louder
   the file will be played, but too small a shift value will cause
   distortion. Experiment!
      After you are finished playing the sound file, you must put the
   timer chip back the way it was supposed to be, or otherwise the next
   program that tries to make a noise on the internal speaker will make
   the noise but will not stop! Here is the code for that (again, sorry
   about the Assembler, it's just that I'm not a Pascal programmer):

   Out   43h, 0B6h
   In    ax, 61h
   And   ax, 0FCh
   Out   61h, ax

      There, that should do it. I hope I haven't totally confused you.
   Please write back if you have ANY questions what-so-ever. Gee, I'm
   already on line 107, time to go to a new message!

 MM> Note that these .MOD
 MM> and .SAM files are in the Amiga Module format (just incase there are
 MM> any others). Oh, there's also the .SND files. Or even .MID/.MDI files
 MM> if you can play them thru a DAC on an LPT port or the PC Speaker. Note
 MM> that I don't have a Sound Blaster (or any other sound card). Thanks.

SAM Files:

      As far as I know, these do not contain any header or specific
   structure. They are just raw sound files. The only trick you have to
   remember about these files are that they are signed, which means that
   when the 7th bit is set, the number is negative. When the 7th bit is
   clear, the number is positive. This is completely different from
   digital sound files that originated on the PC. Remember, MOD and SAM
   files originated from the Amiga, so they have this weird encoding.

      To convert a signed file to an unsigned file, just read in one byte
   from the original file. Add 128 to that byte. Output the answer to a
   new file. In the Amiga world, a byte of 0 is equalivilent to silence.
   A byte of -128 (and +128) is as loud as it gets on the Amiga.  On the
   PC, however, 0 (and 255) is as loud as it gets. A byte of 128 is
   equalivilent to silence on the PC. So, when we add 128 to a -128, we
   get a zereo, which is the same volume for a 128 on the Amiga.

WAV Files:

      The following text was written by Edward Schlunder and was based on
   information provided by Tony Cook on the GT Power Network's 80x86
   Assmebler echo.

                               WAV File Format
                       By: Edward Schlunder. 5-17-93

 BYTE(S)        NORMAL CONTENTS               PURPOSE/DESCRIPTION
 ---------------------------------------------------------------------------

 00 - 03        "RIFF"                        Just an identification block.
                                              The quotes are not included.

 04 - 07        ???                           This is a long integer. It
                                              tells the number of bytes long
                                              the file is, includes header
                                              size.

 08 - 11        "WAVE"                        Just an other I.D. thing.

 12 - 15        "fmt "                        Just an other I.D. thing.

 16 - 19        16, 0, 0, 0                   Size of header to this point.

 20 - 21        1, 0                          Format tag.

 22 - 23        1, 0                          Channels

 24 - 27        ???                           Sample rate, or (in other
                                              words), samples per second.

 28 - 31        ???                           Average bytes per second.

 32 - 33        1, 0                          Block align.

 34 - 35        8, 0                          Bits per sample. Ex: Sound
                                              Blaster can only do 8, Sound
                                              Blaster 16 can make 16.
                                              Normally, the only valid values
                                              are 8, 12, and 16.

 36 - 39        "data"                        Marker that comes just before
                                              the actual sample data.

 40 - 43        ???                           The number of bytes in the
                                              sample.

     Information from Tony Cook, Australia. GT Power 80x86 Assembler echo.

 MM> Does anybody know the format of .MOD/.SAM/.WAV/.VOC file? Info on any
 MM> of those formats (how to read/write/play them using a PC Speaker or
 MM> LPT 1 with a mono DAC) would be greatly appreciated. I would also like

VOC File Format:

      This file format was written by Phil Inch on the 80x86 Assembler
   echo on the GTPN. Thanks Phil!!

BYTE(S)        NORMAL CONTENTS               PURPOSE/DESCRIPTION
---------------------------------------------------------------------------

00 - 19        "Creative Voice File", 26     Just an identification block.
                                             The quotes are not included,
                                             and the 26 is byte 26 (1Ah) which
                                             is an end-of-file marker.  There-
                                             fore, if you TYPE a VOC file, you
                                             will just see Creative Voice File.

20 - 21        26, 00                        This is a low byte, high byte
                                             sequence which gives the offset
                                             of the first block of sound data
                                             in the file.  Currently this is
                                             26 ( 00 x 256 + 26 ) which is the
                                             length of the header, but it's
                                             probably good programming practice
                                             to read and use this value anyway
                                             in case the format changes later.

22 - 23        10,1                          These bytes give the version
                                             number of the VOC file, subnumber
                                             first, then main number.  The
                                             default, as you can see, is 1.10.

24 - 25        41,17                         These bytes are "check digits".
                                             These allow you to be absolutely
                                             SURE that you are working with a
                                             VOC file.  To use them, convert
                                             the version number (above) and
                                             this number to integers.  Do this
                                             with the formula below, where for
                                             convention the above bytes have
                                             been listed as byte1, byte2.

                                             (byte2*256)+byte1

                                             Therefore, for the default values
                                             we get the following integers:

                                             (1 x 256)+10     =  266
                                             (17 x 256)+41    = 4393

                                             When you add the two results, you
                                             get 4659.  If you do these calcs
                                             and get 4659, then you can be
                                             almost certain you're working with
                                             a VOC file.

OK, that takes care of the header information.  I hope you realise that I'll
never get a registration for VOCHDR now!  Oh well <sigh> perhaps people will
buy my games!

   Having gotten to byte 26, we now start encountering data blocks.  There
are eight types in all, conveniently numbered 0 - 7.  For each block, the
first byte will always tell you the type.

For notational convenience, bx means byte x, eg b5 means byte 5.

BLOCK 0 - THE "END BLOCK"

   Structure:     Byte 1: '0' to denote "end block" type

   This block is located at the END of a VOC file.  When a VOC player
   encounters a block 0, it should stop playing the VOC file.


BLOCK 1 - THE "DATA BLOCK"

   Structure:     Byte 1: '1' to denote "data block" type

                       2: \
                       3: | These bytes give the length:
                       4: / b2 + (b3*256) + (b4*65536)

                       5: Sampling rate: Calculated as 1000000 / (256-b5)

                       6: Pack type byte:
                              0 = data is not packed
                              1 = data is packed to four bits
                              2 = data is packed to 2 bits
                              3 = data is packed to 1 bit

                       7: Actual sample data starts here


BLOCK 2 - THE "MORE DATA BLOCK"

   Structure:     Byte 1: '2' to denote "more data block" type

                       2: \
                       3: | These bytes give the length:
                       4: / b2 + (b3*256) + (b4*65536)

                       5: Actual sample data starts here

   The point of this is simple:  If you have a sample that you want to chop
   up into smaller portions (the maximum block length in a VOC file is
   16,842,751 bytes but who's counting?), then define a "more data" block.
   This "carries over" the previously found sampling rate and pack type byte,
   so a "data block" should have been encountered earlier somewhere along
   the line.


BLOCK 3 - THE "SILENCE" BLOCK

   Structure:     Byte 1: '3' to denote "silence block" type

                       2: \
                       3: | These bytes give the length:
                       4: / b2 + (b3*256) + (b4*65536)

                          (Note that this value is usually 3 for a
                          silence block.)

                       5: Duration ( b5+(b6*255) ).  This gives the equivalent
                       6: number of bytes to "play" during the silence.

                       7: Sampling rate: Calculated as 1000000 / (256-b5)

   A silence block is used for long periods of silence.  When long silences
   are required, it's more efficient in size terms to insert one of these
   blocks, as seven bytes can then represent up to 65,536.

BLOCK 4 - THE "MARKER BLOCK"

   Structure:     Byte 1: '4' to denote "marker block" type

                       2: \
                       3: | The length of the block, as usual
                       4: /

                       5: Marker value, as low-high (ie b5 + (b6*255) )
                       6:

   The marker block is read by CT-VOICE.DRV.  When a marker block is
   encountered, the value in the marker value bytes (5 and 6) is copied into
   the status word specified when CT-VOICE was initialized.

   This allows your program to judge where in the sample you currently are,
   thus allowing for progress counters and the like.  It's also useful if
   you're trying to synchronize other processes to the playing of the sound.

   For example, by using appropriate marker blocks, you could send signals
   to your software to move the lips of a person on-screen in time with the
   speech in the VOC.  However, this does take some doing and a VERY good
   VOC editor!


BLOCK 5 - THE "MESSAGE BLOCK"

   Structure:     Byte 1: '5' to denote "message block" type

                       2: \
                       3: | The length of the block, as usual
                       4: /

                   5 - ?: Message, as ASCII text.

                       ?: 0, to denote end of text

   The message block simply allows you to embed text into a VOC file.
   Presumably you could use this to detect when other people have pinched
   your VOC files for their own applications.


BLOCK 6 - THE "REPEAT BLOCK"

   Structure:     Byte 1: '6' to denote "repeat block" type

                       2: \
                       3: | The length of the block, as usual
                       4: /

                       5: Number of times that data should be repeated
                       6: Total = 1 + b5 + (b6*255)

   Every "playable" data block between a block 6 and a block 7 will be repeated
   the number of times specified in b5 and b6.  Note that you add one to this
   value - the data blocks are ALWAYS played at least once.  However, if b5
   and b6 are zero, then you really don't need a repeat block, do you!

   I'm told that you cannot "nest" repeat blocks, but I've never tried it.
   This limitation would only apply to CT-VOICE.DRV I would have thought, but
   it depends how good other VOC players are.


BLOCK 7 - THE "END REPEAT BLOCK"

   Structure:     Byte 1: '7' to denote "end repeat block" type

                       2: \
                       3: | The length of the block, as usual
                       4: /

   This, as explained, marks the end of the block of blocks (!) that you wish
   to repeat.  Note that the "length" is always zero, so I don't know why
   the length bytes are required at all.
---------------------------------------------------------------------

This was picked up off the 80XXX Assembler echo on FidoNet. There are many
other file formats for MODs, but I have found this one to be most complete

Protracker 2.3A Song/Module Format:
-----------------------------------

Offset  Bytes  Description
------  -----  -----------
   0     20    Songname. Remember to put trailing null bytes at the end...
               When written by ProTracker this will be only uppercase;
               there are only historical reasons for this. (And the
               historical reason is that Karsten Obarski, who made the
               first SoundTracker, was stupid.)

Information for sample 1-31:

Offset  Bytes  Description
------  -----  -----------
  20     22    Samplename for sample 1. Pad with null bytes. Will only be
               uppercase.  The samplenames are often used for storing
               messages from the author; in particular, samplenames
               starting with a '#' sign will generally be a message.  This
               convention is a result of a player called IntuiTracker
               displaying all samples starting with # as a message to the
               person playing the module.
  42      2    A WORD with samplelength for sample 1.  Stored as number of
               words.  Multiply by two to get real sample length in bytes.
               This is a big-endian number; for all PC programmers out
               there, this means that to get your 8-bit-orginated format,
               you have to swap the two bytes.
  44      1    Lower four bits are the finetune value, stored as a signed
               four bit number. The upper four bits are not used, and
               should be set to zero.
               They should also be masked out reading; you can never be
               sure what some stupid program could have stored here...
  45      1    Volume for sample 1. Range is $00-$40, or 0-64 decimal.
  46      2    Repeat point for sample 1. Stored as number of words offset
               from start of sample. Multiply by two to get offset in bytes.
  48      2    Repeat Length for sample 1. Stored as number of words in
               loop. Multiply by two to get replen in bytes.

Information for the next 30 samples starts here. It's just like the info for
sample 1.

Offset  Bytes  Description
------  -----  -----------
  50     30    Sample 2...
  80     30    Sample 3...
   .
   .
   .
 890     30    Sample 30...
 920     30    Sample 31...

Offset  Bytes  Description
------  -----  -----------
.
 950      1    Songlength. Range is 1-128.
 951      1    This byte is set to 127, so that old trackers will search
               through all patterns when loading.
               Noisetracker uses this byte for restart, ProTracker doesn't.
 952    128    Song positions 0-127.  Each hold a number from 0-63 (or
               0-127) that tells the tracker what pattern to play at that
               position.
1080      4    The four letters "M.K." - This is something Mahoney & Kaktus
               inserted when they increased the number of samples from
               15 to 31. If it's not there, the module/song uses 15 samples
               or the text has been removed to make the module harder to
               rip. Startrekker puts "FLT4" or "FLT8" there instead.
               If there are more than 64 patterns, PT2.3 will insert M!K!
               here. (Hey - Noxious - why didn't you document the part here
               relating to YOUR OWN PROGRAM? -Vishnu)

Offset  Bytes  Description
------  -----  -----------
1084    1024   Data for pattern 00.
   .
   .
   .
xxxx  Number of patterns stored is equal to the highest patternnumber
      in the song position table (at offset 952-1079).

  Each note is stored as 4 bytes, and all four notes at each position in
the pattern are stored after each other.

00 -  chan1  chan2  chan3  chan4
01 -  chan1  chan2  chan3  chan4
02 -  chan1  chan2  chan3  chan4
etc.

Info for each note:

 _____byte 1_____   byte2_    _____byte 3_____   byte4_
/                \ /      \  /                \ /      \
0000          0000-00000000  0000          0000-00000000

Upper four    12 bits for    Lower four    Effect command.
bits of sam-  note period.   bits of sam-
ple number.                  ple number.


 MM> Does anybody know the format of .MOD/.SAM/.WAV/.VOC file? Info on any

      One thing you should keep in mind about MOD files is that they
   originated from the Amiga, so the samples are signed, see the
   discussion about SAM files for more information.

Note:
       Sounder and Sound Tool both use the same file extension, but have
 different file formats. To tell the difference, Read the first 6 bytes
 of the file. If it matches the magic number for Sound Tool .SND files,
 it is a Sound Tool file. Else, it's a Sounder file or a raw file.


Sounder File Format:

 BYTE(S)        NORMAL CONTENTS               PURPOSE/DESCRIPTION
 ---------------------------------------------------------------------------

 00 - 01        0, 0                          Bits per sample. Ex: Sound
                                              Blaster can only do 8, Sound
                                              Blaster 16 can make 16.
                                              Normally, the only valid value
                                              is 0, which is the code for an
                                              8 bit sample. Future versions
                                              of Sounder and DSOUND.DLL may
                                              allow 16 bit samples and such.

 02 - 03        ???                           Sampling rate. Currently, only
                                              22 KHz, 11 KHz, 7.33 KHz, and
                                              5.5 KHz are valid. If given a
                                              value like 9 KHz, it will be
                                              played at the next closest rate
                                              (in this case, 11 KHz). The
                                              sampling rate is calculated as
                                              follows:

                                              SampRate = Byte1 + (256 * Byte2)

 04 - 05        ???                           Volume to play the sample back
                                              at. Note: On the PC's Internal
                                              Speaker, there is a definite
                                              upper limit as to the volume,
                                              depending on the shift value
                                              (see below). The Sound Blaster
                                              and the Disney Sound Source
                                              aren't quite as restricted,
                                              but still are at some high
                                              value.

 06 - 07        4, 0                          Shift value. This is the number
                                              that each byte is divided by to
                                              "scale" the volume down to a
                                              point where the PC's Internal
                                              Speaker can handle it. See the
                                              discussion on playing back
                                              digitalized sound for more
                                              details.

   Information from Sounder text files and Sound Tool help (.HLP) files.
                       Rewritten by Edward Schlunder


Sound Tool File Format:

 BYTE(S)        NORMAL CONTENTS               PURPOSE/DESCRIPTION
 ---------------------------------------------------------------------------

 00 - 05        "SOUND", 26                   Just an identification thing.
                                              Helps a lot when you are trying
                                              to distinguish between Sounder
                                              .SND files and Sound Tool .SND
                                              files.

 08 - 11        ???                           This is the number of bytes in
                                              the sample. It is calculated as
                                              follows:

       ByteSam = Byte1 + (256 * Byte2) + (512 * Byte3) + (768 * Byte4)

 12 - 15        ???                           This points to the first byte
                                              to play in the file. It is
                                              calculated the same way as the
                                              number of bytes in the sample
                                              (see above).

 16 - 19        ???                           This points to the last byte in
                                              the sample to play. Calculated
                                              the same as above.

 20 - 21        ???                           Sampling rate of the sample.
                                              Valid values are 22 KHz, 11 KHz,
                                              7.33 , and 5.5 K, but if
                                              given a number not listed
                                              above, it will be played at the
                                              closest valid sampling rate.
                                              So, 9 KHz would be played at
                                              11 Khz.
                                              This is calculated as follows:
                                              SamRate =  Byte1 + (256 * Byte2)

 22 - 23        ???                           Bits per sample. Ex: Sound
                                              Blaster can only do 8, Sound
                                              Blaster 16 can make 16.
                                              Normally, the only valid value
                                              is 0, which is the code for an
                                              8 bit sample. Future versions
                                              of Sounder and DSOUND.DLL may
                                              allow 16 bit samples and such.

 24 - 25        ???                           Volume to play the sample back
                                              at. Note: On the PC's Internal
                                              Speaker, there is a definite
                                              upper limit as to the volume,
                                              depending on the shift value
                                              (see below). The Sound Blaster
                                              and the Disney Sound Source
                                              aren't quite as restricted,
                                              but still are at some high
                                              value.

 26 - 27        4, 0                          Shift value. This is the number
                                              that each byte is divided by to
                                              "scale" the volume down to a
                                              point where the PC's Internal
                                              Speaker can handle it. See the
                                              discussion on playing back
                                              digitalized sound for more
                                              details.

 28 - 123       ???                           This is the name of the sample.
                                              It is followed by an ASCII 0.

   Information from Sounder text files and Sound Tool help (.HLP) files.
                         Rewritten by Edward Schlunder

[Back to FAQ SWAG index]  [Back to Main SWAG index]  [Original]