"Assembler tutorial" by CAMERON CLARK

[Back to FAQ SWAG index] [Back to Main SWAG index] [Original]

The non-general use registers, What are they for?

***********************************************************************
*GENERAL REGISTERS *register*               definition                *
*                  *   AX   *  accumulator                   (16 bit) *
*                  *   AH   *  accumulator high-order byte   ( 8 bit) *
*                  *   AL   *  accumulator low order byte    ( 8 bit) *
*                  *   BX   *  base                          (16 bit) *
*                  *   BH   *  base high-order byte          ( 8 bit) *
*                  *   BL   *  base low-order byte           ( 8 bit) *
*                  *   CX   *  count                         (16 bit) *
*                  *   CH   *  count high order byte         ( 8 bit) *
*                  *   CL   *  count low order byte          ( 8 bit) *
*                  *   DX   *  data                          (16 bit) *
*                  *   DH   *  date high order byte          ( 8 bit) *
*                  *   DL   *  data low order byte           ( 8 bit) *
***********************************************************************
*SEGMENT REGISTERS *register*               definition                *
*                  *   CS   *  code  segment (16 bit)                 *
*                  *   DS   *  data  segment (16 bit)                 *
*                  *   SS   *  stack segment (16 bit)                 *
*                  *   ES   *  extra segment (16 bit)                 *
***********************************************************************
*INDEX REGISTERS   *register*               definition                *
*                  *   DI   *  destination index (16 bit)             *
*                  *   SI   *  source      index (16 bit)             *
***********************************************************************
*POINTERS          *register*               definition                *
*                  *   SP   *  stack       pointer (16 bit)           *
*                  *   BP   *  base        pointer (16 bit)           *
*                  *   IP   *  instruction pointer (16 bit)           *
***********************************************************************
*FLAGS               AF, CF, DF, IF, OF, PF, SF, TF, ZF               *
 
     Once again here is the list of registers [and flags] on you 80xxx
     processor. As was said before registers AX-DX are general use;
     however they do have some other special uses. In part II we saw
     that CX is known as the counting register because in is can be
     automatically inc/dec durring a loop. It can also be used to
     repeat a single type of machine instruction. [scambling for Doc]
     The "prefix" and the instructions are as follows:
 
     Prefix = REP
 
     Instructions = Movs,Lods,Stos,Ins, or outs
 
     We already know that Mov is used to move a value to a register or
     a register value some address, but what's a (lod,sto,ins,out)?
     That's were are other registers come into play.
 
     In order to do operation like in pascal, you must use to special
     pointers to get information that is far away from the current
     code being executed. These pointers can be thought of as [es:di],
     and [ds:si]. I like to label them as

     [es:di] -> put
     [ds:si] -> get
 
     These two are used to get or put the value that they point to.
     Notice that DS is called the data segment register, and it is
     used to "get" values from where DS points. To get back to pascal,
     lets use some examples.
 
     var x : array[1..22] of byte;
     [...]
     fillchar(X,32,#0); {I may have this in reverse}
 
     This procedure takes the address of X and fills is with 32 zero
     values.
     now...
     asm
       push es
       push di
 
       mov ax,seg X
       mov es,ax
       mov di,offset X
 
       mov cx,32d  {number of times to repeat}
       mov ax,0    {the "char" we want to move}
 
       REP         {tells the processor to repeat the next repeatable}
                   {instruction CX times                             }
           stosB   {store whats in AX to [es:di] and increment di}
 
       pop di
       pop es
     end;
 
     There's alot of extra stuff there and you may want to know why.
     The non-general use registers are used by pascal and they
     _must_ remain the same value in order for pascal to run correctly.
     That's why we push and pop
 
     what's push & pop? If you know, skip the next paragraph.
     Push and pop "Push" and "pop" values onto the stack. The stack is
     part of the operation system and it is used by most calling programs.
     Misuse of the stack can cause you program to crash the computer.
     Think of the stack as like a deck of cards. IF you put a Jack, then
     and Ace, and then a King in a pile you have just pushed Jack,Ace, and
     King. Know when you pick up the cards you "pop" King,Ace,Jack. This
     is known as a "last in first out" way of storage, and any storage
     using the scheme can be called a stack. A stack must always take
     the last value first and it can't jump around. What you put into
     the stack always comes out in reverse order. That's why when I
     push ES and DI, I must Pop Di then ES. The stack is a quick way
     of storing and retrieving values.
 
     Next, whats that "Seg" and "offset"?
     These are Key words that tell the compiler to substitute at that
     position the absolute address of the Variable refrenced. It's kinda
     like pascal's Addr(variable) procedure. In order for you to access
     variables that aren't part of a procedure call, its best to use
     this method. Remeber that ES:DI must point to the variable you
     are wanting to change in the code about - so you must tell the
     compiler - "I want you to give the location of this variable, and
     not the variables value" using the segment and offset key words.
 
     Why didn't you move the segment of array X dirrectly to DS?
     Answer, you can't. You can move the value of a register to a
     segment register, but you can't move immediate values.
     Illegal
        mov DS,938d
        mov DS,seg X
        mov CS,832d
        mov ES,21h
     Legal
        mov ax,938d
        mov ds,ax
        mov ax,seg X
        mov ds,ax
     or also legal
        push 938d
        pop  DS
        push seg X
        pop  DS

Tpas -> asmb
part III      [2 of 2]

This section may have a bit too much information for now. If you don't
quite get all of it, don't worry! Most of this section will be repeated
and explain in greater detail in the next upcomming sections.


var x : array[1..22] of byte;
 
[...]
 
asm
 push es
 push di
 
 mov ax,seg X
 mov es,ax
 mov di,offset X
 
 mov cx,32d  {number of times to repeat}
 mov ax,0    {the "char" we want to move}
 
 REP         {tells the processor to repeat the next repeatable}
             {instruction CX times                             }
    stosB   {store whats in AX to [es:di] and increment di}
 
 pop di
 pop es
end;
 
  Starting from the top, we know that we should push the original
  values of registers that aren't general use [non AX-DX]. Next,
  we know that we are telling the compiler to give us the address
  that X is located at and not to give us the value of X. Segment
  registers can only be assinged values by a Register or a Pop.
 
  That bring us to CX gets the number of times we want to REPeat,
  and AX get the value we want to move to X. We are know at the
  meaty instruction and that is STOSB. STOSB does a lot at once.
 
  First,  it moves the byte located in AL to [es:di].
  Second, it increments DI so it points to the next byte of X.
  Third,  CX is decremented
  fourth, if CX in not Zero then the process is repeated
 
  When you have a simple loop like
  " for I := 1 to 32 do x[i] := 0; ";
  then, the above is a _really_quick_ way to do the same thing.
  Unlike in Part II, this loop is not as general, and cant as
  easily be used to do many things durring a loop. The REP only
  works for certain instruction and can only repeat that instruction
  CX times.
 
  "OK, now what's LOD,INS,OUT ?"
  First off, the STOS can be use two other ways. There is a STOSW
  that moves a word at a time [AX] and advances DI two bytes. The
  other one is STOS and truthfully, I haven't and don't know how
  to use it yet.
 
  LODS is kinda the oposite of STOS. IF stosB, it moves the byte at
  DS:SI into AL, and if stosW, it moves the word at DS:SI into AX.
 
  !Unless you know your hardware ports, skip INS and OUT for now.!
  ]INS is not part of the 8088 and 8086 processor and won't execute.
  ]IT takes a byte or word [insB, insW] from the port listed in DX
  ]and moves it the addess located in [ES:ID].
  ]
  ]OUT takes a byte or word and moves it to a port. It can be done
  ]as follows
  ]
  ]OUT 13h,AL  {moves byte in AL to port 13h}
  ]OUT 13h,AX  {moves word in AX to port 13h}
  ]OUT DX,al   {move byte in al to port listed DX}
  ]out DX,ax   {move word in ax to port listed DX}
  ]
  ]outs, outsB,outsW are not part of the 8088 and 8086 instruction
  ]outsB moves byte [ds:si] to port listed in DX and advances SI
  ]outsW moves word [ds:si] to port listed in DX and advances SI
  ]I've not used OUTS & don't yet know how.
 
 
  That brings us back to the rest of the special use registers,
  And they are
     CS   *  code  segment (16 bit)                 *
     SS   *  stack segment (16 bit)                 *.
 
  My best advice is DON'T TOUCH! Your processor uses pointers
  just like you so it knows what to execute next and where the
  top of the stack is. CS:IP points the next instuction to execute
  and SS:SP points to the top of the stack.
  CS:IP are changed by jumps and calls.
  SS:SP are changed by pushes and pops
 
  -------------------------------------------------------------
  I screwed up a little bit in order so I need to rectify this
  now.
 
  I need to define immediate values, effective addresses, and so forth.
 
  mov ax,983d
  An immediate value is a byte or word that is part of the instruction.
  In the example above, 938d is the immediate value that is going to
  be moved into ax.
  "mov ax,bx" has no immediate values. The value is stored in BX and
  the word associated is not part of the instruction.
 
 
  mov ax,[0000]
  An effective address can be a word value that is part of the instuction
  but is a location and an immediate value. When this is compiled
  0000 is not moved into ax, the word pointed to by cs:[0000] is moved
  into ax. Why CS? If you don't overide the segment the compiler used
  defaults. Unfortunatly I don't readily know which are which, but
  seeing that you might be a beginner, this won't be a problem that
  has to be delt with know. In other word ignore for now.
 
  mov ax,[bx]
  This has [bx] as the effective address. The value in BX is thought
  of as an address and the word pointed to by BX is moved into ax.
 
  There are several addressing modes and I will try to explain them as
  I get to them and as they are needed.
 
 
  _I HIGHLY suggest that we start out with assembler procedure and
  functions in Pascal and leave it at that_! I think that asm should
  be used only for critical procedures that are too slow in pascal
  or are things not supported by native pascal. This way you code can
  be effiecient and still easily understandable.
 
  Next section Part IV - assembler functions and procedures in pascal.

tpas -> asmb
part IV - assembler functions and procedures in pascal.
     [1 of 2]
 
     Now that we know just a little bit, let learn through example.
     Translation from tpas to asm is not an exact science. There
     are several way to do about the same structure as pascal and
     it is this reason that ASM procedures can be so efficient. You
     can completetly eliminate the legnthy and general machine code
     that pascal produces when things like overflow,range checking,
     and so forth, are not as important as speeding things up.
 
     For the first example, we are going to write a procedure in
     Tpascal that clears the screen in mode $13. We will assume
     that we are already in mode $13.
 
     mov ax,13h
     int 10h    {bios call}
 
 
 
     procedure ClearScrn13;
     var i,j : word;
     begin
       for I := 0 to 199 do
           For J := 0 to 319 do
                 mem[$a000:(i*320) + j] := 0;
 
     end;
 
 
     First off, I assume you know that in ram at $a000:0000 is
     were the vga screen memory is stored. I can be thought of
     as an array of 64,000 bytes, and you must do the Row_Major
     math to access it by Row and Column.
     For every row there are 320 bytes, in order to access [2,2]
     of a [320,200] array we must pass up 320 bytes in the first row
     and and 1 more byte to get to [2,2].
 
     In this case, who cares about rows and colums. We know the
     size of the 1 dimesional array so when can just loop 64000 times.
     This keeps us from having to do any math - which will slow every-
     thing down.
 
     procedure clearScrn13asm;
     begin
     asm
       push es  {remeber to store the values of non-general}
       push di
 
       mov ax,0a000h
       mov es,ax      {es := $a000}
       xor di,di      {di := 0}
 
       mov cx,32000d   {cx := 64000 div 2;}
       xor ax,ax      {ax := 0;}
 
       rep stosW
 
       pop di
       pop es
     end;
     end;
 
     I know, you saying "what the f..?". Remember first we point our
     "put" pointer [es:di] to where we are going to store stuff. So what
     is "xor di,di"? "Xor" or is "exclusive or". It means either A or B
     but not A and B. The truth table is as follows
 
     true  xor false = true
     false xor true  = true
     false xor false = false
     true  xor true  = false
 
     in bits
     1 xor 0 = 1
     0 xor 1 = 1
     0 xor 0 = 0
     1 xor 1 = 0
 
     So any number xor'd it self is zero. For example, 48 decimal is 30
     hexidecimal. In binary 30 is 00110000. Now if you xor each
     individual bit by using the table above you get
 
     0 0 1 1 0 0 0 0   |
 xor 0 0 1 1 0 0 0 0   |
 -------------------   v
   = 0 0 0 0 0 0 0 0
 
     For each column use the table. From left to right 0 xor 0 is 0,
     0 xor 0 is 0, 1 xor 1 is 0 and so forth.
 
     "Why not mov di,0 ? or mov ax,0 ?"
     Code size. Mov di,0 will produce 3 bytes of code and "xor di,di"
     should produce 1 byte of code. For more information, get the
     opcode list from "ftp.intel.com". It list all possible and valid
     instructions.
 
 
     "Why is is mov cx,32000 and not mov cx,64000?"
     Well If you look at the repeated instruction, is will move
     a word at a time, and a word is two bytes. So moving 32000
     words is the same as moving 64000 bytes but just much
     quicker.
 
     Remeber the "REP stosW" will automatically mov ax to es:di
     and increment di and decrement cx until repeated 32000 times.
 
 
     Know lets use an example that won't let us use the repeat prefix.
     Lets mov a virtual screen to the physical screen.
 
     type V_scrn : array[0..319,0..199] of byte;
     var V_P : ^V_scrn;
 
     procedure Paste_scrn( P : V_P );
     var i,j : word;
     begin
       for I := 0 to 199 do
           For J := 0 to 319 do
                 mem[$a000:(i*320) + j] := V^[j,i];; {or ist ^V[j,i]?}
     end;
 
part IV - assembler functions and procedures in pascal.
      [2 of 2]
 
procedure Paste_scrnASM( P : V_P );
     begin
     asm
        push es
        push di
        push ds
        push si
 
        {: Setup or Get and Put pointers :}
 
        {get}
        mov ax,word ptr P[2]  {ds:si -> V_screen }
        mov ds,ax
        mov si,word ptr P
 
        {put}
        mov ax,0a000h         {es:di -> screen}
        mov es,ax
        xor di,di
 
        mov cx,32000d         { for I := 1 to 32000 do begin }
     @looper:
        lodsW                 {  ax  := -> get } { inc si }
        stosW                 {  put := ax     } { inc di }
        loop @looper          { end;           } { dec CX }
 
        pop si
        pop ds
        pop di
        pop es
     end;
     end;
 
 
     OK lets start with "P[2]" a Pointer type is 4 bytes in legnth.
     The first 2 bytes are the offset that it points to and the
     second 2 bytes are the segment that it point to.
     The "[]" symbols also mean indexing - much like an 1 dimensional
     array in pascal. Seeing that the Segment is 2 bytes passed the
     begining of the pointer, we tell th copmiler that we are talking
     about the word 2 bytes passed the begining of the 4 byte pointer.
 
     "word ptr?"
     If we didn't tell the copmiler that we want that this is a word
     and it an effective address and not a varialbe, we'd have a
     problem. We can't use the Offset and Seg key words; becuase, we
     we want the value of the of the pointer and not its location.
 
 
 
     Pascal takes all the parameters of the procedure and
     pushes them on the stack write before a call to the procedure.
     In a procedure when you use one of those variables you are actually
     using a values that is on the stack. When you use the "VAR" keyword
     is pushes the address of the variable on the stack, and therefore
     any changes made to that variable are actually chaning the value
     of the original variable. Otherwise, is just pushes the value of
     the variable on the stack and any changes to that value are made
     only in the stack and are thrown away after completion.
 
     example
 
     Procedure test(VAR X : byte; Y,Z : byte);
     begin
     end;
 
 
 
     a:= 1;
     b:= 2;
     c:= 3;
 
     When you call test(a,b,c);
 
     It pushed all the paramters starting backwards with C
 
     STACK
     SP ->   [ addr(A) ]   a
             [    2    ]   b
             [    3    ]   c
 
     to move C's value into AX you could say
        mov ax,word ptr Z
     to move B's value into AX you could say
        mov ax,word ptr Y
     to move A's value into AX you could say
        mov ax,word ptr X[2]
        mov ds,ax
        mov si,word ptr X     {[ds:si] -> A}
        lodsB                 {ax := [ds:si] }
 
     because the value of A is not on the stack, but A's address is.
 
 
     !Back to the code!
 
     Next es:di points to the vga memory in ram.
 
     This...
 
       mov cx,32000d         { for I := 1 to 32000 do begin }
     @looper:
        lodsW                 {  ax  := -> get } { inc si }
        stosW                 {  put := ax     } { inc di }
        loop @looper          { end;           } { dec CX }
 
     sets up our loop counter CX
     lodsW gets from out virtual screen the first word and increments si 2
     bytes [word size in bytes]
     AX holds the word
 
     stosW takes the word in AX and puts it to the screen and increments
     di 2 bytes.
 
     Loop decrements CX and compares it to zero, if its not equal to zero
     it jumps to @Looper [this changes the values of CS:IP].
 
     You get all that done with just a tinty bit of speedy code.
 
 
 
     I didn't get to functions this time, but in part V I'll get to functions
     and how to acccess pascal variables like a string.
 
 
     As always, I'll be glad to answer any questions. Just one warning,_all
     the code in these section have _NOT_ been compiled and are sure to have
     errors. If you find these errors then maybe these lessons are too simple
     for you and you may be ready to start learning from an intermediate to
     advanced asm book. However, I'll try to be as unassuming as possible,
     but if I loose you on some concept, just mail me.
 
     Later
     **%CpC%**
[Back to FAQ SWAG index] [Back to Main SWAG index] [Original]