Tuesday, November 20, 2007

IBM PC Assembly Language Tutorial 3

IBM PC Assembly Language Tutorial 3

Notice that the key thing in all five phases is being selective. It is

easy to conclude that there is too much to learn unless you can throw

away what you don't need. Most of the rest of this talk is going to

deal with this very important question of what you need and don't

need to learn in each phase. In some cases, I will have to leave you

to do almost all of the learning, in others, I will teach a few salient

points, enough, I hope, to get you started. I hope you understand that

all I can do in an hour is get you started on the way.

Phase 1: Learn the architecture and instruction set

The Morse book might seem like a lot of book to buy for just two

really important chapters; other books devote a lot more space to

the instruction set and give you a big beautiful reference page on

each instruction. And, some of the other things in the Morse book,

although interesting, really aren't very vital and are covered too

sketchily to be of any real help. The reason I like the Morse book is

that you can just read it; it has a very conversational style, it is very

lucid, it tells you what you really need to know, and a little bit more

which is by way of background; because nothing really gets

belabored to much, you can gracefully forget the things you don't

use. And, I very much recommend READING Morse rather than

studying it. Get the big picture at this point.

Now, you want to concentrate on those things which are worth fixing

in memory. After you read Morse, you should relate what you have

learned to this outline.

1. You want to fix in your mind the idea of the four segment registers

CODE, DATA, STACK, and EXTRA. This part is pretty easy to

grasp. The 8086 and the 8088 use 20 bit addresses for memory,

meaning that they can address up to 1 megabyte of memory. But,

the registers and the address fields in all the instructions are no

more that 16 bits long.

So, how to address all of that memory? Their solution is to put

together two 16 bit quantities like this:

calculation SSSS0 ---- value in the relevant segment register SHL

4 depicted in AAAA ---- apparent address from register or

instruction hexadecimal --------

RRRRR ---- real address placed on address bus

In other words, any time memory is accessed, your program will

supply a sixteen bit address. Another sixteen bit address is

acquired from a segment register, left shifted four bits (one nibble)

and added to it to form the real address. You can control the values

in the segment registers and thus access any part of memory you

want. But the segment registers are specialized: one for code, one

for most data accesses, one for the stack (which we'll mention

again) and one "extra" one for additional data accesses.

Most people, when they first learn about this addressing scheme

become obsessed with converting everything to real 20 bit

addresses. After a while, though, you get use to thinking in

segment/offset form. You tend to get your segment registers set up

at the beginning of the program, change them as little as possible,

and think just in terms of symbolic locations in your program, as with

any assembly language.

EXAMPLE:

MOV AX,DATASEG

MOV DS,AX ;Set value of Data segment

ASSUME DS:DATASEG ;Tell assembler DS is usable

.......

MOV AX,PLACE ;Access storage symbolically by 16 bit address

In the above example, the assembler knows that no special issues

are involved because the machine generally uses the DS register to

complete a normal data reference.

If you had used ES instead of DS in the above example, the

assembler would have known what to do, also. In front of the MOV

instruction which accessed the location PLACE, it would have

placed the ES segment prefix. This would tell the machine that ES

should be used, instead of DS, to complete the address.

Some conventions make it especially easy to forget about segment

registers. For example, any program of the COM type gets control

with all four segment registers containing the same value. This

program executes in a simplified 64K address space. You can go

outside this address space if you want but you don't have to.

2. You will want to learn what other registers are available and learn

their personalities:

AX and DX are general purpose registers. They become special

only when accessing machine and system interfaces.

CX is a general purpose register which is slightly specialized for

counting.

BX is a general purpose register which is slightly specialized for

forming base-displacement addresses.

AX-DX can be divided in half, forming AH, AL, BH, BL, CH, CL,

DH, DL.

SI and DI are strictly 16 bit. They can be used to form indexed

addresses (like BX) and they are also used to point to strings.

SP is hardly ever manipulated. It is there to provide a stack.

BP is a manipulable cousin to SP. Use it to access data which has

been pushed onto the stack.

Most sixteen bit operations are legal (even if unusual) when per-

formed in SI, DI, SP, or BP.

3. You will want to learn the classifications of operations available

WITHOUT getting hung up in the details of how 8086 opcodes are

constructed.

8086 opcodes are complex. Fortunately, the assembler opcodes

used to assemble them are simple. When you read a book like

Morse, you will learn some things which are worth knowing but NOT

worth dwelling on.

a. 8086 and 8088 instructions can be broken up into subfields and

bits with names like R/M, MOD, S and W. These parts of the

instruction modify the basic operation in such ways as whether it is 8

bit or 16 bit, if 16 bit, whether all 16 bits of the data are given,

whether the instruction is register to register, register to memory, or

memory to register, for operands which are registers, which register,

for operands which are memory, what base and index registers

should be used in finding the data.

b. Also, some instructions are actually represented by several

different machine opcodes depending on whether they deal with

immediate data or not, or on other issues, and there are some

expedited forms which assume that one of the arguments is the

most commonly used operand, like AX in the case of arithmetic.

There is no point in memorizing any of this detail; just distill the

bottom line, which is, what kinds of operand combinations EXIST in

the instruction set and what kinds don't. If you ask the assembler to

ADD two things and the two things are things for which there is a

legal ADD instruction somewhere in the instruction set, the

assembler will find the right instruction and fill in all the modifier

fields for you.

I guess if you memorized all the opcode construction rules you

might have a crack at being able to disassemble hex dumps by eye,

like you may have learned to do somewhat with 370 assembler. I

submit to you that this feat, if ever mastered by anyone, would be in

the same class as playing the "Minute Waltz" in a minute; a curiosity

only.

Here is the basic matrix you should remember:

Two operands: One operand:

R <-- M R

M <-- R M

R <-- R S *

R|M <-- I

R|M <-- S *

S <-- R|M *

* -- data moving instructions (MOV, PUSH, POP) only

S -- segment register (CS, DS, ES, SS)

R -- ordinary register (AX, BX, CX, DX, SI, DI, BP, SP,

AH, AL, BH, BL, CH, CL, DH, DL)

M -- one of the following

pure address

[BX]+offset

[BP]+offset

any of the above indexed by SI

any of the first three indexed by DI

4. Of course, you want to learn the operations themselves. As I've

suggested, you want to learn the op codes as the assembler

presents them, not as the CPU machine language presents them.

So, even though there are many MOV op codes you don't need to

learn them. Basically, here is the instruction set:

a. Ordinary two operand instructions. These instructions perform an

operation and leave the result in place of one of the operands.

They are

1) ADD and ADC -- addition, with or without including a carry from

a previous addition

2) SUB and SBB -- subtraction, with or without including a borrow

from a previous subtraction

3) CMP -- compare. It is useful to think of this as a subtraction

with the answer being thrown away and neither operand actually

changed

4) AND, OR, XOR -- typical boolean operations

5) TEST -- like an AND, except the answer is thrown away and nei-

ther operand is changed.

6) MOV -- move data from source to target

7) LDS, LES, LEA -- some specialized forms of MOV with side

effects

b. Ordinary one operand instructions. These can take any of the

operand forms described above. Usually, the perform the operation

and leave the result in the stated place:

1) INC -- increment contents

2) DEC -- decrement contents

3) NEG -- twos complement

4) NOT -- ones complement

5) PUSH -- value goes on stack (operand location itself

unchanged)

6) POP -- value taken from stack, replaces current value

c. Now you touch on some instructions which do not follow the

general operand rules but which require the use of certain registers.

The important ones are

1) The multiply and divide instructions

2) The "adjust" instructions which help in performing arithmetic

on ASCII or packed decimal data

3) The shift and rotate instructions. These have a restriction on

the second operand: it must either be the immediate value 1 or

the contents of the CL register.

4) IN and OUT which send or receive data from one of the 1024

hardware ports.

5) CBW and CWD -- convert byte to word or word to doubleword

by sign extension

d. Flow of control instructions. These deserve study in themselves

and we will discuss them a little more. They include

1) CALL, RET -- call and return

2) INT, IRET -- interrupt and return-from-interrupt

3) JMP -- jump or "branch"

4) LOOP, LOOPNZ, LOOPZ -- special (and useful) instructions

which implement a counted loop similar to the 370 BCT instruction

5) various conditional jump instructions

e. String instructions. These implement a limited storage-to-

storage instruction subset and are quite powerful. All of them

have the property that

1) The source of data is described by the combination DS and SI.

2) The destination of data is described by the combination ES and

DI.

3) As part of the operation, the SI and/or DI register(s) is(are)

incremented or decremented so the operation can be repeated.

They include

1) CMPSB/CMPSW -- compare byte or word

2) LODSB/LODSW -- load byte or word into AL or AX

3) STOSB/STOSW -- store byte or word from AL or AX

4) MOVSB/MOVSW -- move byte or word

5) SCASB/SCASW -- compare byte or word with contents of AL or

AX

6) REP/REPE/REPNE -- a prefix which can be combined with any

of the above instructions to make them execute repeatedly across

a string of data whose length is held in CX.

f. Flag instructions: CLI, STI, CLD, STD, CLC, STC. These can set

or clear the interrupt (enabled) direction (for string operations) or

carry flags.

The addressing summary and the instruction summary given above

masks a lot of annoying little exceptions. For example, you can't

POP CS, and although the R <-- M form of LES is legal, the M <-- R

form isn't etc.

etc. My advice is

a. Go for the general rules

b. Don't try to memorize the exceptions

c. Rely on common sense and the assembler to teach you about

exceptions over time. A lot of the exceptions cover things you

wouldn't want to do anyway.

5. A few instructions are rich enough and useful enough to warrent

careful study. Here are a few final study guidelines:

a. It is well worth the time learning to use the string instruction

set effectively. Among the most useful are

REP MOVSB ;moves a string

REP STOSB ;initializes memory

REPNE SCASB ;look up occurance of character in string

REPE CMPSB ;compare two strings

b. Similarly, if you have never written for a stack machine before,

you will need to exercise PUSH and POP and get very comfortable

with them because they are going to be good friends. If you are

used to the 370, with lots of general purpose registers, you may

find yourself feeling cramped at first, with many fewer registers

and many instructions having register restrictions. But, you have

a hidden ally: you need a register and you don't want to throw

away what's in it? Just PUSH it, and when you are done, POP it

back. This can lead to abuse. Never have more than two

"expedient" PUSHes in effect and never leave something PUSHed

across a major header comment or for more than 15 instructions or

so. An exception is the saving and restoring of registers at

entrance to and exit from a subroutine; here, if the subroutine is

long, you should probably PUSH everything which the caller may

need saved, whether you will use the register or not, and POP it in

reverse order at the end.

Be aware that CALL and INT push return address information on

the stack and RET and IRET pop it off. It is a good idea to become

familiar with the structure of the stack.

c. In practice, to invoke system services you will use the INT

instruction. It is quite possible to use this instruction effec-

tively in a cookbook fashion without knowing precisely how it

works.

d. The transfer of control instructions (CALL, RET, JMP) deserve

careful study to avoid confusion. You will learn that these can be

classified as follows:

1) all three have the capability of being either NEAR (CS register

unchanged) or FAR (CS register changed)

2) JMPs and CALLs can be DIRECT (target is assembled into

instruction) or INDIRECT (target fetched from memory or register)

3) if NEAR and DIRECT, a JMP can be SHORT (less than 128

bytes away) or LONG

In general, the third issue is not worth worrying about. On a for-

ward jump which is clearly VERY short, you can tell the assembler

it is short and save one byte of code:

JMP SHORT CLOSEBY

On a backward jump, the assembler can figure it out for you. On a

forward jump of dubious length, let the assembler default to a

LONG form; at worst you waste one byte.

Also leave the assembler to worry about how the target address is

to be represented, in absolute form or relative form.

e. The conditional jump set is rather confusing when studied apart

from the assembler, but you do need to get a feeling for it. The

interactions of the sign, carry, and overflow flags can get your

mind stuttering pretty fast if you worry about it too much. What

is boils down to, though, is

JZ means what it says

JNZ means what it says

JG reater this means "if the SIGNED difference is positive"

JA bove this means "if the UNSIGNED difference is positive"

JL ess this means "if the SIGNED difference is negative"

JB elow this means "if the UNSIGNED difference is negative"

JC arry assembles the same as JB; it's an aesthetic choice

You should understand that all conditional jumps are inherently

DIRECT, NEAR, and "short"; the "short" part means that they can't

go more than 128 bytes in either direction. Again, this is some-

thing you could easily imagine to be more of a problem than it is.

I follow this simple approach:

1) When taking an abnormal exit from a block of code, I always use

an unconditional jump. Who knows how far you are going to end

up jumping by the time the program is finished. For example, I

wouldn't code this:

TEST AL,IDIBIT ;Is the idiot bit on?

JNZ OYVEY ;Yes. Go to general cleanup

Rather, I would probably code this:

TEST AL,IDIBIT ;Is the idiot bit on?

JZ NOIDIOCY ;No. I am saved.

JMP OYVEY ;Yes. What can we say...

NOIDIOCY:

The latter, of course, is a jump around a jump. Some would say

it is evil, but I submit it is hard to avoid in this language.

2) Otherwise, within a block of code, I use conditional jumps

freely. If the block eventually grows so long that the assem-

bler starts complaining that my conditional jumps are too long

I

a) consider reorganizing the block but

b) also consider changing some conditional jumps to their

opposite and use the "jump around a jump" approach as shown

above.

Enough about specific instructions!

6. Finally, in order to use the assembler effectively, you need to

know the default rules for which segment registers are used to

complete addresses in which situations.

a. CS is used to complete an address which is the target of a

NEAR DIRECT jump. On an NEAR INDIRECT jump, DS is used to

fetch the address from memory but then CS is used to complete the

address thus fetched. On FAR jumps, of course, CS is itself altered.

The instruction counter is always implicitly pointing in the code seg-

ment.

b. SS is used to complete an address if BP is used in its formation.

Otherwise, DS is always used to complete a data address.

c. On the string instructions, the target is always formed from ES

and DI. The source is normally formed from DS and SI. If there is a

segment prefix, it overrides the source not the target.

No comments: