3h CHAPTER 6 THE 86 INSTRUCTION SET In this chapter we discuss in detail the instruction set supported by both the A86 and A386 assemblers. To use any of the 32-bit registers, the extra segment regsiters FS and GS, or the instructions marked with a "3", "4", or "5" in the instruction list, you need my A386 assembler, available only if you register both A86 and D86. Effective Addresses Most memory data accessing in the 86 family is accomplished via the mechanism of the effective address. Wherever an effective address specifier "eb", "ew" "ed", or "ev" appears in the list of instructions, you may use a wide variety of actual operands in that instruction. These include general registers, memory variables, and a variety of indexed memory quantities. GENERAL REGISTERS: Wherever an "ew" appears, you can use any of the 16-bit registers AX,BX,CX,DX,SI,DI,SP, or BP. Wherever an "eb" appears, you can use any of the 8-bit registers AL,BL,CL,DL,AH,BH,CH, or DH. Whenever an "ed" occurs, you can (on A386) use any of the 32-bit registers EAX,EBX,ECX,EDX,ESI,EDI,EBP, or ESP. For A86, "ev" is the same as "ew"; for A386, it means you can use either a 16-bit or a 32-bit register. For example, the "ADD ev,rv" form subsumes the 16-bit register-to-register adds; for example, ADD AX,BX; ADD SI,BP; ADD SP,AX. On A386, this form also includes 32-bit register-to-register adds; e.g., ADD EBX,ESI. At the machine level the 16-vs.-32 distinction is made by an "operand override" opcode byte, 66H, that A386 places before the instruction to signal a switch from 16-bits to 32-bits. MEMORY VARIABLES: Wherever an "eb", "ew", "ed", or "ev" appears, you can use a memory variable of the indicated size: byte, word, doubleword, or either-word-or-doubleword. Variables are typically declared in the DATA segment, using a DD declaration for a doubleword variable, a DW declaration for a word variable, or a DB declaration for a byte variable. For example, you can declare variables: DATA_PTR DW ? ESC_CHAR DB ? Later, you can load or store these variables: MOV ESC_CHAR,BL ; store the byte variable ESC_CHAR MOV DATA_PTR,081 ; initialize DATA_PTR MOV SI,DATA_PTR ; load DATA_PTR into SI for use LODSW ; fetch the word pointed to by DATA_PTR Alternatively, you can address specific unnamed memory locations by enclosing the location value in square brackets; for example, MOV AL,[02000] ; load contents of location 02000 into AL 6-2 Note that A86 discerned from context (loading into AL) that a BYTE at 02000 was intended. Sometimes this is impossible, and you must specify byte or word: INC B[02000] ; increment the byte at location 02000 MOV W[02000],0 ; set the WORD at location 02000 to zero INDEXED MEMORY: The 86 supports the use of certain registers as base pointers and index registers into memory. BX and BP are the base registers; SI and DI are the index registers. You may combine at most one base register, at most one index register, and a constant number into a run time pointer that determines the location of the effective address memory to be used in the instruction. These can be given explicitly, by enclosing the index registers in brackets: MOV AX,[BX] MOV CX,W[SI+17] MOV AX,[BX+SI+5] MOV AX,[BX][SI]5 ; another way to write the same instr. Or, indexing can be accomplished by declaring variables in a based structure (see the STRUC directive in Chapter 9): STRUC [BP] ; NOTE: based structures are unique to A86! BP_SAVE DW ? ; BP_SAVE is a word at [BP] RET_ADDR DW ? ; RET_ADDR is a word at [BP+2] PARM1 DW ? ; PARM1 is a word at [BP+4] PARM2 DW ? ; PARM2 is a word at [BP+6] ENDS ; end of structure INC PARM1 ; equivalent to INC W[BP+4] Finally, indexing can be done by mixing explicit components with declared ones: TABLE DB 4,2,1,3,5 MOV AL,TABLE[BX] ; load byte number BX of TABLE The 386 processor also supports indexing using any of the eight 32-bit general registers. This type of indexing is of limited use for memory referencing from real-mode programs (most programs running under DOS), since offsets greater than 64K are disallowed in real mode (you will get a General Protection Fault if you try it). 32-bit indexing is, however, useful in conjunction with the LEA instruction, giving an extremely powerful register arithmetic instruction. For example, LEA ECX,[EAX+2*EBX+17000] performs two additions and a multiplication, all in a single machine instruction. Since no memory accessed is actually attempted, this kind of LEA usage is allowed in real-mode DOS programs. 6-3 In 32-bit indexing, you may use one or two of any of the 32-bit general registers. You may also scale one of the indexing registers, by multiplying it by 2, 4, or 8. You may also add or subtract a constant of any size up to a doubleword capacity to the indexed quantity. If you use the same register twice and scale one of the instances of that register, you get, in effect, an odd-number scaling (3, 5, or 9) of that register; e.g., A386 will allow LEA EAX,[9*EBX] as an abbreviation for LEA EAX,[8*EBX+EBX]. Due to coding restrictions, the ESP register can be used only once within an indexed quantity, and cannot be scaled. Some more examples of 32-bit indexing are: XCHG DX,[EAX] MOV AL,[EAX+EBX] ADD EBX,[ESI+8*ECX+3391811] LEA ECX,[4*EBX-7] Segmentation and Effective Addresses The 86 family has four segment registers, CS, DS, ES, and SS, used to address memory. The 386 and later processors add two more segment registers FS and GS. Each segment register points to 64K bytes of memory within the 1-megabyte memory space of the 86. (The start of the 64K is calculated by multiplying the segment register value by 16; i.e., by shifting the value left by one hex digit.) If your program's code, data and stack areas can all fit in the same 64K bytes, you can leave all the segment registers set to the same value. In that case, you won't have to think about segment registers: no matter which one is used to address memory, you'll still get the same 64K. If your program needs more than 64K, you must point one or more segment registers to other parts of the memory space. In this case, you must take care that your memory references use the segment registers you intended. Each effective address memory access has a default segment register, to be used if you do not explicitly specify which segment register you wish. For most effective addresses, the default segment register is DS. The exceptions are those effective addresses that use the BP register for indexing. All BP-indexed memory references have a default of SS. (This is because BP is intended to be used for addressing local variables, stored on the stack.) 6-4 If you wish your memory access to use a different segment register, you provide a segment override byte before the instruction containing the effective address operand. In the A86 language, you code the override by giving the name of the segment register you wish before the instruction mnemonic. For example, suppose you want to load the AL register with the memory byte pointed to by BX. If you code MOV AL,[BX], the DS register will be used to determine which 64K segment BX is pointing to. If you want the byte to come from the CS-segment instead, you code CS MOV AL,[BX]. Be aware that the segment override byte has effect only upon the single instruction that follows it. If you have a sequence of instructions requiring overrides, you must give an override byte before every instruction in the sequence. (In that case, you may wish to consider changing the value of the default segment register for the duration of the sequence.) NOTE: This method for providing segment overrides is unique to the A86 assembler! The assemblers provided by Intel and IBM (MS-DOS) attempt to figure out segment allocation for you, and plug in segment override bytes "behind your back". In order to do this, those assemblers require you to inform them which variables and structures are pointed to by which segment registers. That is what the ASSUME directive in those assemblers is all about. I wrote Intel's first 86 assembler, ASM86, so I have been watching the situation since day one. Over the years, I have concluded that the ASSUME mechanism creates far, far more confusion that it solves. So I scrapped it; and the result is an assembler with far less red tape. But if your program needs more than 64K, you do have to manage those segment registers yourself; so take care! Effective Use of Effective Addresses Remember that all of the common instructions of the 86 family allow effective addresses as operands. (The only major functions that don't are the AL/AX specific ones: multiply, divide, and input/output). This means that you don't have to funnel many numbers through AL or AX just to do something with them. You can perform all the common arithmetic, PUSH/POP, and MOVes from any general register to any general register; from any memory location (indexed if you like) to any register; and (this is most often overlooked) from any register TO memory. The only thing you can't do in general is memory-to-memory. Among the more common operations that inexperienced 86 programmers overlook are: * setting memory variables to immediate values * testing memory variables, and comparing them to constants * preserving memory variables by PUSHing and POPping them * incrementing and decrementing memory variables * adding into memory variables 6-5 Encoding of Effective Addresses This section outlines the number of program opcode bytes generated by effective-address specifications. This will let you make judgments when trying to keep your program as small as possible. The precise opcodes generated are explained in the text files EFF86.DOC in the A86 package, and EFF386.DOC in the A386 package. Every instruction with an 16-bit effective address has an encoded byte, known as the effective address byte, following the instruction's main opcode. (For obscure reasons, Intel calls this byte the ModRM byte.) If the effective address is a memory variable, or an indexed memory location with a non-zero constant offset, then the effective address byte is immediately followed by the offset amount. Amounts in the range -128 to +127 are given by a single signed byte. Amounts outside that range are represented by a 2-byte offset. In the instruction chart given later in this chapter, effective-address specification opcodes are denoted by a slash / followed either by the letter "r" or an octal digit. The meaning of the r-or-digit is explained in the EFF*.DOC files. For example, the instruction DIV CX falls under the DIV eb form in the instruction chart. The instruction occupies two bytes: the main opcode byte 0F6H, followed by a single effective address byte with no constant offsets involved. Similarly, the instruction DIV B[BX] occupies two bytes. For DIV B[BX+7] you must add an offset byte for the 7, making a total of three bytes. For DIV B[BX+1000] you must add a 2-byte offset for the 1000, making a total of 4 bytes. For DIV B[02000] (more typically coded with a symbolic name such as DIV MY_VAR_NAME), the instruction is also 4 bytes: the main opcode byte, the effective address byte, and the offset of the memory variable. An anomalous case is the operand [BP]. The effective-address byte encoding for this particular operand was usurped by the simple-variable case. When A86 sees [BP], it must specify an 8-bit offset whose value is zero. Thus, the instruction DIV B[BP] occupies three bytes, not two. This anomaly does not apply to [BP+SI] or [BP+DI]. In A386, 32-bit indexing is signalled by a special address override opcode byte (67H) preceding the instruction. Following the override byte is the instruction's main opcode, followed by the effective-address specification. For a simple memory variable, the specification consists of a single effective-address byte followed by the 4-byte offset of the variable. For indexing involving a single, non-scaled index register other than ESP, the specification consists of a single byte followed by the constant offset component. For indexing involving two registers, scaling, or the ESP register, there are two bytes followed by the constant offset component. The constant offset component occpies no space if the the offset is zero, one byte if between -128 to +127, and 4 bytes otherwise. There is no provision for a 16-bit-word-sized offset if you are using 32-bit indexing. 6-6 Note the distinction between the address override byte (67H) and the operand override byte (66H). A86 must supply an address override when the instruction involves a memory operand whose address has 32 bits. A86 must supply an operand override when the data being manipulated has 32 bits. In general, when a 32-bit register name appears inside the square brackets, that's an address override; when it appears outside the square brackets, that's an operand override. Examples: MOV DX,[BX] ; needs neither override in a 16-bit segment MOV DX,[EBX] ; needs an address override MOV EDX,[BX] ; needs an operand override MOV EDX,[EBX] ; needs both overrides Also note that the generation of these override bytes is handled automatically by A86 when it scans the operands to an instruction. The only exceptions to this are the no-operand string operations: REP MOVSW, LODSD, SCASB, etc. For these instructions, the operand size is signalled by the last letter (B, W, or D) of the mnemonic; however, the addressing mode is not signalled by the mnemonic. If you are in 16-bit mode, as all simple DOS programs are, you need to precede a string instruction with an explicit A4 prefix if you wish to use 32-bit addressing ([ESI] and/or [EDI] with count ECX). If you are assembling to a 32-bit protected-mode segment (when that is implemented) you will need to use an explicit A2 prefix if you wish to use 16-bit addressing ([SI] and/or [DI] with count CX). Here are some examples of instruction size involving 32-bit indexing in a real-mode segment: DIV B[EBX] requires an address override byte, the single instruction opcode byte 0F6H, and an effective address byte: total 3 bytes. DIV B[EBX+7] adds the offset byte 07, making the total 4 bytes. DIV B[EBX+1000] forces the offset to be 4 bytes, making the total 7 bytes. DIV B[EBX+EDI*2] does not require an offset, but the extra index register expands the effective address specifier to two bytes, making the total 4 bytes. Similarly, DIV B[ESP] requires two effective address bytes (total 4 instruction bytes), because the ESP register is a special case. Finally, DIV ES:D[EBX+EDI*2+1000] requires three overrides (segment override ES, operand override for the D, and address override for 32-bit indexing), the main opcode byte, two effective address opcode bytes, and a 4-byte offset: total 10 bytes. The [BP] extra-byte anomaly applies, in 32-bit mode, to [EBP] as well. In fact, the anomaly also applies when another indexing register (scaled or not) is added to [EBP]. A386 must generate an offset byte whose value is 0 when it sees any no-offset forms involving [EBP]. 6-7 The 386 and later processors, when running in protected mode, allow segments whose default word-size is 32 bits instead of 16-bits. In such segments, the usage of the operand and address override bytes is reversed: 32-bit operands do not require the operand-override byte, and 16-bit operands do. (8-bit operands never require an operand-override byte.) 32-bit memory addresses do not require an address-override byte; 16-bit addresses do. This mode will be recognized by A386 whenever the USE32 directive is used; however, at the time of this writing, this feature is not yet implemented. All DOS programs, which run in real mode, have a default of 16 bits. How to Read the Instruction Set Chart The following chart summarizes the machine instructions you can program with A86. In order to use the chart, you need to learn the meanings of the specifiers (each given by 2 lower case letters) that follow most of the instruction mnemonics. Each specifier indicates the type of operand (register byte, immediate word, etc.) that follows the mnemonic to produce the given opcodes. The "v" type, for A86, is the same as "w" -- it denotes a 16-bit word. On A386, "v" denotes either a word or doubleword, depending on the presence of an operand override prefix byte. "c" means the operand is a code label, pointing to a part of the program to be jumped to or called. A86 will also accept a constant offset in this place (or a constant segment-offset pair in the case of "cp"). "cb" is a label within about 128 bytes (in either direction) of the current location. "cv" is a label within the same code segment as this program; "cp" is a pair of constants separated by a colon-- the segment value to the left of the colon, and the offset to the right. The offset is always a word in A86; it can be either a word or a doubleword in A386. Note that in both the cb and cv cases, the object code generated is the offset from the location following the current instruction, not the absolute location of the label operand. In some assemblers (most notably for the Z-80 processor) you have to code this offset explicitly by putting "$-" before every relative jump operand in your source code. You do NOT need to, and should not do so with A86. "e" means the operand is an Effective Address. The concept of an Effective Address is central to the 86 machine architecture, and thus to 86 assembly language programming. It is described in detail at the start of this chapter. We summarize here by saying that an Effective Address is either a general purpose register, a memory variable, or an indexed memory quantity. For example, the instruction "ADD rb,eb" includes the instructions: ADD AL,BL, and ADD CH,BYTEVAR, and ADD DL,B[BX+17]. 6-8 "i" means the operand is an immediate constant, provided as part of the instruction itself. "ib" is a byte-sized constant; "iw" is a constant occupying a full 16-bit word. The operand can also be a label, defined with a colon. In that case, the immediate constant which is the location of the label is used. Examples: "MOV rw,iw" includes the instructions: MOV AX,17, or MOV SI,VAR_ARRAY, where "VAR_ARRAY:" appears somewhere in the program, defined with a colon. NOTE that if VAR_ARRAY were defined without a colon, e.g., "VAR_ARRAY DW 1,2,3", then "MOV SI,VAR_ARRAY" would be a "MOV rw,ew" NOT a "MOV rw,iw". The MOV would move the contents of memory at VAR_ARRAY (in this case 1) into SI, instead of the location of the memory. To load the location, you can code "MOV SI,OFFSET VAR_ARRAY". "m" means a memory variable or an indexed memory quantity; i.e., any Effective Address EXCEPT a register. "r" means the operand is a general purpose register. The 8 "rb" registers are AL,BL,CL,DL,AH,BH,CH,DH; the 8 "rw" registers are AX,BX,CX,DX,SI,DI,BP,SP. "rv/m" is used in the Bit Test instructions to denote either a word-or-doubleword register, or an array of bits in memory that can any length. NOTE: The following chart gives all instructions for all processors through the Pentium. You must take care to use only the instructions appropriate for the target processor of your program (the P switch will enforce this for you: see Chapter 3). If an instruction form does not run on all processors, there is a letter or digit just before the description field. "N" means the instruction runs only on NEC processors (which are rare nowdays). A digit x means the instruction runs on the x86 or later: 1 for 186, 2 for 286, 3 for 386, 4 for 486, 5 for Pentium. Instructions with 3 or greater are recognized only by my A386 assembler, received only by those who register both A86 and D86.