PROGPREF SEGMENT AT 0 ;Really a DSECT mapping the program prefix ORG PROGPREF+6 MEMSIZE DW ? ;Size of available memory PROGPREF ENDS Really, no matter whether the AT value represents truth or fiction, it is your responsibility, not the assembler's, to get set up a segment register so that you can really reach the storage in question. So, you can't say MOV AL,EQUIP unless you first say something like MOV AX,BIOSAREA ;BIOSAREA becomes a symbol with value 40H MOV ES,AX ASSUME ES:BIOSAREA Enough about SEGMENT. The END statement is simple. It goes at the end of every assembly. When you are assembling a subroutine, you just say END but when you are assembling the main routine of a program you say END label where 'label' is the place where execution is to begin. Another pseudo-op illustrated in the program is ASSUME. ASSUME is like the USING statement in 370 assembler. However, ASSUME can ONLY refer to seg- ment registers. The assembler uses ASSUME information to decide whether to assemble segment override prefixes and to check that the data you are try- ing to access is really accessible. In this case, we can reassure the assembler that both the CS and DS registers will address the section called HELLO at execution time. Actually, the SS and ES registers will too, but the assembler never needs to make use of this information. I guess I have explained everything in the program except that ORG pseudo-op. ORG means the same thing as it does in many assembly languages. It tells the assembler to move its location counter to some particular address. In this case, we have asked the assembler to start assembling code hex 100 bytes from the start of the section called HELLO instead of at the very beginning. This simply reflects the way COM programs are loaded. When a COM program is loaded by the system, the system sets up all four segment registers to address the same 64K of storage. The first 100 hex bytes of that storage contains what is called the program prefix; this area is described in appendix E of the DOS manual. Your COM program physically begins after this. Execution begins with the first physical byte of your program; that is why the JMP instruction is there. Wait a minute, you say, why the JMP instruction at all? Why not put the data at the end? Well, in a simple program like this I probably could have gotten away with that. However, I have the habit of putting data first and would encourage you to do the same because of the way the assembler has of assembling different instructions depending on the nature of the operand. IBM PC Assembly Language Tutorial 19 Unfortunately, sometimes the different choices of instruction which can assemble from a single opcode have different lengths. If the assembler has already seen the data when it gets to the instructions it has a good chance of reserving the right number of bytes on the first pass. If the data is at the end, the assembler may not have enough information on the first pass to reserve the right number of bytes for the instruction. Sometimes the assembler will complain about this, something like "Forward reference is illegal" but at other times, it will make some default assumption. On the second pass, if the assumption turned out to be wrong, it will report what is called a "Phase error," a very nasty error to track down. So get in the habit of putting data and equated symbols ahead of code. OK. Maybe you understand the program now. Let's walk through the steps involved in making it into a real COM file. 1. The file should be created with the name HELLO.ASM (actually the name is arbitrary but the extension .ASM is conventional and useful) 2. ASM HELLO,,; (this is just one example of invoking the assembler; it uses the small assembler ASM, it produces an object file and a listing file with the same name as the source file. I am not going exhaustively into how to invoke the assembler, which the manual goes into pretty well. I guess this is the first time I mentioned that there are really two assemblers; the small assembler ASM will run in a 64K machine and doesn't support macros. I used to use it all the time; now that I have a bigger machine and a lot of macro libraries I use the full function assembler MASM. You get both when you buy the package). 3. If you issue DIR at this point, you will discover that you have acquired HELLO.OBJ (the object code resulting from the assembly) and HELLO.LST (a listing file). I guess I can digress for a second here concerning the listing file. It contains TAB characters. I have found there are two good ways to get it printed and one bad way. The bad way is to use LPT1: as the direct target of the listing file or to try copying the LST file to LPT1 without first setting the tabs on the printer. The two good ways are to either a. direct it to the console and activate the printer with CTRL-PRTSC. In this case, DOS will expand the tabs for you. b. direct to LPT1: but first send the right escape sequence to LPT1 to set the tabs every eight columns. I have found that on some early serial numbers of the IBM PC printer, tabs don't work quite right, which forces you to the first option. 4. LINK HELLO; (again, there are lots of linker options but this is the simplest. It takes HELLO.OBJ and makes HELLO.EXE). HELLO.EXE? I thought we were IBM PC Assembly Language Tutorial 20 making a COM program, not an EXE program. Right. HELLO.EXE isn't really executable; its just that the linker doesn't know about COM pro- grams. That requires another utility. You don't have this utility if you are using DOS 1.0; you have it if you are using DOS 1.1 or DOS 2.0. Oh, by the way, the linker will warn you that you have no stack segment. Don't worry about it. 5. EXE2BIN HELLO HELLO.COM This is the final step. It produces the actual program you will exe- cute. Note that you have to spell out HELLO.COM; for a nominally rational but actually perverse reason, EXE2BIN uses the default exten- sion BIN instead of COM for its output file. At this point, you might want to erase HELLO.EXE; it looks a lot more useful than it is. Chances are you won't need to recreate HELLO.COM unless you change the source and then you are going to have to redo the whole thing. 6. HELLO You type hello, that invokes the program, it says HELLO YOURSELF!!! (oops, what did I do wrong....?) What about subroutines? _______________________ What about subroutines? What about subroutines? What about subroutines? I started with a simple COM program because I actually think they are easi- er to create than subroutines to be called from high level languages, but maybe its really the latter you are interested in. Here, I think you should get comfortable with the assembler FIRST with little exercises like the one above and also another one which I will finish up with. Next you are ready to look at the interface information for your particular language. You usually find this in some sort of an appendix. For example, the BASIC manual has Appendix C on Machine Language Subroutines. The PASCAL manual buries the information a little more deeply: the interface to a separately compiled routine can be found in the Chapter on Procedures and Functions, in a subsection called Internal Calling Conventions. Each language is slightly different, but here are what I think are some common issues in subroutine construction. 1. NEAR versus FAR? Most of the time, your language will probably call your assembler routine as a FAR routine. In this case, you need to make sure the assembler will generate the right kind of return. You do this with a PROC...ENDP statement pair. The PROC statement is probably IBM PC Assembly Language Tutorial 21 a good idea for a NEAR routine too even though it is not strictly required: FAR linkage: | NEAR linkage: | ARBITRARY SEGMENT | SPECIFIC SEGMENT PUBLIC PUBLIC THENAME | PUBLIC THENAME ASSUME CS:ARBITRARY | ASSUME CS:SPECIFIC,DS:SPECIFIC THENAME PROC FAR | ASSUME ES:SPECIFIC,SS:SPECIFIC ..... code and data | THENAME PROC NEAR THENAME ENDP | ..... code and data .... ARBITRARY ENDS | THENAME ENDP END | SPECIFIC ENDS | END With FAR linkage, it doesn't really matter what you call the segment. you must declare the name by which you will be called in a PUBLIC pseu- do-op and also show that it is a FAR procedure. Only CS will be ini- tialized to your segment when you are called. Generally, the other segment registers will continue to point to the caller's segments. With NEAR linkage, you are executing in the same segment as the caller. Therefore, you must give the segment a specific name as instructed by the language manual. However, you may be able to count on all segment registers pointing to your own segment (sometimes the situation can be more complicated but I cannot really go into all of the details). You should be aware that the code you write will not be the only thing in the segment and will be physically relocated within the segment by the linker. However, all OFFSET references will be relocated and will be correct at execution time. 2. Parameters passed on the stack. Usually, high level languages pass parameters to subroutines by pushing words onto the stack prior to calling you. What may differ from language to language is the nature of what is pushed (OFFSET only or OFFSET and SEGMENT) and the order in which it is pushed (left to right, right to left within the CALL state- ment). However, you will need to study the examples to figure out how to retrieve the parameters from the stack. A useful fact to exploit is the fact that a reference involving the BP register defaults to a ref- erence to the stack segment. So, the following strategy can work: ARGS STRUC DW 3 DUP(?) ;Saved BP and return address ARG3 DW ? ARG2 DW ? ARG1 DW ? ARGS ENDS ........... PUSH BP ;save BP register MOV BP,SP ;Use BP to address stack MOV ...,[BP].ARG2 ;retrieve second argument (etc.) IBM PC Assembly Language Tutorial 22 This example uses something called a structure, which is only available in the large assembler; furthermore, it uses it without allocating it, which is not a well-documented option. However, I find the above approach generally pleasing. The STRUC is like a DSECT in that it establishes labels as being offset a certain distance from an arbitrary point; these labels are then used in the body of code by beginning them with a period; the construction ".ARG2" means, basically, " + (ARG2-ARGS)." What you are doing here is using BP to address the stack, accounting for the word where you saved the caller's BP and also for the two words which were pushed by the CALL instruction. 3. How big is the stack? BASIC only gives you an eight word stack to play with. On the other hand, it doesn't require you to save any registers except the segment registers. Other languages give you a liberal stack, which makes things a lot easier. If you have to create a new stack segment for yourself, the easiest thing is to place the stack at the end of your program and: CLI ;suppress interrupts while changing the stack MOV SSAVE,SS ;save old SS in local storage (old SP ; already saved in BP) MOV SP,CS ;switch MOV SS,SP ;the MOV SP,OFFSET STACKTOP ;stack STI ;(maybe) Later, you can reverse these steps before returning to the caller. At the end of your program, you place the stack itself: DW 128 DUP(?) ;stack of 128 words (liberal) STACKTOP LABEL WORD 4. Make sure you save and restore those registers required by the caller. 5. Be sure to get the right kind of addressibility. In the FAR call exam- ple, only CS addresses your segment. If you are careful with your ASSUME statements the assembler will keep track of this fact and gener- ate CS prefixes when you make data references; however, you might want to do something like MOV AX,CS ;get current segment address MOV DS,AX ;To DS ASSUME DS:THISSEG Be sure you keep your ASSUMEs in synch with reality. IBM PC Assembly Language Tutorial 23 Learning about BIOS and the hardware ____________________________________ Learning about BIOS and the hardware Learning about BIOS and the hardware Learning about BIOS and the hardware You can't do everything with DOS calls. You may need to learn something about the BIOS and about the hardware itself. In this, the Technical Ref- erence is a very good thing to look at. The first thing you look at in the Technical Reference, unless you are really determined to master the whole ball of wax, is the BIOS listings presented in Appendix A. Glory be: here is the whole 8K of ROM which deals with low level hardware support layed out with comments and everything. In fact, if you are just interested in learning what BIOS can do for you, you just need to read the header comments at the beginning of each section of the listing. BIOS services are invoked by means of the INT instruction; the BIOS occu- pies interrupts 10H through 1FH and also interrupt 5H; actually, of these seventeen interrupts, five are used for user exit points or data pointers, leaving twelve actual services. In most cases, a service deals with a particular hardware interface; for example, BIOS interrupt 10H deals with the screen. As with DOS function calls, many BIOS services can be passed a function code in the AH register and possible other arguments. I am not going to summarize the most useful BIOS features here; you will see some examples in the next sample program we will look at. The other thing you might want to get into with the Tech reference is the description of some hardware options, particularly the asynch adapter, which are not well supported in the BIOS. The writeup on the asynch adapt- er is pretty complete. Actually, the Tech reference itself is pretty complete and very nice as far as it goes. One thing which is missing from the Tech reference is informa- tion on the programmable peripheral chips on the system board. These include the 8259 interrupt controller the 8253 timer the 8237 DMA controller and the 8255 peripheral interface To make your library absolutely complete, you should order the INTEL data sheets for these beasts. I should say, though, that the only I ever found I needed to know about was the interrupt controller. If you happen to have the 8086 Family User's Manual, the big book put out by INTEL, which is one of the things people sometimes buy to learn about 8086 architecture, there is an appendix there which gives an adequate description of the 8259. IBM PC Assembly Language Tutorial 24 A final example _______________ A final example A final example A final example I leave you with a more substantial example of code which illustrates some good elementary techniques; I won't claim its style is perfect, but I think it is adequate. I think this is a much more useful example than what you will get with the assembler: PAGE 61,132 TITLE SETSCRN -- Establish correct monitor use at boot time ; ; This program is a variation on many which toggle the equipment flags ; to support the use of either video option (monochrome or color). ; The thing about this one is it prompts the user in such a way that he ; can select the use of the monitor he is currently looking at (or which ; is currently connected or turned on) without really having to know ; which is which. SETSCRN is a good program to put first in an ; AUTOEXEC.BAT file. ; ; This program is highly dependent on the hardware and BIOS of the IBMPC ; and is hardly portable, except to very exact clones. For this reason, ; BIOS calls are used in lieu of DOS function calls where both provide ; equal function. ; OK. That's the first page of the program. Notice the PAGE statement, which you can use to tell the assembler how to format the listing. You give it lines per page and characters per line. I have mine setup to print on the host lineprinter; I routinely upload my listings at 9600 baud and print them on the host; it is faster than using the PC printer. There is also a TITLE statement. This simply provides a nice title for each page of your listing. Now for the second page: SUBTTL -- Provide .COM type environment and Data PAGE ; ; First, describe the one BIOS byte we are interested in ; BIOSDATA SEGMENT AT 40H ;Describe where BIOS keeps his data ORG 10H ;Skip parts we are not interested in EQUIP DB ? ;Equipment flag location MONO EQU 00110000B ;These bits on if monochrome COLOR EQU 11101111B ;Mask to make BIOS think of the color board BIOSDATA ENDS ;End of interesting part ; ; Next, describe some values for interrupts and functions ; DOS EQU 21H ;DOS Function Handler INT code PRTMSG EQU 09H ;Function code to print a message KBD EQU 16H ;BIOS keyboard services INT code GETKEY EQU 00H ;Function code to read a character SCREEN EQU 10H ;BIOS Screen services INT code MONOINIT EQU 02H ;Value to initialize monochrome screen IBM PC Assembly Language Tutorial 25