This second release is in response to those felt that the original text was too cryptic to understand. I assume that I was too brief in some areas, so this time, I'll make an effort to cover everything in more detail and explain more of the examples. In addition, several new topics and tricks are covered here, including fading in/out and how the checkerboard trick was achieved. Oh, and the source to the AA loader is also packaged in the ZIP file. Enjoy... ------------------------------------------------------------------------------ Writing (Graphic) VGA Intros/Loaders (Second Release) By Fred Nietzche 11/08/91 All that this text will cover is the standard VGA video mode 13h (320x200 x 256 colors), which is the easiest to understand and also what is generally used to write the loaders. Again, if you would like more information concerning the higher resolution modes (and also the non- standard video modes), I encourage you to check out Power Graphics Programming by Michael Abrash, and also The Advanced Programmer's Guide to the EGA/VGA by George Sutty and Steve Blair. VGA Memory Organization for Standard Mode 13h ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Before getting on to the fun stuff, you need a little knowledge of how something is displayed on the screen. Bear in mind, though, that this interpretation is simplified for you to understand the basics. Later on, more components will be added to this. Every X number of times a second, your VGA controller SCANS through its display memory, and whatever it picks up there, is placed onto your screen. When writing loaders, our goal then, is to modify that display memory so that when the VGA controller scans through it, it will show what we want on the screen. The organization of this display memory, as it corresponds to how the controller interprets it, turns out to be fairly simple. The first reason why is because each pixel on the screen corresponds to exactly 1 byte on the display memory. Thus, each offset of the address corresponds to 1 pixel. Take this for granted for now if you're not following. And the second reason is that the memory map is linear. What does this mean? Starting out at offset 0 (coordinate (0,0)), the controller scans until offset 319 (319,0) for first line of the screen. For the second line on the screen, it just continues scanning from Offset 320 to 639, and the third line, Offset 640 to 959. Examine the following example closely: ÃÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ Display, Horizontal, 0..319 ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ´ Â ÚÄÄÄÄÄÄÄÄÄÄÂÄÄÄÄÄÄÄÄÄÄÂÄÄÄÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄÄÄÄÂÄÄÄÄÄÄÄÄÄÄ¿ ³ ³Offset 0 ³Offset 1 ³Offset 2 ³ ----> ³Offset 318³Offset 319³ ³ ³(0,0) ³(1,0) ³(2,0) ³ ³(318,0) ³(319,0) ³ V ÃÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄ´ ÃÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄ´ e ³Offset 320³Offset 321³Offset 322³ ----> ³Offset 638³Offset 639³ r ³(0,1) ³(1,1) ³(2,1) ³ ³(318,1) ³(319,1) ³ t ÃÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄ´ ÃÄÄÄÄÄÄÄÄÄÄÅÄÄÄÄÄÄÄÄÄÄ´ , ³Offset 640³Offset 641³Offset 642³ ----> ³Offset 958³Offset 959³ ³(0,2) ³(1,2) ³(2,2) ³ ³(318,2) ³(319,2) ³ 0 ÀÄÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÙ . . | | 1 | | 9 \|/ \|/ 9 ³ ³ Á Looking at this, the equation for calculating the Offset is: Offset = (320 * YCoord) + XCoord Now, all that is missing from the memory address is the segment. The VGA and EGA modes, unlike CGA, all use 0A000h as their display segment. The completed display address then is given by: [ 0A000h : Offset ] You can now put a pixel on the screen, by filling in the display memory address with a BYTE value, ranging from 0 to 255 (total of 256), each one of these values representing a different color (to be re-defined in the next part). Definition of 256 Colors, and the VGA Video DACs ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When a video mode allows for 256 colors, this means that a total of 256 colors may be displayed on the screen at once. But the definitions of these colors can vary, according to how YOU would like them. For example, it would be feasible to change all of the 256 colors available to just BLUE. This would mean that everything on the screen would appear blue, no matter what the display memory addresses say. Why? Because each and every one of those 256 values has been DEFINED as blue. Before scanning through the display memory, the controller stores a table of what each of those 256 colors has been designated as so that it knows which color to place onto the screen. So how to go about changing the definitions of the colors as stored on that table is the goal here. First off, though, you need an understanding of how to define a color. The spectrum of visible light (the colors) can be defined by just three components and their respective intensities, red, green, and blue (RGB). Using this information, and the fact that each each component's intensity is limited to 64 levels (ranging from 0 to 63), you can calculate the total spectrum of colors that the card will allow. Red Green Blue 64 x 64 x 64 = 262,144 colors available. Changing a color is then just a matter of determining which value from the table you would like the change, and the three components that make up your color. There are four VGA Video DAC ports, but only three are important to look up and change the values on the table. Those three are as follows: 03C7h - Read Index Register. When reading a value from the lookup table, placing a value here will set the read index of the table at that value. Reading the Data Register port three times after this will result in the three components of that color stored on the table. After this step, the read index of the table will automatically be incremented by 1, so taking this a little further, reading the Data Register three times again will result in the three components of the color of the NEXT value stored on the table. 03C8h - Write Index Register. This operation is similar to the Read Index Register. When modifying a value's three components on the lookup table, placing a value here will set the write index of the table at that value. The next three byte values you place into the Data Register will be the three components of the color for the value. Again, after this is completed, the write index for the table is incremented by 1, and the next three bytes that is put into the Data Registers will be for the NEXT value of the table. 03C9h - DAC Data Register. This is where the data is read and written to the table (also called the LUT, or Look-Up Table). Note: I am uncertain whether the write index and the read index of the LUT are the same. I've encountered that when you set the write index and read index both at once, and read/write from/to the Data Register several times, the indices will mess up (meaning that they both are not incremented by one each time upon three successive read/writes). There's a chance that they are one and the same, in which case, you'll have to either read them all at once and then write, or set the read index, read three values, and set the write index, and write the three values. The fading in and out examples that follow will treat them as the same. If you know otherwise, please inform me. Now, combining this information with our knowledge of the display memory organization, here is an example to place a blue dot at coordinate (10,20) on the screen. ; Change color of "1" to blue. MOV DX,03C8h ; Write Index Register MOV AL,01 ; We want to modify value 1 on the LUT OUT DX,AL ; Set write index to value 1 INC DX ; Data Register MOV AL,00 ; Red component is zero OUT DX,AL ; Set value's red on LUT to 0 MOV AL,00 ; Green component is zero OUT DX,AL ; Set value's green on LUT to 0 MOV AL,63 ; Blue component is 63 (max intensity) OUT DX,AL ; Set value's blue to 63 ; Change display memory at coord (10,20) to value of 1. MOV ES,0A000h ; Set ES to display segment MOV DI,20*320+10 ; Calculate offset of (10,20) into DI MOV Byte ES:[DI],1 ; Move "1" into that memory location Note, however, that this coding is very inflexible and inefficient, and is done only for the purpose of example. The Vertical Retrace ~~~~~~~~~~~~~~~~~~~~ The last basic component to writing VGA intros is knowing how to produce smooth animation. Refering back to my explanation in the first section, the controller scans the display memory, and in doing so, moves the electron gun with it, left to right. Upon reaching the end of the line though, the electron gun must RETRACE back to the beginning of the next line, and then continues reading the display memory. This retracing is called the Horizontal Retrace. What we are interested in is the Vertical Retrace. After the electron gun finishes placing the display memory data onto the screen, it must RETRACE back up from the lower right hand corner of the screen to the upper left hand corner, before beginning again to scan the display memory and thus update the screen. This retracing is called the Vertical Retrace, and it is during this time that we should make our modifications to the display memory and LUT so that it will appear to "pop" into place. This is the key to smooth animation. The video port and bit # that you can read in determining whether the electron gun is beginning its vertical retrace is 03DAh, bit #3. The following example illustrates its use. MOV DX,03DAh Wait: IN AL,DX ;Get value from port TEST AL,08h ;Is bit #3 on? JZ Wait ;No, then wait until on Retr: IN AL,DX TEST AL,08h ;Is bit #3 on? JNZ Retr ;Yes, then wait until off If you didn't understand the above section, then take it for granted, and any time you want to modify the display memory and/or LUT, put this code in front before doing so. Algorithm/Overhead/Misc ~~~~~~~~~~~~~~~~~~~~~~~ And before working out your algorithm, take into consideration that your graphics execution cycle is the most important part of the program, and that your goal is to optimize that with little regard for overhead time and memory. In order for your intro to deliver its full effect, smooth animation is the key, and achieving this requires that your processor does not have to spend excess time unnecessarily on portions which could have been done in the overhead. By doing as many pre-calculations and memory writes as possible in the overhead, you will speed up the processor, and maybe free it for other animations you have in mind. This can be demonstrated in the AA.EXE intro, in which it takes approximately 5-7 seconds (for my 20Mhz machine) of overhead before the program continues on to the animation. Some Tricks of the Trade ~~~~~~~~~~~~~~~~~~~~~~~~ Now that you (hopefully) have the basics of the VGA down pat, I'm going to move on the some tricks and effects that can be accomplished using this knowledge. This section includes how to fade in/out, moving color bars around using palette cycling, producing the checkerboard using palette cycling, and accessing the internal CGA character set in order to get some message across (purpose of the intro in the first place!). Fading In and Out ~~~~~~~~~~~~~~~~~ Fading a screen in and out is simply a matter of continuously incrementing/decrementing all the components of the values on the LUT until they either reach their final values, or they go down to zero, whichever task you want to do. The difficult part, though, is working out a fast algorithm to do this. The following is an example showing a general fade-out procedure. A fade-in procedure will not be included, but you can figure one out from this example. All_RGB DB 256*3 DUP (?) ;Define temporary work area MOV CX,64 ;Lower intensities 64 times OneCycle: CALL WaitVerticalRetrace ;Using already written proc MOV DX,03C7h ;Read Index XOR AL,AL ;Set read index to 0 OUT DX,AL INC DX INC DX ;Data Register XOR BX,BX ;Init work counter ReadLoop: IN AL,DX ;Read component MOV Byte Ptr All_RGB[BX],AL ;Store component INC BX ;Increment work counter CMP BX,256*3 ;All 256*3 done yet? JL ReadLoop ;No, continue reading XOR BX,BX ;Init work counter (dec) DecLoop: CMP Byte All_RGB[BX],0 ;Is it zero already? JZ Continue ;Yes, skip decrement DEC All_RGB[BX] ;No, decrement Continue: INC BX ;Increment work counter CMP BX,256*3 ;256*3 times already? JL DecLoop ;No, continue decrementing CALL WaitVerticalRetrace ;Using already written proc MOV DX,03C8h ;Write Index MOV AL,0 ;Start writing at zero OUT DX,AL INC DX ;Data Register MOV SI,OFFSET All_RGB ;DS:SI point to work area PUSH CX ;Store cycle number MOV CX,256*3 ;Will repeat 256*3 times CLD ;Set SI forward REP OUTSB ;Set ALL components to work POP CX ;Restore cycle number LOOP OneCycle ;Continue 64 times... I believe this is as fast as you can perform the operation when decrementing each component by 1, the reason being that all the fade-out routines need to wait for the vertical retrace (for smoothness again), which takes up most of the time. If anyone comes out with anything simpler and faster, would appreciate it if you'd let me know.. Color Bars ~~~~~~~~~~ I don't believe I ever defined what palette cycling meant in the last release, which may have caused a lot of confusion. Basically what it is is having a static screen (very little or no modifications to the display memory) and just changing the components of the LUT around for a nice effect. Example, color bars. You've all seen the color bars on the Amiga's and ST's. It's also feasible on the PC's, although it takes a little bit more work. To get horizontal color bars moving up and down, you simply assign a different color value in the display memory FOR EACH DISPLAY LINE that the bars will be moving across. Examine the following closely: ÚÄÄÄÂÄÄÄÄÂÄÄÄÄÂÄÄÄÄÂÄÄÄÄÂÄÄÄÄÄ¿ ³ 1³ 1³ 1³ 1³ 1³ 1 ³ ---> ÃÄÄÄÅÄÄÄÄÅÄÄÄÄÅÄÄÄÄÅÄÄÄÄÅÄÄÄÄÄ´ ³ 2³ 2³ 2³ 2³ 2³ 2 ³ ---> ÃÄÄÄÅÄÄÄÄÅÄÄÄÄÅÄÄÄÄÅÄÄÄÄÅÄÄÄÄÄ´ ³ 3³ 3³ 3³ 3³ 3³ 3 ³ ---> ÃÄÄÄÅÄÄÄÄÅÄÄÄÄÅÄÄÄÄÅÄÄÄÄÅÄÄÄÄÄ´ ³ 4³ 4³ 4³ 4³ 4³ 4 ³ ---> ÀÄÄÄÁÄÄÄÄÁÄÄÄÄÁÄÄÄÄÁÄÄÄÄÁÄÄÄÄÄÙ | | | | \|/ \|/ *The numbers in the boxes are color values* Looking at this arrangement, we can now selectively choose which color value we want on, and which ones we want off (off = black, or whatever the background color happens to be). For example, say if I defined a bar to having a width of 4. First I would set ALL of the colors involved in this palette cycling (the number of lines you filled up and want the bar to travel around in) to the background color (if black, then the RGB components would be 0,0,0). Then, say if I wanted the bar at location 20, I would just fill in the RGB components of the bar at values 20, 21, 22, and 23. To move the bar down one, I would first black out values 20, 21, 22, and 23, and then fill in values 21, 22, 23, and 24. It's fairly straightforward. You're now probably wondering, wouldn't it be easier, then, to just move the bars across? No, because you would have to move complete lines around, which is very cumbersome. This method allows for quick line color changes, saving you more time for other animations. I won't include an example of the actual routine in this text, but one of the intros I've packaged with this ZIP demonstrates this. Notice there that the bars are moving in a sine wave motion. Accessing the CGA Character Set ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The whole point of an intro is to get some message across to the viewer (be it to brag about some group (most cases) or to advertise some board). In order to do so, though, you must find a character set to print this message in, and one that will fit into our tight requirements. These requirements are the size of the set, and how easily the character set can be obtained. For the reasons stated above, although not exactly pretty, I have chosen to use the 8x8 CGA character set located within BIOS. This set, being internal, takes up no addition space in the intro and is fairly easy to access. Starting at the memory address [ 0F000 : 0FA6E ], the characters are stored in memory line by line (or byte by byte since each character is 8 x 8 bits). Thus, every 8 bytes, you will encounter a next character in the set. Example: Suppose, that the character set starts with the letter "A" (which it does not, since "A" is about 63 or so characters into the set). 0F000 : 0FA6E 00110000 0F000 : 0FA75 11111000 0F000 : 0FA6F 01111000 0F000 : 0FA76 01001100 0F000 : 0FA70 11001100 0F000 : 0FA77 01111000 0F000 : 0FA71 11111100 0F000 : 0FA78 01000110 0F000 : 0FA72 11001100 0F000 : 0FA79 01000110 0F000 : 0FA73 11001100 0F000 : 0FA7A 11111100 0F000 : 0FA74 00000000 0F000 : 0FA7B 00000000 And so on, with each character following one another according to the ASCII table. Looking at the latest intros though, it seems that more of the major groups are using the nice multi-colored lettering. That would be feasible, but would (a) take up extraneous space (even though it compresses well), and (b) take up too much of my time. How would you do it you're asking? There are two ways: Either display the characters on the screen and capture each letter of the alphabet there (VERY tedious), or, figure out how the character set is stored in the file and access it through there. Good luck on any of those. If anyone has already gotten any multi-colored set into a known format, please let me know about it... Swirling Letters (With the CGA Character Set) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Producing the swirling letters across the screen (first shown by the earlier TDT-TRSI intros) is simply a matter of a combination of pre-calculated coordinates and fast display memory writes. First, you must calculate all the coordinates of the path that the letters will travel in (the swirling effect uses the sine wave). Since the height of the characters (of the CGA character set) is eight, you need eight of these paths. Shifting the starting phase of the angle by one increment (whose size is dependent upon your program) for each of these paths will give a 3-dimensional effect. The next step you should do is to think of some way of rapidly placing the pixels of the characters on the display memory according to the path. One consideration is the fact that calculating the offset for every path pixel during the graphics execution cycle is very cumbersome, and will slow down your animation. As a result, I not only pre-calculated the coordinates of the pixels, but its display memory offset as well. And the other item you should consider is the format of the storage of the characters of the message. By converting each bit of the character set into a byte, I don't waste any time during the graphics cycle stage striping the whole byte to get the character printed on the screen. After this has been completed, your graphics cycle should look similar (in process) to this pseudocode: { Wait for vertical retrace } { Delete (set to zero) section of screen where letters pass through } { Re-plot ALL the points of the characters, according to the path defined } { Repeat } Examine CENTERPT.PAS and try tracing through the whole thing if you don't understand this portion of the text. Checkerboard ~~~~~~~~~~~~ I got the idea for the checkerboard effect after looking at the TLS.EXE graphics demo. Trying to figure out a way to compress this into something small was difficult though, but I finally found that the easiest way to do this is through palette cycling. So how do you go about palette cycling this mess? First of all, you need to break everything down into small pieces. Consider a standard checkerboard, instead of one going off into the horizon point. How would you go about placing down the values on the display memory so that the result would be that you could have a solid square moving about in all directions with palette cycling? Here's the layout of the values on the display memory: (I'm considering a 4x4 pixel square now.) 01 02 03 04 05 06 07 08 (Figure 1) 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 To produce a checkerboard in this setup, you would just set the values 01 to 04, 09 to 0C, 11 to 14, and 19 to 1C to the solid color (say blue), and the values 05 to 06, 0D to 10, 15 to 18, and 1D to 20 to the other color (say white). The result would be: B B B B W W W W B B B B W W W W B B B B W W W W B B B B W W W W Where B = Blue, and W = White. To produce this checkboard through the entire screen, you would just place squares of these values in the following order (relative to the display screen): (Figure 2) 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 ---> And so on... 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 05 06 07 08 01 02 03 04 0D 0E 0F 10 09 0A 0B 0C ---> 15 16 17 18 11 12 13 14 19 1A 1B 1C 1D 1E 1F 20 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 ---> 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 | \|/ Which produces the following: B B B B W W W W B B B B W W W W ---> And so on... B B B B W W W W B B B B W W W W W W W W B B B B W W W W B B B B W W W W B B B B W W W W B B B B | \|/ Now, in order to "move" the squares left one, looking at the Figure 1 and using the value layout on the display memory, I would just set the following colors to their respective values: B B B W W W W B B B B W W W W B B B B W W W W B B B B W W W W B Use Figure 2, and see what happens to the rest of the display when the values are set to the colors above. And to move the squares down one, the colors corresponding to the values in Figure 1 are as follows: W W W W B B B B B B B B W W W W B B B B W W W W B B B B W W W W Put these colors in Figure 2, and you can see that the rest of the display will also shift down one. Now determining which colors correspond to which values can take time (in fact, a lot!). So, what I did in the AA.EXE loader was pre- calculate all the palette colors in every possible position and stored them into a separate array for each (in this case, this would be 8 x 8 arrays, or 64 arrays total). Then I assigned 4 pointers to each record, pointing to the record which would move it up, down, left, and right. So moving all the squares right one pixel would just require me to continually specify the palette array to the one on the right. Simple. Now that you've got the standard checkerboard worked out, how do you relate this to a checkerboard with the perspective of going into the horizon point? I'm NOT going to go into the details of shifting the perspective, but for their relation, basically you need to know is when to place down a value on the display memory, and how many to put down. This can be down by either rounding (I found this did not work well), or by truncating (better). Go through the AA.EXE code if you want to know how I did the perspectives and how I determined the placement of the values. Line/Vector Letters ~~~~~~~~~~~~~~~~~~~ Just saw a new TDT-TRSI intro (these guys seem to be at the forefront of coming up with new ideas) incorporating line/vector drawn letters. Using line/vector lettering has the advantage of having the letters follow certain shapes, increase/decrease sizes, and overall, producing a neat effect. I'm not going to go into the details, major reason being I haven't tried this yet, but will just run through a couple of important points. The lettering used is defined by points and lines connecting the points. You'll probably have to make up your own lettering as well as the storage format since there are few pre-defined vector/line characters out there simple enough to be used in animation, and small enough to be used in an intro. This part will take some time (one reason I haven't tried it). A pre-calculated path is also needed for the characters to travel in. This time, though, you only need a top and bottom path, and maybe a middle one if you want lines going through the middle. Because these characters are line drawings, you only need the paths which the points will be. And finally, you'll also need a fast line drawing procedure, probably implemented in assembly for greatest speed. Basically, what you'll be doing is drawing the letters with lines continually with shifting coordinates (with respect to the defined paths), so you'll need to be able to draw all lines fast enough for smooth animation. Be forewarned that this type of intro will take a lot of time (compared to the others). Make sure that you've got enough of it before starting... Final Word ~~~~~~~~~~ Believe me, there is MUCH more information about the VGA than I've covered in this text. A lot of those functions, though, go into the higher resolution mode, which takes more time to learn about. Functions such as split screen mode, modifying the dimensions of display memory, tricks with the non-standard modes, etc, etc are all available. When you get the hang of mode 13h (or get bored with it, whichever first), I encourage you to delve into those higher modes I've been referring to... ------------------------------------------------------------------------------ Thanks for bearing through all 10 pages of this text. Whew! Don't think I've ever written anything so huge before, and given the time it took, I can almost guarantee that you won't see anything this big from me again. Anyways, if you want to get in touch with me, you can reach me on my board, CenterPoint! BBS (301)309-0144, 9600+ only. I welcome all your ideas/ executables/and sources (if you're willing to part with them). Impress me. I also just got an Internet account (since I'm at UMCP now). If you want faster replies, send mail to nietzche@wam.umd.edu, and I'll usually reply within a day or two. Later on..