***************************************************************************
***************************************************************************
The UCR Standard Library for Assembly Language Programmers,
Written By Randall Hyde and others, is
sssssss ss ss ss sssssss sssssss
ss ss ss ssss ss ss ss
ss ss ss ss ss ss ss ss
sssssss sssssssss ssssssss sssssss sssss ssssssss
ss ss ss ss ss ss ss ss
ss ss ss ss ss ss ss ss
sssssss ss ss ss ss ss ss sssssss
ww ww ww sssssss sssssss
ww ww wwww ss ss ss
ww ww ww ww ww ss ss ss
ww wwww ww wwwwwwww sssssss sssss
ww ww ww ww ww ww ss ss ss
wwww wwww ww ww ss ss ss
ww ww ww ww ss ss sssssss
We do not want any registration fees for this software.
Now for the catch... It is more blessed to give than to receive.
If this software saves you time and effort and you enjoy using it,
our lives will be enriched knowing that others have appreciated our work.
We would like to share this wonderful feeling with you. If you like this
software and use it, we would like you to contribute at least one routine to
the library. Perhaps you think this library has some neat-o routines in it.
Imagine how nice it would become if everyone used their imagination to
contribute something useful to it.
We hereby release this software to the public domain. You can use it in any
way you see fit. However, we would appreciate it if you share this software
with others as much as it has been shared it with you. That is not to suggest
that you give away software you have written with this package (We're not
quite as crazy as Richard Stallman, bless his heart), but if someone else would
like a copy of this library, please help them out. Naturally, we would be
tickeled pink to receive credit in software that uses these routines (which is
the honorable thing to do) but we understand the way many corporations operate
and won't be terribly put off if you use it without giving due credit.
Enjoy!
If you have comments, bug reports, new code to contribute, etc., you can
reach us through (address and email are circa 1993, if you read this in
1999, don't count on it!):
rhyde@cs.ucr.edu (On Internet).
or
Randall Hyde
Dept of Computer Science
100 University Office Bldg
University of California
Riverside, Ca. 92521
COMMENTS ABOUT THE CODE:
************************
Please don't expect super optimal code here. Most of it is fairly mediocre
(from a size/speed point of view). Hopefully, you'll agree, it's the idea
that counts. If you do not like something I have done, you have got the
sources -- have at it. (Of course, it would be appreciated if you would
send any modifications to one of the E-MAIL addresses above.)
****************+******************** NOTE ************************************
Please understand the purpose of this code! This library is here to make
assembly language programming easy. The nature of this library encourages
people to write code in a fashion similar to that employed when they write
programs in a high level language like C. While this familiar style of
programming does make the task easier, it is not the most appropriate
approach to use when flat-out performance is what you're seeking. "C code
written with MOV instructions" is never as fast as pure assembly language
code employing the proper programming paradigm. Why mention this? Well,
some readers may have heard about assembly language's legendary performance
and they're expecting to achieve that using this library. While programs
written with this library may very well run faster than a comparable program
written in a HLL, you will not get fantastic performance improvement until
you stop thinking in HLLs and starting "thinking" in assembly. The purpose
of this library is to help you *avoid* thinking in assembly language. There-
fore, this code will not help you achieve those fantastic performance levels
you've been hearing about; indeed, this library may stand in the way of that
goal. It's not that these routines are terribly slow, mind you. They just
encourage an inappropriate programming style if speed is what you're after.
On the other hand, since only 10-20% of the code of any given program
represents the time critical stuff (an argument long employed by HLL
supporters), there is nothing wrong with judicious use of this code within
a program that has to be fast. As usual, if performance is your primary
goal, you must study the problem and the program you generate very carefully
to isolate the time critical portions. If you are interested in high-
performance programming at the "micro-algorithm" level, you should take a look
at Michael Abrash's text "Zen of Assembly." This excellent book will explain
many ways to improve the performance of your code at the sub-algorithm level
(where assembly language really shines).
COMMENTS ABOUT THIS DOCUMENTATION:
**********************************
You will have to forgive us for the inconsistent style appearing throughout
this document. Keep in mind that this document has been prepared by many
different people. Keeping the styles consistent is a time consuming and
difficult task.
Whenever a routine's description claims that the flags are not affected,
you should not interpret this to mean that the routine preserves the flags.
Most routines do *not* preserve any of the flags. Such a statement simply
means that the routine does not *explicitly* return a value in one (or more)
of the flag bits.
Note that proper credit has been given to the author of each of the various
routines appearing in this library *except* for many written by Randall
Hyde. All routines without an author by-line were probably written by
Randall Hyde (unless we screwed up somewhere and forgot to put a name
in the documentation). Most of these routines were tested and documented
by various students in Randy Hyde's CS 13 (assembly language) and CS 191X /
CS 185 courses (Commercial Software Development). There are too many names
to mention here, but these students definitely deserve the credit for locating
numerous bugs in the code, providing many suggestions, and doing other work.
Of course, there have been numerous suggestions and bug notices from helpful
souls on BIX and the Internet, as well. Thank you all.
*NOTICE* We have noticed, from time to time, that there are routines in the
library which have not been documented. Perusing the source listings will
help you locate some library routines which have slipped through the cracks.
Also keep in mind that there isn't a one to one correspondence between
source files and library routines. Many of the source files contain
two or more library routines. Someday we will attempt to document which
files contain which routines, but that's in the future for now.
=============================================================================
Version History:
Version 00- Initial release as "Randy Hyde's Standard Library for 80x86
Assembly Language programmers"
Version 10- Initial release as "UCR Standard Library..." CS 191X
students did some testing and documentation in this release.
Version 20- More testing on several routines. Added floating point
library and several other routines.
Version 21- Fixed *MAJOR* bugs in floating point package. Added
11-1-91 several new routines. Included new "TEST" files with
the library. Also included SHELL.ASM file inadvertently
left out of Version 2.0.
Version 22- Made some minor modifications to puth, putl, ltoa, and htoa
11-14-91 as per suggestions made by David Holm and Terje Maithesen
Version 23- Made a small but *major* modification to the stdlib.a and
11-22-91 stdlib.a6 files to force library calls into the STDGRP group.
Otherwise the linker substitued bad segment addresses for
the far calls to the library routines. A real problem when
accessing variables in StdData.
Version 24- Yet more changes to fix the stupid MASM group/segment:offset
12-7-91 bug. Made various changes to the STDLIB.A file. Also fixed
a problem in the FP routines- forgot to declare sl_sefpa
public. Finally, created batch file to automatically unpack
everything from DOS (assuming presence of PKUNZIP somewhere
in the current path).
Version 25- Some new macros (DOS, ExitPgm), fixed a problem with the
12-25-91 PUTI routine, added some SmartArray items. Also added the
GetEnv routine.
Version 26- Maintenance release coinciding with the Dr. Dobb's article
2/20/92 in the March 1992 issue.
Version 27- SmartLists and interrupt driven serial routines added to
6/19/92 the libraries. Also created smaller include files for
each of the standard library categories. (note: the serial
routines actually existed prior to this release, they were
cleaned up and documented for this release). Fixed a couple
of truly disgusting bugs in the floating point package
(wouldn't properly print values like 8100 and hung whenever
encountering a zero value in FADD/FSUB).
Version 28- Modified MemInit to allow the programmer to specify how many
8/20/92 pages to reserve for the heap and the location of the heap.
Version 29- Added HeapStart routine to the memory management code so an
10/5/92 application could get the segment address of the start of
the heap. This is useful when you want to deallocate the
heap (by calling DOS' deallocate routine), for example, to
free up the heap memory so you can run another application.
What really needs to be done here is to write a dealloc
routine, but HeapStart offered some flexibility.
Version 30- Fixed bug in ATOH2 routine (it incremented DI once too far).
10/11/92- Also fixed the same bug in ATOI2, ATOU2, ATOL2, ATOUL2, etc.
3/16/93 Added StrTrim (m) and StrBlkDel (m) to the library. Added
the pattern matching package to the library. Added the date
and time routines (ATOD, DTOA, DTOA2, DTOAm xDTOA, xDTOAm,
xDTOA2, ATOT, TTOA, TTOA2, TTOAm xTTOA, xTTOAm xTTOA2) to
the library. Fixed a bug in ATOI and ATOL which passed off
the ":" character as a numeric digit. Broke the MemInit
routine into two separate routines: MemInit & MemInit2 which
let the user specify the location of the heap or use all the
available memory. Also, no longer require that PSP be a
global variable (However, the library does require DOS 3.3
or later). Fixed a bug in PRINTF/PRINTFF (it did not
properly restore the flags and BP). Fixed a bug in the
LSFPO routine (thanks to Tim Farley for pointing this out).
Added the process manager package to the library.
Version 31- Fixed a bug in strstr which prevented it from matching a
6/10/93 substring at the beginning of a string. Added file
7/24/93 routines to the library. Added macros for strbdel and
8/1/93 strtrim to string.a. Fixed a bug in stricmpl, forgot to
copy a pointer into SI within the routine. Fixed a bug in
CPUID which crashed the machine if a 486.
Version 32- Fixed several bugs in the list routines. Added some actual
3/24/94 file routines to the library. Updated the documentation.
Version 33- Fixed some bugs in the floating point code. Fixed a bug
7/15/94 a bug in the pattern matching code. Added new pattern
match routines. Changed the name of CPUID because it
conflicts with the Pentium instrucion of the same name.
Fixed several bugs in the processes package. Fixed some
problems in the documentation (certain routines were listed
by the wrong name).
Version 34- Fixed several problems in the documentation. Some other
11/18/94 minor bug fixes including changing the CPUID name to
CPUIDENT (to avoid conflict with Pentium CPUID instruction).
Also modified IBML to use CPUIDENT rather than CPUID.
Version 35- Fixed a problem with a signed comparison in the pattern
matching code. It turned out that if you failed on the
first character of a string, it bombed the system. Also
changed the doc on patterns to fix an error.
Version 36- Fixed a bug in the FREAD routine. There are known bugs in
4/4/95 the floating point package, but cannot get a sample example
to determine cause.
==============================================================================
ROUTINES WE WOULD LIKE TO HAVE:
*******************************
If you're interested in adding some routines to this
package, GREAT! Here are some suggestions.
1) Routines which manipulate directories (read/write/etc.)
2) We did it already!
3) Length-prefixed strings package.
4) A graphics package.
5) An object-oriented programming class library.
6) Floating point functions (e.g., SIN, COS, etc.)
7) Just about anything else appearing in a HLL "standard" library.
If you've got any ideas, we would love to discuss them with you. The best
way to reach us is through the E-MAIL addresses above.
MISSING ROUTINES TO BE SUPPLIED IN THE FUTURE:
**********************************************
Table Package
TblInit- Initializes a particular table.
TblEnter- Enters an item into a table.
TblLookup- Looks up an item in a table.
TblFree- Free up memory in use by a table.
Tree Package
<pretty much the same routines as the list package>
Set Package
<Generic set routines (not just character set routines) similar to cset pkg>
Processes Package
Sleep- Delays a process for some period of time.
YieldTo- Transfers control to a specific process.
Forkm- Allocates new PCB on the heap.
Sync- Halts a process until another process dies.
Join- Merges two processes together.
wait & release- Semaphore/synchronization primitives.
80386 Optimized Code
Despite the disclaimer about speed earlier in this document, we do have
plans to rewrite this routine for speed at some point in the future.
At that time we will write the code specifically for 80386 and later
processors (the code will probably be optimized for Pentium/586 processors
at that time). Stay tuned.
HOW TO USE THE STANDARD LIBRARY:
********************************
When you are ready to begin programming with the library, you should
copy the shell.asm file, provided in the package, to another file in
which you will be working, i.e. myprog.asm. The shell.asm file sets
up the machine (segments, etc.) as the UCR Standard Library expects
them. Results are undefined for other setups. Therefore, I strongly
suggest, that when you begin using these routines, you follow the
shell.asm format. Later, when you are familiar with the software,
you may wish to create your own shell.asm file, but it is wise to
initially use the one provided. The shell.asm file has comments which
tell you where to place your code, variables, etc.
There is an include file stdlib.a which
you should include in every assembly you perform which calls the stdlib
routines. SHELL.ASM already includes this file. *YOU MUST PLACE THE
INCLUDE STATEMENT OUTSIDE OF ANY SEGMENTS IN YOUR PROGRAM*. Preferably
as the first line of your program (just like SHELL.ASM). If you place
this include directive inside a segment, certain assemblers/linkers
(especially MASM) will not properly assemble and link your programs.
They will assemble and link without error, but the resulting program
will not execute correctly.
The STDLIB.A file contains macros you can use to call each of the routines
in the standard library. For example, to call PRINTF you would use the
statement
printf
db "format string",0
db other,vars
rather than "calling" printf. Printf is actually a macro, you cannot call
it directly (all of the standard library routines have names like "sl_printf"
and the macro issues a call to the appropriate routine). These macros have
two main purposes-- first, the differentiate calls to the standard library
routines (i.e., no "call" instruction is the difference); and second, they
contain some extra code to perform "smart linking" with MASM 5.1 & earlier,
TASM, and OPTASM. MASM 6.0 supports a new directive, extrndef, which
eliminates the need for this extra code, but the extra code works nonetheless.
Starting with version 27, many of the standard library macros were separated
into smaller files. This speeds up assembly when you don't need *all* of
the routines in the library (the macro file is getting quite large).
STDLIB.A still exists and still loads everything, but you should get in the
habit of specifying the smaller files instead. For MASM 6.0 users, a
special set of include files "*.a6" are now available. MASM 6.0 seems to
run out of memory if you include "stdlib.a6" (which includes everything) so
you may have to include only those files you actually use.
All of the standard library routines, and most of their local data values,
are in a segment named "stdlib". You should not create such a segment unless
you plan on adding new routines to the standard library.
Note: if you want to use the pattern matching functions provided in the
pattern matching package, you will need to include the following
statement somewhere *after* the "include stdlib.a" or
"include pattern.a" statement:
matchfunc
This declares the necessary external names required by the pattern
matching operations. The SHELL.ASM file contains a commented-out
line with this statement. If you use pattern matching in programs
which start out as SHELL.ASM you can simply uncomment this line.
HOW THE STANDARD LIBRARY IS ORGANIZED:
**************************************
The documentation spec sheets for each of the standard library routines appear
in other files provided with the standard library. We've organized these
routines by category. The categories supported to date include
Standard Input Routines
Standard Output Routines
Conversion Routines
Utility Routines
String Handling Routines
Memory Management Routines
Character Set Routines
Floating Point Routines
File I/O
Miscellaneous Routines
Time & Date Routines
Smart List Routines
Serial Port I/O
Pattern Matching Package
Process Package
IF YOU WANT TO PLAY WITH THE SOURCE LISTINGS
********************************************
Most users will probably use the standard library routines in object form
and never worry about the actual implementation. If you, on the other hand,
want to get "under the hood" and take a look at how this code was written
(perhaps to fix a bug), all the source listings are provided with this
release.
We assemble the library for final distribution using TASM 3.0 with the
"/M3", "/jjumps", and "/ic:\stdlib\include" command line options. If you
do not specify these options you will probably get an assembly error.
All initial development of these routines was done with MASM. By writing the
code with MASM and then assembling the final release version with TASM we
could verify that the code worked with both assemblers.
That is, at least, until MASM 6.0 came along. All new routines written since
the introduction of MASM 6.0 were developed with MASM 6.0 and assembled with
TASM 3.0. They should compile with MASM 5.1 as well (though we haven't
verified this). HOWEVER, older routines written before the release of MASM 6
will probably not assemble properly under MASM 6.0 unless you specify the
MASM 5.1 compatibility options. Furthermore, routines written after the
release of MASM 6.0 take advantage of MASM/TASM's "branch out of range"
automatic correction and may produce errors when assembled under MASM 5.1.
Moral of the story-- If you're still using MASM 5.1 (or earlier) or TASM 2.0
(or earlier), *upgrade*!
Given the divergent paths that MASM 6.0 and TASM 3.0 are taking, it is
unlikely that we will continue to provide all future code in a form which
compiles under both assemblers. The windowing package we've created (but
have not released), for example, will only assemble under MASM 6.0. We will
always make sure that the object code works with any assembler/linker out
there, but it's unlikely we will continue to support both MASM and TASM
at the source level for TASM indefinitely (unless BORLAND gives us good
reason to do otherwise, like having a MASM 6.x compatibility mode). Sorry,
it's just too much work for so little return.
Of course, if you would volunteer to translate our MASM 6 code to TASM,
we'd be more than happy to give you full credit for your work.
Currently (6/93), MASM 6.0 and MASM 6.1 have some severe bugs which create
some major problems. As soon as a stable release appears we will convert
specifically to MASM 6.x.
Acknowledgements
================
There are far too many people who have their fingers in this package to
give full credit to everyone involved. Futhermore, this section was
added long after many hard-working people's efforts were forgotten.
If you are one of these people, send me (rhyde) email and I will
certainly rectify this situation.
Most of the routines in the library were written by Randy Hyde.
Those routines authored by someone else contain appropriate notes in the
comments found in the source listing.
Many thanks to those who have found problems in routines in the library.
This includes the students in CS 191x, CS 185, CS 162ABC, and CS 13 at
UC Riverside. They have made important contributions to this library
and their efforts are not forgotten.
Special thanks to the CS 191x class at UC Riverside who reorganized the
documentation from its original sorry state. Special thanks to Steve
Shah for his quick reference guide.
Last, but certainly not least, praise and glory to our Lord for giving us
all the talent to achieve this...
In the future, I will endeavor to keep this section up to date and provide
personal acknowledgements to those who have contributed to the success of
this library.
Conversion Routines
-------------------
The stdlib conversion routines follow a uniform format of storing the data
to be converted and returned. Most routines accept input and return data
of either an ASCII string of characters, stored in the ES:DI register, or
integers, stored in the DX:AX register. If a value is just a 16 or 8-bit
value then it will be stored in AX or AL.
Since there is a possibility of an error in the input values to be converted,
such as it does not contain a proper value to convert, we use the
carry flag to show error status. If the error flag is set then an error has
occured and things are okay if the carry flag is clear.
Routine: ATOL (2)
------------------
Category: Conversion Routine
Registers on Entry: ES:DI- Points at string to convert
Registers on Return: DX:AX- Long integer converted from string
ES:DI- Points at first non-digit (ATOL2 only)
Flags Affected: Carry flag- Error status
Examples of Usage:
gets ;Get a string from user
ATOL ;Convert to a value in DX:AX
Description: ATOL converts the string of digits that ES:DI points at to a
long (signed) integer value and returns this value in DX:AX.
Note that the routine stops on the first non-digit.
If the string does not begin with a digit, this routine returns
zero. The only exception to the "string of digits" only rule is
that the number can have a preceding minus sign to denote a
negative number. Note that this routine does not allow leading
spaces. ATOL2 works in a similar fashion except it doesn't
preserve the DI register. That is, ATOL2 leaves DI pointing at
the first character beyond the string of digits. ATOL/ATOL2 both
return the carry flag clear if it translated the string of
digits without error. It returns the carry flag set if overflow
occurred.
Include: stdlib.a or conv.a
Routine: AtoUL (2)
-------------------
Category: Conversion Routine
Register on entry: ES:DI- address of the string to be converted
Register on return: DX:AX- 32-bit unsigned integer
ES:DI- Points at first character beyond digits (ATOUL2
only)
Flags affected: Carry flag- Set if error, clear if okay.
Examples of Usage:
les InputString
AtoUL
Description: AtoUL converts the string pointed by ES:DI to a 32-bit unsigned
integer. It places the 32-bit unsigned integer into the memory
address pointed by DX:AX. If there is an error in conversion,
the carry flag will set to one. If there is not an error, the
carry flag will be set to zero.
ATOUL2 does not preserve DI. It returns with DI pointing at
the first non-digit character in the string.
Include: stdlib.a or conv.a
Routine: ATOU (2)
--------------------
Category: Conversion Routine
Register on entry: ES:DI points at string to convert
Register on return: AX- unsigned 16-bit integer
ES:DI- points at first non-digit (ATOU2 only)
Flags affected: carry flag - error status
Example of Usage:
Description: ATOU converts an ASCII string of digits, pointed to by ES:DI,
to unsigned integer format. It places the unsigned 16-bit
integer, converted from the string, into the AX register.
ATOI works the same, except it handle unsigned 16-bit integers
in the range 0..65535.
ATOU2 leaves DI pointing at the first non-digit in the string.
Include: stdlib.a or conv.a
Routine: ATOH (2)
-----------------
Category: Conversion Routine
Registers on Entry: ES:DI- Points to string to convert
Registers on Return: AX- Unsigned 16-bit integer converted from hex string
DI (ATOH2)- First character beyond string of hex digits
Flags Affected: Carry = Error status
Example of Usage:
les DI, Str2Convrt
atoh ;Convert to value in AX.
putw ;Print word in AX.
Description: ATOH converts a string of hexadecimal digits, pointed to by
ES:DI, into unsigned 16-bit numeric form. It returns the value in
the AX register. If there is an error in conversion, the carry
flag will set to one. If there is not an error, the carry flag
will be clear. ATOH2 works the same except it leaves DI
pointing at the first character beyond the string of hex digits.
Include: stdlib.a or conv.a
Routine: ATOLH (2)
------------------
Category: Conversion Routine
Registers on Entry: ES:DI- Points to string to convert
Registers on Return: DX:AX- Unsigned 32-bit integer converted from hex string
DI (ATOLH2)- First character beyond string of hex digits
Flags Affected: Carry = Error status
Example of Usage:
les DI, Str2Convrt
atolh ;Convert to value in DX:AX
Description: ATOLH converts a string of hexadecimal digits, pointed to by
ES:DI, into unsigned 32-bit numeric form. It returns the value in
the DX:AX register. If there is an error in conversion, the carry
flag will set to one. If there is not an error, the carry flag
will be clear. ATOLH2 works the same except it leaves the DI
register pointing at the first non-hex digit.
Include: stdlib.a or conv.a
Routine: ATOI (2)
-------------------
Category: Conversion Routine
Register on entry: ES:DI- Points at string to convert.
Register on return: AX- Integer converted from string.
DI (ATOI2)- First character beyond string of digits.
Flags affected: Error status
Examples of Usage:
les DI, Str2Convrt
atoi ;Convert to value in AX
Description: Works just like ATOL except it translates the string to a
signed 16-bit integer rather than a 32-bit long integer.
Include: stdlib.a or conv.a
Routine ITOA (2,M)
------------------
Category: Conversion Routine
Registers on Entry: AX- Signed 16-bit value to convert to a string
ES:DI- Pointer to buffer to hold result (ITOA/ITOA2
only).
Registers on Return: ES:DI- Pointer to string containing converted
characters (ITOA/ITOAM only).
ES:DI- Pointer to zero-terminating byte of converted
string (ITOA2 only).
Flags Affected: Carry flag is set on memory allocation error (ITOAM only)
Examples of Usage:
mov ax, -1234
ITOAM ;Convert to string.
puts ;Print it.
free ;Deallocate string.
mov di, seg buffer
mov es, di
lea di, buffer
mov ax, -1234
ITOA ;Leaves string in BUFFER.
mov di, seg buffer
mov es, di
lea di, buffer
mov ax, -1234
ITOA2 ;Leaves string in BUFFER and
;ES:DI pointing at end of string.
Description: These routines convert an integer value to a string of
characters which represent that integer. AX contains the
signed integer you wish to convert.
ITOAM automatically allocates storage on the heap for the
resulting string, you do not have to pre-allocate this
storage. ITOAM returns a pointer to the (zero-terminated)
string in the ES:DI registers. It ignores the values in
ES:DI on input.
ITOA requires that the caller allocate the storage for the
string (maximum you will need is seven bytes) and pass a
pointer to this buffer in ES:DI. ITOA returns with ES:DI
pointing at the beginning of the converted string.
ITOA2 also requires that you pass in the address of a buffer
in the ES:DI register pair. However, it returns with ES:DI
pointing at the zero-terminating byte of the string. This
lets you easily build up longer strings via multiple calls
to routines like ITOA2.
Include: stdlib.a or conv.a
Routine: UTOA (2,M)
---------------------
Category: Conversion Routine
Registers on entry: AX - unsigned 16-bit integer to convert to a string
ES:DI- Pointer to buffer to hold result (UTOA/UTOA2
only).
Registers on Return: ES:DI- Pointer to string containing converted
characters (UTOA/UTOAM only).
ES:DI- Pointer to zero-terminating byte of converted
string (UTOA2 only).
Flags affected: Carry set denotes malloc error (UTOAM only)
Example of Usage:
mov ax, 65000
utoa
puts
free
mov di, seg buffer
mov es, di
lea di, buffer
mov ax, -1234
ITOA ;Leaves string in BUFFER.
mov di, seg buffer
mov es, di
lea di, buffer
mov ax, -1234
ITOA2 ;Leaves string in BUFFER and
;ES:DI pointing at end of string.
Description: UTOAx converts a 16-bit unsigned integer value in AX to a
string of characters which represents that value. UTOA,
UTOA2, and UTOAM behave in a manner analogous to ITOAx. See
the description of those routines for more details.
Include: stdlib.a or conv.a
Routine: HTOA (2,M)
---------------------
Category: Conversion Routine
Registers on entry: AL - 8-bit integer to convert to a string
ES:DI- Pointer to buffer to hold result (HTOA/HTOA2
only).
Registers on Return: ES:DI- Pointer to string containing converted
characters (HTOA/HTOAM only).
ES:DI- Pointer to zero-terminating byte of converted
string (HTOA2 only).
Flags affected: Carry set denotes memory allocation error (HTOAM only)
Description: The HTOAx routines convert an 8-bit value in AL to the two-
character hexadecimal representation of that byte. Other
that that, they behave just like ITOAx/UTOAx. Note that
the resulting buffer must have at least three bytes for
HTOA/HTOA2.
Include: stdlib.a or conv.a
Routine: WTOA (2,M)
--------------------
Category: Conversion Routine
Registers on Entry: AX- 16-bit value to convert to a string
ES:DI- Pointer to buffer to hold result (WTOA/WTOA2
only).
Registers on Return: ES:DI- Pointer to string containing converted
characters (WTOA/WTOAM only).
ES:DI- Pointer to zero-terminating byte of converted
string (WTOA2 only).
Flags Affected: Carry set denotes memory allocation error (WTOAM only)
Example of Usage:
Like WTOA above
Description: WTOAx converts the 16-bit value in AX to a string of four
hexadecimal digits. It behaves exactly like HTOAx except
it outputs four characters (and requires a five byte buffer).
Include: stdlib.a or conv.a
Routine: LTOA (2,M)
--------------------
Category: Conversion Routine
Registers on entry: DX:AX (contains a signed 32 bit integer)
ES:DI- Pointer to buffer to hold result (LTOA/LTOA2
only).
Registers on Return: ES:DI- Pointer to string containing converted
characters (LTOA/LTOAM only).
ES:DI- Pointer to zero-terminating byte of converted
string (LTOA2 only).
Flags affected: Carry set if memory allocation error (LTOAM only)
Example of Usage:
mov di, seg buffer ;Get address of storage
mov es, di ; buffer.
lea di, buffer
mov ax, word ptr value
mov dx, word ptr value+2
ltoa
Description: LtoA converts the 32-bit signed integer in DX:AX to a string
of characters. LTOA stores the string at the address specified
in ES:DI (there must be at least twelve bytes available at
this address) and returns with ES:DI pointing at this buffer.
LTOA2 works the same way, except it returns with ES:DI
pointing at the zero terminating byte. LTOAM allocates
storage for the string on the heap and returns a pointer
to the string in ES:DI.
Include: stdlib.a or conv.a
Routine: ULTOA (2,M)
---------------------
Category: Conversion Routine
Registers on Entry: DX:AX- Unsigned 32-bit value to convert to a string
ES:DI- Pointer to buffer to hold result (LTOA/LTOA2
only).
Registers on Return: ES:DI- Pointer to string containing converted
characters (LTOA/LTOAM only).
ES:DI- Pointer to zero-terminating byte of converted
string (LTOA2 only).
Flags Affected: Carry is set if malloc error (ULTOAM only)
Example of Usage:
Like LTOA
Description: Like LTOA except this routine handles unsigned integer values.
Include: stdlib.a or conv.a
Routine: SPrintf (2,M)
-----------------------
Category: Conversion Routine
In-Memory Formatting Routine
Registers on entry: CS:RET - Pointer to format string and operands of the
sprintf routine
ES:DI- Address of buffer to hold output string
(sprintf/sprintf2 only)
Register on return: ES:DI register - pointer to a string containing
output data (sprintf/sprintfm only).
Pointer to zero-terminating byte at the
end of the converted string (sprintf2
only).
Flags affected: Carry is set if memory allocation error (sprintfm only).
Example of Usage:
sprintfm
db "I=%i, U=%u, S=%s",13,10,0
db i,u,s
puts
free
Description: SPrintf is an in-memory formatting routine. It is similar to
C's sprintf routine.
The programmer selects the maximum length of the output string.
SPrintf works in a manner quite similar to printf, except sprintf
writes its output to a string variable rather than to the stdlib
standard output.
SPrintfm, by default, allocates 2048 characters for the string
and then deallocates any unnecessary storage. An external
variable, sp_MaxBuf, holds the number of bytes to allocate upon
entry into sprintfm. If you wish to allocate more or less than
2048 bytes when calling sprintf, simply change the value of this
public variable (type is word). Sprintfm calls malloc to
allocate the storage dynamically. You should call free to
return this buffer to the heap when you are through with it.
Sprintf and Sprintf2 expect you to pass the address of a buffer
to them. You are responsible for supplying a sufficiently
sized buffer to hold the result.
Include: stdlib.a or conv.a
Routine: SScanf
----------------
Category: Conversion Routine
Formatted In-Memory Conversion Routine
Registers on Entry: ES:DI - points at string containing values to convert
Registers on return: None
Flags affected: None
Example of Usage:
; this code reads the values for i, j, and s from the characters
; starting at memory location Buffer.
les di, Buffer
SScanf
db "%i %i %s",0
dd i, j, s
Description: SScanf provides formatted input in a fashion analogous to scanf.
The difference is that scanf reads in a line of text from the
stdlib standard input whereas you pass the address of a sequence
of characters to SScanf in es:di.
Include: stdlib.a or conv.a
Routine: ToLower
-----------------
Category: Conversion Routine
Register on entry: AL- Character to (possibly) convert
to lower case.
Register on return: AL- Converted character.
Flags affected: None
Example of usage:
mov al, char
ToLower
Description: ToLower checks the character in the AL register, if it is upper
case it converts it to lower case. If it is anything else,
ToLower leaves the value in AL unchanged. For high performance
this routine is implemented as a macro rather than as a
procedure call. This routine is so short you would spend more
time actually calling the routine than executing the code inside.
However, the code is definitely longer than a (far) procedure
call, so if space is critical and you're invoking this code
several times, you may want to convert it to a procedure call to
save a little space.
Include: stdlib.a or conv.a
Routine: ToUpper
------------------
Category: Conversion Routine
Registers on Entry: AL- Character to (possibly) convert to upper case
Registers on Return: AL- Converted character
Flags Affected: None
Example of Usage:
mov al, char
ToUpper
Description: ToUpper checks the character in the AL register, if it is lower
case it converts it to upper case. If it is anything else,
ToUpper leaves the value in AL unchanged. For high performance
this routine is implemented as a macro rather than as a
procedure call (see ToLower, above).
Include: stdlib.a or conv.a
=====================
Date & Time Routines
=====================
These routines convert DOS system times and dates to/from ASCII strings.
They appear in this section rather than conversions because we eventually
intend to add date and time arithmetic to the package.
Note the time to string conversion routines do not output the hundredths of
a second. Most applications do not need (or want) this. If you want
hundredths of a second you can easily write a routine (using this code) or
modify the existing code to suit your purposes.
Routine: DTOA (2,m)
--------------------
Category: Date/Time Routines
Author: Randall Hyde
Registers on Entry:
CX- Current year (in the range 1980-2099)
DL- Current day
DH- Current month
ES:DI- Points at buffer with room for at least
nine bytes (DTOA/DTOA2 only).
Registers on Return: ES:DI- DTOA sticks a string of the form MM/DD/YY
into a buffer allocated on the heap (DTOAM
only).
ES:DI- Points at the zero terminating byte at the
end of the string (DTOA2 only).
Flags Affected: carry- Set if memory allocation error (DTOAM only).
Example of Usage:
mov ah, 2ah ;Call DOS to get the system
int 21h ; time (also see xDTOA)
lesi TodaysDate ;Buffer to store string.
DTOA ;Convert date to string.
mov ah, 2ah
int 21h
lesi TodaysDate2
DTOA2
mov ah, 2ah
int 21h
DTOAM ;ES:DI is allocated on heap.
Description:
DTOA converts a DOS system date (in CX/DX) to an ASCII string and deposits
the characters into a buffer specified by ES:DI on input. ES:DI must be at
least nine bytes long (eight bytes for mm/dd/yy plus the zero terminating
byte).
DTOA2 converts a DOS system date to an ASCII string just like DTOA above.
The only difference is that it does not preserve DI. It leaves DI pointing
at the zero terminating byte at the end of the string. This routine is use-
ful for building up long strings with a date somewhere in the middle.
DTOAM works like DTOA except you do not pass the pointer to a buffer in ES:DI.
Instead, DTOAM allocates nine bytes for the string on the heap. It returns
a pointer to this new string in ES:DI.
Include: stdlib.a or date.a
Routine: xDTOA (2,m)
---------------------
Category: Date/Time Routines
Author: Randall Hyde
Registers on Entry:
ES:DI- Points at buffer with room for at least
nine bytes (xDTOA/xDTOA2 only).
Registers on Return: ES:DI- DTOA sticks a string of the form MM/DD/YY
into a buffer allocated on the heap (xDTOAM
only).
ES:DI- Points at the zero terminating byte at the
end of the string (xDTOA2 only).
Flags Affected: carry- Set if memory allocation error (xDTOAM only).
Example of Usage:
lesi TodaysDate ;Buffer to store string.
xDTOA ;Convert date to string.
lesi TodaysDate2
xDTOA2
mov ah, 2ah
int 21h
xDTOAM ;ES:DI is allocated on heap.
Description:
These routines work just like DTOA, DTOA2, and DTOAM except you do not pass
in the date to them, they call DOS to read the current system date and
convert that to a string.
Include: stdlib.a or date.a
Routine: LDTOA (2,m)
---------------------
Category: Date/Time Routines
Author: Randall Hyde
Registers on Entry:
CX- Current year (in the range 1980-2099)
DL- Current day
DH- Current month
ES:DI- Points at buffer with room for at least
nine bytes (DTOA/DTOA2 only).
Registers on Return: ES:DI- DTOA sticks a string of the form "mmm dd, yyyy"
into a buffer allocated on the heap (LDTOAM
only).
ES:DI- Points at the zero terminating byte at the
end of the string (LDTOA2 only).
Flags Affected: carry- Set if memory allocation error (LDTOAM only).
Example of Usage:
mov ah, 2ah ;Call DOS to get the system
int 21h ; time (also see xDTOA)
lesi TodaysDate ;Buffer to store string.
LDATE ;Convert date to string.
mov ah, 2ah
int 21h
lesi TodaysDate2
LDTOA2
mov ah, 2ah
int 21h
LDTOAM ;ES:DI is allocated on heap.
Description:
These routines work just like the DTOA, DTOA2, and DTOAM routines except they
output their date in the form "mmm dd, yyyy", e.g., Jan 1, 1980.
Include: stdlib.a or date.a
Routine: xLDTOA (2,m)
---------------------
Category: Date/Time Routines
Author: Randall Hyde
Registers on Entry:
ES:DI- Points at buffer with room for at least
nine bytes (xLDTOA/xLDTOA2 only).
Registers on Return: ES:DI- Sticks a string of the form MMM DD, YYYY
into a buffer allocated on the heap (xLDTOAM
only).
ES:DI- Points at the zero terminating byte at the
end of the string (xLDTOA2 only).
Flags Affected: carry- Set if memory allocation error (xLDTOAM only).
Example of Usage:
lesi TodaysDate ;Buffer to store string.
xLDTOA ;Convert date to string.
lesi TodaysDate2
xLDTOA2
mov ah, 2ah
int 21h
xLDTOAM ;ES:DI is allocated on heap.
Description:
Similar to xDTOA, xDTOA2, and xDTOAM except these routines produce strings of
the form "MMM DD, YYYY".
Include: stdlib.a or date.a
Routine: ATOD (2)
------------------
Category: Date/Time Routines
Author: Randall Hyde
Registers on Entry:
ES:DI- Points at string containing date to convert.
Registers on Return: CX- Year (1980-2099)
DH- Month (1-12)
DL- Day (1-31)
ES:DI- Points at first non-date string (ATOD2 only)
Flags Affected: carry- Set if bad date format.
Example of Usage:
lesi TodaysDate ;Buffer containing string.
ATOD ;Convert string to date.
jc Error
lesi TodaysDate ;Buffer containing string.
ATOD2 ;Convert string to date.
jc Error
Description:
ATOD converts an ASCII string of the form "mm/dd/yy" or "mm-dd-yy" to a DOS
format date. It returns the carry flag set if there is a parse error (that
is, the string is not in one of these two forms) or if the month, date, or
year values are out of range (including specifying Feb 29th on non-leap years).
ATOD2 works just like ATOD except it does not preserve DI. It leaves DI
pointing at the first non-date character encountered in the string.
Include: stdlib.a or date.a
Routine: ATOT (2)
------------------
Category: Date/Time Routines
Author: Randall Hyde
Registers on Entry:
ES:DI- Points at string containing time to convert.
Registers on Return: CH- Hour (0..23)
CL- Minutes (0..59)
DH- Seconds (0..59)
DL- Seconds/100 (0..99)
ES:DI- Points at first character which is not a part
of the parsed time (ATOT2 only).
Flags Affected: carry- Set if bad time format.
Example of Usage:
lesi CurrentTime ;Buffer containing string.
ATOT ;Convert string to time.
jc Error
lesi CurrentTime ;Buffer containing string.
ATOT2 ;Convert string to time.
jc Error
Description:
ATOT converts an ASCII string of the form "hh:mm:ss" or "hh:mm:ss.xxx" to a DOS
format date. It returns the carry flag set if there is a parse error (that
is, the string is not in one of these two forms) or if the hours, minutes,
seconds, or hundredth values are out of range. If the string does not contain
1/100ths of a second, this routine returns zero in DL.
ATOT2 works just like ATOT except it does not preserve DI. It leaves DI
pointing at the first character beyond the time characters.
Include: stdlib.a or time.a
Routine: TTOA (2,m)
--------------------
Category: Date/Time Routines
Author: Randall Hyde
Registers on Entry:
CH- Hour (0..23)
CL- Minutes (0..59)
DH- Seconds (0..59)
DL- 1/100 seconds (0..99)
ES:DI- Points at buffer with room for at least
nine bytes (TTOA/TTOA2 only).
Registers on Return: ES:DI- Sticks a string of the form hh:mm:ss
into a buffer allocated on the heap (TTOAM
only).
ES:DI- Points at the zero terminating byte at the
end of the string (TTOA2 only).
Flags Affected: carry- Set if memory allocation error (TTOAM only).
Example of Usage:
mov ah, 2ch ;Call DOS to get the system
int 21h ; time (also see xTTOA)
lesi CurrentTime ;Buffer to store string.
TTOA ;Convert Time to string.
mov ah, 2ch
int 21h
lesi CurTime2
TTOA2
mov ah, 2ch
int 21h
TTOAM ;ES:DI is allocated on heap.
Description:
TTOA converts the DOS system time in CX/DX to a string and stores the string
at the location specified by ES:DI. ES:DI must point at a buffer with at
least nine characters in it (for a string of the form hh:mm:ss followed by
a zero terminating byte).
TTOA2 works like TTOA except it does not preserve DI. It leaves DI pointing
at the zero terminating byte in the string. This is useful for generating
long strings in memory of which TTOA is one component.
TTOAM is like TTOA except it automatically allocates storage for the string
on the heap.
Include: stdlib.a or time.a
Routine: xTTOA (2,m)
---------------------
Category: Date/Time Routines
Author: Randall Hyde
Registers on Entry:
ES:DI- Points at buffer with room for at least
nine bytes (xTTOA/xTTOA2 only).
Registers on Return: ES:DI- Sticks a string of the form HH:MM:SS
into a buffer allocated on the heap xTTOAM
only).
ES:DI- Points at the zero terminating byte at the
end of the string (xTTOA2 only).
Flags Affected: carry- Set if memory allocation error (xTTOAM only).
Example of Usage:
lesi CurrentTime ;Buffer to store string.
xTTOA ;Convert time to string.
lesi CurTime2
xTTOA2
xTTOAM ;ES:DI is allocated on heap.
Description:
These routines work just like TTOA, TTOA2, and TTOAM except you do not pass
in the time to them, they call DOS to read the current system time and
convert that to a string.
Include: stdlib.a or time.a
File I/O Routines
-----------------
Although MS-DOS provides some fairly decent file I/O facilities, the MS-DOS
file routines are all block oriented. That is, there is no simple routine
you can call to read a single character from a file (the most common case).
Although you can create a buffer consisting of a single byte and call MS-DOS
to read a single character into that buffer, this is very slow. The standard
library file I/O routines provide a set of buffered I/O routines for sequent-
ially accessed files. These routines are suitable only for files which you
sequentially read or sequentially write. They do not work with random access
files nor can you alternately read and write to the file. However, most file
accesses fall into the category of sequential read or write so the Standard
Library routines will work fine in most cases. In other cases, MS-DOS
provides a reasonable API so there really isn't a need for augmentation in
the Standard Library.
The Standard Library provides routines to OPEN a file (for reading or writing
only), CREATE a new file and open it for writing, CLOSE a file, FLUSH the file
buffers associated with a file, GET a character from a file, READ a block of
bytes from a file, PUT a single character to a file, or write a block of chars
to a file.
Note that you can use the standard I/O redirection operations to redirect the
standard input and output to routines which read and write bytes through a
file. Consider the following short routine:
Redir2File proc far
push ds
push es
push di
mov di, seg MyFileVar
mov ds, di
les di, ds:MyFileVar
fputc
pop di
pop es
pop ds
ret
Redir2File endp
This routine, when called, writes the character in AL to the file specified
by the file variable "MyFileVar" (see an explanation of the FPUTC routine
for more details). You can selectively redirect all of the standard output
routines through this procedure (hence sending all standard output to the
file) using the Standard Library SetOutAdrs routine:
lesi Redir2File
SetOutAdrs
<use print, printf, puts, puti, etc. here, all output
goes to the file rather than to the screen.>
lesi PutcStdOut ;Default DOS output
SetOutAdrs
<Now all output goes back to the DOS standard output>
You can also preserve the previous output address using the code:
lesi Redir2File
PushOutAdrs
<use print, printf, puts, puti, etc. here, all output
goes to the file rather than to the screen.>
PopOutAdrs
<Now all output goes back to the previous handler.>
You can do the same thing with the standard input routines when redirecting
input from a file, though this is less useful.
All file I/O routines in the library use a "File Variable" to keep track of
the specified file. *THIS IS NOT THE SAME THING AS A DOS FILE HANDLE!*
"FileVar" is a structure defined in the "file.a" include file. For each file
you open/close, you must create a unique file variable.
Routine: FOPEN
---------------
Category: File I/O
Registers on Entry: AX contains file open mode
(0=open for read, 1=open for write)
ES:DI points at a file variable.
DX:SI points at a file name.
Registers on return: Carry is set/clear for error/no error.
AX contains (DOS) error code if carry is set.
Flags affected:
Carry denotes error.
Example of Usage:
MyFileVar FileVar <>
MyFileName db "file.nam"
.
.
.
mov ax, 0 ;Open for reading
lesi MyFileVar ;Ptr to file variable.
ldxi MyFileName ;Ptr to file name.
fopen
jc Error
Description:
fopen opens a sequential file for reading or writing. It calls DOS to
actually open the file and then sets up appropriate internal variables (in
the FileVar variable) to provide efficient blocked I/O.
Include: stdlib.a or file.a
Routine: FCREATE
-----------------
Category: File I/O
Registers on Entry:
ES:DI points at a file variable.
DX:SI points at a file name.
Registers on return: Carry is set/clear for error/no error.
AX contains (DOS) error code if carry is set.
Flags affected:
Carry denotes error.
Example of Usage:
MyFileVar FileVar <>
MyFileName db "file.nam"
.
.
.
lesi MyFileVar ;Ptr to file variable.
ldxi MyFileName ;Ptr to file name.
fcreate
jc Error
Description:
fcreate opens a new file for reading. If the file already exists, fcreate
will delete it and create a new one. Other than this, the behavior is
quite similar to fopen.
Include: stdlib.a or file.a
Routine: FCLOSE
----------------
Category: File I/O
Registers on Entry:
ES:DI points at a file variable.
Registers on return: Carry is set/clear for error/no error.
AX contains (DOS) error code if carry is set.
Flags affected:
Carry denotes error.
Example of Usage:
MyFileVar FileVar <>
.
.
.
lesi MyFileVar ;Ptr to file variable.
fclose
jc Error
Description:
fclose closes a file opened by fcreate or fopen. Note that you *must* use
this call to close the file (rather than using DOS' close call). There may
be "hot" data present in internal buffers. This call flushes such data to
the file.
Note that you must make this call before quitting your application. DOS will
automatically close all files upon quitting, but DOS will not automatically
flush any hot data to disk upon program termination.
Include: stdlib.a or file.a
Routine: FFLUSH
----------------
Category: File I/O
Registers on Entry:
ES:DI points at a file variable.
Registers on return: Carry is set/clear for error/no error.
AX contains (DOS) error code if carry is set.
Flags affected:
Carry denotes error.
Example of Usage:
Ptr2FileVar dd MyFileVar
.
.
.
les di, Ptr2FileVar ;Ptr to file variable.
fflush
jc Error
Description:
fflush will write any "hot" data (data written to the file by an application
which is currently sitting in internal buffers) to the file. It is a good
idea to occassionally flush files to disk if you do not write the data to
the file all at once. This helps prevents loss of data in the event of an
abnormal termination.
Include: stdlib.a or file.a
Routine: FGETC
---------------
Category: File I/O
Registers on Entry:
ES:DI points at a file variable.
Registers on return: AL contains byte read (if no error, C=0).
AX contains (DOS) error code if carry is set.
Flags affected:
Carry denotes error.
Example of Usage:
Ptr2FileVar dd MyFileVar
.
.
.
les di, Ptr2FileVar ;Ptr to file variable.
fgetc
jc Error
<AL contains byte read at this point>
Description:
fgetc reads a single byte from a file opened for reading. On EOF the carry
flag will be set and AX will contain zero.
Include: stdlib.a or file.a
Routine: FREAD
---------------
Category: File I/O
Registers on Entry:
ES:DI points at a file variable.
DX:SI points at the destination block.
CX contains the number of bytes to read.
Registers on return: AX contains actual # of bytes read (if no error, C=0).
AX contains (DOS) error code if carry is set (AX=0
denotes EOF).
Flags affected:
Carry denotes error.
Example of Usage:
MyFileVar FileVar <>
MyBlock db 256 dup (?)
.
.
.
lesi MyFileVar ;Ptr to file variable.
ldxi MyBlock ;Place to put data.
mov cx, 256 ;# of bytes to read.
fread
jc Error
Description:
fread lets you read a block of bytes from a file opened for reading. This
call is generally *much* faster than reading a string of single bytes if you
want to read a large number of bytes at one time.
Include: stdlib.a or file.a
Routine: FPUTC
---------------
Category: File I/O
Registers on Entry:
ES:DI points at a file variable.
AL contains the character to write to the file.
Registers on return:
AX contains (DOS) error code if carry is set.
Flags affected:
Carry denotes error.
Example of Usage:
Ptr2FileVar dd MyFileVar
.
.
.
les di, Ptr2FileVar ;Ptr to file variable.
mov al, Char2Write
fputc
jc Error
Description:
fputs writes a single byte to a file opened for writing (or opened via the
fcreate call). It writes the byte in AL to the output file. Note that data
written via this call may not be written directly to the file. For performance
reasons the fputc routine buffers up the data in memory and writes large blocks
of data to the file. If you need to ensure that the data is properly written
to the file you will need to make a call to fclose or fflush.
Include: stdlib.a or file.a
Routine: FWRITE
----------------
Category: File I/O
Registers on Entry:
ES:DI points at a file variable.
DX:SI points at the source block.
CX contains the number of bytes to write.
Registers on return: AX contains actual # of bytes written (if no error).
AX contains (DOS) error code if carry is set (AX=0
denotes EOF).
Flags affected:
Carry denotes error.
Example of Usage:
MyFileVar FileVar <>
MyBlock db 256 dup (?)
.
.
.
lesi MyFileVar ;Ptr to file variable.
ldxi MyBlock ;Place to put data.
mov cx, 256 ;# of bytes to read.
fwrite
jc Error
Description:
fwrite lets you write a block of bytes to a file opened for writing. This
call is generally *much* faster than writing a string of single bytes if you
want to read a large number of bytes at one time. Note that fwrite, like
fputc, buffers up data before writing it to disk. If you need to commit
data to the disk surface at some point, you must call the fflush or fclose
routines.
Include: stdlib.a or file.a
Floating Point Routines
-----------------------
The floating point routines provide a basic floating point package for
80x86 assembly language users. The floating point package deals with
four different floating point formats: IEEE 32-bit, 64-bit, and 80-bit
formats, and an internal 81-bit format. The external formats mostly
support the IEEE standard except for certain esoteric values such as
denormalized numbers, NaNs, infinities, and other such cases.
The package provides two "pseudo-registers", a floating point accumulator
and a floating point operand. It provides routines to load and store these
pseudo-registers from memory operands (using the various formats) and then
all other operations apply to these two operands. All computations use the
internal 81-bit floating point format. The package automatically converts
between the internal format and the external format when loading and storing
values.
Do not write code which assumes the internal format is 81 bits. This format
will change in the near future when I get a chance to add guard bits to
all the computations. If your code assumes 81 bits, it will break at that
point. Besides, there is no reason your code should count on the size of
the internal operations anyway. Stick with the IEEE formats and you'll
be much better off (since your code can be easily upgraded to deal with
numeric coprocessors).
WARNING: These routines have not been sufficiently tested as of 10/10/91.
Use them with care. Report any problems with these routines to Randy Hyde
via the electronic addresses provided in this document or by sending a
written report to UC Riverside. As I get more time, I will further test
these routines and add additional functions to the package.
*** Randy Hyde
Routine: lsfpa
---------------
Category: Floating point Routine
Registers on entry: ES:DI points at a single precision (32-bit) value to load
Registers on return: None
Flags affected: None
Example of Usage:
les di, FPValue
lsfpa
Description: LSFPA loads a single precision floating point value into the
internal floating point accumulator. It also converts the
32-bit format to the internal 81-bit format used by the
floating point package.
Include: stdlib.a or fp.a
Routine: ssfpa
---------------
Category: Floating point Routine
Registers on entry: ES:DI points at a single precision (32-bit) value where
this routine should store the floating point acc.
Registers on return: None
Flags affected: Carry set if conversion error.
Example of Usage:
les di, FPValue
ssfpa
Description: SSFPA stores the floating point accumulator into a single
precision variable in memory (pointed at by ES:DI). It
converts the value from the 81-bit format to the 32-bit
value before storing the result. The 64-bit mantissa used
by the FP package is rounded to 24 bits during the store.
The exponent could be out of range. If this occurs, SSFPA
returns with the carry flag set.
Include: stdlib.a or fp.a
Routine: ldfpa
---------------
Category: Floating point Routine
Registers on entry: ES:DI points at a double precision (64-bit) value to load
Registers on return: None
Flags affected: None
Example of Usage:
les di, FPValue
ldfpa
Description: LDFPA loads a double precision floating point value into the
internal floating point accumulator. It also converts the
64-bit format to the internal 81-bit format used by the
floating point package.
Include: stdlib.a or fp.a
Routine: sdfpa
---------------
Category: Floating point Routine
Registers on entry: ES:DI points at a double precision (64-bit) value where
this routine should store the floating point acc.
Registers on return: None
Flags affected: Carry set if conversion error.
Example of Usage:
les di, FPValue
sdfpa
Description: SDFPA stores the floating point accumulator into a double
precision variable in memory (pointed at by ES:DI). It
converts the value from the 81-bit format to the 64-bit
value before storing the result. The 64-bit mantissa used
by the FP package is rounded to 51 bits during the store.
The exponent could be out of range. If this occurs, SDFPA
returns with the carry flag set.
Include: stdlib.a or fp.a
Routine: lefpa
---------------
Category: Floating point Routine
Registers on entry: ES:DI points at an extended precision (80-bit) value to
load
Registers on return: None
Flags affected: None
Example of Usage:
les di, FPValue
lefpa
Description: LEFPA loads an extended precision floating point value into
the internal floating point accumulator. It also converts the
80-bit format to the internal 81-bit format used by the
floating point package.
Include: stdlib.a or fp.a
Routine: lefpal
----------------
Category: Floating point Routine
Registers on entry: CS:RET points at an extended precision (80-bit) value to
load
Registers on return: None
Flags affected: None
Example of Usage:
lefpal
dt 1.345e-3
Description: LEFPAL loads an extended precision floating point value into
the internal floating point accumulator. It also converts the
80-bit format to the internal 81-bit format used by the
floating point package.
Unlike LEFPA, LEFPAL gets its operand directly from the code
stream. You must follow the call to lefpal with a ten-byte
(80-bit) floating point constant.
Include: stdlib.a or fp.a
Routine: sefpa
---------------
Category: Floating point Routine
Registers on entry: ES:DI points at an extended precision (80-bit) value
where this routine should store the floating point acc.
Registers on return: None
Flags affected: Carry set if conversion error.
Example of Usage:
les di, FPValue
sefpa
Description: SEFPA stores the floating point accumulator into an extended
precision variable in memory (pointed at by ES:DI). It
converts the value from the 81-bit format to the 80-bit
value before storing the result.
The exponent could be out of range. If this occurs, SEFPA
returns with the carry flag set.
Include: stdlib.a or fp.a
Routine: lsfpo
---------------
Category: Floating point Routine
Registers on entry: ES:DI points at a single precision (32-bit) value to load
Registers on return: None
Flags affected: None
Example of Usage:
les di, FPValue
lsfpo
Description: LSFPA loads a single precision floating point value into the
internal floating point operand. It also converts the
32-bit format to the internal 81-bit format used by the
floating point package.
Include: stdlib.a or fp.a
Routine: ldfpo
---------------
Category: Floating point Routine
Registers on entry: ES:DI points at a double precision (64-bit) value to load
Registers on return: None
Flags affected: None
Example of Usage:
les di, FPValue
ldfpo
Description: LDFPO loads a double precision floating point value into the
internal floating point operand. It also converts the
64-bit format to the internal 81-bit format used by the
floating point package.
Include: stdlib.a or fp.a
Routine: lefpo
---------------
Category: Floating point Routine
Registers on entry: ES:DI points at an extended precision (80-bit) value to
load
Registers on return: None
Flags affected: None
Example of Usage:
les di, FPValue
lefpo
Description: LEFPO loads an extended precision floating point value into
the internal floating point operand. It also converts the
80-bit format to the internal 81-bit format used by the
floating point package.
Include: stdlib.a or fp.a
Routine: lefpol
----------------
Category: Floating point Routine
Registers on entry: CS:RET points at an extended precision (80-bit) value to
load
Registers on return: None
Flags affected: None
Example of Usage:
lefpal
dt 1.345e-3
Description: LEFPOL loads an extended precision floating point value into
the internal floating point operand. It also converts the
80-bit format to the internal 81-bit format used by the
floating point package.
Unlike LEFPO, LEFPOL gets its operand directly from the code
stream. You must follow the call to lefpal with a ten-byte
(80-bit) floating point constant.
Include: stdlib.a or fp.a
Routine: itof
--------------
Category: Floating point Routine
Registers on entry: AX contains a signed integer value
Registers on return: None
Flags affected: None
Example of Usage:
mov ax, -1234
itof
Description: ITOF converts the 16-bit signed integer in AX to a floating
point value, storing the result in the floating point
accumuator.
Include: stdlib.a or fp.a
Routine: utof
--------------
Category: Floating point Routine
Registers on entry: AX contains an unsigned integer value
Registers on return: None
Flags affected: None
Example of Usage:
mov ax, -1234
itof
Description: UTOF converts the 16-bit unsigned integer in AX to a floating
point value, storing the result in the floating point
accumuator.
Include: stdlib.a or fp.a
Routine: ultof
---------------
Category: Floating point Routine
Registers on entry: DX:AX contains an unsigned 32-bit integer value
Registers on return: None
Flags affected: None
Example of Usage:
mov dx, word ptr val32+2
mov ax, word ptr val32
ultof
Description: ULTOF converts the 32-bit unsigned integer in DX:AX to a
floating point value, storing the result in the floating
point accumuator.
Include: stdlib.a or fp.a
Routine: ltof
--------------
Category: Floating point Routine
Registers on entry: DX:AX contains a signed 32-bit integer value
Registers on return: None
Flags affected: None
Example of Usage:
mov dx, word ptr val32+2
mov ax, word ptr val32
ltof
Description: LTOF converts the 32-bit signed integer in DX:AX to a
floating point value, storing the result in the floating
point accumuator.
Include: stdlib.a or fp.a
Routine: ftoi
--------------
Category: Floating point Routine
Registers on entry: None
Registers on return: AX contains 16-bit signed integer
Flags affected: Carry is set if conversion error occurs.
Example of Usage:
ftoi
puti ;Print AX as integer value
Description: FTOI converts the floating point accumulator value to a
16-bit signed integer and returns the result in AX. If
the floating point number will not fit in AX, FTOI returns
with the carry flag set.
Include: stdlib.a or fp.a
Routine: ftou
--------------
Category: Floating point Routine
Registers on entry: None
Registers on return: AX contains 16-bit unsigned integer
Flags affected: Carry is set if conversion error occurs.
Example of Usage:
ftou
putu ;Print AX as an unsigned value
Description: FTOU converts the floating point accumulator value to a
16-bit unsigned integer and returns the result in AX. If
the floating point number will not fit in AX, FTOU returns
with the carry flag set.
Include: stdlib.a or fp.a
Routine: ftol
--------------
Category: Floating point Routine
Registers on entry: None
Registers on return: DX:AX contains a 32-bit signed integer
Flags affected: Carry is set if conversion error occurs.
Example of Usage:
ftol
putl ;Print DX:AX as integer value
Description: FTOL converts the floating point accumulator value to a
32-bit signed integer and returns the result in DX:AX. If
the floating point number will not fit in DX:AX, FTOL returns
with the carry flag set.
Include: stdlib.a or fp.a
Routine: ftoul
---------------
Category: Floating point Routine
Registers on entry: None
Registers on return: DX:AX contains a 32-bit unsigned integer
Flags affected: Carry is set if conversion error occurs.
Example of Usage:
ftoul
putul ;Print DX:AX as an integer value
Description: FTOUL converts the floating point accumulator value to a
32-bit unsigned integer and returns the result in DX:AX. If
the floating point number will not fit in DX:AX, FTOUL returns
with the carry flag set.
Include: stdlib.a or fp.a
Routine: fpadd
---------------
Category: Floating point Routine
Registers on entry: None
Registers on return: None
Flags affected: None
Example of Usage:
fpadd
Description: FPADD adds the floating point operand to the floating point
accumulator leaving the result in the floating point
accumulator.
Include: stdlib.a or fp.a
Routine: fpsub
---------------
Category: Floating point Routine
Registers on entry: None
Registers on return: None
Flags affected: None
Example of Usage:
fpsub
Description: FPSUB subtracts the floating point operand from the floating
point accumulator leaving the result in the floating point
accumulator.
Include: stdlib.a or fp.a
Routine: fpcmp
---------------
Category: Floating point Routine
Registers on entry: None
Registers on return: AX contains result of comparison.
Flags affected: As appropriate for a comparison. You can use the
conditional branches to check the comparison after
calling this routine. Be sure to use the *signed*
conditional jumps (e.g., JG, JGE, etc.).
Example of Usage:
fpcmp
jge FPACCgeFPOP
Description: FPCMP compares the floating point accumulator to the
floating point operand and sets the flags according to the
result of the comparison. It also returns a value in AX
as follows:
AX Result
-1 FPACC < FPOP
0 FPACC = FPOP
1 FPACC > FPOP
Include: stdlib.a or fp.a
Routine: fpmul
--------------
Category: Floating point Routine
Registers on entry: None
Registers on return: None
Flags affected: None
Example of Usage:
fpmul
Description: FPMUL multiplies the floating point accumulator by the floating
point operand and leaves the result in the floating point
accumulator.
Include: stdlib.a or fp.a
Routine: fpdiv
---------------
Category: Floating point Routine
Registers on entry: None
Registers on return: None
Flags affected: None
Example of Usage:
fpdiv
Description: FPDIV divides the floating point accumulator by the floating
point operand and leaves the result in the floating point
accumulator.
Include: stdlib.a or fp.a
Routine: ftoa (2,m)
--------------------
Category: Floating point Routine
Registers on entry: ES:DI points at buffer to hold result (ftoa/ftoa2 only)
AL- Field width for floating point value.
AH- Number of positions to the right of the dec pt.
Registers on return: ES:DI points at beginning of string (ftoa/ftoam only)
ES:DI points at zero terminating byte (ftoa2 only)
Flags affected: Carry is set if malloc error (ftoam only)
Example of Usage:
mov di, seg buffer
mov es, di
lea di, buffer
mov ah, 2 ;Two digits after "."
mov al, 10 ;Use a total of ten positions
ftoa
Description: FTOA (2,M) converts the value in the floating point accumulator
to a string of characters which represent that value. These
routines use a decimal representation. The value in AH is
the number of digits to put after the decimal point, AL
contains the total field width (including room for the sign
and decimal point). The field width specification works
just like Pascal or FORTRAN. If the number will not fit in
the specified field width, FTOA outputs a bunch of "#"
characters.
FTOA stores the converted string at the address specified by
ES:DI upon entry. There must be at least AL+1 bytes at this
address. It returns with ES:DI pointing at the start of this
buffer.
FTOA2 works just like FTOA except it does not preserve DI.
It returns with DI pointing at the zero terminating byte.
FTOAM allocates storage for the string on the heap and returns
a pointer to the converted string in ES:DI.
Note: this routine preserves the value in the floating point
accumulator but it wipes out the value in the floating point
operand.
Include: stdlib.a or fp.a
Routine: etoa (2,m)
--------------------
Category: Floating point Routine
Registers on entry: ES:DI points at buffer to hold result (etoa/etoa2 only)
AL- Field width for floating point value.
Registers on return: ES:DI points at beginning of string (etoa/etoam only)
ES:DI points at zero terminating byte (etoa2 only)
Flags affected: Carry is set if malloc error (etoam only)
Example of Usage:
mov al, 14 ;Use a total of 14 positions
etoam
puts
putcr
free
Description: ETOA (2,M) converts the value in the floating point accumulator
to a string of characters which represent that value. These
routines use an exponential (scientific notation)
representation. AL contains the field width. It contains
the number of print position to use when outputting the
number. The field width specification works just like Pascal
or FORTRAN. If the number will not fit in the specified
field width, ETOA outputs a bunch of "#" characters.
ETOA stores the converted string at the address specified by
ES:DI upon entry. There must be at least AL+1 bytes at this
address. It returns with ES:DI pointing at the start of this
buffer.
ETOA2 works just like ETOA except it does not preserve DI.
It returns with DI pointing at the zero terminating byte.
ETOAM allocates storage for the string on the heap and returns
a pointer to the converted string in ES:DI.
Note: this routine preserves the value in the floating point
accumulator but it wipes out the value in the floating point
operand.
Include: stdlib.a or fp.a
Routine: atof
--------------
Category: Floating point Routine
Registers on entry: ES:DI points at a string containing the representation
of a floating point number in ASCII form.
Registers on return: None
Flags affected: None
Example of Usage:
les di, FPStr
atof
Description: ATOF converts the string pointed at by ES:DI into a floating
point value and leaves this value in the floating point
accumulator. Legal floating point values are described
by the following regular expression:
{" "}* {+ | -} ( ([0-9]+ {"." [0-9]*}) | ("." [0-9]+)}
{(e | E) {+ | -} [0-9] {[0-9]*}}
"{}" denote optional items.
"|" denotes OR.
"()" groups items together.
Include: stdlib.a or fp.a
Memory Management Routines
--------------------------
The stdlib memory management routines let you dynamically allocate storage on
the heap. These routines are somewhat similar to those provided by the "C"
programming language. However, these routines do not perform garbage
collection, as this would introduce too many restrictions and have a very
adverse effect on speed.
The following paragraph gives a description of how the memory management
routines work. These routines may be updated in future revisions, however,
so you should never make assumptions about the structure of the memory
management record (described below) or else your code may not work on the
next revision.
The allocation/deallocation routines should be fairly fast. Malloc and free
use a modified first/next fit algorithm which lets the system quickly find a
memory block of the desired size without undue fragmentation problems (average
case). The memory manager data structure has an overhead of eight bytes
(meaning each malloc operation requires at least eight more bytes than you ask
for) and a granularity of 16 bytes. The overhead (eight bytes) per allocated
block may seem rather high, but that is part of the price to pay for faster
malloc and free routines. All pointers are far pointers and each new item is
allocated on a paragraph boundary. The current memory manager routines always
allocate (n+8) bytes, rounding up to the next multiple of 16 if the result is
not evenly divisible by sixteen. The first eight bytes of the structure are
used by the memory management routines, the remaining bytes are available for
use by the caller (malloc, et. al., return a pointer to the first byte beyond
the memory management overhead structure).
NOTE: There was a major change in the way this package works starting with
version 30 of the library. Prior to version 30, MemInit required a parameter
in the DX register to determine where to allocate the heap and how much
storage to allocate. Furthermore, the older versions called DOS to deallocate
memory then reallocate it for the heap. Finally, the older versions required
that you set up a global variable "PSP" containing the program segment
prefix value.
As of version 30, MemInit was split into two routines: MemInit and MemInit2.
MemInit allocates all of available memory (like the standard version of the
earlier MemInit) whereas MemInit2 lets you specify the location and size
of the heap. The new version calls DOS to get the PSP (so you don't need
to declare the PSP variable just for MemInit). The new version does not
reallocate memory blocks with DOS calls (which created some problems,
especially with debugger programs). Finally the new versions work fine
with ".EXE" files which do not get all leftover memory allocated to them.
Most older STDLIB programs will work just fine with the new MemInit routine.
If you relied on MemInit to reallocate memory for you, or if you specified
the location of the heap, you will need to modify your program to use these
new versions of the MemInit routine.
Routine: MemInit
-----------------
Category: Memory Management Routine
Registers on Entry: Nothing
Globals Affected: zzzzzzseg - segment name of the last segment in your
program
Registers on return: CX - number of paragraphs actually reserved by MemInit
Flags affected: None
Example of Usage:
; Don't forget to set up
; zzzzzzseg before calling
; MemInit
MemInit
Description: This routine initializes the memory manager system. You must
call it before using any routines which call any of the memory
manager procedures (since a good number of the stdlib routines
call the memory manager, you should get in the habit of always
calling this routine.) The system will "die a horrible death"
if you call a memory manager routine (like malloc) without first
calling MemInit.
This routine expects you to define (and set up) a global
names: zzzzzzseg. "zzzzzzseg" is a dummy segment which
must be the name of the very last segment defined in your
program. MemInit uses the name of this segment to determine the
address of the last byte in your program. If you do not
declare this segment last, the memory manager will overwrite
anything which follows zzzzzzseg. The "shell.asm" file
provides you with a template for your programs which properly
defines this segment.
On return from MemInit, the CX register contains the number of
paragraphs actually allocated.
Include: stdlib.a or memory.a
Routine: MemInit2
------------------
Category: Memory Management Routine
Registers on Entry: ES- segment address of the start of the heap.
CX- Number of paragraphs to allocate for the heap.
Registers on return: None
Flags affected: None
Example of Usage:
mov cx, seg HeapSeg
mov es, cx
mov cx, HeapSize ;In paragraphs!
MemInit2
Description: This routine initializes the memory manager system. You must
call it before using any routines which call any of the memory
manager procedures (since a good number of the stdlib routines
call the memory manager, you should get in the habit of always
calling this routine.) The system will "die a horrible death"
if you call a memory manager routine (like malloc) without first
calling MemInit2 (or MemInit).
This routine lets you decide where the heap lies in memory
(as opposed to MemInit which uses all available bytes from
the end of your program to the end of memory).
Note: you should only call MemInit or MemInit2 once in your
program.
Include: stdlib.a or memory.a
Routine: Malloc
----------------
Category: Memory Management Routine
Registers on Entry: CX - number of bytes to reserve
Registers on return: CX - number of bytes actually reserved by Malloc
ES:DI - ptr to 1st byte of memory allocated by Malloc
Flags affected: Carry=0 if no error.
Carry=1 if insufficient memory.
Example of Usage:
mov cx, 256
Malloc
jnc GoodMalloc
print db "Insufficient memory to continue.",cr,lf,0
jmp Quit
GoodMalloc: mov es:[di], 0 ;Init string to NULL
Description: Malloc is the workhorse routine you use to allocate a block of
memory. You give it the number of bytes you need and if it
finds a block large enough, it will allocate the requested
amount and return a pointer to that block.
Most memory managers require a small amount of overhead for each
block they allocate. Stdlib's (current) memory manager requires
an overhead of eight bytes. Furthermore, the grainularity is 16
bytes. This means that Malloc always allocates blocks of memory
in paragraph multiples. Therefore, Malloc may actually reserve
more storage than you specify. Therefore, the value returned in
CX may be somewhat greater than the requested value. By setting
the minimum allocation size to a paragraph, however, the
overhead is reduced and the speed of Malloc is improved by a
considerable amount.
Stdlib's memory management does not do any garbage collection.
Doing so would place too many demands on Malloc's users.
Therefore, it is quite possible for you to fragment memory with
multiple calls to maloc, realloc, and free. You could wind up in
a situation where there is enough free memory to satisfy your
request, but there isn't a single contiguous block large enough
for the request. Malloc treats this as an insufficient memory
error and returns with the carry flag set.
If Malloc cannot allocate a block of the requested size, it
returns with the carry flag set. In this situation, the contents
of ES:DI is undefined. Attempting to dereference this pointer
will produce erratic and, perhaps, disasterous results.
Include: stdlib.a or memory.a
Routine: Realloc
-----------------
Category: Memory Management Routine
Registers on Entry: CX - number of bytes to reserve
ES:DI - pointer to block to reallocate.
Registers on return: CX - number of bytes actually reserved by Realloc.
ES:DI - pointer to first byte of memory allocated by
Realloc.
Flags affected: Carry = 0 if no error.
Carry = 1 if insufficient memory.
Example of Usage:
mov cx, 1024 ;Change block size to 1K
les di, CurPtr ;Get address of block into ES:DI
realloc
jc BadRealloc
mov word ptr CurPtr, di
mov word ptr CurPtr+2, es
Description: Realloc lets you change the size of an allocated block in the
heap. It allows you to make the block larger or smaller.
If you make the block smaller, Realloc simply frees (returns
to the heap) any leftover bytes at the end of the block. If
you make the block larger, Realloc goes out and allocates a
block of the requested size, copies the bytes form the old
block to the beginning of the new block (leaving the bytes at
the end of the new block uninitialized)), and then frees the
old block.
Include: stdlib.a or memory.a
Routine: Free
--------------
Category: Memory Management Routine
Registers on Entry: ES:DI - pointer to block to deallocate
Registers on return: None
Flags affected: Carry = 0 if no error.
Carry = 1 if ES:DI doesn't point at a Free block.
Example of Usage:
les di, HeapPtr
Free
Description: Free (possibly) deallocates storage allocated on the heap by
malloc or Realloc. Free returns this storage to heap so other
code can reuse it later. Note, however, that Free doesn't
always return storage to the heap. The memory manager data
structure keeps track of the number of pointers currently
pointing at a block on the heap (see DupPtr, below). If you've
set up several pointers such that they point at the same block,
Free will not deallocate the storage until you've freed all of
the pointers which point at the block.
Free usually returns an error code (carry flag = 1) if you
attempt to Free a block which is not currently allocated or if
you pass it a memory address which was not returned by malloc
(or Realloc). By no means is this routine totally robust.
If you start calling free with arbitrary pointers in es:di
(which happen to be pointing into the heap) it is possible,
under certain circumstances, to confuse Free and it will attempt
to free a block it really should not.
This problem could be solved by adding a large amount of extra
code to the free routine, but it would slow it down considerably.
Therefore, a little safety has been sacrificed for a lot of
speed. Just make sure your code is correct and everything will
be fine!
Include: stdlib.a or memory.a
Routine: DupPtr
----------------
Category: Memory Manager Routine
Registers on Entry: ES:DI - pointer to block
Registers on return: None
Flags affected: Carry = 0 if no error.
Carry = 1 if es:di doesn't point at a free block.
Example of Usage:
les di, Ptr
DupPtr
Description: DupPtr increments the pointer count for the block at the
specifiied address. Malloc sets this counter to one. Free
decrements it by one. If free decrements the value and it
becomes zero, free will release the storage to the heap for
other use. By using DupPtr you can tell the memory manager
that you have several pointers pointing at the same block
and that it shouldn't deallocate the storage until you free
all of those pointers.
Include: stdlib.a or memory.a
Routine: IsInHeap
------------------
Category: Memory Management Routine
Registers on Entry: ES:DI - pointer to a block
Registers on return: None
Flags affected: Carry = 0 if ES:DI is a valid pointer.
Carry = 1 if not.
Example of Usage:
les di, MemPtr
IsInHeap
jc NotInHeap
Description: This routine lets you know if es:di contains the address of
a byte in the heap somewhere. It does not tell you if es:di
contains a valid pointer returned by malloc (see IsPtr, below).
For example, if es:di contains the address of some particular
element of an array (not necessarily the first element)
allocated on the heap, IsInHeap will return with the carry clear
denoting that the es:di points somewhere in the heap. Keep in
mind that calling this routine does not validate the pointer;
it could be pointing at a byte which is part of the memory
manager data structure rather than at actual data (since the
memory manager maintains that informatnion within the
bounds of the heap). This routine is mainly useful for seeing
if something is allocated on the heap as opposed to somewhere
else (like your code, data, or stack segment).
Include: stdlib.a or memory.a
Routine: IsPtr
---------------
Category: Memory Management Routine
Registers on Entry: ES:DI - pointer to block
Registers on return: None
Flags affected: Carry = 0 if es:di is a valid pointer.
Carry = 1 if not.
Example of Usage:
les di, MemPtr
IsPtr
jc NotAPtr
Description: IsPtr is much more specific than IsInHeap. This routine returns
the carry flag clear if and only if es:di contains the address
of a properly allocated (and currently allocated) block on the
heap. This pointer must be a value returned by Malloc, Realloc,
or DupPtr and that block must be currently allocated for IsPtr
to return the carry flag clear.
Include: stdlib.a or memory.a
Routine: BlockSize
-------------------
Category: Memory Management Routine
Registers on Entry: ES:DI - pointer to block
Registers on return: CX- Size of specifed block (in bytes). Returns
zero if ES:DI does not point at a legal block.
Flags affected: None
Example of Usage:
les di, MemPtr
BlockSize
Description:
BlockSize returns the size (in bytes) of a block allocated on the heap.
If the block is not in the heap, this code returns zero in CX.
This routine does NOT verify that the block was actually allocated and
is still allocated. It just makes sure that the pointer points at a valid
location somewhere in the heap and returns the block size from the data
structure at the specified address. You are responsible for ensuring that
you do not use a deallocated memory block.
Include: stdlib.a or memory.a
Routine: MemAvail
------------------
Category: Memory Management Routine
Registers on Entry: None
Registers on return: CX- Size of largest free block on the heap
Flags affected: None
Example of Usage:
MemAvail
Description:
MemAvail returns the size (in paragraphs) of the largest free block on the
heap. You can use this call to determine if there is sufficient storage
for an object on the heap.
Include: stdlib.a or memory.a
Routine: MemFree
-----------------
Category: Memory Management Routine
Registers on Entry: None
Registers on return: CX- Size of all free blocks on the heap.
Flags affected: None
Example of Usage:
MemFree
Description:
MemFree returns the size (in paragraphs) of the the free storage on the
heap. Note that this storage might be fragmented and not all of it may
be available for use by Malloc. To determine the largest free block
available use MemAvail.
Include: stdlib.a or memory.a
Process Manager Routines
------------------------
The UCR Standard Library Process package provides a simple preemptive
multitasking package for 8086 assembly language programmers. It also
provides a coroutine package and support for semaphores.
First, *AND THIS IS VERY IMPORTANT*, this package only supports the
8086, 8088, 80188, 80186, and 80286 processors operating in real mode.
You will need to make some minor modifications to the process package if
you wish to support the 32-bit x86 processors. The current process manager
only saves 16-bit registers, not the 32-bit registers of the 80386 and later.
You will, however, find it a relatively minor task to go in and modify this
code to support the 386 and later processors. We will probably add support
for these processors at a later date when time allows.
Second, *THIS IS ALSO VERY IMPORTANT*, keep in mind that DOS, BIOS, and many
of the routines in the standard library ARE NOT REENTRANT. Two processes
executing at the (apparent) same time cannot both be executing DOS, BIOS, or
the same standard library routines. It is unlikely that DOS or BIOS will
ever be made reentrant, and you shouldn't ever expect the standard library
to be made reentrant (far too much work). The standard library provides
semaphore support through which you can control access to critical resources
including DOS, BIOS, and the UCR Standard Library. If you are unfamiliar
with terms like reentrancy, semaphores, synchronization, and deadlock, you
should probably pick up a good text on operating systems and familiarize
yourself with these terms before attempting to use this package.
This process package provides three facilities to your assembly language
programs: A preemptive multitasking process manager, coroutine support, and
semaphore support. There are six routines associated with the preemptive
multitasking system: PRCSINIT, PRCSQUIT, FORK, KILL, DIE, and YIELD. There
are three routines associated with the coroutines package: COINIT, COCALL,
and COCALLL (though you'll rarely refer to COCALLL directly). Finally, there
are two routines to support semaphores: WAITSEMAPH and RLSSEMAPH.
The PRCSINIT and PRCSQUIT routines initialize and deinitialize the interrupt
system for the multitasking system. You must call PRCSINIT prior to
executing any of the preemptive multitasking routines or any of the semaphore
routines (the semaphore routines make sense only in the context of preemptive
multitasking). This initializes various internal variables and patches the
INT 8 interrupt vector (timer interrupt) to point at an internal routine
in the process manager. You must call PRCSQUIT when you are done with the
preemptive multitasking system; certainly you must call it before your
program terminates *FOR ANY REASON*. If you do not call PRCSQUIT, the system
will probably crash shortly after you try anything else after returning to
DOS since the timer interrupt will still be calling the routine left in
memory when your program terminates.
The process manager patches into the 1/18th second clock on the PC. There-
fore, the system will automatically perform an context switch every 55ms
or so. If your application reprograms the timer chip, this may produce
unexpected results. This may be particularly bothersome if you are running
a TSR which plays with the timer chip. Absolutely no attempt was made to
make this code robust enough to work in all cases with other code which
ties into the timer interrupt. Most well-written code will work fine, but
there are not guarantees.
The FORK routine lets you spawn a new process. For each call to FORK your
code makes, the FORK routine returns twice- once as the parent process and
once as the child process. FORK returns process ID information in the AX
and BX registers so that the code immediately following the FORK can figure
out if it's the parent or child process. FORK provides the basic (and only!)
mechanism for starting a second process.
The KILL and DIE routines let you terminate a process. KILL lets one process
terminate some other process (generally a child process). DIE lets a
process terminate itself.
The YIELD routine gives up the current process' time slice and lets some other
process take over.
The semaphore routines, WaitSemaph and RlsSemaph, let you wait on a semaphore
or signal that semaphore, respectively. The PROCESS.A and PROCESS.A6
include files contain the definition of a semaphore type, it is
semaphore struc
SemaCnt dw 1
smaphrLst dd ?
endsmaphrlst dd ?
semaphore ends
The only field you should ever play around with is the SemaCnt field. This
value is the number of processes which are allowed to be in the critical
region controlled by the semaphore at one time. For most mutual exclusion
problems, this value should always be one. Do not modify this value once
the program starts running. The process package increments and decrements
this number to keep track of the number of processes waiting to use a
resource. If you want to allow two processes to share a resource at the
same time, you should declare your semaphore variable as follows:
MySemaPh semaphore <2>
You execute the WaitSemaPh routine to see if a semaphore is currently busy.
When you get back from the WaitSemaPh call, the resource protected by the
semaphore is exclusively yours until you execute the RlsSemaPh routine.
Note that when you call the WaitSemaPh routine, the specified resource may
already be in use, in which case your process will be suspended until the
resource is freed (and anyone waiting in line ahead of you has had their
shot at the resource). If you do not call the RlsSemaPh routine to free
the semaphore, any other process waiting on that resource will wait
indefinitely. Also note that if you call WaitSemaPh twice on a semaphore
without releasing it inbetween, your process and any other process which
waits on that resource will deadlock.
While semaphores solve a large number of synchronization and mutual exclusion
problems, their primary use in the UCR Standard Library is to prevent re-
entrancy problems in DOS, BIOS, and the Standard Library itself. For example,
if you have two processes which print values to the display, attempting to
run both processes concurrently will crash the system if they both attempt
to print at the same time (since this will cause DOS to be reentered). A
simple solution is to use a DOS semaphore as follows:
In the data segment:
DOS semaphore {}
In Process 1:
lesi DOS
WaitSemaPh
print
db "Printed from process #1.",cr,lf,0
lesi DOS
RlsSemaPh
In Process 2:
lesi DOS
WaitSemaPh
print
db "Printed from process #2.",cr,lf,0
lesi DOS
RlsSemaPh
Semaphore guarantee mutual exclusion between the WaitSemaPh and RlsSemaPh
calls (for a particular semaphore variable, DOS in this case). Hence, once
process #1 enters its *CRITICAL REGION* by executing the WaitSemaPh call,
any attempt by process two to enter its critical region will cause process
two to suspend execution until process one executes the RlsSemaPh routine.
Coroutines provide "simulated" multitasking where the processes themselves
determine when to perform a context switch. This is quite similar to the
"cooperative multitasking" systems provided by Apple, Microsoft, and others
("cooperative multitasking" is a hyped up term to hide the fact that their
systems provide only multiprogramming, not multitasking). There are many
advantages and disadvantages to coroutines vs. multitasking. First of all,
reentrancy problems do not exist in a system using coroutines. Since you
control when one process switches to another, you can make sure that such
context switches do not occur in critical regions. Another advantage to
coroutines is that the processes themselves can determine which other process
gets the next access to the CPU. Finally, when a coroutine is executing,
it gets full access to the CPU to handle a time-critical operation without
fear of being preempted. On the other hand, poorly designed coroutines
provide a very crude approximation to multitasking and may actually hurt
the overall performance of the system.
The UCR Standard Library provides three routines to support coroutines:
COINIT, COCALL, and COCALLL. Generally, you'll only see COINIT and COCALL
in a program, the standard library automatically generates COCALLL calls
for certain types of COCALL statements. The COINIT routine initializes the
coroutine package and creates a process control block (PCB) for the currently
active routine. COCALL switches context to some other process. When one
process COCALLs another and that second process COCALLs the first, the first
process continues execution immediately after the first COCALL instruction
(so it behaves more like a return than a call). In general, you should not
think of COCALL as a "call" but rather as a "switch to some other process."
You may have coroutines and multitasking active at the same time, but you
should not make a COCALL to a process which is being time-sliced by the
multitasking system. I won't guarantee that this *won't* work, but it
seems sufficiently weird that something is bound to go wrong.
For those who are interested, the coroutine and multitasking packages
maintain the state of a process in a process control block (PCB) which
is the following structure:
pcb struc
NextProc dd ?
regsp dw ?
regss dw ?
regip dw ?
regcs dw ?
regax dw ?
regbx dw ?
regcx dw ?
regdx dw ?
regsi dw ?
regdi dw ?
regbp dw ?
regds dw ?
reges dw ?
regflags dw ?
PrcsID dw ?
StartingTime dd ?
StartingDate dd ?
CPUTime dd ?
pcb ends
As mentioned earlier, this code does not maintain the full state of the
80386 and later processors since it only saves the 16-bit register set.
If you like, you can easily change the definition of the PCB, and all the
code in the PROCESS.ASM file which refers to the PCB, and support full
32-bit operation. As usual, you should rarely, if ever, play around with
the internal fields of a PCB. That is for the process manager to do and you
could mess things up pretty back if you're not careful.
Routine: prcsinit
-----------------
Category: Processes
Registers on entry: None
Registers on return: None
Flags affected: None
Example of Usage:
prcsinit
Description:
Prcsinit initializes the process manager. Note that if you make a call to
the process manager, you *MUST* make a call to prcsquit before your program
quits. Failure to do so will crash the system in short order. Note that
you must even handle the case where the user types control-C or encounters
a critical error and aborts back to DOS.
This routine patches into the timer interrupt vector (INT 08) and may not
work properly with code in the system which is also messing around with the
timer interrupt.
Include: stdlib.a or process.a
Routine: prcsquit
-----------------
Category: Processes
Registers on entry: None
Registers on return: None
Flags affected: None
Example of Usage:
prcsquit
Description:
Prcsquit deinitializes the process manager. It detaches the timer interrupt
from the internal routine and performs various other clean up chores. You
must call this routine before your program terminates if you've called
prcsinit somewhere in your program.
Note that you cannot call prcsquit twice in a row, although you can call
the prcsinit/prcsquit combinations as many times as you like in your code.
Include: stdlib.a or process.a
Routine: fork
-------------
Category: Processes
Registers on entry: ES:DI- Points at a PCB to hold the process info
for the new process. The regss and regsp
fields of this PCB must be initialized to
the top of a new stack for the new process.
Registers on return: AX- <Parent process> Returned containing zero.
<Child process> Child process ID.
BX- <Parent process> Child's process ID.
<Child process> Returned containing zero.
Flags affected: None
Example of Usage:
ChildPCB pcb {0, offset endstk, seg endstk}
.
.
.
lesi ChildPCB
fork
cmp ax, 0
jne DoChildProcess
<Parent process continues here>
.
.
.
ChildsStack db 1024 dup (?)
EndStk dw 0
Description:
Fork spawns a new process. You make a single call to fork but it returns
twice- once for the parent process (the original caller to fork) and once
for the child process (the new process created by fork). On entry to fork
ES:DI must point at a PCB for the new process which has the REGSP and
REGSS fields initialized to point at the last word in a stack set aside for
the child process. *THIS IS VERY IMPORTANT!*
On return, the code following the call to FORK can test to see whether the
parent is returning or the child is returning by looking at the value in
the AX register. AX will contain zero upon return to the parent process.
It will contain a non-zero value, which is the process ID, when returning
to the child process. When the parent process returns, FORK returns the
process ID of the child process in the BX register. The parent process can
use this value to kill the child process, should that become necessary later
on. If the child needs access to the parent's process ID, the parent process
should store away its process ID in a variable before calling the FORK
routine. Note that with the exception of the AX, BX, SP, and SS registers,
FORK preserves all register values and returns the same set of values to both
the parent and child processes. In particular, it preserves the value of
the DS register so the child will have access to any global variables in
use by the parent process.
FORK does *not* copy the parent's stack data to the child's stack. Upon
return from FORK, the child's stack is empty. If you call FORK after
pushing something on the stack (e.g., a return address because you've called
FORK inside some other procedure), that information will not be placed on the
stack of the child process. If you such information pushed on the child's
stack, you will need to save SS:SP prior to calling FORK (in a global
variable) and then push the data pointed at by this saved value onto the
child's stack upon return. Of course, it's a whole lot easier if you simply
don't count on anything being on the child's stack when you get back from
FORK. In particular, don't call FORK from inside some nested routine and
expect the child process to return to the caller of the routine containing
FORK.
Include: stdlib.a or process.a
Routine: die
------------
Category: Processes
Registers on entry: None
Registers on return: None
Flags affected: None
Example of Usage:
die
Description:
Die kills the current process. Control is transferred to some other process
in the process manager's ready-to-run queue. Control never returns back to
the current process.
Note: if the current process is the only process running, calling DIE may
crash the system.
Include: stdlib.a or process.a
Routine: kill
-------------
Category: Processes
Registers on entry: AX- Process ID of process to terminate.
Registers on return: None
Flags affected: None
Example of Usage:
mov ax, ProcessID
kill
Description:
KILL lets one process terminate another. Typically, a parent might kill a
child process, although any process which knows the process ID of some other
process can kill that other process.
If a process passes its own ID to KILL, the system behaves exactly as though
the process called the DIE routine.
If a process passes its own ID to KILL and it is the only process in the
system, the system may crash.
Include: stdlib.a or process.a
Routine: yield
--------------
Category: Processes
Registers on entry: None
Registers on return: None
Flags affected: None
Example of Usage:
yield
Description:
YIELD voluntarily gives up the CPU. The remainder of the current time slice
is given to some other process and the current process goes to the end of the
process manager's ready to run queue. This call is particularly useful for
passing control between two cooperating process where one process needs to
wait for some action to complete and you don't feel like using semaphores
to synchronize the two activies. This call is roughly equivalent to
"COCALL <next available process>".
Include: stdlib.a or process.a
Routine: coinit
---------------
Category: Processes
Registers on entry: ES:DI- Points at an empty PCB for the current process.
Registers on return: None
Flags affected: None
Example of Usage:
MainProcess pcb {}
.
.
.
lesi MainProcess
coinit
Description:
COINIT initializes the coroutine package and sets up an internal pointer
to the PCB specified by ES:DI (the "current coroutine" pcb). On the next
COCALL, the process state will be saved in the pcb you've specified on the
COINIT call. Note that you do not have to initialize this pcb in any way,
that will all be taken care of by COINIT and the COCALL.
Include: stdlib.a or process.a
Routine: cocall/cocalll
-----------------------
Category: Processes
Registers on entry: ES:DI- Points at a PCB for the new coroutine
(COCALL only).
CS:IP- Points at a pointer to a PCB for the new
coroutine (COCALLL only).
Registers on return: None
Flags affected: None
Example of Usage:
OtherProcess pcb {----}
YetAnotherProcess pcb {----}
.
.
.
lesi OtherProcess
cocall
.
.
.
cocall YetAnotherProcess
Description:
COCALL switches context between two coroutines. There are two versions of
this call, although you use COCALL to invoke both of them: COCALL and COCALLL.
For COCALL, you must pass the address of a PCB in ES:DI. When calling COCALL
in this fashion, the operand field of the COCALL instruction must be blank
(see the example above). COCALLL expects the address of the pcb to follow
in the code stream. The COCALL macro looks for an operand and, if one is
present, it automatically creates the appropriate call to COCALLL and inserts
the address of the PCB in the code stream (again, see the example above).
Before you start a coroutine for the first time by calling COCALL, you must
properly initialize the pcb for that coroutine. You must provide initial
values for the regsp, regss, regip, and regcs fields of the pcb (fields two
through five in the pcb structure). Regsp and regss must point at the last
word of an appropriately sized stack for that coroutine; regip and regcs
must point at the initial entry point for the coroutine. For example, if you
want to switch between the current process and a coroutine named "CORTN", you
could use the following code:
MainCoRtn pcb {}
CoRtnPCB pcb {0,offset CoRtnStk, seg CoRtnStk,
offset CoRtn, seg CoRtn}
.
.
.
lesi MainCoRtn
coinit
.
.
.
cocall CoRtnPCB
.
.
.
CoRtn proc
.
.
.
cocall MainCoRtn
.
.
.
CoRtn endp
.
.
.
db 1024 dup (?)
CoRtnStk dw 0
Include: stdlib.a or process.a
Routine: waitsemaph
-------------------
Category: Processes
Registers on entry: ES:DI- Points at a semaphore variable
Registers on return: None
Flags affected: None
Example of Usage:
DOSsemaph semaphore {}
.
.
.
lesi DOSsemaph
WaitSemaPh
.
. <This is the critical section>
.
lesi DOSsemaph
RlsSemaPh
Description:
WaitSemaPh and RlsSemaPh protect critical regions in a multitasking
environment. WaitSemaPh expects you to pass the address of a semaphore
variable in ES:DI. If that particular semaphore is not currently in use,
WaitSemaPh marks the semaphore "in use" and immediately returns. If the
semaphore is already in use, the WaitSemaPh queues up the current process
on a waiting queue and lets some other process start running. Once a process
is done with the resource protected by a semaphore, it must call RlsSemaPh
to release the semaphore back to the system. If any processes are waiting
on that semaphore, the call to RlsSemaPh will activate the first such process.
Note that a process must not make two successive calls to WaitSemaPh on a
particular semaphore variable without calling RlsSemaPh between the calls.
Doing so will cause a deadlock.
Include: stdlib.a or process.a
Routine: rlssemaph
------------------
Category: Processes
Registers on entry: ES:DI- Points at a semaphore variable
Registers on return: None
Flags affected: None
Example of Usage: See WaitSemaPh
Description:
RlsSemaPh releases a semaphore (also known as "signalling") that the current
process has aquired via a call to WaitSemaPh. Please see the WaitSemaPh
explaination for more details. You should not call RlsSemaPh without first
calling WaitSemaPh. Doing so may cause some inconsistencies in the system.
Include: stdlib.a or process.a
Interrupt-Driven Serial Port I/O Package
========================================
One major problem the the PC's BIOS is the lack of good interrupt driven
I/O support for the serial port. The BIOS provides a mediocre set of polled
I/O facilities, but completely drops the ball on interrupt driven I/O.
This set of routines in the standard library provides polled I/O support
to read and set the registers on the 8250 (or other comparable chip, e.g.,
16450) as well as read and write data (polled). In addition, there are
a pair of routines to initialize and disable the interrupt system as well
as perform I/O using interrupts.
Typical polled I/O session:
1. Initialize chip using polled I/O routines.
2. Read and write data using ComRead and ComWrite routines.
Typical interrupt driven I/O session:
1. Initialize chip using polled I/O routines.
2. Read and write data using ComIn and ComOut routines.
Of course, all the details of serial communications cannot be discussed
here- it's far too broad a subject. These routines, like most in the
library, assume you know what you're doing. They just make it a little
easier on you. If you don't understand anything about serial communications,
you *might* be able to use these routines, but they were not written with
that audience in mind. There are several good references on serial communi-
cations; "C Programmer's Guide to Serial Communications" comes to mind. If
you've never looked at the 8250 or comparable chips before, you might want
to take a look at a reference such as this one if the routines in this
section don't make much sense.
Note: This routines are set up to use the COM1: hardware port. See the
source listings if you want to access a different serial port. Perhaps in
a future release we will modify this code to work with any serial port.
Routine: ComBaud
-----------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: AX- BPS (baud rate): 110, 150, 300, 600, 1200,
2400, 4800, 9600, 19200
Registers on return: None
Flags affected: None
Example of Usage:
mov ax, 9600 ;Set system to 9600 bps
ComBaud
Description:
ComBaud programs the serial chip to change its "baud rate" (technically,
it's "bits per second" not baud rate). You load AX with the appropriate
bps value and call ComBaud, as above. Note: if AX is not one of the legal
values, ComBaud defaults to 19.2kbps.
Include: ser.a or stdlib.a
Routine: ComStop
-----------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: AX- # of stop bits (1 or 2)
Registers on return: None
Flags affected: None
Example of Usage:
mov ax, 2 ;Set system to send 2 stop bits
ComStop
Description:
ComStop programs the serial chip to transmit the specifed number of stop
bits when sending data. You load AX with the appropriate value and call
ComStop, as above. Note that this only affects the output data stream. The
serial chip on the PC will always work with one incoming stop bit, regardless
of the setting. Since additional stop bits slow down your data transmission
(by about 10%) and most devices work fine with one stop bit, you should
normally program the chip with one stop bit unless you encounter some
difficulties. The setting of this value depends mostly on the system you
are connecting to.
Include: ser.a or stdlib.a
Routine: ComSize
-----------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: AX- # of data bits to transmit (5, 6, 7, or 8)
Registers on return: None
Flags affected: None
Example of Usage:
mov ax, 8 ;Set system to send 8 data bits
ComSize
Description:
ComSize programs the serial chip to transmit the specifed number of data
bits when sending data. You load AX with the appropriate value and call
ComSize, as above. The setting of this value depends mostly on the system
you are connecting to.
Include: ser.a or stdlib.a
Routine: ComParity
-------------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: AX- Bits 0, 1, and 2 are defined as follows:
bit 0- 1 to enable parity, 0 to disable.
bit 1- 0 for odd parity, 1 for even.
bit 2- Stuck parity bit. If 1 and bit 0 is 1, then the parity bit
is always set to the inverse of bit 1.
Registers on return: None
Flags affected: None
Example of Usage:
mov ax, 0 ;Set NO parity
ComParity
.
.
.
mov ax, 11b ;Set even parity
ComParity
Description:
ComParity programs the serial chip to use various forms of parity error
checking. If bit zero of AX is zero, then this routine disables parity
checking and transmission. In this case, ComParity ignores the other
two bits (actually, the 8250 ignores them, ComParity just passes them
through). If bit zero is a one, and bit two is a zero, then bit #1
defines even/odd parity during transmission and receiving. If bit #0
is a one and bit two is a one, then the 8250 will always transmit bit #1
as the parity bit (forced on or off).
Include: ser.a or stdlib.a
Routine: ComRead
-----------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: None
Registers on return: AL- Character read from port
Flags affected: None
Example of Usage:
ComRead
mov Buffer, al
Description:
ComRead polls the port to see if a character is available in the on-chip
data register. If not, it waits until a character is available. Once
a character is available, ComRead reads it and returns this character in
the AL register.
Warning: do *not* use this routine while operating in the interrupt mode.
This routine is for polled I/O only.
Include: ser.a or stdlib.a
Routine: ComWrite
------------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: AL- Character to write to port
Registers on return: None
Flags affected: None
Example of Usage:
mov al, 'a'
ComWrite
Description:
ComWrite polls the port to see if the transmitter is busy. If so, it waits
until the current transmission is through. Once the 8250 is done with the
current character, ComWrite will put the character in AL into the 8250
transmit register.
Warning: do *not* use this routine while operating in the interrupt mode.
This routine is for polled I/O only.
Include: ser.a or stdlib.a
Routine: ComTstIn
------------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: None
Registers on return: AL=0 if no char available, 1 if char available
Flags affected: None
Example of Usage:
Wait4Data: ComTstIn
cmp al, 0
je Wait4Data
Description:
ComTstIn polls the port to see if any input data is available. If so,
it returns a one in AL, else it returns a zero.
Warning: do *not* use this routine while operating in the interrupt mode.
This routine is for polled I/O only.
Include: ser.a or stdlib.a
Routine: ComTstOut
-------------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: None
Registers on return: AL = 1 if xmitr available, 0 if not
Flags affected: None
Example of Usage:
WriteData: <Do Something>
ComTstOut
cmp al, 0
je WriteData
mov al, 'a'
ComWrite
Description:
ComTstIn polls the port to see if the transmitter is currently busy. If so,
it returns a zero in AL, else it returns a one.
Warning: do *not* use this routine while operating in the interrupt mode.
This routine is for polled I/O only.
Include: ser.a or stdlib.a
Routine: ComGetLSR
-------------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: None
Registers on return: AL = LSR value
Flags affected: None
Example of Usage:
ComGetLSR
<do something with value in LSR>
Description:
Reads the LSR (line status register) and returns this value in AL. The
LSR using the following layout.
Line Status Register (LSR):
bit 0- Data Ready
bit 1- Overrun error
bit 2- Parity error
bit 3- Framing error
bit 4- Break Interrupt
bit 5- Transmitter holding register is empty.
bit 6- Transmit shift register is empty.
bit 7- Always zero.
Warning: In general, it is not a good idea to call this routine while
the interrupt system is active. It won't hurt anything, but the value
you get back may not reflect properly upon the last/next character you
read.
Include: ser.a or stdlib.a
Routine: ComGetMSR
-------------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: None
Registers on return: AL = MSR value
Flags affected: None
Example of Usage:
ComGetMSR
<do something with value in MSR>
Description:
The MSR (modem status register) bits are defined as follows:
Modem Status Register (MSR):
bit 0- Delta CTS
bit 1- Delta DSR
bit 2- Trailing edge ring indicator
bit 3- Delta carrier detect
bit 4- Clear to send
bit 5- Data Set Ready
bit 6- Ring indicator
bit 7- Data carrier detect
Warning: In general, it is not a good idea to call this routine while
the interrupt system is active. It won't hurt anything, but the value
you get back may not reflect properly upon the last/next character you
read.
Include: ser.a or stdlib.a
Routine: ComGetMCR
-------------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: None
Registers on return: AL = MCR value
Flags affected: None
Example of Usage:
ComGetMCR
<do something with value in MCR>
Description:
The MCR (modem control register) bits are defined as follows:
Modem Control Register (MCR):
bit 0- Data Terminal Ready (DTR)
bit 1- Request to send (RTS)
bit 2- OUT 1
bit 3- OUT 2
bit 4- Loop back control.
bits 5-7- Always zero.
The DTR and RTS bits control the function of these lines on the 8250.
They are useful mainly for polled I/O handshake operations (though they
*could* be used with interrupt I/O, it's rarely necessary unless your
main application is *really* slow and the data is coming in real fast.
Out1 and Out2 control output pins on the 8255. Keep in mind that the OUT1
pin enables/disables the serial port interrupts. Play with this *only* if
you want to control the interrupt enable.
Loop back control is mainly useful for testing the serial port or checking
to see if a serial chip is present.
Include: ser.a or stdlib.a
Routine: ComSetMCR
-------------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: AL = new MCR value
Registers on return: None
Flags affected: None
Example of Usage:
mov al, NewMCRValue
ComSetMCR
Description:
This routine writes the value in AL to the modem control register. See
ComGetMCR for details on the MCR register.
Include: ser.a or stdlib.a
Routine: ComGetLCR
-------------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: None
Registers on return: AL = LCR value
Flags affected: None
Example of Usage:
ComGetLCR
<do something with value in LCR>
Description:
The LCR (line control register) bits are defined as follows:
Line Control Register (LCR):
bits 0,1- Word length (00=5, 01=6, 10=7, 11=8 bits).
bit 2- Stop bits (0=1, 1=2 stop bits [1-1/2 if 5 data bits]).
bit 3- Parity enabled if one.
bit 4- 0 for odd parity, 1 for even parity (assuming bit 3 = 1).
bit 5- 1 for stuck parity.
bit 6- 1=force break.
bit 7- 1=Divisor latch access bit. 0=rcv/xmit access bit.
Since the standard library provides routines to initialize the serial chip
(which is the purpose of this port) you shouldn't really mess with this
port at all. You may, however, use ComGetLCR to see what the current
settings are before making any changes.
Warning: (applies mainly to ComSetLCR) DO NOT, UNDER ANY CIRCUMSTANCES,
CHANGE THE DIVISOR LATCH ACCESS BIT WHILE OPERATING IN INTERRUPT MODE.
The interrupt service routine assumes the rcv/xmit register is mapped in
whenever an interrupt occurs. If you must play with the divisor latch,
turn off interrupts before changing it. Always set the divisor latch
access bit back to zero before turning interrupts back on.
Include: ser.a or stdlib.a
Routine: ComSetLCR
-------------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: AL = new LCR value
Registers on return: None
Flags affected: None
Example of Usage:
; If this maps in the divisor latch, be sure we're not operating with
; serial interrupts!
mov al, NewLCRValue
ComSetLCR
Description:
This routine writes the value in AL to the line control register. See
ComGetLCR for details on the LCR register. Especially note the warning
about the divisor latch access bit.
Include: ser.a or stdlib.a
Routine: ComGetIIR
-------------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: None
Registers on return: AL = IIR value
Flags affected: None
Example of Usage:
ComGetIIR
<do something with value in IIR>
Description:
The IIR (interrupt identification register) bits are defined as follows:
Interrupt ID Register (IIR):
bit 0- No interrupt is pending (interrupt pending if zero).
bits 1,2- Binary value denoting source of interrupt:
00-Modem status
01-Transmitter Hold Register Empty
10-Received Data Available
11-Receiver line status
bits 3-7 Always zero.
This value is of little use to anyone except the interrupt service routine.
The ISR is the only code which should really access this port.
Include: ser.a or stdlib.a
Routine: ComGetIER
-------------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: None
Registers on return: AL = IER value
Flags affected: None
Example of Usage:
ComGetIER
<do something with value in IER>
Description:
The IER (line control register) bits are defined as follows:
Interupt enable register (IER):
If one:
bit 0- Enables received data available interrupt.
bit 1- Enables transmitter holding register empty interrupt.
bit 2- Enables receiver line status interrupt.
bit 3- Enables the modem status interrupt.
bits 4-7- Always set to zero.
Normally, the interrupt initialization procedure sets up this port. You may
read or change its value as you deem necessary to control the types of
interrupts the system generates. Note that the interrupt service routine
(ISR) in the library ignores errors. You will need to modify the ISR if you
need to trap errors.
Include: ser.a or stdlib.a
Routine: ComSetIER
-------------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: AL = new IER value
Registers on return: None
Flags affected: None
Example of Usage:
mov al, NewIERValue
ComSetIER
Description:
Writes the value in AL to the IER. See ComGetIER for more details.
Include: ser.a or stdlib.a
Routine: ComInitIntr
---------------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: None
Registers on return: None
Flags affected: None
Example of Usage:
ComInitIntr
Description:
Sets up the chip to generate interrupts and programs the PC to transfer
control to the library serial interrupt service routine when an interrupt
occurs. Note that other than interrupt initialization, this code does not
initialize the 8250 chip.
Include: ser.a or stdlib.a
Routine: ComDisIntr
--------------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: None
Registers on return: None
Flags affected: None
Example of Usage:
ComDisIntr
Description:
This routine uninstalls the ISR and programs the chip to stop the generation
of interrupts. You must call ComInitIntr after calling this routine to
turn the interrupt system back on.
Include: ser.a or stdlib.a
Routine: ComIn
---------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: None
Registers on return: AL=character read from buffer or port
Flags affected: None
Example of Usage:
ComIn
<Do something with AL>
Description:
ComIn is the input routine associated with interrupt I/O. It reads the
next available character from the serial input buffer. If no characters
are avialable in the buffer, it waits until the system receives one before
returning.
Include: ser.a or stdlib.a
Routine: ComOut
----------------
Author: Randall Hyde
Category: Serial Communications
Registers on entry: AL=Character to output
Registers on return: None
Flags affected: None
Example of Usage:
<Get character to write into AL>
ComOut
Description:
ComOut is the output routine associated with interrupt I/O. If the serial
transmitter isn't currently busy, it will immediately write the data to the
serial port. If it is busy, it will buffer the character up. In most cases
this routine returns quickly to its caller. The only time this routine
will delay is if the buffer is full can you cannot add any additional
characters to it.
Include: ser.a or stdlib.a
CHARSETS
Createsets Creates a set on the heap
Emptyset Cleans out set
Rangeset Add a range of values to a set
Addstr Add a group of characters to a set
Rmvstr Remove a string from a set
AddChar Add a single character to a set
Rmvchar Remove a single character to a set
Member Find if a character is in a set
CopySet Makes a verbatim copy of a set to another
SetUnion Computes the union of two sets
SetIntersect Computes the intersection of two sets into a third
SetDifference Removes items in second set which are in first
Nextitem Searches the first character (item) in the set
pointing to its mask byte
Rmvitem Removes an item from a set
UTIL
ISize Calculate number of spaces needed to print signed integer
USize Calculate number of spaces needed to print unsigned integer
LSize Calculate number of spaces needed to print signed long integer
ULSize Calculate number of spaces needed to print unsigned long integer
IsAlNum Checks to see if AL is in the range of A-Z, a-z, 0-9
IsXDigit Checks to see if AL is in the range of A-F, a-f, 0-9
IsDigit Checks to see if AL is in the range of 0-9
IsAlpha Checks to see if AL is in the range of A-Z, a-z
IsLower Checks to see if AL is in the range of a-z
IsUpper Checks to see if AL is in the range of A-Z
STRINGS
The following string routines take as many as four different forms: strxxx,
strxxxl, strxxxm, and strxxxlm. These routines differ in how they store
the destination string into memory and where they obtain their source strings.
Routines of the form strxxx generally expect a single source string address
in ES:DI or a source and destination string in ES:DI & DX:SI. If these
routines produce a string, they generally store the result into the buffer
pointed at by ES:DI upon entry. They return with ES:DI pointing at the
first character of the destination string.
Routines of the form strxxxl have a "literal source string". A literal
source string follows the call to the routine in the code stream. E.g.,
strcatl
db "Add this string to ES:DI",0
Routines of the form strxxxm automatically allocate storage for a source
string on the heap and return a pointer to this string in ES:DI.
Routines of the form strxxxlm have a literal source string in the code
stream and allocate storage for the destination string on the heap.
Strcpy Copies string.
Strcpyl Copies string literal
StrDup Copies string to newly allocated memory
StrDupl Copies string to newly allocated memory from literal
Strlen Calculate length of string
Strcat Concatenate two strings
Strcatm Concatenate two strings, allocating enough memory for the final
resulting string on the heap
Strcatl Concatenate string from literal
Strcatml Concatenate string from literal to allocated memory
Strchr Searches for first occurence of a character in a string
Strstr Searches for the position of a substring within another string
Strcmp Compares one string to another
Strcmpl Compares one string to literal string
Stricmp Compares one string to another disregarding case
Strupr Converts a string to uppercase
Struprm Copies string to heap, then converts to upper and returns address
Strlwr Convert string to lower case
Strlwrm Copies string to heap, converts, then returns new string
Strset Overwrites data on input string with character in AL
Strsetm Allocates new strings, then overwrites with character in AL
Strspan Compares strings, returning 1st position not equal
Strspanl Compares strings, returning 1st position not equal, literal
Strcspan Compares strings, returning 1st position that _IS_ equal
Strcplanl Compares strings, returning 1st position that _IS_ equal,literal
StrIns Inserts one string into another
StrInsl Inserts one string into another, literal
StrInsm Inserts one string into another after allocating memory
StrInsml Inserts one string into another after allocating memory, literal
StrDel Deletes characters from a string
StrDelm Deletes characters from a copy of a string
StrTrim Removes trailing spaces from a string
StrTrimm Removes trailing spaces from a copy of a string
StrBlkDel Removes leading spaces from a string
StrBlkDelm Removes leading spaces from a copy of a string
Strrev Reverses the characters in a string. ie: "BLAH" -> "HALB"
Strrevm Reverses the characters in a copy of a string
StrBDel Removes leading spaces from a string
StrBDelm Removes leading spaces from a copy of a string
ToHex Converts a stream of binary vaues into Intel Hex format
STDOUT
Putc Print a character out to stdout
PutCR Print a CR/LF to stdout
PucStdOut Print a character to stdout
PutcBIOS Use BIOS to print a character to the _SCREEN_
GetOuAdrs Get the address of the current output routine
SetOutAdrs Redirects calls to output routine to user defined
PushOutAdrs Pushes current output address to internal stack
PopOutAdrs Pops output address from internal stack and sets
Puts Print a string to stdout
Puth Print a value out in hex format
Putw Print a value out in word hex format
Puti Print a value out in signed integer format
Putu Print a value out in unsigned integer format
Putl Print a value out in signed long integer format
Putul Print a value out in unsigned long integer format
PutISize Print a value out in signed integer format using minimum spaces
PutUSize Print a value out in unsigned integer format using minimum spaces
PutLSize Print a value out in signed long format using minimum spaces
PutULSize Print a value out in unsigned long fomat using minimum spaces
Print Print out a literal string
Printf Print out a literal string using C library type formatters
Printff Print out a literal string using C library type formatters. Also
supports printout out floating point values
STDIN
Getc Gets a character from STDIN
GetcStdIn Gets a character from STDIN
GetcBIOS Gets a character using BIOS. Redirection is not allowed
SetInAdrs Sets the address to the routine which you want to use for input
GetInAdrs Gets the address which is being used to take input
PushInAdrs Pushes the address of the input routine to an internal stack
PopInAdrs Pop the address of the input routine from an internal stack
Gets Get a string from STDIN
Getsm Get a string in STDIN and stuff into newly alloacted buffer
Scanf Gets string from STDIN using C library type formatters
SERIAL PORT STUFF
ComBaud Inits the seral port to a user defined speed
ComStop Inits number of stop bits to use in transmission
ComSize Inits number of data bits to use in transmission
ComParity Inits the serial port as to whether or not to use parity checking
ComRead Reads character from serial port
ComWrite Transmits character to serial port
ComTstIn Checks to see if character is availble in buffer. Does not read.
ComTstOut Checks if character can be transmitted
ComGetLSR Reads current status of Line Status Register
ComGetMSR Reads current status of Modem Status Regster
ComGetMCR Reads current status of Modem Control Register
ComGetLCR Reads current status of Line Control Regiter
ComGetIIR Reads current status of Interrupt Identification Register
ComGetIER Reads current status of Interupt Enable Register
ComSetMCR Writes value to Modem Control Register
ComSetLCR Writes value to Line Control Register
ComSetIER Writes value to Interrupt Enable Register
ComInitIntr Sets up interrupts and progams to control serial chip
ComDisIntr Untinstalls all programs installed with ComInitIntr
ComIn Reads chracter from serial port. Will wait if none available.
ComOut Writes character to serial port, waiting if port is busy.
PROCESS
Prcsinit Starts the process manager
Prcsquit Shutsdown the process manager
Fork Spawns a new process
Die Kills the current process
Kill Lets one process terminate another
Yield Forces context switch, surrendering rest of current time slice
CoInit Inits the CoRoutine package
CoCall Switches context between two coroutines
CoCalll Switches context between two coroutines, passing info another way
WaitSemaph Protects critical regions in memory
RlsSemaPh Releases a semaphore that the current process has aquired
PATTERN
Spancset Match any number of characters belonging to a character set
Brkcset Match any number of characters which are *not* in a character set
MatchStr Matches a specified string
MatchToStr Match characters in string until specified substring
MatchChar Matches a single character
MatchChars Matches zero or more occurrences of the same character
MatchToChar Matches characters up to and including specified character
MatchToPat Matches all characters up to specified characters
Anycset Matches single character from a character set
NotAnycset Match single character which is not in character set
EOS Matches end of string
ARB Matches arbitary number of characters
ARBNUM Matches arbitary number of strings
Skip Matches "n" arbitary characters.
POS Matches at position "n" in the string
RPOS Matches at position "n" from the end of the string
GOTOpos Moves to position in string
RGOTOpos Moves to position "n" from end of string
MISC
Random Generate a random number
Randomize Reseed random number generator based on time of day
cpuid Identifies CPU
Argc Return number of command line parameters
Argv Returns address to string of command line parameter specified
GetEnv Returns address of environment table information
DOS Invokes DOS INT 21h interrupt
ExitPgm Exits current program and returns to DOS
MEMORY
MemInit Initializes memory manager. Must be called first.
MemInit2 Initializes another part of memory manager
Malloc Dynamically allocate memory
Realloc Resize a block of memory already allocated with Malloc
Free Deallocate a chunk of memory allocated with Malloc
DupPtr Replicate a pointer to a chunk of memory so free won't deallocate
it until all the pointers are taken care of
IsInHeap Tells you if ES:DI points somewhere in the heap
IsPtr Tells you if ES:DI points to a properlly allocated chunk of heap
BlockSize Returns size of block currently pointed to in the heap
MemAvail Returns size of largest free block on the heap
MemFree Returns size of total bytes free on the heap
LIST
CreateList Allocates storage for a list variable on the head
AppendLast Add a node to the list
Remove1st Removes the first item from a list
Peek1st Looks at the first item from a list
Insert1st Inserts a node at the first node from a list
RemoveLast Removes the last node from a list
PeekLast Looks at the last item from a list
InsertCur Inserts a node into the list
InsertmCur Builds a node on the heap, then inserts that into the list
AppendCur Inserts a node into the list after the current node pointed to
AppendmCur Builds node on heap, then inserts that after current node
RemoveCur Removes current node from the list
Peek Looks at current node on the list
SetCur Sets the specified node as the current node
Insert Inserts a new node before a specified node in the list
Append Inserts a new node after a specified node in the list
Remove Removes the specified node from the list
FLOATING POINT (FP)
lsfpa Load single percision float value into internal accumulator
ssfpa Store single percision float value from accumulator to memory
ldfpa Load double percision float value into internal accumulator
sdfpa Store double percision float value from accumulator to memory
lefpa Load extended percision float value into internal accumulator
lefpal lefpa with a literal value after it in the code
sefpa Store extended percision float value from accumulator to memory
lsfpo lsfpa a value, then convert to extended percision
ldfpo ldfpa a value, then convert to extended percision
lefpo lefpa a value, then convert to extended percision
lefpol lefpo a value, with the value being literal in the code
itof Convert a 16bit signed integer to float
utof Convert a 16bit unsigned integer to float
ultof Convert a 32bit unsigned integer to float
ltof Convert a 32bit signed integer to float
ftoi Convert float number to signed 16bit integer format
ftou Convert float number to unsigned 16bit integer format
ftol Convert float number to signed 32bit integer format
ftoul Convert float number to unsigned 32bit integer format
fpadd Add float accumulator to float operand
fpsub Subtract float operand from the float accumulator
fpsmp Compare float accumulator to operand and set flags accordingly
fpmul Multiply float operand to float accumulator
fpdiv Divides float accumulator by operand
ftoa Converts float number into string, preserving DI
ftoa2 Converts float number into string, not preserving DI
ftoam Converts float to string, allocating enough space for string
etoa Convert float to string using scientific notation
etoa2 Works like etoa, except not preserving DI
etoam Works like etoa, this time allocing space on the heap for string
atof Converts string into float
DATE TIME
Date Converts DOS system date into string ( mm/dd/yy )
Date2 Converts DOS system date into string, not preserving DI
Datem Converts DOS system date into string allocated from heap
xDate Converts current DOS system date into string
xDate2 Converts current DOS system date into string, killing DI
xDatem Converts current DOS system date to string with memory from heap
lDate Converts DOS date into string ( mmm, dd, yyyy )
lDate2 Converts DOS date into string killing DI
lDatem Converts DOS date into string, memory allocated from heap
xlDate Converts current DOS date into string
xlDate2 Converts current DOS date into string killing DI
xlDatem Converts current DOS date into string allocated from heap
atod Converts string (mm/dd/yy or mm-dd-yy) into DOS date
atod2 Converts string into DOS date, killing DI
atot Converts string (hh:mm:ss or hh:mm:ss.xxx) into DOS time
atot2 Converts string into DOS time killing DI
time Converts DOS time to string
time2 Converts DOS time to string, killing DI
timem Converts DOS time to string, allocated from heap
xtime Converts current DOS time to string
xtime2 Converts current DOS time to string, killing DI
xtimem Converts current DOS time to string, allocated from heap
CONVERSION
atol Converts string of numbers to signed 32bit integer
atoul Converts string of numbers to unsigned 32bit integer
atou Converts string of numbers to unsigned 16bit integer
atoh Converts string of hex numbers to unsigned 16bit integer
atoh2 Converts string of hex numbers to unsigned 16bit int killing DI
atolh Converts string of hex numbers to unsigned 32bit int
atolh2 Converts string of hex numbers to unsigned 32bit int killing DI
atoi Converts string of numbers to signed 16bit integer
itoa Converts signed integer to string
itoam Converts signed integer to string, allocting space from heap
itoa2 Converts signed integer to string, killing DI
utoa Converts unsigned integer to string
utoam Converts unsigned integer to string, allocating space from heap
utoa2 Converts unsigned integer to string, killing DI
htoa Converts 8bit hex value to string
htoa2 Converts 8bit hex value to string, killing DI
htoam Converts 8bit hex value to string, allocating space from heap
wtoa Converts 16bit hex value to string
wtoa2 Converts 16bit hex value to string, killing DI
wtoam Converts 16bit hex value to string, allocating space from heap
ltoa Converts 32bit signed integer to string
ltoa2 Converts 32bit signed integer to string, killing DI
ltoam Converts 32bit signed integer to string, getting space from heap
ultoa Converts 32bit unsigned int to string
ultoa2 Converts 32bit unsigned int to string, killing DI
ultoam Converts 32bit unsigned int to string, getting space from heap
sprintf In memory print formatting
sprintf2 In memory print formatting, killing DI
sprintfm In memory print formatting, getting space from heap
sscanf In memory input formatting
sscanf2 In memory input formatting, killing DI
sscanfm In memory input formatting, getting space from heap
tolower Converts character to lowercase
toupper Converts character to uppercase
By: Steve Shah
sshah@ucrengr.ucr.edu
sshah@watserv.ucr.edu
sshah@mozart.ucr.edu
Pick one -- any one.......
Current version:
UCRASM 31
Compiled 1.0 -- June 7, 1993 10:40a
Standard Input Routines:
Character Input Routines
------------------------
The character input routines take input from either a standard
device (keyboard, etc.) or a standard library. After the character input
routines receive the characters they either place the characters on the stack
and/or return. The character input routines work similar to the "C" character
input routines.
Routine: Getc
--------------
Category: Character Input Routine
Registers on Entry: None
Registers on Return: AL- Character from input device.
AH- 0 if eof, 1 if not eof.
Flags Affected: Carry- 0 if no error, 1 if error. If error occurs, AX
contains DOS error code.
Example of Usage:
getc
mov KbdChar, al
putc
Description: This routine reads a character from the standard input device.
This call is synchronous, that is, it does not return until a
character is available. The Default input device is DOS
standard input.
Getc returns two types of values: extended ASCII (codes 1-255)
and IBM keyboard scan codes. If Getc returns a non-zero value,
you may interpret this value as an ASCII character. If Getc
returns zero, you must call Getc again to get the actual
keypress.
The second call returns an IBM PC keyboard scan code.
Since the user may redirect input from the DOS command line,
there is the possibility of encountering end-of-file (eof)
when calling getc. Getc uses the AH register to return eof
status. AH contains the number of characters actually read
from the standard input device. If it returns one, then
you've got a valid character. If it returns zero, you've
reached end of file. Note that pressing control-z forces an
end of file condition when reading data from the keyboard.
This routine returns the carry flag clear if the operation
was successful. It returns the carry flag set if some sort
of error occurred while reading the character. Note that eof
is not an error condition. Upon reaching the end of file,
Getc returns with the carry flag clear. If getc is seen from
a file the control-z is not seen as an end-of-file marker,
but just read in as a character of the file.
Control-c if read from a keyboard device aborts the program.
However if when reading something other than a keyboard
(files, serial ports), control-c from the input source
returns control-c. However when pressing control-break
the program will abort regardless of the input source.
Regarding CR/LF, if the input is from a device, (eg. keyboard
serial port) getc returns whatever that device driver returns,
(generally CR without a LF). However if the input is from
a file, getc stripes a single LF if it immediately follows
the CR.
When using getc files operate in "cooked" mode. While
devices operate in "pseudo-cooked" mode, which means no
buffering, no CR -> CR/LF, but it handles control-c, and
control-z.
See the sources for more information about GETC's internal
operation.
Include: stdlib.a or stdin.a
Routine: GetcStdIn
--------------------
Category: Character Input Routine
Register on entry: None.
Register on return: AL- Character from input device.
Flags affected: AH- 0 if eof, 1 if not eof.
Carry- 0 if no error, 1 if error
(AX contains DOS error code if error occurs).
Example of Usage:
GetcStdIn
mov InputChr, al
putc
Description: This routine reads a character from the DOS standard input
device. This call is synchronous, that is, it does not return
until a character is available. See the description of Getc
above for more details.
The difference between Getc and GetcStdIn is that your
program can redirect Getc using other calls in this library.
GetcStdIn calls DOS directly without going through this
redirection mechanism.
Include: stdlib.a or stdin.a
Routine: GetcBIOS
-------------------
Category: Character Input Routine
Register on entry: None
Register on return: AL- Character from the keyboard.
Flags affected: AH- 1 (always). Carry- 0 (always).
Example of Usage:
GetcBIOS
mov CharRead, al
putc
Description: This routine reads a character from the keyboard. This call is
synchronous, that is it does not return until a character is
available.
Note that there is no special character processing. This
code does *not* check for EOF, control-C, or anything
else like that.
Include: stdlib.a or stdin.a
Routine: SetInAdrs
-------------------
Category: Character Input Routine
Registers on Entry: ES:DI - address of new input routine
Registers on return: None
Flags affected:
Examples of Usage:
mov es, seg NewInputRoutine
mov di, offset NewInputRoutine
SetInAdrs
les di, RoutinePtr
SetInAdrs
Description: This routine redirects the stdlib standard input so that it
calls the routine who's address you pass in es:di. The
routine (whose address appears in es:di) should be a "getc"
routine which reads a character from somewhere and returns
that character in AL. It should also return EOF status in
the AH register and error status in the carry flag (see
the description of GETC for more details).
Include: stdlib.a or stdin.a
Routine: GetInAdrs
--------------------
Category: Character Input Routine
Register on entry: None
Register on return: ES:DI - address of current input routine (called by Getc).
Flags affected: None
Example of Usage:
GetInAdrs
mov word ptr SaveInAdrs, di
mov word ptr SaveInAdrs+2, es
Description: You can use this function to get the address of the current
input routine, perhaps so you can save it or see if it is
currently pointing at some particular piece of code.
If you want to temporarily redirect the input and then restore
the original input or outline, consider using
PushInAdrs/PopInAdrs described later.
Include: stdlib.a or stdin.a
Routine: PushInAdrs
---------------------
Category: Character Input Routine
Register on entry: ES:DI - Address of new input routine.
Register on return: Carry=0 if operation successful.
Carry=1 if there were already 16 items on the stack.
Example of Usage:
mov es, seg NewInputRoutine
mov di, offset NewInputRoutine
PushInAdrs
.
.
.
les di, RoutinePtr
PushInAdrs
Description: This routine "pushes" the current input address onto an
internal stack and then copies the value in es:di into the
current input routine pointer. The PushInAdrs and PopInAdrs
routines let you easily save and redirect the standard output
and then restore the original output routine address later on.
If you attempt to push more than 16 items on the stack,
PushInAdrs will ignore your request and return with the
carry flag set. If PushInAdrs is successful, it will
return with the carry flag clear.
Include: stdlib.a or stdin.a
Routine: PopInAdrs
--------------------
Category: Character Input Routine
Register on entry: None
Register on return: ES:DI - Points at the previous stdout routine before
the pop.
Example of Usage:
mov es, seg NewInRoutine
mov di, offset NewInputRoutine
PushInAdrs
.
.
.
PopInAdrs
Description: PopInAdrs undoes the effects of PushInAdrs. It pops an item
off the internal stack and stores it into the input routine
pointer. The previous value in the output pointer is returned
in es:di.
Include: stdlib.a or stdin.a
Routine: Gets, Getsm
---------------------
Category: Character Input Routine
Register on entry: ES:DI- Pointer to input buffer (gets only).
Register on return: ES:DI - address of input of text.
carry- 0 if no error, 1 if error.
If error, AX contains: 0- End of
file encountered in middle of
string. 1- Memory allocation error (getsm only).
Other- DOS error code.
Flags affected: None
Example of usage:
getsm ;Read a string from the
;keyboard
puts ;Print it
putcr ;Print a new line
free ;Deallocate storage for
;string.
mov di, seg buffer
mov es, di
lea di, buffer
gets
puts
putcr
Description: Reads a line of text from the stdlib standard input device.
You must pass a pointer to the recipient buffer in es:di to
the GETS routine. GETSM automatically allocates storage for
the string on the heap (up to 256 bytes) and returns a pointer
to this block in es:di.
Gets(m) returns all characters typed by the user except for the
carriage return (ENTER) key code. These routines return a
zero-terminated string (with es:di pointing at the string).
Exactly how Gets(m) treats the incoming data depends upon
the source device, however, you can usually count on Gets(m)
properly handling backspace (erases previous character),
escape (erase entire line), and ENTER (accept current line).
Other keys may affect Gets(m) as well. For example, Gets(m),
by default, calls Getc which, in turn, usually calls DOS'
standard input routine. If you type control-C or break while
read from DOS' standard input it may abort the program.
If an error occurs during input (e.g., EOF encountered in
the middle of a line) Gets(m) returns the error code in
AX. If no error occurs, Gets(m) preserves AX.
Include: stdlib.a or stdin.a
Routine: Scanf
---------------
Category: Character Input Routine
Register on entry: None
Register on return: None
Flags affected: None
Example of usage:
scanf
db "%i %h %^s",0
dd i, x, sptr
Description: * Formatted input from stdlib standard input.
* Similar to C's scanf routine.
* Converts ASCII to integer, unsigned, character, string, hex,
and long values of the above.
Scanf provides formatted input in a fashion analogous to
printf's output facilities. Actually, it turns out that scanf
is considerably less useful than printf because it doesn't
provide reasonable error checking facilities (neither does C's
version of this routine). But for quick and dirty programs
whose input can be controlled in a rigid fashion (or if you're
willing to live by "garbage in, garbage out") scanf provides
a convenient way to get input from the user. Like printf, the
scanf routine expects you to follow the call with a format
string and then a list of (far pointer) memory addresses. The
items in the scanf format string take the following form: %^f,
where f represents d, i, x, h, u, c, x, ld, li, lx, or lu.
Like printf, the "^" symbol tells scanf that the address
following the format string is the address of a (far) pointer
to the data rather than the address of the data location itself.
By default, scanf automatically skips any leading whitespace
before attempting to read a numeric value. You can instruct
scanf to skip other characters by placing that character in the
format string. For example, the following call instructs scanf
to read three integers separated by commas (and/or whitespace):
scanf
db "%i,%i,%i",0
dd i1,i2,i3
Whenever scanf encounters a non-blank character in the format
string, it will skip that character (including multiple
occurrences of that character) if it appears next in the input
stream. Scanf always calls gets to read a new line of text
from stdlib's standard input. If scanf exhausts the format
list, it ignores any remaining characters on the line. If
scanf exhausts the input line before processing all of the
format items, it leaves the remaining variables unchanged.
Scanf always deallocates the storage allocated by gets.
Include: stdlib.a or stdin.a
Character Output Routines
-------------------------
The stdlib character output routines allow you to print to the
standard output device. Although the processing of non-ASCII
characters is undefined, most output devices handle these characters
properly. In particular, they can handle return, line feed, back space,
and tab.
Most of the output routines in the standard library output data
through the Putc routine. They generally use the AX register upon
entry and print the character(s) to the standard output device by
calling DOS by default. The output is redirectable to the
user-written routine. However, the PutcBIOS routine prints doesn't
use DOS. Instead it uses BIOS routines to print the character in AL
using the INT command for teletype-like output.
The print routines are similar to those in C, however, they differ
in their implementation. The print routine returns to the address
immediately following the terminating byte, therefore, it is important
to remember to terminate your string with zero or you will print an
unexpected sequence of characters.
Routine: Putc
--------------
Category: Character Output Routine
Registers on Entry: AL- character to output
Registers on Return: None
Flags affected: None
Example of Usage:
mov al, 'C'
putc ;Prints "C" to std output.
Description: Putc is the primitive character output routine. Most other
output routines in the standard library output data through
this procedure. It prints the ASCII character in AL register.
The processing of control codes is undefined although most output
routines this routine links to should be able to handle return,
line feed, back space, and tab. By default, this routine calls
DOS to print the character to the standard output device. The
output is redirectable to to user-written routine.
Include: stdlib.a or stdout.a
Routine: PutCR
---------------
Category: Character Output Routine
Register on entry: None
Register on return: None
Flags affected: None
Example of Usage: PutCR
Description: Using PutCR is an easy way of printing a newline to the stdlib
standard output. It prints a newline (carriage return/line feed)
to the current standard output device.
Include: stdlib.a or stdout.a
Routine: PutcStdOut
-------------------
Category: Character Output Routine
Registers on Entry: AL- character to output
Registers on Return: None
Flags Affected: None
Example of Usage:
mov AL, 'C'
PutcStdOut ; Writes "C" to standard output
Description: PutcStdOut calls DOS to print the character in AL to the standard
output device. Although processing of non-ASCII characters and
control characters is undefined, most output devices handle these
characters properly. In particular, most output devices properly
handle return, line feed, back space, and tab. The output is
redirectable via DOS I/O redirection.
Include: stdlib.a or stdout.a
Routine: PutcBIOS
-----------------
Category: Character Output Routine
Registers on Entry: AL- character to print
Registers on Return: None
Flags Affected: None
Example of Usage:
mov AL, "C"
PutcBIOS
Description: PutcBIOS prints the character in AL using the BIOS routines,
using INT 10H/AH=14 for teletype-like output. Output through
this routine cannot be redirected; such output is always sent
to the video display on the PC (unless, of course, someone has
patched INT 10h). Handles return, line feed, back space, and
tab. Prints other control characters using the IBM Character
set.
Include: stdlib.a or stdout.a
Routine: GetOutAdrs
-------------------
Category: Character Output Routine
Registers on Entry: None
Registers on Return: ES:DI- address of current output routine (called by Putc)
Flags Affected: None
Example of Usage:
GetOutAdrs
mov word ptr SaveOutAdrs, DI
mov word ptr SaveOutAdrs+2, ES
Description: GetOutAdrs gets the address of the current output routine, perhaps
so you can save it or see if it is currently pointing at some
particular piece of code. If you want to temporarily redirect
the output and then restore the original output routine, consider
using PushOutAdrs/PopOutAdrs described later.
Include: stdlib.a or stdout.a
Routine: SetOutAdrs
--------------------
Category: Character Output Routine
Registers on Entry: ES:DI - address of new output routine
Registers on return: None
Flags affected: None
Example of Usage:
mov es, seg NewOutputRoutine
mov di, offset NewOutputRoutine
SetOutAdrs
les di, RoutinePtr
SetOutAdrs
Description: This routine redirects the stdlib standard output so that it
calls the routine who's address you pass in es:di. This routine
expects the character to be in AL and must preserve all registers.
It handles the printable ASCII characters and the four control
characters return, line feed, back space, and tab. (The routine
may be modified in the case that you wish to handle these codes
in a different fashion.)
Include: stdlib.a or stdout.a
Routine: PushOutAdrs
---------------------
Category: Character Output Routine
Registers on Entry: ES:DI- Address of new output routine
Registers on Return: None
Flags Affected: Carry = 0 if operation is successful
Carry = 1 if there were already 16 items on the stack
Example of Usage:
mov ES, seg NewOutputRoutine
mov DI, offset NewOutputRoutine
PushOutAdrs
.
.
.
les DI, RoutinePtr
PushOutAdrs
Description: This routine "pushes" the current output address onto an internal
stack and then uses the value in es:di as the current output
routine address. The PushOutAdrs and PopOutAdrs routines let you
easily save and redirect the standard output and then restore the
original output routine address later on. If you attempt to push
more than 16 items on the stack, PushOutAdrs will ignore your
request and return with the carry flag set. If PushOutAdrs is
successful, it will return with the carry flag clear.
Include: stdlib.a or stdout.a
Routine: PopOutAdrs
--------------------
Category: Character Output Routine
Registers on Entry: None
Registers on Return: ES:DI- Points at the previous stdout routine before
the pop
Flags Affected: None
Example of Usage:
mov ES, seg NewOutputRoutine
mov DI, offset NewOutputRoutine
PushOutAdrs
.
.
.
PopOutAdrs
Description: PopOutAdrs undoes the effects of PushOutAdrs. It pops an item off
the internal stack and stores it into the output routine pointer.
The previous value in the output pointer is returned in es:di.
Defaults to PutcStdOut if you attempt to pop too many items off
the stack.
Include: stdlib.a or stdout.a
Routine: Puts
--------------
Category: Character Output Routine
Register on entry: ES:DI register - contains the address of the string
Register on return: None
Flags affected: None
Example of Usage:
les di, StrToPrt
puts
putcr
Description: Puts prints a zero-terminated string whose address appears
in es:di. Each character appearing in the string is printed
verbatim. There are no special escape characters. Unlike
the "C" routine by the same name, puts does not print a
newline after printing the string. Use putcr if you want
to print the newline after printing a string with puts.
Include: stdlib.a or stdout.a
Routine: Puth
--------------
Category: Character Output Routine
Register on entry: AL
Register on return: AL
Flags affected: None
Example of Usage:
mov al, 1fh
puth
Description: The Puth routine Prints the value in the AL register as two
hexadecimal digits. If the value in AL is between 0 and 0Fh,
puth will print a leading zero. This routine calls the stdlib
standard output routine (putc) to print all characters.
Include: stdlib.a or stdout.a
Routine: Putw
--------------
Category: Character Output Routine
Registers on Entry: AX- Value to print
Registers on Return: None
Flags Affected: None
Example of Usage:
mov AX, 0f1fh
putw
Description: The Putw routine prints the value in the AX register as four
hexadecimal digits (including leading zeros if necessary).
This routine calls the stdlib standard output routine (putc)
to print all characters.
Include: stdlib.a or stdout.a
Routine: Puti
--------------
Category: Character Output Routine
Registers on Entry: AX- Value to print
Registers on Return: None
Flags Affected: None
Example of Usage:
mov AX, -1234
puti
Description: Puti prints the value in the AX register as a decimal integer.
This routine uses the exact number of screen positions required
to print the number (including a position for the minus sign, if
the number is negative). This routine calls the stdlib standard
output routine (putc) to print all characters.
Include: stdlib.a or stdout.a
Routine: Putu
--------------
Category: Character Output Routine
Register on entry: AX- Unsigned value to print.
Register on return: None
Flags affected: None
Example of Usage:
mov ax, 1234
putu
Description: Putu prints the value in the AX register as an unsigned integer.
This routine uses the exact number of screen positions required
to print the number. This routine calls the stdlib standard
output routine (putc) to print all characters.
Include: stdlib.a or stdout.a
Routine: Putl
--------------
Category: Character Output Routine
Register on entry: DX:AX- Value to print
Register on return: None
Flags affected: None
Example of Usage:
mov dx, 0ffffh
mov ax, -1234
putl
Description: Putl prints the value in the DX:AX registers as an integer.
This routine uses the exact number of screen positions
required to print the number (including a position for the
minus sign, if the number is negative). This routine calls
the stdlib standard output routine (putc) to print all
characters.
Include: stdlib.a or stdout.a
Routine: Putul
---------------
Category: Character Output Routine
Register on entry: DX:AX register
Register on return: None
Flags affected: None
Example of Usage:
mov dx, 12h
mov ax, 1234
putul
Description: Putul prints the value in the DX:AX registers as an unsigned
integer. This routine uses the exact number of screen
positions required to print the number. This routine calls
the stdlib standard output routine (putc) to print all
characters.
Include: stdlib.a or stdout.a
Routine: PutISize
------------------
Category: Character Output Routine
Registers on Entry: AX - Integer value to print
CX - Minimum number of print positions to use
Registers on return: None
Flags affected:
Example of Usage:
mov cx, 5
mov ax, I
PutISize
.
.
.
mov cx, 12
mov ax, J
PutISize
Description: PutISize prints the signed integer value in AX to the
stdlib standard output device using a minimum of n print
positions. CX contains n, the minimum field width for the
output value. The number (including any necessary minus sign)
is printed right justified in the output field.
If the number in AX requires more print positions than
specified by CX, PutISize uses however many print positions
are necessary to actually print the number. If you specify
zero in CX, PutISize uses the minimum number of print positions
required. Of course, PutI will also use the minimum number
of print positions without disturbing the value in the CX
register.
Note that, under no circumstances, will the number in AX
ever require more than 6 print positions (-32,767 requires
the most print positions).
Include: stdlib.a or stdout.a
Routine: PutUSize
------------------
Category: Character Output Routine
Registers on entry: AX- Value to print
CX- Minimum field width
Registers on return: None
Flags affected: None
Example of usage:
mov cx, 8
mov ax, U
PutUSize
Description: PutUSize prints the value in AX as an unsigned decimal integer.
The minimum field width specified by the value in CX.
Like PutISize above except this one prints unsigned values.
Note that the maximum number of print positions required by any
number (e.g., 65,535) is five.
Include: stdlib.a or stdout.a
Routine: PutLSize
------------------
Category: Character Output Routine
Register on entry: DX:AX-32 bit value to print
CX- Minimum field width
Register on return: None
Flags affected: None
Example of Usage:
mov cx, 16
mov dx, word ptr L+2
mov ax, word ptr L
PutLSize
Description: PutLSize is similar to PutISize, except this prints the long
integer value in DX:AX. Note that there may be as many as
11 print positions (e.g., -1,000,000,000).
Include: stdlib.a or stdout.a
Routine: PutULSize
-------------------
Category: Character Output Routine
Register on entry: AX : DX and CX
Register on return: None
Flags affected: None
Example of usage: mov cx, 8
mov dx, word ptr UL+2
mov ax, word ptr UL
PutULSize
Description: Prints the value in DX:AX as a long unsigned decimal integer.
Prints the number in a minimum field width specified by the
value in CX. Just like PutLSize above except this one prints
unsigned numbers rather than signed long integers. The largest
field width for such a value is 10 print positions.
Include: stdlib.a or stdout.a
Routine: Print
----------------
Category: Character Output Routine
Register on entry: CS:RET - Return address points at the string to print.
Register on return: None
Flags affected: None
Examples of Usage: print
db "Print this string to the display device"
db 13,10
db "This appears on a new line"
db 13,10
db 0
Description: Print lets you print string literals in a convenient
fashion. The string to print immediately follows the call
to the print routine. The string must contain a
zero terminating byte and may not contain any intervening
zero bytes. Since the print routine returns to the address
immediately following the zero terminating byte, forgetting
this byte or attempting to print a zero byte in the middle
of a literal string will cause print to return to an
unexpected instruction. This usually hangs up the machine.
Be very careful when using this routine!
Include: stdlib.a or stdout.a
Routine: Printf
----------------------
Category: Character Output Routine
Register on entry: CS:RET - Return address points at the format string
Register on return: None
Flags affected: None
Example of Usage:
printf
db "Indirect access to i: %^d",13,10,0
dd IPtr;
printf
db "A string allocated on the heap: %-\.32^s"
db 13,10,0
dd SPtr
Descriptions: Printf, like its "C" namesake, provides formatted output
capabilities for the stdlib package. A typical call to printf
always takes the following form:
printf
db "format string",0
dd operand1, operand2, ..., operandn
The format string is comparable to the one provided in the
"C" programming language. For most characters, printf simply
prints the characters in the format string up to the
terminating zero byte. The two exceptions are characters
prefixed by a backslash ("\") and characters prefixed by a
percent sign ("%"). Like C's printf, stdlib's printf uses
the backslash as an escape character and the percent sign as
a lead-in to a format string.
Printf uses the escape character ("\") to print special
characters in a fashion similar to, but not identical to C's
printf. Stdlib's printf routine supports the following
special characters:
* r Print a carriage return (but no line feed)
* n Print a new line character (carriage return/line feed).
* b Print a backspace character.
* t Print a tab character.
* l Print a line feed character (but no carriage return).
* f Print a form feed character.
* \ Print the backslash character.
* % Print the percent sign character.
* 0xhh Print ASCII code hh, represented by two hex digits.
C users should note a couple of differences between stdlib's
escape sequences and C's. First, use "\%" to print a percent
sign within a format string, not "%%". C doesn't allow the
use of "\%" because the C compiler processes "\%" at compile
time (leaving a single "%" in the object code) whereas printf
processes the format string at run-time. It would see a single
"%" and treat it as a format lead-in character. Stdlib's
printf, on the other hand, processes both the "\" and "%" and
run-time, therefore it can distinguish "\%".
Strings of the form "\0xhh" must contain exactly two hex
digits. The current printf routine isn't robust enough to
handle sequences of the form "\0xh" which contain only a
single hex digit. Keep this in mind if you find printf
chopping off characters after you print a value.
There is absolutely no reason to use any escape character
sequences except "\0x00". Printf grabs all characters
following the call to printf up to the terminating zero byte
(which is why you'd need to use "\0x00" if you want to print
the null character, printf will not print such values).
Stdlib's printf routine doesn't care how those characters got
there. In particular, you are not limited to using a single
string after the printf call. The following is perfectly
legal:
printf
db "This is a string",13,10
db "This is on a new line",13,10
db "Print a backspace at the end of this line:"
db 8,13,10,0
Your code will run a tiny amount faster if you avoid the use
of the escape character sequences. More importantly, the
escape character sequences take at least two bytes. You can
encode most of them as a single byte by simply embedding the
ASCII code for that byte directly into the code stream.
Don't forget, you cannot embed a zero byte into the code
stream. A zero byte terminates the format string. Instead,
use the "\0x00" escape sequence.
Format sequences always between with "%". For each format
sequence you must provide a far pointer to the associated
data immediately following the format string, e.g.,
printf
db "%i %i",0
dd i,j
Format sequences take the general form "%s\cn^f" where:
* "%" is always the "%" character. Use "\%" if you
actually want to print a percent sign.
* s is either nothing or a minus sign ("-").
* "\c" is also optional, it may or may not appear in
the format item. "c" represents any printable
character.
* "n" represents a string of 1 or more decimal digits.
* "^" is just the caret (up-arrow) character.
* "f" represents one of the format characters: i, d, x,
h, u, c, s, ld, li, lx, or lu.
The "s", "\c", "n", and "^" items are optional, the "%" and
"f" items must be present. Furthermore, the order of these
items in the format item is very important. The "\c" entry,
for example, cannot precede the "s" entry. Likewise, the "^"
character, if present, must follow everything except the "f"
character(s).
The format characters i, d, x, h, u, c, s, ld, li, lx, and
lu control the output format for the data. The i and d
format characters perform identical functions, they tell
printf to print the following value as a 16-bit signed
decimal integer. The x and h format characters instruct
printf to print the specified value as a 16-bit or 8-bit
hexadecimal value (respectively). If you specify u, printf
prints the value as a 16-bit unsigned decimal integer.
Using c tells printf to print the value as a single character.
S tells printf that you're supplying the address of a
zero-terminated character string, printf prints that string.
The ld, li, lx, and lu entries are long (32-bit) versions of
d/i, x, and u. The corresponding address points at a 32-bit
value which printf will format and print to the standard output.
The following example demonstrates these format items:
printf
db "I= %i, U= %u, HexC= %h, HexI= %x, C= %c, "
db "S= %s",13,10
db "L= %ld",13,10,0
dd i,u,c,i,c,s,l
The number of far addresses (specified by operands to the "dd"
pseudo-opcode) must match the number of "%" format items in
the format string. Printf counts the number of "%" format
items in the format string and skips over this many far
addresses following the format string. If the number of
items do not match, the return address for printf will be
incorrect and the program will probably hang or otherwise
malfunction. Likewise (as for the print routine), the format
string must end with a zero byte. The addresses of the items
following the format string must point directly at the memory
locations where the specified data lies.
When used in the format above, printf always prints the
values using the minimum number of print positions for each
operand. If you want to specify a minimum field width, you
can do so using the "n" format option. A format item of the
format "%10d" prints a decimal integer using at least ten
print positions. Likewise, "%16s" prints a string using at
least 16 print positions. If the value to print requires
more than the specified number of print positions, printf
will use however many are necessary. If the value to print
requires fewer, printf will always print the specified number,
padding the value with blanks. Printf will print the value
right justified in the print field (regardless of the data's
type). If you want to print the value left justified in the
output file, use the "-" format character as a prefix to the
field width, e.g.,
printf
db "%-17s",0
dd string
In this example, printf prints the string using a 17 character
long field with the string left justified in the output field.
By default, printf blank fills the output field if the value
to print requires fewer print positions than specified by the
format item. The "\c" format item allows you to change the
padding character. For example, to print a value, right
justified, using "*" as the padding character you would use
the format item "%\*10d". To print it left justified you
would use the format item "%-\*10d". Note that the "-" must
precede the "\*". This is a limitation of the current
version of the software. The operands must appear in this
order. Normally, the address(es) following the printf
format string must be far pointers to the actual data to print.
On occasion, especially when allocating storage on the heap
(using malloc), you may not know (at assembly time) the
address of the object you want to print. You may have only
a pointer to the data you want to print. The "^" format
option tells printf that the far pointer following the format
string is the address of a pointer to the data rather than
the address of the data itself. This option lets you access
the data indirectly.
Note: unlike C, stdlib's printf routine does not support
floating point output. Putting floating point into printf
would increase the size of this routine a tremendous amount.
Since most people don't need the floating point output
facilities, it doesn't appear here. Check out PRINTFF.
Include: stdlib.a or stdout.a
Routine: PRINTFF
-----------------
Category: Character Output Routine
Registers on Entry: CS:RET- Points at format string and other parameters.
Registers on Return: If your program prints floating point values, this
routine modifies the floating point accumulator and
floating point operand "pseudo-registers" in the
floating point package.
Flags Affected: None
Examples of Usage:
printff
db "I = %d, R = %7.2f F = 12.5e G = 9.2gf\n",0
dd i, r, f, g
Description:
This code works just like printf except it also allows the
output of floating point values. The output formats are
the following:
Single Precision:
mm.nnF- Prints a field width of mm chars with nn digits
appearing after the decimal point.
nnE- Prints a floating point value using scientific
notation in a field width of nn chars.
Double Precision:
mm.nnGF- As above, for double precision values.
nnGE- As above, for double precision values.
Extended Precision-
mm.nnLF- As above, for extended precision values.
nnLE- As above, for extended precision values.
Since PRINTFF supports everything PRINTF does, you should not
use both routines in the same program (just use PRINTF). The
PRINTF & PRINTFF macros check for this and will print a warning
message if you've included both routines. Using both will not
cause your program to fail, but it will make your program
unnecessarily larger. You should not use PRINTFF unless you
really need to print floating point values. When you use
PRINTFF, it forces the linker to load in the entire floating
point package, making your program considerably larger.
Include: stdlib.a or fp.a
String Handling Routines
------------------------
Manipulating text is a major part of many computer applications. Typically,
strings are inputed and interpreted. This interpretation may involve some
chores such as extracting certain part of the text, copying it, or comparing
with other strings.
The string manipulation routines in C provides various functions. Therefore,
the stdlib has some C-like string handling functions (e.g. strcpy, strcmp).
In C a string is an array of characters; similarly, the string are terminated
by a "0" as a null character. In general, the input strings of these routines
are pointed by ES:DI. In some routines, the carry flag will be set to indicate
an error.
The following string routines take as many as four different forms: strxxx,
strxxxl, strxxxm, and strxxxlm. These routines differ in how they store
the destination string into memory and where they obtain their source strings.
Routines of the form strxxx generally expect a single source string address
in ES:DI or a source and destination string in ES:DI & DX:SI. If these
routines produce a string, they generally store the result into the buffer
pointed at by ES:DI upon entry. They return with ES:DI pointing at the
first character of the destination string.
Routines of the form strxxxl have a "literal source string". A literal
source string follows the call to the routine in the code stream. E.g.,
strcatl
db "Add this string to ES:DI",0
Routines of the form strxxxm automatically allocate storage for a source
string on the heap and return a pointer to this string in ES:DI.
Routines of the form strxxxlm have a literal source string in the code
stream and allocate storage for the destination string on the heap.
Routine: Strcpy (l)
--------------------
Category: String Handling Routine
Registers on Entry: ES:DI - pointer to source string (Strcpy only)
CS:RET - pointer to source string (Strcpy1 only)
DX:SI - pointer to destination string
Registers on return: ES:DI - points at the destination string
Flags affected: None
Example of Usage:
mov dx, seg Dest
mov si, offset Dest
mov di, seg Source
mov es, di
mov si, offset Source
Strcpy
mov dx, seg Dest
mov si, offset Dest
Strcpyl
db "String to copy",0
Description: Strcpy is used to copy a zero-terminated string from one
location to another. ES:DI points at the source string,
DX:SI points at the destination address. Strcpy copies all
bytes, up to and including the zero byte, from the source
address to the destination address. The target buffer must
be large enough to hold the string. Strcpy performs no error
checking on the size of the destination buffer.
Strcpyl copies the zero-terminated string immediately following
the call instruction to the destination address specified by
DX:SI. Again, this routine expects you to ensure that the
taraget buffer is large enough to hold the result.
Note: There are no "Strcpym" or "Strcpylm" routines. The
reason is simple: "StrDup" and "StrDupl" provide these functions
using names which are familiar to MSC and Borland C users.
Include: stdlib.a or strings.a
Routine: StrDup (l)
--------------------
Category: String Handling Routine
Register on entry: ES:dI - pointer to source string (StrDup
only). CS:RET - Pointer to source string
(StrDupl only).
Register on return: ES:DI - Points at the destination string
allocated on heap. Carry=0 if operation
successful. Carry=0 if insufficient
memory for new string.
Flags affected: Carry flag
Example of usage:
StrDupl
db "String for StrDupl",0
jc MallocError
mov word ptr Dest1, di
mov word ptr Dest1+2, es ;create another
;copy of this
;string. Note
;that es:di points
;at Dest1 upon
;entry to StrDup,
;but it points at
;the new string on
;exit
StrDup
jc MallocError
mov word ptr Dest2, di
mov word ptr Dest2+2, es
Description: StrDup and StrDupl duplicate strings. You pass them
a pointer to the string (in es:di for strdup, via
the return address for strdupl) and they allocate
sufficient storage on the heap for a copy of this
string. Then these two routines copy their source
strings to the newly allocated storage and return
a pointer to the new string in ES:DI.
Include: stdlib.a or strings.a
Routine: Strlen
----------------
Category: String Handling Routine
Registers on entry: ES:DI - pointer to source string.
Register on return: CX - length of specified string.
Flags Affected: None
Examples of Usage:
les di, String
strlen
mov sl, cx
printf
db "Length of '%s' is %d\n",0
dd String, sl
Description: Strlen computes the length of the string whose address
appears in ES:DI. It returns the number of characters
up to, but not including, the zero terminating byte.
Include: stdlib.a or strings.a
Routine: Strcat (m,l,ml)
-------------------------
Category: String Handling Routine
Registers on Entry: ES:DI- Pointer to first string
DX:SI- Pointer to second string (Strcat and Strcatm only)
Registers on Return: ES:DI- Pointer to new string (Strcatm and Strcatml only)
Flags Affected: Carry = 0 if no error
Carry = 1 if insufficient memory (Strcatm and Strcatml
only)
Example of Usage: les DI, String1
mov DX, seg String2
lea SI, String2
Strcat ; String1 <- String1 + String2
les DI, String1
Strcatl ; String1 <- String1 +
db "Appended String",0 ; "Appended String",0
les DI, String1
mov DX, seg String2
lea SI, String2
Strcatm ; NewString <- String1 + String2
puts
free
les DI, String1
Strcatml ; NewString <- String1 +
db "Appended String",0 ; "Appended String",0
puts
free
Description: These routines concatenate two strings together. They differ
mainly in the location of their source and destination operands.
Strcat concatenates the string pointed at by DX:SI to the end of
the string pointed at by ES:DI in memory. Both strings must be
zero-terminated. The buffer pointed at by ES:DI must be large
enough to hold the resulting string. Strcat does NOT perform
bounds checking on the data.
( continued on next page )
Routine: Strcat (m,l,ml) ( continued )
-----------------------------------------
Strcatm computes the length of the two strings pointed at by ES:DI
and DX:SI and attempts to allocate this much storage on the heap.
If it is not successful, Strcatm returns with the Carry flag set,
otherwise it copies the string pointed at by ES:DI to the heap,
concatenates the string DX:SI points at to the end of this string
on the heap, and returns with the Carry flag clear and ES:DI
pointing at the new (concatenated) string on the heap.
Strcatl and Strcatml work just like Strcat and Strcatm except you
supply the second string as a literal constant immediately AFTER
the call rather than pointing DX:SI at it (see examples above).
Include: stdlib.a or strings.a
Routine: Strchr
----------------
Category: String Handling Routine
Register on entry: ES:DI- Pointer to string.
AL- Character to search for.
Register on return: CX- Position (starting at zero)
where Strchr found the character.
Flags affected: Carry=0 if Strchr found the character.
Carry=1 if the character was not present
in the string.
Example of usage:
les di, String
mov al, Char2Find
Strchr
jc NotPresent
mov CharPosn, cx
Description: Strchr locates the first occurrence of a character within a
string. It searches through the zero-terminated string pointed
at by es:di for the character passed in AL. If it locates the
character, it returns the position of that character to the CX
register. The first character in the string corresponds to the
location zero. If the character is not in the string, Strchr
returns the carry flag set. CX's value is undefined in that
case. If Strchr locates the character in the string, it
returns with the carry clear.
Include: stdlib.a or strings.a
Routine: Strstr (l)
--------------------
Category: String Handling Routine
Register on entry: ES:DI - Pointer to string.
DX:SI - Pointer to substring(strstr).
CS:RET - Pointer to substring (strstrl).
Register on return: CX - Position (starting at zero)
where Strstr/Strstrl found the
character. Carry=0 if Strstr/
Strstrl found the character.
Carry=1 if the character was not
present in the string.
Flags affected: Carry flag
Example of usage :
les di, MainString
lea si, Substring
mov dx, seg Substring
Strstr
jc NoMatch
mov i, cx
printf
db "Found the substring '%s' at location %i\n",0
dd Substring, i
Description: Strstr searches for the position of a substring
within another string. ES:DI points at the
string to search through, DX:SI points at the
substring. Strstr returns the index into ES:DI's
string where DX:SI's string is found. If the
string is found, Strstr returns with the carry
flag clear and CX contains the (zero based) index
into the string. If Strstr cannot locate the
substring within the string ES:DI points at, it
returns the carry flag set. Strstrl works just
like Strstr except it excepts the substring to
search for immediately after the call instruction
(rather than passing this address in DX:SI).
Include: stdlib.a or strings.a
Routine: Strcmp (l)
--------------------
Category: String Handling Routine
Registers on entry: ES:DI contains the address of the first string
DX:SI contains the address of the second string (strcmp)
CS:RET (contains the address of the substring (strcmpl)
Register on return: CX (contains the position where the two strings differ)
Flags affected: Carry flag and zero flag (string1 > string2 if C + Z = 0)
(string1 < string2 if C = 1)
Example of Usage:
les di, String1
mov dx, seg String2
lea si, String2
strcmp
ja OverThere
les di, String1
strcmpl
db "Hello",0
jbe elsewhere
Description: Strcmp compares the first strings pointed by ES:DI with
the second string pointed by DX:SI. The carry and zero flag
will contain the corresponding result. So unsigned branch
instructions such as JA or JB is recommended. If string1
equals string2, strcmp will return with CX containing the
offset of the zero byte in the two strings.
Strcmpl compares the first string pointed by ES:DI with
the substring pointed by CS:RET. The carry and zero flag
will contain the corresponding result. So unsigned branch
instructions such as JA or JB are recommended. If string1
equals to the substring, strcmp will return with CX
containing the offset of the zero byte in the two strings.
Include: stdlib.a or strings.a
Routine: Stricmp (l)
---------------------
Category: String Handling Routine
Registers on entry: ES:DI contains the address of the first string
DX:SI contains the address of the second string (stricmp)
CS:RET (contains the address of the substring (stricmpl)
Register on return: CX (contains the position where the two strings differ)
Flags affected: Carry flag and zero flag (string1 > string2 if C + Z = 0)
(string1 < string2 if C = 1)
Example of Usage:
les di, String1
mov dx, seg String2
lea si, String2
stricmp
ja OverThere
les di, String1
stricmpl
db "Hello",0
jbe elsewhere
Description: This routine is virtually identical to strcmp (l) except it
ignores case when comparing the strings.
Include: stdlib.a or strings.a
Routine: Strupr (m)
--------------------
Category: String Handling Routine
Conversion Routine
Register on entry: ES:DI (contains the pointer to input string)
Register on return: ES:DI (contains the pointer to input string
with characters converted to upper case)
Note: struprm allocates storage for a new
string on the heap and returns the pointer
to this routine in ES:DI.
Flags affected: Carry = 1 if memory allocation error (Struprm only).
Example of Usage:
les di, lwrstr1
strupr
puts
mov di, seg StrWLwr
mov es, di
lea di, StrWLwr
struprm
puts
free
Description: Strupr converts the input string pointed by ES:DI to
upper case. It will actually modify the string you pass
to it.
Struprm first makes a copy of the string on the heap and
then converts the characters in this new string to upper
case. It returns a pointer to the new string in ES:DI.
Include: stdlib.a or strings.a
Routine: Strlwr (m)
--------------------
Category: String Handling Routine
Conversion Routine
Register on entry: ES:DI (contains the pointer to input string)
Register on return: ES:DI (contains the pointer to input string
with characters converted to lower case).
Flags affected: Carry = 1 if memory allocation error (strlwrm only)
Example of Usage:
les di, uprstr1
strlwr
puts
mov di, seg StrWLwr
mov es, di
lea di, StrWLwr
strlwrm
puts
free
Description: Strlwr converts the input string pointed by ES:DI to
lower case. It will actually modify the string you pass
to it.
Strlwrm first copies the characters onto the heap and then
returns a pointer to this string after converting all the
alphabetic characters to lower case.
Include: stdlib.a or strings.a
Routine: Strset (m)
--------------------
Category: String Handling Routine
Register on entry: ES:DI contains the pointer to input string (StrSet only)
AL contains the character to copy
CX contains number of characters to allocate for
the string (Strsetm only)
Register on return: ES:DI pointer to newly allocated string (Strsetm only)
Flags affected: Carry set if memory allocation error (Strsetm only)
Example of Usage:
les di, string1
mov al, " " ;Blank fill string.
Strset
mov cx, 32
mov al, "*" ;Create a new string w/32
Strsetm ; asterisks.
puts
free
Description: Strset overwrites the data on input string pointed by
ES:DI with the character on AL.
Strsetm creates a new string on the heap with the number
of characters specified in CX. All characters in the string
are initialized with the value in AL.
Include: stdlib.a or strings.a
Routine: Strspan (l)
---------------------
Category: String Handling Routine
Registers on Entry: ES:DI - Pointer to string to scan
DX:SI - Pointer to character set (Strspan only)
CS:RET- Pointer to character set (Strspanl only)
Registers on Return: CX- First position in scanned string which does not
contain one of the characters in the character set
Flags Affected: None
Example of Usage:
les DI, String
mov DX, seg CharSet
lea SI, CharSet
Strspan ; find first position in String with a
mov i, CX ; char not in CharSet
printf
db "The first char which is not in CharSet "
db "occurs at position %d in String.\n",0
dd i
les DI, String
Strspanl ; find first position in String which
db "aeiou",0 ; is not a vowel
mov j, CX
printf
db "The first char which is not a vowel "
db "occurs at position %d in String.\n",0
dd j
Description: Strspan(l) scans a string, counting the number of characters which
are present in a second string (which represents a character set).
ES:DI points at a zero-terminated string of characters to scan.
DX:SI (strspan) or CS:RET (strspanl) points at another zero-
terminated string containing the set of characters to compare
against. The position of the first character in the string
pointed to by ES:DI which is NOT in the character set is returned.
If all the characters in the string are in the character set, the
position of the zero-terminating byte will be returned.
Although strspan and (especially) strspanl are very compact and
convenient to use, they are not particularly efficient. The
character set routines provide a much faster alternative at the
expense of a little more space.
Include: stdlib.a or strings.a
Routine: Strcspan, Strcspanl
-----------------------------
Category: String Handling Routine
Registers on Entry: ES:DI - Pointer to string to scan
DX:SI - Pointer to character set (Strcspan only)
CS:RET- Pointer to character set (Strcspanl only)
Registers on Return: CX- First position in scanned string which contains one
of the characters in the character set
Flags Affected: None
Example of Usage:
les DI, String
mov DX, seg CharSet
lea SI, CharSet
Strcspan ; find first position in String with a
mov i, CX ; char in CharSet
printf
db "The first char which is in CharSet "
db "occurs at position %d in String.\n",0
dd i
les DI, String
Strcspanl ; find first position in String which
db "aeiou",0 ; is a vowel.
mov j, CX
printf
db "The first char which is a vowel occurs "
db "at position %d in String.\n",0
dd j
Description: Strcspan(l) scans a string, counting the number of characters
which are NOT present in a second string (which represents a
character set). ES:DI points at a zero-terminated string of
characters to scan. DX:SI (strcspan) or CS:RET (strcspanl) points
at another zero-terminated string containing the set of characters
to compare against. The position of the first character in the
string pointed to by ES:DI which is in the character set is
returned. If all the characters in the string are not in the
character set, the position of the zero-terminating byte will be
returned.
Although strcspan and strcspanl are very compact and convenient to
use, they are not particularly efficient. The character set
routines provide a much faster alternative at the expense of a
little more space.
Include: stdlib.a or strings.a
Routine: StrIns (m,l,ml)
-------------------------
Category: String Handling Routine
Registers on Entry: ES:DI - Pointer to destination string (to insert into)
DX:SI - Pointer to string to insert
(StrIns and StrInsm only)
CX - Insertion point in destination string
Registers on Return: ES:DI - Pointer to new string (StrInsm and StrInsml only)
Flags Affected: Carry = 0 if no error
Carry = 1 if insufficient memory
(StrInsm and StrInsml only)
Example of Usage:
les DI, DestStr
mov DX, word ptr SrcStr+2
mov SI, word ptr SrcStr
mov CX, 5
StrIns ; Insert SrcStr before the 6th char of DestStr
les DI, DestStr
mov CX, 2
StrInsl ; Insert "Hello" before the 3rd char of DestStr
db "Hello",0
les DI, DestStr
mov DX, word ptr SrcStr+2
mov SI, word ptr SrcStr
mov CX, 11
StrInsm ; Create a new string by inserting SrcStr
; before the 12th char of DestStr
puts
putcr
free
Description: These routines insert one string into another string. ES:DI
points at the string into which you want to insert another. CX
contains the position (or index) where you want the string
inserted. This index is zero-based, so if CX contains zero, the
source string will be inserted before the first character in the
destination string. If CX contains a value larger than the size
of the destination string, the source string will be appended to
the destination string.
StrIns inserts the string pointed at by DX:SI into the string
pointed at by ES:DI at position CX. The buffer pointed at by
ES:DI must be large enough to hold the resulting string. StrIns
does NOT perform bounds checking on the data.
( continued on next page )
Routine: StrIns (m,l,ml) ( continued )
-----------------------------------------
StrInsm does not modify the source or destination strings, but
instead attempts to allocate a new buffer on the heap to hold the
resulting string. If it is not successful, StrInsm returns with
the Carry flag set, otherwise the resulting string is created and
its address is returned in the ES:DI registers.
StrInsl and StrInsml work just like StrIns and StrInsm except you
supply the second string as a literal constant immediately AFTER
the call rather than pointing DX:SI at it (see examples above).
Routine: StrDel, StrDelm
-------------------------
Category: String Handling Routine
Registers on Entry: ES:DI - pointer to string
CX - deletion point in string
AX - number of characters to delete
Registers on return: ES:DI - pointer to new string (StrDelm only)
Flags affected: Carry = 1 if memory allocation error, 0 if okay
(StrDelm only).
Example of Usage:
les di, Str2Del
mov cx, 3 ; Delete starting at 4th char
mov ax, 5 ; Delete five characters
StrDel ; Delete in place
les di, Str2Del2
mov cx, 5
mov ax, 12
StrDelm
puts
free
Description: StrDel deletes characters from a string. It works by computing
the beginning and end of the deletion point. Then it copies all
the characters from the end of the deletion point to the end of
the string (including the zero byte) to the beginning of the
deletion point. This covers up (thereby effectively deleting)
the undesired characters in the string.
Here are two degenerate cases to worry about -- 1) when you
specify a deletion point which is beyond the end of the string;
and 2) when the deletion point is within the string but the
length of the deletion takes you beyond the end of the string.
In the first case StrDel simply ignores the deletion request. It
does not modify the original string. In the second case,
StrDel simply deletes everything from the deletion point to the
end of the string.
StrDelm works just like StrDel except it does not delete the
characters in place. Instead, it creates a new string on the
heap consisting of the characters up to the deletion point and
those following the characters to delete. It returns a pointer
to the new string on the heap in ES:DI, assuming that it
properly allocated the storage on the heap.
Include: stdlib.a or strings.a
Routine: StrTrim (m)
---------------------
Category: String Handling Routine
Registers on Entry: ES:DI - pointer to string
Registers on return: ES:DI - pointer to string (new string if StrTrimm)
Flags affected: Carry = 1 if memory allocation error, 0 if okay
(StrTrimm only).
Example of Usage:
les di, Str2Trim
StrTrim ; Delete in place
puts
les di, Str2Trim2
StrTrimm
puts
free
Description: StrTrim (m) removes trailing spaces from a string. StrTrim
removes the space in the specified string (by backing up the
zero terminating byte in the string. StrTrimm creates a new
copy of the string (on the heap) without the trailing spaces.
Include: stdlib.a or strings.a
Routine: StrBlkDel (m)
-----------------------
Category: String Handling Routine
Registers on Entry: ES:DI - pointer to string
Registers on return: ES:DI - pointer to string (new string if StrBlkDelm)
Flags affected: Carry = 1 if memory allocation error, 0 if okay
(StrBlkDelm only).
Example of Usage:
les di, Str2Trim
StrBlkDel ; Delete in place
puts
les di, Str2Trim2
StrBlkDelm
puts
free
Description: StrBlkDel (m) removes leading spaces from a string. StrBlkDel
removes the space in the specified string, modifying that
string. StrBlkDelm creates a new copy of the string (on the
heap) without the leading spaces.
Include: stdlib.a or strings.a
Routine: StrRev, StrRevm
-------------------------
Author: Michael Blaszczak (.B ekiM)
Category: String Handling Routine
Registers on Entry: ES:DI - pointer to string
Registers on return: ES:DI - pointer to new string (StrRevm only).
Flags affected: Carry = 1 if memory allocation error, 0 if okay
(StrRevm only).
Example of Usage:
Description: StrRev reverses the characters in a string. StrRev reverses,
in place, the characters in the string that ES:SI points at.
StrRevm creates a new string on the heap (which contains the
characters in the string ES:DI points at, only reversed) and
returns a pointer to the new string in ES:DI. If StrRevm
cannot allocate sufficient memory for the string, it returns
with the carry flag set.
Include: stdlib.a or strings.a
Routine: StrBDel (m)
---------------------
Author: Randall Hyde
Category: String Handling Routine
Registers on Entry: ES:DI - pointer to string
Registers on return: ES:DI - pointer to new string (StrBDelm only).
Flags affected: Carry = 1 if memory allocation error, 0 if okay
(StrBDelm only).
Example of Usage:
Description: StrBDel(m) deletes leading blanks from a string. StrBDel
operates on the string in place, StrBDelm creates a copy
(on the heap) of the string without the leading blanks.
Include: stdlib.a or strings.a
Routine: ToHex
---------------
Category: String Handling Routine/ Conversion Routine
Registers on Entry: ES:DI - pointer to byte array
BX- memory base address for bytes
CX- number of entries in byte array
Registers on return: ES:DI - pointer to Intel Hex format string.
Flags affected: Carry = 1 if memory allocation error, 0 if okay
Example of Usage:
mov bx, 100h ;Put data at address 100h in hex file.
mov cx, 10h ;Total of 16 bytes in this array.
les di, Buffer ;Pointer to data bytes
ToHex ;Convert to Intel HEX string format.
puts ;Print it.
Description:
ToHex converts a stream of binary values to Intel Hex format. Intel HEX format
is a common ASCII data interchange format for binary data. It takes the
following form:
: BB HHLL RR DDDD...DDDD SS <cr> <lf>
(Note:spaces were added for clarity, they are not actually present in the
hex string)
BB is a pair of hex digits which represent the number of data bytes (The DD
entries) and is the value passed in CX.
HHLL is the hexadecimal load address for these data bytes (passed in BX).
RR is the record type. ToHex always produces data records with the RR field
containing "00". If you need to output other field types (usually just an
end record) you must create that string yourself. ToHex will not do it.
DD...DD is the actual data in hex form. This is the number of bytes specified
in the BB field.
SS is the two's complement of the checksum (which is the sum of the binary
values of the BB, HH, LL, RR, and all DD fields).
This routine allocates storage for the string on the heap and returns a pointer
to that string in ES:DI.
Include: stdlib.a or strings.a
Utility Routines
----------------
The following routines are all Utility Routines. The first routines listed
below compute the number of print positions required by a 16-bit and 32-bit
signed and unsigned integer value. UlSize is like the LSize except it treats
the value in DX:AX as an unsigned long integer. The next set of routines in
this section check the character in the AL register to see whether it is a
hexidecimal digit, if it alphabetic, if it is a lower case alphabetic, if it
is a upper case alphabetic, and if it is numeric. Then there are some
miscellaneous routines (macros) which process command line parameters, invoke
DOS and exit the program.
Routine: ISize
---------------
Category: Utility Routine
Register on entry: AX- 16-bit value to compute the
output size for.
Register on return: AX- Number of print positions
required by this number (including
the minus sign, if necessary).
Flags affected: None
Example of usage:
mov ax, I
ISize
puti ;Prints positions
;req'd by I.
Description: This routine computes the number of print positions
required by a 16-bit signed integer value. ISize computes
the minimum number of character positions it takes to print
the signed decimal value in the AX register. If the number
is negative, it will include space for the minus sign in
the count.
Include: stdlib.a or util.a
Routine: USize
---------------
Category: Utility Routine
Register on entry: AX- 16 bit value to compute the
output size for
Register on return: AX- number of print positions
required by this number (including
the minus sign, if necessary)
Flags affected: None
Example of usage:
mov ax, I
USize
puti ;prints position
;required by I
Description: This routine computes the number of print positions
required by a 16-bit signed integer value. It also
computes the number of print positions required by a
16-bit unsigned value. USize computes the minimum number
of character positions it will take to print an unsigned
decimal value in the AX register. If the number is
negative, it will include space for the minus sign in the
count.
Include: stdlib.a or util.a
Routine: LSize
---------------
Category: Utility Routine
Register on entry: DX:AX - 32-bit value to compute the
output size for.
Register on return: AX - Number of print positions
required by this number (including
the minus sign, if necessary).
Flags affected: None
Example of Usage:
mov ax, word ptr L
mov dx, word ptr L+2
LSize
puti ;Prints positions
;req'd by L.
Description: This routine computes the number of print positions
required by a 32-bit signed integer value. LSize computes
the minimum number of character positions it will take to
print the signed decimal value in the DX:AX registers. If
the number is negative, it will include space for the minus
sign in the count.
Include: stdlib.a or util.a
Routine: ULSize
----------------
Category: Utility Routine
Registers on Entry: DX:AX - 32-bit value to compute the output size for.
Registers on return: AX - number of print positions required by this number
Flags affected: None
Example of Usage:
mov ax, word ptr L
mov dx, word ptr L+2
ULSize
puti ; Prints positions req'd by L
Description: ULSize computes the minimum number of character
positions it will take to print an unsigned decimal
value in the DX:AX registers.
Include: stdlib.a or util.a
Routine: IsAlNum
-----------------
Category: Utility routine
Register on entry: AL - character to check.
Register on return: None
Flags affected: Zero flag - set if character is alphanumeric,
clear if not.
Example of usage : mov al, char
IsAlNum
je IsAlNumChar
Description : This routine checks the character in the AL register to
see if it is in the range A-Z, a-z, or 0-9. Upon return,
you can use the JE instruction to check to see if the
character was in this range (or, conversely, you can use
JNE to see if it is not in range).
Include: stdlib.a or util.a
Routine: IsXDigit
------------------
Category: Utility Routine
Register on Entry: AL- character to check
Registers on Return: None
Flags Affected: Zero flag- Set if character is a hex digit, clear if not
Example of Usage: mov al, char
IsXDigit
je IsXDigitChar
Description: This routine checks the character in the AL register to
see if it is in the range A-F, a-f, or 0-9. Upon
return, you can use the JE instruction to check to see
if the character was in this range (or, conversely,
you can use jne to see if it is not in the range).
Include: stdlib.a or util.a
Routine: IsDigit
------------------
Category: Utility Routine
Register on entry: AL- Character to check
Register on return: None
Flags affected: Zero flag- set if character is numeric, clear if not.
Example of Usage: mov al, char
IsDigit
je IsDecChar
Description: This routine checks the character in the AL register to
see if it is in the range 0-9. Upon return, you can use
the JE instruction to check to see if the character was
in the range (or, conversely, you can use JNE to see if it
is not in the range).
Include: stdlib.a or util.a
Routine: IsAlpha
------------------
Category: Utility Routine
Register on entry: AL- Character to check
Register on return: None
Flags affected: Zero flag- set if character is alphabetic, clear if not.
Example of Usage: mov al, char
IsAlpha
je IsAlChar
Description: This routine checks the character in the AL register to
see if it is in the range A-Z or a-z. Upon return, you
can use the JE instruction to check to see if the character
was in the range (or, conversely, you can use JNE to see
if it is not in the range).
Include: stdlib.a or util.a
Routine: IsLower
----------------
Category: Utility Routine
Registers on Entry: AL- character to test
Registers on Return: None
Flags Affected: Zero = 1 if character is a lower case alphabetic character
Zero = 0 if character is not a lower case alphabetic
character
Example of Usage: mov AL, char ; put char in AL
IsLower ; is char lower a-z?
je IsLowerChar ; if yes, jump to IsLowerChar
Description: This routine checks the character in the AL register to
see if it is in the range a-z. Upon return, you can use
the JE instruction to check and see if the character was
in this range (or you can use JNE to check and see if
the character was not in this range). This procedure is
implemented as a macro for high performance.
Include: stdlib.a or util.a
Routine: IsUpper
-----------------
Category: Utility Routine
Registers on Entry: AL- character to check
Registers on Return: None
Flags Affected: Zero flag - set if character is uppercase alpha, clear
if not.
Example of Usage: mov al, char
IsUpper
je IsUpperChar
Description: This routine checks the character in the AL register to
see if it is in the ranger A-Z. Upon return, you can use
the JE instruction to check to see if it not in the
range). It uses macro implementation for high performance.
Include: stdlib.a or util.a
Linked list manipulation routines
=================================
These routines manipulate items in a linked list. Internally the system
represents the data as a doubly linked list, although your program should
not rely on the internal structure of the data structure.
There are two structures of interest defined in the LISTS.A file: LIST
and NODE. Use variables of type LIST to create brand new lists. Use
variables of type NODE to hold the entries in the list.
These structures take the following form:
List struc
Size dw ? ;Size, in bytes, of a node in the list
Head dd 0 ;Ptr to start of list
Tail dd 0 ;Ptr to end of list
Current dd 0 ;Pointer to current node
List ends
Node struc
Next dd ? ;Ptr to next node in list
Prev dd ? ;Ptr to prev node in list
NodeData db ?? ;Data immediately follows Prev
Node ends
There are two ways to create a new list: statically or dynamically.
Consider static allocation first. In this case, you create a list variable
by declaring an object of type LIST in a data segment, e.g.,
MyList list <25>
You *must* supply the size (in bytes) of a node in the list. Note that the
size should *not* include the eight bytes required for the next and prev
pointers. This allows you to change the internal structure of the list
(e.g., to a singly linked list) without having to change other code. You
can easily compute this as follows:
MyList list <(sizeof MyNode) - (sizeof Node)>
When you declare lists in this fashion, the definition automatically
initializes the list to an empty list.
You can also create a list dynamically by calling the CreateList routine.
To CreateList you must pass the size of a Node (not including the pointers)
in the CX register. It allocates storage for the list variable on the
heap and returns a pointer to this new (empty) list in es:di.
mov cx, (sizeof MyNode) - (sizeof Node)
CreateList
mov word ptr MyListPtr, di
mov word ptr MyListPtr+2, es
To create nodes for your list, you should "overload" the NODE definition
appearing the in LISTS.A file. This works best under MASM 6.0 and TASM 3.0,
which support object-oriented programming, though it isn't that difficult to
accomplish with other assemblers. A mechanism compatible with *all*
assemblers follows:
To create a brand new node is easy, just do the following:
MyNode struc
db (size Node) dup (0) ;Inherit all fields from NODE.
Field1 db ? ;User-supplied fields for this
Field2 dw ? ; particular node type.
Field3 dd ? ; " " " "
Field4 real4 3.14159 ; " " " "
MyNode ends
Note that the NODE fields must appear *first* in the data structure.
The list manipulation routines assume that the list pointers in NODE appear
at the beginning of the structure.
The CurrentNode field of the list data structure points at a "current" node
in the list. The current node is the last node operated on in the case of
insert, append, peek, etc. In the event a node is removed, the current node
will be the next node after the node removed. In general, the current node
can be thought of as a "cursor" which wanders through the list according to
the operations occuring. Since most list operations occur on the next node
in a list, keeping the CurrentNode field updated speeds up access to the
list.
You can use the following routines to implement the corresponding data
structures (which can all be implemented using lists):
FIFO Queues:
AppendLastm, AppendLast, Remove1st, and Peek1st (technically, using Peek1st
is cheating, but so what).
Deques (double ended queues):
All the FIFO routines plus InsertFirstm, InsertFirst, RemoveLast, and
PeekLast (PeekLast is cheating too).
Lists:
All of the above plus InsertCur, InsertmCur, AppendCur, AppendmCur,
RemoveCur, Insert, Insertm, Append, Appendm, Remove, SetCur, NextNode,
and PrevNode.
For those who care about such things, the UCR Standard Library implements
the list data structure using a doubly linked list. However, it is a
true generic (encapsulated) data type and your code needed be at all
concerned about the internal structure. Furthermore, assuming you treat
it like an encapsulated data structure, you can modify the internal list
structure and not break any programs which use the list data types.
Routine: CreateList
--------------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: CX- Size of data (in bytes) to store at each node
Registers on return: ES:DI- Pointer to new list variable on heap
Flags affected: Carry set if CreateList cannot allocate sufficient
storage on the heap for the list variable.
Example of Usage:
mov cx, (sizeof MyNode) - (sizeof Node)
CreateList
jc ListError
mov word ptr ListVarPtr, di
mov word ptr ListVarPtr+2, es
Description:
CreateList allocates storage for a list variable on the head and initializes
that variable to the empty list. It also sets up the size field of the
list variable based on the value passed in the CX register. It returns
a pointer to the newly created list in the ES:DI registers.
This routine initializes the CurrentNode field to NIL. Any node inserted
before or after the current node will be inserted as the first node in this
case.
Include: lists.a or stdlib.a
Routine: AppendLast (m)
------------------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: DX:SI- Pointer to node to add to list (AppendLast)
DX:SI- Pointer to block of data (sans list stuff)
to add to end of list (AppendLastm)
ES:DI- Pointer to list.
Registers on return: ES:DI- Pointer to list.
Flags affected: Carry set if AppendLastm cannot allocate sufficient
storage on the heap for the list variable.
Examples of Usage:
; Append data statically declared as ANode to the end of the list pointed at
; by the list variable "ListVar".
ldxi ANode
les di, ListVar
AppendLast
; Create a node from the data at address "MyData". Build the node on the
; heap and append this node to the end of the list pointed at by ListVar.
ldxi MyData
les di, ListVar
AppendLastm
jc BadListError
Description:
AppendLast and AppendLastm add a node to the end of a list. AppendLast works
with whole nodes. It is useful, for example when moving a node from one
list to another or when dealing with nodes that were created statically in
the program. It requires nodes properly declared using the NODE data type
in the LIST.A include file.
AppendLastm builds a new node on the heap and appends this node to the end
of the specified list. The difference between AppendLastm and AppendLast is
that AppendLastm does not require a predefined node. Instead, DX:SI points
at the data for the node (the number of bytes is specified by the ListSize
field of the LIST data type). AppendLastm allocates memory, copies the data
from DX:SI to the data field of the new node, and then links in the new node
to the specified list.
The new node added to the list becomes the CurrentNode.
Include: stdlib.a or lists.a
Routine: Remove1st
-------------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list variable.
Registers on return: DX:SI- Pointer to node removed from the front of
the list (NIL if nothing in list).
Flags affected: Carry set if the list was empty.
Examples of Usage:
; The following loop removes all the items from a list and processes each
; item.
DoAllOfList: les di, MyList
Remove1st
jc DidItAll
<manipulate this item>
jmp DoAllOfList
DidItAll:
Description:
Remove1st removes the first item from a list and returns a pointer to that
item in DX:SI. If the list was empty, then it returns a NIL pointer in
DX:SI and returns with the carry flag set.
Note that you can use the AppendLast(m) and Remove1st routines to implement
and manipulate a FIFO queue data structure. Peek1st is another useful
routine which returns the first item on a list without removing it from
the list.
The second node in the list (the one after the node just removed) becomes
the new CurrentNode. If there are no additional nodes in the list, the
CurrentNode variable gets set to NIL.
Include: stdlib.a or lists.a
Routine: Peek1st
-----------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list variable.
Registers on return: DX:SI- Pointer to node at the beginning of
the list (NIL if nothing in list).
Flags affected: Carry set if the list was empty.
Examples of Usage:
les di, MyList
Peek1st
jc NothingThere
Description:
Peek1st is similar to Remove1st in that it returns a pointer to the first
item in a list (NIL if the list is empty). However, it does not remove the
item from the list. This is useful for performing a "non-destructive" read
of the first item in a FIFO queue.
This routine sets the CurrentNode field to the first node in the list.
Include: stdlib.a or lists.a
Routine: Insert1st (m)
-----------------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: DX:SI- Pointer to node to add to list (Insert1st)
DX:SI- Pointer to block of data (sans list stuff)
to add to end of list (Insert1stm)
ES:DI- Pointer to list.
Registers on return: ES:DI- Pointer to list.
Flags affected: Carry set if Insertm cannot allocate sufficient
storage on the heap for the list variable.
Examples of Usage:
; Insert data statically declared as ANode to the beginning of the list
; pointed at by the list variable "ListVar".
ldxi ANode
les di, ListVar
Insert1st
; Create a node from the data at address "MyData". Build the node on the
; heap and insert this node to the beginning of the list pointed at by
; ListVar.
ldxi MyData
les di, ListVar
Insert1stm
jc BadListError
Description:
Insert1st and Insert1stm add a node to the beginning of a list. Insert1st
works with whole nodes. It is useful, for example when moving a node from one
list to another or when dealing with nodes that were created statically in
the program. It requires nodes properly declared using the NODE data type
in the LISTS.A include file.
Insert1stm builds a new node on the heap and inserts this node to the start
of the specified list. The difference between Insert1stm and Insert1st is
that Insert1stm does not require a predefined node. Instead, DX:SI points
at the data for the node (the number of bytes is specified by the ListSize
field of the LIST data type). Insert1stm allocates memory, copies the data
from DX:SI to the data field of the new node, and then links in the new node
to the specified list.
Note that Insert1st/Insert1stm can be used to create Deque data structures.
The newly inserted node becomes the CurrentNode in the list.
Include: stdlib.a or lists.a
Routine: RemoveLast
--------------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list variable.
Registers on return: DX:SI- Pointer to node removed from the end of
the list (NIL if nothing in list).
Flags affected: Carry set if the list was empty.
Examples of Usage:
; The following loop removes all the items from a list and processes each
; item.
DoAllOfList: les di, MyList
RemoveLast
jc DidItAll
<manipulate this item>
jmp DoAllOfList
DidItAll:
Description:
RemoveLast removes the last item from a list and returns a pointer to that
item in DX:SI. If the list was empty, then it returns a NIL pointer in
DX:SI and returns with the carry flag set.
Note that you can use the Insert1st(m) and RemoveLast routines to implement
and manipulate a DEQUE queue data structure (along with the FIFO routines:
AppendLast(m), Rmv1st, and Peek1st). PeekLast is another useful
routine which returns the last item on a list without removing it from
the list.
The last node in the list (the one before the node just removed) becomes the
new CurrentNode in the list. If the list is empty, CurrentNode gets set to
NIL.
Include: stdlib.a or lists.a
Routine: PeekLast
------------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list variable.
Registers on return: DX:SI- Pointer to node at the end of
the list (NIL if nothing in list).
Flags affected: Carry set if the list was empty.
Examples of Usage:
les di, MyList
PeekLast
jc NothingThere
Description:
PeekLast is just like Peek1st except it looks at the last node on the list
rather than the first. It does the same job as RemoveLast except it does
not remove the node from the list. Great for implementing Deques.
This routine also sets the CurrentNode field to point at the last node in
the list.
Include: stdlib.a or lists.a
Routine: InsertCur
-------------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list.
DX:SI- Pointer to node to insert
Examples of Usage:
les di, MyList
ldxi NewNode
InsertCur
Description:
InsertCur inserts the node pointed at by DX:SI before the "current" node in
the list. The current node is the last one operated on by the software.
The newly inserted node becomes the CurrentNode in the list.
Include: stdlib.a or lists.a
Routine: InsertmCur
--------------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list.
DX:SI- Pointer to data for node to insert
Flags on exit: Carry flag is set if malloc error occurs.
Examples of Usage:
les di, MyList
ldxi DataBlock
InsertmCur
jc Error
Description:
InsertmCur builds a new node on the heap (using the block of data pointed at
by DX:SI and the size of a node in the size field of the list variable) and
then inserts the new node before the "current" node in the list. The current
node is the last one operated on by the software.
This code treats the newly inserted node as the current node.
Include: stdlib.a or lists.a
Routine: AppendCur
-------------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list.
DX:SI- Pointer to node to append
Examples of Usage:
les di, MyList
ldxi NewNode
AppendCur
Description:
AppendCur inserts the node pointed at by DX:SI after the "current" node in
the list. The current node is the last one operated on by the software.
The newly inserted node becomes the CurrentNode in the list.
Include: stdlib.a or lists.a
Routine: AppendmCur
--------------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list.
DX:SI- Pointer to data for node to insert.
Flags on exit: Carry flag is set if malloc error occurs.
Examples of Usage:
les di, MyList
ldxi DataBlock
AppendmCur
jc MallocError
Description:
AppendmCur builds a new node on the heap (using the block of data pointed at
by DX:SI and the size of a node in the size field of the list variable) and
then inserts the new node after the "current" node in the list. The current
node is the last one operated on by the software.
This code treats the newly inserted node as the current node.
Include: stdlib.a or lists.a
Routine: RemoveCur
-------------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list.
Registers on exit: DX:SI- Points at node removed from list (NIL if
no such node).
Flags on return: Carry set if the list was empty.
Examples of Usage:
les di, MyList
RemoveCur
jc EmptyList
Description:
RemoveCur removes the current node (pointed at by CurrentNode) from the list
and returns a pointer to this node in DX:SI. If the list was empty, RemoveCur
returns NIL in DX:SI and sets the carry flag.
This routine modifies CurrentNode so that it points at the next item in the
list (the node normally following the current node). If there is no such
node (i.e., CurrentNode pointed at the last node in the list upon calling
RemoveCur) then this routine stores the value of the *previous* node into
CurrentNode. If you use this routine to delete the last node in the list,
it sets CurrentNode to NIL before leaving.
Include: stdlib.a or lists.a
Routine: PeekCur
-----------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list.
Registers on exit: DX:SI- Points at the current node (i.e., contains
a copy of CurrentNode), NIL if the list
is empty.
Flags on return: Carry set if the list was empty.
Examples of Usage:
les di, MyList
PeekCur
jc EmptyList
Description:
PeekCur simply returns CurrentNode in DX:SI (assuming the list is not empty).
If the list is empty, it returns the carry flag set and NIL in DX:SI.
It does not affect the value of CurrentNode.
Include: stdlib.a or lists.a
Routine: SetCur
----------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list.
CX- Node number of new current node.
Registers on exit: DX:SI- Returned pointing at selected node.
NIL if the list is empty. Points at the
last node in the list if the value in CX
is greater than the number of nodes in the
list.
Flags on return: Carry set if the list was empty.
Examples of Usage:
les di, MyList
mov cx, NodeNum
SetCur
jc EmptyList
Description:
SetCur locates the specified node in the list and sets CurrentNode to the
address of that node. It also returns a pointer to that node in DX:SI.
If CX is greater than the number of nodes in the list (or zero) then
SetCur sets CurrentNode to the last node in the list. If the list is
empty, SetCur returns NIL in DX:SI and returns with the carry flag set.
Include: stdlib.a or lists.a
Routine: NextNode
------------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list.
Registers on exit: DX:SI- Returned pointing at selected node.
NIL if the list is empty. Points at the
last node in the list if the current node
was the last node in the list
Flags on return: Carry set if the current node was the last node
or the list was empty.
Examples of Usage:
les di, MyList
NextNode
jc EmptyOrEnd
Description:
NextNode modifies the CurrentNode pointer so that it points to the next
node in the list, if there is one. It also returns a pointer to that node
in DX:SI. If the list is empty, or CurrentNode points at the last node
in the list, NextNode returns with the carry flag set.
Include: stdlib.a or lists.a
Routine: PrevNode
------------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list.
Registers on exit: DX:SI- Returned pointing at selected node.
NIL if the list is empty. Points at the
1st node in the list if the current node
was the 1st node in the list
Flags on return: Carry set if the current node was the 1st node
or the list was empty.
Examples of Usage:
les di, MyList
PrevNode
jc EmptyOr1st
Description:
PrevNode modifies the CurrentNode pointer so that it points to the previous
node in the list, if there is one. It also returns a pointer to that node
in DX:SI. If the list is empty, or CurrentNode points at the 1st node
in the list, PrevNode returns with the carry flag set.
Include: stdlib.a or lists.a
Routine: Insert (m)
--------------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list.
DX:SI- Address of node to insert (Insert)
DX:SI- Pointer to data block to create node from
(Insertm).
CX- Number of node to insert DX:SI in front of;
Note that the list is one-based. That is,
the number of the first node in the list is
one. Zero corresponds to the last node in
the list.
Flags on return: Carry set if malloc error occurs (Insertm only).
Examples of Usage:
les di, MyList
ldxi NewNode
mov cx, 5
Insert ;Inserts before Node #5.
; The following example builds a new node on the heap from the data at
; location "RawData" and inserts this before node #5 in MyList.
les di, MyList
ldxi RawData
mov cx, 5
Insertm
jc MallocError
Description:
Insert(m) inserts a new node before a specified node in the list. The node to
insert in front of is specified by the value in the CX register. The first
node in the list is node #1, the second is node #2, etc. If the value in
CX is greater than the number of nodes in the list (in particular, if CX
contains zero, which gets treated like 65,536) then Insert(m) appends the
new node to the end of the list.
Insertm allocates a new node on the heap (DX:SI points at the data fields
for the node). If a malloc error occurs, Insertm returns the carry flag
set.
CurrentNode gets set to the newly inserted node.
Include: stdlib.a or lists.a
Routine: Append (m)
--------------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list.
DX:SI- Address of node to insert (Append)
DX:SI- Pointer to data block to create node from
(Appendm).
CX- Number of node to insert DX:SI after;
Note that the list is one-based. That is,
the number of the first node in the list is
one. Zero corresponds to the last node in
the list.
Flags on return: Carry set if malloc error occurs (Appendm only).
Examples of Usage:
les di, MyList
ldxi NewNode
mov cx, 5
Append ;Inserts after Node #5.
; The following example builds a new node on the heap from the data at
; location "RawData" and inserts this after node #5 in MyList.
les di, MyList
ldxi RawData
mov cx, 5
Appendm
jc MallocError
Description:
Append(m) inserts a new node after a specified node in the list. The node to
insert in front of is specified by the value in the CX register. The first
node in the list is node #1, the second is node #2, etc. If the value in
CX is greater than the number of nodes in the list (in particular, if CX
contains zero, which gets treated like 65,536) then Insert(m) appends the
new node to the end of the list.
Appendm allocates a new node on the heap (DX:SI points at the data fields
for the node). If a malloc error occurs, Appendm returns the carry flag
set.
CurrentNode gets set to the newly inserted node.
Include: stdlib.a or lists.a
Routine: Remove
----------------
Author: Randall Hyde
Category: List Manipulation
Registers on entry: ES:DI- Pointer to list.
CX- # of node to delete from list.
Registers on exit: DX:SI- Points at node removed from list (NIL if
no such node).
Flags on return: Carry set if the list was empty.
Examples of Usage:
les di, MyList
mov cx, NodeNumbr
Remove
jc EmptyList
Description:
Remove removes the specified node (given by CX) from the list
and returns a pointer to this node in DX:SI. If the list was empty, Remove
returns NIL in DX:SI and sets the carry flag.
This routine modifies CurrentNode so that it points at the next item in the
list (the node normally following the current node). If there is no such
node (i.e., CurrentNode pointed at the last node in the list upon calling
Remove) then this routine stores the value of the *previous* node into
CurrentNode. If you use this routine to delete the last node in the list,
it sets CurrentNode to NIL before leaving.
Include: stdlib.a or lists.a
IBM/L 1.0
(Instruction Benchmarking Language)
This program lets you time sequences of instructions to see how much time
they *really* take to execute. The cycle timings in most 80x86 assembly
language books are horribly inaccurate as they assume the absolute best
case. IBM/L lets you try out some instruction sequences and see how
much time they really take.
IBM/L uses the system 1/18th second clock and measures most executions
in terms of clock ticks. Therefore, it would be totally useless for
measure the speed of a single instruction (since all instructions execute
in *much* less than 1/18th second). IBM/L works by repeatedly executing
a code sequence thousands (or millions) of times and measuring that amount
of time. IBM/L automatically subtracts away the loop overhead time.
IBM/L is a very crude program, something like the "Zen Timer" (from
Michael Abrash's book "Zen of Assembly") would be more appropriate
if you need absolutely accurate timings. The intent of this program
is to give you a good feeling for which instructions are faster than
others.
IBM/L programs begin with an optional data section. The data section begins
with a line containing "#DATA" and ends with a line containing "#ENDDATA".
All lines between these two lines are copied to an output assembly language
program inside the DSEG data segment. Typically you would put global
variables into the program at this point. As a general rule, you should not
use names which begin with a period. IBM/L prefaces all its names with a
period and you could run into a conflict were you to use such names.
Note:if you are using MASM 6.0 or later, you must select the option which
allows identifiers to begin with a period. MASM 5.1 and earlier do not
have a problem with such identifiers.
Example of a data section:
#DATA
I dw ?
J dw ?
K dd ?
ch db ?
ch2 db ?
#ENDDATA
These lines would be copied to a data segment in the created program.
Note that these names would be available to *all* code sequences you
place in the following code sections.
Following the data section are one or more code sections. A code section
consists of optional #REPETITION and #UNRAVEL statements followed by the
actual #CODE / #ENDCODE sections.
The #REPETITION statement takes the following form:
#REPETITION value1, value2
(The "#" must be in column one). "value1" and "value2" must be 16-bit integer
constants (less than or equal to 65,535).
This statement instructs IBM/L to generate a loop which repeats the following
code segment (value1 * value2) times. I used two values so I could use 16-bit
arithmetic (easy to perform in C/FLEX/BISON). If you do not specify any
repetitions at all, the default is value1=65535 and value2=5. Once you set
a repetitions value, that value remains in effect for all following code
sequences until you explicitly change it again.
In general, the bigger the value you choose, the more accurate the timing will
be. However, as you choose larger and larger values for the repetitions, the
program code segments will take longer and longer to execute. Remember,
the generated assembly language program will repeat the code seqences in a
loop the specified (value1*value2) number of times. Short, simple, instruction
sequences will execute much faster than long, complex, instruction sequences.
If you are interested in the straight-line execution times for some
instruction(s), placing those instructions in a tight loop may dramatically
affect IBM/L's accuracy. Don't forget, executing a control transfer instruct-
ion (necessary for a loop) flushes the pre-fetch queue and has a big effect
on execution times. The "#UNRAVEL" statement lets you copy a block of code
several times in place (like unravelling a loop) thereby reducing the overhead
of the conditional jump instructions controlling the loop. The "#UNRAVEL"
statement takes the following form:
#UNRAVEL count
(The "#" must be in column one). "count" is a 16-bit integer constant
denoting the number of times IBM/L is to repeat the code in place.
Note that the specified code sequence in the #CODE section will actually
be executed (count*value1*value2) times, since the #UNRAVEL statement
repeats the code sequence "count" times inside the loop.
In its most basic form, the #CODE section looks like the following:
#CODE ("Title")
%DO
<assembly statements>
#ENDCODE
The title can be any string you choose. IBM/L will display this title
when printing the timing results for this code section. IBM/L will take
the specified assembly statements and output them (multiple times if the
#UNRAVEL statement specifies) inside a loop. At run time the generated
assembly language source file will time this code and present a count,
in ticks, for one execution of this sequence.
Example:
#unravel 16 Execute the sequence 16 times inside the loop
#repetitions 32, 30000 Do this 32*30000 times
#code ("MOV AX, 0 Instruction")
%do
mov ax, 0
#endcode
The above code would generate an assembly language program which executes
the MOV AX, 0 instruction 16 * 32 * 30000 times and report the amount of
time that it would take.
Most IBM/L programs have multiple code sections. New code sections can
immediately follow the previous ones, e.g.,
#unravel 16 Execute the sequence 16 times inside the loop
#repetitions 32, 30000 Do this 32*30000 times
#code ("MOV AX, 0 Instruction")
%do
mov ax, 0
#endcode
#code ("XOR AX, AX Instruction")
%do
xor ax, ax
#ENDCODE
The above sequence would execute the MOV AX, 0 and XOR AX, AX instructions
16*32*30000 times and report the amount of time necessary to perform
these instructions. By comparing the results you can determine which
instruction sequence is fastest.
All IBM/L programs must end with a "#END" statement. Therefore, the
correct form of the instruction above is
#unravel 16 Execute the sequence 16 times inside the loop
#repetitions 32, 30000 Do this 32*30000 times
#code ("MOV AX, 0 Instruction")
%do
mov ax, 0
#endcode
#code ("XOR AX, AX Instruction")
%do
xor ax, ax
#ENDCODE
#END
An example of a complete IBM/L program using all of the techniques we've
seen so far is
#data
even
i dw ?
db ?
j db ?
#enddata
#unravel 16 Execute the sequence 16 times inside the loop
#repetitions 32, 30000 Do this 32*30000 times
#code ("Aligned Word MOV")
%do
mov ax, i
#endcode
#code ("Unaligned word MOV")
%do
mov ax, j
#ENDCODE
#END
There are a couple of optional sections which may appear between the
"#CODE" and the "%DO" statements. The first of these is "%INIT" which begins
an initialization section. IBM/L emits initialization sections before the
loop and does not count their execution time when timing the loop. This lets
you set up important values prior to running a test which do not count
towards the timing. E.g.,
#data
i dd ?
#enddata
#repetitions 5,20000
#unravel 1
#code
%init
mov word ptr i, 0
mov word ptr i+2, 0
%do
mov cx, 200
lbl: inc word ptr i
jnz NotZero
inc word ptr i+2
NotZero: loop lbl
#endcode
#end
The code in the "%INIT" section executes only once and does not affect the
timing.
Sometimes you may want to use the "#UNRAVELS" statement to repeat a section
of code several times. However, there may be some statements which you
only want to execute once on each loop (that is, without copying the code
several times in the loop). The "%eachloop" section allows this. Note that
the code executed in the "%eachloop" section is not counted in the final
timing.
Example:
#data
i dw ?
j dw ?
#enddata
#repetitions 2,20000
#unravel 128
#code
%init -- The following is executed only once
mov i, 0
mov j, 0
%eachloop -- The following is executed only 40000 times, not 128*40000 times
inc j
%do
inc i
#endcode
#end
In the above code, IBM/L only counts the time required to increment i. It does
not time the instructions in the %init or %eachloop sections.
The code in the %eachloop section only executes once per loop iteration. Even
if you use the "#unravel" statement (the "inc i" instruction above, for
example, executes 128 times per loop iteration because of #UNRAVEL). Sometimes
you may want some sequence of instructions to execute like those in the %do
section, but not get timed. The "%discount" section allows for this.
Here is the full form of an IBM/L source file:
#DATA
<data declarations>
#ENDDATA
#REPETITIONS value1, value2
#UNRAVEL count
#CODE
%INIT
<Initialization code, executed only once>
%EACHLOOP
<Loop initialization code, executed once on each pass through the loop>
%DISCOUNT
<Untimed statements, executed each time the %DO section executes>
%DO
<The statements you want to time>
#ENDCODE
<additional code sections>
#END
There are several sample files which demonstrate each of these sections
included with this package.
--------------------------------
How to use IBM/L
IBM/L was created using FLEX and BISON. As per FSF's license (indeed, going
beyond what they request) this package includes all the sources (C, ASSEMBLY,
FLEX, and BISON) for the program. Feel free to modify it as you see fit.
To use this package you need several files. IBML.EXE is the executable
program. You run it as follows:
c:> IBML filename.IBM
This reads an IBML source file (filename.IBM, above) and writes an assembly
language program to the standard output. Normally you would use I/O
redirection to capture this program as follows:
c:> IBML filename.IBM >filename.ASM
Once you create the assembly language source file, you can assemble and run
it. The resulting EXE file will display the timing results.
To properly run the IBML program, you must have the "IBMLINC.A" file in the
current working directory. This is a skeleton assembly language source file
into which IBM/L inserts your assembly source code. Feel free to modify this
file as you see fit. Keep in mind, however, that IBM/L expects certain
markers in the file (currently ";##") where it will insert the code.
Be careful how you deal with these existing markers if you modify the
IBMLINC.A file.
The output assembly language source file assumes the presence of the
UCR Standard Library for 80x86 Assembly Language Programmers. In particular,
it needs the STDLIB include files (stdlib.a) and the library file (stdlib.lib).
These must be present (or in your INCLUDE/LIB environment paths) or MASM
will not be able to properly assemble the output assembly language file.
There is a batch file included in this package which demonstrates the steps
necessary to run IBM/L on a test file.Character Set Routines
----------------------
The character set routines let you deal with groups of characters as a set
rather than a string. A set is an unordered collection of objects where
membership (presence or absence) is the only important quality. The stdlib
set routines were designed to let you quickly check if an ASCII character is
in a set, to quickly add characters to a set or remove characters from a set.
These operations are the ones most commonly used on character sets. The
other operations (like union, intersection, difference, etc.) are useful, but
are not as popular as the former routines. Therefore, the data structure
has been optimized for sets to handle the membership and add/delete operations
at the slight expense of the others.
Character sets are implemented via bit vectors. A "1" bit means that an item
is present in the set and a "0" bit means that the item is absent from the
set. The most common implementation of a character set is to use thirty-two
consecutive bytes, eight bytes per, giving 256 bits (one bit for each char-
acter in the character set). While this makes certain operations (like
assignment, union, intersection, etc.) fast and convenient, other operations
(membership, add/remove items) run much slower. Since these are the more
important operations, a different data structure is used to represent sets.
A faster approach is to simply use a byte value for each item in the set.
This offers a major advantage over the thirty-two bit scheme: for operations
like membership it is very fast (since all you have got to do is index into
an array and test the resulting value). It has two drawbacks: first, oper-
ations like set assignment, union, difference, etc., require 256 operations
rather than thirty-two; second, it takes eight times as much memory.
The first drawback, speed, is of little consequence. You will rarely use the
the operations so affected, so the fact that they run a little slower will be
of little consequence. Wasting 224 bytes is a problem, however. Especially
if you have a lot of character sets.
The approach used here is to allocate 272 bytes. The first eight bytes con-
tain bit masks, 1, 2, 4, 8, 16, 32, 64, 128. These masks tell you which bit
in the following 264 bytes is associated with the set. This facilitates
putting eight sets into 272 bytes (34 bytes per character set). This provides
almost the speed of the 256-byte set with only a two byte overhead. In the
stdlib.a file there is a macro that lets you define a group of character
sets: set. The macro is used as follows:
set set1, set2, set3, ... , set8
You must supply between one and eight labels in the operand field. These are
the names of the sets you want to create. The set macro automatically
attaches these labels to the appropriate mask bytes in the set. The actual
bit patterns for the set begin eight bytes later (from each label). There-
fore, the byte corresponding to chr(0) is staggered by one byte for each
set (which explains the other eight bytes needed above and beyond the 256
required for the set). When using the set manipulation routines, you should
always pass the address of the mask byte (i.e., the seg/offset of one of the
labels above) to the particular set manipulation routine you are using.
Passing the address of the structure created with the macro above will
reference only the first set in the group.
Note that you can use the set operations for fast pattern matching appli-
cations. The set membership operation for example, is much faster that the
strspan routine found in the string package. Proper use of character sets
can produce a program which runs much faster than some of the equivalent
string operations.
Note: there is a special include file in the INCLUDE directory, STDSETS.A,
which contains the bit definitions for eight commonly-used character sets:
Alpha (upper and lower case alphabetics), lower (lower case alphabetics),
upper (upper case alphabetics), digits ("0".."9"), xdigits (hexadecimal
digits: "0"-"9", 'a'-'z', and 'A'-'Z'), alphanum (upper/lower case alpha
and digits), whitespace (spaces, tabs, carriage returns, and linefeeds),
and delimiters (whitespace plus ",", ";", "<", ">", and "|").
If you want to use this standard character set in your program you must
include the STDSETS.A file in an appropriate (data) segment. Note that
including STDLIB.A or CHARSETS.A will not give the standard sets. You must
explicitly place an include STDSETS.A in your program to have access to
these sets.
Routine: Createsets
--------------------
Category: Character Set Routine
Registers on Entry: no parameters passed
Registers on return: ES:DI - pointer to eight sets
Flags affected: Carry = 0 if no error. Carry = 1 if insufficient
memory to allocate storage for sets.
Example of Usage:
Createsets
jc NoMemory
mov word ptr SetPtr, di
mov word ptr SetPtr+2, es
Description: Createsets allocates 272 bytes on the heap. This is sufficient
room for eight character sets. It then initializes the first
eight bytes of this storage with the proper mask values for
each set. Location es:0[di] gets set to 1, location es:1[di]
gets 2, location es:2[di] gets 4, etc. The Createsets routine
also initializes all of the sets to the empty set by clearing
all the bits to zero.
Include: stdlib.a or charsets.a
Routine: EmptySet
------------------
Category: Character Set Routine
Registers on Entry: ES:DI - pointer to first byte of desired set
Registers on return: None
Flags affected: None
Example of Usage:
les di, SetPtr
add di, 3 ; Point at 4th set in group.
Emptyset
Description: Emptyset clears out the bits in a character set to zero
(thereby setting it to the empty set). Upon entry, es:di must
point at the first byte of the character set you want to clear.
Note that this is not the address returned by Createsets. The
first eight bytes of a character set structure are the
addresses of eight different sets. ES:DI must point at one of
these bytes upon entry into Emptyset.
Include: stdlib.a or charsets.a
Routine: Rangeset
------------------
Category: Character Set Routine
Registers on entry: ES:DI (contains the address of the first byte of the set)
AL (contains the lower bound of the items)
AH (contains the upper bound of the items)
Registers on return: None
Flags affected: None
Example of Usage:
lea di, SetPtr
add di, 4
mov al, 'A'
mov ah, 'Z'
rangeset
Description: This routine adds a range of values to a set with ES:DI as the
pointer to the set, AL as the lower bound of the set, and
AH as the upper bound of the set (AH has to be greater than
AL, otherwise, there will an error).
Include: stdlib.a or charsets.a
Routine: Addstr (l)
--------------------
Category: Character Set Routine
Registers on Entry: ES:DI- pointer to first byte of desired set
DX:SI- pointer to string to add to set (Addstr only)
CS:RET-pointer to string to add to set (Addstrl only)
Registers on Return: None
Flags Affected: None
Example of Usage:
les di, SetPtr
add di, 1 ;Point at 2nd set in group.
mov dx, seg CharStr ;Pointer to string
lea si, CharStr ; chars to add to set.
addstr ;Union in these characters.
;
les di, SetPtr ;Point at first set in group.
addstrl
db "AaBbCcDdEeFf0123456789",0
;
Description: Addstr lets you add a group of characters to a set by
specifying a string containing the characters you want in
the set. To Addstr you pass a pointer to a zero-terminated
string in dx:si. Addstr will add (union) each character
from this string into the set.
Addstrl works the same way except you pass the string as
a literal string constant in the code stream rather than
via ES:DI.
Include: stdlib.a or charsets.a
Routine: Rmvstr (l)
--------------------
Category: Character Set Routine
Registers on entry: ES:DI contains the address of first byte of a set
DX:SI contains the address of string to be removed
from a set (Rmvstr only)
CS:RET pointer to string to add to set (Rmvstrl only)
Registers on return: None
Flags affected: None
Example of Usage:
les di, SetPtr
mov dx, seg CharStr
lea si, CharStr
rmvstr
mov dx, seg CharStr
lea si, CharStr
rmvstrl
db "ABCDEFG",0
Description: This routine is to remove a string from a set with ES:DI
pointing to its first byte, and DX:SI pointing to the
string to be removed from the set.
For Rmvstrl, the string of characters to remove from the
set follows the call in the code stream.
Include: stdlib.a or charsets.a
Routine: AddChar
-----------------
Category: Character Set Routine
Registers on Entry: ES:DI- pointer to first byte of desired set
AL- character to add to the set
Registers on Return: None
Flags affected: None
Example of Usage:
les di, SetPtr
add di, 1 ;Point at 2nd set in group.
mov al, Ch2Add ;Character to add to set.
addchar
Description: AddChar lets you add a single character (passed in AL)
to a set.
Include: stdlib.a or charsets.a
Routine: Rmvchar
-----------------
Category: Character Set Routine
Registers on entry: ES:DI (contains the address of first byte of a set)
AL (contains the character to be removed)
Registers on return: None
Flags affected: None
Example of Usage:
lea di, SetPtr
add di, 7 ;Point at eighth set in group.
mov al, Ch2Rmv
Rmvchar
Description: This routine removes the character in AL from a set.
ES:SI points to the set's mask byte. The corresponding
bit in the set is cleared to zero.
Include: stdlib.a or charsets.a
Routine: Member
----------------
Category: Character Set Routine
Registers on entry: ES:DI (contains the address of first byte of a set)
AL (contains the character to be compared)
Registers on return: None
Flags affected: Zero flag (Zero = 0 if the character is in the set
Zero = 1 if the character is not in the set)
Example of Usage:
les di, SetPtr
add di, 1
mov al, 'H'
member
jne IsInSet
Description: Member is used to find out if the character in AL is in a set
with ES:DI pointing to its mask byte. If the character is in
the set, the zero flag is set to 0. If not, the zero flag is
set to one.
Include: stdlib.a or charsets.a
Routine: CopySet
-----------------
Category: Character Set Routine
Register on entry: ES:DI- pointer to first byte of destination set.
DX:SI- pointer to first byte of source set.
Register on Return: None
Flags affected: None
Example of Usage:
les di, SetPtr
add di, 7 ;Point at 8th set in group.
mov dx, seg SetPtr2 ;Point at first set in group.
lea si, SetPtr2
copyset
Description: CopySet copies the items from one set to another. This is a
straight assignment, not a union operation. After the
operation, the destination set is identical to the source set,
both in terms of the element present in the set and absent
from the set.
Include: stdlib.a or charsets.a
Routine: SetUnion
------------------
Category: Character Set Routine
Register on entry: ES:DI - pointer to first byte of destination set.
DX:SI - pointer to first byte of source set.
Register on return: None
Flags affected: None
Example of Usage: les di, SetPtr
add di, 7 ;point at 8th set in group.
mov dx, seg SetPtr2 ;point at 1st set in group.
lea si, sSetPtr2
unionset
Description: The SetUnion routine computes the union of two sets.
That is, it adds all of the items present in a source set
to a destination set. This operation preserves items
present in the destination set before the SetUnion
operation.
Include: stdlib.a or charsets.a
Routine: SetIntersect
----------------------
Category: Character Set Routine
Register on entry: ES:DI - pointer to first byte of destination set.
DX:SI - pointer to first byte of source set.
Register on return: None
Flags affected: None
Example of Usage:
les di, SetPtr
add di, 7 ;point at 8th set in group.
mov dx, seg SetPtr2 ;point at 1st set in group.
lea si, SetPtr2
setintersect
Description: SetIntersect computes the intersection of two sets, leaving
the result in the destination set. The new set consists
only of those items which previously appeared in
both the source and destination sets.
Include: stdlib.a or charsets.a
Routine: SetDifference
-----------------------
Category: Character Set Routine
Register on entry: ES:DI - pointer to the first byte of destination set.
DX:SI - pointer to the first byte of the source set.
Register on return: None
Flags affected: None
Example of Usage:
les di, SetPtr
add di, 7 ;point at 8th set in group.
mov dx, seg SetPtr2 ;point at 1st set in group.
lea si, SetPtr2
setdifference
Description: SetDifference computes the result of (ES:DI) := (ES:DI) -
(DX:SI). The destination set is left with its original
items minus those items which are also in the source set.
Include: stdlib.a or charsets.a
Routine: Nextitem
------------------
Category: Character Set Routine
Registers on entry: ES:DI (contains the address of first byte of the set)
Registers on return: AL (contains the first item in the set)
Flags affected: None
Example of Usage:
les di, SetPtr
add di, 7 ;Point at eighth set in group.
nextitem
Description: Nextitem is the routine to search the first character (item)
in the set with ES:DI pointing to its mask byte. AL will
return the character in the set. If the set is empty, AL
will contain zero.
Include: stdlib.a or charsets.a
Routine: Rmvitem
-----------------
Category: Character Set Routine
Registers on entry: ES:DI (contains the address fo first byte of the set)
Registers on return: AL (contains the first item in the set)
Flags affected: None
Example of Usage:
les di, SetPtr
add di, 7
rmvitem
Description: Rmvitem locates the first available item in the set and
removes it with ES:DI pointing to its mask byte. AL will
return the item removed. If the set is empty, AL will
return zero.
Include: stdlib.a or charsets.a
Pattern Matching Routines
-------------------------
The UCR Standard Library contains a very rich set of (character string)
pattern matching routines. These routines were designed to mimic the
pattern matching primitives found in the SNOBOL4 programming language, so
they are very powerful indeed.
These routines are actually quite simple. They derive their power through
the use of a recursive, backtracking pattern matching algorithm and a
properly specified data structure: the "pattern". The data type for a
pattern is the following:
Pattern struc
MatchFunction dd ?
MatchParm dd 0
MatchAlternate dd 0
NextPattern dd 0
EndPattern dw ?
StartPattern dw ?
StrSeg dw ?
Pattern ends
The "MatchFunction" field is a pointer to a (far) procedure which tests the
current characters in the string. You could write your own functions for
this purpose if you choose, however, several important routines are already
provided in this package.
"MatchParm" represents a four-byte value which the pattern matching algorithm
passes to the MatchFunction routine. Typically it is a pointer to a string
or a character set though it could be any four-byte value.
"MatchAlternate" is a pointer to an alternate pattern to try if the current
pattern fails to match the string.
"NextPattern" is a pointer to another pattern in the current pattern list.
This lets you concatenate patterns to form more complex patterns.
"EndPattern", "StartPattern", and "StrSeg" are words filled in by the pattern
matching routine so other code can locate the characters matched by this
particular pattern. In general, you should not modify these values.
The MatchFunction is where most of the work actually takes place. Currently
the standard library provides 18 different MatchFunctions:
Spancset
Brkcset
MatchStr
MatchiStr
MatchToStr
MatchChar
MatchToChar
MatchChars
MatchToPat
Anycset
NotAnycset
EOS
ARB
ARBNUM
Skip
POS
RPOS
GOTOpos
and RGOTOpos
Note: in order to gain access to these match functions you must place the
statement:
matchfuncs
somewhere in your program after including "pattern.a" or "stdlib.a". This
macro defines the externals for the match functions. Since these match
functions do not get called in the same manner as other standard library
routines, they do not get automatically linked with your programs. There
is a comment in the "shell.asm" file which activates this macro. Uncomment
this line and you'll be in great shape.
A brief description of the matching functions:
Spancset will match any number of characters belonging to a character set.
(This includes zero characters.) You specify the character set via a pointer
to a UCR Stdlib CSET. The pointer to this character set goes in the
"MatchParm" field of the above structure.
Brkcset will match any number of characters which are *not* in a character
set. That is, it will match up to a character in the specified character
set or until the end of the string, whichever comes first. Once again,
"MatchParm" contains a pointer to the character set to use.
MatchStr matches a specified string (something like the strcmp routine).
The "MatchParm" field points at the zero terminated string to match.
MatchiStr is like MatchStr except it converts the input string to uppercase
before comparing against the specified string (the specified string should
contain all upper case characters or it will not match).
MatchToStr matches all characters in a string up to and including the
string specified by the "MatchParm" field.
MatchChar matches a single character. This character must appear in the
L.O. byte of the "MatchParm" field. Note that this routine must match
exactly one character.
MatchChars matches zero or more occurrences of the same character. Again,
the character appears in the L.O. byte of the "MatchParm" field.
MatchToChar matches all characters up to, and including, the character
specified in the L.O. byte of the "MatchParm" field.
MatchToPat matches all characters up to, and including, the characters
matched by the pattern specified in the "MatchParm" field.
Anycset matches a single character from a character set. As usual, the
"MatchParm" field points at the character set to test.
NotAnycset matches a single character which is *not* in the specified
character set. Once again, "MatchParm" points at the character set to use.
EOS matches the end of the string (that is, the zero terminating byte).
ARB matches an arbitrary number of characters.
ARBNUM matches an arbitrary (zero or more) number of strings matching the
pattern specified in the "MatchParm" field.
Skip matches "n" arbitrary characters. The number of characters to skip is
specified in the L.O. word of the "MatchParm" field.
POS matches if the matching routine is currently at position "n" in the
string. "n" is given by the L.O. word of the "MatchParm" field.
RPOS matches if the matching routine is currently at position "n" from the
end of the string. Again, "n" is the L.O. word of the "MatchParm" field.
GOTOpos moves to the position in the string specified by the L.O. word of
the "MatchParm" field. This routine fails if it tries to move backwards in
the string or it attempts to move beyond the end of the string.
RGOTOpos moves to the position in the string "n" chars from the end of the
string ("n" being the L.O. word of "MatchParm"). Fails if this moves you
backwards in the string.
To understand how to use these routines, a little pattern matching theory
is in order. Pattern matching is quite similar to string comparison except
you don't need to match an exact string. Instead, you can specify arbitrary
*patterns* of characters to match. For example, an indentifier in a high
level language like Pascal consists of an alphabetic character following by
zero or more alphanumeric characters. You cannot perform a single string
comparison which will accept all possible Pascal identifiers (indeed, that
single comparison would only accept a single Pascal identifier). However,
you can easily create a pattern which will accept all Pascal ids:
Anycset(A-Za-z) Spancset(A-Za-z0-9)
Anycset above matches a single character from the specified set (alphabetic)
and the Spancset function matches zero or more characters from the alpha-
numeric character set. If you were to take a character string and *match*
it against the above pattern, the match would __succeed__ if the string is a
valid Pascal identifier, it would __fail__ if the string is not a valid
Pascal identifier.
Note, by the way, that the above pattern actually matches any string which
*begins* with a valid Pascal identifier. If the match routine exhausts
the pattern before exhausting characters in the string, it considers the
match successful. If you wanted to match *only* a Pascal identifier you
would use a pattern like the following:
Anycset(A-Za-z) Spancset(A-Za-z0-9) EOS
The "EOS" pattern matches the end of the string, so this pattern matches Pascal
identifiers only.
Of course, you can use pattern matching to perform string comparisons using
the MatchStr function:
MatchStr("Hello world") EOS
However, this is a frightfully expensive way to do a simple string comparison.
(Okay, it really isn't that bad, but it does take several times longer than,
say, strcmp.)
Although you wouldn't match character strings this way, using strings within
patterns is quite useful. Consider the following which matches
"value=<fp number>"
where "<fp number>" denotes a decimal value:
MatchStr("value=") Spancset(0-9) MatchChar('.') Spancset(0-9)
This pattern requires that the string begin with "value=<fp number>" though
anything could follow the decimal value in the string. If you wanted the
string to exactly match the above pattern, you could put an EOS at the end
of the pattern.
What if you wanted to test for the presence of the above pattern *anywhere*
in the string (not necessarily at the beginning)? This is easily accomplished
by the pattern:
ARB MatchStr("value=") Spancset(0-9) MatchChar('.') Spancset(0-9)
ARB matches an arbitrary number of characters. At first you might think
"gee, if it matches any number of characters, what's to prevent it from
matching everything to the end of the string and causing the rest of the
pattern to fail?" Well, simply put, the pattern matching algorithm tries
its absolute best to succeed. So ARB will match must enough characters
so that the string "value=<fp number>" will match the rest of the pattern,
if it is present in the string. To achieve this, the matching algorithm
usings *backtracking*. In a nutshell, backtracking works as follows:
the current pattern matches as many characters as it can (all of them in the
case of ARB). It then tries to match the remaining characters against the
rest of the pattern. If that fails, then it backs up and tries again
(logically you can think of ARB giving up one character at a time from the
end of the string until the remaining patterns match). If there is no match
after backing up back to the starting point, the whole pattern fails.
If this sounds expensive (slow), well, it is. That's why you would never
try and use the pattern matching primitives for simple string comparisons.
That's also why you should try to avoid two adjacent patterns which match
the same set of characters (since ARB matches anything, it will match the
same characters as any adjacent pattern, hence it may be slow).
Another important feature to this pattern matching system is the ability
to do *alternation*. Consider the following:
MatchStr("black") | MatchStr("blue")
The "|" symbol above denotes alternation, and is read as "or". This pattern
matches the string "black" *or* the string "blue". Okay, now consider the
following (very typical example in pattern matching):
[MatchStr("black") | MatchStr("blue")] [MatchStr("bird") | MatchStr("berry")]
The above pattern matches the strings "blackbird", "bluebird", "blackberry",
or "blueberry". The alternation operator lets you choose "black" or "blue"
from the first pattern and "bird" or "berry" from the second pattern.
This description of pattern matching theory could go on for quite some time,
but this is not the place for it. This discussion is intended to serve as
only an appetizer for you. If you want additional information on pattern
matching theory, pick up any SNOBOL4 manual, especially "Algorithms in
SNOBOL4" (by Gimpel, if I remember correctly). Also, copies of Vanilla
SNOBOL are floating around with an electronic manual. Among other things,
this contains the phone number and all of Catspaw which sells SNOBOL4 and
ICON products for various systems. They sell several texts on SNOBOL4 and
ICON (which are pattern matching languages). Since these library routines
are based on SNOBOL4, taking a look at SNOBOL4 will provide some insight
into these routines.
Now let's talk about how to actually specify a pattern in assembly language
using the pattern matching routines. As you probably figured out already,
you don't get to use the nice (SNOBOL4-like) syntax used the preceding
examples. Indeed, if there is any pain associated with pattern matching
in this package, it's setting up the patterns in the first place.
A pattern is a linked list of objects all of type "PATTERN" (see the
structure definition earlier). You must fill in all but the last three
fields of this structure (or live with the default value of zero).
The Pascal identifier pattern mentioned above would look something
like the following:
pasid pattern <Anycset,alphabetic,0,pasid2>
pasid2 pattern <Spanscet,alphanum>
(note: alphabetic and alphanum are standard character sets available in
UCR standard library. See the section on character sets for details).
The first pattern matches a single character from the "alphabetic" character
set. The second pattern matches zero or more characters from the "alphanum"
character set. In both cases the alternate field is zero (NULL/NIL) because
there is no alternate to match. In the first pattern above, the NextPattern
field contains the link to the "pasid" pattern which concatenates the two
patterns. Pasid2 has no link field since it is the last pattern in the
list (if the field is not present, it defaults to zero/NIL/NULL).
To actually match a string against a pattern you load ES:DI with the address
of the string to test and DX:SI with the address of the first pattern in
your pattern list. CX contains the offset of the last character you wish
to check in the string (set CX to zero if you want to match all characters
in the string). Then you execute the "match" procedure. On return, the
carry flag denotes success (C=1) or failure (C=0), AX contains the offset
of the character in the string immediately after the match.
lesi MyString
ldxi MyPattern
mov cx, 0
match
jc TheyMatched
By default, all patterns match characters at the beginning of the string.
As you've seen earlier in the document, you can use ARB to allow the
pattern to match at some other point in the string. For example, the following
matches any string which contains one or more alphabetic characters followed
by one or more decimal digits:
HasAnAlpha pattern <ARB,0,0,HAA2>
HAA2 pattern <Anycset,alphabetic,0,HAA3>
HAA3 pattern <Spancset,alphabetic,0,HAA4>
HAA4 pattern <Anycset,digits>
Since ARB does not require any parameters, you can use any value for the
second parameter to "pattern". Zero is as good a value as any other.
Note that "Spancset" by itself would not be sufficient for the alphabetic
matches. Spancset matches zero or more characters. We need to match one
or more characters. That is why the pattern above needs the Anycset and
Spancset patterns.
The above pattern data type matches its pattern *anywhere* in the target
string. If you wanted to force a match at the end of the string, you could
use the following pattern:
HasAnAlpha pattern <ARB,0,0,HAA2>
HAA2 pattern <Anycset,alphabetic,0,HAA3>
HAA3 pattern <Spancset,alphabetic,0,HAA4>
HAA4 pattern <Anycset,digits,0,HAA5>
HAA5 pattern <Spancset,digits,0,HAA6>
HAA6 pattern <EOS>
EOS doesn't require a parameter, so the above lets all the fields except the
function name default to zero.
The MatchAlternate field contains the address of a pattern to match if the
current pattern fails (and *only* if the current pattern fails). Consider
the blue/blackbird/berry pattern described earlier. You can easily implement
that pattern with the following statements:
BB pattern <MatchString,black,bluepat,BB2>
BB2 pattern <MatchString,berry,birdpat>
bluepat pattern <MatchString,blue,0,BB2>
birdpat pattern <MatchString,bird>
black db "black",0
blue db "blue",0
bird db "bird",0
berry db "berry",0
If you match BB against the string "blackberry" BB will match the black,
then it will go to BB2 which matches the berry. This string doesn't use
either alternate.
If you match BB against the string "blueberry" BB immediately fails, so
it tries the alternate pattern, bluepat. Bluepat matches blue and then
goes on to the BB2 pattern which matches the berry.
If you match BB against the string "blackbird", BB matches black and then
tries to match BB2 against bird. BB2 fails and tries its alternate (birdpat)
with matches the characters "bird".
If you match BB against the string "bluebird", BB fails and tries its alter-
nate, bluepat. Bluepat matches "blue" and passes control to its next pattern,
which is BB2. BB2 tries to match "bird" and fails, so it passes control to
its alternate, birdpat, which matches bird.
The example above is pretty straightforward as far as alternation is concerned.
You can create some very sophisticated patterns with the alternation field.
For example, consider the following generic pattern:
[pat1 | ] [pat2 | pat3]
"[pat1 | ]" is an easy way of saying that pat1 is optional. You can easily
create this pattern as follows:
Pat1 pattern <pat1func,pat1parm,pat2,pat2>
Pat2 pattern <pat2func,pat2parm,pat3>
Pat3 pattern <pat3func,pat3parm>
If you match a string against Pat1 and the beginning of the string does not
match Pat1, then it tries Pat2 instead (and if that fails, it tries Pat3
before failing). If the string begins with a pattern matched by Pat1, the
matching algorithm then looks to see if the characters matching Pat1 are
followed by some character matching Pat2 or, alternately, Pat3.
There are all kinds of tricky ways you can use the alternation field to
create complex patterns and control the precendence of the pattern matching
algorithm. This short document cannot even begin to describe the
possibilities. You will need to experiment with this capability to discover
its true potential.
Creating Your Own Pattern Functions
-----------------------------------
Although the UCR Stdlib pattern matching routines include many of the
functions you'll typically want to use for pattern matching, it's quite
possible you'll want to write your own pattern matching functions. This
is actually quite easy to do. The matching functions are all far procedures
which the Match procedure calls with the following parameters:
ES:DI- Points at the first character of the character string the function
should match against. The match function should never look at
characters before this string and it should not look beyond the end
of the string (which is marked by a zero terminating byte).
DS:SI- Contains the four-byte parameter found the the MatchParm field.
CX- Contains the last position plus one in the string you're allowed
to compare. Note that this may or may not point at the zero term-
inating byte. You must not scan beyond this character. Generally,
you can assume the zero terminating byte is at or after this location
in the string.
On return, AX must contain the address (offset into the string) of the last
character matched *plus one*. After your pattern matches, additional patterns
following in the pattern list will begin their matching at location ES:AX.
You must also return the carry set if your match succeeded, you must return
the carry clear if your match failed.
Note that the MATCH procedure is fully recursive and rentrant. So you can
call MATCH recursively from inside your match function. This helps make
writing your own match routines much easier. (note: actually, you need to
call MATCH2, which is the reentrant version, from inside your match functions.)
As an example, let's consider the example above where we wanted to match
a string of one or more alphabetic characters following by one or more
digits anywhere in a string. Consider the following pattern:
HAA pattern <ARB,0,0,HAA2>
HAA2 pattern <MatchAlpha,0,0,HAA3>
HAA2 pattern <MatchDigits>
The MatchAlpha and MatchDigits pattern functions are not provided in the
standard library, we will have to write them. MatchAlpha matches one or
more alphabetic characters, MatchDigits matches one or more decimal digits.
Here's the routines that implement these two functions:
; Note that ES:DI & CX are already set up for these routines by the
; Match procedure.
MatchAlpha proc far ;Must be a far proc!
push dx
push si ;Preserve modified registers.
ldxi Alpha1 ;Get pointer to "Match one or more
match2 ; alpha" pattern and match it.
pop si
pop dx
ret
MatchAlpha endp
MatchDigits proc far ;Must be a far proc!
push dx
push si ;Preserve modified registers.
ldxi Digits1 ;Get pointer to "Match one or more
match2 ; digits" pattern and match it.
pop si
pop dx
ret
MatchDigits endp
Alpha1 pattern <Anycset,alpha,0,Alpha2>
Alpha2 pattern <Spancset,alpha>
Digits1 pattern <Anycset,digits,0,Digits2>
Digits2 pattern <Spancset,digits>
Note that the MatchAlpha and MatchDigits patterns do not require any
parameters from the MatchParm field, they intrinsically know what they
need to use.
Another way to accomplish the above is to write a generic "one or more
occurrences of a pattern" type of pattern. The following code implements
this:
; Assume the "MatchParm" field contains a pointer to the pattern we
; want to repeat one or more times:
OneOrMore proc far
push dx
push di
mov dx, ds ;Point DX:SI at pattern.
match2 ;Make sure we get at least 1.
jnc Fails
MatchMore: mov di, ax ;Move on in string.
match2
jc MatchMore
pop di
pop dx
stc ;Return success
ret
Fails: pop di
pop dx
clc ;Return failure
ret
OneOrMore endp
A pattern which would match one or more alphabetics with this would be:
Alpha1ormore pattern <OneOrMore,alphaset>
AlphaSet pattern <Anycset,alpha>
You would specify the "Alpha1ormore" pattern to match one or more alphabetic
characters.
Of course, you can write any arbitrary function you choose for your match
function, you do not need to call MATCH2 from within your match function.
For example, a simple routine which matches one or more alphabetics followed
by one or more digits could be written as follows:
AlphaDigits proc far
push di
cmp di, cx
jae Failure
mov al, es:[di]
and al, 5fh ;Convert l.c. -> U.C.
cmp al, 'A'
jb Failure
cmp al, 'Z'
ja Failure
DoTheMore0: inc di
cmp di, cx
jae Failure
mov al, es:[di]
and al, 5fh
cmp al, 'A'
jb TryDigits
cmp al, 'Z'
jbe DoTheMore0
TryDigits: mov al, es:[di]
xor al, '0' ;See if in range '0'..'9'
cmp al, 10
jae Failure
DoTheMore1: inc di
cmp di, cx
jae Success
mov al, es:[di]
xor al, '0'
cmp al, 10
jb DoTheMore1
Success: mov ax, di ;Return ending posn in AX.
pop di
stc ;Success!
ret
Failure: mov ax, di ;Return failure position.
pop di
clc ;Return failure.
ret
AlphaDigits endp
Note that the pattern matching function must return the failure position in
AX. Also note that the routine must *not* search beyond the point specified
in the CX register. These points did not appear in the previous code because
all of that was handled automatically by the MATCH2 routine. Of course,
the matching function must set or clear the carry flag depending upon the
success of the operation.
There is a really sneaky way to simulate the use of parentheses in a pattern
to override the normal left-to-right evaluation of a pattern. The SL_MATCH2
routine (which is what MATCH2 winds up calling) works quite well as a pattern
matching function. Consider the following pattern:
ParenPat pattern <sl_match2,HAA>
Now the ParenPat pattern will match anything the HAA pattern matches, however,
the system treats all of HAA as a single pattern (inside ParenPat) rather
than as a list of concatenated patterns. This is real important when using
PATGRAB to extract portions of a pattern. Patgrab can only extract the
characters belonging to a single pattern data structure (not a list). However,
ParenPat above is a single pattern data structure which maintains the infor-
mation for the entire string matched by HAA. Therefore, using patgrab on
ParenPat will extract the entire string matched by HAA.
Parenthetical operations can also simplify other patterns. Keep in mind,
however, that the Alternate pointer field in the pattern structure can
also be used to simulate parenthetical patterns without the expense of
generating new patterns (see the previous examples).
A Note About Performance:
There are two aspects to efficiency programmers worry about: speed and memory
usage. In the worst case, this pattern matching package does not fare well
in either category. It is possible that the routines in this package could
consume several kilobytes of stack space while matching a string; it is also
possible that the matching would take so long that it's impractical to use
this package. Fortunately, these worst case scenerios are pathological
cases which rarely (if ever except in synthetic programs) occur.
Space is the first issue to address. Each call to Match/Match2 can push
upwards of fifty bytes onto the stack. This is in addition to the stack
space required by the low-level matching function. Few of the built-in
matching functions push more than six or eight bytes, but you could write
your own matching function which pushes more. It is very easy to design
a (synthetic) pattern which forces a nested, recursive, call of MATCH2 for
each character in the string you are going to match. In such a case, this
package could require upwards of n*50 bytes on the stack where "n" is the
length of the string. For a string with 100 characters, you'd probably
need 5K of stack space for the pattern matching routines alone.
In general, patterns would rarely exhibit this type of behavior. Most low-
level pattern matching functions match several characters at once. Further-
more, you almost never encounter patterns in real life which require a
recursive call for each character in the string. Instead, most complex
patterns consist of simpler patterns concatenated together in a list. This
does not require a nested recursive call for each character in the string,
rather, the package makes a call, matches some characters, then returns; next,
the routines call the next sub-pattern in a similar fashion. Note however,
that the state (stack space) for the previous sub-pattern has been reclaimed
from the stack at that point.
In practice, a typical pattern might require as much as 1K free stack space.
However, keep in mind that this is a "typical worst case value" not an
absolute worst case value. Of course, you can control the amount of stack
space the pattern matching algorithms use by avoiding recursive pattern
definitions and avoiding parenthetical patterns (which stack up machine
state recursively) in complex patterns. Of course, limiting the size of
the strings you're matching the pattern against will help as well.
Generally, the stack space used by the pattern matching algorithm is of
little concern. Setting aside an extra kilobyte of memory, even five
kilobytes of memory, isn't a big problem for most programmers. Speed,
on the other hand, can be a problem. This pattern matching package uses
a generalized backtracking pattern matching algorithm. You can easily
devise a pattern which runs in O(x**n) time where "x" is an arbitrary value
you get to pick (basically the number of possible alternations on each
sub-pattern in the pattern) and "n" is the length of the string you match.
The "O()" function notation simply means that if it takes "m" units of time
with a string of length "n", it will take "m*x" units of time for a string
of length "x+1". Not very good. For some patterns it could easily take
longer than your lifetime to match a string whose length is 100 characters.
Once again, we're fortunate in the sense that this terrible performance occurs
so rarely you need not be too concerned about it. To achieve such bad per-
formance requires a specially prepared pattern matching a specially prepared
string. Such combinations do not normally exist in nature! However, while
100 year matching times may not occur much, most programmers are interested
in having their patterns match in milliseconds. It is very easy to devise
a pattern which takes seconds, minutes, or possibly even hours, to match
some types of strings (second timing is very likely for some common strings
and patterns, minutes is rather rare, hours is very rare, but still much
more possible than the 100 year problem). The really sad part is that slow
matching times is almost always due to a poor choice of pattern rather than
an intrinsic problem with the pattern matching algorithm.
Typically, all performance problems are directly related to the amount of
backtracking which must occur to match a pattern. Backtracking almost always
occurs which you have two adjacent sub-patterns in a pattern where the
end of the first sub-pattern matches strings from the same set as the front
of the second sub-pattern. Consider the following pattern:
spancset(a-z " " ",") matchstr("hello")
If you match this pattern against the string "hi there, hello" the first
spancset pattern matches the entire string. The matchstr function would then
fail. At this point, backtracking kicks in and backs up one character in the
string. To the spancset function matches "hi there, hell" and the matchstr
function fails on "o". So the algorithm backs up one more character and
tries again. It keeps doing this until it backs up all the way to the
point where spancset matches "hi there, " and matchstr finally matches
"hello". To get around this problem, a better choice of pattern is probably
in order. Consider the following which generally does the same thing:
matchtostr("hello")
Matchtostr skips over all characters in a string up to the string "hello"
and leaves the "match cursor" pointing just beyond the "o" in "hello".
This matches almost what the previous pattern will match but it does it a
whole lot faster. It doesn't exactly match the previous pattern because
matchtostr matches any characters, not just (a-z " " ",") but for most
purposes this is exactly what you want anyway.
ARB is probably one of the worst offenders. Anytime you use ARB (with
a sub-pattern following it in the pattern list), you are almost guaranteed
that backtracking will occur. Therefore, you should attempt to avoid the
use of ARB in patterns where performance is a consideration.
In general, you can often redesign a pattern data structure to avoid
overlaps in adjacent sub-patterns. Patterns which do not have such conflicts
will generally have reasonable performance. Of course, designing such patterns
takes more effort and testing; so it may not be worth the effort for quick
and dirty projects. On the other hand, if you execute the pattern match more
than a few times (or the pattern matching starts to take minutes rather than
seconds) its probably worthwhile to redo the pattern.
Of course, this pattern matching package is not suitable for all pattern
matching tasks. The whole purpose of this package was to make pattern
matching in assembly language easy, not fast. You would never, for example,
want to write a lexical analyzer for a compiler using this package. It
would be too slow and using languages like LEX and FLEX produce faster
(much faster) lexical analyzers and it's easier to create them with LEX/FLEX
as well. Ditto for parsers using BISON or YACC. Likewise, there are many
times when pattern matching languages like AWK, ICON, SNOBOL4, etc., are
more appropriate than using the routines in this package. This package is
really intended for small pattern matching tasks inside a larger assembly
language program. For example, parsing command line parameters or
parsing input lines typed by the user to request some activity in your
assembly language program. While it's certainly possible to write a program
whose sole purpose is to perform some pattern matching problem, using this
package may not provide any better performance than, say, a SPITBOL (compiled
SNOBOL4) program and it would probably take you longer to write than the
comparable SNOBOL4 program.
Routine: MATCH
---------------
Category: Pattern Matching Routine
Author: Randall Hyde
Registers on Entry: ES:DI - pointer to source string
DX:SI - pointer to pattern
CX- offset of last valid position+1 in string.
Zero to match entire string.
Registers on return: AX- Position in string where the pattern stopped
matching.
Flags affected: carry- 0 denotes failure to match pattern.
1 denotes success.
Example of Usage:
lesi StringToTest
ldxi PatternToMatch
mov cx, 0 ;Match entire string.
MATCH
jnc DidNotMatch
Description:
MATCH is the general purpose matching subroutine provided in the standard
library to perform pattern matching. On entry, DX:SI must point at a
pattern (list) data structure. See the pattern.a include file (and the
documentation preceeding this page) for more details on this data structure.
Also on entry, ES:DI should point at the first character where the pattern
matching is to begin. This need not be the beginning of a string, ES:DI could
point into the middle of a string; however, the pattern matching begins at
location ES:DI.
ES:CX, on entry, must point at the last byte to check *plus one*. This
typically points at the zero terminating byte of a string, but it could
point at some character in the string before the zero terminating byte.
If CX contains zero upon entry to the MATCH routine, the MATCH code will
automatically point CX at the zero byte in the string pointed at by ES:DI.
Include: stdlib.a or pattern.a
Routine: MATCH2
----------------
Category: Pattern Matching Routine
Author: Randall Hyde
Registers on Entry: ES:DI - pointer to source string
DX:SI - pointer to pattern
CX- offset of last valid position+1 in string.
Registers on return: AX- Position in string where the pattern stopped
matching.
Flags affected: carry- 0 denotes failure to match pattern.
1 denotes success.
Example of Usage:
;Typical usage in a matching function:
ldxi NewPattern
MATCH2
Description:
MATCH2 is a special, reentrant, version of MATCH. You would normally *not*
call this routine to perform pattern matching from your main program.
Instead, this routine is intended for use inside pattern matching functions
you write yourself. Please see the accompanying documentation for more
details.
Include: stdlib.a or pattern.a
Routine: patgrab
-----------------
Category: Pattern Matching Routine
Author: Randall Hyde
Registers on Entry: ES:DI - pointer to a pattern structure
Registers on return: ES:DI - String (on heap) corresponding to the chars
matched by the pattern.
Flags affected: carry- set if insufficient space on heap to allocate
the string.
Example of Usage:
lesi SomeString
ldxi SomePattern
Match
lesi SomePattern
patgrab
Description:
You use patgrab to extract the substring matched by some particular pattern.
You always call this routine *after* calling MATCH. Match stores pointer
information away in the pattern data structure, patgrab extracts this infor-
mation and builds a string to your specifications.
To grab a string which spans several sub-patterns, you can use strcat to
combine the strings or a parenthetical pattern (see the documentation
preceding these routine descriptions for details).
Include: stdlib.a or pattern.a
Routine: spancset
------------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, spancset is only invoked in a pattern data structure.
You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
SCexample pattern <spancset,alpha>
Description:
Spancset will skip over zero or more characters from the character set (cset)
specified by the "matchparm" field (the second operand above). This routine
always succeeds and returns the "match cursor" pointing at the first character
position beyond the matched characters in the source string.
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: brkcset
-----------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, brkcset is only invoked in a pattern data structure.
You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
BCexample pattern <brkcset,alpha>
Description:
Brkcset skips over all characters which are *not* in the character set passed
in the "MatchParm" parameter. This routine always succeeds. It stops with
the match cursor pointing at the first character found in the specified char-
acter set (it does not "eat" that character). This routine always succeeds.
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: matchstr
------------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, matchstr is only invoked in a pattern data structure.
You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
MSexample pattern <matchstr,str2match>
Str2Match db "String to match",0
Description:
Matchstr compares the next characters in the source string against the
string pointed at by the "MatchParm" parameter. If the next set of char-
acters in the source string match, this routine succeeds and returns the
match cursor pointing one character beyond the matched string in the
source string. If the characters do not match, this routine fails and
does not modify the match cursor.
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: matchtostr
--------------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, matchtostr is only invoked in a pattern data struct-
ure. You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
MTSexample pattern <matchtostr,str2match>
Str2Match db "String to match",0
Description:
MatchToStr matches all characters in a string up to *and including* the
string specified by the "MatchParm" parameter. Note that this is a very
fast (comparatively) routine and is much faster than something like
ARB followed by MatchStr (or some other combination which will force
backtracking). This routine fails if it cannot find the specified string
in the source string (beyond the match cursor position).
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: matchchar
-------------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, matchchar is only invoked in a pattern data struct-
ure. You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
MCexample pattern <matchchar, 'a'>
Description:
Matchchar tests a single character at the current match cursor position.
If the character in the L.O. byte of "MatchParm" is equal to the current
character in the source string, this routine passes over that character
in the string and returns success. Otherwise, it does not advance the
match cursor and returns failure.
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: matchtochar
---------------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, matchtochar is only invoked in a pattern data struct-
ure. You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
MTCexample pattern <matchtochar, 'a'>
Description:
MatchToChar matches all characters up to *and including* the character
specified by the L.O. byte of "MatchParm". It succeeds if it finds the
character in the string (in which case it returns the match cursor pointing
just beyond the specified character). It fails otherwise.
This is a relatively fast matching routine and should be used in place of
something like ARB followed by MATCHCHAR.
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: matchchars
--------------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, matchchars is only invoked in a pattern data struct-
ure. You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
MCsexample pattern <matchchars, 'a'>
Description:
This routine matches zero or more occurrences of the specified character
starting at the match cursor position. It always returns success. If it
matches one or more characters it leaves the match cursor pointing beyond
the last matched character.
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: matchtopat
--------------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, matchtopat is only invoked in a pattern data struct-
ure. You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
MTPexample pattern <matchtopat, somepat>
SomePat pattern <arbitrary_pattern....>
Description:
Matchtopat matches an arbitrary number of characters up to *and including*
the characters matched by the pattern specified by the "MatchParm" parameter.
Success or failure on return depends entirely on whether or not the pattern
specified as a parameter matches at some point or another.
MatchToPat uses "shy" pattern matching. That is, it first attempts to match
zero characters (the empty string) followed by the parameter pattern. If
this succeeds, it quits. Otherwise MatchToPat matches a single character
and tries to match the parameter pattern again. Each time it fails it matches
one additional character and tries again until there are no more characters
in the source string, at which point it fails.
SNOBOL4+ programmers- MatchToPat is the pattern which is most comparble to
the ARB PAT pattern in SNOBOL4+.
Whether or not MatchToPat is faster than ARB (stdlib version) followed by
some other pattern depends entirely on the location of the second pattern.
ARB uses a greedy algorithm and backtracks to match any following patterns.
MatchToPat uses a shy algorithm which tries the parameter pattern first and
eats characters from the source string only if the parameter pattern fails.
If the match generally occurs earlier in the source string, MatchToPat will
be faster. If the match occurs later in the source string, ARB/PAT will
probably be faster. If matching generally fails, MatchToPat is marginally
faster.
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: anycset
-----------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, anycset is only invoked in a pattern data struct-
ure. You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
ACexample pattern <anycset, alpha>
Description:
Anycset matches a single character from the character set (cset) specified
by the "MatchParm" parameter. If the character at the match cursor position
is in this set, anycset advances the match cursor and succeeds. Otherwise
it does not advance the match cursor and anycset fails.
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: notanycset
--------------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, notanycset is only invoked in a pattern data struct-
ure. You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
NACexample pattern <notanycset, alpha>
Description:
NotAnycset matches a single character which is *not* in character set (cset)
specified by the "MatchParm" parameter. If the character at the match cursor
position is not in this set, notanycset advances the match cursor and succeeds.
Otherwise it does not advance the match cursor and notanycset fails.
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: EOS
-------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, EOS is only invoked in a pattern data struct-
ure. You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
EOSexample pattern <EOS>
Description:
EOS matches the end of the source string (that is, the zero terminating byte
of the source string). The standard library pattern matching package does
not require that a source string completely match a pattern for that pattern
to succeed. Instead, the pattern need only specify a prefix of that string.
If you want the pattern to match the entire string you must stick the EOS
pattern at the end of your pattern list.
Whether EOS succeeds or fails, it does *not* advance the match cursor.
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: ARB
-------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, ARB is only invoked in a pattern data struct-
ure. You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
ARBexample pattern <ARB>
Description:
ARB matches an arbitrary number of characters in a string (up to EOS).
It always succeeds.
SNOBOL4+ users- Stdlib's ARB function isn't exactly like SNOBOL4's. This
ARB function uses a "greedy" algorithm. It immediately grabs as many char-
acters as it can. If there is a pattern following ARB (and there generally
is) backtracking *will* occur. If you want an ARB operation which uses a
"shy" matching algorithm, take a look at the "MatchToPat" function.
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: ARBNUM
----------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, ARBNUM is only invoked in a pattern data struct-
ure. You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
ANexample pattern <ARBNUM, SomePattern>
SomePattern pattern <some_other_pattern....>
Description:
ARBNUM matches zero or more occurrences of the pattern specified by the
"MatchParm" pattern. It always succeeds and leaves the match cursor pointing
beyond the last character matched in the source string. If it matches zero
occurrences of the specified pattern, it still succeeds and returns the
match cursor unchanged.
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: Skip
--------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, Skip is only invoked in a pattern data struct-
ure. You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
Sexample pattern <Skip, 5, 0, NextPat>
Description:
Skip matches (skips over) the next "n" characters in the source string.
"n" is the L.O. word of the "MatchParm" parameter.
Skip succeeds if there were at least "n" characters in the string. It fails
if there were less than "n" characters in the string. If you want Skip to
succeed even if there are less than "n" characters in the string you can
use ARB as the alternate pattern for Skip:
SARBexample pattern <Skip, 5, 0, ArbPat>
ARBPat pattern <ARB>
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: POS
-------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, POS is only invoked in a pattern data struct-
ure. You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
Pexample pattern <POS, 5, 0, NextPat>
Description:
POS (position) succeeds if the match cursor is currently at the location
specified by the "MatchParm" parameter. It fails otherwise. Note: the
first character in the source string is at position zero.
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: RPOS
--------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, RPOS is only invoked in a pattern data struct-
ure. You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
RPexample pattern <RPOS, 5, 0, NextPat>
Description:
RPOS (position) succeeds if the match cursor is currently at the specified
location *from the end of the string*. The last character in the string is
RPOS one (EOS is at position zero).
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: GotoPOS
-----------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, GotoPOS is only invoked in a pattern data struct-
ure. You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
GPexample pattern <GotoPOS, 5, 0, NextPat>
Description:
GotoPos moves the match cursor *forward* to the specified position. It fails
if that position does not exist in the string or if you attempt to move the
match cursor backwards in the string.
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).
Routine: RGotoPOS
-----------------
Category: Pattern Matching Primitive
Author: Randall Hyde
Registers on Entry: N/A
Registers on return: N/A
Flags affected: N/A
Example of Usage:
(Note: Generally, RGotoPOS is only invoked in a pattern data struct-
ure. You would not normally call this code directly from your
program [though it is possible, see the source listings for
details].)
RGPexample pattern <RGotoPOS, 5, 0, NextPat>
Description:
RGotoPos moves the match cursor *forward* in the string to the point specified
by the "MatchParm" parameter. This value is the position in the string from
the *end* of the string. This function fails if you attempt to move the
match cursor backwards in the string or if the position does not exist in
the string.
Include: stdlib.a or patterns.a (and then invoke the "matchfuncs"
macro to obtain the external declaration for this function).