October 10, 1991 RPSORT Reference Page i Table of Contents ----------------- Introduction 1 How to Display RPSORT Built-in Syntax Screens 1 Syntax Conventions 1 How To Exit Quickly From RPSORT 2 Quick Exit From RPSORT When Output Goes To The Standard Output 2 General Description of How RPSORT Does A Sort 3 Summary Of RPSORT Syntax 4 Details Of RPSORT Syntax 5 Options For Specifying Input And Output Files 5 Using RPSORT As A Filter 5 Specifying Input And Output Files Directly 5 Specifying Lines Or Fixed Length Records 6 Lines Are The Default 6 /Fnnnn - Specifying Fixed Length Records 6 Detailed Description Of Sort Key Types Supported By RPSORT 7 Sort Keys That Are Character Strings 7 Default Case Insensitive Character Strings 7 ASCII (Case Sensitive) Character Strings 7 C Language Style Character Strings 7 Turbo Pascal Style Character Strings 8 Sort Keys That Are Binary Numbers 8 Signed Binary Integers 8 Unsigned Binary Integers 9 BASICA And GWBASIC Floating Point Numbers 9 Turbo Pascal Real Numbers 9 Math Co-Processor Floating Point Numbers 10 Defining The Desired Sort Sequence To RPSORT 11 Standard Defaults For Sort Keys 11 Switches Which Set Defaults For Sort Keys 11 /A - Sort all Text Keys in ASCII (Case Sensitive) Sequence 11 /C - Make All Text Keys Be C Language Strings 11 /P - Make All Text Keys Be Turbo Pascal Strings 11 /R - Sort All Keys In Reverse (Descending Order) 11 October 10, 1991 RPSORT Reference Page ii Table of Contents (continued) ----------------------------- Defining Sort Keys 12 Sort Key Definition Syntax 12 Col - The Start Column For A Key 12 Len - The Length Of A Key 12 R - Sorting The Key In Reverse (Descending) Order 12 A - Sorting The Key In ASCII (case insensitive) Order 12 C - Specifying A C Language Type String 12 P - Specifying A Turbo Pascal Type String 12 I - Specifying A Signed Binary Integer 13 U - Specifying An Unsigned Binary Integer 13 F - Specifying A Math Co-processor Type Floating Point Number 13 M - Specifying A BASIC Interpretor Type Floating Point Number 13 T - Specifying A Turbo Pascal Type Real Number 13 List Of Various Compiler And Interpreter Numeric Data Types 14 Miscellaneous Switches 15 /Q - Suppressing Copyright And Completion Messages 15 /Eerrfile - Directing Error Messages To A File 15 /B - Ignoring Control Breaks Entered From The Keyboard 15 /D - Delete Records Whose Sortkeys Duplicate Previous Record 15 /N - Delete Null Lines 16 /Td - Designate Drive To Be Used For Temporary Files. 16 /Z - Ignore Ctrl-Z In Text File. Use Entire Physical File. 16 Efficiency Considerations 17 Do ASCII Sort If Text Keys Are All Upper Case Or All Lower Case 17 How Memory Size Affects RPSORT Speed And Need For Temp Disk Space 18 Using CHKDSK Or MEM To Determine Free Memory 18 Sorts Requiring No Merge Phase And No Temporary Files 19 Sorts Requiring One Merge Phase And One Temporary File 20 Sorts Requiring Two Or More Merge Phases And Two Temporary Files 21 Deciding What Drives To Put Temporary And Output Files On 21 Buffers Command In Your Config.Sys 22 Using Disk Cache Programs 22 Special Situations 23 Sorting Files That Contain Tabs 23 Writing The Output To The File That Contained The Input 24 Two Incompatibilities With The DOS SORT 24 Error Messages 25 Error Numbers And Return Codes 25 Syntax Error Messages 26-30 DOS Version Before 2.0 Message 31 Insufficient Memory Messages 31 Line Or String Too Long Messages 31 Input/Output Error Messages 32-33 Never Should Happen Error Messages 33 October 10, 1991 RPSORT Reference Page 1 Introduction RPSORT is a sort utility that greatly improves upon the features and the performance of the sort utility distributed with Microsoft DOS. First, RPSORT does everything that the DOS SORT does. Virtually any command that works with DOS SORT works with RPSORT and produces the same result. But RPSORT does much more. It can sort very large files and supports multiple sort keys. It is extremely fast. I do not know of another sort utility that can outspeed it. RPSORT sorts text files. These consist of lines each ended by CRLF (i.e. a carriage return and line feed). RPSORT also sort files of fixed length records such as those produced by many BASIC, Pascal and C programs. RPSORT supports numerous sort key types including regular text keys, C language strings, Turbo Pascal strings, signed and unsigned binary integers of any length and several types of binary floating point numbers. RPSORT can delete null lines (consisting only of a CRLF). It can also delete records/lines whose sort keys duplicate those in a previous record/line. A summary of RPSORT syntax appears on page 4 of this document. A comprehensive list of RPSORT examples can be found in the file EXAMPLES.DOC. How to Display RPSORT Built-in Syntax Screens Enter the RPSORT command with no parameters, to see RPSORT's built-in syntax screens. Use the Page Down and Page Up keys to negotiate the screens. Press the Esc key when you are finished viewing the syntax screens. Syntax Conventions . Items in square brackets ([]) are optional. Type the information inside the brackets but not the brackets themselves. . An item followed by an ellipsis (...) may be repeated several times. . Capital letters (A thru Z) and special characters (/ and ? and +) should be entered as they appear in the syntax except that you may enter lower case letters in place of the capital letters. . Words spelled out in lower case letters describe an item you are to enter. For example, where you see the word "inputfile" in the syntax, enter the path (if necessary) and the name of an input file. File names and other and other parameters may be entered in lower or upper case as you choose. October 10, 1991 RPSORT Reference Page 2 How To Exit Quickly From RPSORT RPSORT is very fast and can sort files containing hundreds of kilobytes and thousands of records in just a few seconds (I am assuming a 286 CPU and a hard disk). However, if you are sorting a really large file (say 20 megabytes) then the execution time could be a some number of minutes. If you start such a sort and then realize that you specified the wrong sort key(s), you can terminate the sort immediately as follows: . Enter a Ctrl-Break (i.e. hold down the Ctrl key press the Break key). . Within a very few seconds, RPSORT will respond with the message: Do you wish to quit RPSORT? Press Esc to quit, any other key to continue. . If you do indeed wish to terminate the sort press the Esc key. RPSORT will clean up properly by deleting any temporary files as well as any partial output file and then it will terminate. . If you decide you don't want to terminate the sort after all, press any key but the Esc key and the sort will continue. After terminating the sort, as above, you can then re-enter the RPSORT command with the correct parameters. There might be other reasons to terminate the sort. Perhaps you need the computer for some other purpose and can't wait for the sort to finish. In such cases, be aware that any work done by RPSORT will be lost. If you do the sort later on you will have to start it from the beginning. If you want RPSORT to ignore any control break, use the /B switch. See "Miscellaneous Switches" on page 15. Quick Exit From RPSORT When Output Goes To The Standard Output If RPSORT is writing its output to the standard output, as in: RPSORT outputfile then the termination proceeds a little differently: . As above you enter Ctrl-Break. . RPSORT simply terminates the sort within a few seconds without giving you a chance to change your mind. As above it deletes any temporary files but it does not delete the output file. It can't delete the latter because it doesn't know the name of a redirected output file. October 10, 1991 RPSORT Reference Page 3 General Description of How RPSORT Does A Sort When you execute RPSORT you specify in the command line: . The source of the input data (one or more files) and the destination for the sorted output (either a file or the screen). . Whether the input is lines terminated by CRLF or fixed length records. . Optionally, you define one or more sort key. These indicate: . The location of the key in the line or record. . The length of the key. . The type of key. Any of several string or numeric types. If there are no sort key definitions, RPSORT assumes a default character string key consisting of the entire line or record. The sort process involves the following: . RPSORT compares two records/lines, at a time, to determine which comes first and swaps them, if necessary, to put them in the right sequence. The comparisons continue until the entire input has been sequenced. . RPSORT uses the quicksort algorithm (invented by C. A. R. Hoare in 1962) to determine which records/lines to compare. This algorithm is very good at doing the sort with the minimum number of comparisons. . In comparing two records/lines, RPSORT compares the sort keys in the same sequence as their appearance in the command line until it finds an unequal compare or runs out of sort keys. . If all the sort keys are equal for two records, RPSORT breaks the tie by comparing the locations of the two records in the input. This maintains any inherent order in the file (i.e. if two or more records have identical sort keys then their order among themselves in the sorted output will be the same as it was in the input). . For files consisting of lines, some of the lines may be: . Too short to contain any part of a given sort key. Then, the sort key is taken to be a null string and sorts lower than anything else. . Or too short to contain the whole sort key. Then, the key comparison is done for the length of the shorter key. If the keys are equal for that length, the shorter key sorts low. . If the input file(s) are small enough to fit in the available memory space the sort is done in one pass in memory. . If the input is too big to fit into memory, it is read in chunks and each chunk is sorted and written to a temp file. Then RPSORT uses one or more merge phases to combine the chunks into the sorted output file. . RPSORT displays the elapsed time for the sort at the end. October 10, 1991 RPSORT Reference Page 4 Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]] [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P] [/R] [/Td] [/Z] [sort key defin. . .] Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U] --------------------------------------------------------------------------- Summary Of RPSORT Syntax Input is one or more filespecs (including path if required) separated by plus signs. Output is a single filespec. Input filespec(s) must precede output filespec. Input file(s) are sorted together into the single output file. Wildcard characters are allowed in input filespecs and all files with matching names will be included. For example: RPSORT IPFILE*.DAT+C:\MYDIR\IP??FILE.DAT OPFILE.DAT RPSORT can also be used as a filter. For example: RPSORT OPFILE By default, RPSORT assumes a text file with the entire line as a case insensitive sort key. This can be changed by some of the parameters below. /Q suppresses copyright and success messages. Must be first parameter. /Eerrfile specifies file to which error messages will go instead of the screen. Should precede any parameter except /Q. /? or ? displays built-in syntax screens. /A does an ASCII sort. Case sensitive (lower case not equal upper case). /B tells RPSORT to ignore any control break entered from the keyboard. /C specifies C language style text keys (terminated by a binary zero). /D deletes any record whose sortkeys duplicate those in a previous record. /Fnnnn says that the input consists of fixed length records of nnnn bytes. /N deletes any null lines (those consisting only of a CRLF sequence). /P specifies Pascal style text keys (first byte is length of string). /R specifies a reverse (descending order) sort. /Td designates drive to be used for temp files instead of default drive. /Z tells RPSORT to ignore Ctrl-Z in text file and use the entire file. The /R switch applies to all sort keys. The /A, /C and /P switches apply to all text sort keys. They can't be over-ridden for an individual sort key. A sort key definition starts with /+ and may include the following attributes. No spaces are allowed between the attributes: col is starting column of this key. Col 1 is the first col in the record. :len is the length of this key. A does an ASCII (case sensitive) sort for the key. C sorts this key as C language text key (terminated by a binary zero). F sorts this key as a 80x87 floating point number. Len is 4, 8 or 10. I sorts this key as a signed binary integer. This may be any length. M sorts this key as a BASICA floating point number. Len is 4 or 8. P sorts this key as Pascal text key (first byte is length of string). R does a reverse (descending) sort for this key. T sorts this key as a Turbo Pascal type "real" number. Len must be 6. U sorts this key as an unsigned binary integer. This may be any length. Attributes F, I, M, P, T and U are only allowed for fixed length records. October 10, 1991 RPSORT Reference Page 5 Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]] [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P] [/R] [/Td] [/Z] [sort key defin. . .] Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U] Details Of RPSORT Syntax There are three types of parameters: . Those that specify files (i.e inputfile and outputfile). . Switches which consist of a slash and a letter plus possibly a file name, number or drive letter (e.g /Q, /Eerrfile, /Fnnnn, Td). . Sort key definitions each of which defines a single sort key. The parameters can be entered in any sequence except that: . The inputfile(s) must always precede the outputfile. . The /Q switch (see /Q - Suppressing Copyright And Completion Messages) must precede any other parameter. . The /Eerrfile switch (see /Eerrfile - Directing Error Messages To A File) should precede everything but /Q. Options For Specifying Input And Output Files Using RPSORT As A Filter RPSORT can be used as a filter which reads the standard input and writes to the standard output. For example: RPSORT opfile The standard output need not be redirected and can go to the screen. The standard input must be redirected to a file or piped from the output of another program. RPSORT will not accept an input file directly from the keyboard. If you take the input from the standard input then the output MUST go to the standard output. Specifying Input And Output Files Directly You can specify the input and output files directly. Input is one or more files separated by plus signs but output must be a single file. The filespecs may include a path. All input files are combined and sorted together into the single output file. Wildcard characters are allowed in input filespecs and all files with matching names are included. For example: RPSORT IPFILE*.DAT+C:\MYDIR\IP??FILE.DAT OPFILE.DAT If the path and filename for the output filespec are the same as that for an existing file, the latter will be replaced by the output from RPSORT. If this is what you want, fine but if you don't want to lose the existing file then use a different name for the output. October 10, 1991 RPSORT Reference Page 6 Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]] [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P] [/R] [/Td] [/Z] [sort key defin. . .] Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U] Specifying Lines Or Fixed Length Records Lines Are The Default By default, the file is assumed to consist of lines. A line is a sequence of characters terminated by CRLF. RPSORT also accepts the LFCR sequence as a line terminator. The lines may vary in length from null lines up to a maximum length of 32750. RPSORT will reject a file that contains a line longer than this. If the last record in an input file does not terminate with CRLF or LFCR, RPSORT will append these two characters and display a message informing you of its action. If the input is two or more files, RPSORT will, if necessary, append a CRLF to terminate the last line in each of the files. RPSORT never assumes that a line starting in one file continues in the next. Only character string sort keys are allowed in a file of lines. Binary numeric sort keys are not allowed. /Fnnnn - Specifying Fixed Length Records A file of fixed length records contains records all of the same length. The /Fnnnn switch tells RPSORT that the records are fixed length and the value you enter for nnnn specifies the length. For example, /F65 tells RPSORT that the file consists of 65 byte records. Fixed length records need not end with a CRLF but if they do, those two bytes must be included in the length given by the /Fnnnn switch. The maximum length you may specify is 32750. RPSORT would reject /F32751. If the last record in the input is shorter than the length given in the /Fnnnn switch (i.e. the file length is not an exact multiple of nnnn), RPSORT ignores the last record and does not include it in the sorted output. RPSORT displays a message to inform you of its action. If the input consists of two or more files, RPSORT will skip last short records from each of the input files. RPSORT never assumes that a record starting in one file continues in the next. All key types supported by RPSORT are allowed in a file of fixed length records. October 10, 1991 RPSORT Reference Page 7 Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]] [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P] [/R] [/Td] [/Z] [sort key defin. . .] Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U] Detailed Description Of Sort Key Types Supported By RPSORT Sort Keys That Are Character Strings Default Case Insensitive Character Strings This is the only sequence supported by the DOS SORT. The digits 0 through 9 come before the letters. Lower case letters sort equal to upper case letters. Foreign letters, punctuation and currency symbols sort equal to their American English equivalents. ASCII (Case Sensitive) Character Strings The sequence is according to the ASCII value assigned to each character. This puts the digits 0 through 9 before any letters and puts all of the upper case letters before any of the lower case letters. Foreign letters, punctuation and currency symbols sort higher than any of the above. The ASCII value for each character is the code used internally by the computer to represent that character. An ASCII sort is the fastest possible sort because it requires no pre-processing of the characters. You can specify this type of sort key by using either the /A switch (see page 11) or the A attribute (see page 12). C Language Style Character Strings C language strings are allocated some maximum length in your C program. This should be the length in the sort key definition. For example, if you define "char mystr[8]" in your C program then the compiler allocates 8 bytes and therefore the length specified to RPSORT should also be 8. The actual character string, however, may be shorter. C language strings are terminated by a binary zero if they do not fill the allocated space. Therefore, RPSORT takes the length of a C style string to be the lesser of: . The length attribute (or if absent the default length). . The length up to but not including the first binary zero. You can specify this type of sort key by using either the /C switch (see page 11) or the C attribute (see page 12). October 10, 1991 RPSORT Reference Page 8 Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]] [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P] [/R] [/Td] [/Z] [sort key defin. . .] Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U] Detailed Description Of Sort Key Types Supported By RPSORT (continued) Sort Keys That Are Character Strings (continued) Turbo Pascal Style Character Strings Turbo Pascal strings are allocated some maximum length in your Pascal program. This should be the length given in the sort key definition. For example, if you define string[8] in your Pascal program then the compiler allocates 9 bytes to the string and therefore the length specified to RPSORT should also be 9. The first byte in a Pascal string is a length byte. This contains a binary number which is the actual length of the string. The remaining bytes allow enough room for the longest possible string. The length must be between 2 and 256 inclusive. These limits correspond to string[1] and string[255] respectively. If RPSORT finds a length byte value, in the file, that is too large (i.e. greater than or equal to the specified length) it aborts. This would only occur if the sort key was incorrectly defined. You can specify this type of sort key by using either the /P switch (see page 11) or the P attribute (see page 12). This type of sort key is only allowed for fixed length records. Sort Keys That Are Binary Numbers Signed Binary Integers A signed binary integer is a two's complement binary integer that is stored low byte first, high byte last. This is the natural way for an 80X86 CPU to store binary integers. As far as I know, all language compilers and interpreters for IBM PCs and clones store them this way. RPSORT allows signed binary integer sort keys to be any length from 1 up to the length of the record. You can specify this type of sort key by using the I attribute (see page 13). This type of sort key is only allowed for fixed length records. October 10, 1991 RPSORT Reference Page 9 Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]] [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P] [/R] [/Td] [/Z] [sort key defin. . .] Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U] Detailed Description Of Sort Key Types Supported By RPSORT (continued) Sort Keys That Are Binary Numbers (continued) Unsigned Binary Integers Unsigned binary integers, just like signed binary integers, are stored low byte first, high byte last. RPSORT allows unsigned binary integer sort keys to be any length from 1 up to the length of the record. You can specify this type of sort key by using the U attribute (see page 13). This type of sort key is only allowed for fixed length records. BASICA And GWBASIC Floating Point Numbers RPSORT supports binary floating point numbers as defined by the BASIC interpreter (prior to MS-DOS v5.0) and older versions of Microsoft QuickBASIC (prior to QB v4.0). The lengths that RPSORT will accept for these numbers are: Length = 4 for single precision numbers. Length = 8 for double precision numbers. You can specify this type of sort key by using the M attribute (see page 13) and one of the lengths listed above. This type of sort key is only allowed for fixed length records. Turbo Pascal Real Numbers RPSORT supports Turbo Pascal numbers of type "real". The length need not be specified and is always 6. You can specify this type of sort key by using the T attribute (see page 13). This type of sort key is only allowed for fixed length records. This was the "real" type in the original version of Turbo Pascal and is still supported in version 6.0. To see how to sort the new 80x87 formats in Turbo Pascal (single, double, extended and comp) refer to the table on page 14. Also see the next section on "Math Co-Processor Floating Point Numbers". October 10, 1991 RPSORT Reference Page 10 Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]] [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P] [/R] [/Td] [/Z] [sort key defin. . .] Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U] Detailed Description Of Sort Key Types Supported By RPSORT (continued) Sort Keys That Are Binary Numbers (continued) Math Co-Processor Floating Point Numbers RPSORT supports three types of math co-processor (i.e. 80x87) floating point numbers. The table below gives the lengths and names assigned to them by Intel and by three popular compilers. Length Intel QuickBasic Turbo Pascal Turbo C ------ ---------- ---------- ------------ ------- 4 short real single single float 8 long real double double double 10 temp real N/A extended long double You can specify this type of sort key by using the F attribute (see page 13) and one of the lengths listed above. This type of sort key is only allowed for fixed length records. RPSORT does not require a math co-processor to sort numbers of this type and does not use the 80x87 even if it is present. Zero values returned by an 80x87 are marked as either a +0 or a -0. Some zero values arise from underflow. This occurs if a result is too small (i.e. has too negative an exponent) for the given numeric format (short real, long real or temp real). The 80x87 returns a zero result but keeps the sign of the small number. RPSORT sorts minus zeros as less than plus zeros. I could call this a deliberate feature in that it reflects as best as possible the true sequence of very small results but actually it's a natural consequence of the way I do the sort. A result can be too large for the given numeric format. This is called overflow. Most compilers generate an error and do not store store a result but an 80x87 can return special values denoting plus and minus infinity. RPSORT sorts plus infinity higher than any other value and minus infinity as lower than any other value. The 80x87 also generates special values for error conditions (e.g. taking the square root of a negative number). Any compiler would generate an error rather than store such values. Still, RPSORT must do something if it finds them. I sort them the same as plus or minus infinity depending on their sign. October 10, 1991 RPSORT Reference Page 11 Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]] [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P] [/R] [/Td] [/Z] [sort key defin. . .] Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U] Defining The Desired Sort Sequence To RPSORT Standard Defaults For Sort Keys The following defaults are used by RPSORT unless you specify other defaults (see "Switches Which Set Defaults For Sort Keys") or specify different attributes in the sort key definition for a sort key (see "Defining Sort Keys"). . The sort key consists of the entire record/line. . The sort key is a character string to be sorted per the same case insensitive sequence used by the DOS SORT. Digits 0 through 9 precede the letters. Lower case letters sort equal to upper case. Foreign letters, punctuation and currency symbols sort equal to their American English equivalents. . The sort will be in ascending (low to high) sequence. Switches Which Set Defaults For Sort Keys These switches change some of the defaults for sort keys. They can't be over-ridden by individual sort key definitions. Use them only if you want all your sort keys to have the same attributes. The /C and /P switches may be of particular interest to computer programmers. The /A, /C and /P switches apply to all character string sort keys (i.e. they apply to any sort key that is not defined as being a binary numeric type). /C and /P are mutually exclusive but either may be used in conjunction with /A. /A makes the ASCII (case sensitive) sequence the default. Digits 0 through 9 precede the letters and all upper case letters precede any lower case letters. The sequence is per the ASCII code for each character. /C says that all string keys are C language character strings. See page 7 for a description of C style strings. /P says that all string keys are Turbo Pascal type strings. See page 8 for a description of Pascal style strings. The /P switch is only allowed for fixed length records. /R specifies a reverse sort. The sort will be in descending (high to low) sequence. /R applies to all the sort keys you define. October 10, 1991 RPSORT Reference Page 12 Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]] [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P] [/R] [/Td] [/Z] [sort key defin. . .] Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U] Defining Sort Keys Sort Key Definition Syntax If no sort key definitions are given, RPSORT assumes a single default sort key (see "Standard Defaults For Sort Keys" and "Switches Which Set Defaults For Sort Keys"). You may specify as many sort key definitions as you like provided that they fit within the command line (maximum of 127 bytes). Sort key definitions consist of a /+ followed by a list of attributes with no spaces between them. You may, however, use spaces to separate one sort key definition from another or from a switch. All attributes are optional. A sort key definition may be just /+, which gets you the same default sort key as when no sort key definitions are specified. The following describes each of the attributes: col is the starting column for the key. This must be at least 1 but no more than 32750. For fixed length records, the maximum is the largest column such that there is enough room in the remainder of the record to hold the minimum legitimate key length for the given key type. :len is the length for this key (e.g. a seven byte key would be indicated by :7). The legitimate values for len depend on the type of the sort key. R specifies a reverse sort for this key. The sequence will be in descending (high to low) sequence. The next three attributes are used for character string keys. C and P are mutually exclusive but either may be used with A. C and P may be of interest to computer programmers. A does an ASCII (case sensitive) sort for the key. The digits 0 through 9 precede the letters and all upper case letters precede any of the lower case letters. C says that this key is a C language character string. See page 7 for a description of C style strings. P says that this key is a Turbo Pascal character string. See page 8 for a description of Pascal style character strings. The P attribute is only allowed for fixed length records. October 10, 1991 RPSORT Reference Page 13 Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]] [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P] [/R] [/Td] [/Z] [sort key defin. . .] Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U] Defining Sort Keys (continued) Sort Key Definition Syntax (continued) The next five attributes define binary numeric type keys which are only allowed for fixed length records. They are mutually exclusive. These attributes may be of interest to computer programmers. The table on page 14 lists some programming language compilers and interpreters and indicates the appropriate type and length attributes to be used for each of their binary numeric data types. I sorts this key as a signed binary integer. These may be any length. See page 8 for additional details. U sorts this key as an unsigned binary integer. These may be any length. See page 9 for additional details. F sorts this key as a binary floating point number of the type produced by a math co-processor (i.e. an 80x87). RPSORT supports three precisions for 80x87 floating point numbers. The table below gives the lengths for each precision and the names assigned to them by Intel and three popular compilers. Length Intel QuickBasic Turbo Pascal Turbo C ------ ---------- ---------- ------------ ------- 4 short real single single float 8 long real double double double 10 temp real N/A extended long double RPSORT does not require a math co-processor to sort numbers of this type and does not use the 80x87 even if it is present. See page 10 for additional details concerning math co-processor floating point numbers. M sorts this key as a binary floating point number as defined by the BASIC interpreter (prior to MS-DOS v5.0) and older versions of Microsoft QuickBASIC (prior to QB v4.0). The len attribute can be 4 or 8. Use len = 4 for single precision numbers. Use len = 8 for double precision numbers. T sorts this key as a Turbo Pascal number of type "real". The len parameter need not be specified and is 6 by default. See page 9 for additional details. October 10, 1991 RPSORT Reference Page 14 The following table lists the type and length attributes for the binary numeric types available in a few programming language compilers and interpreters. If you are using a compiler that is not in this table, you should review the previous pages along with the programmers guide for your compiler to see if any of the binary numeric types supported by RPSORT match those available with your compiler. Compiler Or Interpreter Number Type Type Attribute Length Attribute ----------------------- ----------- -------------- ---------------- Microsoft QuickBASIC Integer I 2 v4.0 and later & Long I 4 Microsoft QBASIC Single F 4 Double F 8 Microsoft QuickBASIC Integer I 2 v3.0, 8087 Single F 4 Double F 8 IBM BASICA & Integer I 2 Microsoft GWBASIC & Single M 4 Microsoft QuickBASIC Double M 8 v1.0, v2.0 and v3.0 non-8087 Turbo Pascal Shortint I 1 v4.0 and later Integer I 2 Longint I 4 Byte U 1 Word U 2 Real T 6 Single F 4 Double F 8 Extended F 10 Comp I 8 Turbo Pascal Integer I 2 v3.0 8087 Byte U 1 Real F 8 Turbo Pascal Integer I 2 v1.0, v2.0 and v3.0 non-8087 Byte U 1 Real T 6 Borland/Turbo C signed char I 1 unsigned int U 2 short int I 2 int I 2 unsigned long U 4 long I 4 float F 4 double F 8 long double F 10 October 10, 1991 RPSORT Reference Page 15 Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]] [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P] [/R] [/Td] [/Z] [sort key defin. . .] Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U] Miscellaneous Switches /Q - Suppressing Copyright And Completion Messages. The /Q switch, if it is the first parameter, suppresses display of: . The Copyright message when the sort starts. . The "Sort successfully completed." message after successful sort. Error messages, if any, will still be displayed. /Eerrfile - Directing Error Messages To A File. This switch directs error and successful completion messages to the file designated by errfile instead of the screen. For example: /Ec:\mydir\myerrors Specify /Enul to send error messages to the DOS NUL file which means nowhere. Only the /Q switch, if any, should precede the /E switch. /B - Ignoring Control Breaks Entered From The Keyboard. Tells RPSORT to ignore Ctrl-Break from the keyboard. This would be useful if you setup a batch file which includes RPSORT and you don't want the users of the batch file to be able to interrupt RPSORT. /D - Delete Records Whose Sortkeys Duplicate Those In A Previous Record Tells RPSORT to delete any records/lines whose sort keys duplicate those in a previous one. This deletes records/lines even if they are not identical to a previous one since all that is required is that the sort keys be the same. To only delete identical records/lines, tack on /+a as the last sort key. This produces an equal compare only for identical records/lines. For example: RPSORT /D /+1:2 deletes any lines whose first two bytes equal those on a previous line, while RPSORT /D /+1:2 /+a deletes only lines that are identical to a previous line. October 10, 1991 RPSORT Reference Page 16 Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]] [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P] [/R] [/Td] [/Z] [sort key defin. . .] Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U] Miscellaneous Switches (continued) /N - Delete Null Lines This switch deletes all null lines (i.e. lines consisting only of a CRLF). Lines that are all spaces and thus look like null lines when you list them will not be deleted. This switch is not allowed for fixed length records for which it would be meaningless. /Td - Designate Drive To Be Used For Temporary Files. Given enough memory, RPSORT loads the entire input into memory, sorts it and writes the sorted data to the output file. In such cases, RPSORT does not need to create any temporary files. If the input is larger than the available memory, RPSORT reads the file a chunk at a time, sorts each chunk and writes the sorted chunks to a temporary file. RPSORT then does one or more merge phases to combine the chunks into a single sorted output file. RPSORT normally puts temporary files on the default drive. The /T switch lets you to specify an alternate drive. For example: /TC puts the temporary file(s) on your C drive. See the section on "Efficiency Considerations" for more details. /Z - Ignore Ctrl-Z In Text File. Use Entire Physical File. RPSORT (just like MS-DOS) treats Ctrl-Z as the end of a text file. This is usually the correct thing to do since Ctrl-Z, if present, normally follows the last byte of actual data. Sometimes, however, one or more Ctrl-Zs occur in the middle of a text file. Files downloaded from bulletin boards may contain garbage characters (such as Ctrl-Z) due to a noisy line. If you sort such a file, the sorted output is shorter than the original file because RPSORT uses only part of the input. The /Z switch tells RPSORT to ignore Ctrl-Zs and to use the entire input. RPSORT deletes any Ctrl-Zs except for one at the end of the file. /Z is not applicable to fixed length records where Ctrl-Z has no special meaning and is just taken as another data byte. October 10, 1991 RPSORT Reference Page 17 Efficiency Considerations Do ASCII Sort If Text Keys Are All Upper Case Or All Lower Case An ASCII sort puts text keys in order according to the ASCII code assigned each of the characters. This is the fastest possible sort because RPSORT can sequence records by directly comparing the sort keys without having to pre-process them in any way. If a file contains both upper and lower case letters and you want all the keys starting with a lower case "a" to be together with the keys starting with an upper case "A" and so on, then you can't do an ASCII sort and must do a case insensitive sort. However, if your file contains only upper case letters (or if it contains only lower case letters) then an ASCII sort will acheive the the same result as a case insensitive sort but will be faster. You specify an ASCII sort either by using the A attribute in each sort key: RPSORT /+1:5A /+12:7A INPUT.DAT OUTPUT.DAT or by using the /A switch: RPSORT /A /+1:5 /+12:7 INPUT.DAT OUTPUT.DAT If your files contain foreign letters, punctuation or currency symbols and you want these to sort the same as their American English equivalents then you must do a case insensitive sort. October 10, 1991 RPSORT Reference Page 18 Efficiency Considerations (continued) How Memory Size Affects RPSORT Speed And Need For Temp Disk Space The amount of memory (I mean conventional memory not Expanded or Extended memory) affects RPSORT's speed in the following ways: . If memory is big enough or conversely the file is small enough to do the sort in memory, in one pass, then the sort will be optimally fast. . Otherwise the input must be sorted a chunk at a time with the chunks being written to a temp file. Then one or more merge phases will be required to combine the chunks. If memory is very small and many merge phases are required, RPSORT would slow down dramatically. The following pages contain a lot of nitty gritty detail about the conditions which force RPSORT to use temp disk space and how much temp disk space it might need. You can ignore these details if your situation meets either of the following conditions: . No temp files are needed if the free memory (see "Using CHKDSK Or MEM To Determine Free Memory" below) equals the input size plus twice the line/record count plus 70,000. A 10,000 line 400,000 byte file requires 400,000 plus (2 * 10,000) plus 70,000 or a total of 490,000 bytes of free memory to sort the input without using temp disk space. . If the drive assigned to hold temp files (either the default drive or the drive specified in the /T switch) has twice as much space as the size of the input file, this will always be sufficient. Using CHKDSK Or MEM To Determine Free Memory To determine the amount of free memory in your system, use the CHKDSK command which gives you a display something like: 362496 bytes total disk space 53248 bytes in 2 hidden files 303104 bytes in 36 user files 6144 bytes available on disk 655360 bytes total memory 581168 bytes free The free memory is on the last line (581168 in this example). If you own MS-DOS 5.0 you can use the MEM command and get something like: 655360 bytes total conventional memory 655360 bytes available to MS-DOS 564288 largest executable program size Here the free memory appears on the "largest executable program size" line (564288 in this case). October 10, 1991 RPSORT Reference Page 19 Efficiency Considerations (continued) How Memory Size Affects RPSORT Speed And Need For Temp Disk Space (cont.) Sorts Requiring No Merge Phase And No Temporary Files If possible, RPSORT will do a sort in a single pass without requiring any temporary files. Use the following steps to determine whether a given file can be sorted in a single pass: . First determine the amount of free memory (called FREEMEM below). See "Using CHKDSK Or MEM To Determine Free Memory" above. . Then the memory space required, by RPSORT, for the input equals: File Size + Twice The Number Of Records/Lines In The File This sum is called FILESPACE below. For example, if the file size were 453,868 bytes and it consisted of 8,323 lines then FILESPACE would equal 453,868 + 8,323 + 8,323 or 470,514 bytes. . RPSORT also requires some memory for itself and for buffers and tables. This depends on the size of FREEMEM: . If FREEMEM exceeds 170,000 bytes RPSORT reserves 70,000 bytes. In this case, a file can be sorted in one pass if: FILESPACE is less than FREEMEM - 70,000 . If FREEMEM is less than 170,000 bytes then RPSORT reserves 18,000 bytes plus one-third of the remainder of FREEMEM for itself. This means that a file can be sorted in a single pass if: 2 * (FREEMEM - 18,000) FILESPACE is less than ---------------------- 3 . If FREEMEM is less than approximately 30,000 bytes, then RPSORT will be unable to do the sort at all. October 10, 1991 RPSORT Reference Page 20 Efficiency Considerations (continued) How Memory Size Affects RPSORT Speed And Need For Temp Disk Space (cont.) Sorts Requiring One Merge Phase And One Temporary File When a single pass sort is not possible, RPSORT breaks up the file into "chunks" and sorts each chunk separately. Then it merges these chunks to produce the sorted output. Use the following steps to check whether a file can be sorted with a single merge phase using only a single temporary file the same size as the input file: . Compute FREEMEM and FILESPACE as described in the previous section. . Then compute the number of chunks (called #CHUNKS below) as follows and round up to the next higher integer: . If FREEMEM exceeds 170,000 bytes then: FILESPACE #CHUNKS = ---------------- FREEMEM - 70,000 . If FREEMEM is less than 170,000 then: 3 * FILESPACE #CHUNKS = ---------------------- 2 * (FREEMEM - 18,000) . Now compute the maximum number of chunks that RPSORT can merge at one time (called MAXMERGE below) as follows and round down to the next lower integer: . If FREEMEM exceeds 315,000 then: FREEMEM - 50,000 MAXMERGE = ---------------- 16,000 . If FREEMEM exceeds 90,000 but is less than 315,000 then: MAXMERGE = 16 . If FREEMEM is less than 90,000 then: 8 * (FREEMEM - 18,000) MAXMERGE = ---------------------- 36,000 . If #CHUNKS is less than or equal to MAXMERGE, then RPSORT will do a single merge phase sort using a single temp file the same size as the input file. October 10, 1991 RPSORT Reference Page 21 Efficiency Considerations (continued) How Memory Size Affects RPSORT Speed And Need For Temp Disk Space (cont.) Sorts Requiring Two Or More Merge Phases And Two Temporary Files If necessary, RPSORT will do a multiple merge phase sort. This requires two temporary files each the size of the input file. Actually, RPSORT doesn't abruptly go to a full second merge phase if it can't do the sort in one merge phase. If #CHUNKS is less than twice MAXMERGE it does a one and a fraction merge phase sort. The first temp file (TEMP1) will be the same size as the input but the second (TEMP2) will be smaller as follows: #CHUNKS - MAXMERGE + 1 Size of TEMP2 = ---------------------- * Size of input file #CHUNKS Deciding What Drives To Put Temporary And Output Files On Reading one file and writing another file concurrently on the same drive is generally inefficient because it requires that the drive head assembly constantly move back and forth between the two files. This can slow things down significantly. RPSORT always finishes reading the input file before it starts writing the output file. This means there is no loss of efficiency if the input and output are on the same drive. Of course there must be enough room on this drive to hold the output file. If a sort requires temporary files they are written at the same time as the input file is read. Similarly, temporary files are read at the same time as the output file is written. The drive assigned for temporary files must have enough space to hold the entire input and in some cases twice that much. Temp files go to the default drive but you can over-ride this with the /T switch. If you have a big enough RAM disk, you should consider putting the temp files there. This could markedly enhance the performance of RPSORT. If you don't use a RAM disk, you should assign temp files to a drive other than the ones on which the input and output files reside. This dictum is not absolute, however, as indicated by the following: . If you have only one hard drive and both the input and output files reside there, you are better off putting the temp files on the same hard drive than on a floppy. . If you are short of disk space, putting the temp files and the output file on the same drive could help because the output file might be able to reuse part of the space allocated to temp files. October 10, 1991 RPSORT Reference Page 22 Efficiency Considerations (continued) Buffers Command In Your Config.Sys MS-DOS allocates disk buffers in memory to support read and write operations. The buffers are usually 512 bytes each. MS-DOS allocates 10 or 15 buffers depending on whether your system has less or more than 512K of memory. Some applications run faster with a larger number of buffers. You specify this in your config.sys file. For example: BUFFERS=30 On my computer (a 10Mhz 286 with a slow hard disk): . Sorting a modest size file (say up to a megabyte) speeds up little if at all when I increase the number of buffers. . Sorting a large file (say a few megabytes), speeds up a very few percent with BUFFERS=20. . BUFFERS=30 produces an additional small improvement for very large files (upwards of ten megabytes). To fine tune the performance of RPSORT on your system, sort files of the type and size typical for you and test the effect of various BUFFER values. In any case, you probably will use the number of buffers that is optimal for your principal applications not for RPSORT. Using Disk Cache Programs Disk cache programs (like SMARTDRV.SYS which is distributed as part of MS-DOS 5.0 package) set aside an area of memory called the disk cache. Typically the disk cache is allocated in expanded or extended memory and may be quite large (i.e. a megabyte or more). Disk cache programs intercept accesses to disk and retain data from the disk, in the cache. If the data is required later on, the disk cache program can provide the data from memory rather than having to go to the disk drive which would be much slower. If the retained data is needed often enough then the performance of your system will improve. Otherwise, your system may slow down due to the overhead of the disk cache program. I can't make any definitive statement as to how disk cache programs might improve or degrade the performance of your system. If you contemplate using a disk cache program, I suggest that you perform experiments with caches of different sizes and possibly with different cache programs. These experiments should include the entire range of activities you perform on your system. October 10, 1991 RPSORT Reference Page 23 Special Situations Sorting Files That Contain Tabs If the input file contains tabs, they may need to be expanded to the proper number of spaces to align your sort keys. RPSORT can't sort such a file correctly because it doesn't expand tabs. As a convenience, I have included a program called RPTAB. It reads your file and produces an output file that is the same except that the tabs have been expanded. The syntax is: RPTAB input-filespec output-filespec [tabstop...] The parameters must be given in the order defined above. Listing tab stops is optional. If you specify none, the default tab stops are at positions 1, 9, 17, 25, 33... and so on at intervals of eight columns. If you specify tab stops they must be a sequence of integers each greater than the preceding one. The first tab stop is always column 1 and need not be given. The interval between the last two explicit tab stops implies subsequent tab stops at the same interval. The following command expands tabs to the default tab stops: RPTAB MYTABS.DAT MYSPACES.DAT The following command says that tab stops are at positions 1, 6, 15, 27, 39, 51... etc. The interval of 12 between 15 and 27 is propagated to subsequent tab stops: RPTAB MYTABS.DAT MYSPACES.DAT 6 15 27 After creating MYSPACES.DAT as in the above examples, you could use RPSORT to sort it in the usual way. You can also use RPTAB for the reverse operation. This means to replace spaces by tabs whenever possible. The syntax is: RPTAB /T input-filespec output-filespec [tabstop...] The syntax is identical except for the addition of the /T switch. If you have a text file with a lot of spaces, RPTAB can reduce its size while leaving it readable by many text processing utilities. This package includes the source code for RPTAB in the file RPTAB.PAS. It is written in Turbo Pascal and compiled with the version 6.0. You may modify RPTAB in any way you choose. Please! Please! do not distribute any modified version under my name. RPTAB.PAS consists of Pascal statements and assembly language sub-routines. The latter were written using Turbo Pascal's inline assembler (a very useful addition by Borland). October 10, 1991 RPSORT Reference Page 24 Special Situations (continued) Writing The Output To The File That Contained The Input Nothing stops you from specifying the same file as both input and output in a RPSORT command. It is dangerous but it can be beneficial in some circumstances. It is possible to do this is because RPSORT never starts writing the output file until after it has finished reading the input file. Therefore it will not destroy the input before it has read it. The danger is that after RPSORT has started writing the output file but before it has finished, your system may go down due possibly to a power failure or a software or hardware problem or whatever. In this case the input would be destroyed and the output would not yet exist. This would mean the loss of your data unless you had backed up your file or it could be recreated in some way. The benefit is realized when you must put the output file on the same drive as the input file but there is not enough space, on the drive, to hold both. By using the same file for input as for output you would re-use the same disk space and thus might be able to do a sort that otherwise you could not do. Once again, don't do this unless you have backed up your data or you have some relatively easy way to recover it. None of the above applies if RPSORT is being used as a filter. In that case if the output file is the same as the input then the input file will be destroyed by DOS before RPSORT even starts executing. Two Incompatibilities With The DOS SORT Their are two exceptions to the statement that any command that works with the DOS SORT will produce the same result with RPSORT: . RPSORT will not let you type the input file from the keyboard. . The DOS SORT tacks the CRLF, that ends a line, onto the sort key. RPSORT doesn't. Thus, RPSORT sorts null lines to the beginning of a file. The DOS SORT precedes them with any line whose sort key starts with a character like tab or formfeed whose ASCII value is less than that for CR. October 10, 1991 RPSORT Reference Page 25 Error Messages Error Numbers And Return Codes Each type of error that RPSORT can detect has been assigned an error number which appears in the corresponding error message. For example: ERROR 049: No room on disk to write sorted output file. When RPSORT terminates, it sets the "errorlevel" return code as follows: . If the sort was successful, RPSORT sets the return code to zero. . If one or more syntax errors are discovered, the relevant error messages are displayed and the sort is terminated. The return code is set to the error number for the first error detected. . If an error is discovered while executing the sort (typically some kind of input, output or insufficient memory error), the appropriate error message is displayed and the return code is set to the error number for that error. The error numbers are broken down into groups as follows: Error Number Group Range Of Error Numbers ---------------------- ---------------------- Syntax Errors 1 - 34 DOS Version Before 2.0 37 Insufficient Memory 40 - 41 Line/String Too Long Errors 43 - 44 Input/Output Errors 46 - 54 There are also a number of error messages with error numbers in the range 59 through 74 which should never happen. Any of these could imply a bug in RPSORT. If you run RPSORT from a batch file, you can test the return codes in statements like: IF ERRORLEVEL 1 GOTO SORTERR This would catch any return code greater than or equal to one and thus any error at all. Another example: IF ERRORLEVEL 40 GOTO EXECERR This would catch any return code greater than or equal to 40 and thus any sort execution error. October 10, 1991 RPSORT Reference Page 26 Error Messages (continued) Syntax Error Messages When RPSORT parses the command line it displays messages for any syntax errors it finds. It always parses the complete command line and therefore may report several errors. Many error messages display the bad parameter at the end of the message. For example: ERROR 019: Only one keylength allowed: "/13:5:7" In the message listing below, the quoted word "badparm" stands for the bad parameter that RPSORT is complaining about. RPSORT never executes the sort if it finds syntax errors but instead terminates immediately after displaying the last error message. The list of syntax error messages follows: ERROR 001: Slash (/) must be followed by a parameter. A slash was followed by a space. Slash must always be followed by one of the switch characters or it must start a sort key definition. ERROR 002: Illegal parameter: "badparm" This message is displayed when RPSORT finds an illegal parameter but can't figure out a more specific error to cite. It lists this message and the bad parameter that it objects to. ERROR 003: Only one /X switch is allowed. /X in this message will either be /F, /E or /T. Each of these switches may only be specified once in an RPSORT command. ERROR 004: /P and /C are incompatible. /P and /C are mutually exclusive. /P says that all character string sort keys are Pascal style strings while /C says that all character string sort keys are C language style strings. There is no way a character string can be both of these. ERROR 005: Record len must be between 1 and 32,750 in: "badparm" "badparm" is a /Fnnnn switch specifying a record length that is either zero or greater than 32750. This is not allowed. ERROR 006: Pascal string key only allowed in fixed len record: "badparm" "badparm" is a sort key definition specifying a Pascal style string (i.e. including the P attribute). This is only allowed if a /Fnnnn switch was specified to tell RPSORT that the file consists of fixed length records. October 10, 1991 RPSORT Reference Page 27 Error Messages (continued) Syntax Error Messages (continued) ERROR 007: /P only allowed for fixed length records. The /P switch which says that all character string sort keys are Pascal style sort keys is only allowed if a /Fnnnn switch was specified to tell RPSORT that the file consists of fixed length records. ERROR 008: Binary number key (F,I,M,T or U) only allowed in fixed len record: "badparm" "badparm" is a sort key definition listing one of the binary number attributes. These are only allowed if a /Fnnnn switch was specified to tell RPSORT that the file consists of fixed length records. ERROR 009: /N switch not allowed for fixed length records. The /N switch, which says that null lines are to be deleted, is only allowed for a file consisting of lines. It is not allowed if a /Fnnnn switch has been specified. ERROR 010: One and only one temp drive letter may be entered: "badparm" "badparm" is a /T switch specifying either no drive letters or more than one. It should list only a single drive to be used for temporary files (e.g. /TC). ERROR 011: Non-existent drive: "badparm" "badparm" is a /T switch specifying a drive letter that does not exist in your system. ERROR 012: Invalid character for the drive: "badparm" "badparm" is a /T switch specifying a non-alphabetic drive. A drive can only be specified by a letter. ERROR 013: Start column must be between 1 and 32,750: "badparm" "badparm" is a sort key definition specifying a start column that is either zero or larger than 32750. This is not allowed. ERROR 014: Start column must not exceed record len: "badparm" "badparm" is a sort key definition specifying a start column that is larger than the record length in the /Fnnnn switch. This is not allowed. October 10, 1991 RPSORT Reference Page 28 Error Messages (continued) Syntax Error Messages (continued) ERROR 015: Only one start column allowed: "badparm" "badparm" is a sort key definition that specifies more than one start column for the sort key. This is not allowed. ERROR 016: Error in sort key: "badparm" "badparm" is an erroneous sort key definition. RPSORT is unable to cite a more specific error. ERROR 017: Key len must be between 1 and 32,750: "badparm" "badparm" is a sort key definition specifying a key length that is either zero or larger than 32750. This is not allowed. ERROR 018: Key len is too big to fit in record: "badparm" "badparm" is a sort key definition containing a key length that would cause the key to extend beyond the end of the record as specified by the /Fnnnn switch. ERROR 019: Only one key length allowed: "badparm" "badparm" is a sort key definition that specifies more than one key length for the sort key. This is not allowed. ERROR 020: Length for 80x87 floating point number must be 4, 8 or 10: "badparm" "badparm" is a sortkey definition specifying an 80x87 type floating point number (attribute F). The key length (either explicit or the implied key length to the end of the record) is not one of the legitimate values (4, 8 or 10). ERROR 021: Length for GWBASIC/BASICA floating point number must be 4 or 8: "badparm" "badparm" is a sortkey definition specifying a GWBASIC/BASICA type floating point number (attribute M). Its key length (either explicit or the implied key length to the end of the record) is not one of the legitimate values (4 or 8). ERROR 022: Length for Turbo Pascal floating point number must be 6: "badparm" "badparm" is a sortkey definition specifying a Turbo Pascal type floating point number (attribute T). It specifies a key length other than 6 which is the only legitimate value. It is not necessary to specify a key length for Turbo Pascal floating point numbers because RPSORT assumes the length 6 by default. October 10, 1991 RPSORT Reference Page 29 Error Messages (continued) Syntax Error Messages (continued) ERROR 023: Length for Pascal strings must be between 2 and 256: "badparm" "badparm" is a sortkey definition specifying a Turbo Pascal type string (attribute P). It specifies a key length less than 2 or more than 256 which are the limits for this type of string and correspond to string[1] and string[255] respectively. ERROR 024: Length for Pascal strings must be between 2 and 256. The /P switch was specified telling RPSORT that all character string type sort keys were Turbo Pascal style strings but at least one of the character string sort key definitions gave a key length less than 2 or more than 256. Alternatively, one of them had no explicit key length but the implied key length to the end of the record was not in the required range. ERROR 025: "P" and "C" attributes are incompatible: "badparm" "badparm" is a sort key definition that specifys both the P and C attributes, thus saying that the sort key is both a Pascal style string and a C style string. This is not possible. ERROR 026: C attribute conflicts with /P: "badparm" or ERROR 026: P attribute conflicts with /C: "badparm" "badparm" is a sort key definition that specifys the C or P attribute. This conflicts with the opposite /P or /C switch thus implying that the sort key is both a C style string and a P style string. This is not possible. ERROR 027: Sort key cannot be both a binary number and a string: "badparm" "badparm" is a sort key definition specifying one of the attributes (A, C or P) appropriate to a character string key and also one of the attributes (F, I, M, T, U) appropriate to a binary number key. This is not allowed. ERROR 028: Only one binary key type allowed in a sort key: "badparm" "badparm" is a sort key definition which includes more than one of the binary number attributes (F, I, M, T, U). A sort key can't be two different kinds of number. October 10, 1991 RPSORT Reference Page 30 Error Messages (continued) Syntax Error Messages (continued) ERROR 029: Only one list of input files and a single output file may be given. Found additional file spec: "badparm" "badparm" is the third filespec or list of filespecs listed in the command line. The first filespec or list of filespecs separated by plus signs is taken to be the input. Then there should be a single filespec for the output. For example: RPSORT INPUT1.DAT+INPUT2.DAT OUTPUT.DAT No additional filespec is not allowed. ERROR 030: Multiple files not allowed in output spec: "badparm" The first filespec or list of filespecs separated by plus signs is taken to the input. Subsequently, in the command line, you would enter the output filespec. This must be a single file. ERROR 031: Misplaced plus sign in input file list: "badparm" The list of files you specify for the input must be separated by plus signs. There must be no spaces around the plus signs and there must be no plus sign before the first filespec or after the last filespec in the list. ERROR 032: Input is redirected from the standard input, output must go to the standard output. The following file spec is illegal: "badparm" You redirected the standard input to a file. In this case the output must also go to the standard output. This can either be to the screen by default or can be redirected to a file. ERROR 033: You must specify an input file. You did not specify an input file either explicitly or by redirecting the standard input to a file. RPSORT insists that its input come from a file specified in one of these two ways. ERROR 034: No name specified for error file: "badparm" "badparm" is a /E switch that did not include a file name. The /E switch must include a file name. For example: /ESORTERRS.TXT October 10, 1991 RPSORT Reference Page 31 Error Messages (continued) DOS Version Before 2.0 Message ERROR 037: RPSORT requires MS DOS version 2.00 or later. RPSORT uses MS-DOS functions that were added in version 2.0 and therefore can not run with an earlier version. Insufficient Memory Messages ERROR 040: Not enough memory. RPSORT requires 30,000 bytes. RPSORT can run in very small amounts of memory but there is a limit. ERROR 041: Not enough memory to hold at least two records/lines at a time. If the records or lines in your file are large, you may need more available memory than the basic 30K RPSORT usually requires. You need room to hold at least two lines or records at a time plus you need memory to hold the RPSORT program and a few tables and other odds and ends. In the extreme case where your lines or records are 32000 bytes each, you might need some 90K of memory to run RPSORT. Line Or String Too Long Messages ERROR 043: Line exceeds max length of 32750 bytes. RPSORT found a line in the file that exceeded the maximum allowed length of 32750 bytes. ERROR 044: Found Pascal string whose length byte exceeds specified key length. The binary number in the first byte of a Pasal string (the length byte) must be less than the length attribute specified in the sort key definition. Otherwise, the string would extend beyond the end of the key. October 10, 1991 RPSORT Reference Page 32 Error Messages (continued) Input/Output Error Messages ERROR 046: No data in input file(s) so nothing to do. There was no data in the input file(s). Either the size of the input file(s) was zero or the first byte of each input file was a Ctrl-Z thus terminating the input file(s) at the very beginning. ERROR 047: Input file not found: filename The input file named by "filename" was not found. Check the spelling of the name or add a path if appropriate. ERROR 048: Error reading input file. Normally you would not see this error message from RPSORT. Usually if there is an uncorrectable error while reading a disk, DOS will tell you and then prompt you to specify: Abort, Retry, Ignore, Fail Typically you would try R for retry a few times to see if it can get past the error. If not, you would probably enter A for abort and RPSORT would never know what happened since DOS would terminate it. If, however, you entered F for fail then DOS would return this info to RPSORT and RPSORT would display the ERROR 048 message. If you were to enter I for ignore then DOS would return to RPSORT with no indication that the read had failed. RPSORT would assume that data from the file had actually been read into memory. This data would be garbage but RPSORT would happily sort the garbage and produce a meaningless output file. ERROR 049: No room on disk to write sorted output file. There is not enough space on the drive you assigned for the output file to hold the latter. ERROR 050: No room on disk to write temp file. The drive you assigned for RPSORT temporary files has insufficient space to hold them. ERROR 051: Unable to create temp file. There was an error attempting to create a temporary file. Probably, the disk you assigned to hold temporary files doesn't have enough directory entries available in the current directory. RPSORT may require up to three directory entries for temporary files. October 10, 1991 RPSORT Reference Page 33 Error Messages (continued) Input/Output Error Messages (continued) ERROR 052: Unable to create output file. There was an error attempting to create the output file. Probably, the drive and directory you assigned to the output file doesn't have any directory entries available. ERROR 053: Unable to create error file: "badparm" There was an error attempting to create the error message file. Probably, the drive and directory you assigned to the error file doesn't have any directory entries available. ERROR 054: Ran out of space on disk attempting to write error file. Redirecting error messages to the screen. There is not enough space on the drive you assigned for the error message file to hold the latter. RPSORT displays the current and any subsequent error messages to the screen before terminating. Never Should Happen Error Messages At a number of points, in RPSORT, I check for errors resulting from the use of DOS functions under conditions where in principle no errors could occur. Any such errors would imply the possibility of a bug in RPSORT. If any of these error messages are displayed, please send me a precise description of the circumstances. This would include the amount of memory available in your system (the amount reported by CHKDSK or MEM not the total amount), the size of the file in bytes and the count of records or lines, whether the file consists of fixed length records or lines and the kind of sort key(s) you were using. ERROR 059: Error allocating memory. An error occurred using memory allocation functions in MS-DOS. ERROR 060: Unknown error accessing disk. through ERROR 074: Unknown error accessing disk. These messages all have the same text, but the error number would tell me where in the program the error occurred. All of these have to do with disk I/O.