TITLE: SNR.EXE VERSION: 4.0 PURPOSE: Multiple simultaneous Search 'N' Replace program DATE: 7/19/90 AUTHOR: THOMAS A. LUNDIN c/o Graphics Unlimited Inc. 3000 Second Street No. Minneapolis, MN 55411 (612) 588-7571 DONATION: $25 requested DESCRIPTION: SNR is a multi-string search-and-replace filter. Both text and binary files can be processed by the program, since SNR allows the definition of hex values in a search-and-replace equation. SNR will translate a file of any size that your system can handle (I've used it on 50-megabyte files). Up to 50 multi-character (m:n) equations can be entered into an SNR table, each of which can be a maximum of 200 characters in length. Additionally, up to 256 single-character (1:1) equations can be loaded into an SNR table. All search-and-replace operations are performed in a single pass. NEW! Context flag allows toggling between two different output strings for the same input string. See the note on BIT STRIPPING, further on. REVISION 4.0 - 7/19/90 HISTORY: Removed OS/2 compatibility. Nobody used it. Allow 256 1:1 equations to be loaded in addition to 50 m:n equations. Added context flag. 3.01 - 11/3/89 Program recompiled for family mode, meaning it will run as-is under DOS, OS/2 real mode, and OS/2 protected mode. Also improved I/O throughput on fast disk drives. 3.0 - 9/27/89 The program has been completely rewritten. The pesky bugs that were present in versions 1.5 and 2.0 have been eliminated. All binary characters INCLUDING NULLS can now be searched and replaced. Nulls can also be entered as part of a larger search-and-replace equation. An automatic "bit-stripping" feature has been added which removes the high-order bits from input characters, forcing 8-bit codes into 7-bit ASCII. -------------THIS IS NOW THE DEFAULT ACTION!!!------------ The feature can be selectively disabled for binary files (see BIT STRIPPING, below). The equation length has been increased to 200 characters, which can be split into search and replace sides of any length, as long as the total does not exceed 200. Equations no longer need to be in any specific order. The average speed of the program has been vastly improved, especially for multiple and high-occurrence search and replaces. The program no longer needs to read itself for runtime parameters. OPERATION: The command line invocation is: snr [@]filename ext tablename [/d] ÄÙ 1. "filename" is any unambiguous DOS path/file name. An optional at-sign (@) in front of the filename indicates that multiple files are to be processed from a directory list file, i.e., a file that has been created from a redirected DIR command. For example, the DOS command dir *.txt >dirfile ÄÙ will create a disk file named "dirfile" which contains a directory of all files with a ".TXT" extension. 2. "ext" is the filename extension you wish to assign each output file. The output files are created using the name of the input file plus the extension specified here. The choice of an extension is important; if you accidentally choose an extension which is already used by an input file, the result of the process is indeterminable, although file data loss is likely to occur. Use an extension which is sure to be unique. Extensions can be the DOS device names CON, NUL, AUX, and PRN. In any of these four cases, an output file is NOT created on disk; rather, the output is redirected to the console (CON), nowhere (NUL), the rs232 port (AUX), or the printer (PRN). Using CON is handy for a quick preview of the conversion process before storing it to disk. NOTE: If you preview converted binary files to CON, be aware that hex code 1A (DOS end-of-file) will terminate the conversion display, perhaps prematurely. This permature termination will NOT occur when you store to disk. 3. "tablename" is the unambiguous DOS path/file name which contains the string translation "equations". Although no restrictions are placed upon the tablename, for sake of clarity it is suggested that you adopt a consistent naming scheme for them, (say, with an *.SNR extension). Tables are discussed in more detail below. 4. "/d" is an option which replaces the original input file with the converted data. The EXT specified on the command line is thus used temporarily until the conversion is complete, at which time the program performs an internal file delete/rename operation. EXAMPLES... C>snr tstfil.doc txt tst.snr ÄÙ The above command line will process the input file "tstfil.doc" and create an output file "tstfil.txt" using the table "tst.snr". C>snr @dirlist p1 sample.snr ÄÙ The above command line will process a group of files contained in the directory list "dirlist" and create a group of output files with extensions of *.p1 using the table "sample.snr". CONVERSION SNR tables are ASCII text files which contain the TABLES: search-and-replace equations used by the program. Up to 306 of these equations can be entered in a single table -- 256 1:1 equations, and 50 m:n equations. The m:n equations can consist of 200 characters split freely between the search side and the replacement side. Blank lines in a table will be ignored. A sample 1:1 equation would be: A=a The above equation would translate an upper-case 'A' to a lower-case 'a'. A sample m:n equation would be: Now is the time=NOW IS THE TIME The above equation will translate the words "Now is the time" to all upper-case. Notice that spaces ARE SIGNIFICANT characters in an equation. An equation can be defined to ignore search text (that is, throw it away on output) simply by leaving out a replacement string, like this: Now is the time= If you want to output word spaces at the end of a replacement string, but your text editor strips trailing spaces, you can define them as hex codes: Now is the time =NOW IS THE TIME\20 In fact, any hex code can be formed from a backslash followed by two hex digits (0-9, a-f, A-F). You'd normally use hex codes to search or replace binary characters that can't be generated directly from the keyboard. For instance, a carriage return/line feed sequence (CRLF) can be specified like this: \0d\0a\0d\0a=\0d\0a The above equation will convert two CRLFs in a row to a single CRLF. There are three ASCII characters which MUST be specified as hex codes in an SNR equation, since they have special meaning in their normal ASCII form. They are: Backslash (\), which must be entered as \5c Equals (=), which must be entered as \3d Asterisk (*), which must be entered as \2a The end of a table is signified by a \\E on a line by itself. This code is optional, but recommended, since it will prevent the table processor from inadvertently reading past the end of your equations (some word processors may pad their last blocks with garbage, which the table processor would attempt to read as equation data). SNR provides for comments in its tables. Comments can be entered in a table as lines by themselves, or set off from an equation: \ This is a comment line by itself. \ A comment consists of a single backslash \ followed by one or more spaces. \0d\0a=\0d\0a \ this will ensure that existing CRLF pairs are \ left untouched \0d=\0d\0a \ this equation will convert an isolated CR \ into a CRLF \0a=\0d\0a \ this equation will convert an isolated LF \ into a CRLF \\E If you're a programmer, or you've used a macro language of some kind, you know the value of program comments. If you're new to the field of user programming, get into the habit of commenting your work. Believe me, you'll be glad you did. SNR will automatically sort equations by length when assembling a table in memory. SNR does not perform a sequential search of equations, so the order in which you enter your equations is immaterial. There is a provision to load 256 equations that have one character to search for, and one character to replace with (1:1). This greater capacity is available to allow you to develop equations that can translate a file's entire codeset (for instance, from ASCII to EBCDIC or vice versa). See the sample tables ASC2EBC.SNR and EBC2ASC.SNR for examples of tables that use 1:1 equations. CONTEXT FLAG: There are probably instances where you'd like to toggle between two different replacement strings for the same search string. The context flag allows you to do this. Some example uses would be: - to compress multiple spaces from a document - to ignore any text between two codes - to make one code alternate as two different codes - to turn all-upper-case text into upper and lower ...and many other uses. The context flag has two states: on or off. (When SNR begins execution, the context flag's state is OFF). The flag's state can be tested in a search, and its state can be set or reset in a replacement. The "off" state of the context flag is entered as *00. The "on" state is entered as *01. The context flag is the last item entered in a search or replacement string. For example: ABC*00=abc*01 ABC*01=XYZ*00 In the above two equations, if the string "ABC" is read from the input file, and the context flag is OFF, then the string "abc" is written to the output file, and the context flag is set ON. If, on the other hand, the string "ABC" is read from the input file, and the context flag is ON, then the string "XYZ" is written to the output file, and the context flag is reset OFF. For example, if data from the input file looks like this: ABCDEFG ABCDEFG ABCDEFG ABCDEFG ...it would convert to this: abcDEFG XYZDEFG abcDEFG XYZDEFG Since the context flag is global in scope, you can define many different equations to test and set it in a single table. But beware of unwanted side effects -- remember that only one equation at a time will control the state of the flag, and its state may change between two equations that complement each other, resulting in misconversion. For example: ABC*00=abc*01 ABC*01=XYZ*00 EFG*00=efg*01 EFG*01=ZYX*00 The above equations are similar to the previous example, except that we have defined two new equations that test and set the context flag. Given this example input data: ABCDEFG ABCDEFG ABCDEFG ABCDEFG ...the result would convert to this: abcDZYX abcDZYX abcDZYX abcDZYX The context flag is being toggled by only two equations: ABC*00=abc*01 <--- this one ABC*01=XYZ*00 EFG*00=efg*01 EFG*01=ZYX*00 <--- and this one If, on the other hand, the input data stream looked like this: ABCDEFG DEFGABC ABCDEFG DEFGABC ...the result would look like this: abcDZYX DefgXYZ abcDZYX DefgXYZ The thing to keep in mind if you are going to use the context flag is: it is OFF when the program begins; you need to enter an equation to set it ON; and then you need to have some equation that resets it OFF. See some of the sample .SNR tables for examples in the use of the context flag. BIT STRIPPING: SNR's default action is to map all incoming characters to 7-bit ASCII before running them through the equations. This is handy if you're converting an old Wordstar file, and don't want to deal individually with the tagged characters. This action, however, is not at all welcome when you're dealing with files that make use of specific binary characters (examples: Word Perfect, XYWrite, others). In cases where you DON'T want 7-bit mapping to occur, type a \\L8 as the first line of a table; this will disable bit-stripping for that table. An interesting side-effect of the bit-stripping feature is illustrated by the table STRIP.SNR. It contains NO equations at all (except for the \\E), and yet it will effectively remove the high-order bits from any file that it is run through. Try it! NOTES: The program will run on any IBM PC/XT/AT series computer using MS-DOS 2.x or higher, with a minimum of 128K RAM. SNR was compiled with Turbo C++ 2.0, but was written using ANSI C, not C++ (yuk!). DISCLAIMER: This program is distributed as shareware. Use it, copy it, upload it, give it to your friends. Please distribute only the complete program in ZIP form, including all document and sample files. No warranties, either expressed or implied, are given by the author or distributor of the program, and the user accepts all risk of damage arising out of the application and use of the program. CONTRIBUTION: If you like SNR, please mail a contribution of $25 to help defray the cost of developing and supporting the program (and to help my family with our moving expenses!). At that price, it's one of the bargains of the text processing world, and I'll bet you recoup that investment dozens, maybe hundreds, of times over. Please make checks payable to THOMAS A. LUNDIN (please, not to the company I work for) and mail to the address shown at the top of this DOC file. ----------------------------------------------------------- Send comments/bug reports/contributions to: ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ» º THOMAS A. LUNDIN º º c/o Graphics Unlimited Inc. º º 3000 Second Street No. º º Minneapolis, MN 55411 º º Daytime # (612) 588-7571 º ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ You can also reach me at my BBS home base: PC-ROCKLAND BBS If you can't find a program here, it probably doesn't exist! (914) 353-8153 (Leave msg. for "Tom Lundin") Thank you for using SNR.