Program usage documentation for the Free-DOS FIN program
(c) Copyright 1993-1996 by K. Heidenstrom.
Modified:
KH.19950130.001 Started work on interim version
KH.19950201.002 More work
KH.19950202.003 (FIN 1.4.0) First release
KH.19950203.004 Tidied up the usage examples
KH.19950521.005 (FIN 1.4.2) No changes required
KH.19950715.006 (FIN 1.4.3) Add -P switch
KH.19950716.007 (FIN 1.4.4) Add -L switch
KH.19950717.008 (FIN 1.4.5) Default to *.* if no pathspec but switches
KH.19951217.009 (FIN 1.4.6) Add -F switch
KH.19960218.010 (FIN 1.4.7) Add '6' and '3' parameters to -S option
1. LEGAL
The FIN.COM program is Copyright 1993-1996 by K. Heidenstrom. The
author may be reached at kheidens@actrix.gen.nz on the Internet or by
snail mail: K. Heidenstrom c/- P.O. Box 27-103, Wellington, New Zealand.
This program is free software. You may redistribute the source and
executable and/or modify the program under the terms of the GNU
General Public License as published by the Free Software Foundation;
either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but is provided "as-is", without any warranty of any kind, including
the implied warranty of merchantability or fitness for a particular
purpose. In no event will the author be liable for any damages of
any kind related to the use of this program. See the GNU General
Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
2. FUNCTION AND USAGE SYNTAX
The FIN program displays various pieces of information about a file or
a list of files. The program can recurse subdirectories if the -R
switch is given. Any combination of the following can be displayed:
Directory information (date, time, and size) (-D switch)
Attributes (read-only/archive/system/hidden) (-A switch)
Number of clusters occupied (-Sc switch)
CRCs or checksums (-Cxx switch)
Word and line counts (-Wn switch)
Histogram (frequency of values) (-H switch)
If more than two files are processed, totals are displayed for:
Size
Number of clusters
Word and line counts
Histogram
The program usage syntax is:
FIN [-A] [-Cxx] [-D] [-F] [-H] [-L] [-P] [-R] [-Sc] [-Wn] Pathspec [...]
Each of the option switches enables a specific function, and each is
described separately below. The -R switch specifies recursing into
subdirectories, the other switches enable the calculation and display
of specific types of information.
Following the option switches are one or more pathspecs. These may
contain an optional drive specification, optional directory path, and
optional file specification, which may include wildcards.
Note that all option switches must appear before the first pathspec.
Also, option switches may be run together, for example the "-A", "-D"
and "-R" switches may be combined into "-ADR".
If a directory is specified, with no filespec, the default file mask
'*.*' is appended. This can also be forced by a trailing backslash.
If no pathspec is given, and at least one switch is given, FIN will
use a default pathspec of '*.*'. For example, the command FIN -D is
equivalent to FIN -D *.*. However, if no switches are given, i.e.
FIN with no parameters, FIN will issue a usage message and terminate.
All parameters are case-insensitive. Switches may begin with '-' or
with the currently active DOS switch character (switchar) which is
normally '/'. FIN does not support the forward slash as a directory
separator character, regardless of the setting of the switchar.
If no parameters are specified, or if an illegal or incorrectly formed
option is specified, the program issues a usage summary message and
terminates, returning errorlevel 255.
2.1. DIRECTORY INFORMATION - THE -D SWITCH
If the -D switch is given, directory information is reported for each
file. Directory information consists of the file size, the file date,
and the file time.
The file size is reported in bytes.
The date is reported in numeric form. The order of the numbers is set
by the country code specified in CONFIG.SYS. There are three date
formats: the American format MM-DD-YY, the European format DD-MM-YY,
and the Japanese format YY.MM.DD. Also, the date separator character
may be '/' in some cases.
The time is reported in 24-hour format as hours and minutes. Seconds
are not displayed.
If two or more files are processed, FIN will display a totals message
at the end of processing, giving the total number of bytes in all
files, and/or total number of clusters occupied, and/or total number
of files, depending on the options used.
2.2. FILE ATTRIBUTES - THE -A SWITCH
If the -A switch is given, the attributes (read-only, archive, system
and hidden) is reported for each file. Each of these attributes may
be true or false. A true attribute is shown in upper case, a false
attribute is shown in lower case.
A common attribute display is 'rAsh'. This indicates that the file
is not read-only (the 'r'), does have its archive attribute set (the
'A'), and is not system or hidden (the 's' and 'h').
Note that the -A switch also enables processing of files with unusual
attributes, specifically hidden and system files. If -A is not given,
files which have their hidden or system attributes set are not listed
or processed.
2.3. FILE CLUSTERS - THE -Sc SWITCH
The -Sc switch causes FIN to calculate and display the number of
clusters occupied by the file. A cluster is the smallest unit of
disk storage that can be allocated by DOS. Therefore even a 1-byte
file will always occupy at least one cluster of disk space. Cluster
sizes are always a power of two, and are typically in the range 512
bytes to 32K bytes. 360k and 720k floppies usually have 1K clusters,
1.2M and 1.44M floppies usually have 512-byte clusters, and hard disks
typically have 2K, 4K, 8K, 16K or 32K clusters.
The 'c' character may be a '*' (asterisk), a number (any of 1, 2,
3, 4, 5, 6, and 8), or a drive letter.
If 'c' is an asterisk, this tells FIN to use the cluster size of
the drive on which each file resides, to calculate the number of
clusters it occupies. This tells you how many clusters it is
currently occupying on the drive on which it resides.
If 'c' is a number, this will be taken to indicate the cluster size
for which you wish to calculate the number of clusters occupied.
The number may be one of the following:
1 Calculate size based on 1K cluster size
2 Calculate size based on 2K cluster size
3 Calculate size based on 32K cluster size
4 Calculate size based on 4K cluster size
5 Calculate size based on 512-byte cluster size
6 Calculate size based on 16K cluster size
8 Calculate size based on 8K cluster size
If 'c' is a letter, it is taken to be a drive identifier. In this
case, FIN accesses the drive before processing files, to determine
its cluster size, and also reports the cluster size, the total number
of clusters, and the number of available clusters on the specified
drive, before the files are listed.
This option is useful for determining how to fit a number of files
onto an archival floppy disk, and may also have other uses.
If two or more files are processed, FIN will display a totals message
at the end of processing, giving the total number of clusters in all
files. If this option is used in conjunction with the -D option, the
total bytes, total clusters and number of files are all displayed.
2.4. THE -Cxx, -H, and -Wn SWITCHES
Unlike the other switches, these three switches cause FIN to read
and examine the contents of each file.
The -Cxx switch causes FIN to calculate a checksum or a CRC (a much
more robust type of checksum) on the contents of the file.
The -H switch causes FIN to calculate and dump a histogram of values
for the contents of the file.
The -Wn switch causes FIN to count the number of words and lines in
the file. This is only meaningful for text files.
If any of these three switches are specified, FIN will read the file
contents during processing.
The -Cxx and -H switches are appropriate for use with any type of
file - text files, binary files, executable files, data files, etc.
The -Wn (word and line count) switch should only be used with text
files as it does not give a meaningful result on other file types.
2.5. CHECKSUM AND CRC - THE -Cxx SWITCH
CRCs and checksums are used to verify that data has not been corrupted.
They are often used in communications protocols in networks and in file
transfers such as Xmodem, Ymodem, and Zmodem, and also as a block check
on each sector on floppy disks and hard disks. When the data is sent,
or stored, it is accumulated using a particular algorithm (method) and
the result of the accumulation is stored along with the data. When the
data is received or read, the same algorithm is applied, and the result
is checked against the value that accompanies the data. If they differ,
then the data and/or the check value has been corrupted.
FIN can calculate block check values on the files and display them.
This is mainly useful as a quick check that two files are the same,
especially with the CRC check, which is very robust. If two copies
of the same file have the same CRC, the chances of their contents
being different are very low.
There are five forms of this switch:
-CRC 32-bit CRC (ANSI X3.66)
-CBS Checksum, byte-wise summed
-CWS Checksum, word-wise summed
-CBX Checksum, byte-wise XORed
-CWX Checksum, word-wise XORed
Each of these switches specifies a different algorithm for the
accumulation process. They are all exclusive, because FIN only
calculates one type of check at a time.
In all cases, the file is treated as a binary file, i.e. there is
no special text processing. This is the correct way to do this
processing, but it may lead to confusion as the file will give a
different checksum or CRC if presented in different forms. For
example, a text file in Unix format (end of line marked by a
linefeed) will give a different checksum or CRC from the same text
file in DOS format (end of line marked by a carriage return and
linefeed). Also, any Ctrl-Z characters (typically used in DOS to
mark the end of a text file) are not treated specially.
Each -Cxx form is now described.
2.5.1. CRC
The -CRC switch specifies a 32-bit CRC (cyclic redundancy check). The
version used by FIN is known as ANSI X3.66.
A CRC is an arithmetic check method based on remainder after division
which is too complex to explain here. CRCs are widely accepted as a
reliable and efficient check method. They are used in Zmodem, on
floppy disks, and in archivers such as PKZIP and LHA.
If -CRC is specified, FIN outputs an eight digit hexadecimal value at
the start of each file information line, which corresponds to the 32-
bit CRC value that was calculated for the file.
The algorithm and data tables for the CRC portion of FIN were derived
from a program called CRC.C, written by Gary S. Brown, Copyright 1986,
which was free software. For details, see the source code to FIN.
2.5.2. CHECKSUM, BYTE-WISE SUMMED
Each 8-bit byte (character) from the file is added into a 16-bit
accumulator, which was initialized to zero. The accumulator may
overflow during the accumulation process. The result is displayed
in the first column of the file information line as a four-digit
hexadecimal value.
2.5.3. CHECKSUM, WORD-WISE SUMMED
Each pair of 8-bit bytes (characters) from the file is taken as a
16-bit number (the first byte becomes the low-order byte of the
number, the second byte becomes the high-order byte of the number),
and each resulting 16-bit value is added into a 16-bit accumulator,
which was initialized to zero. The accumulator may overflow during
the accumulation process. The result is displayed in the first column
of the file information line as a four-digit hexadecimal value.
If the file contains an odd number of bytes, the final byte is taken
as the low-order byte, with a high-order byte value of zero.
2.5.4. CHECKSUM, BYTE-WISE XORED
Each 8-bit byte (character) from the file is exclusive-ORed into a
16-bit accumulator, which was initialized to zero. The result is
displayed in the first column of the file information line as a
four-digit hexadecimal value, though the first two digits will always
be zero.
2.5.5. CHECKSUM, WORD-WISE XORED
Each pair of 8-bit bytes (characters) from the file is taken as a
16-bit number (the first byte becomes the low-order byte of the
number, the second byte becomes the high-order byte of the number),
and each resulting 16-bit value is exclusive-ORed into a 16-bit
accumulator, which was initialized to zero. The result is displayed
in the first column of the file information line as a four-digit
hexadecimal value.
If the file contains an odd number of bytes, the final byte is taken
as the low-order byte, with a high-order byte value of zero.
2.6. HISTOGRAMS - THE -H SWITCH
A histogram is a map of numbers of occurrences, or frequencies of
occurrence. Usually they are displayed graphically. If -H is
specified, FIN will generate a text-based dump of the histogram of
byte values in each file. The histogram dump consists of zero or
more lines, each of the form:
<nnnnnnn> of byte value <ddd> 0<xx>h '<c>'
The <nnnnnnn> value gives the number of times that the byte value
occurred in the file contents. The <ddd>, <xx> and <c> values are
decimal, hexadecimal, and ASCII representations of the byte value.
The ASCII representation is not shown if it is not a printable ASCII
character.
The histogram dump is always displayed in order of decreasing
frequency, i.e. the <nnnnnnn> values always decrease through the
dump.
If two or more files are processed, FIN will display a combined
histogram for all files processed, at the end of processing.
2.7. WORD AND LINE COUNTS - THE -Wn SWITCH
Specifying the -Wn switch causes FIN to count the number of words
and lines in each file. The 'n' character specifies which definition
of 'word' to use. There are currently two word definitions. Both
definitions define character values as either word characters or
whitespace (whitespace is spaces, tab characters, newlines, and other
characters that may appear between words).
A word is defined as one or more word characters with whitespace on
both sides. Before the beginning of the file, and after the end of
the file, are assumed to be whitespace.
The two definitions are as follows. You may wish to refer to an ASCII
table when reading these definitions. Note that the definitions are
based on the standard ASCII character set and the IBM PC extended
(high-ASCII) character set.
Definition 1 (selected by -W1):
Byte range Character range Defined as
---------- --------------- ----------
0-25 NUL to Ctrl-Y Whitespace
26 Ctrl-Z End of file marker
27-47 ESC to "/" Whitespace
48-57 "0" to "9" Word characters
58-64 ":" to "@" Whitespace
65-90 "A" to "Z" Word characters
91-96 "[" to "`" Whitespace
97-122 "a" to "z" Word characters
123-127 "{" to Del Whitespace
128-154 "Ç" to "Ü" Word characters
155-159 "¢" to "ƒ" Whitespace
160-167 "á" to "º" Word characters
168-255 "¿" to 0FFh Whitespace
Definition 2 (selected by -W2):
Byte range Character range Defined as
---------- --------------- ----------
0-25 NUL to Ctrl-Y Whitespace
26 Ctrl-Z End of file marker
27-32 ESC to " " Whitespace
33-44 "!" to "," Word characters
45 "-" Whitespace
46-255 "." to 0FFh Word characters
Line counting is done using an algorithm that is designed to support
Unix files (end of line marked with linefeed), Apple format files
(end of line marked with carriage return), and DOS files (end of line
marked with a carriage return and linefeed) automatically.
If selected, the word and line counts are displayed just before the
file name.
Note that word and line counting is only meaningful for text files.
Using the -Wn option with non-text files (e.g. executable program
files, overlay files, database files, configuration files, archive
files, etc) will not give a useful result.
2.8. FILENAMES ONLY - THE -F SWITCH
By default, at the end of each line, FIN displays a fully qualified
pathname (such as "C:\UTIL\FIN.COM", etc). If the -F switch is
given, FIN will display just the filename part of the full pathname,
e.g. just "FIN.COM" for the above example.
2.9. LISTFILE FORMAT - THE -L SWITCH
The -L switch tells FIN to output a narrow listfile-format output.
This is useful when generating a packing list, or marking up a file
list in an archive. The -L switch overrides the -A, -C, -D, -H, -Sc,
and -Wn switches. These switches will be ignored if the -L switch is
given.
The output line format using -L is as follows:
FILENAME.EXT 12345 941225 0654
All fields in the output line are aligned, regardless of the length of
the filename. The filename will be space padded to ensure that the '.'
always appears in the ninth column (except if there is no extension,
in which case no '.' will be shown). The next field is the file size,
which may be up to 8 characters wide. If the file is larger than
99,999,999 bytes, this field will show '########'. The next field is
the date, which is always in YYMMDD format, and the last field is the
time, always in HHMM format. A listfile in this format can be sorted
by filename, extension, size, or date/time, using a program such as
DOS's SORT.
2.10. RECURSE SUBDIRECTORIES - THE -R SWITCH
The -R switch specifies that FIN should recursively examine all
subdirectories below the specified (or default) directory and match
files in those subdirectories too.
Note that FIN does not travel 'up' the directory tree, it only
recurses downwards, or 'deeper', by adding directories to the end
of the path.
If you want FIN to scan all directories on a particular drive, specify
the root directory as the starting point and use the -R option; FIN
will scan the root directory and recursively scan all subdirectories,
and all of their subdirectories, etc, until it has scanned the entire
drive.
This feature makes FIN useful as a WHEREIS program. See the usage
examples (later) for details.
2.11. PAUSE AFTER EACH PAGE - THE -P SWITCH
The -P switch tells FIN to pause after each page of text has been
displayed, and prompt for a keypress, much like the /P option to
the DOS DIR command.
You can press any key at the prompt, but there are two special cases.
If you press Ctrl-C or Ctrl-Break, the program is terminated. If you
press Enter, the program displays only one more line before prompting
you again. Any other key will cause the next page of text to be
displayed.
2.12. OUTPUT LINE FORMAT
The format of the output lines in all modes except -L mode is:
[CRC/Chksum] [Clust] [Size Date Time] [Attrs] [Words Lines] Pathname
Each of the optional items shown is displayed if the corresponding
option switch was specified. The switches that correspond to the
five optional items listed above are, in order:
-Cxx
-Sc
-D
-A
-Wn
An example line, produced by the command 'FIN -acrcds5w1 file.ext'
could be:
FEDCBA98 123 87654 12-25-94 12:34 rAsh 12345 2345 C:\FILE.EXT
The items in the above line correspond directly to the output line
format shown earlier. Note that I have removed some padding spaces
from the above line to make it fit into 80 columns.
Histograms, if requested, are dumped on a line-by-line basis after
the other information has been listed.
Note that the file name is always displayed at the far end of the
output line. By default the file name shown is fully qualified with
drive specifier, full path starting at the root directory, and file
name, but if the -F switch is given, the drive specifier and path are
not displayed.
2.13. ERRORLEVELS
FIN always reports an errorlevel when it terminates. This may be
useful when FIN is used within batch files. A message will be also
issued if an error occurs. The errorlevels that may be returned by
FIN are as follows:
0 Normal completion
17 Unable to open file (if -Cxx, -H, or -Wn specified)
20 Read error on file (if -Cxx, -H, or -Wn specified)
96 Unable to obtain parameters for drive (cluster calculations)
162 Insufficient memory (approx. 64K of memory is required)
227 Internal bug error (no error message is issued)
255 Incorrect usage syntax
3. USAGE EXAMPLES
3.1. USING THE CRC FACILITY
The CRC facility of FIN is useful for comparing sets of files from
different computers. For example after backing up a set of files from
one computer to another, you can generate a list of CRCs of all files
and just compare the CRCs, rather than comparing the contents, which
would be slower.
A list of file CRCs can be included in a README file when a set of
files are distributed, as an assurance against possible corruption
of the files (this is unlikely as most archivers use a 16-bit or
32-bit CRC anyway).
3.2. USING FIN FOR FILE MAINTENANCE
I have a computer at home and another at work, and often files get out
of date on one machine. The command FIN -DR C:\ redirected to a file
on a floppy creates a full list of all files on the hard disk, which
can then be compared using a text comparison utility with a similar
file for the other machine, to determine what files are different, and
which machine has the newer file.
The CRCs could also be included, by changing the -DR options to -CDR,
though this may make list generation take an unreasonably long time.
A list made in this way is also useful to include with backups as a
quick viewable list of the files that were backed up.
3.3. USING FIN AS A WHEREIS PROGRAM
FIN is useful for performing the standard 'WHEREIS' function. I use
a batch file called WHEREIS.BAT which contains the following lines:
@ECHO OFF
IF NOT .%1 == . GOTO Cont
ECHO Usage: WHEREIS filespec [drive:]
GOTO Exit
:Cont
FIN -ADPR %2\%1
:Exit
The usage for this batch file is:
WHEREIS <filespec> [<drive>:]
For example:
WHEREIS *.bak
WHEREIS lostfile.txt
WHEREIS *.exe e:
----//----