MERGING more than two files One PCBoard sysop discovered that his PCBFILER "lost" descriptions when trying to sort huge files (around 6 megs or so), such as you would get if you were creating a master file list for a large bbs. MERGE can avoid this problem by merging the individual sorted directories. It would also be quicker since sorting a huge file takes much longer than merging already sorted smaller files. The unregistered MERGE can only merge two files at a time, so they would have to be merged sequentially. To merge four files (any duplicates will also be removed; the first occurrence, which is always from the first named list, is retained), a batch file might take the form: MERGE list1 list2 bigout MERGE list3 bigout MERGE list4 bigout REMOVING "cross file" duplicates Besides the merging of two filelists, MERGE will locate the same file description if it is duplicated in another directory. Sysops could use this feature to remove any duplicated descriptions from all of their filelists. One way to do it would be to write a batch file comparing each directory to another in turn. With a lot of directories, the batch file could be long. The registered version of MERGE, when it's released, will be able to do multiple filelists by reading an ASCII text file that contains the name of every directory. For now, if you want to compare four file lists, write something like: MERGE list1 list2 /1 (the grouping shows that the) (compares should be increased) MERGE list1 list3 /1 (by one each time; this method) MERGE list2 list3 /1 (can reduce the overall time) (since a list could have been) MERGE list1 list4 /1 (shrunk in size when it starts) MERGE list2 list4 /1 (making its series of compares) MERGE list3 list4 /1 Both of these examples assume the lists are sorted in file name ascending order (the file name is the first field). For compares, the list in the first position will keep all its descriptions, but any names in the second list that are the same as in the first will be removed. List4 might be something like your main upload or "catch all" directory, which would get all of its duplicates removed first, since the other directories would be organized by the type of files. Running MERGE on a single file list will also remove all duplicates contained in that list, if any exist (the list MUST be sorted on the file name; the first occurrence is retained and any subsequent are discarded). During unattended batch processing, you would probably want to run MERGE with the "/E" switch, otherwise if an error is encountered in one of the files, MERGE will halt. Two other possible options are "/L" for a log file of all activities (also use "/VL" for a record of duplicate and/or error locations) and "/T" for a "trash" file of all removed duplicates (you can examine these later in case you want to insure that a duplicate was removed from the directory you had intended). Using "/EA" with "/T" will also strip any descriptions with errors to the trash file, which you would then be able to correct for reinsertion. If the "output" directory list exists, add "/O" for no prompt before an overwrite. REMOVING "cross file" duplicates: a different approach There's an alternate method (there are probably many other ways) of deleting any duplicate descriptions from all file directories that won't require a batch file as long as the one above may become. This is perhaps less automated, but with possibly more control over what directory retains a duplicated description. Depending on how you keep your lists organized, several steps below might be omitted: 1) Sort each list in date descending order, then name ascending order (the most recent date of any duplicate will occur first). 2) Merge each list on itself (removes any duplicate from that list). MERGE list1 MERGE list2 ... 3) Compare each list in turn to the upload directory (gets rid of any duplicates that are already categorized). MERGE list1 uploads /1 MERGE list2 uploads /1 ... 4) Merge all the lists together (you can omit "uploads"; it's done) using the trash option with "+K+N". The trash file - DUPENAME - is written without any headers or footers ("+N") and gets only the file names (the key, "+K") of all duplicates. MERGE list1 list2 master /tdupename+k+n MERGE list3 master /tdupename+k+n ... 5) Sort DUPENAME in name ascending order. 6) Merge DUPENAME on itself (the same description may have been in more than one directory, so DUPENAME could have duplicates). MERGE dupename 7) Compare the duplicate names to each list (you can omit "uploads" again). Use the trash option again without "+K" (whether you use "+N" depends on your preference for the last step). MERGE dupename list1 /tstripped.dup MERGE dupename list2 /tstripped.dup ... 8) The resulting lists will have all occurrences of duplicated descriptions removed (they will be in STRIPPED.DUP). If "+N" was used above, you can just append the stripped duplicates to your uploads filelist. If you don't want to sort the entire uploads directory afterwards (and merge it on itself), sort and run merge on the stripped duplicates before appending them. If you didn't use "+N", the directory each duplicate came from is in STRIPPED.DUP along with the descriptions. You can use an editor to go through this list and move each description to the desired directory (if some descriptions were in several different directories, you can decide which is the most appropriate). Since DUPENAME contains the names of all duplicated descriptions, you may be able to use it to check that you only have one occurrence of that file on your hard drive or between your hard drive and CD-ROMs. CREATING DIR files for a CD-ROM A CD-ROM was found to have only one line file descriptions. You can create your own DIR lists using MERGE, a DIZ extractor, and CHK4DES (see PRODUCTS.DOC). Run the DIZ extractor on each file directory of the CD-ROM to create file lists from the FILE_ID.DIZ's. Depending on the DIZ extractor, the file names without DIZ's will either have some phrase stating no description was found or an empty description area. CHK4DES can be used to automatically remove all the names without descriptions from the generated lists. To insure there is at least a one line description for each file, merge each DIR list created by the extractor with the corresponding one line description list provided on the CD-ROM, placing the DIZ list first. MERGE list1.diz list1.cdr list1.dir MERGE list2.diz list2.cdr list2.dir ... You can check the "examples" selection in the "help" system for more ideas of MERGE's versatility, and I'd appreciate hearing about any new uses you may discover.