PRODUCT : Borland Version Tracker NUMBER : 1340 VERSION : 2.x OS : DOS DATE : October 25, 1993 PAGE : 1/3 TITLE : How does BVT detect a binary file? How does the Borland Version Tracker detect a binary file? ========================================================== This technical information document describes in detail the procedure in which the Borland Version Tracker determines the binary-ness of a file. Binary Files ============ The whole reason for treating binary files seperately in the Borland Version Tracker is that in order to efficiently determine differences between current and older versions (i.e. so that they use the least amount of space when stored), one must have some regular marker spread throughout the binary file that is to be compared. This marker allows one to resynchronize a mismatch (yes, you can do it on a character-by-character basis but only if you have a spare Cray 1 at your disposal), when it is found. Since binary files don't have any regular markers (i.e. there isn't a at the end of every line), they are treated differently than ASCII files. Now binary-ness is something we all know when we see (like art). You edit a file with your text editor and see a whole lot of weird characters; so you say, "Hmmm, must be a binary file." Ok, so how do you tell your version control product what weird is? Here's the algorithm that was written to determine this: -There are no characters in the file at all or the file size is bigger than 10 times the global line size or 4096 characters (whichever is bigger) and less than 0.2% of the characters are characters. -More than 1% of the characters in the file are '\0'. -More than 10% of the characters in the file are meta-characters. Bearing in mind the sequence of events that preceeds the reconstruction of a non-terminal version in a sequence, it only makes sense that a module can be either binary or non-binary. Therefore, binary-ness is only determined at initial check-in PRODUCT : Borland Version Tracker NUMBER : 1340 VERSION : 2.x OS : DOS DATE : October 25, 1993 PAGE : 2/3 TITLE : How does BVT detect a binary file? time. Once this determination is made as to whether or not a file is binary, it stays with the file forever (or, at least as long as the file is associated with version control). It still is desirable to compare two binary versions for the purposes of determining whether the version needs to be changed (i.e. the user is checking-in the same version that they checked out). This is done via a three step process. First, the lengths of the two versions are compared. If they are equal, step two follows. If any of the checks are not true, comparison is aborted and check-in proceeds with the new version. Second, a checksum is calculated as the binary file is copied into the storage file (the assumption is made that the two versions are the same and it is OK to proceed with the check- in... if it isn't, the check-in will be rolled back). The checksum is compared with the stored checksum from the previous version. If they match, we go on to step three. The third step is an actual byte-by-byte compare of the two versions to ensure that they are absolutely identical. If they are, the check-in is rolled back. In order to try and save as much space as possible (remember, each binary version is stored complete), all binary versions are stored compressed. The compression algorithm is Huffman. A decode table preceeds the compressed data and the difference descriptor has its length set to negative to indicate that the data is compressed. Since the decode table occupies 512 bytes of space and compression/decompression requires a significant effort, the decision was made to cut off compression, if the version size is less than 2K. This only somewhat arbitrary size was chosen to allow the decision whether to compress or not to compress be made efficiently. DISCLAIMER: You have the right to use this technical information subject to the terms of the No-Nonsense License Statement that PRODUCT : Borland Version Tracker NUMBER : 1340 VERSION : 2.x OS : DOS DATE : October 25, 1993 PAGE : 3/3 TITLE : How does BVT detect a binary file? you received with the Borland product to which this information pertains.