===================================== 6x86opt Cyrix/IBM/ST 6x86 Processor Optimizer ===================================== v0.78 (c) Mikael Johansson 1997 7.10.1997 Unregistered ShareWare version! *WHAT THIS PROGRAM DOES* ------------------------ This program optimizes the Cyrix/IBM/ST 6x86 (M1/M2) processor. *HOW IT DOES IT* ---------------- By setting the appropriate configuration bits on the CPU to their appropriate values. Details on which bits are set are (very surprisingly) found in the section *WHICH BITS ARE SET* A special address region configuration for the Linear Frame Buffer can also be set up. When enabling Write Gathering for the LFB, writes to it get sped up from 2 to 8 times, thus increasing video performance in LFB modes. *INSTALLING THE PROGRAM* ------------------------ The processor is restored to default state on system reset => the changes 6x86opt does to your CPU (the only thing it does) are lost on boot. This means that 6x86opt can not make any permanent changes to your system, so if it doesn't work properly, just don't run it and your computer will behave like before. But as you probably will want to make 6x86opt (and/or CPUIDEN) a permanent part of your system, executing it manually every time you boot your computer can be tedious. So it is a good idea to execute the program(s) from your AUTOEXEC.BAT file. This file is run every time you start your computer, and is located in the root directory of your C-drive ( C:\AUTOEXEC.BAT ). The 6x86opt package now comes complete with an installation program, called INSTALL. For instructions on using this one, see the next topic. Below is a description on how to install 6x86opt manually: To edit your AUTOEXEC.BAT, open up the file in your favourite text editor. Then insert a line like this somewhere in the middle: C:\drivers\6x86\6x86opt.exe [-parameter1 ...] You must edit the path of the program according to the directory where the 6x86opt.exe file is located. If you don't know where it is, run 6x86opt with the -verbose parameter and it will tell you it's location (among other things). When saving the file, remember to save it in DOS-ASCII text format (or similar). If you use DOS Edit or something, this is the only choice you have. NOTE 1: If you also use CPUIDEN, insert the corresponding line in your AUTOEXEC.BAT _after_ the (possible) 6x86opt line. Some systems need this execution order, don't ask why (as I don't know :-) NOTE 2: You need to have a between all parameters, so your CPUIDEN line for example could look like this: C:\drivers\6x86\CPUIDEN.EXE -set *INSTALLING WITH INSTALL.EXE* ----------------------------- The INSTALL program can move your files to a new directory, and install 6x86opt and/or CPUIDEN in your AUTOEXEC.BAT file. You can choose the parameters for the two programs before installing. The parameters are described later in this document. It is not necessary to study the parameters in detail. Without parameters 6x86opt will perform a default optimization. But at least take a look at the -linbuf parameter! INSTALL can also be used to uninstall the calls from AUTOEXEC.BAT and to change the parameters that 6x86opt/CPUIDEN are executed with. The first thing INSTALL asks for is which of the above three actions you want to commence. The installation with INSTALL.EXE is simple. Just run INSTALL from the directory where all the 6x86opt package files are located, and follow the on-screen prompts. A list of available keys is always present at the bottom of the screen. Read however through the following notes on installing first: INSTALL works best if the AUTOEXEC.BAT file doesn't have any calls to 6x86opt/CPUIDEN previously. It is recommended that you try to uninstall possible 6x86opt and CPUIDEN calls with INSTALL first! The original AUTOEXEC.BAT file is backed up before any changes are done. The backup is named AUTOEXEC.6XO, and is located in the root of C:\ You must have _all_ the files of the 6x86opt package located and present in the same directory. INSTALL is not good at analyzing multiple boot-up configuration AUTOEXEC.BAT files (or other goto statements). If you want INSTALL to install into such a .BAT, you should install the files "At the beginning". Restrictions of INSTALL: INSTALL can not at this time handle AUTOEXEC.BAT files longer than 250 lines, or lines longer than 255 characters. If you have multi-config or a .BAT file longer than 250 lines, you most likely are experienced enough to install the package without the help of INSTALL :-) If not, contact me and inspire me to improve INSTALL in these areas. *SKIPPING EXECUTION IN AUTOEXEC.BAT* ------------------------------------ If you have 6x86opt/CPUIDEN installed in your AUTOEXEC.BAT file, but don't want them to fulfil their task, hold down the key when the programs are about to execute. This will prevent them from doing anything but telling you you have pressed down! *COMMAND LINE PARAMETERS* ------------------------- 6x86opt can be invoked without any parameters for default optimization. You can also define sixteen different command line parameters: -config (-c) ; Execute the configuration file 6x86opt.cfg instead of then default optimization. -x ; Set bit 1 in DBR0. See the topic *REDUCED PERFORMANCE* -FLOP (-F) ; Fast Loop. Will unset the SLOP bit (CCR5:1). -susphlt(-s) ; Set the SUSP_HLT bit (CCR2:3). -SADS (-S) ; Set the SADS bit (CCR2:1). (Used to be "do not unset") -MMXPLUS(-M) ; Set the MMXPLUS bit (CCR7:0). M2 only. Don't know what it does, sounds promising though :-) Experiences with this is classified as "interesting feedback". -linbuf (-l) ; Searches for a Linear Frame Buffer and tries to define an ARR/RCR for it allowing Write Gathering. -manbuf (-m) ; Same as above, but the LFB address and size must be given like -manbuf:ADDR,SIZE. ADDR and SIZE in MB. -killbuf(-k) ; Clears the ARR and RCR that previously has been set up for the LFB. You can either let 6x86opt find the LFB itself, or tell it the address or ARR nr. manually: -killbuf or -killbuf:ADDR or -killbuf:ARR -justbuf(-j) ; When defined, only LFB related parameters are checked, and no optimization is done. Doesn't do anything on itself. -ARR0+1 (-A) ; Set up ARR/RCR 0 and 1 to recommended values, i.e video buffer and device/system ROM space. -nowb (-n) ; Do not enable caching and WriteBack. Note that this parameter doesn't disable these, just leaves them be. -defbtb (-d) ; Do not change the BTB configuration. -verbose(-v) ; Miss„ menn„„n. Shows what the program does during execution. -peek (-p) ; Does not change anything, just shows the bit states, and the LFB and Video Memory size. -force (-f) ; Forces execution of the program even if 6x86opt does not detect a 6x86 CPU. Can cause undefined behaviour. The parameters in parentheses are abbreviations for the parameters. If an unknown parameter is defined on the command line, 6x86opt will show some info about itself. *THE LINEAR FRAME BUFFER AND THE -linbuf AND -manbuf PARAMETERS* ---------------------------------------------------------------- As the Linear Frame Buffer defined by the VESA 2.0 standard is located outside the area of physical memory, the 6x86 has all memory access to this area defaulted at poorest performance. To increase the performance of the LFB, an ARR and a corresponding RCR can be set up for it, allowing Write Gathering. When WG is enabled, multiple writes to sequential addresses are gathered and issued in one write cycle. The buffer is 64 bits wide, so for example 4 word writes are written in one cycle instead of four. Of course only applications that use the LFB, like Quake, get a performance increase from this. When the -linbuf parameter is defined, 6x86opt will automatically search for the LFB and the size of your video card memory, which also is the size of the LFB. It will then search the ARR:s to see if an ARR already is set at this address. If not, it will try to find an empty (zero-size) ARR. When it has decided what ARR to program it will do this. The RCR will be set with the bits WG and RCD (Region Cache Disable). If 6x86opt has problems setting up an ARR, a warning message will be shown. The optimization process will continue, but the exit code will be 2. If 6x86opt fails to detect your LFB or video memory correctly, or if you for example want to define a region before loading a VESA 2.0 driver, you can manually define it with the -manbuf parameter. The format is: -manbuf:ADDR,SIZE where ADDR and SIZE must be given in megabytes (if your video card has less than 1 MB of memory, you still must give this as a minimum size). So if your LFB is located at 3584 MB and you have a video card with 2 MB of RAM, the -manbuf declaration would be: -manbuf:3584,2 (or -m:3584,2). If you get a warning message complaining that the Address is not a multiple of the Block Size, you should reconfigure the LFB Address so that it is. This is a requirement of the 6x86 CPU. If -linbuf fails in detecting the environment correctly, tell me about it. NOTE 1: If you have several -manbuf parameters on the command line, only the _last_ will be processed. NOTE 2: -linbuf is executed before -manbuf, so if -manbuf defines the same address, but different size, the -manbuf definition will prevail. NOTE 3: Only so much checking of the ADDR and SIZE definitions of -manbuf is done so that 6x86opt will not (should not:) crash. Be careful in defining the right values for these. *REMOVING THE LINEAR FRAME BUFFER SPEED UP WITH THE -killbuf PARAMETER* ----------------------------------------------------------------------- You might want to remove the LFB speed up settings, if you for example remove your VESA 2.0 driver, or are going to use Windows 3 (Sometimes problems occur with this combination!). -killbuf is here to help you. Invoke 6x86opt with the -killbuf parameter to reset the ARR/RCR that was setup for LFB space. You have 2 main alternatives: 1) Use the automatic -killbuf. To do this, you must still have the LFB enabled. 6x86opt will detect it, and clear the ARR/RCR. Just write -killbuf (or -k) on the command line. 2) Use the manual -killbuf. If you already have disabled the LFB, or if 6x86opt just won't detect it correctly, you can either specify the Starting Address of the LFB, or what ARR/RCR was setup for the LFB. Use the parameter format -killbuf:ADDR (ADDR in MB) or then -killbuf:ARR. 6x86opt will interpret a value after the ':' as the ARR if the value is below 7, otherwise as the Starting Address. If some problems are encountered, you will see a WARNING! message materializing on your screen. *THE -ARR0+1 PARAMETER* ----------------------- When the -ARR0+1 (or -A) parameter is defined on the command line 6x86opt will set up these registers to the recommended values, so that ARR/RCR0 defines the VGA video buffer, and ARR/RCR1 defines device/system ROM space. The reason why this is not set by default is that some BIOS might already have set up these regions, but at other addresses, and now uses ARR0 and ARR1 for other purposes. I recommend that you try this switch and see if you can detect a speed increase. For exact definition on how these are set, see 6X86SET.TXT. *USING THE 6X86OPT.CFG CONFIGURATION FILE WITH -config* ------------------------------------------------------- You can instead of the default optimization let 6x86opt execute a config file generated by 6x86set. Just define the -config (or -c) parameter at the command line. The config file holds information on how to set all the bits on the system that 6x86set can handle. Note that the config file must be named 6x86opt.cfg and be in the same directory as 6x86opt. If 6x86opt can not find the configuration file, or detects that the file is damaged, it will show a Warning! message. A damaged file will not be processed to any extent. For more info on creating a config file, use 6x86set:s on-line help (invoked by pressing in the program). Also see the file 6X86SET.TXT. *REDUCED PERFORMANCE AND THE -x PARAMETER* ------------------------------------------ If running 6x86opt gives a decrease in performance you should give the -x command line parameter to 6x86opt when executing it. When defined, bit 1 of DBR0 will be set, and a performance increase should result. Some systems need to have this parameter defined. Also other systems can can benefit from setting this bit. But as the documentation for the bit (along with other whole registers) is only available to specific OEM partners of Cyrix and IBM, I do not (yet) know exactly what it does. Therefore it is not included in the default optimization process. Feedback of experiences with this is very welcome! *WINBENCH PROCESSOR SCORES* --------------------------- The processor suites in WinBench gives different results every test run, and the second run is usually poorer than the first. To get reliable results, run these (and other) testsuites right after booting the system with either unoptimized or optimized configuration. You will find that the processor tests are not affected noticeably, as these tests apparently has no use of the optimized settings. Graphics scores should increase. So, the performance increase is dependent on the application. The performance increase is also dependent on how the BIOS sets the bits on boot. If the BIOS "sets'em all" itself there can of course be no improvement. I have not heard of any such systems. BIOS:es that does not support the 6x86 very well will have the highest improvement. In no case should 6x86 decrease overall performance (see previous topic). And to my knowledge there is no way that the speed up of the LFB would have opposite effect. So if you for some reason only want to use this, you can define the -justbuf parameter on the command line. *CPUIDEN.EXE* ------------- This program toggles/sets/unsets the state of the CPUIDEN bit in CCR4:7. When set, the CPUID instruction can be executed. Programs that understand this can get additional information about the processor through this instruction. Some programs misinterpret this information, and it might be better if these programs are left without the additional info. Setting the bit causes for example Win95 (before OSR2 == Win95b) to identify the CPU as a Pentium (which is not necessarily a bad thing). For this CPUIDEN.EXE must be run from the AUTOEXEC.BAT file. You could encounter problems by enabling the CPUID instruction, as some badly written software could get tempted to execute some Pentium specific instructions or functions if they misidentify the 6x86. When invoked without parameters CPUIDEN toggles the state of the bit. The following command line parameters are recognized: -set (-s) ; Sets the bit (enables CPUID instruction) -unset (-u) ; Unsets the bit -force (-f) ; Forces execution even when CPUIDEN does not detect a 6x86 CPU. *SHAREWARE* ----------- As from v0.76, the 6x86opt package is ShareWare. By now I have improved 6x86opt way beyond my original need to have two bits on my CPU set. So if you like the package (or some parts of it), you should register! For detailed information on this, see the file REGISTER.NOW You can distribute the ShareWare package as long as all files are included and unmodified. You may not charge more than a nominal fee for distributing the package. The files included in this package are: 6X86OPT.EXE 43712 bytes 6x86opt, the optimizer 6X86OPT.TXT 31955 bytes This textfile 6X86SET.EXE 100512 bytes 6x86set, the setup utility 6X86SET.TXT 5383 bytes Info on 6x86set CPUIDEN.EXE 10448 bytes CPUID instruction enabler/disabler/toggler INSTALL.EXE 66400 bytes Installer REGISTER.NOW 4755 bytes Information on registering the software *CONTACTING THE AUTHOR* ----------------------- If you encounter any bugs/misreports in the package, have suggestions, questions, or especially if you know of any other optimization tricks for the 6x86, I would like to hear from you, I like e-mail! All info about the undocumented registers is greatly appreciated (performance affecting or not)! If you know something about them that I don't and tell me, you gain a Special Registered 6x86opt package :-O I answer all my e-mail. If you don't receive an answer, mail me again after checking your "reply-to" field in your mailer. About 1/30 of the replied mail bounce back to me due to "user unknown" an other delivery problems. And please don't e-mail me with any HTML-mail, it looks horrible in a mail-reader that doesn't know what it's dealing with :-) e-mail: mpjohans@kumpu.helsinki.fi or: Mikael.Johansson@helsinki.fi Postman Pat employing mail: Mikael Johansson Kitarakuja 3C 220 00420 Helsinki FINLAND *WHICH BITS ARE SET* -------------------- The following bits are set: bit(s) | reg:bit | why ------------------------------------------------------------------------ NO_LOCK | CCR1:4 | "With NO_LOCK set, previously noncacheable locked | | cycles are executed as unlocked cycles and, | | therefore, may be cached. This results in higher | | CPU performance." | | The reason the bit is not set as default is because | | software that require locked cycles might exist. I | | have never had any problems with this. ------------------------------------------------------------------------ WL | RCR7:2 | This one is related to the one above. It enables | | weak locking for the memory region specified by | | ARR7, i.e. all memory. I am not sure if this is | | necessary, as my benchmark results vary. It does no | | harm however. ------------------------------------------------------------------------ DTE_EN | CCR4:4 | "DTE_EN allows Directory Table Entries (DTE) to be | | cached on the 6x86 microprocessor. This provides a | | performance improvement for some applications that | | access and modify the page table frequently." ------------------------------------------------------------------------ WT_ALLOC| CCR5:0 | "Write Allocate (WT_ALLOC) allows L1 cache write | | misses to cause a cache line allocation. This | | feature improves the L1 cache hit rate resulting | | in higher performance especially for Windows | | applications." ------------------------------------------------------------------------ IORT2-0 | CCR4:2-0| I/O Recovery Time. Defines the minimum time between | | I/O port accesses. Set to 111 == no delay. ------------------------------------------------------------------------ SUSP_HLT| CCR2:3 | If this bit is set, the HLT instruction causes the | | CPU to enter low power suspend mode. This works OK | | at least running Linux, doesn't affect DOS. Maybe | | also OS/2 can make use of it. Doesn't affect | | performance. Only set when -susphlt is defined! ------------------------------------------------------------------------ All quotations above are taken from the "IBM 6x86 Microprocessor BIOS Writer's Guide", Document #40205. The following bits are unset: bit | reg:bit | why ------------------------------------------------------------------------ CD | CR0:30 | When 'Cache Disable' is set, the cache is disabled(!) | | Naturally the cache should be enabled. I do not believe | | that this bit is ever set, but you never know. ------------------------------------------------------------------------ NW | CR0:29 | When 'No Write Back' is set, the L1 cache operates in | | Write Through mode. By unsetting the bit the cache | | strategy will be set to Write Back. ------------------------------------------------------------------------ The above bits are not affected if the -nowb parameter is defined. It will also not work when 6x86opt is executed inside Windows. bit | reg:bit | why ------------------------------------------------------------------------ SLOP | CCR5:1 | Unset to disable the slowdown of the LOOP instruction. | | Only unset if the -FLOP parameter is defined. ------------------------------------------------------------------------ SADS | CCR2:1 | Slow ADS, when set the CPU inserts an idle cycle after | | sampling of BRDY# and prior to asserting ADS# => will | | be unset. If the -SADS parameter is defined, this bit | | will be set instead. ------------------------------------------------------------------------ The following bits are also modified: bit(s) | reg:bit | what and why ------------------------------------------------------------------------ MAPEN | CCR3:7-4| Set to 0001 during execution to get access to all | | register indexes. Set to 0000 after optimization. ------------------------------------------------------------------------ LOCK_NW| CCR2:2 | Unset so that the NW bit in CR0 can be modified. | | After optimization it's value is restored. ------------------------------------------------------------------------ ARREN | CCR5:5 | Unset during execution so that the ARR/RCRs can be | | safely programmed. If the ARR0+1 parameter is defined, | | or a successful LFB ARR/RCR is set up, ARREN will be | | set before exit. Otherwise it will be restored. ------------------------------------------------------------------------ 6x86opt enables the Branch Target Buffer (BTB), configures it to store target addresses for both near and far Change-Of-Flow instructions (COFs), and enables it's return stack. See below: The following undocumented bits are modified : bit(s) | reg:bit | what and comment ----------------------------------------------------------------------- MMXPLUS | CCR7:0 | This bit is set only if the -MMXPLUS parameter is | | defined, and an M2 is detected. What it does I don't | | know. Sounds nice anyway. ------------------------------------------------------------------------ unknown | DBR0:1 | Will be set if the -x parameter is defined. ------------------------------------------------------------------------ D_FWDBYP| DBR0:5 | Data Forwarding/Bypassing enable ------------------------------------------------------------------------ BTBTR_EN| DBR0:6 | Set to 1 to get access to Test Registers below TR3. | | Unset after optimization. ------------------------------------------------------------------------ unknown | TR2i5:4 | If set, reduces performance => will be unset. BTB_DIS | TR2i5:3 | BTB Disable. Will be unset. RSTK_DIS| TR2i5:2 | BTB Return Stack Disable. Will be unset FARH_DIS| TR2i5:1 | BTB Far Hits Disable. Will be unset. BTB_DIS | TR2i5:0 | BTB Disable. Will be unset. ------------------------------------------------------------------------ If the -defbtb parameter is defined, no BTB related registers are affected. When the -linbuf or -manbuf parameters are defined, 6x86opt tries to set up an ARR and RCR for the Linear Frame Buffer, allowing Write Gathering for this memory area. See the topic *LINEAR FRAME BUFFER* *FURTHER OPTIMIZATION* ---------------------- It is possible that your system could be sped up even more by toggling the right bits to the right states. Mess around with your 6x86 with 6x86set. But remember that your system might crash by setting the "wrong" bits! You might also want to check some bits that when in the "wrong" state decreases performance of your CPU, but that 6x86opt doesn't correct. One such bit is SLOP (CCR5:1), which should be 0, but could have been set by your system for some reason. 6x86opt doesn't unset it as default, as there could be a very good reason for it to be set. You must define the -FLOP parameter for it to be unset. See what happens. Also, why not check the ARR:s and RCR:s with 6x86set. Another thing that an interested person might want to check, is that if disabling SMM would affect performance (if it does, contact me!). If anyone gets a performance gain from enabling caching/WG/WWO/... on certain memory areas, drop me a line. And then, just go through the registers with 6x86set, and search for bits with comments like "reduces performance if set" and check them. For more information, see the file 6X86SET.TXT! And of course, I'm always interested in a setup that outperforms the default one! *EXIT CODES* ------------ The following exit codes can be generated: 0 : 6x86opt (at least thinks it) encountered no problems . 1 : 6x86opt was invoked with some unknown (or misdefined) parameter, or was pressed at init. 2 : 6x86opt issued a warning of some kind. 3 : 6x86opt detected a load error while processing the config file. 4 : 6x86opt did not detect a 6x86 CPU. Any other codes _should_ never be generated. *VERSION HISTORY* ----------------- v0.64 : First released version. (5.11.1996) v0.64b: Some comments added to the document. (22.11.1996) v0.72 : Command line parameters added. The program now unsets the CD and NW bits in CR0. The document was clarified. (3.12.1996) v0.73 : Added the -x and -verbose parameters. Some undocumented bits are now modified. Better code. (18.12.1996) v0.74 : Added the -linbuf, -manbuf and -peek parameters. (15.1.1997) v0.76 : Added the 6x86set utility. Added the -config, -killbuf -justbuf and -susphlt parameters. Now can set up ARR0 and ARR1 properly with the -ARR0+1 param. Does not set SUSP_HLT as a default anymore. Use -susphlt. Now sets IORT and TR2i5:4 OK. The exit codes got revised. Changed the distribution format from PD to SW. (15.3.1997) v0.77 : Corrected a bug that set SLOP instead of IORT. (I knew it was a bad idea to have 13 different parameters...) Added the -FLOP parameter. Some typos corrected. (18.3.1997) v0.77b: 6x86opt now unsets the SADS bit as default. Added the -SADS parameter to leave this bit unchanged. (5.5.1997) v0.78 : Added preliminary M2 (6x86MX) support. INSTALL program included in the package. Added the -MMXPLUS parameter. The -SADS parameter now _sets_ the SADS bit instead of just leaving it be. FARH_DIS and DTE_EN bits not touched on M1 rev.<2.2 Holding the key down as 6x86opt/CPUIDEN starts will now halt execution. *WHY THIS PROGRAM EXISTS* ------------------------- Because I could find no good optimizer for the processor anywhere. The only one I found was IBM:s M1OPT. This program however changes every bit in the CCR:s and everywhere else according to some model machine of their own. For example all Power Management features are disabled even if they are set before. This program also does not set NO_LOCK. *TO BE DONE* ------------ Windows NT support: On hold for now, as I have not been able to get hold of the NT DDK yet :-/ However, Olivier Gilloire has written a nice tool for all NT owners, 6x86cfg :-) You can get it for example from Bryan Davis very good 6x86 page at http://www.alternativecpu.com 6x86set: Support for M2. As soon as I get a M2 (Don't hold your breath for this one, I'm in no need (or position) to upgrade yet!) INSTALL: Better .BAT analyze, better file error handling, better... Some other things that I couldn't let delay v0.78 further. Anything else that sounds good. *SPECIAL THANKS* ---------------- Rich "Doc" Colley. For testing that helped deliver the -x parameter. Darvin M. For a beer glass and other. ;-) Barry Shilliday. For finding the SLOP bug so soon. All helpful people that have beta tested my 6x86 related software. Everyone who has sent me mail that helped improve 6x86opt. *THE ULTIMATE HEAVY METAL BAND* ------------------------------- Mercyful Fate *THINGS* -------- You use all files in this package entirely at your own risk. Cyrix and IBM and everyone else have copyright on what they have.