NOVELL TECHNICAL INFORMATION DOCUMENT () DSREPAIR.DOC for DSREPAIR.NLM version 2.23e for NetWare 4.0x Servers The DSREPAIR.NLM file version 2.23e for NetWare 4.0x servers includes an enhancement that allows for editing replica rings and printing information about the local database (DIB). This document provides you with procedures and examples for determining when and how to use these enhancements. This document contains important information that is not included in the NetWare 4.0x manual set or online Help or in the NetWare 4.0x release notes. The following table of contents is provided for this document: I. Updating the DSREPAIR.NLM File to Version 2.23e II. Using the DSREPAIR Utility A. Using the Select Options B. Beginning the Repair Process C. Saving Replica Ring Information D. Checking Server Addresses E. Checking IDs In a Remote ID Table F. Checking Remote IDs In Replica Rings G. Moving NetWare 4.0x Servers to the Correct Tree H. Verifying Backlinks and External References III. Using the DSREPAIR Utility to Edit the Replica Rings A. Using the Ring Repair Menu 1. Changing the Replica Type of a Server Replica to the MASTER 2. Editing the Replica Ring for a Partition B. Example of Repair Method ================================================================ I. Updating the DSREPAIR.NLM File to Version 2.23e The DSREPAIR.NLM file version 2.23e is included in this package. This utility is supplied in an English-only version. You should copy the DSREPAIR.NLM file to all NetWare 4.0x servers that will run the DSREPAIR utility. Prerequisites Access to the server console or Supervisory rights to the SYS:SYSTEM directory. Procedure 1. COPY the DSREPAIR.NLM file to the SYS:SYSTEM directory on every NetWare 4.0x server that you want to run the utility. II. Using the DSREPAIR Utility The DSREPAIR console command utility is provided with NetWare 4.0x software to repair problems with NetWare Directory Services (NDS) on a single-server basis. It does not correct problems on other servers from a single, centralized location. It must be run on each server that you want to correct Directory database errors on. IMPORTANT: DSREPAIR affects only the parts of the database stored on the server where you run it. To fix the entire database, you must run the utility on each server which contains a part of the database. This utility checks and repairs the partitions and replicas stored on the server where you run it. The utility also checks and repairs NDS records, schema, bindery objects, and external references. Some NDS database problems are not fatal, and NDS continues to operate. But if the database becomes corrupted, you get a message on the console that the server could not open the local database. In this case, run DSREPAIR or reinstall the NDS database to fix the problem so that the database can be opened. Procedure 1. At the server console prompt, run DSREPAIR by typing dsrepair <Enter> The utility locks the database and displays the following menu options: 1)Select Options View current DSREPAIR settings and select options from the following table. (When an option is on, an asterisk appears at the left of the option.) 2) Begin Repair Starts the repair process. 3) Save Replica Ring Allows you to save the Information replica ring information that you changed in the replica ring editor. 4)Check Server Addresses Checks the network address of every server known to this server with an address from SAP and verify that the network address is correct. 5) Check IDs In Remote Checks the remote server ID ID Table table and repair it. 6) Check Remote IDs In Contacts each server in each Replica Rings partition root's replica ring for every partition root on this server. 7) Move This Server To Assists NetWare 4.01 and The Correct Tree NetWare 4.02 servers to identify the correct tree they reside in after a tree merge or tree name rename operation using the DSMERGE.NLM. 8) Verify Backlinks and Prepares a NetWare 401 or External References NetWare 4.02 server for upgrading to Netware 4.1. A. Using the Select Options The Select Options allows you to configure DSREPAIR for your particular environment and repair needs. Each option has an active and inactive function that you can select. You toggle between the active or inactive function for each option by typing the letter that corresponds with the option. An astrick (*) is placed by the left side of each option when activated. The following table list and defines the available options: Option 1) Pause after each error/ Do not pause after each error Cause DSREPAIR to pause after each error. Option 2) Unattended repair/ Do not exit automatically upon completion Causes DSREPAIR to operate in unattended mode (DSREPAIR runs and exits without intervention). You can also set this option by using the -U switch on the command line. Option 3) Log errors to a file/ Do not log errors to a file Designates a file where errors are logged. (Default: SYS:SYSTEM\DSREPAIR.LOG.) (You can also specify a file by using the -L log_filename on the command line. Turn this option off if you don't want to log errors.) Option 4) New Replica Ring Editor Feature/ Do not repair replica ring Causes DSREPAIR to display the Replica Repair Menu. See "Using the DSREPAIR Utility to Edit the Replica Rings" for more information. Option 5) Check for valid mail directories/ Do not check for valid mail directories Causes DSREPAIR to check SYS:MAIL for subdirectories that must have the name of user IDs in the NDS database. If an object with the ID of the directory name does not exist, the mail directory with that ID is removed. Option 6) Check file system for valid trustee IDs/ Do not check file system for valid trustee IDs Causes DSREPAIR to check the file system for valid trustees. DSREPAIR makes sure that every ID in the file system has a corresponding valid ID in the NDS database. If not, the ID is purged from the database. This operation might take considerable time, depending on the size and number of volumes. Option 7) Check for valid stream files/ Do not check for valid stream files Causes DSREPAIR to check for valid stream files. DSREPAIR checks the NDS secure file area for valid stream files that must correspond to stream properties of objects. If not, the stream file is deleted. Option 8) Return to the main menu. B. Beginning the Repair Process The repair process uses the option settings made in the "Select Options" menu. If the DSREPAIR utility finds any correctable problems, the changes are made in a temporary file set. When checking is completed, a prompt appears asking you if you want to save the temporary file set even if no errors were reported. To save the database, select "Yes" at the prompt. If you save the changes, the "Invalid Trustee" check is performed on the mounted volumes. If trustee assignments exist for objects that have been deleted from the database, the trustee's rights are deleted from the file system. When the repair process is complete, the message "Repair process completed" appears. C. Saving Replica Ring Information Saving the replica ring information allows you to keep a record of all the servers within a particular replica ring. You can display or print out this information to a log file (Default filename is SYS:SYSTEM\DSREPAIR.LOG). D. Checking Server Addresses This option will check the network address of every server found in the database with an address from SAP (Service Advertising Protocol) and verify that the network address is correct. If an NCP_SERVER class object does not have a NETWORK_ADDRESS property, one will be added. NDS servers advertise their presence to other servers using SAP packets which contain their name and address. If the address in incorrect, it will be replaced with the address from SAP. Also, all replica rings are searched to find the server that the address is being verified for, and the address in the replica property is also checked. If the address is incorrect, a warning message is generated and displayed. A future version of the DSREPAIR utility will repair the address in the replica property. NOTE: The local servers database is not locked while this operation is performed. E. Checking IDs In a Remote ID Table This option checks the remote server ID table and repairs it. NOTE: The local servers database is not locked while this operation is performed. The remote server ID table contains a list of ID pairs. The first ID (local ID) is the ID of a server in the Directory tree. The second ID (remote ID) is the ID of the server you are running DSREPAIR on as it exists in the remote server's database. Because ID numbers are specific and unique to each server, the ID of the server in another server's database will probably be different for every server. This server identifies itself to other servers by using the ID from the other servers' local database. If DSREPAIR cannot connect to the other server (identified in the local database by the Local ID) then no further checking is performed. If a server no longer exists in the tree, then the server object for that server should be deleted. Any external references to the server object it are purged when the backlink process cannot resolve the object. This may take one or two days' time to resolve. As the server object and the external references to it are deleted, the local server ID is deleted from the table. If the deleted servers persist in the ID list, there is some other problem causing the ID checking to fail. If the server is down, then it will be checked later if the repair is run again when the server is back online. DSREPAIR will verify that the remote ID is correct and then read the remote servers public key, and authenticate to the remote server. If the remote server's object is an external reference on the remote servers database, then it will report error -631 reading the public key. This is because you cannot read a property of an external reference remotely. Therefore, this is not a problem. The connection should still authenticate. If the authentication fails, then there is a damaged key somewhere within the tree. This will be addressed in a future release of DSREPAIR. If the remote ID resolves to a server other than this one in the remote database, it may be because a rename or move has occurred and the Distinguished Name of this server has changed on the other server. If the remote ID does not seem to be this server's remote ID, DSREPAIR attempts to resolve this server's name on the remote server and put the correct ID in the remote ID table. If the server can authenticate, then the procedure worked and the table has been repaired. If the local ID cannot be resolved in the local database, then the ID pair is deleted. F. Checking Remote IDs In Replica Rings This option will contact each server in the partition root's replica ring. This is done for every partition root on this server. NOTE: The local servers database is not locked while this operation is performed. The replica property contains an ID (local ID) for a remote server and the ID (remote ID) of the partition root in that server's database. DSREPAIR will verify that the remote partition ID is valid and if it is not, it will try to repair the remote ID or delete the server from the replica ring. G. Moving NetWare 4.0x Servers to the Correct Tree When a tree merge or tree rename operation is performed in a Netware 4.1 Directory tree that contains servers running NetWare version 4.01 and 4.02, you might need to identify the correct tree on the NetWare 4.0x servers after the operation is completed. If a NetWare 4.02 server is up and can communicate the NetWare 4.1 server that is running DSMERGE, then the tree should rename properly on the NetWare 4.02 server. If a 4.01 server is in the tree, it will rename its tree name, and then it will stop communicating with other servers. To resolve this, down the server and bring it up again, or run this DSREPAIR and select the "Begin Repair" option to repair the local database. If either the 4.01 or 4.02 server is down during a rename tree or a merge tree operation, they will not be able to change their tree name when they are booted again. This only affects the servers in the source tree during a merge tree since servers in the target tree do not change their tree name during a merge. The source tree is the one that the DSMERGE.NLM is running on. To fix servers running NetWare 4.0x that cannot find the correct tree name after they are brought back up, run the "Moving NetWare 4.0x Servers to the Correct Tree" option. A list of all the tree names found in SAP are displayed. You should then select the tree that the server belongs to. You are then asked to login to the tree to verify that you have rights to perform this operation. Login as the ADMIN user or a user that has similar rights. Then the server attempts to authenticate to the tree, and if this is successful, the tree name for the server is changed. H. Verifying Backlinks and External References This option is used to check a Directory database before upgrading a NetWare4.01 or NetWare 4.02 server to Netware 4.1. External references are local place holders for objects that reside in partitions stored on other servers in the Directory tree. These external reference objects are required locally because they have been referenced by the file system or an object in a local partition. Backlinks are pointers on local objects to servers that contain an external reference of that object. NDS maintains the backlinks and external references, and typically removes them when they are no longer needed. There are occasions however that these items either have not yet been removed or have failed to be removed. External references and their associated real objects on another server can have different creation time stamps. This is not abnormal behavior in Netware 4.01 and NetWare 4.02 environments. Netware 4.1 however uses these timestamps differently and requires an external reference and the actual object to have identical creation time stamps. This option examines the local database integrity with regard to external references and backlinks and removes them when possible. Any timestamp mismatch is recorded in the DSREPAIR.LOG file. This option should be used and the log file examined to be sure there are no problems reported in the local database before upgrading to Netware 4.1. III. Using the DSREPAIR Utility to Edit the Replica Rings The DSREPAIR utility version 2.23e allows you to resolve errors that occurred because the MASTER replica for a partition is lost or a server within a specific replica ring no longer exists in the Directory tree. The errors occur primarily during partitioning operations such as join and split. Use the DSREPAIR utility at the system console prompt to designate a MASTER replica and edit the replica rings for a partition. Procedure 1. At the server console prompt, run DSREPAIR by typing dsrepair <Enter> 2. To set options for editing the replica ring, type "1". 3. Select "New Replica Ring Editor Feature" option by typing number "4." 4. Select option "8" to return to the main menu. 5. At the main menu screen, type "2" to begin checking the database. Changes are saved in a temporary file set until all of the partitions and replicas on the server are checked. Following the local database repair, the "Ring Repair Menu" appears if replicas exist on the server. A. Using the Ring Repair Menu The "Ring Repair Menu" displays a list of all replica root objects stored on the server. The list may include the following replica types: MASTER SECONDARY READ-ONLY SUBORDINATE Each replica is assigned a unique number which appears on the left side of the screen. You should record the unique number that corresponds to the replica that you want to edit. If more replicas are found than can be displayed on the first screen, you will be prompted to press a key to see the next screen until the list is completely displayed. When the list is complete, you are prompted to select a replica by typing in the unique number of the replica,or type "0" (zero) to exit and continue with the repair. When you select a replica, the following options appear: 1) Change Replica Type to Master 2) Edit Replica Ring 1. Changing the Replica Type to MASTER Changing a replica type to a MASTER is used to select a new MASTER replica for a partition that has lost the MASTER replica. You can lose a MASTER replica by - Uninstalling the server containing the MASTER replica from the Directory tree. - The server containing the MASTER replica is damaged or destroyed. Without a MASTER replica, partition operations such as split and join cannot be performed. This is because all workstation utilities that perform partition operations contact the MASTER replica of a partition first to schedule these operations. The replica contained on the server you are running the DSREPAIR utility will be changed to the new MASTER replica for a particular partition if the servername exists in the replica ring for that partition. If you want to set some other server that has a replica of the partition as the server containing the MASTER replica, you should run DSREPAIR on that server. The following conditions exist for changing replicas to the MASTER replica for: SECONDARY or READ-ONLY You should change only SECONDARY or READ-ONLY replicas become a MASTER replica. If another server is found in the replica ring that contains the MASTER replica for a given partition, it is changed to a SECONDARY. If the server that originally contained the MASTER replica is reintroduced into the Directory tree (comes back on line), the new replica ring with the new MASTER replica information will over-write the old one. The original server that contained the MASTER replica is changed to contain a SECONDARY partition in the replica ring table. SUBORDINATE Only in the case of a complete loss of all replicas (MASTER, SECONDARY, READ-ONLY) for a given partition, should you change a SUBORDINATE replica to a MASTER. SUBORDINATE replicas exist as reference partition roots for the partition; however, these are not real replicas of the partition. Because SUBORDINATE replicas are only reference points, there are no copies of the objects, their properties, or their data left in the Directory tree. There may however be external reference objects subordinate to the SUBORDINATE reference partition roots that contain the names of the objects. These external reference objects were created on these servers because they needed object IDs to grant rights to the file system for these objects. You will be able to see the partition with the workstation utilities; however, partition operations will fail, and when you try to view the servers that contain replicas of the partition, none will be displayed. This is because the workstation utilities do not show servers with SUBORDINATE type partition roots. When you select a SUBORDINATE replica to be changed to the MASTER, a message appears alerting you to run DSREPAIR again on this server. This is because the subordinate reference objects that were contained by the SUBORDINATE replica are now subordinate to a MASTER replica. This is an illegal condition in the Directory database. When DSREPAIR runs again, it will change the external reference objects to real objects with a base class of "unknown." Be aware that DSREPAIR generates a lot of errors on the objects in this replica when this takes place. At this point, you have a MASTER replica of the partition, and it contains some but probably not all of the objects that were in the original replica. However, hese objects all have the object class of "unknown," and they have no properties or data. You can restore the replica from a tape drive with SMS, which restores all the properties and data for these objects, or to clean up the tree, delete all the objects in the replica. Then join the partition root object with the parent partition, which changes the object from a partition root to just a container, and then delete the container object. When the deletion is complete, you are returned to the list of replicas. You can select another replica and perform the operation again. Selecting "0" (zero) returns you to the main repair screen. When the change is completed, a prompt appears asking you if you want to save the changes. Decide if you want to make changes to the database. To change the database, select "Yes" at the prompt. If you save the changes, the utility then does an "Invalid Trustee" check on the mounted volumes. If trustee assignments exist for objects that have been deleted from the database, the trustee's rights are deleted from the file system. When the repair process is complete, the message "Repair process completed" appears. 2. Editing the Replica Ring for a Partition Editing the replica ring for a partition is primarily modifying a list of all the servers that contain a replica of a given partition. You should edit the replica ring if any server containing a replica of a partition is uninstalled or does not exist in the Directory tree. Examples of this condition are as follows: - The server was properly removed from the Directory tree with the INSTALL.NLM utility. - The server object was deleted; however, some references to the server object still exist because of tree or synchronization failures. - A server was physically damaged or removed from the tree without being uninstalled. These conditions will commonly generate the -625 error on the NDS server console when the SET DSTRACE=ON command is enabled. The -625 error prevents synchronization from completing properly because NDS attempts to updated all servers in a replica ring with the latest information. This error also prevents objects marked for deletion from being purged, because all servers in a replica ring must be informed of a deletion. You should edit the replica ring and delete any server that no longer exists in the Directory tree. When editing the replica ring, the following information is available: - A list of all servers that contain a replica are displayed - A unique number is assigned to each server - The type of replica on that is contained on a server for that partition - The servers Directory Name - The current status or the server in the replica ring ("Present" or "Deleted") You can delete a server from the replica ring by typing in the unique number that corresponds with the servername. After a server is deleted from the ring, it cannot be added again. CAUTION: If you delete a server that still exists in the Directory tree, that server will no longer be synchronized with the Directory database. If you make a mistake and delete the wrong server, exit the ring editor by selecting "0" (zero) at each prompt, and return to the main repair screen. When you are prompted to save the database (DIB), select "N" for NO and run DSREPAIR again to access the ring editor and make the correct selection. You may want to check the ring of each replica on this server. This is the only way to be sure that the failed server has been removed from all replica rings. When the deletion is complete, you are returned to the list of replicas. You can select another replica and perform the operation again. Selecting "0" (zero) returns you to the main repair screen. When checking is completed, a prompt appears asking you if you want to save the changes. Decide if you want to make changes to the database. To change the database, select "Yes" at the prompt. If you save the changes, the utility changes the database and then does an "Invalid Trustee" check on the mounted volumes. If trustee assignments exist for objects that have been deleted from the database, the trustee's rights are deleted from the file system. When the repair process is complete, the message "Repair process completed" appears. WARNING: If you remove a server from the replica ring that still has a replica of the partition, you will need to contact a Novell Technical Support representative to correct the problem. B. Example of Repair Method Problem: A Server that Was Not Properly Removed Scenario: The server SERV3-MASTER was deleted from the tree for several months but still appeared in the DETROIT partition ring on server SERV4-MASTER. The DS console screen on a server in the same partition is generating -625 errors. The DS console screen appears as follows: SYNC: Start sync of partition OU=DETROIT. 2D52D8D6:967 SYNC: Start outbound sync with server <CN=SERV1-> 2D52D8D6:239 SYNC: Start outbound sync with server <CN=SERV2-> 2D52D8D7:052 SYNC: Start outbound sync with server <CN=SERV3-> 2D52D90F:402 (16:23:59) SYNC: failed to communicate with server <CN=SERV3-> ERROR: -625 2D52D914:962 SYNC: End sync of partition OU=DETROIT. All processed = NO. To remove SERV3-MASTER from the ring, you should complete the following steps: Procedure 1. Run the DSREPAIR.NLM on SERV4-MASTER 2. Select option 1 from the main menu for selection options. 3. From the Selections Menu, choose option "4," "Repair replica ring with manual editor" by typing "4." You are returned to the Main Menu. 4. Select Option 2 to start the repair process. After running DSREPAIR, the following screen appears: Replica Type Name 19 SECONDARY "OU=UGH" 20 MASTER "OU=DETROIT" 21 MASTER "OU=FOO" 22 SUBORDINATE "OU=DOSFISH" 5. Choose the DETROIT partition ring by typing 20 <Enter>. You can now change the replica on the server to the MASTER or modify the replica ring. This example demonstrates how to modify the replica ring. The following screen apprears: NetWare 4.01 Directory Services Repair Utility You selected replica: OU=DETROIT Press "1" to set this server as MASTER in the replica ring. Press "2" to manually edit the replica ring. Enter selection, or "0" to quit: 2 6. Select option 2 to manually edit the replica ring by typing 2. This following screen appears which displays the current servers that are associated with the DETROIT partition ring. The server SERV3- is still present. NetWare 4.01 Directory Services Repair Utility Select a server in the ring of partition: OU=DETROIT Number State Type Server 1 Present MASTER CN=SERV4- 2 Deleted SUBORDINATE CN=WARLOCKS 3 Present SUBORDINATE CN=SERV3- 4 Present SUBORDINATE CN=SERV2- 5 Present SUBORDINATE CN=SERV1- Enter the number of the server to delete, or "0" to quit: 3 7. Select SERV3- (number 3) from the ring by typing 3. This deletes the server from the replica ring and the following screen appears: NetWare 4.01 Directory Services Repair Utility Select a server in the ring of partition: OU=DETROIT Number State Type Server 1 Present MASTER CN=SERV4- 2 Deleted SUBORDINATE CN=WARLOCKS 3 Deleted SUBORDINATE CN=SERV3- 4 Present SUBORDINATE CN=SERV2- 5 Present SUBORDINATE CN=SERV1- Enter the number of the server to delete, or "0" to quit: 0 8. Exit the replica ring editor by typing "0" (zero). ================================================================ Disclaimer Novell, Inc. makes no representations or warranties with respect to the contents or use of this document, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. Further, Novell, Inc. reserves the right to revise this publication and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. Novell, Inc., makes no representations or warranties with respect to any NetWare software, and specifically disclaims any express or implied warranties of merchantability, title, or fitness for a particular purpose. Distribution of any NetWare software is forbidden without the express written consent of Novell, Inc. Further, Novell reserves the right to discontinue distribution of any NetWare software. Novell is not responsible for lost profits or revenue, loss of use of the software, loss of data, costs of recreating lost data, the cost of any substitute equipment or program, or claims by any party other than you. Novell strongly recommends a backup be made before any software is installed. Technical support for this software may be provided at the discretion of Novell. Trademarks Novell, Inc. has attempted to supply trademark information about company names, products, and services mentioned in this document. The following list of trademarks was derived from various sources. Novell and NetWare are registered trademarks NetWare 4.01, NetWare 4.02, NetWare 4,NetWare Client, NetWare Directory Services and NDS, and NLM are trademarks of Novell, Inc. All other products and company names are trademarks or registered trademarks of their respective holders.