This chapter contains the following:
You should perform a database backup whenever you want to save the database and be able to restore it to the current state at a later point.
You can use the following methods to restore the database:
If the database is accidentally deleted from a node, use the fs2d daemon to replicate the database from another node in the pool.
If you want to be able to recreate the current configuration, use the build_cmgr_script script. You can then recreate this configuration by running the script generated.
If you want to retain a copy of the database and all node-specific information such as local logging, use the cdbBackup and cdbRestore commands.
If the database has been accidentally deleted from an individual administration node, you can replace it with a copy from another administration node. Do not use this method if the cluster database has been corrupted.
Do the following:
Stop the CXFS daemons by running the following command on each administration node:
# /etc/init.d/cxfs_cluster stop |
Run cdbreinit on administration nodes that are missing the cluster database.
Restart the daemons by running the following commands on each administration node:
# /etc/init.d/cxfs_cluster start |
The fs2d daemon will then replicate the cluster database to those nodes from which it is missing
You can use the build_cmgr_script command from one node in the cluster to create a cmgr script that will recreate the node, cluster, switch, and filesystem definitions for all nodes in the cluster database. You can then later run the resulting script to recreate a database with the same contents; this method can be used for missing or corrupted cluster databases.
Note: The build_cmgr_script script does not contain local logging information, so it cannot be used as a complete backup/restore tool. |
To perform a database backup, use the build_cmgr_script script from one node in the cluster, as described in “Creating a cmgr Script Automatically” in Chapter 11.
Caution: Do not make configuration changes while you are using the build_cmgr_script command. |
By default, this creates a cmgr script in the following location:
/tmp/cmgr_create_cluster_clustername_processID |
You can specify another filename by using the -o option.
To perform a restore on all nodes in the pool, do the following:
Stop CXFS services for all nodes in the cluster.
Stop the cluster database daemons on each node.
Remove all copies of the old database by using the cdbreinit command on each node.
Execute the cmgr script (which was generated by the build_cmgr_script script) on the node that is defined first in the script. This will recreate the backed-up database on each node.
Note: If you want to run the generated script on a different node, you must modify the generated script so that the node is the first one listed in the script. |
Restart cluster database daemons on each node.
For example, to backup the current database, clear the database, and restore the database to all administration nodes, do the following on administration nodes as directed:
On one: # /var/cluster/cmgr-scripts/build_cmgr_script -o /tmp/newcdb Building cmgr script for cluster clusterA ... build_cmgr_script: Generated cmgr script is /tmp/newcdb On one: # stop cx_services for cluster clusterA On each: # /etc/init.d/cxfs_cluster stop On each: # /usr/cluster/bin/cdbreinit On each: # /etc/init.d/cxfs_cluster start On the *first* node listed in the /tmp/newcdb script: # /tmp/newcdb |
The cdbBackup and cdbRestore commands backup and restore the cluster database and node-specific information, such as local logging information. You must run these commands individually for each node.
To perform a backup of the cluster, use the cdbBackup command on each node.
Caution: Do not make configuration changes while you are using the cdbBackup command. |
To perform a restore, run the cdbRestore command on each node. You can use this method for either a missing or corrupted cluster database. Do the following:
Stop CXFS services.
Stop cluster services on each node.
Remove the old database by using the cdbreinit command on each node.
Stop cluster services again (these were restarted automatically by cdbreinit in the previous step) on each node.
Use the cdbRestore command on each node.
Start cluster services on each node.
For example, to backup the current database, clear the database, and then restore the database to all administration nodes, do the following as directed on administration nodes in the cluster:
On each: # /usr/cluster/bin/cdbBackup On one: # stop cx_services for cluster clusterA On each: # /etc/init.d/cxfs_cluster stop On each: # /usr/cluster/bin/cdbreinit On each (again): # /etc/init.d/cxfs_cluster stop On each: # /usr/cluster/bin/cdbRestore On each: # /etc/init.d/cxfs_cluster start |
For more information, see the cdbBackup and cdbRestore man page.
The cxfs-config command displays and checks configuration information in the cluster database. You can run it on any administration node in the cluster.
By default, cxfs-config displays the following:
Cluster name and cluster ID
Tiebreaker node
Networks for CXFS kernel-to-kernel messaging
Note: Use of these networks is deferred. |
Nodes in the pool:
CXFS filesystems:
Name, mount point (enabled means that the filesystem is configured to be mounted; if it is not mounted, there is an error)
Device name
Mount options
Potential metadata servers
Nodes that should have the filesystem mounted (if there are no errors)
Switches:
Switch name, user name to use when sending a telnet message, mask (a hexadecimal string representing a 64-bit port bitmap that indicates the list of ports in the switch that will not be fenced)
Ports on the switch that have a client configured for fencing at the other end
Warnings or errors
For example:
thump# /usr/cluster/bin/cxfs-config Global: cluster: topiary (id 1) tiebreaker: <none> Networks: net 0: type tcpip 192.168.0.0 255.255.255.0 net 1: type tcpip 134.14.54.0 255.255.255.0 net 2: type tcpip 1.2.3.4 255.255.255.0 Machines: node leesa: node 6 cell 2 enabled Linux32 client_only fail policy: Fence nic 0: address: 192.168.0.164 priority: 1 network: 0 nic 1: address: 134.14.54.164 priority: 2 network: 1 node thud: node 8 cell 1 enabled IRIX client_admin fail policy: Fence nic 0: address: 192.168.0.204 priority: 1 network: 0 nic 1: address: 134.14.54.204 priority: 2 network: 1 node thump: node 1 cell 0 enabled IRIX server_admin fail policy: Fence nic 0: address: 192.168.0.186 priority: 1 network: 0 nic 1: address: 134.14.54.186 priority: 2 network: 1 Filesystems: fs dxm: /mnt/dxm enabled device = /dev/cxvm/tp9500a4s0 options = [] servers = thump (1) clients = leesa, thud, thump Switches: switch 0: admin@asg-fcsw1 mask 0000000000000000 port 8: 210000e08b0ead8c thump port 12: 210000e08b081f23 thud switch 1: admin@asg-fcsw0 mask 0000000000000000 Warnings/errors: enabled machine leesa has fencing enabled but is not present in switch database |
The command has the following options:
-ping contacts each NIC in the machine list and displays if the packets is transmitted and received. For example:
node leesa: node 6 cell 2 enabled Linux32 client_only fail policy: Fence nic 0: address: 192.168.0.164 priority: 1 ping: 5 packets transmitted, 5 packets received, 0.0% packet loss ping: round-trip min/avg/max = 0.477/0.666/1.375 ms nic 1: address: 134.14.54.164 priority: 2 ping: 5 packets transmitted, 5 packets received, 0.0% packet loss ping: round-trip min/avg/max = 0.469/0.645/1.313 ms |
-xfs lists XFS information for each CXFS filesystem, such as size. For example:
Filesystems: fs dxm: /mnt/dxm enabled device = /dev/cxvm/tp9500a4s0 options = [] servers = thump (1) clients = leesa, thud, thump xfs: magic: 0x58465342 blocksize: 4096 uuid: 3459ee2e-76c9-1027-8068-0800690dac3c data size 17.00 Gb |
-xvm lists XVM information for each CXFS filesystem, such as volume size and topology. For example:
Filesystems: fs dxm: /mnt/dxm enabled device = /dev/cxvm/tp9500a4s0 options = [] servers = thump (1) clients = leesa, thud, thump xvm: vol/tp9500a4s0 0 online,open subvol/tp9500a4s0/data 35650048 online,open slice/tp9500a4s0 35650048 online,open data size: 17.00 Gb |
-check performs extra verification, such as XFS filesystem size with XVM volume size for each CXFS filesystem. This option may take a few moments to execute.
For more information, see the cxfs-config man page.