USDA ARS VCRU AFS File Servers
Return to main AFS page at /AFS.htm
 

udebug

The following document was copied from http://www-01.ibm.com/support/docview.wss?rs=961&context=SSXMUG&dc=DB520&uid=swg21113428

The udebug program allows a System Administrator to examine the ubik status of a server by specifying the port of the database server process they wish to check. This program is distributed with AFS in the third tar file, and is normally installed in /usr/afs/bin.

Syntax:
% udebug -help
Usage: udebug -servers <server machine> [-port <IP port>] [-long ] [-help ]

As with most other AFS commands, the "-long" switch can be used to obtain more detailed output. The port numbers for the database servers may be obtained at the link following this document.


In each of the examples that follow, the following information applies:

"Host ###.###.###.###" shows the IP address of the host with which you are communicating.

"his time is 0" indicates that clocks are synchronized relative to the client's time.

"his time is n" indicates the number of seconds of skew between the local workstation clock and the server's clock. This number should be 2 or less. Any number greater than 5 or 10 can be the source of synchronization problems. Note, it may be the workstation's clock that is off, not necessarily the server's clock. To verify the server's clock, compare it to the other server's times.

"Vote: Last yes vote for ###.###.###.### at -xx" displays the server site that it last voted for and the number of seconds that have passed since that vote was performed. A vote is taken every 15 seconds to determine a sync site. Usually, the only server conducting a vote will be the sync site. However, if any server goes for 75 seconds without hearing from the sync site, then that server will initiate a vote. When a server is voting for an "unknown" file server you will see its vote for address 255.255.255.255. Note, the "-xx" should be less than 90.

"Last yes vote started at -#########" is the number of seconds that have passed since a yes vote was received. This in uninitialized or less than 90.

"Local db version is #########.#" is the version of the database at the local site. This should match the version listed with "Sync site's db version" described below. The first part of the version number is a time stamp indicating when the current sync site was elected. The second part following the "." starts at 1 and is incremented on every update made to the database.

"I am (not) sync site" will display status of the server with which you are communicating if that site is not the sync site.

"I am sync site until xx (x servers)" displays only if the server with which you are communicating is the sync site.

"Recovery state xx" will appear only on the sync site. A hex value is displayed that shows the status. The normal status is "1f" which will be shown if all of the bits are turned on. Any value other than "1f" could indicate problems, and the status of each bit should be investigated.

Bit descriptions counting from the right (i.e. 54321):
-The 1st bit (1), shows that it is the sync site.
-The 2nd bit (2) shows that all of the database versions it sees on the other database servers are okay.
-The 3rd bit (4) shows that the sync site has fetched the best database that it sees.
-The 4th bit (8) shows that it has changed the version number of its database.
-The 5th bit (16) shows that since it became the sync site, it has sent a copy of its database to the other database servers.

"Lowest host ###.###.###.### at -x" displays the IP address of the server with the lowest IP address and the number of seconds that have passed since it voted for the lowest IP address host.

"Sync host ###.###.###.### at -x" displays the IP address of the file server machine that is the sync site, and the number of seconds that have passed since it voted for the sync site.
"Sync site's db version is #########.#" indicates the version of the database which is housed at the sync site. This should match the version listed with "local db version" above.
"n locked pages, n of them for write" indicates the status of currently active updates, if any. This normally will display 0. If it appears to be stuck at a value other than 0, contact AFS Product Support.

"This server started at -#########", displays number of seconds since the last restart of the file server. This field is not always reinitialized.


Example displays:

When a server has just started, and before it votes for another server, the output will look like this:

% udebug server3 7002
Host 192.55.207.13, his time is 1
Vote: Last yes vote for 255.255.255.255 at -76 (not sync site);
Last vote started at -637421183
Local db version is 637420836.1
I am not sync site
Lowest host 192.54.226.1 at -5
Sync host 192.54.226.1 at -5
Sync site's db version is 0.0
0 locked pages, 0 of them for write
This server started at -637421183


When the file server that is not a sync site votes for a sync site and finds one, the output looks like this:

% udebug server2 7002
Host 192.55.207.13, his time is 0
Vote: Last yes vote for 192.54.226.1 at -9 (sync site);
Last vote started at -9
Local db version is 637420836.1
I am not sync site
Lowest host 192.54.226.1 at -9
Sync host 192.54.226.1 at -9
Sync site's db version is 637420836.1
0 locked pages, 0 of them for write
This server started at -637421591


The output from a sync site is displayed below. Note that the status of the other database servers are also displayed with the following information:

"Server ###.###.###.###.###: (db #########.#)" shows the server IP address and the version of the database which that server has.

"last vote rcvd at -#" indicates the number of seconds that have passed since that server voted for the sync site (if vote was yes). This value is normally less than 16, and is always less than 90 unless one or more servers are down or unreachable.

"last beacon sent at -#" indicates the number of seconds that have passed since the sync site sent beacons to the other database servers to verify that they are up. This value is normally less than 16, and is always less than 90 unless one or more servers are down or unreachable.

"last vote was ..." displays the status of the last vote taken.

"dbcurrent=n" shows that the database on that server is up to date with the database on the sync site. If n is 0 the database is not current, if n is 1 the database is current.

"up=n" displays if the server is responding to beacons. (n = 0 (not responding), n = 1 (responding))

"beaconSince=n" displays that number of seconds that have passed since the last response to a beacon was received. This number should be small if the server is up.

% udebug bigbird 7002
Host 158.98.3.3, his time is 1
Vote: Last yes vote for 158.98.3.3 at -1 (sync site); Last vote started at -1
Local db version is 719493856.463
I am sync site until 59 (3 servers)
Recovery state 1f
Sync site's db version is 719493856.463
0 locked pages, 0 of them for write
This server started at -5048149

Server 192.55.207.52: (db 719493856.463)
last vote rcvd at -2, last beacon sent at -2, last vote was yes
dbcurrent=1, up=1 beaconSince=1

Server 158.98.3.2: (db 719493856.463)
last vote rcvd at -2, last beacon sent at -2, last vote was yes
dbcurrent=1, up=1 beaconSince=1

When running a single database server, a vote is never performed. Therefore, it will display "Vote: Last yes vote for 255.255.255.255", and the times displayed will get very large. In the first few lines of the output, the server believes that it is not the sync site. However, it indicates that it is the sync site later on in the output.

Here is what the output looks like for a single database server system:

% udebug server3 7003
Host 192.55.207.5, his time is 0
Vote: Last yes vote for 255.255.255.255 at -60244 (not sync site);
Last vote started at -637421924
Local db version is 637361685.1
I am sync site until 1510061723 (1 servers)
Recovery state 1f
Sync site's db version is 648534534.1
0 locked pages, 0 of them for write
This server started at -60239  
Related information
AFS UDP Ports