We went over nslookup and the name server's debugging output in the last two chapters. Before we go on, let's introduce two new tools that can be useful in troubleshooting: named-xfer and name server database dumps.
named-xfer is the program named starts to perform zone transfers. named-xfer checks whether the slave's copy of the zone data is up-to-date, and transfers a new zone, if necessary. (In versions 4.9 and 8, named checks first if a zone is up to date in order to avoid starting up child processes when no transfer is necessary.)
In Chapter 12, Reading BIND Debugging Output, we showed you the debugging output a slave name server logged as it checked its zone. When the slave server transferred the zone, it started a child process (named-xfer) to pull the data to the local file system. We didn't tell you, however, that you can also start named-xfer manually, instead of waiting for named to start it, and that you can tell it to produce debugging output independently of named.
This can be useful if you're tracking down a problem with zone transfers but don't want to wait for named to schedule one. To test a zone transfer manually, you need to specify a number of command-line options:
/etc/named-xferUsage: xfer -z zone_to_transfer -f db_file -s serial_no [-d debug_level] [-l debug_log_file] [-t trace_file] [-p port] [-S] [-C class] servers...
When named starts named-xfer, it specifies the -z option (the zone named wants to check), the -f option (the name of the db file that corresponds to the zone, from named.boot), the -s option (the zone's serial number on the slave, from the current SOA record), and the addresses of the servers the secondary was instructed to load from (the IP addresses from the masters substatement in the zone statement in named.conf, or from the secondary directive in named.boot). If named is running in debug mode, it also specifies the debug level for named-xfer with the -d option.
When you run named-xfer manually, you can also specify the debug level on the command line with -d. (Don't forget, though, that debug levels above three will produce tons of debugging output if the transfer succeeds!) You can also specify an alternate filename for the debug file with the -l option. The default log file is /usr/tmp/xfer.ddt.XXXXXX, where XXXXXX is a suffix appended to preserve uniqueness, or a file by the same name in /var/tmp. And you can specify the name of the host to load from, instead of its IP address.
For example, you could check to see whether zone transfers from terminator were working, with the following command line:
/etc/named-xfer -z movie.edu -f /tmp/db.movie -s 0 terminator%
In this command, we specified a serial number of zero because we wanted to force named-xfer to attempt a zone transfer, even if it wasn't needed. Zero is a special serial number - named-xfer will transfer the zone regardless of the actual zone serial number. Also, we told named-xfer to put the new zone file in /tmp, rather than overwriting the zone's working data file.
We can tell if the transfer succeeded by looking at named-xfer's return value, which has four possible values:
The zone data is up-to-date and no transfer was needed
Indicates a successful transfer
The host(s) named-xfer queried can't be reached, or an error occurred and named-xfer may have logged an error message
An error occurred and named-xfer logged an error message
Poring over a dump of the name server's internal database - including cached information - can also help you track down problems. The INT signal causes named to dump its authoritative data, cache data, and hints data to named_dump.db in BIND's running directory (or in /usr/tmp/named_dump.db or /var/tmp/named_dump.db, for BIND 4). An example of a named_dump.db file follows. The authoritative data and cache entries, mixed together, appear first in the file. At the end of the file are the hints data:
; Dumped at Tue Jan 6 10:49:08 1998 ;; ++zone table++ ; 0.0.127.in-addr.arpa (type 1, class 1, source db.127.0.0) ; time=0, lastupdate=0, serial=1, ; refresh=0, retry=3600, expire=608400, minimum=86400 ; ftime=884015430, xaddr=[0.0.0.0], state=0041, pid=0 ;; --zone table-- ; Note: Cr=(auth,answer,addtnl,cache) tag only shown for non-auth RR's ; Note: NT=milliseconds for any A RR which we've used as a nameserver ; --- Cache & Data --- $ORIGIN . . 518375 IN NS G.ROOT-SERVERS.NET. ;Cr=auth [22.214.171.124] 518375 IN NS J.ROOT-SERVERS.NET. ;Cr=auth [126.96.36.199] 518375 IN NS K.ROOT-SERVERS.NET. ;Cr=auth [188.8.131.52] 518375 IN NS L.ROOT-SERVERS.NET. ;Cr=auth [184.108.40.206] 518375 IN NS M.ROOT-SERVERS.NET. ;Cr=auth [220.127.116.11] 518375 IN NS A.ROOT-SERVERS.NET. ;Cr=auth [18.104.22.168] 518375 IN NS H.ROOT-SERVERS.NET. ;Cr=auth [22.214.171.124] 518375 IN NS B.ROOT-SERVERS.NET. ;Cr=auth [126.96.36.199] 518375 IN NS C.ROOT-SERVERS.NET. ;Cr=auth [188.8.131.52] 518375 IN NS D.ROOT-SERVERS.NET. ;Cr=auth [184.108.40.206] 518375 IN NS E.ROOT-SERVERS.NET. ;Cr=auth [220.127.116.11] 518375 IN NS I.ROOT-SERVERS.NET. ;Cr=auth [18.104.22.168] 518375 IN NS F.ROOT-SERVERS.NET. ;Cr=auth [22.214.171.124] EDU 86393 IN SOA A.ROOT-SERVERS.NET. hostmaster.INTERNIC.NET. ( 1998010500 1800 900 604800 86400 ) ;Cr=addtnl [126.96.36.199] $ORIGIN 0.127.in-addr.arpa. 0 IN SOA cujo.movie.edu. root.cujo.movie.edu. ( 1998010600 10800 3600 608400 86400 ) ;Cl=5 IN NS cujo.movie.edu. ;Cl=5 $ORIGIN 0.0.127.in-addr.arpa. 1 IN PTR localhost. ;Cl=5 $ORIGIN EDU. PURDUE 172787 IN NS NS.PURDUE.EDU. ;Cr=addtnl [188.8.131.52] 172787 IN NS MOE.RICE.EDU. ;Cr=addtnl [184.108.40.206] 172787 IN NS PENDRAGON.CS.PURDUE.EDU. ;Cr=addtnl [220.127.116.11] 172787 IN NS HARBOR.ECN.PURDUE.EDU. ;Cr=addtnl [18.104.22.168] $ORIGIN movie.EDU. ;cujo 593 IN SOA A.ROOT-SERVERS.NET. hostmaster.INTERNIC. NET. ( ; 1998010500 1800 900 604800 86400 );EDU.; NXDOMAIN ;-$ ;Cr=auth [22.214.171.124] $ORIGIN RICE.EDU. MOE 172787 IN A 126.96.36.199 ;NT=84 Cr=addtnl [188.8.131.52] $ORIGIN PURDUE.EDU. CS 86387 IN NS pendragon.cs.PURDUE.edu. ;Cr=addtnl [184.108.40.206] 86387 IN NS ns.PURDUE.edu. ;Cr=addtnl [220.127.116.11] 86387 IN NS harbor.ecn.PURDUE.edu. ;Cr=addtnl [18.104.22.168] 86387 IN NS moe.rice.edu. ;Cr=addtnl [22.214.171.124] NS 172787 IN A 126.96.36.199 ;NT=4 Cr=addtnl [188.8.131.52] $ORIGIN ECN.PURDUE.EDU. HARBOR 172787 IN A 184.108.40.206 ;NT=6 Cr=addtnl [220.127.116.11] $ORIGIN CS.PURDUE.EDU. galt 86387 IN A 18.104.22.168 ;Cr=auth [22.214.171.124] PENDRAGON 172787 IN A 126.96.36.199 ;NT=20 Cr=addtnl [188.8.131.52] $ORIGIN ROOT-SERVERS.NET. K 604775 IN A 184.108.40.206 ;NT=10 Cr=answer [220.127.116.11] A 604775 IN A 18.104.22.168 ;NT=20 Cr=answer [22.214.171.124] L 604775 IN A 126.96.36.199 ;NT=8 Cr=answer [188.8.131.52] B 604775 IN A 184.108.40.206 ;NT=9 Cr=answer [220.127.116.11] M 604775 IN A 18.104.22.168 ;NT=20 Cr=answer [22.214.171.124] C 604775 IN A 126.96.36.199 ;NT=17 Cr=answer [188.8.131.52] D 604775 IN A 184.108.40.206 ;NT=11 Cr=answer [220.127.116.11] E 604775 IN A 18.104.22.168 ;NT=9 Cr=answer [22.214.171.124] F 604775 IN A 126.96.36.199 ;NT=73 Cr=answer [188.8.131.52] G 604775 IN A 184.108.40.206 ;NT=14 Cr=answer [220.127.116.11] H 604775 IN A 18.104.22.168 ;NT=160 Cr=answer [22.214.171.124] I 604775 IN A 126.96.36.199 ;NT=102 Cr=answer [188.8.131.52] J 604775 IN A 184.108.40.206 ;NT=21 Cr=answer [220.127.116.11] ; --- Hints --- $ORIGIN . . 3600 IN NS A.ROOT-SERVERS.NET. ;Cl=0 3600 IN NS B.ROOT-SERVERS.NET. ;Cl=0 3600 IN NS C.ROOT-SERVERS.NET. ;Cl=0 3600 IN NS D.ROOT-SERVERS.NET. ;Cl=0 3600 IN NS E.ROOT-SERVERS.NET. ;Cl=0 3600 IN NS F.ROOT-SERVERS.NET. ;Cl=0 3600 IN NS G.ROOT-SERVERS.NET. ;Cl=0 3600 IN NS H.ROOT-SERVERS.NET. ;Cl=0 3600 IN NS I.ROOT-SERVERS.NET. ;Cl=0 3600 IN NS J.ROOT-SERVERS.NET. ;Cl=0 3600 IN NS K.ROOT-SERVERS.NET. ;Cl=0 3600 IN NS L.ROOT-SERVERS.NET. ;Cl=0 3600 IN NS M.ROOT-SERVERS.NET. ;Cl=0 $ORIGIN ROOT-SERVERS.NET. K 3600 IN A 18.104.22.168 ;NT=11 Cl=0 L 3600 IN A 22.214.171.124 ;NT=9 Cl=0 A 3600 IN A 126.96.36.199 ;NT=10 Cl=0 M 3600 IN A 188.8.131.52 ;NT=11 Cl=0 B 3600 IN A 184.108.40.206 ;NT=1288 Cl=0 C 3600 IN A 220.127.116.11 ;NT=21 Cl=0 D 3600 IN A 18.104.22.168 ;NT=1288 Cl=0 E 3600 IN A 22.214.171.124 ;NT=19 Cl=0 F 3600 IN A 126.96.36.199 ;NT=23 Cl=0 G 3600 IN A 188.8.131.52 ;NT=18 Cl=0 H 3600 IN A 184.108.40.206 ;NT=11 Cl=0 I 3600 IN A 220.127.116.11 ;NT=21 Cl=0 J 3600 IN A 18.104.22.168 ;NT=13 Cl=0
The name server that created this named_dump.db file was authoritative only for 0.0.127.in-addr.arpa. Only two names have been looked up by this server: galt.cs.purdue.edu and cujo.movie.edu. In the process of looking up galt, this server cached not only the address of galt, but also the list of name servers for purdue.edu and the addresses for those servers. The name cujo.movie.edu, however, doesn't really exist (nor does the domain movie.edu, except in our examples), so the server cached the negative response. In the dump file, the negative response is commented out (the line starts with a semicolon), and the reason is listed (NXDOMAIN) instead of real data. You'll notice the TTL is quite low (593). Negative responses are only cached for 10 minutes (600 seconds).
The hints section at the bottom of the file contains the data from the db.cache file. The TTL of the hints data is decremented, and it may go to zero, but the hints are never discarded.
Note that some of the resource records are followed by a semicolon and NT=. You will only see these on the address records of name servers. The number is a round-trip time calculation that the name server keeps so that it knows which name servers have responded most quickly in the past; the name server with the lowest round-trip time will be tried first the next time.
The cache data is easy to pick out - those entries have a credibility tag (Cr=) and the IP address of the server the data came from. The zone data and hint data are tagged with (Cl=), which is just a count of the level in the domain tree. (root is level 0, foo is level 1, foo.foo is level 2, etc.) Let's digress a moment to explain the concept of credibility.
One of the advances between version 4.8.3 and 4.9 is the addition of a credibility measure. This allows a name server to make more intelligent decisions about what to do with new data from a remote server.
A 4.8.3 name server only had two credibility levels - locally authoritative data, and everything else. The locally authoritative data were data from your zone files - your name server knew better than to update its internal copy of what came from your zone file. But, all data from remote name servers were considered equal.
Here is a situation that could happen, and the way a 4.8.3 server would deal with it. Suppose that your server looked up an address for terminator.movie.edu and received an authoritative answer from the movie.edu name server. (Remember, an authoritative answer is the best you can get.) Sometime later while looking up foo.ora.com, your server receives another address record for terminator.movie.edu, but this time as part of the delegation info for ora.com (which terminator.movie.edu backs up). The 4.8.3 name server would update the cached address record for terminator.movie.edu, even though the data came from the com name server instead of the authoritative movie.edu name server. Of course, the com and movie.edu name servers will have exactly the same data for terminator, so this won't be a problem, right? Yeah, and it never rains in southern California, either.
A 4.9 or 8 name server is more intelligent. Like a 4.8.3 name server, it still considers your zone data beyond any doubt. But a 4.9 name server distinguishes among the different data from remote name servers. Here is the hierarchy of remote data credibility from most credible to least:
These records are data from authoritative answers - the answer section of a response packet with the authoritative answer bit set.
These records are data from the rest of the response packet - the authority and additional. The authority section of the response contains NS records that delegate a domain to an authoritative name server. The additional section contains address records that may complete information in other sections (i.e., address records that go with NS records in the authority section).
There is one exception to this rule: when the server is priming its root name server cache, the records that would be at credibility addtnl are bumped up to answer credibility, to make them harder to change accidentally. Notice in the dump that the address records for root name servers are at credibility answer, but the address records for the purdue.edu name servers are at credibility addtnl.
In the situation just described, a 4.9 or 8 name server would not replace the authoritative data (credibility = auth) for terminator.movie.edu with the delegation data (credibility = addtnl) because the authoritative answer has higher credibility.
BIND version 4.9 added a feature, query logging, which can be used to help diagnose certain problems. When query logging is turned on, a running name server will log every query with syslog. This feature could help you find resolver configuration errors, because you can check that the name you think is being looked up really is the name being looked up.
First you must make sure that LOG_INFO messages are being logged by syslog for the facility daemon. Next you need to turn on query logging. This can be done three ways: for BIND 4.9, set options query-log in your name server boot file; for BIND 4.9 or BIND 8, start the name server with -q on the command line, or send a WINCH signal to a running name server. You'll start seeing syslog messages like this:
Feb 20 21:43:25 terminator named: XX /22.214.171.124/carrie.movie.edu/A Feb 20 21:43:32 terminator named: XX /126.96.36.199/188.8.131.52.in-addr.arpa/PTR
These messages include the IP address of the host that made the query and the query itself. Inverse queries will have a dash before the query type (i.e., an inverse query for an address record would be logged as "-A" instead of just "A"). After enough queries have been logged, you can turn off query logging by sending a WINCH signal to your name server.