[Chapter 8] 8.4 Changing TTLs

8.4 Changing TTLs

An experienced domain administrator needs to know how to set the time to live on his zone's data to his best advantage. The TTL on a resource record, remember, is the time in seconds any server can cache that record. So if the TTL for a particular resource record is 3600 (seconds), and a server outside your domain caches that record, it will have to remove the entry from its cache after an hour. If it needs the same data after the hour is up, it'll have to query your name servers again.

When we introduced TTLs, we emphasized that your choice of a TTL would dictate how current you'd keep copies of your data, at the cost of increased load on your name servers. A low TTL would mean that name servers outside your domain would have to get data from your name servers often, and would therefore keep current. On the other hand, your name servers would be peppered by their queries.

You don't have to choose a TTL once and for all, though. You can - and experienced administrators do - change TTLs periodically to suit your needs.

Suppose we know that one of our hosts is about to be moved to another network. This host is the movie.edu film library. It houses a large collection of files our site makes available to hosts on the Internet. During normal operation, outside name servers cache the address of our host according to the minimum TTL in the SOA record. (We set the movie.edu TTL to be one day in our sample files.) A name server caching the old address record just before the change could have the wrong address for as long as a day. A loss of connectivity for a day is unacceptable, though. What can we do to minimize the loss of connectivity? We can lower the TTL, so that outside servers cache the address record for a shorter period. By reducing the TTL, we force the outside servers to update their data more frequently, which means that any changes we make when we actually move the system will be propagated to the outside world quickly. How long can we make the TTL? Unfortunately, we can't use a TTL of zero, which should mean "don't cache this record at all." Some older BIND 4 name servers can't return records with a TTL of zero, instead returning null answers or SERVFAIL errors. Small TTLs, like 30 seconds, are okay, though. The easiest change is to lower the TTL in the SOA record in the db.movie file. If you don't place an explicit TTL on resource records in the db files, the name server applies this minimum TTL from the SOA record to each resource record. If you lower the minimum TTL field, though, the new, lower TTL applies to all addresses, not just the address of the host being moved. The drawback to this approach is that your name server will be answering a lot more queries, since the querying servers will cache all the data in your zone for a shorter period. A better alternative is to put a different TTL only on the affected address record.

To add an explicit TTL on an individual resource record, place it before the IN in the class field. The TTL value is in seconds. Here's an example of an explicit TTL from db.movie:

cujo  3600 IN  A    192.253.253.5  ; explicit TTL of 1 hour

If you're observant, you may have noticed a potential problem: the explicit TTL on cujo's address is 3600 seconds, but the TTL field in the SOA record - ostensibly the minimum TTL for the zone - is higher. Which takes precedence?

If BIND followed the DNS RFCs, the TTL field in the SOA record would really define the minimum TTL value for all resource records in the zone. Thus, you could only specify explicit TTLs larger than this minimum. BIND name servers don't work this way, though. In other words, in BIND, "minimum" is not really minimum. Instead, BIND implements the minimum TTL field in the SOA record as a "default" TTL. If there is no TTL on a record, the minimum applies. If there is a TTL on the resource record, BIND allows it even if it is smaller than the minimum. That one record is sent out in responses with the smaller TTL, while all other records are sent out with the "minimum" TTL from the SOA record.

You should also know that when giving out answers, a slave supplies the same TTL a primary does - that is, if a primary gives out a TTL of 86400 for a particular record, a slave will, too. The slave doesn't decrement the TTL according to how long it has been since it loaded the zone. So, if the TTL of a single resource record is set smaller than the SOA minimum, both the primary and slave name servers give out the resource record with the same, smaller TTL. If the slave name server has reached the expiration time for the zone, it expires the whole zone. It will never expire an individual resource record within a zone.

So BIND does allow you to put a small TTL on an individual resource record if you know that the data is going to change shortly. Thus, any server caching that data only caches it for a brief time. Unfortunately, while BIND makes tagging records with a small TTL possible, most domain administrators don't spend the time to do it. When a host changes address, you often lose connectivity to it for a while.

More often than not, the host having its address changed is not one of the main hubs on the site, so the outage impacts few people. If one of the mail hubs or a major ftp repository - like the film library - is moving, though, a day's loss of connectivity may be unacceptable. In cases like this, the domain administrator should plan ahead and reduce the TTL on the data to be changed.

Remember that the TTL on the affected data will need to be lowered before the change takes place. Reducing the TTL on a workstation's address record and changing the workstation's address simultaneously may do you little or no good; the address record may have been cached minutes before you made the change, and may linger until the old TTL times out. And be sure to factor in the time it'll take your slaves to load from your primary. For example, if your minimum TTL is 12 hours, and your refresh interval is 3 hours, be sure to lower the TTLs at least 15 hours ahead of time, so that by the time you move the host, all the long TTL records will have timed out. Of course, if all of your slaves are BIND 8 servers that use NOTIFY, the slaves shouldn't take the full refresh interval to synch up.

8.4.1 Changing Other SOA Values

We briefly mentioned increasing the refresh interval as a way of offloading your primary name server. Let's discuss refresh in a little more detail and go over the remaining SOA values, too.

The refresh value, you'll remember, controls how often a slave checks whether its data is up-to-date. The retry value then becomes the refresh time after the first failure to reach a master name server. The expire value determines how long data can be held before it's discarded, when a master is unreachable. Finally, the minimum TTL sets how long domain information may be cached.

Suppose we've decided we want the slaves to pick up new information every hour instead of every three hours. We change the refresh value to 3600 in each of the db files (or with the -o option to h2n). Since the retry is related to refresh, we should probably reduce retry, too - to every 15 minutes or so. Typically, the retry is less than the refresh, but that's not required.[10] Although lowering the refresh value will speed up the distribution of data, it will also increase the load on the server being loaded from since the slaves will check more often. The added load isn't much, though; each slave makes a single SOA query during each zone's refresh interval to check its master's copy of the zone. So with two slave name servers, changing the refresh time from three hours to one hour will only generate four more queries (per zone) to the primary in any three-hour span.

[10] Actually, BIND 8 servers will warn you if refresh is set to less than ten times the retry interval.

If all of your slaves run BIND 8 and you use NOTIFY, of course, refresh doesn't mean as much. But if you have even one BIND 4 slave, your zone data will take up to the refresh interval to reach it.

Some older versions of BIND slaves stopped answering queries during a zone load. As a result, BIND was modified to spread out the zone loads, reducing the periods of unavailability. So, even if you set a low refresh interval, your slaves may not check exactly as often as you request. BIND name servers attempt a certain number of zone loads and then wait 15 minutes before trying another batch. On the other hand, BIND 4.9 and later may also refresh more often than the refresh interval. These newer BINDs will wait a random number of seconds between one-half the refresh interval and the full refresh interval to check serial numbers.

Expiration times on the order of a week are common - longer if you frequently have problems reaching your updating source. The expiration time should always be much larger than the retry and refresh interval; if the expire time is smaller than the refresh interval, your slaves will expire their data before trying to load new data. BIND 8 will complain if you set an expire time less than refresh plus retry, less than twice retry, less than seven days or greater than six months. Choosing an expire time that meets all BIND 8's criteria is a good idea in most situations.

If your data don't change much, you might consider raising the minimum TTL. The SOA's minimum TTL value is typically one day (86400 seconds), but you can make it longer. One week is about the longest value that makes sense for a TTL. Longer than that and you may find yourself unable to change bad, cached data in a reasonable amount of time.


8.3 Registering Name Servers		8.5 Planning for Disasters