How NTP Works

Last update: 22-Feb-2011 17:55 UTC

Related Links

Table of Contents


Introduction and Overview

NTP time synchronization services are widely available in the public Internet. The public NTP subnet in early 2011 includes several thousand servers in most countries and on every continent of the globe, including Antarctica, and sometimes in space and on the sea floor. These servers support a total population estimated at over 25 million computers in the global Internet.

The NTP subnet operates with a hierarchy of levels, where each level is assigned a number called the stratum. Stratum 1 (primary) servers at the lowest level are directly synchronized to national time services via satellite, radio or telephone modem. Stratum 2 (secondary) servers at the next higher level are synchronize to stratum 1 servers and so on. Normally, NTP clients and servers with a relatively small number of clients do not synchronize to public primary servers. There are several hundred public secondary servers operating at higher strata and are the preferred choice.

This page presents an overview of the NTP daemon included in this software distribution. We refer to this daemon as the reference implementation only because it was used to test and validate the NTPv4 specification RFC-5905. It is best read in conjunction with the briefings on the Network Time Synchronization Research Project page.

gif

Figure 1. NTP Daemon Processes and Algorithms

The overall organization of the NTP daemon is shown in Figure 1. It is useful in this context to consider the daemon as both a client of upstream servers and as a server for downstream clients. It includes a pair of peer/poll processes for each reference clock or remote server used as a synchronization source. Packets are exchanged between the client and server using the on-wire protocol described in the white paper Analysis and Simulation of the NTP On-Wire Protocols. The protocol is resistant to lost, replayed or spoofed packets.

The poll process sends NTP packets at intervals ranging from 8 s to 36 hr. The peer process receives NTP packets and performs the packet sanity tests of the flash status word. Packets that fail one or more of these tests are summarily discarded. Otherwise, the peer process runs the on-wire protocol that uses four raw timestamps: the origin timestamp T1 upon departure of the client request, the receive timestamp T2 upon arrival at the server, the transmit timestamp T3 upon departure of the server reply, and the destination timestamp T4 upon arrival at the client. These timestamps, which are recorded by the rawstats option of the filegen command, are used to calculate the clock offset and roundtrip delay samples:

offset = [(T2 - T1) + (T3 - T4)] / 2
delay = (T4 - T1) - (T3 - T2).

The algorithm described on the Clock Filter Algorithm page selects the offset and delay samples most likely to produce accurate results. Those servers that have passed the sanity tests are declared selectable. From the selectable population the statistics are used by the algorithm described on the Clock Select Algorithm page to determine a number of truechimers according to correctness principles. From the truechimer population the algorithm described on the Clock Cluster Algorithm page determines a number of survivors on the basis of statistical clustering principles. The algorithms described on the Mitigation Rules and the prefer Keyword page combine the survivor offsets, designate one of them as the system peer and produces the final offset used by the algorithm described on the Clock Discipline Algorithm page to adjust the system clock time and frequency. The clock offset and frequency, are recorded by the loopstats option of the filegen command. For additional details about these algorithms, see the Architecture Briefing on the Network Time Synchronization Research Project page.

Statistics Budget

Each source is characterized by the offset and delay samples measured by the on-wire protocol and the dispersion and jitter calculated by the clock filter algorithm. In a window of eight samples, this algorithm selects the offset sample with the lowest delay, which generally represents the most accurate data. The selected samples become the peer offset and peer delay. The peer dispersion is determined as a weighted average of the dispersion samples in the window. It continues to grow at the same rate as the sample dispersion, 15 ms/s. Finally, the peer jitter is determined as the root-mean-square (RMS) average of the offset samples in the window relative to the selected offset sample. The peer offset, peer delay, peer dispersion and peer jitter are recorded by the peerstats option of the filegen command. Peer variables are displayed by the rv command of the ntpq program.

The clock filter algorithm continues to process packets in this way until the source is no longer reachable. Reachability is determined by an eight-bit shift register, which is shifted left by one bit as each poll packet is sent, with 0 replacing the vacated rightmost bit. Each time an update is received, the rightmost bit is set to 1. The source is considered reachable if any bit is set to 1 in the register; otherwise, it is considered unreachable.

A server is considered selectable only if it is reachable and a timing loop would not be created. A timing loop occurs when the server is apparently synchronized to the client or when the server is synchronized to the same server as the client. When a source is unreachable, a dummy sample with "infinite" dispersion is inserted in the shift register at each poll, thus displacing old samples.

The composition of the survivor population and the system peer selection is redetermined as each update from each source is received. The system variables are copied from the system peer variables of the same name and the system stratum set one greater than the system peer stratum. System variables are displayed by the rv command of the ntpq program.

Like peer dispersion, the system dispersion increases at the same rate so, even if all sources have become unreachable, the daemon appears to dependent clients at ever increasing dispersion. It is important to understand that a server in this condition remains a reliable source of synchronization within its error bounds, as described in the next section.

Quality of Service

The mitigation algorithms deliver several important statistics, including system offset and system jitter. These statistics are determined by the mitigation algorithms's from the survivor statistics produced by the clock cluster algorithm. System offset is best interpreted as the maximum likelihood estimate of the system clock offset, while system jitter is best interpreted as the expected error of this estimate. These statistics are reported by the loopstats option of the filegen command.

Of interest in this discussion is how the daemon determines the quality of service from a particular reference clock or remote server. This is determined from two statistics, expected error and maximum error. Expected error, or system jitter, is determined from various jitter components; it represents the nominal error in determining the mean clock offset.

Maximum error is determined from delay and dispersion contributions and represents the worst-case error due to all causes. In order to simplify discussion, certain minor contributions to the maximum error statistic are ignored. Elsewhere in the documentation the maximum error is called synchronization distance. If the precision time kernel support is available, both the estimated error and maximum error are reported to user programs via the ntp_gettime() kernel system call. See the Kernel Model for Precision Timekeeping page for further information.

The maximum error is computed as one-half the root delay to the primary source of time; i.e., the primary reference clock, plus the root dispersion. The root variables are included in the NTP packet header received from each server. When calculating maximum error, the root delay is the sum of the root delay in the packet and the peer delay, while the root dispersion is the sum of the root dispersion in the packet and the peer dispersion.

A source is considered selectable only if its maximum error is less than the select threshold, by default 1.5 s, but can be changed according to client preference using the maxdist option of the tos command. A common consequences is when an upstream server loses all sources and its maximum error apparent to dependent clients begins to increase. The clients are not aware of this condition and continues to accept synchronization as long as the maximum error is less than the select threshold.

Although it might seem counter-intuitive, a cardinal rule in the selection process is, once a sample has been selected by the clock filter algorithm, older samples are no longer selectable. This applies also to the clock select algorithm. Once the peer variables for a source have been selected, older variables of the same or other sources are no longer selectable. The reason for these rules is to limit the time delay in the the clock discipline algorithm. This is necessary to preserve the optimum impulse response and thus the risetime and overshoot.

This means that not every sample can be used to update the peer variables and up to seven samples can be ignored between selected samples. This fact has been carefully considered in the discipline algorithm design with due consideration for feedback loop delay and minimum sampling rate. In engineering terms, even if only one sample in eight survives, the resulting sample rate is twice the Nyquist rate at any time constant and poll interval.