Synchronizing Internet Clock frequency protocol (sic)

Synchronizing Internet Clock frequency protocol (sic) Universidad de Buenos Aires - CONICET

Av. Paseo Colon 850 Buenos Aires C1063ACV Argentina +54 11 5285-0716 ihameli@cnet.fi.uba.ar http://cnet.fi.uba.ar/ignacio.alvarez-hamelin/

Universidad de Buenos Aires

Av. Paseo Colon 850 Buenos Aires C1063ACV Argentina +54 11 5285-0716 dsamanie@fi.uba.ar

Universidad de Buenos Aires

Av. Paseo Colon 850 Buenos Aires C1063ACV Argentina +54 11 5285-0716 ortegaalfredo@gmail.com

Deutsche Telekom

Heinrich-Hertz-Str. 3-7 Darmstadt 64297 Germany +49 6151 5812747 Ruediger.Geib@telekom.de

General TICTOC sic, frequency, clock synchronisation, MitM Synchronizing Internet Clock Frequency specifies a new secure method to synchronize difference clocks on the Internet, assuring smoothness (i.e., frequency stability) and robustness to man-in-the-middle attacks. In 90% of all cases, Synchronized Internet Clock Frequency is highly accurate, with a Maximum Time Interval Error less than 25 microseconds by a minute. Synchronized Internet Clock Frequency is based on a regular packet exchange and works with commodity terminal hardware. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in .

There are different types of clock synchronization on the Internet. NTP remains one of the most popular because a potential user does not need any extra hardware, and it is practically a standard in most of the operating systems distributions. Its working principle relies on time servers having some kind of precise clock source, like atomic clocks or GPS based. For most of the needs, NTP provides an accurate synchronization. Moreover, NTP recently incorporates some strategies oriented to avoid man-in-the-middle (MitM) attacks. NTPs potential accuracy is in the order of tens of milliseconds. Synchronizing Internet Clock frequency (sic frequency) is a protocol providing synchronized difference clocks in two endpoints connected to the Internet. While synchronized absolute clocks aim on a measurement of exact time differences between them, synchronized difference clocks allow measurements during identical time intervals at two locations. This is useful if loads, packet loss or a variation in delay is to be measured. The sic frequency design is close to TSClocks (see below) but it takes advantage of statistics to perform better. sic frequency synchronization relies on Internet based delay measurements. Route changes are frequent, so we include its detection. Finally, our implementation also contemplates the protection to MitM attacks, including the signature of measurements in each packet. sic frequency does neither put constrains on the quality of a server's clock, nor does it require a limitation of the distance of synchronized end systems. Another proposal is the TSClocks , which take advantage of the internal computers’ clock. This work has been shown a very interesting solution because it is not expensive and can be used in any computer connected to the Internet. This solution was proposed in the beginning at LAN (Local Area Network) level, and then it has been extended to other situations. In authors report a difference clock error of about half of hundred of microseconds for a WAN connection with 40ms of RTT (Round Trip Time). When accuracy and stability are needed, further options arise, e.g., the PTP clock (this mechanism was also defined as the IEEE Std. 1588-2008). The PTP clock however incorporates specialized hardware to provide a highly accurate clock, which is required in each point to be synchronised. Also the GPS (Global Position System) requires specialized hardware in every point of measurement. While GPS may be less expensive than PTP, the GPS unit requires a sky clear view for working. The latter may be costly or impossible in some locations. Finally, we mention the shows a methodology to measure delays in networks. It is based on filtering that selects some packets to perform the delay computation. The packet selection is based on the minimum and average RTT, and we show that both of them have some statistical problems to determine (see ).

Synchronizing Internet Clock frequency (sic frequency) is a protocol providing synchronized difference clocks in two endpoints connected to the Internet. Synchronized difference clocks allow measurements during identical time intervals at two locations. This is useful if loads, packet loss or a variation in delay is to be measured. The model of typical Internet time-measurement is shown in .

In this model, sic frequency performs measurements with packets in the way shown in .

/ \_ C_s [s] / \_ / \_ / \_ / \_ / \_ / \_ / \_ / \_ / \_ / \_ C_c [s] Client +---*------------------------------------------@------> t1 t4 ]]> Here, C_s is the server clock, C_c is the client clock and t1...t4 are timestamps. shows a horizontal time line for client and server. The diagonal lines depict a packet traversing some physical space (wires, routers, and switches). The packet travel times are not assumed to be identical, because routes and background load may differ in each direction. The difference between the client clock C_c and the server clock C_s can be modeled as:

where phi is the absolute clock difference. If RTT is constant (i.e. little or no background load) and routes are symmetric in both directions, the difference between clocks can be computed as:

s] = t1 - ( t2 - RTT/2 ) , (2)]]>

and phi[c->s] = phi[c<-s]. The general equation for the RTT is:

Computing Equations 2 and 3 for the this simplified case allows calculation of phi as a function of RTT. Note that if routes are not symmetrical it is impossible to determine the absolute clocks’ difference. The sic frequency protocol is based on statistics, background traffic- and network behavior observations. The RTT between two endpoints follows a heavy-tailed distribution. An alpha-stable distribution shows as one possible model . This distribution can be characterized by four parameters: the localization “delta,” the stretching “gamma,” the tail “alpha,” and the symmetry “beta,” . The location parameter is highly related to the mode of the distribution: delta > 0. The stretching is related to the dispersion: gamma > 0. The symmetry, -1 <= beta <= 1, indicates if the distribution is skewed to the right (the tail decays to the left) for positive values or the opposite direction for negatives ones. Finally, the tail alpha, defined in (0,2], indicates if the distribution is Gaussian one when alpha=2, a power law without variance for alpha <2, and also without statistic mean for alpha<1. The alpha-stable distribution is the generalization of the Central Limit Theorem for any distribution (i.e., it includes the cases without variance or mean). Then, the phi(t) estimation involves the subtraction of two alpha-stable random variables, which yields on another alfa-stable distribution but symmetrical . Due to the characteristic of this result, i.e., a fixed mode and symmetry, a good estimator of the mode is the median. Therefore, sic performs periodic measurements to infer the difference of two clocks in the Internet taking advantage of the empiric observations. The periodicity of RTT measurements is set to 1 second. The parameters of the simple skew model are estimated by the following equation:

where phi(t) = C_c - C_s, K is a constant representing the absolute difference of time of client clock C_c and server clock C_s, and F is the rate parameter. As sic frequency is a difference clock, we only estimate the frequency parameter “F.” Note that the “K” parameter cannot be estimated using just endpoints measurements. Estimating the “K” parameter accurately is out of scope, and we use K=min(RTT)/2, as it used in several synchronization protocols under the assumption of symmetric paths. Considering the following asymmetry definition,

s] A = 1 - --------- , (6) t[c<-s] ]]> where t[c->s] is the minimum delay measured from the client to the server. The maximum asymmetry A of equation 6 is A=1, which is unlucky, and this establishes the hard bound for the error of K as min(RTT): if t[c->s] approaches RTT, t[c->s] approaches zero. The difference between the two is phi (t), and this difference hence is close to min(RTT), if A=1. In our experiments the error in estimation phi(t) was always less than min(RTT)/2. Another problem with most of the synchronization protocols is the estimation of the minimum RTT, which depends upon the time-window within which the RTT is captured. A minimum RTT can only be measured in the absence of any cross traffic. In a first step, the minimum RTT measured during a window of 10 minutes (mRTT10m) is captured. Based on these values, the minimum RTT over a week (mRTTw) is determined. RTTee is defined as mRTT10m - mRTTw. shows the the RTT estimation error captured during an experiment where the minimum latency between probes was 9431 microseconds during one week, i.e., mRTTw=9431 microseconds. Notice that mRTT10m varies a lot, and the observed values can be more than 450 microseconds above the minimum RTT over a week. This error is a consequence of the statistical behavior of the RTT which can be modeled by the alfa-stable distribution. Finally, it is mostly believed there always exist NTP servers at less than five hops with few milliseconds of RTT, because of the NTP deployment. In we show a typical case in Latin America region where the RTT differ notably form host in the same city (Buenos Aires). This example reveals that in some countries could be not possible to have this desired situation and other synchronization tools are needed.

The sic frequency protocol estimates phi(t) of Equation 5 using measurement statistics and taking advantage of the inherent RTT properties, i.e., the heavy tail distribution and its alfa-stable distribution model. The basic sic frequency operation is to periodically send packets, estimate phi(t), and correct the local clock with:

where t_c is the corrected time and t the local clock time (notice that phi(t) is calculated according to Equation 1). The sic protocol also detects route changes by seeking a non-negligible difference between the minimum RTT of the actual and past round trip measurement. The next section also discusses different mechanisms to detect route changes by RTT evaluation.

presents the sic frequency algorithm. In addition, parameters and their definitions are introduced. Finally, formal packet formats are provided. The sic frequency protocol MUST sign the packets with the deterministic Elliptic Curve Digital Signature Algorithm (ECDSA) specified by to protect sic frequency from MitM attacks. To avoid delays when a packet is signed, sic frequency signs them in a deferred fashion. That is, in each packet carries the signature of the previous packet (see algorithms in and ).

sic frequency implementations MUST support the formal description specified by this section. Once activated, the sic frequency protocol MUST operate permanently while a client and a receiver exchange measurement packets. sic frequency works with three states: NOSYNC, PRESYNC, and SYNC. These states are triggered by the variables errsync, presync, and synck. Lines 1 to 4 of the pseudocode in initialize the required data structures needed and set the sic frequency state to NOSYNC. In NOSYNC state, a complete measurement window estimates phi’s by Equation 2 (see line 8). Notice that also Equation 3 can be used, or an average of both Equations. During the experiments, using a single equation only resulted in estimations with a smaller error. The possible explanation is that measurements are affected by the same type of traffic. The median of the measurement window is also computed in line 9, while lines 10-12 are used to verify if there is a path change in the measurements. When an appreciable difference is detected (bounded by errRTT) in line 13, the “else” clause is executed and the systems re-initiates the cycle (see lines 17-22). Notice that line 13 verifies if the absolute value of the minimum RTTs is lower than a percentage of minimum over the complete RTT window. The sic frequency algorithm specification is presented by three tables of pseudocode. The parameters are explained after the third table.

= presynck + P)) then 15 | | | | presynck <- true 16 | | | end if 17 | | else 18 | | | synck <- false, Wmedian <- 0 19 | | | Wm <- 0, errsync <- epoch, n_to <- 0 20 | | | epoch_sync <- INT_MAX - P, pre_sync <- INT_MAX - P 21 | | | set(0, 0, NOSYNC) 22 | | end if 23 | | if ((synck == true) && (epoch >= epochsync + P)) then 24 | | | (m, c) <- linear_fit(Wmedian) 25 | | | actual_c <- c 26 | | | actual_m <- (1-alpha) * m + alpha * actual_m 27 | | | epochsync <- epoch, n_to <- 0 28 | | | set(actual_m, actual_c, SYNC) 29 | | else 30 | | | if (epoch == errsync + MEDIAN_MAX_SIZE) then 31 | | | | presync <- epoch 32 | | | end if 33 | | | if (epoch >= presync + P) then 34 | | | | (actual_m, actual_c) <- linear_fit(Wmedian) 35 | | | | synck <- true , epoch_sync <- epoch 36 | | | | set(actual_m, actual_c, PRESYNC) 37 | | | end if 38 | | end if 39 | else 40 | | to <- false 41 | end if 42 end for ======================================================================= ]]> Several conditions should be verified to pass from NOSYNC to PRESYNC. First, the “else” condition of line 29 should occur, and also the elapsed time between errsync and actual epoch should be MEDIAN_MAX_SIZE (30-32). Therefore, when it also P time is passed form presync, the condition on line 33 is true, and the system arrives at PRESYNC, providing an initial estimation of phi. Then, if there is no route change, the condition in line 14 will be true when the time was increased in another P period. Then, the system is in SYNC state, and it provides the estimation of phi(t) in line 28. Notice that every P time the estimation of phi(t) is computed unless a route change occurs (lines 13 and 17-22). The function in line 6: (epoch, t1, t2, t3, t4, to) <- send_sic_packet(SERVER_IP, TIMEOUT), has a special treatment. It sends the packets specified in , which have signatures. To avoid the processing delay caused by the signature computation, we implemented a policy to send the signature of the previous packet, and if an error is detected, we can stop the synchronization just one loop ahead. illustrates how the client side MUST implement the function send_sic_p (SERVER_IP, TIMEOUT). This function computes the timestamp t1 in line 1, build and send the UDP packet in lines 2-3. Then, if there is no timeout, it calculates the t4 timestamp (line 5), and if no packets were lost, verifies the signature of the previous one in lines 8-18. If the signature is not valid with the received certificate, then the system MUST change to NOSYNC state immediately (see line 11). NOSYNC state MUST also be set, if the limit of time without receiving packets MAX_to is reached. Finally, it stores the received packet into prev_rcv_pck (a global variable) to use in the next packet (line 19). Notice that n_to, the lost packets, is a global variable, as well as the epoch of the previous packet: e_prev.

The server sic algorithm is presented in . It uses prev_sic_P{}, which is a structure to store the received previous signatures, indexed by the IP client addresses (CLIENT_add contains its IP and UDP port); and the same for prev_sig{} with the previously sent signatures. Line 6 verifies either signature is null because it is the first packet, or it is a valid signature. In both cases, the algorithm process the packet computing t3, building up the sic frequency packet, sending it and computing its signature (stored to send in the next reply) in lines 7-11. Next, the actual packet is stored in the prev_sic_P{} structure, line 13.

We provide a formal definition of each used constant and variables; the RECOMMENDED values are displayed in parentheses at the end of the description. These constant and variables MUST be represented in a sic frequency implementation. All the types MUST be respected. They are expressed in “C” programming language running on a 64-bit processor. Constants used for the sic frequency algorithm () RUNNING_TIME: is the period between sic packets are sent (1 second). MEDIAN_MAX_SIZE: is the window size used to compute the median of the measurements (600). P: is the period between phi’s estimation (60). alpha: is a float in the [0,1], the coefficient of the autoregressive estimation of the slope of phi(t) (0.05). TIMEOUT: is the maximum time in seconds that a sic packet reply is expected (0.8 seconds). SERVER_IP: is the IP address of the server (@IP in version 4 or 6). errRTT: is a float that bounds the maximum difference to detect a route change (0.2). MAX_to: is an integer representing the maximum number of packet lost (P/10). CERT: is a public certificate of the other end, it is used to verify signs of the packets. UDP_PORT: is an integer with the port UDP where the service is running on the server. (4444) SERVER_IP: is the IP address of the server. CLIENT_IP: is the IP address of the client. States used for the sic frequency algorithm () NOSYNC: a boolean indicates that it is not possible to correct the local time. PRESYNC: an integer indicates that sic is almost (P RUNNING_TIME) seconds from the synchronization. SYNC: a boolean indicates that sic is synchronized. Variables used for the sic frequency algorithms (, and ) errsync: is an integer with the UNIX timestamp epoch of the initial NOSYNC cycle. It is used to complete the window or measurements (Wm) to compute their medians. presync: is an integer with the UNIX timestamp epoch of the initial PRESYNC cycle. It is used to wait until (P RUNNING_TIME) seconds to the linear fit of phi(t). synck: is an integer with the UNIX timestamp epoch of the initial SYNC cycle. Every P RUNNING_TIME) seconds the phi(t) function is estimated. epochsync: is an integer with the last UNIX timestamp epoch of synchronization. It is used to compute a new estimation of phi(t), every (P RUNNING_TIME) seconds. epoch: is an integer with UNIX timestamp in seconds. It carries the initial epoch of each sic measurement packet. t1, t2, t3, t4: are long long integers to store the t UNIX timestamps in microseconds. actual_m : is a double with the slope for the phi(t) estimation. actual_c: is a double with the intercept for the phi(t) estimation. Wm: is an array of doubles of MEDIAN_MAX_SIZE. It stores the instantaneous estimates of phi(t). Wmedian: is an array of doubles of P size. It saves the computed medians of Wm every RUNNING_TIME. WRTT: is an array of doubles of (2 P) size. It stores the calculated RTT of last measurements. RTTl: is a double with the minimum of last P RTTs. It is used to detect changes on the route from the client to the server. RTTf: is a double with the minimum of previous P RTTs. It is used to detect changes on the route from the client to the server. n_to: is an integer representing the number of lost packets in the actual synchronization window P. e_prev: is an integer with the UNIX timestamp epoch of the last valid packet. prev_rcv_pck: is a sic packet structure, the previous received one.

The sic frequency uses UNIX microsecond format timestamps. Regarding Figure 2, the client takes a timestamp t1 just before it sends the packet. When the server receives the packet, it immediately computes t2, and just before it is sent back to the client, it computes t3. When the client receives the packet, it calculates t4. The server does not need the timestamp t1 because the proposed protocol synchronizes a client with the server clock. This information could however be useful for the server for future use. The packets are shown in . They MUST be sent as UDP data, and it MUST have five fields. The first three correspond to t1 (client), t2 (server), and t3 (server); the last one is the signature of the previous message of the sender (client o server) with its private key. The timestamps t1, t2, and t3 MUST be the UNIX timestamp in microseconds represented with a long long integer of 64-bit C language. The client and server certificates SHOULD be valid and signed ones (only for experimentation user MAY use autogenerated ones).

Server f1 f2 f3 f4 +----------------------------------------+ | t1_c | t2_s | t3_s | Sig_s n-1 | +----------------------------------------+ Server --> Client ]]>

To deploy the sic frequency algorithm, as a minimum a Server and one Client are needed. The Server can support multiple clients. The maximum number of clients is for further study. The Server clock is considered the master one, and all clients synchronize with it. The Server side runs sic frequency as a server with a UDP_PORT number, as specified by the algorithm shown in . Client sic runs the algorithm shown in and also SHOULD provide the corrected time as

Different ways of doing this task are possible: Providing a client capable of reading the variables actual_m and actual_c in shared memory and producing the result of Equation 8. Providing a service in a UDP port answering the correct timestamp queries with Equation 8. Other solution.

In this section we present the prove of the sic concept through some test that we already performed, and the current implementation of sic in C language. Our implementation is publicly available . This protocol implements protection against MitM attacks. The identity of endpoints is guarantee by signed certificates using the deterministic Elliptic Curve Digital Signature Algorithm (ECDSA) specified in the . Server and Client should use signed and valid ECDSA certificates to ensure their identity, and each side has is responsible to verify the public certificate of the other side before to run the algorithm in .

To verify the sic proposal, we tested it using three hosts with GPS units. The first two were located at Buenos Aires, and the third at Los Angeles. We slightly modified the algorithm in to trigger each measurement using the PPS (pulse per second) signal provided by the GPS unit. Then, recording the client and server clocks with the PPS signal, we can determine the real phi function of Equation 1, within the GPS error (it is several orders of magnitude smaller than the error of the sic frequency protocol). We use MTIE defined as follows (Maximum Time Interval Error, see ):

for every t' and t in the interval [t,t+s]; and we chose s=60 seconds. We first used two host (RaspBerriesPI-2) connected back to back to analyze the minimum achievable precision, yielding a MTIE of 15.8 microseconds for the 90 percentile. Then, we selected two real cases of study, one national and other international. In we show the result of the MTIE, evaluated in 60 seconds intervals, for the experiment Buenos Aires-Buenos Aires (RTT of 10ms) and Buenos Aires-Los Angeles (RTT of 198ms). The percentile 90 corresponds to 18.35 microseconds for the Buenos Aires case, and 25.4 microseconds for the Los Angeles case. The percentile 97.5 corresponds to 30 microseconds for the Buenos Aires case, and 42 microseconds for the Los Angeles case. We display the quartiles in . These measurements were performed during a week in each case.

We also conducted another test for NTP4 in Buenos Aires, Argentina. We used a host with GPS, whose PPS signal triggered a process to log actual timestamps. This host was also running NTP4 with the server time.afip.gov.ar, also located at Buenos Aires city, with an average RTT of 12ms. Applying the same process of the previous cases, we obtained that the following quartiles Q3: 9.1ms, Q2: 5.2ms, and Q1: 3.3ms for the MTIE of the NTP4 measurements (also reported in ). Finally, the percentile 90 of the NTP4's MTIE is 11.1ms. The comparison of NTP4 with frequency sic shows that this new method performs two orders of magnitude better.

This document presents the sic algorithm to synchronize host clock frequency using the Internet and resistant to MitM attacks. It also shows the complete specification, implementation, and experiments results that support it working principle. In particular, sic frequency provides a clock rate stability of less than 1ppm for most of the time.

Following enumeration of Time Protocols in packet-switched networks, the proposed encryption of timing packets, based on a mechanism of secure key distribution, provides the following characteristics: 3.2.1 Packet Manipulation: Prevented by packet signature. 3.2.2 Spoofing: Prevented by packet signature and secure key distribution. 3.2.3 Replay Attack: Prevented by chain signing of packets. 3.2.4 Rogue Master Attack: Prevented by secure key distribution. 3.2.5 Packet Interception and Removal: If several packets are removal, the protocol do not arrive to SYNC state. 3.2.6 Packet Delay Manipulation: Not prevented. Future versions may prevent this using over-specification of timing (using redundant masters) 3.2.7 L2/L3 DoS attacks: Not prevented. This can be prevented in future versions using over-specification of timing and redundant masters time servers. 3.2.8 Cryptographic performance attacks: Not an issue in ECDSA. 3.2.9 DoS attacks agains the time protocol: Prevented by secure key distribution. 3.2.10 Grandmaster Time source attack (GPS attacks): Not prevented. Future versions may prevent this using over-specification of timing (using several time servers) . 3.2.11 Exploiting vulnerabilities in the time protocol: Not prevented, future vulnerabilities are unknown. 3.2.12 Network Reconnaissance: Not prevented in this version. No countermeasures were done in node anonymization. The Packet Delay manipulation is one of the hardest problems to solve because there exist some smart ways to attack any synchronization protocol. Even thou, the sic frequency protocol can protect itself because can identify several attacks of this type, i.e., it is challenging to mimic traffic behavior.

This memo makes no requests of IANA.

The authors thank to Ethan Katz-Bassett, Zahaib Akhtar, the USC and CAIDA for lodging the testbed of sic frequency.

Robust synchronization of absolute and difference clocks over networks.



      
        
          Chance and stability: stable distributions and their
          applications.

          
            

            
          

          
            

            
              
                

                

                

                

                
              

              

              

              

              
            

          

          
        

        
      

      
        
          Influence of traffic in stochastic behavior of
          latency.

          
            

            
          

          
            

            
              
                

                

                

                

                
              

              

              

              

              
            

          

          
            

            
          

          
        

        
      

      
        
          Synchronizing Internet Clocks

          
            

            
          

          
            

            
              
                

                

                

                

                
              

              

              

              

              
            

          

          
            

            
          

          
        

        
      

      
        
          Measurement of maximum time interval error for
          telecommunications clock stability characterization

          
            

            
          

          
        

        
      

      
        
          Definitions and terminology for synchronization in packet
          networks (Recommendation ITU-T G.8260)



    
      This appendix shows an experiment to measure the RTT and the distance
      in hops from four different points to a time server in Buenos Aires city
      (the capital of Argentina). We did the measures two times from the four
      points, and we used one hundred packets to determine some statistical
      parameters. Next traceroute measurements show that the number of hops
      and RTT are very different from each point also changes a lot. For
      instance, taking a distinctive look at the STD, average, and maximum is
      possible to detect huge variations. We provide here a case in Argentina,
      trying to reach an NTP server from 4 different points at the Buenos
      Aires city.