Two human beings verbally compare the Short Authentication String, drawing the human brain directly into the protocol. And this is a Good Thing.
The Diffie-Hellman key exchange by itself does not provide protection against man-in-the-middle (MiTM) attacks. To authenticate the key exchange, ZRTP uses a Short Authentication String (SAS), which is essentially a cryptographic hash of the two Diffie-Hellman values. The SAS value is rendered to both ZRTP endpoints. To carry out authentication, this SAS value is read aloud to the communication partner over the voice connection. If the values on both ends do not match, it indicates the presence of a man-in-middle attack. If they do match, there is a high probability that no man-in-the-middle is present. The use of hash commitment in the DH exchange constrains the attacker to only one guess to generate the correct SAS in his attack, which means the SAS can be quite short. A 16-bit SAS, for example, provides the attacker only one chance out of 65536 of not being detected.
ZRTP provides a second layer of authentication against a MiTM attack, based on a form of key continuity. It does this by caching some hashed shared key material to use in the next call, to be mixed in with the next call’s DH shared secret, giving it key continuity properties analogous to SSH. If the MiTM is not present in the first call, he is locked out of subsequent calls. Thus, even if the SAS is never used, most MiTM attacks are stopped, because they weren’t present in the first call. ZRTP’s key continuity features have some self-healing properties that are better than the SSH approach.
ZRTP provides yet a third layer of protection against a MiTM attack. The IETF plans to add integrity protection to the delivery of SIP information, and that integrity protection will rely on a PKI. When this eventually deploys, ZRTP can take advantage of this. See RFC 6189 on how ZRTP can leverage an integrity-protected SIP layer to provide integrity protection for ZRTP’s Diffie-Hellman exchange in the media layer. This protects against a MiTM attack, without requiring the users to verbally compare the SAS. However, no VoIP clients yet offer a fully implemented SIP stack that provides end-to-end integrity protection for the delivery of SIP information. Thus, real-world implementations of ZRTP endpoints will continue to depend on SAS authentication for quite some time. Even after there is widespread availability of SIP products that offer integrity protection, many users will still be faced with the fact that the signaling path may be controlled by institutions that do not have the best interests of the end user in mind. In those cases, ZRTP’s built-in SAS authentication will remain the gold standard for the prudent user. That, plus the key continuity features.