Johns Hopkins University researchers have observed that when voice is compressed with a variable bit-rate (VBR) codec , the packet lengths vary depending on the types of sounds being compressed. This leaks a lot of information about the content even if the packets are encrypted, regardless of what encryption protocol is used. Silent Phone does not use VBR codecs.
Skype’s VBR codec leaks information regardless of the quality of the encryption, which may allow phrases to be identified with an accuracy of 50-90%. Let me be clear about this leakage of information-- it doesn’t leak any cryptographic key material, and it doesn’t help the attacker actually break the crypto. The VBR codec is leaking information about the content of the voice packets, because some sounds compress more than other sounds. By looking at how much each packet of sound was compressed, which can be inferred by the packet size, it is possible to infer something about what kind of sound it is, like a vowel, or a sharp consonant. This undermines the usefulness of the encryption. Some phrases can be identified with an accuracy of 50% to 90%. This is a serious vulnerability.
Fortunately, not too many codecs use VBR. Speex has a VBR-capable codec, and some VoIP applications that use Speex allow the user to choose which codecs to enable. iSAC is a commercially licensed VBR codec, used by Skype. This means that Skype is vulnerable to VBR leakage regardless of the quality of Skype’s built-in crypto. Microsoft’s RT Audio also appears to be a VBR codec, and is used in Microsoft Office Communicator.
It also appears that voice activity detection (VAD) leaks information about the content of the conversation, but to a far lesser extent than VBR. This effect can be mitigated by lengthening the VAD “hangover time” by about 1 to 2 seconds. That would sharply reduce the information leakage, but it may be something that only the VoIP application developer can do, if the VAD parameters are tunable. For an end user, a simpler solution would be to avoid the use of VAD, if this is feasible in your situation. Examples of codecs that use VAD include AMR and G.722.2. If it’s not convenient to avoid all VAD codecs, keep in mind that the leakage from VAD is much less than the leakage from VBR.
Some researchers have suggested that the VAD hangover time should be lengthened by a random amount. For example, a random normal distribution over the range of 1 to 2 seconds. Most codecs that use VAD only allow a fixed amount of VAD hangover to be easily configured. It remains unclear whether a random hangover time is worth the extra effort. This requires further research.