*[Note. This is a slightly modified version of an article that
originally appeared in the December 2003 issue of Archive magazine.]*

In the past I've written about PGP and GnuPG for secure e-mail. This article is concerned mainly with the general principles of strong cryptography, rather than any particular software.

By 'strong' cryptography I mean here methods designed for the highest
security requirements, i.e. those that are likely to resist attacks by the
most powerful adversaries using the world's most advanced technology. I am
*not* speaking about methods that are merely adequate for hiding data
from members of one's family, for example.

Any computer file is represented as a sequence of bits, each having the
value 0 or 1. The basic element of all modern cryptography is a
*cipher*, which is a mathematical set of rules (known technically as an
*algorithm*) for converting one sequence of bits into another. The
original file, known as the *plaintext*, may contain an ordinary text
message, an image, sound, or anything else that can be represented in a
computer. The action of the cipher on the plaintext produces the
*ciphertext*, or vice versa.

The exact details of the algorithms which define modern ciphers, and the reasons why they work, are rather technical mathematically and need not concern us here. The competent design and analysis of strong ciphers requires extensive knowledge of the branch of mathematics known as number theory and also of the methods of cryptanalysis (the science of discovering weaknesses in cryptosystems and breaking them if possible).

A basic principle, adopted by all modern cryptographers, is that the algorithm which defines any cryptosystem must be publicly available. Only then is it possible for the cryptosystem to be subjected to peer review and critical analysis, so that users can have confidence in it. This important rule of cryptography is known as Kerckhoffs' Principle.

The precise way in which the plaintext is encrypted in any particular
case depends not only on the nature of the cipher, but also on a secret
*key*, which in general is simply a large number. In most ciphers, the
same secret key is used for encryption and decryption. Such a cipher is
called a *symmetric cipher*.

Key lengths are conventionally measured in bits, and most of the well known strong ciphers have key lengths between 128 and 256 bits (powers of 2 being most common). A cipher is considered strong if, after years of serious study by the world's best cryptographers, there is no known effective cryptanalytic attack against it. This means that the most efficient way of breaking an encrypted message without a knowledge of the key used to encrypt it is 'brute force', i.e. trying all possible keys.

Thus, the effort required to break an encrypted message is determined by
the number of possible keys, known as the *keyspace*. Knowing the speed
of a computer, it is easy to calculate how long it would take to search the
keyspace of a particular cipher.

As an example, consider a cipher that uses 128-bit keys. Each bit can be
either 0 or 1, so that there are 2^{128}, or approximately
3×10^{38}, keys. Suppose, for example, that ten billion (i.e.
10^{10}) computers are assigned to the task, each capable of testing
ten billion keys per second. In combination they could test 10^{20}
keys per second. Then the task of running through the entire keyspace would
take 3×10^{18} seconds, which is about 100 billion years.

To be fair, on average it would be necessary to run through only half the keyspace to hit upon the correct key, which would take 'only' 50 billion years. This is longer than the estimated age of the universe according to modern cosmology, which is about 15 billion years.

For every additional bit of the key, the time required to test all keys is doubled. It is left as an exercise for the reader to do the calculation for a 256-bit key.

The use of a cipher is by itself inadequate for the transmission of
secure messages by e-mail, because of the *key distribution problem*. If
a secure message is to be exchanged between people who have not previously
met or communicated in person, how can they agree upon a secret key? If they
have a secure channel for exchanging the key, they may as well exchange the
message itself by the same method!

Public-key cryptography, as used in the worldwide standard programs PGP
and GnuPG for secure e-mail, provides an elegant and practical solution to
this problem. In a public-key cryptosystem, each user has a *pair* of
keys: a secret key and a public key. Public keys can be widely publicised
openly. Anyone can send an encrypted message to a particular user by
encrypting it with the recipient's public key. The recipient can decrypt it
with his secret key, but nobody else can do this because nobody else has
access to that secret key.

Although the secret and public keys reverse the action of each other, it is computationally infeasible for anyone to calculate the secret key from a knowledge of the corresponding public key. With the sort of key lengths that are commonly used in practice, public-key cryptography has a degree of security that can match that of the popular symmetric ciphers.

In actual practice, a dual system is used. The message itself is encrypted with a symmetric cipher, using a randomly generated one-time key, and a public-key cryptosystem is used to encrypt the key for the cipher. The reason for this dual system is that symmetric ciphers work much faster than public-key cryptosystems of comparable strength. Therefore it is advantageous to apply the public-key cryptosystem to only the fixed-length key, but not to the message itself, which will usually be much longer.

As Phil Zimmermann, the creator of PGP, wrote in his user manual, "Beware of snake oil." The term 'snake oil', originally referring to quack medicine peddled by travelling salesmen, is used by cryptographers nowadays to refer to dubious encryption products.

Those who know little about serious cryptography often assume that a particular encryption method is safe if nobody has cracked it. However, as the eminent cryptographer Bruce Schneier pointed out in his Crypto-Gram Newsletter dated 15 February 2003, "That's actually backwards. In the world of cryptography, we assume something is broken until we have evidence to the contrary." By this he means that an encryption method can be fully trusted only if it has been subject to rigorous and critical analysis by experts to check its resistance to all known cryptanalytic attacks.

Unfortunately, one can find a lot of snake oil on the market for all major operating systems. I have even seen some for RISC OS. As an example of what exists, a huge list of snake oil, mainly for Windows, can be found on the website of the well known security expert Peter Gutmann at http://www.cs.auckland.ac.nz/~pgut001/links/products.html. That list includes both software written by amateurs and commercial software.

How can we distinguish trusted cryptographic software from snake oil? There are several important tests:

Is the source code public? If it isn't, the encryption method is not open to analysis by experts. Maybe the authors have something to hide or are not sufficiently confident about their software.

Do the authors of the cryptosystems used in the software have any cryptographic credentials? The design of a good cryptosystem cannot be entrusted to amateurs. There are many pitfalls, and even the professionals have made some serious blunders. All cryptosystems that are generally accepted in the world of cryptography have been created by experts whose work has been published in peer-reviewed journals of international standing.

Does the program use ridiculous key lengths? Enormous keys are unnecessary and give no guarantee of security. It is easy to create weak encryption methods using huge keys. At the present time, the best symmetric ciphers use key lengths like 128 or 256 bits, while the best public-key systems require key lengths like 1024 or 2048 bits for good security. There is a technical reason for this difference in key size, which need not be discussed here. If you come across encryption software which claims to use colossal key lengths much greater than these, this should be grounds for suspicion.

Does the program claim to use a 'one-time pad'? A one-time pad (which is essentially a sequence of random bits as long as the message itself) is the basis of the only known encryption method which can be proved mathematically to be absolutely secure. However, this method is totally impractical for most purposes of communication, since security is guaranteed only if the pad is conveyed in a secure manner from the sender to the recipient, for example by trusted courier. As its name suggests, a one-time pad must never be reused. If there is any possibility that the pad, or any part of it, might be reused, then no security is guaranteed. It is also important that the pad is generated in a truly random manner, not by some deterministic process. Many programs that claim to use a one-time pad violate these essential conditions.

These warning signs for snake oil are by no means foolproof, but they do serve as a useful guide for most users, who do not have the expertise or time to make an assessment of cryptographic software for themselves.

What encryption programs can be considered strong? PGP and GnuPG, the standard programs for secure e-mail, make use of only algorithms that are generally trusted by the cryptographic community, and their source code is available for public scrutiny (at least for most versions of PGP). Versions of one or both of these programs exist for all major operating systems, including RISC OS.

In addition to those programs for secure e-mail, there are many applications designed to protect one's own confidential data, using symmetric ciphers. All known RISC OS applications of this kind which make use of reputable cryptographic algorithms are collected on my website at http://www.queen.clara.net/pgp/acorn.html

Those applications are generally written by amateur programmers, not by professional cryptographers. However, they all make use of widely trusted cryptosystems, and they all make their source code publicly available for review. In essence, all such programs are simply front ends for the basic encryption algorithms which they use.

Nevertheless, the creation of such front ends can be a delicate task in itself. In contrast to ordinary programs not designed to ensure security, in the case of front ends for cryptosystems it is not sufficient to check merely that the program 'works', i.e. that it produces the expected output. It is also essential, for example, to ensure that the passphrase used to encrypt data (which in turn determines the key) cannot 'leak' in any way that might be accessible to hackers. For this reason, it is important that the entire source code be publicly available, not only for the encryption engine itself, but for the front end as well.

Some readers undoubtedly use other operating systems in addition to RISC OS, in which case there will be an even greater variety of encryption programs available to them. Therefore it may be worth listing some of the well established and generally trusted encryption algorithms. For symmetric ciphers, these include AES (also known as Rijndael), 3DES (or triple-DES), Blowfish, Twofish, CAST and IDEA. For public-key cryptography, the most commonly used cryptosystems are RSA and ElGamal (also known as the Diffie-Hellman system in some contexts). This list is by no means exhaustive, but it does cover the most commonly used systems.

If you come across encryption software which uses any of the cryptosystems mentioned above, or at least which does not show any of the warning signs for snake oil, there's a good chance that you can trust it.

Back to main PGP page for general information about PGP.