Octaves in music are arguably the most harmonic sounding intervals in music. This is because they are defined to be exactly double the frequency. The sound waves of an octave therefore have exactly ½ the length (period) and perfectly fit twice in the wave of it’s mother tone. Sounding together, this produces no frequency interference whatsoever in the sound waves, and is thus very pleasing to the ear in an arguably objective way of speaking.
If an octave, however, would be slightly but significantly less (or more) than double the frequency, the sound waves would start interfering, causing a slow, distinctive movement in pitch going up and down, making it sound out of tune and ugly, in an arguably objective way of speaking.
The human voice is capable of singing octaves arbitrarily close to perfection. Doing so requires the singer to listen and sense the mother tone, and then produce sound with the voice, all the while adjusting the frequency to approximate a perfect doubling of the frequency. When people think someone has a beautiful singing voice, this is often in large part because the person is very skilled in approximating perfection of such frequencies with their voice (notwithstanding the art of expressivity through slight flatness or sharpness).
After octaves, the so-called fifths are the most important musical phenomenon. The sound of a fifth is like tripling the frequency, again making the sound waves fit perfectly (three times) within the mother tone. Tripling the frequency yields a tone that is higher than an octave, but we can half that frequency again (thus moving an octave down) to obtain the fifth that is usually intended. This procedure should convince you that this fifth is exactly 3/2 times the frequency of the mother tone, and its pitch lies in between the mother tone (1 times the frequency) and an octave (2 times the frequency).
Again, the human voice is capable of singing fifths arbitrarily close to perfection, no matter the frequency of the mother tone. This also holds for instruments where pitches can be produced on a spectrum, such as violins and cellos. However, when assigning frequencies to the tones of a keyboard (a process called tuning), it turns out to be impossible to always produce perfect fifths (given that octaves should sound perfectly), unless there could be an infinite amount of keys on a keyboard. In the following part, I explain the compromises that had to be made between the human ear, the physical world, and the discrete nature of a keyboard.
By some seemingly divine intervention, or perhaps brilliant mathematical insights, musicians some centuries ago decided to divide an octave into 12 tones. As a consequence, a keyboard has 12 steps within a single octave. These steps are by musicians often – perhaps quite arbitrarily – referred to as half steps. On the keyboard, most of these half steps are from white notes to black notes or vice versa, and two of them from white notes to white notes (from E to F and from B to C in the picture below). Why the octave on a keyboard starts with the C is beyond reason; it is just custom. The following names are assigned to the 7 white notes on a keyboard within one octave:
In nearly all instruments, songs, concerts, orchestras, choirs, etc., it is customary to assign a frequency of 440Hz to the A in one particular octave; let us call this the central octave. Different writing systems are used to indicate which notes are meant exactly, but I will label the notes within the central octave by C0, D0, …, A0, B0 (where A0 is 440 Hz). Notes in the octave to the right will be named C1, D1, etc. (and then C2, D2, etc., after that, and so on). Notes in octaves to the left of the central octave I denote by C(-1), D(-1), etc. In theory, this gives us an infinite keyboard, but most keyboards offer about 5 (for organs) to 7 (for pianos) octaves, since frequencies lower or higher than this range start becoming inaudible or just plain ugly.
Choosing a frequency of 440 Hz for A0 and then demanding perfectly sounding octaves already fixes the frequencies of all the A’s. To name a few, we have A1 with 880 Hz, A2 with 1760 Hz, and A(-1) with 220 Hz. However, we need a way to assign frequencies to all the other notes as well – these are B, C, D, E, F, G, and the black notes. In order to do this, let us first look at how the black notes are named in practice.
The 5 black notes have more than one possible name, but we start by naming them as follows (from left to right):
- Db, spoken: D flat – a half step below D;
- Eb, spoken: E flat – a half step below E;
- F#, spoken: F sharp – a half step above F;
- Ab, spoken: A flat – a half step below A;
- Bb, spoken: B flat – a half step below B.
One may be tempted perhaps to call Ab (A flat) also by the name G# (G sharp), as done for the black note following F. As we will see later, this is okay: the modern tuning system allows us to say that G# and Ab are harmonically equivalent on a keyboard – just like C# and Db are, etc. Being harmonically equivalent in this case means that the difference between the two frequencies of Ab and G# is so small that most, if not all, human ears are incapable of hearing it.
Going up a half step should make the sound a little bit higher (in frequency, in pitch), but we have not yet decided how much frequency exactly. But remember that an octave doubles the frequency of a note, so for half steps we are looking to increment the frequency in such a way, that twelve of them will constitute doubling the frequency.
A naive first approach could be to divide the difference between two frequencies of an octave in 12 equal steps in absolute distance. This would mean that all of the twelve half steps between A0 with 440 Hz and A1 with 880 Hz would then be chosen as follows, using a distance of 36,7 Hz (rounded):
This seems to work quite nicely for several reasons. First of all, we have perfect sounding octaves on every note, as some additional calculations for other octaves will show. Second, since now the frequency of Eb (660) is exactly 3/2 times that of A (440), we have a perfect fifth for A exactly 6 half steps from it. Also, the C (550) fits perfectly with a rational factor of 5/4 in the 440 frequency of A. This phenomenon (multiplying the frequency by 5 and moving down 2 octaves) yields a sound referred to as a pure major third, which is the next most perfect sounding thing in music, as can again be quite objectively argued.
Although this tuning may seem quite promising at first glance, when we start considering fifths on other notes besides A, we encounter some unsettling problems. The song Can’t Help Falling In Love With You by Elvis Presley, for example, starts with singing a C (wise), then a fifth on the C (men), and then back to the C (say… only fools rush in). A perfect fifth on C does not exist in this tuning however, since 3/2 times 550 equals 825, a frequency that cannot be produced by a keyboard tuned this way!
The solution to this is the following: rather than repeatedly (twelve times) adding some frequency c, and, in doing so, solving f(A1) = f(A0) + 12 c for c, frequencies are, in fact, multiplied between half steps with some factor c. To make sure that twelve of such frequency multiplications (12 half steps) constitute on octave, we have to have c12=2, and so c is the twelfth root of 2, which is approximately 1.0594631.
Multiplying every half step with this number, we obtain the following tuning, rounded to one decimal precision:
One may at this point be realizing that this way of distancing resembles the pattern followed by frets on a guitar, or lengths of organ pipes or piano snares.
A fifth on C1 is now approximately 3 times 523.3 (that equals 1570.0) divided by 2 equals 785 Hz. The G1 that we have in this tuning has 784 Hz, and is exactly seven half steps away from C1. This is therefore a near perfect fifth. This almost being out of tune is not audible for most, if not all, human ears.
For other notes more generally, multiplying by the twelfth square root of two, which was c=1.0594631, and doing so seven times, yields a total multiplication of c7= 1.05946317= 1.498… Therefore, on every note, there is a fifth on the keyboard which produces almost exactly 3/2 times the frequency. Because of this, there are 12 different tonalities we can play in. To understand why, let me first give an example of a scale.
The most simple and common scale in modern music is the C major scale. It contains all the white notes, starting with C, so C,D,E,F,G,A,B. Note that we can stack a lot of fifths (which consist of 7 half steps) within this collection of notes. Starting from C, we have
C → G → D → A → E → B
Going another fifth further would give F#, which is not in our key. Using an F# now would cause us to feel we have changed playing or singing in the scale with G as the mother tone. This scale belonging to the fifth, in this case G, is also referred to as the dominant scale.
The one tone that is missing from the stacking of fifths above is the F, which is referred to more generally as the subdominant key. It is actually a fifth (7 half steps) below the C. Going fifths further below F yields a Bb (B flat), which is not in our C major scale. Playing that tone would make us feel we have moved to the subdominant scale (F in this case).
Inspired by this stacking of pure fifths, we obtain the following notes in all 12 major scales:
Because of its circular structure, this process of moving up and down major scales via fifths is referred to as the circle of fifths (see picture below). Moving clockwise, thus going up a fifth, comprises of adding a sharp to the scale (or removing a flat) and moving down (left) adds a flat (or removes a sharp). The most commonly played keys are in the upper half of the circle; the more unpopular ones in the lower half.
Because of the way we tuned our keyboard, every key in the circle of fifths sounds equally “out of tune” in its fifths and thirds. This allows us to modulate between different scales, providing a rich dimension to music composition.