Saturday, December 29, 2007

Quantum Software

Perhaps it is time to take stock of how far we have come with softwarephysics. Recall that softwarephysics is a simulated science, where we try to match up the effective theories of physics that describe the physical Universe, with the corresponding effective theories of softwarephysics that describe similar phenomena in the Software Universe. We began with the struggles of 18th century steam engine designers and the resulting development of thermodynamics. We saw that thermodynamics was an effective theory that described the macroscopic behavior of matter in terms of pressures, volumes, temperatures, and energy flows. We were also introduced to the second law of thermodynamics, which held that the entropy or disorder of the Universe was constantly increasing, and that the only way we could buck this trend and produce order out of disorder, like a car out of iron ore, was to degrade the low entropy chemical energy in a fuel into high entropy disordered energy, also known as heat. We found that entropy is a measure of the depreciation of the Universe, the total amount always increases and never decreases as a whole, but can be decreased locally with an expenditure of effort. We saw that software too was subject to the second law of thermodynamics in that software tended to depreciate, or increase in entropy, through the introduction of bugs whenever software was worked upon by programmers, and that the only way to add macroscopic functionality to software, and thereby reduce its entropy, was to degrade the chemical energy in a programmer’s brain into heat energy. We then drilled down deeper to another effective theory of physics called statistical mechanics, which took the macroscopic ideas of thermodynamics and examined them at the microscopic level of molecules and atoms. With statistical mechanics, we saw that entropy could also be viewed in terms of the microscopic disorder of the microstates that a system could exist in, and we used the entropy of poker hands to clarify this idea. The paradox of Maxwell’s Demon was presented next, and we learned how Leon Brillouin solved the paradox with his concept of information as a decrease in the entropy of a system at the microscopic level. We then used these ideas to show that software tended to increase in entropy, and decrease in information content, whenever programmers worked on software because there were many more “buggy” versions of a piece of software than correct versions. Next we examined the special theory of relativity and found that information, the stuff that Leon Brillouin used to solve Maxwell’s Demon, was just as real as matter or energy. In order to preserve the concept of causality, we had to introduce the limitation that information could not travel faster than the speed of light; the same limitation that special relativity imposed upon matter and energy. So the information that flows through our computer systems on a 24 x 7 basis is tangible stuff after all. Finally, we ended with the general theory of relativity and the concept of spacetime, which we extended to cyberspacetime as a stage upon which the events of the Software Universe and IT could unfold.

The ultimate purpose of all this is to collect enough effective theories in softwarephysics to be able to frame the fundamental problem of software. With the fundamental problem of software in hand, we will see that a biological approach to software is in order, and at that point, we will switch gears to cover the biological aspects of softwarephysics. We are about half way there, and I just wanted to briefly stop to recap our progress to date because our next stop will bring us to the really strange physics of the 20th century, quantum mechanics, and its very counter intuitive assertions. As one of my old physics professors used to say, “You never really understand quantum mechanics; you just get used to it”. So be prepared to hold on tight!

The Software Universe is Quantized
In my last posting on cyberspacetime, we saw that the cyberspacetime of the Software Universe is quantized in both the cyberspace and time dimensions. Because the time dimension of cyberspacetime comes in quanta of time of less than a nanosecond (10-9 seconds) and the number of microprocessors on a server or PC is not readily apparent to end-users, the quantization of cyberspacetime vanishes at the macroscopic level familiar to IT professionals and laymen end-users. For example, even the most economical of PCs are now equipped with a dual-core processor with two CPUs, and a modern data warehouse can scale up to 1,024 nodes, with each node containing 4 dual-core processors, for a total of 8,192 microprocessors. During the course of a single browser session on the Internet, end-users can easily interact with thousands of microprocessors over the span of trillions of nanoseconds, but this all just appears as one large continuous cyberspacetime to them.

For layman end-users and IT managers, the same can be said of software itself. They only view software macroscopically and are only interested in the macroscopic functions that software performs, the speed with which the functions execute, and the stability and reliability of its performance. For them, software is a continuous macroscopic substance. Programmers, on the other hand, are well aware of the quantization of software at the source code level. For programmers, software is composed of lines of source code. And each line of code has a microscopic purpose which translates into a macroscopic effect. In turn, the lines of code are composed of characters, which must be exactly correct in both number and kind. Each character of a line of code is composed of a series of 8 quantized bits, with each bit in one of two quantum states “1” or “0”, which can also be characterized as ↑ or ↓. For example, in the following line of code:

discountedTotalCost = (totalHours * ratePerHour) - costOfNormalOffset;

some sample characters have ASCII representations of:

C = 01000011 = ↓ ↑ ↓ ↓ ↓ ↓ ↑ ↑
H = 01001000 = ↓ ↑ ↓ ↓ ↑ ↓ ↓ ↓
N = 01001110 = ↓ ↑ ↓ ↓ ↑ ↑ ↑ ↓
O = 01001111 = ↓ ↑ ↓ ↓ ↑ ↑ ↑ ↑

The physical characteristics of each character in a line of code and its ultimate macroscopic effects are defined by the arrangement of its 8 quantized bits ↑ ↓.

Is the Physical Universe Quantized Too?
Recall that the general theory of relativity relied upon Einstein’s Equivalence Principle, which held that the effects of gravity and acceleration were deemed equivalent for all observations. Thus the slowing down of accelerated clocks could be used to predict the slowing down of clocks in a gravitational field and vice versa. The Equivalence Principle is a two-way street, that allows physicists to jump back and forth between accelerated reference frames and gravitational fields. In softwarephysics we have a similar two-way street in the Equivalence Conjecture of softwarephysics:

Over the past 70 years, through the uncoordinated efforts of over 50 million independently acting programmers to provide the world with a global supply of software, the IT community has accidentally spent more than $10 trillion creating a computer simulation of the physical Universe on a grand scale – the Software Universe.

The Equivalence Conjecture is an outgrowth of Konrad Zuse’s Calculating Space, published in 1967, in which he proposed that the physical Universe was equivalent to a network of computers, and which gave birth to the concept of digital physics. Using nothing more than the Equivalence Conjecture of softwarephysics, we could logically predict that the physical Universe must also be quantized too! I am just trying to have a little fun here. The discovery of the quantization of the physical Universe actually has a very long history going back thousands of years to Ancient Greece.

Atoms and the Void
It all began around 450 B.C., when Leucippus and his student Democritus, proposed that the world was composed of a “void” filled with eternal, unchanging, quantized particles called atoms. The Greek word atomos means "uncuttable". According to Democritus, if you kept cutting a piece of gold in half, ultimately you would end up with a single uncuttable atom of gold. These atoms were eternal and unchanging, so the multitude of changes seen in the real world was an illusion, the result of combining or breaking apart combinations of atoms. This atomic view of the Greek atomists contrasted sharply with the philosophy of many of the other Greek philosophers of the day, which held that matter was a continuous substance composed of 4 elemental substances – fire, water, earth, and air. But if matter was really composed of atoms, what forces held the atoms together in combinations, and why didn’t they all just fall apart in a pile at the bottom of the void? These questions plagued the atomists and hampered the acceptance of atomic theory until the 20th century. In 1803, John Dalton, a school teacher, used the concept of atoms to explain why elemental substances always reacted in simple proportions by weight. For example, 12 grams of carbon always reacted with 32 grams of oxygen to form 48 grams of carbon dioxide, and 4 grams of hydrogen always combined with 32 grams of oxygen to form 36 grams of water. Dalton proposed that each element consisted of a unique kind of atom, and that these atoms could join with each other to form chemical compounds:

C + O2 → CO2
2 H2 + O2 → 2 H2O

Strangely enough, physics was the last branch of science to fully buy into the atomic theory. The strongly positivist school of thought, led by Ernst Mach, frowned upon atoms because they could not be directly observed. This began to change in 1897, when J. J. Thompson successfully isolated electrons in atoms by deflecting a cathode ray, a stream of electrons, in a vacuum tube with a magnetic and an electric field. Thompson discovered that the charge-to-mass ratio of the electrons from a variety of cathodes, made of different metals, was always the same. Thompson concluded that electrons must be a negatively charged particle, common to all atoms, and that all electrons were identical. Since normal matter has no net electrical charge, Thompson proposed that atoms consisted of a smeared out positive charge with embedded electrons – the plum pudding model of the atom.

In 1909, Hans Geiger and Ernest Marsden, under the direction of Ernest Rutherford, bombarded a thin gold foil, about 200 atoms thick, with alpha particles. An alpha particle is a helium nucleus, consisting of two protons and two neutrons, which manages to tunnel out of a much larger radioactive nucleus such as radon. An alpha particle obtains a high velocity, and consequently lots of kinetic energy, as it is rapidly pushed away from the mother radioactive nucleus by the many protons in the radioactive nucleus. The plum pudding model of the atom predicted that, as the alpha particles passed through the gold foil, they should only have been deflected by a small angle of a few degrees. The surprising result from this experiment was that some of the alpha particles were deflected by very large angles, with some actually reflecting right back into the radioactive source. In 1911, Rutherford proposed that the back scattering of the alpha particles was caused by a very small, positively charged nucleus, less than 3.4 x 10-14 meters in size, and about 100,000 times smaller than the overall size of the gold atoms. The negative charge of the surrounding electrons, orbiting the positively charged nucleus, neutralized the nucleus, yielding atoms with no net electrical charge. In this model, atoms were composed mostly of empty space, with negatively charged electrons orbiting a central positively charged nucleus, like the Earth orbiting the Sun.

Quantization of the Atom
As with relativity, this model presented a problem for classical electrodynamics. The orbiting electrons, moving in circles around the positively charged nucleus, would be constantly accelerating, and should, by classical electrodynamic theory, be constantly radiating off electromagnetic radiation. As the orbiting electrons in atoms radiated off all of their orbital kinetic energy, all the atoms in the Universe should immediately collapse in the blink of an eye. In an earlier posting, I cited a similar problem with classical electrodynamics that Max Planck confronted in 1900. According to classical electrodynamics, the walls of the room in which you are currently located should be at a temperature of absolute zero, having converted all of the energy of the free electrons in the walls of the room into ultraviolet light and x-rays. This was known as the “Ultraviolet Catastrophe” at the time, and is another example of an effective theory bumping up against the limitations of its effective range of reliable prediction. In 1900, Max Planck was able to resolve this dilemma by proposing that the energy of the oscillating electrons in the walls of your room was quantized into a set of discrete integer multiples of an elementary unit of energy E = hf.

E = nhf

n = 1, 2, 3, ...
h = Planck’s constant = 4.136 x 10-15 eV sec
f = frequency of the electron oscillation

You can read a translation of his famous 1901 paper, that first introduced the concept of quantization at:

Max Planck regarded his inspiration/revelation of the quantization of the oscillation energy of the free electrons and their radiated energy as a mathematical trick to overcome the Ultraviolet Catastrophe. But in 1905, the same year that he published the special theory of relativity, Einstein proposed that Planck’s discovery was not a mathematical trick at all. Einstein proposed that sometimes light, an electromagnetic wave in classical electrodynamics, could also behave like a stream of quantized particles, that we now call photons, with energy:

E = hf

In 1913, Niels Bohr adopted a similar strategy of quantization to resolve the problem of atoms spontaneously collapsing due to radiating away all of their orbital energy. Bohr proposed that the angular momentum L of the electrons was quantized too.

L = nħ

where ħ = h/2π and n = 1, 2, 3, …

Bohr proposed that electrons did not radiate energy so long as they remained in one of the quantized orbits about the nucleus.

Angular momentum is a measure of the amount of spin that a body has. In classical mechanics, the angular momentum of an electron orbiting a nucleus is:

L = mvr

m = mass of the electron
v= the velocity of the electron
r = radius of the electron’s orbit

and the angular momentum L can take on any value as the velocity v and radius r vary continuously. For example, the conservation of angular momentum is the reason that a skater spins faster when she pulls in her arms in a spin and can take on any value. But in Bohr’s model of the atom, the electrons could only take on quantized values of angular momentum with quantized amounts of energy. For hydrogen, which consists of a single electron orbiting a nucleus composed of a single proton, the energy states became:

En = -13.6 eV/n²

where n = 1, 2, 3, …

An electron-volt, eV, is a very small amount of energy. It is the amount of energy that an electron gains as it accelerates in a vacuum towards the positive pole of a one volt battery and away from the negative pole of the battery. Most chemical reactions have an energy of a few electron-volts per molecular reaction, so it is a convenient unit of energy for atomic theory.

The energies of the quantized states of hydrogen run as:

n = 1: E1 = -13.6/1 = -13.6 eV
n = 2: E2 = -13.6/4 = -3.4 eV
n = 3: E3 = -13.6/9 = -1.5 eV
n = 4: E4 = -13.6/16 = -0.85 eV

where n=1 is the lowest energy level of the atom. In the Bohr model of the atom, quantized photons of light are emitted or absorbed when electrons jump from one quantum state to another. For example, when an electron in the second energy level of hydrogen with n = 2 drops to the lowest quantum state of n = 1, a photon with energy:

∆E = -3.4 - (-13.6) = 10.2 eV

is emitted. Using the formula:

E = hf

it easy to calculate the frequency of a photon with 10.2 eV of energy, and this is what is observed spectroscopically in the laboratory in confirmation of Bohr’s predicted value.

In 1860 Gustav Kirchhoff and Robert Bunsen invented the modern spectroscope, consisting of a gas flame, a slit, a prism, and a small observing telescope. Kirchoff and Bunsen introduced small amounts of materials into a flame and then allowed the colored light from the flame to pass through a slit and then on through a prism. They then observed the image of the slit with a small telescope. The prism caused the single slit to appear as a series of multiple slits, or lines, of different colors caused by the splitting up of the different frequencies of light within the colored light of the flame. You can repeat their experiment by sprinkling a little salt water in the flame of a gas range and observing the characteristic yellow color of excited sodium atoms. Kirchoff and Bunsen did not know it at the time, but what was happening was that the high temperature of the flame was bouncing atoms around in the flame causing some of the electrons in the atoms to get excited. As the electrons transitioned back to lower energy levels, they gave off photons of specific frequencies or colors. You can read their original paper at:

In 1924, Louis de Broglie proposed in his doctoral thesis, that if light waves could sometimes behave as particles, then perhaps particles, like electrons, could also behave like waves with a wavelength λ of:

λ = h/mv

λ = wavelength
h = Planck's constant
m = mass of the particle
v = the velocity of the particle

In 1925, Clinton Davisson and Lester Germer were at Bell Labs bombarding a nickel target with electrons in a vacuum tube. During one of the runs of their experiment, the tube leaked causing the nickel to oxidize. To salvage the nickel target, they had to heat the nickel to a high temperature. Unknown to them, the heating of the nickel fused the nickel into several large crystals. When the fused nickel target was later bombarded with electrons again, they discovered that the electrons were now reflected back in a diffraction pattern. Diffraction patterns are a telling characteristic of waves. It is the diffraction of light waves off the closely spaced pits on a music CD that creates the rainbow sparkle that you see when you shine a light on it. The light waves diffract off the regularly spaced pattern of pits on a CD and get spread out at different angles by their frequencies (or wavelengths and colors), just as a prism spreads out light by its colors or frequencies. In fact, you can make a simple homemade spectroscope using a CD and two pieces of cardboard. Tape the two pieces of cardboard together, such that there is a narrow slit between the two pieces, about 1/16th of an inch wide. Now place the CD on a flat table with the label side down, and then put the cardboard with the slit, in front of a desk lamp containing a compact fluorescent bulb. Be sure to position the desk lamp, so that the photons from the slit have to do a bank shot off the CD at an angle to reach your eyes. Reflected in the CD you will see the bright white reflection of the slit, but you will also see the image of the slit as a series of colored circular slits or lines expanding out from the center of the CD. That is the diffraction pattern. If you look carefully, you will see a complex pattern of closely spaced lines; two red lines, a broad yellow line, a green line, a blue line, and an indigo line, with black spaces between the colored lines. This is the spectral signature of the excited mercury atoms in the compact fluorescent bulb, and it is first hand proof of both the quantization of the energy levels in mercury atoms and of the emitted photons too. In a similar fashion, Davisson and Germer observed essentially the same thing, when they accelerated electrons to specific energies and bombarded their crystalline nickel target; they too obtained a diffraction pattern of reflected electrons at specific angles. The electrons diffracted off the regularly spaced nickel atoms in the crystalline lattice of the nickel crystal at specific angles, just as the photons in your homemade spectroscope diffracted off of the pattern of pits in the CD at specific angles with specific colors or frequencies. Davisson and Germer reported their results in a paper published in 1927, confirming de Broglie’s idea of the wavelike behavior of particles.

The idea that particles, like electrons, could behave as waves allowed for a reinterpretation of Bohr’s conjecture that the angular momentum of the electrons in a hydrogen atom were quantized as:

L = nħ

where ħ = h/2π and n = 1, 2, 3, …

We could now envision the electron as a standing wave, surrounding the central positively charged proton. Just as only certain wavelengths of sound can exactly fit into a pop bottle and resonate when you blow across the mouth of the bottle, only electrons with certain fixed wavelengths or energies could exactly fit around the circumference of an electron orbital about the proton.

The Bohr quantum model of the atom was very successful at predicting the spectra from hydrogen atoms, but less so for more complex atoms. It also did not explain things like why the spectral lines from atoms split into two or more lines when exposed to a strong magnetic field. As with all effective theories, it had its limitations. Clearly, some improvements were required.

Erwin Schrödinger
The next advance came from a 38 year old physicist, Erwin Schrödinger, who was still struggling to make his mark in physics. The rule of thumb in physics is that most of the great advances come from people under the age of 30, probably because the inspiration/revelation step of the scientific method seems to come easier to the young at heart with less to lose. At this point, physicists had been working with the mathematics of waves for more than 100 years, and were well aware that all known waves obeyed a differential equation of a particular form known as the wave equation. Schrödinger was convinced that for the de Broglie hypothesis about the wavelike behavior of matter to advance, a wave equation for particles was required. So in the winter of 1926, Schrödinger packed up his latest mistress and left his home and wife in Vienna for a couple of weeks of sabbatical in a Swiss chalet. There he worked out a famous wave equation for particles, now known as the Schrödinger equation. The story goes that he had two pearls, one for each ear, that allowed him to work undisturbed by his mistress. His wife back in Vienna certainly was not a distraction either. He truly was young at heart at the age of 38 and ripe for some inspiration/revelation.

To understand the significance of all this, we need to delve a little into the mathematics of differential equations. Imagine a very long taught guitar string stretched between two walls that are separated by a large distance. If we pluck the string near the left wall, a pulse will begin to travel to the right. Figure 1 below is a snapshot of a small section of a possible pulse at a particular time as it moves to the right.

Figure 1 (click to enlarge)

The motion of the pulse can be described by a scary looking differential equation, known as the wave equation for a stretched string, which describes how each small section of the string moves up and down as the pulse passes by.

The Wave Equation of a Stretched String

∂²y  =  μ   ∂²y
──      ──  ───
∂x²      T    ∂t²

We will be dealing a little bit with differential equations when we get to chaos theory in softwarephysics, so let’s spend a little time with the wave equation to show that it is really not so scary after all. The project plan for a large IT project can be pretty scary too, if you look at the whole thing at once. However, if you break it down into its individual tasks, it looks much less formidable. It’s important to take life one 2x4 at a time. The first term in the equation on the left is called the second partial derivative of the pulse with respect to the x-axis (distance along the string). It is just the curvature of a small section of the string as the pulse passes by (see Figure 1):


When this term is a big number, it means that the curvature of the string is large and the string has a peak or valley. When this term is a smaller number, it means that the curvature is low and the string is kind of flat. The variable μ is the density of the string. When μ is large, it means that the string is heavy; when μ is small, it means that the string is light. The variable T is the tension in the string, or how tightly the string is stretched. So if we just look at what we have so far, we see that the curvature of the string pulse is equal to the density of the string μ divided by the tension T of the string times “something else”. This makes sense. When the string is stretched very tightly, the tension T is large, so the curvature of the pulse should get flatter (smaller). Also, when the density μ of the string gets bigger, the curvature of the pulse should get larger too, because a heavy string should be less flattened by a tension T, than a light flimsy string. Now we have to look at the “something else” term on the far right:


This term is called the second partial derivative with respect to time. It is just the curvature of the pulse in time – how fast a tiny section of the string accelerates up and down as the pulse passes by. A flat pulse will not accelerate up and down very quickly as it passes by, while a tightly peaked pulse will accelerate up and down quickly as the pulse passes by.

The wave equation for the stretched string now makes sense as a whole. It simply says that the curvature of the pulse along the x-axis gets bigger when the density μ of the string gets bigger, or when tension T gets smaller, or when the rate at which the string accelerates up and down as the pulse passes by gets bigger. When you work out the mathematics, the velocity of the pulse is given by:
v  =  √ T/μ

The hard part about differential equations is solving them. You have to find a curve that meets the above requirements. When you take a course in differential equations, they teach you all sorts of ingenious techniques using calculus to guess what curvy line fits the requirements. For the problem at hand, the solution to the wave equation for a stretched string fixed at both ends, is a series of standing sine waves, which look something like Figure 2 below. The string can be excited into many standing waves, defined by a number n = 1, 2, 3,… which describes how many wavelengths of the standing wave just exactly fit between the two fixed points.

Schrödinger’s Equation
Working with Einstein’s equation for the energy of a photon and de Broglie’s equation for the wavelength of a particle, Schrödinger had a problem. Unlike the velocity of a wave on a string, which only depended upon the density of the string μ and its tension T, the velocity of a particle’s wave depended upon the wavelength of the particle λ:

v  =  h

Schrödinger believed that a particle must really be a wavepacket of many superimposed waves of different wavelengths that added up in phase near the location of the particle. Given the above equation, the waves in the wavepacket would tend to move at different velocities because they all had different wavelengths. The traditional wave equation, like the one for a wave pulse on a string, would not work under such conditions. Schrödinger overcame this problem with the following compromise equation, that sort of looks like a traditional wave equation. Note that the wavefunction Ψ is just a wiggly line, like the pulse on our string, and is pronounced like the word “sigh”, and m is the mass of the particle.

-ħ²    ∂²Ψ  =  iħ ∂Ψ
──      ──            ──
2m    ∂x²            ∂t

He had to make two modifications to the standard wave equation:

1. He used the first partial derivative with respect to time, instead of the second partial derivative with respect to time on the far right side of the “=” sign

2. The equation contained:
i  =  √ -1

(or in other words  i²  =  -1  )

which meant that the Schrödinger equation was a complex differential equation, with an imaginary part containing the quantity “i”, the square root of -1. Now we all know that there is no “real” number, that when multiplied by itself (squared) produces a -1, but that does not scare off mathematicians! Several hundred years ago, mathematicians became comfortable with the idea of an “imaginary” number i, which they defined as the square root of -1. “Real” numbers, the kind of numbers that we are used to dealing with, are just numbers that do not have an imaginary part. A little later, physicists discovered that the Universe seemed to just love “imaginary” numbers. The imaginary number i started to pop up in all sorts of equations and was nearly as popular as π. Now the fact that Schrödinger’s equation contained an imaginary part meant that solutions to the equation, known as wavefunctions Ψ, would not be totally “real” either, because they would contain imaginary parts using the square root of -1. As we shall see, this implication created a problem for the interpretation of what exactly a wavefunction really was. All of the other waves we had experience with, like waves on a string, light waves, or water waves were “real” functions or curvy lines. What did a complex wavefunction, with both “real” and “imaginary” parts, mean?

Schrödinger’s Time Independent Equation for a Single Dimension
We can simplify Schrödinger’s equation by getting rid of the part that depends upon time for the cases where the energy E does not change with time. That is certainly true for the electrons in an atom, so long as they remain in their orbits and do not jump from one orbit to another and emit or absorb a photon.

-ħ²  d²ψ(x)   +   V(x) ψ(x)  =  E ψ(x)
──  ──────
2m     dx²

In this equation we use ψ for the wavefunction, instead of the full blown Ψ, because it does not contain the part that varies with time. ψ is still just a wiggly line, like the pulse on our string, and is still pronounced like the word “sigh”. In addition, we added a term V(x) which is another function or wiggly line that describes how the potential energy of the particle varies as it moves back and forth along the x-axis. Imagine a straight road that runs across a hilly landscape and that you are riding a bicycle. If you ride your bicycle up a hill, you can tell that you are increasing your potential energy V(x) because it is hard to peddle up the hill as you convert some of your kinetic energy into potential energy. Similarly, it is easy to coast down a hill on a bicycle, because some of the potential energy V(x) that you have stored away, is converted back into kinetic energy. Another way of looking at this is that V(x) is also a way of describing a force. For example, the topography of the wiggly V(x) function seems to produce a force pulling you down the hill and another force impeding your progress up the hill. So the above formula allows you to calculate the wavefunction ψ for a particle subject to a force.

The Particle in a Box
Recall Figure 1 in my posting The Demon of Software, which depicted two containers full of molecules bouncing around. In 1872, Ludwig Boltzmann developed statistical mechanics by envisioning the molecules existing in a large number of microstates. This was many years before the arrival of quantum mechanics, so Boltzmann had to make up his microstates by arbitrarily force-fitting the molecules into little mathematical cubicles, like saying “I can tell by your energy that you are a perfect size 8”, at least approximately. Let us now apply Schrödinger’s equation to this problem to find the real microstates. To simplify the problem, imagine a single container of width “L” along the x-axis containing a single particle, such as an electron, in it. The electron is bouncing back and forth along the x-axis like a target in a shooting gallery. The electron only feels a force when it hits one of the walls on either the left or right side of the container at x = 0 or x = L, so the potential energy V(x) between the walls is zero. At the walls located at x = 0 and x = L, the electron feels an infinite force pushing it back into the container, so the potential energy hill V(x) at these two points is very steep and actually goes straight up to infinity.

When you solve Schrödinger’s equation for this problem, you get quantized solutions that are sine wavefunctions:
ψn(x) = √ 2/L    sin(nπx/L)

n = 1, 2, 3, ...

En = n²h²

n = 1, 2, 3, ...

m = mass of the particle (electron in this case)
L = width of the box
h = Planck’s constant
n = quantum number

For example for the lowest energy level where n = 1:
ψ1(x) = √ 2/L    sin(πx/L)

E1 = h²

What is happening here is that you can only fit sine waves into the box that have wavelengths that fit just right. The “n”s determine the number of wavelengths of the sine wavefunction that fits in the box properly and the energy En of each wavefunction. It is easier to see this as a plot rather than as an equation. Figure 2 shows the first three wavefunctions for n = 1, 2, and 3.

Figure 2

The above wavefunctions look a lot like the first three octaves of a vibrating guitar string, and indeed, are identical to what you get for a vibrating string.

Remember, I warned you about wavefunctions being strange. At first, nobody really knew what to do with them. Physicists are just like other people. They tend to take ideas that they are familiar with, and project these ideas onto things that are new to them. Physicists have a few popular models. Grab a small pebble and keep making it smaller in your mind, until it is infinitely small with a dimension of zero. When you are finished, you have a particle. Now take the particle and throw it into a still pond. The resulting waves that propagate away are another good model. Take a bar magnet, as Michael Faraday did, and cover it with a piece of paper. Then sprinkle some iron filings over it. The “lines of force” that you see are a field. Faraday called it a field because it reminded him of a freshly plowed farmer’s field. So when Schrödinger came up with his equation, nobody really knew what to make of the wavefunction solutions to the equation. Schrödinger thought they might have something to do with the electric field of electrons, but he wasn’t quite sure.

In 1928 Max Born came up with another interpretation. Born proposed that the wavefunction was really a probability wave. Imagine that a gang of young thugs moves into your neighborhood, and that they begin to knock off liquor stores late at night in the surrounding area. In order to avoid capture, the gang decides to only knock off each liquor store once in case the police are in hiding. If you plot the liquor stores that get knocked off versus time, you will see an expanding wave of crime. The odds of a particular liquor store getting knocked off increases as the wave passes by and diminishes as the wave moves on. Recall that usually the wavefunctions that are solutions to Schrödinger’s equation are complex, meaning that they have both “real” and “imaginary” parts, and that we are only used to dealing with solutions to wave equations that are real. Born knew that it was a mathematical fact that whenever you multiplied a complex number or function by its complex conjugate, you always obtained a real number because all of the imaginary parts disappeared. To obtain the complex conjugate of a wavefunction Ψ, all you have to do is change i to –i wherever you see it. The resulting function is denoted as Ψ*.

Born proposed that the probability of finding a particle at a position x was:

Ψ* Ψ = | Ψ|²

at the point x. So to plot the probability of finding the electron in our 1-dimensional box, we just have to square the absolute value of its wavefunction. Figure 3 shows the resulting plot.

Figure 3 (click to enlarge)

Now here comes the interesting part. Classical mechanics predicts that the probability of finding the electron at any point should be the same for each point along the x-axis, because the electron is just bouncing back and forth like a target in a shooting gallery, and that the electron could have any energy at all, since the energy is just defined by how fast the electron is bouncing back and forth. However, the wavefunctions that are solutions to Schrödinger’s equation predict that the energy of the electron is quantized and comes in discrete allowed values. They also predict that the probability of finding the electron along the x-axis varies according to the energy state of the electron defined by its quantum number n. For the lowest energy level, where n = 1, the electron is most likely to be found near the center of the box because ψ*ψ has a peak there. That is not too bothersome. However, for the second energy level, where n = 2, something really strange happens. There is a high probability of finding the electron on either the left or right side of the box, but never in the center! How can an electron move back and forth in the box without ever passing through the center? For the higher energy levels, where n = 3, 4, 5, …, there are even more dead spots where ψ*ψ = 0, and the electron will never be found! This is just another example of the quantum strangeness that is built into our Universe.

Quantum Mechanics of the Atom
Schrödinger was able to apply his new equation for the case of the hydrogen atom and derive its energy levels defined by the quantum number n by using a V(x) caused by the electrostatic force of the proton pulling on the electron. Again, these matched the spectroscopically observed energy levels of the hydrogen atom, also predicted by the Bohr model of the atom. Because electrons moving around a proton have some rotational motion, Schrödinger’s wavefunctions for the hydrogen atom also had two additional quantum numbers l and m, which defined quantized amounts of angular momentum, and these quantum numbers explained most of the splitting of spectral lines in a magnetic field that Bohr’s model did not. But there was still a small amount of additional spectral line splitting that Schrödinger’s model failed to predict. Schrödinger’s wavefunctions for electrons in atoms had one additional major failing.

In 1921, Otto Stern and Walter Gerlach performed an experiment which showed that electrons had an intrinsic quantized angular momentum they called spin and a small associated magnetic field like a tiny bar magnet. Stern and Gerlach shot hot silver atoms from an oven through a distorted magnetic field and found that the beam of silver atoms split into two beams. This was a little strange, since electrons are now thought of as fundamental point particles with a dimension of zero, so how could electrons have any angular momentum, if angular momentum is defined as L = mvr, and electrons have an r = 0? The other strange thing was that if electrons were really spinning like little tops, they should spin in all different directions, but when Stern and Gerlach performed their experiment, they always found that the electrons were either spinning in the same direction as their magnetic field or 1800 in the opposite direction of their magnetic field. How did the electrons know how to align their spins in advance, before they got to the magnetic field? Again this is just quantum strangeness at work. Stern and Gerlach found that electrons are like little spinning magnets of dimension zero with a spin up ↑ or spin down ↓ of magnitude:

Sz = ± ½ ħ

In 1928 Paul Dirac realized that, from the standpoint of classical mechanics, the electrons orbiting the nucleus of an atom would have to move very quickly to overcome the electrostatic force pulling them into the positively charged nucleus. In fact, the electrons would have to move at about 30% of the speed of light, and would experience relativistic effects. Schrödinger had used the classical concept of the energy E of a particle in his equation, but had not taken into account any relativistic effects. When Dirac included these effects into a much more complicated form of the wave equation, the quantized spin of electrons popped out just as Stern and Gerlach had observed. The other thing that the Dirac equation predicted was that electrons came in two forms; one with a negative charge and positive energy and one with a positive charge and negative energy! We now call these positively charged electrons, positrons. Positrons were the first form of antimatter predicted by physics, and Carl D. Anderson actually observed a positron in 1932 in a cosmic ray experiment carried aloft in a balloon. With Dirac’s equation, we now have a model for the atom composed of a positively charged nucleus surrounded by electrons with wavefunctions defined by four quantum numbers n, l, m, and s.

In 1925, Wolfgang Pauli wondered why all of the electrons in an atom did not simply decay to their lowest energy level of n = 1 by emitting photons. He empirically proposed that each electron in an atom had to be in a unique quantum state. The implication of this proposal was that each electron had to have a unique wavefunction defined by a unique combination of quantum numbers n, l, m, and s. This meant that as you throw electrons into an atom, the electrons have to arrange themselves in shells about the nucleus, as the available slots for each combination of n, l, m, and s is taken up. This is a good thing, because the chemical characteristics of an atom are chiefly defined by the arrangement of its electrons in shells and especially by the electrons near the surface of the atom in the outer electron shells. If all electrons were at their lowest energy level of n = 1, you would not be sitting here contemplating atoms because there would be no chemical activity in the Universe, and you would not exist. It turns out that, theoretically, all of chemistry can be derived from the quantum mechanical model of atoms.

Quantum Implications for Softwarephysics
Recall that the individual characters in a line of source code:

discountedTotalCost = (totalHours * ratePerHour) - costOfNormalOffset;

are each defined by 8 quantized bits, with each bit in one of two quantum states “1” or “0”, which can also be characterized as ↑ or ↓.

C = 01000011 = ↓ ↑ ↓ ↓ ↓ ↓ ↑ ↑
H = 01001000 = ↓ ↑ ↓ ↓ ↑ ↓ ↓ ↓
N = 01001110 = ↓ ↑ ↓ ↓ ↑ ↑ ↑ ↓
O = 01001111 = ↓ ↑ ↓ ↓ ↑ ↑ ↑ ↑

We may think of each character in a line of code as an atom, and each line of code as a molecular chemical reaction which ultimately produces a macroscopic effect. The 8 quantized bits for each character are the equivalent of the spins of 8 electrons in 8 electron shells that may be either in a spin up↑ or spin down ↓ state. And the chemical characteristics of each character are determined by the arrangements of the spin up ↑ or spin down ↓ state of the bits in the character.

I apologize for all of the math in this posting. If you feel a little confused, you are in good company. Nobody really understands any of this stuff.

Next time we will expand upon these ideas and develop the quantum mechanical formulation of softwarechemistry and try to delve a little into the cosmic interpretation of what all this means.

Comments are welcome at

To see all posts on softwarephysics in reverse order go to:

Steve Johnston

Saturday, December 15, 2007


Last time we explored Einstein’s special theory of relativity and saw how observers in relative motion with each other measured different times and distances between events. This created a problem for causality because there was no absolute “now”, so observers could disagree on the sequence of events and could possibly even observe an effect happen before its cause. The only way we were able to resolve this problem and preserve causality was to presume that information could not travel faster than light. We also saw how Einstein took a very positivistic approach to his new concept of space and time by only dealing with observable phenomena like the ticking of light clocks, the paths and timings of light beams, and the lengths of objects measured directly with yard sticks. Einstein made no reference to an absolute space or time, that we presume exists, but which we cannot directly measure.

People frequently sum this all up with the phrase that “everything is relative”, meaning that anything goes. But that is not the case at all. It turns out that my distances and times may disagree with yours, but I will always be able to convert them into your distances and times through the use of a set of handy conversion formulas known as the Lorentz transformations. For example, you might claim that a certain classic Led Zeppelin CD costs 10 euros, while I maintain that it goes for just under $15 U.S. dollars. That is because of the current currency transformation:

1 euro = 1.47139 U.S. dollars

We will always disagree on the exact numerical cost of items when you use euros and I use dollars, but we will always be able to convert one cost to the other with a transformation formula like the one above.

The Lorenz transformations all have the following term in them:
√1 - v²/c²

where v is the velocity of your friend’s spaceship speeding by, and c is the velocity of light. When v is 10% of the speed of light, then

v/c = 0.1


v²/c² = 0.01

√1 - v²/c²   =
√1 - 0.01   =
√0.99   =   0.99498743710...   ≈ 1

For relative velocities under 10% of the speed of light, the Lorenz transformations are similar to the situation where:

1 euro ≈ 1 U.S. dollar

and everybody agrees on the numerical cost of goods. In such a situation, everybody agrees on the distances and times between events to a very good approximation, and Newtonian mechanics makes very accurate predictions.

The Newtonian idea of a fixed and absolute space, providing a stage or background upon which the events of the Universe could unfold, was so appealing that it was very hard to part with. In Einstein’s original positivistic concept of relativity, there is no stage; there is only a dialog between the players in terms of their relative observations of each other. In 1907, Hermann Minkowski, a former university math professor of Einstein, realized that we could still have a stage for our players if we combined space and time into a 4-dimensional spacetime. By adding time as a 4th dimension to our traditional 3-dimensional world, we could unite space and time into spacetime. But how can you possibly combine the 3-dimensional world of x, y and z coordinates that are measured in feet, yards, miles, or meters with time that is measured in seconds? To understand this better, let us turn to an example from American football. The playing area of an official NFL football field is 100 yards in length and 160 feet in width. For some unknown reason, distances along the x-axis of a football field are measured in yards, while distances along the y-axis are measured in feet and inches from the sidelines. For example:

NFL Hash marks are the narrowest of all the levels of football play. They measure 70 feet, 9 inches from the sidelines and are 18 feet, 6 inches wide. (same as the width of the goal posts)

Now imagine that you are at a football game with a friend. The both of you watch a play where the quarterback finds that nobody is open and decides to run the ball himself (see top section of Figure 1). The quarterback receives the football at point A and proceeds to run in a straight line to point B where he is tackled. You observe that the quarterback made a gain of ∆x yards during the play, and that he managed to run ∆y feet across the field as well. Recall that ∆y gains across the field are meaningless in football, so long as you stay within the playing area defined by the sidelines. It turns out that your friend has started the celebration party a little too early with the help of Mr. Jack Daniels, and is in no condition to make observations, but does so just the same. Your friend sees the same play that you just witnessed, but is looking cockeyed at the field (see bottom section of Figure 1). Your friend observes that the quarterback made a gain of ∆x` yards during the play, and that he managed to run ∆y` feet across the field at the same time. When the two of you compare notes, you find that you disagree about both the X-dimensional gains in yards and the Y-dimensional gain in feet:

∆x ≠ ∆x`
∆y ≠ ∆y`

Being in a better mental condition than your friend, you remember the Pythagorean theorem from high school. You realize that if you both make your X-dimensional and Y-dimensional measurements in feet rather than in yards and feet, then you can use the Pythagorean theorem to find the total distance D that the quarterback ran. The total distance of the play should be the same for all observers no matter how they look at the playing field:

D² = ∆y² + ∆x²
D² = ∆y`² + ∆x`²

And sure enough, when all of the calculations are completed, you both agree that the quarterback ran a total of D feet before being tackled, even though you still disagree on the X-dimensional and Y-dimensional gains of the play.

Figure 1 (click to enlarge)

Let’s extend the above example, which was conducted in space, to a similar scenario conducted in spacetime. The top section of Figure 2 is a spacetime plot of the next football play. Because it is hard to plot more than 2 dimensions at a time on a piece of paper, in a traditional spacetime plot we usually plot one of the normal 3-dimensions X, Y, or Z of space vs. time T. The normal convention is to plot T vs. X, and we shall do the same. The X-axis in the spacetime plot represents the distance down the length of the football field, as it did in Figure 1 above, and T represents the time of the play. In the second play, the quarterback again decides to run the ball, but this time he runs straight down the field in the X-dimension from point A to point B in a straight line and does not run across the field at all. With the T-dimension we measure the time it takes for the quarterback to run the play. As before, we measure the progress of the play as motion through spacetime with ∆x and ∆t components.

Now imagine that you and your friend are on spaceships in relative motion with each other watching the second play. Even though the quarterback only runs straight down the field in the X-dimension, you and your friend will again disagree on the number of yards the ball was carried ∆x and you will also disagree on how long the play lasted ∆t because your friend’s relative motion makes him look cockeyed at your spacetime! From the bottom section of Figure 2 we see that:

∆x ≠ ∆x`
∆t ≠ ∆t`

Is there some way to come to a mutual agreement for the second play as we did for the first play? Yes, just as the distance D measured in space is the same for all observers, the interval Ι in spacetime is also the same for all observers. The interval Ι is like the distance between events in spacetime. You and your friend will both agree on the interval Ι of the play, so long as you both measure time and space in meters, just as you both agreed on the distance D traveled in the first play when you both measured ∆x and ∆y in feet. After all, there is nothing profound about measuring time in seconds. A second is just some arbitrary unit of time that has something to do with the current rate of spin of the planet that we happen to find ourselves on at the moment. The Universe really doesn’t know about, or care about, seconds. To convert seconds to meters, we need to use the speed of light. The speed of light is 3 x 108 meters/second, so to measure time in meters we need to use a conversion factor, just like we needed a conversion factor to change yards into feet:

1 yard = 3 feet

1 second = 3 x 108 meters of time

Once you convert your time measurements into meters of time, you can apply a modified Pythagorean theorem:

Ι² = ∆t² - ∆x²
Ι² = ∆t`² - ∆x`²

Notice that to measure the interval Ι between events in spacetime, you need to use a “-“ sign instead of a “+” sign in the equation.

Once again, when all of the calculations are completed, you and your friend will now both agree on the interval Ι of the play, even though you will still disagree on the X-dimensional and T-dimensional measurements.

Figure 2 (click to enlarge)

An interesting thing about intervals is that since light travels exactly one meter of distance in one meter of time, the interval between any two events defined by the motion of a light beam is always zero. For example, the interval between a photon leaving the surface of the Sun and arriving at the Earth 8 minutes later, after traveling 93 million miles is zero! If you examine Figure 1 and Figure 2 in my posting Is Information Real? , you will find a similar situation. In this situation, there are two events – the flash of the flash unit and the recording of the flash of light in the camera. Let us say that the mirror of the LightClock is 5 meters from the flash unit and camera, giving a round trip distance of 10 meters in the Y-dimension. Your friend will observe that there is no X-dimensional distance between the flash of light and the recording of the flash by the camera because the flash goes straight up and down for him. He will observe some T-dimensional distance between the events because it will take 10 meters of time for the flash to go straight up to the mirror and reflect back into the camera. Notice that in this problem, we have to deal with 2 dimensions of space, ∆x and ∆y and one of time ∆t, so we need to tack on another dimension to our formula for the interval Ι:

Ι² = ∆t² - ∆x² - ∆y²

Figure 1
∆y = 10 meters of space
∆x = 0 meters of space
∆t = 10 meters of time
Ι = 0

In Figure2 you will observe the same two events, but you will observe that there is an X-dimensional distance ∆x between the flash of light and the recording of the flash by the camera because the light beam has to do a bank shot off the mirror. You will also observe that the T-dimensional distance will be greater than 10 meters of time because the flash has to travel a longer distance to make the bank shot, and it will take more than 10 meters of time to do so.

Figure 2
∆y = 10 meters of space
∆x > 0 meters of space
∆t > 10 meters of time
Ι = 0

In both cases, you and your friend will disagree on the X-dimensional and T-dimensional differences ∆x and ∆t between the two events of the flash unit flashing and the camera recording the flash, but in both cases you will agree that the interval Ι is zero. The extra ∆x and ∆t you observe cancel each other out, so that you still obtain an interval Ι = 0 too, just like your friend.

In this analysis, I am using the convention for the interval Ι that Edwin Taylor and John Wheeler use in their excellent book on the special theory of relativity Spacetime Physics (1963) because it yields positive intervals for everyday experiences, which is more in keeping with our experiences of positive distances in space. However, most treatments of the interval reverse the signs on ∆x and ∆t and define the interval as:

s² = ∆x² - ∆t²

but that definition yields negative intervals between most day-to-day events in the Universe. The signs on the ∆x and ∆t are really just a meaningless human convention that the Universe is not concerned with. The important thing is which term is larger. For example, suppose you see a man walking past your home on the sidewalk and you want to calculate the interval between his footsteps. Let’s suppose that relative to you the man takes 1 meter footsteps every second in your reference frame. What interval do you measure? Going back to our formula:

Ι² = ∆t² - ∆x²

∆x = 1 meter of space
∆t = 3 x 108 meters of time = 300,000,000 meters of time

Ι² = 300,000,000² - 1²
Ι² = 90,000,000,000,000,000 – 1
Ι² ≈ 90,000,000,000,000,000

Taking the square root we find:

Ι ≈ 300,000,000 meters

So in our reference frame, the man is mainly moving through spacetime in time and hardly moving through space at all as he walks by! This would be true even if he were driving past your home in a racecar or flying past in a jet airliner. As you sit there quietly at rest in your own reference frame, watching the man slowly stroll by, naturally your ∆x = 0, so you will find yourself traveling through 300,000,000 meters of spacetime every second. Even when you are just sitting still looking out the window, you will find yourself speeding through spacetime at the speed of light, but only in the time dimension! This is the reason nobody stumbled upon the special theory of relativity until 1905. Because we are all speeding through spacetime at the speed of light, mainly in the time dimension, we do not experience relativistic effects. Only when ∆x becomes large, and of the same magnitude as ∆t, do we experience the effects of the special theory of relativity. For example, if the man we were observing was able to run past our house at 50% of the speed of light, then his ∆x between footsteps would become 150,000,000 meters and our calculation of the interval Ι between his footsteps would drop significantly. This would make us see his wristwatch run slow and his body shrink in width as he rapidly flew by. Stranger yet, if we were to observe photons speeding by our home, we would measure an interval of:

Ι² = 300,000,000² – 300,000,000² = 0

because we would observe a ∆x = 300,000,000 meters and a ∆t = 300,000,000 meters.

But since all observers must agree upon the same interval between events, and the photons think that they are at rest with a ∆x = 0, the photons must calculate their intervals of zero as:

Ι² = 0² – 0² = 0

So in the reference frame of photons, which we see speeding by at the speed of light, there is no change in time or space for them – the photons are frozen in their spacetime and do not experience time or age at all! So for many photons, the Big Bang just happened and they are just sitting around waiting for something to happen, which is hard to do when your clock does not move.

When ∆t is greater than ∆x, physicists say that events are timelike because time dominates the calculation of the interval, and when ∆x is greater than ∆t, physicists say that the events are spacelike because space dominates the calculation of the interval. When events A and B are timelike, it means that there is enough time between the events for a beam of light to reach from event A to event B, and consequently it is possible for event A to cause event B to happen. When events A and B are spacelike, it means that there is not enough time between the events for a beam of light to reach from event A to event B, and consequently it is not possible for event A to cause event B to happen.

The Stage Returns
In Minkowski’s reformulation of special relativity, we once again have a stage or background of spacetime upon which the events of the Universe can unfold. Einstein did not at first like this reformulation of his theory of relativity because he thought it was too close to the old Newtonian concept of absolute space. But after a while he warmed up to the idea, and found it of great use as he tried to generalize his theory of relativity for all observers. Recall that his original version of relativity, published in 1905, only covered the special case of observers moving at constant speeds in straight lines with no acceleration. However, if you go back to your spaceship and turn on the engine, Einstein’s first postulate that you cannot perform any experiment that reveals if you are moving relative to an absolute space literally flies out the window. When you accelerate, you can easily tell that you are moving relative to an absolute space because everything tumbles back towards the back end of your spaceship! For example, if you try pouring one of those little bottles of gin, while your airplane rapidly accelerates for takeoff, you will most likely end up with it all over your shirt!

The Equivalence Principle
In 1907, Einstein had another revelation that solved the above problem. He realized that if a man fell out of a window and was in free-fall, then all effects of gravity would simply disappear. For example, we frequently watch astronauts in free-fall on the Space Shuttle become “weightless”. With this inspiration/revelation, Einstein concluded that he could reinstate his first postulate of relativity by postulating that gravity and acceleration were equivalent. This has become known as the Equivalence Principle:

"we [...] assume the complete physical equivalence of a gravitational field and a corresponding acceleration of the reference system."

With the Equivalence Principle, it once again became possible to conclude that no experiment could be performed to determine if you were moving or standing still relative to an absolute space. For example, suppose you are on a spaceship and let go of one of those little cans of Spam, and you observe that it immediately falls to the floor. According to the Equivalence Principle, you will not be able to tell if you are sitting on a launch pad on Earth in a gravitational field, or if you are out in outer space accelerating upwards at 32 ft/sec². Similarly, if the Spam just sits there and floats like it does on the Space Shuttle when it is in orbit, you will not be able to tell if you are in free-fall in a gravitational field or far out in interstellar space just coasting along.

Spacetime Curvature
Einstein’s new postulate of the Equivalence Principle had even more far reaching consequences than Einstein’s original two postulates of special relativity, but at a high intellectual cost. The original special theory of relativity, published in 1905, used little more than high school algebra for the most part. On the other hand, Einstein’s new general theory of relativity, published in 1915, used tensor calculus and non-Euclidean geometry which taxed even the most math proficient physicists of the day. Einstein’s new general theory of relativity relied heavily on Minkowski’s concept of spacetime. Not only did we have our stage back, but now we found that the stage could be distorted, like watching a play in a curvy fun-house mirror.

Imagine a large merry-go-round that can spin very quickly. Start out with the merry-go-round not spinning and measure the radius R of the merry-go-round with a standard foot ruler. Next measure the circumference C of the merry-go-round using the same standard foot ruler. If you divide the circumference by the radius, you will get:

C / R = 2π ≈ 2 * 3.141592…..

This should not come as a surprise because that is what they taught you in high school geometry, also known as Euclidean geometry, where the three angles of a triangle add up to 1800 and π is determined using a circle and the above formula.

Now spin up the merry-go-round to a high rate of rotational speed and measure the radius R and circumference C again. You will find that the radius R will not change because the rotational motion is perpendicular to the radius R and there is no relativistic shortening in that dimension. However, when you measure the circumference C, you will find that you get a larger number than before because the standard foot ruler you use will shorten in the direction of the rotational motion because of relativistic effects:

C / R > 2π

This is clearly not what you learned in high school geometry. According to the general theory of relativity, the reason that you no longer measure a value of 2π is that the merry-go-round is accelerating, and its spacetime is no longer flat, but is curved and deformed. When the merry-go-round spins, the direction of its motion on the circumference is constantly changing, and that is an acceleration. That’s why you feel a tug trying to pull you off the merry-go-round when it is spinning. If you put clocks along the circumference of the merry-go-round while it is spinning, you will also notice that they run slower than a clock placed at the center of the merry-go-round. Again the rotational motion along the circumference of the merry-go-round is causing time to slow down because of relativistic effects, and this is further evidence of the distortion of spacetime caused by the acceleration of the merry-go-round. If accelerations can curve spacetime, then using the Equivalence Principle, we can predict that gravitational fields must also curve spacetime too. In fact, the general theory of relativity proposes that gravity really is just a curvature of spacetime itself.

It’s easier to see how this works for the curvature of a 2-dimensional surface than for a 4-dimensional spacetime, so let’s go back to our football field. Imagine our football field again, but this time it is situated on the Equator of the Earth with its X-dimensional length aligned in an east-west direction along the Equator. If you add up the angles in a triangle on the football field by stretching a rope between three players, you will get 1800, just as you learned in geometry class in high school. Now let the size of the football field expand many times over, until the 50 yard line reaches the North and South Poles of the Earth. Don’t worry about what happens to the length of the football field during this example. If you now add up the angles in a triangle formed by a player on the Equator in South America, a player on the Equator in Africa, and a player at the North Pole, you will get more than 1800, in fact, the angles will add up to about 2700 because each angle in the triangle will be close to 900. This is a sure sign that the surface of the Earth is curved and not flat. Euclidean geometry, the kind of geometry you learned in high school, only works for flat surfaces. For curved surfaces, you have to use the non-Euclidean geometry developed by Bernhard Riemann in 1854.

Now have the player in South America and the player in Africa both start walking due north towards the North Pole. As the two players proceed due north, they will find that they are slowly approaching each other and that they will finally meet at the North Pole. Even though both players were traveling due north in straight lines, it will seem as if there was some “force” pulling them closer together. Of course, there really was no force at all. The two players were merely walking in straight lines on a curved 2-dimensional surface defined by the surface of the Earth. It was the curvature of the 2-dimensional surface that brought them together, not some “force”. The general theory of relativity says the same thing about gravity. The apparent curved orbit of the Earth about the Sun is caused by the Earth moving in a straight line through a curved spacetime. The general theory of relativity proposes that matter, energy, and pressure cause spacetime to deform. However, it is mainly the great mass of the Sun that causes spacetime near the Sun to deform, but the thermal energy of the Sun and the great pressure at its core, also provide lesser amounts of deformation too. When an object moves through a deformed spacetime in a straight line, it seems to move in a curved line through space. We interpret the deflected motion of the object to be the result of the “force” of gravity, when really it is just a peculiarity of the distorted spacetime that the object is moving through. That is why all objects fall with the same acceleration, as Galileo observed. They are all just moving through the same deformed spacetime in straight lines.

All of the above can be expressed in terms of Einstein’s famous field equations for general relativity, which in their most concise form are:

Gμν = 8 π Tμν

The expression on the left represents the curvature of spacetime and the expression on the right represents the matter and energy content of spacetime. The simplicity of the above formula is a bit misleading. It really represents a large matrix of nonlinear partial differential equations. In Software Chaos, we will learn that we generally cannot solve nonlinear differential equations and must instead use computers to calculate numerical solutions for them. This is just the case for Einstein’s field equations. Exact solutions can only be calculated for the simplest of problems, while high-speed supercomputers are required for more complicated problems. However, approximate solutions to some problems can also be obtained by making some simplifying approximations to certain problems.

To illustrate this process, suppose you have a Mercator projection map of the world and wish to determine the distance between Panama City near the Equator and London. The Mercator projection is a projection of the spherical Earth onto a cylinder surrounding the Earth, which produces a map with straight lines of both latitude and longitude forming a grid. The advantage of the Mercator projection is that courses of constant bearing appear as straight lines on the Mercator projection, so if you set your ship on a Northwest course, it will follow a straight line on your map. The problem with the Mercator projection is that it stretches both the north-south and east-west scales near the poles so that Greenland appears to have the same land area as Africa even though it is really 14 times smaller in area. Now if you try to find the distance between Panama City and London on the map using the Pythagorean theorem by measuring the distance along the Equator from Panama City to the longitude of London and then measuring the distance due north from the Equator to London you will find that the distance D that you obtain will not be correct:

D² ≠ ∆y² + ∆x²

However, you will find that by applying some correction factors to the above equation, you can once again find an equation that approximately yields the distance D between Panama City and London:

D² = A∆y² + B∆x²

The correction factors A and B are a reflection of the fact that space on a Mercator projection is distorted.

Similarly, for gravitational fields that are not too strong, like the gravitational field around the Sun, the interval Ι can be approximately written for objects moving towards and away from the Sun as:

Ι² = (1 - 2GM/c² x) ∆t² - (1 + 2GM/c² x) ∆x²

G = Newton’s gravitational constant
M = the mass of the Sun
x = the distance from the Sun
c = the speed of light
∆x = a small change in the distance from the Sun
∆t = a small change in time

The above correction factors to the interval are again the result of spacetime being distorted by the mass of the Sun. So the special theory of relativity is simply the general theory of relativity applied to a flat spacetime in which our original equation for the interval has no correction terms and is like the Pythagorean theorem for flat space. It turns out that for objects moving at orbital speeds about the Sun, such as the Sun’s planets, nearly all of this distortion to spacetime is due to the distortion of time, so Newtonian gravity is really just an illusion resulting from the distortion of time! For more on the general theory of relativity see Gravity from the ground up (2003) by Bernard Schutz, which is an amazing mathematical treatment of the general theory of relativity using only high school algebra!

The general theory of relativity is an incredibly accurate effective theory that takes over when Newtonian mechanics and Newtonian gravity break down at high velocities and large masses. It works over a very large range of conditions and predicts things like the slowing down of time in gravitational fields, the curvature of light beams near massive objects, the physics of black holes, and the cosmological expansion of spacetime. For example, we have been able to observe stellar systems composed of two neutron stars orbiting each other. A neutron star is a star that has reached the end of its life and has collapsed into a mass of neutrons so dense that a teaspoonful has a mass of 5 trillion tons, and its surface gravity is a trillion times stronger than Earth’s. As the two neutron stars orbit each other, they create gravity waves, ripples in spacetime, that radiate energy away causing an orbital decay of the pair. And the observed orbital decay agrees with Einstein’s predictions to one part in 1014. Even your GPS unit relies on Einstein’s general theory of relativity. There are currently 31 GPS satellites orbiting at an altitude of 12,600 miles above the Earth, and each contains a very accurate atomic clock. The signals from the GPS satellites travel to your GPS unit at the speed of light, so by knowing the travel time of the signals from at least 4 of the GPS satellites, it is possible to determine your position on Earth very accurately. In order to do that, it is very important to have very accurate timing measurements. Because the GPS satellites are moving relative to your GPS unit on Earth, their onboard atomic clocks run slower and lose about 7.2 microseconds per day due to their high velocities. But because the satellites are also further from the center of the Earth and in a weaker gravitational field, where spacetime is less deformed than on the surface of the Earth, their atomic clocks also run faster and gain 45.9 microseconds per day due to the weaker gravity out there. The net effect is a gain of 38.7 microseconds per day, so the GPS atomic clocks have to be purposefully built to run slow by 38.7 microseconds per day before they are launched, so that they will keep in sync with clocks on the surface of the Earth. If this correction were not made, an error in your position of 100 yards/day would accrue. But despite the great success of the general theory of relativity, it is still only an effective theory that does not work for very small things where quantum effects predominate, but that is a story for another time.

Spacetime and Cosmology
The general theory of relativity had a huge impact on the new science of cosmology early in the 20th century because it could provide a description of the overall curvature of spacetime for the entire Universe as a whole. There will be a future posting devoted entirely to cybercosmology, but here is a brief thumbnail sketch of events as we currently understand them for the physical Universe. Our bubble Universe formed about 14 billion years ago in a Big Bang, perhaps from a rapidly expanding false vacuum (Andrei Linde 1986), or from a black hole in a previous bubble Universe (Lee Smolin 1992). The radius of the visible Universe, that portion of our bubble that we can see with telescopes, is currently only 14 billion light years because that is the furthest distance that light could have traveled towards us in the 14 billion years since the Big Bang. But the visible Universe is just a very tiny speck in our bubble which is more than 1026 times larger, or perhaps, even infinitely large. Our bubble is so large because it went through a very rapid cosmic expansion called Inflation during the first 10-32 seconds after the Big Bang that stretched the spacetime of our bubble until it was nearly flat (Alan Guth 1981). The spacetime of the Universe has continued to expand since the Inflation, but at a much lower rate of expansion. The rapid expansion of spacetime during the Inflation might sound like a violation of the restriction, imposed by the special theory of relativity, that nothing can move faster than the speed of light, but that is not the case. It is true that matter, energy, and information cannot travel faster than the speed of light through spacetime, but spacetime itself can expand much faster than the speed of light, and that is what happened during the Inflation. The end result is that we can only see a very tiny portion of our bubble Universe out to a distance of 14 billion light years, but our bubble continues on to an infinite, or nearly infinite, size filled with galaxies we cannot yet see. At the instant of the Big Bang, a huge quantity of positive energy in the form of photons, and a matching amount of negative energy in the form of gravitational potential energy was released. The net amount of energy in our bubble Universe seems to be zero, which certainly is the case if spacetime is truly flat. So it appears that our bubble Universe was essentially made from “nothing”. It seems that today, most cosmologists think that there probably are an infinite number of other bubble universes, forming an infinite multiverse.

In the general theory of relativity, the physical Universe is a 4-dimensional spacetime. The Software Universe, on the other hand, is a 2-dimensional cyberspacetime, consisting of a time dimension and a cyberspace dimension. The world currently produces about 500 billion microprocessors per year, so there should be about 1 – 10 trillion active microprocessors at the present time. Deep down at the hardware level, cyberspacetime is not a continuum - both dimensions are quantized. The cyberspace dimension is defined by the 1 – 10 trillion currently active discrete microprocessors, wherever they might be, and the individual system clocks of each microprocessor quantize the time dimension. Microprocessors currently run with a clock speed of about 109 Hz, so the time dimension of cyberspacetime is quantized into quantum chunks of time of about 10-9 second each.

As with spacetime in the physical Universe, cyberspacetime is the stage upon which the events of the Software Universe and IT unfold. The Software Universe began in a Big Bang about 2.1 billion seconds ago as a few bytes of machine code on Konrad Zuse’s Z3 computer in the spring of 1941, and has expanded and evolved into the complex Software Universe we see today, consisting of millions of terabytes of software residing on trillions of microprocessors. Cyberspacetime began on Earth, but has expanded to every planet in our solar system and is currently leaving our solar system on board the Pioneer 1 & 2 and Voyager 1 & 2 probes.

At the logical level, most familiar to IT professionals, the graininess caused by the quantization of cyberspacetime at the hardware level vanishes and so most IT professionals just view it as a large cyberspacetime continuum. Like the spacetime of the physical Universe, cyberspacetime is nearly flat. For example, I frequently work from home in a suburb of Chicago in the middle of the night (for me) with some of my fellow Middleware Operations team members who are located in India and with our Command Center which is located 26 miles from my home. Our Unix server farms and mainframe datacenters are scattered throughout the U.S. in different cities. But as far as I can tell, it is all just one large flat cyberspacetime to me. Programmers only have to worry about warped cyberspacetime if they write code for one of the 300 geosynchronous satellites orbiting at an altitude of 22,300 miles, one of the 31 GPS satellites orbiting at 12,600 miles, or distant space probes, because cyberspacetime is more warped here on Earth than way out there. But even we Earthbound programmers still have to contend with the limitation imposed by the special theory of relativity that requires that information cannot travel faster than the speed of light. This becomes a factor for telecommunication over satellite links and is why the software on the Spirit and Opportunity probes on Mars must be located on board the probes and not back on Earth in a JPL control room, since it can take up to 21 minutes for commands from JPL to reach Mars. When I worked for Amoco in the credit card processing section of IT, we had to worry about the 240 milliseconds that we lost in sending credit card transactions up from the gas stations to a geosynchronous satellite and back down to our mother Earth station in Tulsa. When customers swiped their cards in a CRIND (Card Reader in Dispenser) at the pump, we lost 240 milliseconds because our transactions could not travel faster than the speed of light, and those 240 milliseconds could not be regained by tuning the network no matter how hard we tried.

The Software Universe is not made of things; it consists only of processes and flows of information between discrete events in the two dimensions of cyberspacetime. There is the illusion that the Software Universe is filled with real things, such as files and databases with purchase orders and inventory levels on them, but that is an illusion. For example, when you edit a file with notepad, you are not interacting with a file; you are interacting with a process. Just press CTRL-ALT-DEL to open Windows Task Manager, click on the Processes tab and look for the notepad.exe. The PID is the process ID for the notepad process you are interacting with. The CPU Time is the distance the notepad process has traveled along the time dimension of cyberspacetime measured in dedicated CPU cycles to the notepad process. Nothing ever interacts with "things" in cyberspacetime because the "things" have to first be read into the memory of a computer and placed under the control of a process. Only the processes interact with each other in cyberspacetime. Therefore, files and databases on disk drives, tapes, and CDs are not part of the Software Universe; they are part of the physical Universe. In the Software Universe, we have two processes called "read" and "write" which allow us to pass information into and out of cyberspacetime. So at the logical level, above the hardware level, the Software Universe is simply made up of information and was created out of “nothing”, just like the physical Universe. We shall see in a future posting on cybercosmology, that a Universe simply made out of information and composed only of interacting processes, might be a good model for the physical Universe too.

Next time we will drill down even deeper by turning to another effective theory of physics called quantum mechanics and explore the nature of the microstates of statistical mechanics, the structure of atoms, and software at the level of individual characters in source code.

Comments are welcome at

To see all posts on softwarephysics in reverse order go to:

Steve Johnston

Sunday, December 02, 2007

Is Information Real?

Last time we explored the nature of information and saw how the concept of information was used in statistical mechanics to solve the paradox of Maxwell’s Demon, and in the process, we took the concept of information from something you knew to something you could measure. But this begs the question, what are we measuring? Is information real, or is information just a useful human contrivance that we made up like the constellations we use for locating stars in the night sky? This is an important question for both physics and softwarephysics because it is where the two overlap the most. It is also an important question for IT professionals because we process information all day long, and as I demonstrated in my last posting, software is also a form of information subject to the mischief of the second law of thermodynamics.

Are You Moving or Standing Still?
Strangely, the answer to the question of the reality of information was resolved by an obscure clerk in the Swiss Patent Office in Bern in 1905 by the name of Albert Einstein, but this story starts out much earlier than that. The story really begins with another more fundamental question – are you currently moving or standing still? My fervent hope is that most of you are reading this posting in a quiet room and not on a Blackberry while driving your car in rush hour traffic as you fidget with an iPod and adjust your GPS navigation unit. So hopefully most of you would maintain that you are comfortably at rest in a room with familiar surroundings. But on second thought, recall that the Earth rotates on its axis at 1,038 miles/hour at the equator. I am writing these words in a suburb of Chicago at a latitude of 42° , so I am currently moving to the east at about 771 miles/hour relative to the center of the Earth. Also, don’t forget that the Earth orbits the Sun at 66,660 miles/hour, and that the Sun orbits the center of our galaxy at 132 miles/second, and that our galaxy is moving about 360 miles/second relative to the cosmic background radiation (CBR). So everything in the Universe, including you and me, is in a constant state of relative motion. And yet I do not feel a thing. The reason I do not feel all this motion is that, for the most part, I am moving in a straight line at a constant speed. For example, the Earth takes 365 days to orbit the Sun and complete a full 360° revolution about it. So that comes to about 1 degree/day. The width of your index finger at arm’s length also subtends an angle of about 1°. Now imagine being able to drive a car all day long in a straight line at 66,660 miles/hour, and find that at the end of the day, you have only deviated from your desired straight line path by the width of your index finger at arm’s length, when you look back at your starting point! Most of us would likely congratulate ourselves on being able to drive in such a straight line. Because the circumference of the Earth’s orbit is over 584 million miles and it takes 365 days to cover that distance, the Earth essentially moves in a straight line over the course of a single day to a very good approximation.

Galileo’s Concept of Relative Motion
In 1543, Copernicus published On the Revolutions of the Heavenly Spheres in which he proposed a new model for the solar system that had the Earth revolve about the Sun, rather than having the Sun orbit about a fixed and stationary Earth located at the center of the Universe. Galileo was an early supporter of the Copernican theory because of his observations of the moons of Jupiter revolving about Jupiter and his observations of the phases of Venus as it orbited the Sun. He made these discoveries in 1610 with the first use of a telescope to perform astronomical observations. But the lack of any readily apparent terrestrial evidence for the motion of the Earth was a significant challenge for Galileo because it was the chief argument against the Copernican theory. If the Earth really was moving, why didn’t we feel it move? Galileo addressed this issue in 1632 in his Dialogue Concerning the Two Chief World Systems, in which he proposed that we do not feel the motion of the Earth because the motion of the Earth is, for all practical purposes, in a straight line as outlined above, and that observers cannot sense straight line motion at a constant speed. In a famous passage of this book, Galileo suggested performing a series of experiments down in the hold of a ship on a calm sea. Galileo proposed that if you throw a ball, watch dripping water, or perform any other experiment in the hold of a ship on a calm sea, you will obtain the same results if the ship is rapidly moving under sail or is standing still at anchor. No experiment you perform will allow you to tell if you are moving or standing still. Anybody who has ever poured one of those little bottles of gin on an airplane traveling at 550 miles/hour in a quiet sky can attest to the validity of Galileo’s observations.

Shut yourself up with some friend in the main cabin below decks on some large ship, and have with you there some flies, butterflies, and other small flying animals. Have a large bowl of water with some fish in it; hang up a bottle that empties drop by drop into a wide vessel beneath it. With the ship standing still, observe carefully how the little animals fly with equal speed to all sides of the cabin. The fish swim indifferently in all directions; the drops fall into the vessel beneath; and, in throwing something to your friend, you need throw it no more strongly in one direction than another, the distances being equal; jumping with your feet together, you pass equal spaces in every direction. When you have observed all these things carefully (though doubtless when the ship is standing still everything must happen in this way), have the ship proceed with any speed you like, so long as the motion is uniform and not fluctuating this way and that. You will discover not the least change in all the effects named, nor could you tell from any of them whether the ship was moving or standing still. In jumping, you will pass on the floor the same spaces as before, nor will you make larger jumps toward the stern than toward the prow even though the ship is moving quite rapidly, despite the fact that during the time that you are in the air the floor under you will be going in a direction opposite to your jump. In throwing something to your companion, you will need no more force to get it to him whether he is in the direction of the bow or the stern, with yourself situated opposite. The droplets will fall as before into the vessel beneath without dropping toward the stern, although while the drops are in the air the ship runs many spans. The fish in their water will swim toward the front of their bowl with no more effort than toward the back, and will go with equal ease to bait placed anywhere around the edges of the bowl. Finally the butterflies and flies will continue their flights indifferently toward every side, nor will it ever happen that they are concentrated toward the stern, as if tired out from keeping up with the course of the ship, from which they will have been separated during long intervals by keeping themselves in the air. And if smoke is made by burning some incense, it will be seen going up in the form of a little cloud, remaining still and moving no more toward one side than the other. The cause of all these correspondences of effects is the fact that the ship's motion is common to all the things contained in it, and to the air also. That is why I said you should be below decks; for if this took place above in the open air, which would not follow the course of the ship, more or less noticeable differences would be seen in some of the effects noted.

Although it did not occur to me to put these observations to the test when I was voyaging, I am sure that they would take place in the way you describe. In confirmation of this I remember having often found myself in my cabin wondering whether the ship was moving or standing still; and sometimes at a whim which I have supposed it going one way when its motion was the opposite....

This is a key point. Galileo proposed that all motion is relative, meaning that you can only define motion as a change in the distance between objects. There is no such thing as absolute motion relative to some absolute and fixed space. Galileo’s concept of relative motion was carried forward further by Gottfried Leibniz, a contemporary and strident rival of Newton, who fervently claimed that there was no such thing as absolute space; there only was relative motion between objects – absolute space was a fantasy of our common sense. So when you tell a police officer that you really did come to a full and complete stop at a stop sign, you really mean a full and complete stop relative to the stop sign and not some fixed and absolute space filling up the entire Universe. This flies in the face of our common sense notions about space and motion, and there were many objections to this idea of relative motion and an Earth moving about the Sun. Galileo was tried for heresy in 1633 by the Inquisition and was sentenced to life imprisonment under house arrest. The uncensored version of the Dialogue Concerning the Two Chief World Systems was banned until 1835.

Newton’s Concept of Absolute Space and Time
In the first few pages of Newton’s Principia, published in 1687, Newton proposed that there really was an absolute and fixed space filling the entire Universe that all objects existed in and moved through. This absolute fixed space was like a stage or background upon which the motions of all the objects in the Universe were played out upon. Newton admitted that, as Galileo had proposed, you could not measure this fixed and absolute space directly, but just the same, it still existed. Newton also proposed that there was a fixed and absolute universal time that all observers agreed upon. All observers agreed on the number of seconds it took the police officer to write out your ticket. The ideas of a fixed and absolute space and time are such common sense self-evident concepts that Newton almost dismissed dealing with them outright in the first few pages of the Principia because they seemed so obvious to him, but this turned out to ultimately lead to his undoing. It would take more than 200 years to reveal the flaws in his reasoning.

Electromagnetism Creates Problems
In the 19th century, great advances were made in the area of electrodynamics, the dynamical study of charged particles in the presence of electrical and magnetic fields. It began in 1820 when Hans Christian Oersted discovered that an electrical current in a wire produced a magnetic field. You can repeat this famous discovery by simply connecting a wire to a standard 9-volt battery, like the ones used in many toys, and watch the current deflect a compass needle. Shortly thereafter, Andre-Marie Ampere came up with a mathematical model which explained this phenomenon. When electrical charges move, they create a magnetic field, and when charged particles move in a magnetic field, they feel a magnetic force perpendicular to their motion that deflects them. Before you connect the wire to the 9-volt battery, there are many free electrons in the copper wire moving about in random directions creating magnetic fields that all cancel each other out. When you connect the battery, an electric field is created in the wire, and the free electrons in the wire begin to drift slowly towards the positive pole of the battery. Contrary to what many people believe, the electrons do not flow at the speed of light. In fact, they drift at the stupendous rate of about one foot per hour in the wire. It is the driving electrical field that travels at nearly the speed of light and which makes rapid telecommunications possible. So now we had a handy way of telling if something was standing still or moving. All you had to do was observe an electrically charged object. If it gave off a magnetic field, then you knew it was moving; if it did not give off a magnetic field, then you knew it was standing still. So Newton was “right” and Galileo was “wrong”, an observer could use electromagnetic experiments to tell if he was standing still or moving relative to absolute space. There was just one problem, the idea did not work. In 1901, Trouton and Noble conducted just such an experiment using a suspended charged capacitor. They tried to observe the magnetic field that should have been given off by the suspended charged capacitor as it moved through absolute space on board the Earth, as the Earth orbited the Sun. But they did not find any! Also, in 1887, Michelson and Morley conducted an experiment in which they measured the speed of light, an electromagnetic wave, in two perpendicular directions separated by an angle of 90°. The expected result was that a difference in the speed of light in the two directions would be observed because of the Earth’s motion about the Sun, but again, none was detected. The speed of light was always found to be the same even as the experimental rig was rotated in different directions while the Earth orbited the Sun at 66,660 miles/hour.

Einstein’s Concept of Relative Space and Time
Albert Einstein graduated from college with a degree in physics in 1900, but could not find a job in academic physics, and instead, settled for a job as a patent clerk in the Swiss Patent Office in Bern from 1902 – 1908. In 1905, Einstein published On the Electrodynamics of Moving Bodies in which he proposed that Galileo was right after all. In this paper, Einstein proposed that you really could not conduct any experiment, including electromagnetic experiments, that would reveal if you were moving or standing still relative to an absolute space, because there was no such thing as an absolute space. All motion was relative to other objects just as Galileo had proposed from the start. In order for this to be true, Einstein had to raise two conjectures to the level of postulates:

1. The laws of physics are the same for all observers, even for observers moving relative to each other at constant speeds in straight lines.

2. The speed of light is the same for all observers, even for observers moving relative to each other at constant speeds in straight lines.

If the above two postulates were not true, you could easily tell if you were moving or standing still relative to an absolute space. All you would have to do is measure the speed of light in different directions, and if it were not the same in all directions, then you would know that you were moving relative to an absolute space.

You can read an English translation of On the Electrodynamics of Moving Bodies at

The first few sections are very enlightening and not that difficult.

In an earlier posting, I covered the steps in the scientific method. The above two postulates in On the Electrodynamics of Moving Bodies comprised the first step of the scientific method - inspiration/revelation. In the remainder of the paper, Einstein uses deductive rationalism to expand the postulates into a self-consistent theory by deducing the implications of the postulates. So let’s do the same. Imagine that you have a friend on a spaceship far from all the stars in our galaxy and that he has a special clock that we shall call a LightClock. The engine on his spaceship is turned off, and he is just coasting along in the vacuum of space. As far as your friend can tell, he is not moving because he does not feel any motion in the pit of his stomach, just as you do not feel any motion sitting in your quiet room reading this posting. The LightClock consists of a flash unit, a mirror, and a camera and works like this. The flash unit emits a burst of light that reflects off the mirror back into the camera. Every time the camera detects a reflected flash of light, the flash unit triggers again and sends out another burst of light. Each burst of light from the flash unit is considered a tick of the LightClock, and the time interval between ticks is used to tell time on your friend’s spaceship.

Figure 1

Now imagine that you are on an identical spaceship also far from all the stars of our galaxy with the engine turned off just coasting along. You have an identical LightClock on board your spaceship to keep time, and you also do not think that you are moving because you do not feel any motion in the pit of your stomach either. All of a sudden, you see your friend’s spaceship fly by at a high rate of speed relative to your spaceship. As your friend’s spaceship flies by, you see his LightClock ticking away with light flashes. Because your friend’s spaceship is moving relative to yours, you do not see the light rays moving straight up and down between the flash unit, mirror, and camera as he does. Instead, it appears to you that in order for the light flashes from the flash unit to reflect off the mirror and hit the camera, they have to move at an angle like a billiard ball in a bank shot because the camera will have moved to the right as the spaceship speeds by. The faster your friend’s spaceship speeds by, the greater the angle that is required for the bank shot.

Figure 2

From your perspective, the identical LightClock on your spaceship behaves just like Figure 1 above with the light rays going straight up and down from the flash unit to the mirror and back into the camera.

Now here comes the strange part when we start to apply Einstein’s two postulates. From your perspective, the bank shot path taken by the light rays in your friend’s LightClock, going up from the flash unit to the mirror at an angle, and finally coming back down to the camera at an angle, is clearly longer than the straight up and down path taken by the light rays in your identical LightClock. Since Einstein’s second postulate states that the speed of light is the same for all observers, from your perspective, the apparent time between light flashes in your friend’s LightClock must be greater than in your identical LightClock. Your conclusion must be that time is running slower for your friend than it is for you! Now you might object that this is all nonsense. Yes, these strange LightClocks might have a problem keeping proper time when they move relative to each other, but nobody uses such strange time pieces, and “real” clocks would certainly not have a problem keeping in sync with each other on the two spaceships. This is where Einstein’s first postulate comes into play. If “real” clocks on board your friend’s spaceship did not behave just like his LightClock, then he would have a way to tell if he was moving relative to absolute space or standing still. All he would have to do is compare his LightClock to a “real” clock, and if the two clocks did not keep the same time, then he would know that he was “moving”. If the LightClock and “real” clock stayed in synch, then he would know that he was “standing still”. So the “real” clocks on your friend’s spaceship must slow down too, just like his LightClock. It goes further than that. Everything on your friend’s spaceship must slow down too to keep in sync with the LightClock, including the biochemical aging reactions in your friend’s body. From your perspective, everything would have to move in slow motion on your friend’s spaceship. Some further analysis also reveals that the length of objects on your friend’s spaceship would also have to shrink relative to the same objects on your identical spaceship. In fact, his whole spaceship would shrink in length from your perspective as it flew past your spaceship. The faster his spaceship flew past you, the greater would be the slowing down of his time and the greater the shrinking of his spaceship. As his spaceship approached the speed of light, time on board would come nearly to a stop and his spaceship would shrink to the thickness of a piece of paper! Now for the strangest part of all. Since all we know is that the two spaceships are in relative motion with each other, your friend would see the very same thing when he looked at your LightClock and your spaceship! He would say that your LightClock was running slow, that time on board your spaceship had slowed down, and that your spaceship had shrunk too. And you would both be right!

The above implications of Einstein’s two postulates do some serious damage to our common sense notions of space and time, so we need to do some investigative work to check things out. To complete the final step of the scientific method, we need to use inductive empiricism to conduct experiments to test the deduced implications of Einstein’s postulates. Surprisingly, all known laboratory results confirm Einstein’s strange predictions. For example, when high energy cosmic rays (mainly fast moving protons) strike the upper atmosphere of the Earth, they create particles called muons at an altitude of about 30,000 feet. A muon is a fundamental particle very much like a very heavy electron, but unlike electrons, muons decay into other particles in 2.2 x 10-6 seconds. In order to conserve the momentum of the incoming high energy cosmic rays, the muons created by the collision of cosmic rays with atoms in the Earth’s upper atmosphere end up moving at about 99.8% of the speed of light. But with a lifetime of only 2.2 x 10-6 seconds, the muons should only travel about 2165 feet before decaying, so none of the muons should reach the Earth’s surface. Yet we do observer a large number of muons striking the Earth’s surface. The reason we observe muons at the surface of the Earth is that at a relative speed of 99.8% of the speed of light, the time on board the muons slows down considerably, and the lifetime of the muons increases to 34.8 x 10-6 seconds from our perspective. This allows plenty of time for the muons to reach the Earth’s surface before decaying.

There is no Absolute Now
Let’s look into another problem caused by the relative motion between the spaceships. Imagine the two spaceships are outfitted with some new timing equipment. This time two flash units are installed, one at each end of the spaceship and a camera is installed exactly in the middle of each spaceship. Suppose your friend observes that two bursts of light from the flash units on his spaceship arrive exactly at the same time at his camera. Since the camera is exactly midway between the two flash units, your friend must conclude that both flash units fired off simultaneously.

Figure 3

Suppose you observer the same two events as your friend’s spaceship flies past at a great speed relative to you. You also observe that the two flashes arrive at the camera at the same time. However, because his spaceship is moving relative to yours, you know that flash A had to happen before flash B. The light from flash A has to chase after the camera, while the camera runs into the oncoming light from flash B, so the light from flash A had to travel a greater distance to reach the camera than the light from flash B. Consequently, flash A happened before flash B from your perspective.

Figure 4

If there was a third spaceship going even faster to your right than your friend’s spaceship, observers on that spaceship would see just the opposite. They would contend that flash B happened before flash A!

Now we have a problem. We have three observers; one says that A and B happened at the same time; one says that A happened first and then B happened; and the last says that B happened first and then A happened. These are all true statements. Do not think that these observations are optical illusions caused by the observers not taking into account time lags due to the time it takes for light to reach them from different objects on the other spaceships. When all such adjustments are made, the conclusions of all three observers are found to be true.

In the Newtonian concept of space and time, such problems do not arise because everybody agrees on a common universal “now”. But in Einstein’s concept of space and time, there is no universal “now”. Each observer has his own “now”, and if I am moving relative to you, then my “now” has to be different than your “now”. This creates a problem for the concept of causality. Consider the situation where we think that flash A “causes” flash B to occur. Suppose we rig up a detector at B waiting for a signal from A along an ethernet cable. When an IP packet from A arrives at B, B fires off a flash of light. We set up A so that it sends the IP packet after it flashes. So in this arrangement we think that flash A causes flash B to happen, so we had better not get into a situation where B happens before A! The only way we can manage that is to conclude that the IP packet cannot travel faster than the speed of light, since we are already using light beams for all of the timings in our example. If the IP packet or any other form of information does not travel faster than the speed of light, all observers will agree that A happens before B and we will not have a problem with causality.

In another supplemental paper on relativity published by Einstein in 1905 he put forward an “Oh, by the way, I nearly forgot to tell you” additional concept:


Matter is a form of energy! Now watch the coming together of ideas. The concept of energy as formulated by Rudolph Clausius in the first law of thermodynamics is found to be equivalent to matter in Einstein’s reformulation of space and time, and the concept of information which grew out of the second law of thermodynamics, Maxwell’s Demon, and the concept of entropy in statistical mechanics, is found to preserve the idea of causality in Einstein’s Universe of relative motion. And the conservation of energy outlined in the first law of thermodynamics matches up nicely with Lavoisier’s earlier discovery of the conservation of matter. Matter can turn into energy and energy can turn into matter, but the total amount of energy and matter in the Universe remains constant.

In this second paper on relativity, Einstein also showed that material objects could not move faster than the speed of light because it takes more and more energy to accelerate objects as they approach the speed of light, and it would take an infinite amount of energy for an object to surpass the speed of light. So now we have a very an interesting finding. Matter, energy, and information cannot move faster than the speed of light. Based upon this finding, we can consider information to be just as real as energy and matter! So the answer to the question posed by this posting is yes, information is truly as “real” as matter and energy!

Resistance to Outside Ideas
Einstein had great expectations for his two papers on relativity published in 1905. He had also published two other papers in 1905, one which helped launch the quantum mechanical revolution by proposing that light waves were quantized into particles we now call photons, and the other, which used Brownian motion to prove that atoms and molecules actually did exist after all, despite the misgivings of the physicists who gave Ludwig Boltzmann such grief over the idea. Einstein hoped that these papers would be his ticket out of the patent office. But to Einstein’s great surprise, there was a deafening silence. Again, Einstein was an outsider and not taken seriously by the physics community of the day. Luckily for Einstein, Max Planck read his work and thought that it was significant. With Planck’s endorsement, other physicists began to take a look too, and a grudging acceptance of Einstein’s ideas began to develop over the years. Einstein was not freed from his patent office until 1908, when he became a teaching assistant at the University of Bern. In 1911, he finally became an associate professor at the University of Zurich, and finally became a full professor at the Eidgenössische Technische Hochschule in Zurich in 1912. Relativity remained controversial for many years, and Einstein never received the Nobel Prize for this work, but he was awarded the Nobel Prize in Physics in 1921 for his work on light and photons.

The Introduction of the Concept of Effective Theories
Since Newtonian mechanics was built upon a foundation of absolute space and time, it had to be modified when Einstein’s concepts of relative space and time were introduced. Everything got a little cockeyed at that point because the foundation of physics had shifted. Because of the shift in the underlying foundation, strange things like E=mc² began to pop up. But for the most part, if you kept your speeds down below 10% of the speed of light, the relativistic adjustments to Newtonian mechanics were so small that they could be ignored. So Newtonian mechanics worked very well over the velocity range of 0% - 10% of the speed of light. That defined the velocity range over which Newtonian mechanics worked as an effective theory. Recall that an effective theory is an approximation of reality that only holds true over a certain restricted range of conditions and only provides a certain depth of understanding of the problem at hand.

Einstein’s ideas were quite a shock to the early 20th century physicists because they were not used to the concept of effective theories. Physicists in the 18th and 19th centuries did not think that they were discovering effective theories that were only approximations of reality; they thought that they were discovering the true fundamental laws of the Universe that defined actual reality. It soon got worse. About the same time that relativity was struggling for acceptance, it became evident that Newtonian mechanics did not work very well for small things like atoms either. Again it was the interplay of Newtonian mechanics and classical electromagnetism that caused the problem. In an earlier posting, I discussed the Ultraviolet Catastrophe in which classical electromagnetism predicted that the walls of the room you are sitting in should be at a temperature of absolute zero, having converted all of their available energy into ultraviolet light and x-rays. Newtonian mechanics and classical electromagnetism also predicted that the electrons surrounding the nuclei of all the atoms in the Universe should rapidly convert all of their orbital energy into electromagnetic radiation and collapse in the blink of an eye. What was happening here was that the effective theories of Newtonian mechanics and classical electromagnetism were bumping up against the limits of their effective range of applicability. Beyond a limited range of velocities and sizes, the approximations of Newtonian mechanics and classical electrodynamics simply did not work and had to be supplemented by additional effective theories that covered these new ranges of conditions – relativity and quantum mechanics.

Implications From an IT Perspective
What lessons can we draw from all of this from an IT perspective? First of all, we can see that information plays a very significant role in the physical Universe, just as it does in the Software Universe that we live in as IT professionals. Secondly, it highlights the pitfalls of common sense. Since traditional computer science currently relies heavily on common sense for predicting the behavior of software, we need to be very wary of the “obvious” assumptions that IT common sense is built upon. Remember, IT common sense is just an unwritten effective theory of software behavior based upon our common sense experiences with software and certainly is subject to the constraints of all effective theories. IT common sense only works for a limited range of IT conditions, and it only provides a limited depth of understanding.

We should also take note of Einstein’s strong adherence to a positivist approach to a relative space and time versus Newton’s concept of an absolute space and time. In Einstein’s original conceptualization of relativity, we only deal with observable phenomena like the ticking of light clocks, the paths and timings of light beams, and the lengths of objects measured directly with yard sticks. Einstein does not make any reference to an absolute space or time that we presume exists, but which we cannot directly measure as Newton did in his Principia. In softwarephysics we also take a positivist point of view of software behavior. We do not care what software “really is”, we only care about how software is observed to behave.

The above material published in 1905 has become known as Einstein’s special theory of relativity because it covers the special case in which all of the observers are in relative motion with each other at constant speeds in straight lines. In 1915, Einstein generalized his theory by allowing observers to accelerate, and this has become known as Einstein’s general theory of relativity. Next time we will explore the general theory of relativity, spacetime, and extend these ideas to the concept of cyberspacetime.

As a supplemental reading, you can find an excellent treatment of the above material with very little math, at professor John D. Norton’s course HPS 0410 Einstein for Everyone at:

Be sure to investigate the animated graphic on the Relativity of Simultaneity towards the middle of the webpage.

Comments are welcome at

To see all posts on softwarephysics in reverse order go to:

Steve Johnston