Sunday, May 10, 2015

Quantum Computing and the Many-Worlds Interpretation of Quantum Mechanics

In my last posting, The Software Universe as an Implementation of the Mathematical Universe Hypothesis we explored Max Tegmark’s proposal that our physical Universe, and the Software Universe that we IT professionals and end-users are all immersed in, is simply an unchanging eternal mathematical structure that has always existed in a Platonic sense. In that posting, we discussed Max Tegmark’s proposal that there is a Level III multiverse comprised of an infinite number of Level I and Level II multiverses that are constantly splitting due to Hugh Everett’s Many-Worlds Interpretation of quantum mechanics. In this posting, I would like to further explore the Many-Worlds Interpretation of quantum mechanics as it relates to quantum computing because many quantum computer researchers consider it key to the advancement of quantum computing.

The concept of quantum computing goes back to some early work in 1982 by Richard Feynman and David Deutsch, but it was David Deutsch who carried the idea forward and came up with the very first theoretical design of a quantum computer, similar to Alan Turing’s 1936 theoretical description for classical computers. Here is a link to David Deutsch’s seminal 1985 paper describing quantum computers and contrasting them with the classical computers that we work with today:

A more accessible outline of quantum computing can be found in David Deutsch’s book The Fabric of Reality (1997). Another very good book on quantum computing is Programming the Universe: A Quantum Computer Scientist Takes on the Cosmos (2006) by Seth Lloyd. Seth Lloyd is currently working on quantum computers at MIT and is the first quantum mechanical engineer in MIT’s Mechanical Engineering department. Seth Lloyd is recognized for proposing the very first technologically feasible design for a quantum computer. In his book, he proposes that the physical Universe is a huge quantum computer calculating how to behave and generates what we observe in the physical Universe, along the lines of Max Tegmark’s Level III multiverse. A good online synopsis of this idea is available in The Computational Universe (2002), in which he calculates the computing power of the entire physical Universe treated as one large quantum computer. You can find this fascinating paper at:

So why is the Many-Worlds Interpretation important to quantum computer research? Well, the whole point to quantum computation is that a quantum computer can perform a huge number of logical operations in parallel using a limited amount of hardware, while classical computers need dedicated hardware for each logical operation. For example, in a classical computer, like your laptop, a 1-bit memory location can hold a 1 or a 0, but in a quantum computer, a 1-qubit memory location can hold both a 1 and a 0 at the same time in a mixed quantum state! In a classical computer, when your code reads the 1 or 0 at the top of an if-then-else block, it will do one thing or the other by branching either into the then-block of code or into the else-block of code. But in a quantum computer, a 1-qubit memory location can be in a mixed quantum state of being both 1 and 0 at the same time, so when the quantum computer reads the 1-qubit memory location, it logically splits into two quantum computers. One of the twin quantum computers performs the then-block, while the other quantum computer performs the else-block at the same time and in parallel. So with a quantum computer, you can have a single computer behave like a nearly infinite number of computers all working in parallel on the same problem at the same time. In that sense, a quantum computer would behave very much like Mickey’s water-carrying brooms in The Sorcerer's Apprentice segment of Walt Disney’s Fantasia, constantly splitting in two to perform a task at each logical branch of your program:

In his book, David Deutsch asks the very compelling question of where exactly are all of those computations being performed, if not in a huge number of parallel universes by a huge number of parallel quantum computers? That is why the Many-World’s Interpretation of quantum mechanics seems so natural to those working on quantum computers. In fact, it is rather difficult to picture how a quantum computer could operate using the standard Copenhagen Interpretation of quantum mechanics. So let me refresh your memory on the Copenhagen Interpretation of quantum mechanics before proceeding.

In 1927, Niels Bohr and Werner Heisenberg proposed a very positivistic interpretation of quantum mechanics now known as the Copenhagen Interpretation. You see, Bohr was working at the University of Copenhagen Institute of Theoretical Physics at the time. The Copenhagen Interpretation contends that absolute reality does not really exist. Instead, there are an infinite number of potential realities, defined by the wavefunction ψ of a quantum system, and when we make a measurement of a quantum system, the wavefunction of the quantum system collapses into a single value that we observe, and thus brings the quantum system into reality (see Quantum Software for more on wavefunctions). This satisfied Max Born’s contention that wavefunctions are just probability waves. The Copenhagen Interpretation suffers from several philosophical problems though. For example, Eugene Wigner pointed out that the devices we use to measure quantum events are also made out of atoms which are quantum objects in themselves, so when an observation is made of a single atom of uranium to see if it has gone through a radioactive decay using a Geiger counter, the atomic quantum particles of the Geiger counter become entangled in a quantum superposition of states with the uranium atom. If the uranium has decayed, then the uranium atom and the Geiger counter are in one quantum state, and if the atom has not decayed, then the uranium atom and the Geiger counter are in a different quantum state. If the Geiger counter is fed into an amplifier, then we have to add in the amplifier too into our quantum superposition of states. If a physicist is patiently listening to the Geiger counter, we have to add him into the chain as well, so that he can write and publish a paper which is read by other physicists and is picked up by Time magazine for a popular presentation to the public. So when does the “measurement” actually take place? We seem to have an infinite regress. Wigner’s contention is that the measurement takes place when a conscious being first becomes aware of the observation. Einstein had a hard time with the Copenhagen Interpretation of quantum mechanics for this very reason because he thought that it verged upon solipsism. Solipsism is a philosophical idea from Ancient Greece. In solipsism, your Mind is the whole thing, and the physical Universe is just a figment of your imagination. So I would like to thank you very much for thinking of me and bringing me into existence! Einstein’s opinion of the Copenhagen Interpretation of quantum mechanics can best be summed up by his statement "Is it enough that a mouse observes that the Moon exists?". Einstein objected to the requirement for a conscious being to bring the Universe into existence, because in Einstein’s view, measurements simply revealed to us the condition of an already existing reality that does not need us around to make measurements in order to exist. But in the Copenhagen Interpretation, the absolute reality of Einstein does not really exist. Additionally, in the Copenhagen Interpretation, objects do not really exist until a measurement is taken, which collapses their associated wavefunctions, but the mathematics of quantum mechanics does not shed any light on how a measurement could collapse a wavefunction.

The collapse of the wavefunction is also a one-way street. According to the mathematics of quantum mechanics a wavefunction changes with time in a deterministic manner, so like all of the other current effective theories of physics, they are reversible in time and can be run backwards. This is also true in the Copenhagen Interpretation, so long as you do not observe the wavefunction and collapse it by the process of observing it. In the Copenhagen Interpretation, once you observe a wavefunction and collapse it, you cannot undo the collapse, so the process of observation becomes nonreversible in time. That means if you fire photons at a target, but do not observe them, it is possible to reverse them all in time and return the Universe back to its original state. That is how all of the other effective theories of physics currently operate. But in the Copenhagen Interpretation, if you do observe the outgoing photons you can never return the Universe back to its original state. This can best be summed up by the old quantum mechanical adage - look particle, don’t look wave. A good way to image this in your mind is to think of a circular tub of water. If you drop a pebble into the exact center of a circular tub of water, a series of circular waves will propagate out from the center. Think of those waves as the wavefunction of an electron changing with time into the future according to the Schrödinger equation. When the circular waves hit the circular walls of the tub they will be reflected back to the center of the tub. Essentially, they can be viewed as moving backwards in time. This can happen in the Copenhagen Interpretation so long as the electron is never observed as its wavefunction moves forward or backward in time. However, if the wavefunction is observed and collapsed, it can never move backwards in time, so observation becomes a one-way street.

In 1956, Hugh Everett working on his Ph.D. under John Wheeler, proposed the Many-Worlds Interpretation of quantum mechanics as an alternative. The Many-Worlds Interpretation admits to an absolute reality but claims that there are an infinite number of absolute realities spread across an infinite number of parallel universes. In the Many-Worlds Interpretation, when electrons or photons encounter a two-slit experiment, they go through one slit or the other, and when they hit the projection screen they interfere with electrons or photons from other universes that went through the other slit! In Everett’s original version of the Many-Worlds Interpretation, the entire Universe splits into two distinct universes whenever a particle is faced with a choice of quantum states, and so all of these universes are constantly branching into an ever-growing number of additional universes. In the Many-Worlds Interpretation of quantum mechanics, the wavefunctions or probability clouds of electrons surrounding an atomic nucleus are the result of overlaying the images of many “real” electrons in many parallel universes. Thus, according to the Many-Worlds Interpretation wavefunctions never collapse. They just deterministically evolve in an abstract mathematical Hilbert space and are reversible in time, like everything else in physics.

While doing research for The Software Universe as an Implementation of the Mathematical Universe Hypothesis I naturally consulted Max Tegmark’s HomePage at:

and I found a link there to Hugh Everett’s original 137-page Jan 1956 draft Ph.D. thesis in which he laid down the foundations for the Many-Worlds Interpretation. This is a rare document indeed because on March 1, 1957, Everett submitted a very compressed version of his theory in his final 36-page doctoral dissertation, "On the Foundations of Quantum Mechanics", after heavy editing by his thesis advisor John Wheeler to make his Ph.D. thesis more palatable to the committee that would be hearing his oral defense and also to not offend Niels Bohr, one of the founding fathers of the Copenhagen Interpretation and still one of its most prominent proponents. But years later John Wheeler really did want to know what Niels Bohr thought of Hugh Everett’s new theory and encouraged Everett to visit Copenhagen in order to meet with Bohr. Everett and his wife did finally travel to Copenhagen in March of 1959 and spent six weeks there. But by all accounts, the meeting between Bohr and Everett was a disaster, with Bohr not even discussing the Many-Worlds Interpretation with Everett.

Below is the link to Hugh Everett’s original 137-page Jan 1956 draft Ph.D. thesis:

I have also placed his thesis on Microsoft One Drive at:!1437&authkey=!ADIm_WTYLkbx90I&ithint=file%2cpdf

Since I love to read the original source documents for great ideas, like Copernicus’s On the Revolutions of the Celestial Spheres (1543), Galileo’s the Starry Messenger (1610) and Dialogue Concerning the Two Chief World Systems (1632), Newton’s Principia (1687), and Darwin’s On the Origin of Species (1859), I could not resist reading Hugh Everett’s original work too. So in this posting, I would like to step through Hugh Everett’s original Ph.D. thesis with you page by page, with a little translation along the way. To do that, let’s focus on the introduction and the concluding chapter of his original Ph.D. thesis, where he outlines what he is trying to achieve, and then skip over most of the math in the intervening chapters. For those chapters, I will only highlight his key findings as he builds his case for the Many-Worlds Interpretation.

For the remainder of this posting, direct quotes from Hugh Everett’s original Ph.D. thesis will be in blue, while my comments will be in black.

The Many-Worlds Interpretation


Hugh Everett, III


We begin, as a way of entering our subject, by characterizing a particular interpretation of quantum theory which, although not representative of the more careful formulations of some writers, is the most common form encountered in textbooks and university lectures on the subject.

With the very first sentence of his Ph.D. thesis, Hugh Everett lays down the gauntlet and begins by discussing the Copenhagen Interpretation of quantum mechanics and classifying it as not one of the more “careful formulations”. He is correct about the textbooks of the day exclusively teaching the Copenhagen Interpretation. I took my very first quantum mechanics course in 1970, and in those days the Copenhagen Interpretation was taught as a quantum mechanical fact. In fact, the textbooks of the day did not even refer to the idea of the act of measurement collapsing wavefunctions as the Copenhagen Interpretation because that would infer that other interpretations were even possible.

A physical system is described completely by a state function ψ, which is an element of a Hilbert space, and which furthermore gives information only concerning the probabilities of the results of various observations which can be made on the system. The state function ψ is thought of as objectively characterizing the physical system, i.e., at all times an isolated system is thought of as possessing a state function, independently of our state of knowledge of it. On the other hand, ψ changes in a causal manner so long as the system remains isolated, obeying a differential equation. Thus there are two fundamentally different ways in which the state function can change:

Hugh Everett begins his dissertation stating what everybody already agrees upon in classical quantum mechanics. Every physical system, like a single electron, can be described by a wavefunction called ψ that is a solution to Schrödinger’s equation. Note that in his thesis Hugh Everett sometimes uses the term “state function” and sometimes the term “wave function”, rather than the term wavefunction ψ. All of these terms mean the same thing. They are just solutions to the Schrödinger wave equation, which sometimes Hugh Everett refers to simply as the “wave equation”. The wavefunction ψ is a wiggly line that extends over the whole Universe but has the greatest amplitude near where the electron is most likely to be found. The wavefunction ψ is also a complex function with both a real and imaginary part, so it has both an amplitude and a phase (See The Foundations of Quantum Computing for details).

The chief difference between quantum mechanics and classical mechanics is that in classical mechanics objects have definite properties, like a definite position or a definite velocity. This is not so in quantum mechanics. In quantum mechanics, objects can be in a mixture or superposition of states. For example, if you pin down the exact location of an electron in quantum mechanics, the electron is said to be in a certain state of position called an eigenstate and with a certain numerical position that is called an eigenvalue. In the Copenhagen Interpretation, the act of measurement takes an object that is in a superposition of states and collapses its wavefunction down into a particular eigenstate with a particular eigenvalue. And this is a totally probabilistic process. The wavefunction itself does not specifically say where the object is located in advance. The wavefunction just tells you the probability of observing specific eigenstates with specific eigenvalues, and this probability is obtained by finding the square of the wavefunction's amplitude at a given position. For example, observing a hydrogen atom which initially is in a superposition of many states might determine that the hydrogen atom is in its ground state eigenstate at a known energy level that is its energy eigenvalue.

Hugh Everett then goes on to define two ways the wavefunction ψ can change with time:

Process 1: A discontinuous change brought on by observation. In the standard Copenhagen Interpretation this causes the wavefunction ψ of the electron which is spread out over the entire Universe with decreasing amplitude as you get further away from where the electron is likely to be found, to suddenly collapse so that the amplitude of the wavefunction ψ of the electron becomes huge where the electron is observed.

Process 2: The electron is not observed, so its wavefunction ψ sort of smears out with time in a deterministic manner.

The question of the consistency of the scheme arises if one contemplates regarding the observer and his object-system as a single (composite) physical system. Indeed, the situation becomes quite paradoxical if we allow for the existence of more than one observer. Let us consider the case of one observer A, who is performing measurements upon a system S, the totality (A + S) in turn forming the object-system for another observer, B.

If we are to deny the possibility of B's use of a quantum mechanical description (wave function obeying wave equation) for A + S, then we must be supplied with some alternative description for systems which contain observers (or measuring apparatus). Furthermore, we would have to have a criterion for telling precisely what type of systems would have the preferred positions of "measuring apparatus" or "observer" and be subject to the alternate description. Such a criterion is probably not capable of rigorous formulation.

On the other hand, if we do allow B to give a quantum description to A + S, by assigning a state function ψ A+S, then, so long as B does not interact with A + S, its state changes causally according to Process 2, even though A may be performing measurements upon S. From B's point of view, nothing resembling Process 1 can occur (there are no discontinuities), and the question of the validity of A's use of Process 1 is raised. That is, apparently either A is incorrect in assuming Process 1, with its probabilistic implications, to apply to his measurements, or else B's state function, with its purely causal character, is an inadequate description of what is happening to A + S.

Basically, if I am observer A and I observe electron S using Process 1, according to the Copenhagen Interpretation I collapse the wavefunction ψ of the electron down to a single point in space. But I am just made up of a huge number of quantum particles too, just like the single electron that I observed. So if you, as observer B, do not watch me (as observer A) observing electron S, the wavefunction ψ A+S that describes me and the electron does not collapse and it continues to change in a deterministic manner according to Process 2. So either I, as observer A, do not really collapse the wavefunction of the electron with Process 1, or you, as observer B, do not let the combination of me and the electron evolve in time in an undisturbed manner according to Process 2. Thus, the Copenhagen Interpretation leads to a contradiction when more than one observer is involved.

To better illustrate the paradoxes which can arise from strict adherence to this interpretation we consider the following amusing, but extremely hypothetical drama. Isolated somewhere out in space is a room containing an observer, A, who is about to perform a measurement upon a system S. After performing his measurement he will record the result in his notebook. We assume that he knows the state function of S (perhaps as a result of previous measurement), and that it is not an eigenstate of the measurement he is about to perform. A, being an orthodox quantum theorist, then believes that the outcome of his measurement is undetermined and that the process is correctly described by Process 1.

In the meantime, however, there is another observer, B, outside the room, who is in possession of the state function of the entire room, including S, the measuring apparatus, and A, just prior to the measurement. B is only interested in what will be found in the notebook one week hence, so he computes the state function of the room for one week in the future according to Process 2. One week passes, and we find B still in possession of the state function of the room, which this equally orthodox quantum theorist believes to be a complete description of the room and its contents. If B's state function calculation tells beforehand exactly what is going to be in the notebook, then A is incorrect in his belief about the indeterminacy of the outcome of his measurement. We therefore assume that B's state function contains non-zero amplitudes over several of the notebook entries.

At this point, B opens the door to the room and looks at the notebook (performs his observation). Having observed the notebook entry, he turns to A and informs him in a patronizing manner that since his (B's) wave function just prior to his entry into the room, which he knows to have been a complete description of the room and its contents, had non-zero amplitude over other than the present result of the measurement, the result must have been decided only when B entered the room, so that A, his notebook entry, and his memory about what occurred one week ago had no independent objective existence until the intervention by B. In short, B implies that A owes his present objective existence to B's generous nature which compelled him to intervene on his behalf. However, to B's consternation, A does not react with anything like the respect and gratitude he should exhibit towards B, and at the end of a somewhat heated reply, in which A conveys in a colorful manner his opinion of B and his beliefs, he rudely punctures B's ego by observing that if B's view is correct, then he has no reason to feel complacent, since the whole present situation may have no objective existence, but may depend upon the future actions of yet another observer.

Clearly, in a Universe with more than one observer, the opening hypothesis of his thesis that wavefunctions change in time by either Process 1 or Process 2 cannot be right. Otherwise, nothing in the Universe would ever “really” happen until its very last sentient being took a peek into the room above and collapsed its very complicated wavefunction. Hugh Everett next proposes several alternative explanations.

It is now clear that the interpretation of quantum mechanics with which we began is untenable if we are to consider a universe containing more than one observer. We must therefore seek a suitable modification of this scheme, or an entirely different system of interpretation. Several alternatives which avoid the paradox are:

Alternative 1: To postulate the existence of only one observer in the universe. This is the solipsist position, in which each of us must hold the view that he alone is the only valid observer, with the rest of the universe and its inhabitants obeying at all times Process 2 except when under his observation.

This view is quite consistent, but one must feel uneasy when, for example, writing textbooks on quantum mechanics, describing Process 1, for the consumption of other persons to whom it does not apply.

Alternative 2: To limit the applicability of quantum mechanics by asserting that the quantum mechanical description fails when applied to observers, or to measuring apparatus, or more generally to systems approaching macroscopic size.

If we try to limit the applicability so as to exclude measuring apparatus, or in general systems of macroscopic size, we are faced with the difficulty of sharply defining the region of validity. For what n might a group of n particles be construed as forming a measuring device so that the quantum description fails? And to draw the line at human or animal observers, i.e., to assume that all mechanical aparata obey the usual laws, but that they are somehow not valid for living observers, does violence to the so-called principle of psycho-physical parallelism, and constitutes a view to be avoided, if possible. To do justice to this principle we must insist that we be able to conceive of mechanical devices (such as servomechanisms), obeying natural laws, which we would be willing to call observers.

Alternative 3: To admit the validity of the state function description, but to deny the possibility that B could ever be in possession of the state function of A + S. Thus one might argue that a determination of the state of A would constitute such a drastic intervention that A would cease to function as an observer.

The first objection to this view is that no matter what the state of A + S is, there is in principle a complete set of commuting operators for which it is an eigenstate, so that, at least, the determination of these quantities will not affect the state nor in any way disrupt the operation of A. There are no fundamental restrictions in the usual theory about the knowability of any state functions, and the introduction of any such restrictions to avoid the paradox must therefore require extra postulates.

The second objection is that it is not particularly relevant whether or not B actually knows the precise state function of A + S. If he merely believes that the system is described by a state function, which he does not presume to know, then the difficulty still exists. He must then believe that this state function changed deterministically, and hence that there was nothing probabilistic in A's determination.

Alternative 4: To abandon the position that the state function is a complete description of a system. The state function is to be regarded not as a description of a single system, but of an ensemble of systems, so that the probabilistic assertions arise naturally from the incompleteness of the description.

It is assumed that the correct complete description, which would presumably involve further (hidden) parameters beyond the state function alone, would lead to a deterministic theory, from which the probabilistic aspects arise as a result of our ignorance of these extra parameters in the same manner as in classical statistical mechanics.

Alternative 5: To assume the universal validity of the quantum description, by the complete abandonment of Process 1. The general validity of pure wave mechanics, without any statistical assertions, is assumed for all physical systems, including observers and measuring apparata. Observation processes are to be described completely by the state function of the composite system which includes the observer and his object-system, and which at all times obeys the wave equation (Process 2).

This brief list of alternatives is not meant to be exhaustive, but has been presented in the spirit of a preliminary orientation. We have, in fact, omitted one of the foremost interpretations of quantum theory, namely the position of Niels Bohr. The discussion will be resumed in the final chapter, when we shall be in a position to give a more adequate appraisal of the various alternate interpretations. For the present, however, we shall concern ourselves only with the development of Alternative 5.

It is evident that Alternative 5 is a theory of many advantages. It has the virtue of logical simplicity and it is complete in the sense that it is applicable to the entire universe. All processes are considered equally (there are no "measurement processes" which play any preferred role), and the principle of psycho-physical parallelism is fully maintained. Since the universal validity of the state function description is asserted, one can regard the state functions themselves as the fundamental entities, and one can even consider the state function of the whole universe. In this sense this theory can be called the theory of the "universal wave function", since all of physics is presumed to follow from this function alone. There remains, however, the question whether or not such a theory can be put into correspondence with our experience.

The present thesis is devoted to showing that this concept of a universal wave mechanics, together with the necessary correlation machinery for its interpretation, forms a logically self consistent description of a universe in which several observers are at work.

Alternative 5 is Hugh Everett’s Many-Worlds Interpretation of quantum mechanics. In this interpretation of quantum mechanics, he completely eliminates Process 1 as a way for wavefunctions to change with time. Instead, he plans to bring in the acts of measurement and observation under Process 2, and simply let the wavefunctions evolve with time according to the wave equation. In this interpretation of quantum mechanics, wavefunctions are the fundamental thing and provide all that can be known of the Universe. In fact, the whole Universe can be considered to be one single very complex wavefunction evolving with time. That is why he calls his theory the theory of the universal wavefunction.

We shall be able to introduce into the theory systems which represent observers. Such systems can be conceived as automatically functioning machines (servomechanisms) possessing recording devices (memory) and which are capable of responding to their environment. The behavior of these observers shall always be treated within the framework of wave mechanics. Furthermore, we shall deduce the probabilistic assertions of Process 1 as subjective appearances to such observers, thus placing the theory in correspondence with experience. We are then led to the novel situation in which the formal theory is objectively continuous and causal, while subjectively discontinuous and probabilistic. While this point of view thus shall ultimately justify our use of the statistical assertions of the orthodox view, it enables us to do so in a logically consistent manner, allowing for the existence of other observers. At the same time it gives a deeper insight into the meaning of quantized systems, and the role played by quantum mechanical correlations.

In order to bring about this correspondence with experience for the pure wave mechanical theory, we shall exploit the correlation between subsystems of a composite system which is described by a state function. A subsystem of such a composite system does not, in general, possess an independent state function. That is, in general a composite system cannot be represented by a single pair of subsystem states, but can be represented only by a superposition of such pairs of subsystem states…. there is no single state for Particle 1 alone or Particle 2 alone, but only the superposition of such cases.

In fact, to any arbitrary choice of state for one subsystem there will correspond a relative state for the other subsystem, which will generally be dependent upon the choice of state for the first subsystem, so that the state of one subsystem is not independent, but correlated to the state of the remaining subsystem. Such correlations between systems arise from interaction of the systems, and from our point of view all measurement and observation processes are to be regarded simply as interactions between observer and object-system which produce strong correlations.

Let one regard an observer as a subsystem of the composite system: observer + object-system. It is then an inescapable consequence that after the interaction has taken place there will not, generally, exist a single observer state. There will, however, be a superposition of the composite system states, each element of which contains a definite observer state and a definite relative object-system state. Furthermore, as we shall see, each of these relative object-system states will be, approximately, the eigenstates of the observation corresponding to the value obtained by the observer which is described by the same element of the superposition. Thus, each element of the resulting superposition describes an observer who perceived a definite and generally different result, and to whom it appears that the object-system state has been transformed into the corresponding eigenstate. In this sense the usual assertions of Process 1 appear to hold on a subjective level to each observer described by an element of the superposition. We shall also see that correlation plays an important role in preserving consistency when several observers are present and allowed to interact with one another (to "consult" one another) as well as with other object-systems.

In order to develop a language for interpreting our pure wave mechanics for composite systems we shall find it useful to develop quantitative definitions for such notions as the "sharpness" or "definiteness" of an operator A for a state ψ, and the "degree of correlation" between the subsystems of a composite system or between a pair of operators in the subsystems, so that we can use these concepts in an unambiguous manner. The mathematical development of these notions will be carried out in the next chapter (II) using some concepts borrowed from Information Theory. We shall develop there the general definitions of information and correlation, as well as some of their more important properties. Throughout Chapter II we shall use the language of probability theory to facilitate the exposition, and because it enables us to introduce in a unified manner a number of concepts that will be of later use. We shall nevertheless subsequently apply the mathematical definitions directly to state functions, by replacing probabilities by square amplitudes, without, however, making any reference to probability models.

Having set the stage, so to speak, with Chapter II, we turn to quantum mechanics in Chapter III. There we first investigate the quantum formalism of composite systems, particularly the concept of relative state functions, and the meaning of the representation of subsystems by noninterfering mixtures of states characterized by density matrices. The notions of information and correlation are then applied to quantum mechanics. The final section of this chapter discusses the measurement process, which is regarded simply as a correlation-inducing interaction between subsystems of a single isolated system. A simple example of such a measurement is given and discussed, and some general consequences of the superposition principle are considered.

This will be followed by an abstract treatment of the problem of Observation (Chapter IV). In this chapter we make use only of the superposition principle, and general rules by which composite system states are formed of subsystem states, in order that our results shall have the greatest generality and be applicable to any form of quantum theory for which these principles hold. (Elsewhere, when giving examples, we restrict ourselves to the non-relativistic Schrödinger Theory for simplicity.) The validity of Process 1 as a subjective phenomenon is deduced, as well as the consistency of allowing several observers to interact with one another.

Chapter V supplements the abstract treatment of Chapter IV by discussing a number of diverse topics from the point of view of the theory of pure wave mechanics, including the existence and meaning of macroscopic objects in the light of their atomic constitution, amplification processes in measurement, questions of reversibility and irreversibility, and approximate measurement.

The final chapter summarizes the situation, and continues the discussion of alternate interpretations of quantum mechanics.

With that Hugh Everett ends the introduction of his thesis. Basically, he is proposing that Process 1, in which an external observer A observes a quantum system like an electron, and causes a discontinuous change to the electron’s wavefunction ψ is an illusion. Instead, the wavefunction of observer A becomes “correlated” with the wavefunction of the electron into a composite wavefunction of both observer A and the electron. This composite wavefunction then evolves in time according to the wave equation. Thus, Process 1 really does not exist. Everything in the Universe just evolves in time according to Process 2. In the following chapters, Hugh Everett goes on to explain how this evolution of a correlated composite wavefunction can produce all of the strange quantum mechanical things we observe in the lab.


The present chapter is devoted to the mathematical development of the concepts of information and correlation. As mentioned in the introduction we shall use the language of probability theory throughout this chapter to facilitate the exposition, although we shall apply the mathematical definitions and formulas in later chapters without reference to probability models. We shall develop our definitions and theorems in full generality, for probability distributions over arbitrary sets, rather than merely for distributions over real numbers, with which we are mainly interested at present. We take this course because it is as easy as the restricted development, and because it gives a better insight into the subject.

The first three sections develop definitions and properties of information and correlation for probability distributions over finite sets only. In section four the definition of correlation is extended to distributions over arbitrary sets, and the general invariance of the correlation is proved. Section five then generalizes the definition of information to distributions over arbitrary sets. Finally, as illustrative examples, sections seven and eight give brief applications to stochastic processes and classical mechanics, respectively.

Now the really heavy math begins once we leave the Introduction and proceed into the main body of Hugh Everett’s Ph.D. thesis. He first goes into the mathematics of probability distributions that would be familiar to anybody who deals with statistics. He is mainly concerned with joint probabilities and conditional distributions because he is focusing upon what happens when observer A observes electron S. He reminds the reader about what independence means because it is important to his proposal. If observer A never interacts with electron S it means that they are statistically independent and cannot affect each other, but if observer A does observe electron S it means they are correlated and that is a whole different story.

Independence means that the random variables take on values which are not influenced by the values of other variables with respect to which they are independent. That is, the conditional distribution of one of two independent variables, Y, conditioned upon the value xi for the other, is independent of xi, so that knowledge about one variable tells nothing of the other.

Next, he applies Claude Shannon’s concept of Information (1948) to the analysis (see Some More Information About Information for details) for joint distributions that are correlated and not independent. Remember, Claude Shannon’s formulation for the concept of Information hinges upon the amount of “surprise” there is in a signal composed of 1s and 0s, while in softwarephysics we use Leon Brillouin’s concept of Information as a form of negative entropy (see The Demon of Software for details). Hugh Everett goes on to conclude that if two things are not dependent upon each other, like me observing electron S1 and you observing a different electron S2 that the total amount of Information obtained is equal to the sum of the Information I get about electron S1 and you get about electron S2.

For independent random variables X, Y, ... ,Z, the following relationship is easily proved:

(2.4) IXY... Z = IX + IY + ... + IZ (X, Y, ... ,Z independent) ,

so that the information of XY... Z is the sum of the individual quantities of information, which is in accord with our intuitive feeling that if we are given information about unrelated events, our total knowledge is the sum of the separate amounts of information. ….

But what if there is some correlation when measuring two things that depend upon each other? Like measuring the market value of your home and your annual salary? Those two numbers are not independent of each other, so knowing one tells you something about the other.

….which we are told, the natural thing to do to arrive at a single number to measure the strength of correlation is to consider the expected change in information about X, given that we are to be told the value of Y. This quantity we call the correlation information, or for brevity, the correlation, of X and Y, and denote it by {X, Y}….

….Thus the correlation is symmetric between X and Y, and hence also equal to the expected change of information about Y given that we will be told the value of X. Furthermore, according to (3.3) the correlation corresponds precisely to the amount of "missing information" if we possess only the marginal distributions, i.e., the loss of information if we choose to regard the variables as independent.

Here he is saying that suppose you have a box full of interacting molecules and you know their individual positions and velocities. That information he defines as the marginal information about the marginal distributions of the molecules. But if molecule X bounces off molecule Y, then their positions and velocities will no longer be independent because molecule X has interacted with molecule Y, so their marginal distributions are correlated. Hugh Everett calls the information tied up with that correlation the correlation information {X,Y} and it is symmetric {X,Y} = {Y,X}. He then goes on to take his definition of correlation information to derive the conservation of information in classical mechanics (see The Demon of Software for details). Remember that physicists get very nervous about the idea of destroying information because then they cannot reverse the effective theories of physics in time.

we have proved that….and the total information is conserved.

Now it is known that the individual (marginal) position and momentum distributions tend to decay, except for rare fluctuations, into the uniform and Maxwellian distributions respectively, for which the classical entropy is a maximum. This entropy is, however, except for the factor of Boltzman's constant, simply the negative of the marginal information

(7.4) Imarginal = IX1 + IY1 + IZ1 + ... + Ipxn + Ipyn + Ipzn

which thus tends towards a minimum. But this decay of marginal information is exactly compensated by an increase of the total correlation information

(7.5) {total} = Itotal - Imarginal

since the total information remains constant. Therefore, if one were to define the total entropy to be the negative of the total information, one could replace the usual second law of thermodynamics by a law of conservation of total entropy, where the increase in the standard (marginal) entropy is exactly compensated by a (negative) correlation entropy. The usual second law then results simply from our renunciation of all correlation knowledge (stosszahlansatz), and not from any intrinsic behavior of classical systems. The situation for classical mechanics is thus in sharp contrast to that of stochastic processes, which are intrinsically irreversible.

Thus Hugh Everett ends Chapter II of his thesis by deriving a concept of Information that is very similar to Leon Brillouin’s concept of Information as being a form of negative entropy. Remember, Hugh Everett defined marginal information as the information about the individual molecules, and {total} as the total amount of correlation information that is created by the molecules bouncing off each other and becoming correlated. So we can rewrite equation (7.5) as:

Itotal = Icorrelation + Imarginal

which says that the total amount of Information in classical mechanics does not change and that information is conserved. For example, let’s say you start off with a box that initially only has molecules on the left side of the box. So you begin with lots of marginal information about the individual molecules because you know they are all in the left side of the box. But as time progresses, the molecules will bounce off each other and begin to scatter into the right side of the box, until the molecules finally become smoothly spread throughout the entire box, as the second law of thermodynamics predicts. In the process, the marginal information of the molecules will decrease with time, but at the same time, the correlation information of the molecules will increase as the molecules bounce off each other and become correlated, so the total amount of information remains constant.


Having mathematically formulated the ideas of information and correlation for probability distributions, we turn to the field of quantum mechanics. In this chapter we assume that the states of physical systems are represented by points in a Hilbert space, and that the time dependence of the state of an isolated system is governed by a linear wave equation.

It is well known that state functions lead to distributions over eigenvalues of Hermitian operators (square amplitudes of the expansion coefficients of the state in terms of the basis consisting of eigenfunctions of the operator) which have the mathematical properties of probability distributions (non-negative and normalized). The standard interpretation of quantum mechanics regards these distributions as actually giving the probabilities that the various eigenvalues of the operator will be observed, when a measurement represented by the operator is performed.

A feature of great importance to our interpretation is the fact that a state function of a composite system leads to joint distributions over subsystem quantities, rather than independent subsystem distributions, i.e., the quantities in different subsystems may be correlated with one another. The first section of this chapter is accordingly devoted to the development of the formalism of composite systems, and the connection of composite system states and their derived joint distributions with the various possible subsystem conditional and marginal distributions. We shall see that there exist relative state functions which correctly give the conditional distributions for all subsystem operators, while marginal distributions can not generally be represented by state functions, but only by density matrices.

In Section 2 the concepts of information and correlation, developed in the preceding chapter, are applied to quantum mechanics, by defining information and correlation for operators on systems with prescribed states. It is also shown that for composite systems there exists a quantity which can be thought of as the fundamental correlation between subsystems, and a closely related canonical representation of the composite system state. In addition, a stronger form of the uncertainty principle, phrased in information language, is indicated.

The third section takes up the question of measurement in quantum mechanics, viewed as a correlation producing interaction between physical systems. A simple example of such a measurement is given and discussed. Finally some general consequences of the superposition principle are considered.

Hugh Everett then goes on to discuss composite systems consisting of several parts. For example, suppose we have a system S composed of two electrons S1 and S2.

….It is well known that if the states of a pair of systems S1 and S2 are represented by points in Hilbert spaces H1 and H2 respectively, then the states of the composite system S = S1 + S2 (the two systems S1 and S2 regarded as a single system S are represented correctly by points of the direct product of H1 and H2….

After a great a great deal of math Hugh Everett concludes:

….Therefore there exists in general no state for S1 which correctly gives the marginal expectations for all operators in S1….

However, even though there is generally no single state describing marginal expectations, we see that there is always a mixture of states, .... which does yield the correct expectations.

which means that when two systems S1 and S2 interact with each other, there is no wavefunction for just S1 or S2 alone, but there is a wavefunction for the composite system of S1 and S2 together. That means that when you as system S1 observe an electron S2 there are no longer separate wavefunctions for you and the electron. Instead, you and the electron become entangled into a single composite wavefunction for both you and the electron. He goes on to summarize this as:

In summary, we have seen in this section that a state of a composite system leads to joint distributions over subsystem quantities which are generally not independent. Conditional distributions and expectations for subsystems are obtained from relative states, and subsystem marginal distributions and expectations are given by density matrices.

There does not, in general, exist anything like a single state for one subsystem of a composite system. That is, subsystems do not possess states independent of the states of the remainder of the system, so that the subsystem states are generally correlated. One can arbitrarily choose a state for one subsystem, and be led to the relative state for the other subsystem. Thus we are faced with a fundamental relativity of states, which is implied by the formalism of composite systems. It is meaningless to ask the absolute state of a subsystem - one can only ask the state relative to a given state of the remainder of the system….

Next, he discusses the marginal information of individual particles and the correlated information due to the particles interacting with each other by bouncing off each other in terms of operators acting upon their wavefunctions ψ. Remember in quantum mechanics, the wavefunction ψ of a particle is the whole deal and contains all of the information there is about the particle, like its position and velocity. In quantum mechanics that information is determined by applying mathematical operators to the wavefunction ψ. For example, if you want to know how much energy a particle has, there is a mathematical operator that you can apply to its wavefunction ψ that will give you an actual number. It’s like if you want to know how much money somebody has on them, you can apply an operation to them that frisks them down and checks all of their pockets for wallets, billfolds, loose bills and change and then adds it all up.

We wish to be able to discuss information and correlation for Hermitian operators A, B, ... , with respect to a state function ψ. These quantities are to be computed, through the formulas of the preceding chapter, from the square amplitudes of the coefficients of the expansion of ψ terms of the eigenstates of the operators.

Finally, Hugh Everett notes that quantum mechanics is very much like classical mechanics in regards to the relationship between the entropy and Information that is obtained by applying the above mathematical operators. Remember, in classical mechanics, we saw that the correlation information is the information that arises from particles interacting with each other by bouncing off each other. The chief difference for quantum mechanics is that instead of particles bouncing off each other, we have mathematical operators operating on their wavefunctions instead:

….It is also interesting to note that the quantity - Trace(ρ ln ρ ) is (apart from a factor of Boltzman's constant) just the entropy of a mixture of states characterized by the density matrix ρ. Therefore the entropy of the mixture characteristic of a subsystem S1 for the state ψS = ψS1 + S2 is exactly matched by a correlation information {S1 ,S2}, which represents the correlation between any pair of operators A, B, which define the canonical representation. The situation is thus quite similar to that of classical mechanics.

Next, Hugh Everett takes up the thorny issues of measurement in quantum mechanics. Recall that in the Copenhagen Interpretation measuring the wavefunction ψ of an electron causes it to mysteriously collapse to a single point and that is where you will find the electron.

We now consider the question of measurement in quantum mechanics, which we desire to treat as a natural process within the theory of pure wave mechanics. From our point of view there is no fundamental distinction between "measuring apparata" and other physical systems. For us, therefore, a measurement is simply a special case of interaction between physical systems - an interaction which has the property of correlating a quantity in one subsystem with a quantity in another….

….Nearly every interaction between systems produces some correlation however. Suppose that at some instant a pair of systems are independent, so that the composite system state function is a product of subsystem states ψS = ψS1 ψS2 . Then this condition obviously holds only instantaneously if the systems are interacting - the independence is immediately destroyed and the systems become correlated. We could, then, take the position that the two interacting systems are continually "measuring" one another, if we wished….

….Suppose that we have a system of only one coordinate, q, (such as position of a particle), and an apparatus of one coordinate r (for example the position of a meter needle)….

….This principle has the far reaching implication that for any possible measurement, for which the initial system state is not an eigenstate, the resulting state of the composite system leads to no definite system state nor any definite apparatus state. The system will not be put into one or another of its eigenstates with the apparatus indicating the corresponding value, and nothing resembling Process 1 can take place….

….Thus in general after a measurement has been performed there will be no definite system state nor any definite apparatus state, even though there is a correlation. It seems as though nothing can ever be settled by such a measurement. Furthermore this result is independent of the size of the apparatus, and remains true for apparatus of quite macroscopic dimensions….

Suppose, for example, that we coupled a spin measuring device to a cannonball, so that if the spin is up the cannonball will be shifted one foot to the left, while if the spin is down it will be shifted an equal distance to the right. If we now perform a measurement with this arrangement upon a particle whose spin is a superposition of up and down, then the resulting total state will also be a superposition of two states, one in which the cannonball is to the left, and one in which it is to the right. There is no definite position for our macroscopic cannonball!

This behavior seems to be quite at variance with our observations, since macroscopic objects always appear to us to have definite positions. Can we reconcile this prediction of the purely wave mechanical theory with experience, or must we abandon it as untenable? In order to answer this question we must consider the problem of observation itself within the framework of the theory.

To understand the above section we need a little background in experimental physics. Electrons have a quantum mechanical property called spin. You can think of an electron’s spin like the electron has a little built-in magnet. In fact, it is the spin of the little electron magnets that add up to make the real magnets that you put on your refrigerator. When you throw an electron through a distorted magnetic field that is pointing up the electron will pop out in one of two states. It will either be aligned with the magnetic field (called spin-up) or it will be pointing 1800 in the opposite direction of the magnetic field (called spin-down). Both the spin-up and spin-down conditions are called an eigenstate. Prior to the observation of the electron’s spin, the electron is in a superposition of states and is not in an eigenstate. Now if the electron in the eigenstate of spin-up is sent through the same magnetic field again, it will be found to pop out in the eigenstate of spin-up again. Similarly, a spin-down electron that is sent through the magnetic field again will also pop out as a spin-down electron. Now here is the strange part. If you rotate the magnetic field by 900 and send spin-up electrons through it, 50% of the electrons will pop out with a spin pointing to the left, and 50% will pop out with a spin pointing to the right. And you cannot predict in advance which way a particular spin-up electron will pop out. It might spin to the left, or it might spin to the right. The same goes for the spin-down electrons – 50% will pop out spinning to the left and 50% will pop out spinning to the right.

Figure 1 - In the Stern-Gerlach experiment we shoot electrons through a distorted magnetic field. Classically, we would expect the electrons to be spinning in random directions and the magnetic field should deflect them in random directions, creating a smeared out spot on the screen. Instead, we see that the act of measuring the spins of the electrons puts them into eigenstates with eigenvalues of spin-up or spin-down and the electrons are either deflected up or down. If we rotate the magnets by 900, we find that the electrons are deflected to the right or to the left.

In the above section, Hugh Everett is proposing that when a device, like our magnets above, measures the spin of an electron that is in an unknown state, and not in a spin-up or spin-down eigenstate, the device does not put the electron into a spin-up or spin-down eigenstate as the Copenhagen Interpretation maintains. Instead, the device and the electron enter into a correlated composite system state or combined wavefunction with an indeterminate spin of the electron.

In the next chapter Hugh Everett explains how this new worldview can be used to explain what we observe in the lab. In fact, he will propose that from the perspective of the measuring magnets and the electron, two independent observational histories will emerge, one with the measuring magnets finding a spin-up electron and one with the measuring magnets finding a spin-down electron, and both of these will be just as “real” as the other. For them, the Universe has essentially split in two, with each set in its own Universe. That is where the “Many-Worlds” in the Many-Worlds Interpretation of quantum mechanics comes from.


We shall now give an abstract treatment of the problem of observation. In keeping with the spirit of our investigation of the consequences of pure wave mechanics we have no alternative but to introduce observers, considered as purely physical systems, into the theory.

We saw in the last chapter that in general a measurement (coupling of system and apparatus) had the outcome that neither the system nor the apparatus had any definite state after the interaction - a result seemingly at variance with our experience. However, we do not do justice to the theory of pure wave mechanics until we have investigated what the theory itself says about the appearance of phenomena to observers, rather than hastily concluding that the theory must be incorrect because the actual states of systems as given by the theory seem to contradict our observations.

Recall that in Chapter III Hugh Everett demonstrated that when an observer O observes an electron, the wavefunction ψ of observer O and whatever apparatus that is used to observe the electron become entangled or “correlated” with the wavefunction ψ of the electron into a total state function ψ of the Observer and the electron together, and that neither the observer O nor the electron has separate wavefunctions after the observation is made.

We shall see that the introduction of observers can be accomplished in a reasonable manner, and that the theory then predicts that the appearance of phenomena, as the subjective experience of these observers, is precisely in accordance with the predictions of the usual probabilistic interpretation of quantum mechanics.

We are faced with the task of making deductions about the appearance of phenomena on a subjective level, to observers which are considered as purely physical systems and are treated within the theory. In order to accomplish this it is necessary to identify some objective properties of such an observer (states) with subjective knowledge (i.e., perceptions). Thus, in order to say that an observer O has observed the event a, it is necessary that the state of O has become changed from its former state to a new state which is dependent upon a.

It will suffice for our purposes to consider our observers to possess memories (i.e., parts of a relatively permanent nature whose states are in correspondence with the past experience of the observer). In order to make deductions about the subjective experience of an observer it is sufficient to examine the contents of the memory.

As models for observers we can, if we wish, consider automatically functioning machines, possessing sensory apparata and coupled to recording devices capable of registering past sensory data and machine configurations. We can further suppose that the machine is so constructed that its present actions shall be determined not only by its present sensory data, but by the contents of its memory as well. Such a machine will then be capable of performing a sequence of observations (measurements), and furthermore of deciding upon its future experiments on the basis of past results. We note that if we consider that current sensory data, as well as machine configuration, is immediately recorded in the memory, then the actions of the machine at a given instant can be regarded as a function of the memory contents only, and all relevant experience of the machine is contained in the memory.

Now, remember this is 1956! There really weren’t many computers running around in 1956. I know because I was there. Still, Hugh Everett is now proposing to take human observers out of the equation and replace them with computers using Artificial Intelligence instead. This is a wise move because human observers use consciousness to record observations, and we still do not understand what consciousness is. By taking human observers out of the analysis he avoids that complication. No wonder that the Many-Worlds Interpretation seems to naturally lend itself to quantum computers. Computers were part of the analysis from the very beginning.

For such machines we are justified in using such phrases as "the machine has perceived A" or "the machine is aware of A" if the occurrence of A is represented in the memory, since the future behavior of the machine will be based upon the occurrence of A. In fact, all of the customary language of subjective experience is quite applicable to such machines, and forms the most natural and useful mode of expression when dealing with their behavior, as is well known to individuals who work with complex automata.

When dealing quantum mechanically with a system representing an observer we shall ascribe a state function, ψO to it. When the State ψO describes an observer whose memory contains representations of the events A,B, ... ,C we shall denote this fact by appending the memory sequence in brackets as a subscript, writing:

ψO[A,B, ... ,C]

The symbols A,B, ... ,C, which we shall assume to be ordered time wise, shall therefore stand for memory configurations which are in correspondence with the past experience of the observer. These configurations can be thought of as punches in a paper tape, impressions on a magnetic reel, configurations of a relay switching circuit, or even configurations of brain cells. We only require that they be capable of the interpretation "The observer has experienced the succession of events A,B, ... ,C." (We shall sometimes write dots in a memory sequence, [. .. A,B, ... ,C], to indicate the possible presence of previous memories which are irrelevant to the case being considered.)

Our problem is, then, to treat the interaction of such observer-systems with other physical systems (observations), within the framework of wave mechanics, and to deduce the resulting memory configurations, which we can then interpret as the subjective experiences of the observers.

The machine with Artificial Intelligence is going to make a series of observations A, B, C…. and record them in its memory. Hugh Everett concludes with this summary:

In the language of subjective experience, the observer which is described by a typical element, ψ'i,j...k, of the superposition has perceived an apparently random sequence of definite results for the observations. It is furthermore true, since in each element the system has been left in an eigenstate of the measurement, that if at this stage a redetermination of an earlier system observation Sl takes place, every element of the resulting final superposition will describe the observer with a memory configuration of the form [... ,ai1 ,... , ajl , ... , akr,ajl ] in which the earlier memory coincides with the later – i. e., the memory states are correlated. It will thus appear to the observer which is described by a typical element of the superposition that each initial observation on a system caused the system to "jump" into an eigenstate in a random fashion and thereafter remain there for subsequent measurements on the same system. Therefore, qualitatively, at least, the probabilistic assertions of Process 1 appear to be valid to the observer described by a typical element of the final superposition.

So when you throw an electron through a nonuniform magnetic field, the machine with Artificial Intelligence is going to record that the electron randomly “jumps ” into a spin-up eigenstate or a spin-down eigenstate and then continues to remain a spin-up or spin-down electron. Hugh Everett then proceeds to summarize all of this and explain how such an observer O that becomes entangled or “correlated” with an electron will leave behind in its memory a sequence of events that is exactly what we observe in the lab. The electron will seem to behave in a random manner until it is observed and put into a particular eigenstate, and then the electron will remain in that eigenstate until it is perturbed again. Since the latest observation supplies all of the possible information about the relative system state of the observer O and the electron, and previous observations are not correlated with it, the Heisenberg Uncertainty Principle is not violated either. If the observer O measures the electron’s velocity, a further measure of its position will blur its velocity.

We can therefore summarize the situation for an arbitrary sequence of observations, upon the same or different systems in any order, and for which the number of observations of each quantity in each system is very large, with the following result:

Except for a set of memory sequences of measure nearly zero, the averages of any functions over a memory sequence can be calculated approximately by the use of the independent probabilities given by Process 1 for each initial observation, on a system, and by the use of the transition probabilities (2.23) for succeeding observations upon the same system. In the limit, as the number of all types of observations goes to infinity the calculation is exact, and the exceptional set has measure zero.

This prescription for the calculation of averages over memory sequences by probabilities assigned to individual elements is precisely that of the orthodox theory (Process 1). Therefore all predictions of the usual theory will appear to be valid to the observer in almost all observer states, since these predictions hold for almost all memory sequences.

In particular, the uncertainty principle is never violated, since, as above, the latest measurement upon a system supplies all possible information about the relative system state, so that there is no direct correlation between any earlier results of observation on the system, and the succeeding observation. Any observation of a quantity B, between two successive observations of quantity A (all on the same system) will destroy the one-one correspondence between the earlier and later memory states for the result of A. Thus for alternating observations of different quantities there are fundamental limitations upon the correlations between memory states for the same observed quantity, these limitations expressing the content of the uncertainty principle.

In conclusion, we have described in this section processes involving an idealized observer, processes which are entirely deterministic and continuous from the over-all viewpoint (the total state function is presumed to satisfy a wave equation at all times) but whose result is a superposition, each element of which describes the observer with a different memory state. We have seen that in almost all of these observer states it appears to the observer that the probabilistic aspects of the usual form of quantum theory are valid. We have thus seen how pure wave mechanics, without any initial probability assertions, can lead to these notions on a subjective level, as appearances to observers.

So if an observer throws lots of electrons through a nonuniform magnetic field the observer will perceive the electrons popping out randomly in spin-up and spin-down eigenstates, but what really is happening is that the observer and the electrons are splitting off into their own universes each time an electron goes through the magnetic field. One observer-electron pair splits off into a spin-up universe, while another observer-electron pair splits off into a spin-down universe.

3 Several Observers

We shall now consider the consequences of our scheme when several observers are allowed to interact with the same systems, as well as with one another (communication). In the following discussion observers shall be denoted by O 1, O2,..., other systems by S1, S2,...,and observables by operators A, B, C,....

We shall also wish to allow communication among the observers, which we view as an interaction by means of which the memory sequences of different observers become correlated. (For example, the transfer of impulses from the magnetic tape memory of one mechanical observer to that of another constitutes such a transfer of information.)

Case 1: We allow two observers to separately observe the same quantity in a system, and then compare results.

After a bit of math, he concludes:

This means that observers who have separately observed the same quantity will always agree with each other.

For example, suppose a spin-up electron pops out of our measuring magnets and observer O1 measures it with a set of magnets and finds it to be a spin-up electron. If observer O2 observes the very same spin-up electron, he will also measure it to be a spin-up electron, and when observers O1 and O2 compare results they will agree. This is in agreement with the finding that the Copenhagen Interpretation predicts.

Case 2: We allow two observers to measure separately two different, noncommuting quantities in the same system.

For Case 2 we could have one observer measure the spin of an electron, while the other observer measures the velocity of the electron. The spin of an electron and its velocity are noncommuting quantities. Commuting quantities are quantities that fall under the Heisenberg Uncertainty Principle, like the position and velocity of the electron, where the measurement of one affects the measurement of the other. For this case, Hugh Everett mathematically demonstrates that again the same results are obtained as predicted by the Copenhagen Interpretation.

Case 3: We suppose that two systems S1 and S2 are correlated but no longer interacting, and that O1 measures property A in S1 and O2 property B in S2.

It is therefore seen that one observer's observation upon one system of a correlated, but non-interacting pair of systems, has no effect on the remote system, in the sense that the outcome or expected outcome of any experiments by another observer on the remote system are not affected. Paradoxes like that of Einstein-Rosen-Podolsky which are concerned with such correlated, non-interacting, systems are thus easily understood in the present scheme.

Case 3 is the basis for the infamous EPR (Einstein-Rosen-Podolsky) paradox that has caused so much grief for the Copenhagen Interpretation. The EPR Paradox goes like this. Suppose we prepare many pairs of quantum mechanically “entangled” electrons that conserve angular momentum. Each pair consists of one spin-up electron and one spin-down electron, but we do not know which is which at the onset. Now let the pairs of electrons fly apart and let two observers measure their spins. If observer A measures an electron there will be a 50% probability that he will find a spin-up electron and a 50% chance that he will find a spin-down electron, and the same goes for observer B, 50% of observer’s B electrons will be found to have a spin-up, while 50% will be found with a spin-down. Now the paradox of the EPR paradox, from the perspective of the Copenhagen Interpretation, is that when observer A and observer B come together to compare notes, they find that each time observer A found a spin-up electron, observer B found a spin-down electron, even though the electrons did not know which way they were spinning before the measurements were performed. Somehow when observer A measured the spin of an electron, it instantaneously changed the spin of the electron that observer B measured. Einstein hated this “spooky action at a distance” feature of the Copenhagen Interpretation that made physics nonlocal, meaning that things that were separated by great distances could still instantaneously change each other. He thought that it violated the speed of light speed limit of his Special Theory of Relativity that did not allow information to travel faster than the speed of light. Einstein thought that the EPR paradox was the final nail in the coffin of quantum mechanics. There had to be some “hidden variables” that allowed electrons to know if they “really” were a spin-up or spin-down electron. Hugh Everett solves this problem by letting the electrons be in all possible spin states in a large number of parallel universes. When observers measure the spin of an electron, they really do not measure the spin of the electron. They really measure in which universe they happen to be located in, and since everything in the Many-Worlds Interpretation relies on “correlated” composite wavefunctions, it should come as no surprise that when observer A and observer B come together, they find that their measurements of the electron spins are correlated.


We have now completed the abstract treatment of measurement and observation, with the deduction that the statistical predictions of the usual form of quantum theory (Process 1) will appear to be valid to all observers. We have therefore succeeded in placing our theory in correspondence with experience, at least insofar as the ordinary theory correctly represents experience.

We should like to emphasize that this deduction was carried out by using only the principle of superposition, and the postulate that an observation has the property that if the observed variable has a definite value in the object-system then it will remain definite and the observer will perceive this value. This treatment is therefore valid for any possible quantum interpretation of observation processes, i.e., any way in which one can interpret wave functions as describing observers, as well as for any form of quantum mechanics for which the superposition principle for states is maintained. Our abstract discussion of observation is therefore logically complete, in the sense that our results for the subjective experience of observers are correct, if there are any observers at all describable by wave mechanics.

In this chapter we shall consider a number of diverse topics from the point of view of our pure wave mechanics, in order to supplement the abstract discussion and give a feeling for the new viewpoint. Since we are now mainly interested in elucidating the reasonableness of the theory, we shall often restrict ourselves to plausibility arguments, rather than detailed proofs.

1. Macroscopic objects and classical mechanics

In the light of our knowledge about the atomic constitution of matter, any "object" of macroscopic size is composed of an enormous number of constituent particles. The wave function for such an object is then in a space of fantastically high dimension (3N, if N is the number of particles). Our present problem is to understand the existence of macroscopic objects, and to relate their ordinary (classical) behavior in the three dimensional world to the underlying wave mechanics in the higher dimensional space.

Let us begin by considering a relatively simple case. Suppose that we place in a box an electron and a proton, each in a definite momentum state, so that the position amplitude density of each is uniform over the whole box. After a time we would expect a hydrogen atom in the ground state to form, with ensuing radiation. We notice, however, that the position amplitude density of each particle is still uniform over the whole box. Nevertheless the amplitude distributions are now no longer independent, but correlated. In particular, the conditional amplitude density for the electron, conditioned by any definite proton (or centroid) position, is not uniform, but is given by the familiar ground state wave function for the hydrogen atom. What we mean by the statement, "a hydrogen atom has formed in the box," is just that this correlation has taken place - a correlation which insures that the relative configuration for the electron, for a definite proton position, conforms to the customary ground state configuration.

The wave function for the hydrogen atom can be represented as a product of a centroid wave function and a wave function over relative coordinates, where the centroid wave function obeys the wave equation for a particle with mass equal to the total mass of the proton-electron system. Therefore, if we now open our box, the centroid wave function will spread with time in the usual manner of wave packets, to eventually occupy a vast region of space. The relative configuration (described by the relative coordinate state function) has, however, a permanent nature, since it represents a bound state, and it is this relative configuration which we usually think of as the object called the hydrogen atom. Therefore, no matter how indefinite the positions of the individual particles become in the total state function (due to the spreading of the centroid), this state can be regarded as giving (through the centroid wave function) an amplitude distribution over a comparatively definite object, the tightly bound electron-proton system. The general state, then, does not describe any single such definite object, but a superposition of such cases with the object located at different positions.

In the above section, Hugh Everett proposes putting an electron and a proton in a box with each particle given a known initial momentum. Then according to the Heisenberg Uncertainty Principle, we cannot know anything about their positions, so they must be uniformly smeared out over the insides of the whole box, and they should stay that way forever. However, eventually, the electron and proton will interact and form a hydrogen atom, giving off a photon in the process. The two particles will then be defined by a composite correlated wavefunction that corresponds to the ground state of a hydrogen atom. This composite correlated wavefunction can be viewed as the product of a centroid wavefunction with the mass of a hydrogen atom and a relative wavefunction spread over coordinates relative to the proton. If we then open the box and release the hydrogen atom this centroid wavefunction will spread out all over the place as the hydrogen atom diffuses away from the box, but there still will be a relative component of the total composite wavefunction that represents the relative location of the electron with respect to the proton.

In a similar fashion larger and more complex objects can be built up through strong correlations which bind together the constituent particles. It is still true that the general state function for such a system may lead to marginal position densities for any single particle (or centroid) which extend over large regions of space. Nevertheless we can speak of the existence of a relatively definite object, since the specification of a single position for a particle, or the centroid, leads to the case where the relative position densities of the remaining particles are distributed closely about the specified one, in a manner forming the comparatively definite object spoken of.

Suppose, for example, we begin with a cannonball located at the origin, described by a state function:


where the subscript indicates that the total state function ψ describes a system of particles bound together so as to form an object of the size and shape of a cannonball, whose centroid is located (approximately) at the origin, say in the form of a real gaussian wave packet of small dimensions, with variance σ02 for each dimension.

If we now allow a long lapse of time, the centroid of the system will spread in the usual manner to occupy a large region of space....

It is not true that each individual particle spreads independently of the rest, in which case we would have a final state which is a grand superposition of states in which the particles are located independently everywhere. The fact that they are in bound states restricts our final state to a superposition of "cannonball" states. The wave function for the centroid can therefore be taken as a representative wave function for the whole object.

Similarly, in the above section Hugh Everett mathematically demonstrates that if we have a large number of particles that constitute a cannonball with a composite wavefunction ψ[cj(0,0,0)] defined upon the coordinates (0,0,0) that this composite wavefunction will indeed spread out with time, just like the wavefunction for a single unbound electron will spread out with time, but the individual particles will not spread out all over the place causing the cannonball to essentially evaporate. Thus large objects composed of bound particles will continue to behave as large objects composed of bound particles as time progresses. Next, he describes what an observer would record when observing the cannonball move through space. The observer would become correlated into a superposition of his wavefunction with that of the centroid cannonball wavefunction, and the cannonball will then appear to behave in a manner conforming to classical mechanics:

Let us now consider the result of an observation (considered along the lines of Chapter IV) performed upon a system of macroscopic bodies in a general state. The observer will not become aware of the fact that the state does not correspond to definite positions and momenta (i.e., he will not see the objects as "smeared out" over large regions of space) but will himself simply become correlated with the system - after the observation the composite system of objects + observer will be in a superposition of states, each element of which describes an observer who has perceived that the objects have nearly definite positions and momenta, and for whom the relative system state is a quasi-classical state in the previous sense, and furthermore to whom the system will appear to behave according to classical mechanics if his observation is continued. We see, therefore, how the classical appearance of the macroscopic world to us can be explained in the wave theory.

Since Hugh Everett has eliminated Process 1 from his theory, he next addresses what happens when an observation is made. For example, suppose observer A uses magnets to measure the spin of an electron and finds that the electron is a spin-up electron. In the Copenhagen Interpretation, the act of observing the electron will collapse its wavefunction into a spin-up eigenstate and this is an irreversible process that cannot be reversed in time. Hugh Everett goes through some more mathematics using what he has already discussed above to come to a different conclusion:

3. Reversibility and irreversibility

….There is another way of looking at this apparent irreversibility within our theory which recognizes only Process 2. When an observer performs an observation the result is a superposition, each element of which describes an observer who has perceived a particular value. From this time forward there is no interaction between the separate elements of the superposition (which describe the observer as having perceived different results), since each element separately continues to obey the wave equation. Each observer described by a particular element of the superposition behaves in the future completely independently of any events in the remaining elements, and he can no longer obtain any information whatsoever concerning these other elements (they are completely unobservable to him).

The irreversibility of the measuring process is therefore, within our framework, simply a subjective manifestation reflecting the fact that in observation processes the state of the observer is transformed into a superposition of observer states, each element of which describes an observer who is irrevocably cut off from the remaining elements. While it is conceivable that some outside agency could reverse the total wave function, such a change cannot be brought about by any observer which is represented by a single element of a superposition, since he is entirely powerless to have any influence on any other elements.

There are, therefore, fundamental restrictions to the knowledge that an observer can obtain about the state of the universe. It is impossible for any observer to discover the total state function of any physical system, since the process of observation itself leaves no independent state for the system or the observer, but only a composite system state in which the object-system states are inextricably bound up with the observer states. As soon as the observation is performed, the composite state is split into a superposition for which each element describes a different object-system state and an observer with (different) knowledge of it. Only the totality of these observer states, with their diverse knowledge, contains complete information about the original object-system state - but there is no possible communication between the observers described by these separate states. Any single observer can therefore possess knowledge only of the relative state function (relative to his state) of any systems, which is in any case all that is of any importance to him….

So instead of the observer collapsing the wavefunction of the electron with his magnets, the observer splits into two observers. One observer sees a spin-up electron and the other observer sees a spin-down electron. These two observers are totally unaware of each other and are completely cut off from each other with no possibility to interact. This is how when a quantum computer reads a 1-qubit memory location that is in a superposition of 1 and 0 at the top of an if-then-else block, one quantum computer will execute the then-block, while the other instance of the quantum computer will execute the else-block.

….We take this opportunity to caution against a certain viewpoint which can lead to difficulties. This is the idea that, after an apparatus has interacted with a system, in "actuality" one or another of the elements of the resultant superposition described by the composite state-function has been realized to the exclusion of the rest, the existing one simply being unknown to an external observer (i.e., that instead of the superposition there is a genuine mixture). This position must be erroneous since there is always the possibility for the external observer to make use of interference properties between the elements of the superposition.

In the present example, for instance, it is in principle possible to deflect the two beams back toward one another with magnetic fields and recombine them in another inhomogeneous field, which duplicates the first, in such a manner that the original spin state (before entering the apparatus) is restored. This would not be possible if the original Stern-Gerlach apparatus performed the function of converting the original wave packet into a non-interfering mixture of packets for the two spin cases. Therefore the position that after the atom has passed through the inhomogeneous field it is "really" in one or the other beam with the corresponding spin, although we are ignorant of which one, is incorrect.

Shooting a beam of electrons through an inhomogeneous magnetic field will cause two beams to seemingly emerge, one with spin-up electrons and one with spin-down electrons. But according to Hugh Everett, each electron will end up in each beam, but in two separate universes, and each electron will be just as “real” as the other. For example, Hugh Everett maintains that this must be so because, theoretically, it is possible to reflect the electrons coming out of a Stern-Gerlach device back through the device to return the spin-up and spin-down electrons back into being electrons in a mixed state of spin-up and spin-down. Essentially, this is what our circular tub of water would do to the circular waves arising from dropping a pebble into the center of the circular tub of water.

It is therefore improper to attribute any less validity or "reality" to any element of a superposition than any other element, due to this ever present possibility of obtaining interference effects between the elements. All elements of a superposition must be regarded as simultaneously existing.

Below is Hugh Everett’s final chapter in its entirety where he nicely sums things up, without any mathematics at all.


We have shown that our theory based on pure wave mechanics, which takes as the basic description of physical systems the state function - supposed to be an objective description (i.e., in one-one, rather than statistical, correspondence to the behavior of the system) - can be put in satisfactory correspondence with experience. We saw that the probabilistic assertions of the usual interpretation of quantum mechanics can be deduced from this theory, in a manner analogous to the methods of classical statistical mechanics, as subjective appearances to observers - observers which were regarded simply as physical systems subject to the same type of description and laws as any other systems, and having no preferred position. The theory is therefore capable of supplying us with a complete conceptual model of the universe, consistent with the assumption that it contains more than one observer.

Because the theory gives us an objective description, it constitutes a framework in which a number of puzzling subjects (such as classical level phenomena, the measuring process itself, the inter-relationship of several observers, questions of reversibility and irreversibility, etc.) can be investigated in detail in a logically consistent manner. It supplies a new way of viewing processes, which clarifies many apparent paradoxes of the usual interpretation - indeed, it constitutes an objective framework in which it is possible to understand the general consistency of the ordinary view.

We shall now resume our discussion of alternative interpretations. There has been expressed lately a great deal of dissatisfaction with the present form of quantum theory by a number of authors, and a wide variety of new interpretations have sprung into existence. We shall now attempt to classify briefly a number of these interpretations, and comment upon them.

a. The "popular" interpretation. This is the scheme alluded to in the introduction, where ψ is regarded as objectively characterizing the single system, obeying a deterministic wave equation when the system is isolated but changing probabilistically and discontinuously under observation.

In its unrestricted form this view can lead to paradoxes like that mentioned in the introduction, and is therefore untenable. However, this view is consistent so long as it is assumed that there is only one observer in the universe (the solipsist position - Alternative 1 of the Introduction). This consistency is most easily understood from the viewpoint of our own theory, where we were able to show that all phenomena will seem to follow the predictions of this scheme to any observer. Our theory therefore justifies the personal adoption of this probabilistic interpretation, for purposes of making practical predictions, from a more satisfactory framework

b. The Copenhagen interpretation. This is the interpretation developed by Bohr. The ψ function is not regarded as an objective description of a physical system (i.e., it is in no sense a conceptual model), but is regarded as merely a mathematical artifice which enables one to make statistical predictions, albeit the best predictions which it is possible to make. This interpretation in fact denies the very possibility of a single conceptual model applicable to the quantum realm, and asserts that the totality of phenomena can only be understood by the use of different, mutually exclusive (i.e., "complementary") models in different situations. All statements about microscopic phenomena are regarded as meaningless unless accompanied by a complete description (classical) of an experimental arrangement.

While undoubtedly safe from contradiction, due to its extreme conservatism, it is perhaps overcautious. We do not believe that the primary purpose of theoretical physics is to construct "safe" theories at severe cost in the applicability of their concepts, which is a sterile occupation, but to make useful models which serve for a time and are replaced as they are outworn.

Another objectionable feature of this position is its strong reliance upon the classical level from the outset, which precludes any possibility of explaining this level on the basis of an underlying quantum theory. (The deduction of classical phenomena from quantum theory is impossible simply because no meaningful statements can be made without pre-existing classical apparatus to serve as a reference frame.) This interpretation suffers from the dualism of adhering to a "reality" concept (i.e., the possibility of objective description) on the classical level but renouncing the same in the quantum domain.

There is some political maneuvering going on in the above passage. The “popular” interpretation really is the Copenhagen Interpretation, but Niels Bohr was still a living giant of quantum theory at the time, and it would not be wise for this Ph.D. thesis to be seen as a direct attack on the Copenhagen Interpretation and Niels Bohr. So Hugh Everett breaks apart the Copenhagen Interpretation into two parts. Part 1 he calls the “popular” interpretation in which wavefunctions mysteriously collapse when an observation is made. Part 2 he calls the Copenhagen Interpretation where wavefunctions are just a mathematical tool used to perform calculations after you set up a macroscopic experiment to make a quantum mechanical measurement. For example, the wavefunctions of electrons passing through a nonuniform magnetic field could be used to calculate that 50% will be observed to be spin-up electrons, while 50% will be observed to be spin-down electrons. So now we know that the “popular” interpretation that Hugh Everett has been attacking from the very first line of his thesis:

“a particular interpretation of quantum theory which, although not representative of the more careful formulations of some writers, is the most common form encountered in textbooks and university lectures on the subject.”

is really not Niels Bohr’s sacred Copenhagen Interpretation at all. It is really the “popular” interpretation that he has been attacking all along. I think we all would have practiced a similar maneuver in his shoes.

c. The "hidden variables" interpretation. This is the position (Alternative 4 of the Introduction) that '" is not a complete description of a single system. It is assumed that the correct complete description, which would involve further (hidden) parameters, would lead to a deterministic theory, from which the probabilistic aspects arise as a result of our ignorance of these extra parameters in the same manner as in classical statistical mechanics.

The ψ function is therefore regarded as a description of an ensemble of systems rather than a single system. Proponents of this interpretation include Einstein, Bohm, Wiener and Siegal.

Einstein hopes that a theory along the lines of his general relativity, where all of physics is reduced to the geometry of space-time could satisfactorily explain quantum effects. In such a theory a particle is no longer a simple object but possesses an enormous amount of structure (i.e., it is thought of as a region of space-time of high curvature). It is conceivable that the interactions of such "particles" would depend in a sensitive way upon the details of this structure, which would then play the role of the "hidden variables". However, these theories are non-linear and it is enormously difficult to obtain any conclusive results. Nevertheless, the possibility cannot be discounted.

Bohm considers ψ to be a real force field acting on a particle which always has a well-defined position and momentum (which are the hidden variables of this theory). The ψ-field satisfying Schrödinger equation is pictured as somewhat analogous to the electromagnetic field satisfying Maxwell's equations, although for systems of n particles the ψ -field is in a 3n-dimensional space. With this theory Bohm succeeds in showing that in all actual cases of measurement the best predictions that can be made are those of the usual theory, so that no experiments could ever rule out his interpretation in favor of the ordinary theory. Our main criticism of this view is on the grounds of simplicity - if one desires to hold the view that if is a real field then the associated particle is superfluous since, as we have endeavored to illustrate, the pure wave theory is itself satisfactory.

Wiener and Siegal have developed a theory which is more closely tied to the formalism of quantum mechanics. From the set N of all nondegenerate linear Hermitian operators for a system having a complete set of eigenstates, a subset I is chosen such that no two members of I commute and every element outside I commutes with at least one element of I . The set I therefore contains precisely one operator for every orientation of the principal axes of the Hilbert space for the system. It is postulated that each of the operators of I corresponds to an independent observable which can take any of the real numerical values of the spectrum of the operator. This theory, in its present form, is a theory of infinitely many "hidden variables," since a system is pictured as possessing (at each instant) a value for everyone of these "observables" simultaneously, with the changes in these values obeying precise (deterministic) dynamical laws. However, the change of anyone of these variables with time depends upon the entire set of observables, so that it is impossible ever to discover by measurement the complete set of values for a system (since only one "observable" at a time can be observed). Therefore, statistical ensembles are introduced, in which the values of all of the observables are related to points in a "differential space," which is a Hilbert space containing a measure for which each (differential space) coordinate has an independent normal distribution. It is then shown that the resulting statistical dynamics is in accord with the usual form of quantum theory.

It cannot be disputed that these theories are often appealing, and might conceivably become important should future discoveries indicate serious inadequacies in the present scheme (i.e., they might be more easily modified to encompass new experience). But from our viewpoint they are usually more cumbersome than the conceptually simpler theory based on pure wave mechanics. Nevertheless, these theories are of great theoretical importance because they provide us with examples that "hidden variables" theories are indeed possible.

d. The stochastic process interpretation. This is the point of view which holds that the fundamental processes of nature are stochastic (i.e., probabilistic) processes. According to this picture physical systems are supposed to exist at all times in definite states, but the states are continually undergoing probabilistic changes. The discontinuous probabilistic "quantum-jumps" are not associated with acts of observation, but are fundamental to the systems themselves.

A stochastic theory which emphasizes the particle, rather than wave, aspects of quantum theory has been investigated by Bopp. The particles do not obey deterministic laws of motion, but rather probabilistic laws, and by developing a general "correlation statistics" Bopp shows that his quantum scheme is a special case which gives results in accord with the usual theory. (This accord is only approximate and in principle one could decide between the theories. The approximation is so close, however, that it is hardly conceivable that a decision would be practically feasible.)

Bopp's theory seems to stem from a desire to have a theory founded upon particles rather than waves, since it is this particle aspect (highly localized phenomena) which is most frequently encountered in present day high-energy experiments (cloud chamber tracks, etc.). However, it seems to us to be much easier to understand particle aspects from a wave picture (concentrated wave packets) than it is to understand wave aspects (diffraction, interference, etc.) from a particle picture.

Nevertheless, there can be no fundamental objection to the idea of a stochastic theory, except on grounds of a naked prejudice for determinism. The question of determinism or indeterminism in nature is obviously forever undecidable in physics, since for any current deterministic [ probabilistic] theory one could always postulate that a refinement of the theory would disclose a probabilistic [deterministic] substructure, and that the current deterministic [probabilistic] theory is to be explained in terms of the refined theory on the basis of the law of large numbers [ignorance of hidden variables). However, it is quite another matter to object to a mixture of the two where the probabilistic processes occur only with acts of observation.

e. The wave interpretation. This is the position proposed in the present thesis, in which the wave function itself is held to be the fundamental entity, obeying at all times a deterministic wave equation.

This view also corresponds most closely with that held by Schrödinger. However, this picture only makes sense when observation processes themselves are treated within the theory. It is only in this manner that the apparent existence of definite macroscopic objects, as well as localized phenomena, such as tracks in cloud chambers, can be satisfactorily explained in a wave theory where the waves are continually diffusing. With the deduction in this theory that phenomena will appear to observers to be subject to Process I, Heisenberg's criticism of Schrödinger’s opinion - that continuous wave mechanics could not seem to explain the discontinuities which are everywhere observed - is effectively met. The "quantum jumps" exist in our theory as relative phenomena (i.e., the states of an object-system relative to chosen observer states show this effect), while the absolute states change quite continuously.

The wave theory is definitely tenable and forms, we believe, the simplest complete, self-consistent theory.

We should like now to comment on some views expressed by Einstein. Einstein's criticism of quantum theory (which is actually directed more against what we have called the "popular" view than Bohr's interpretation) is mainly concerned with the drastic changes of state brought about by simple acts of observation (i.e., the infinitely rapid collapse of wave functions), particularly in connection with correlated systems which are widely separated so as to be mechanically uncoupled at the time of observation. At another time he put his feeling colorfully by stating that he could not believe that a mouse could bring about drastic changes in the universe simply by looking at it.

However, from the standpoint of our theory, it is not so much the system which is affected by an observation as the observer, who becomes correlated to the system.

In the case of observation of one system of a pair of spatially separated, correlated systems, nothing happens to the remote system to make any of its states more "real" than the rest. It had no independent states to begin with, but a number of states occurring in a superposition with corresponding states for the other (near) system. Observation of the near system simply correlates the observer to this system, a purely local process - but a process which also entails automatic correlation with the remote system. Each state of the remote system still exists with the same amplitude in a superposition, but now a superposition for which element contains, in addition to a remote system state and correlated near system state, an observer state which describes an observer who perceives the state of the near system. From the present viewpoint all elements of this superposition are equally "real." Only the observer state has changed, so as to become correlated with the state of the near system and hence naturally with that of the remote system also. The mouse does not affect the universe - only the mouse is affected.

This is Hugh Everett’s solution to the EPR paradox. Recall that if we prepare many pairs of quantum mechanically “entangled” electrons that conserve angular momentum, initially each electron will be in a mixture of spin states because it has not been measured yet. In the Copenhagen Interpretations these electrons “really” do not know what their spins are at this point, but when observer A and observer B later measure their spins with a Stern-Gerlach device and then compare notes, they will find that whenever observer A measured a spin-up electron, observer B measured its twin as a spin-down electron. Since the electrons “really” did not know what their spins were before being measured, somehow measuring the spin of an electron “here” instantaneously determined the spin of its twin over “there”, and the “here” and “there” can be on the opposite ends of the visible Universe. In 1982 Alain Aspect actually conducted an experiment that validated this finding using photons instead of electrons, so this is not just a thought experiment. The Universe actually behaves like this!

Here is Hugh Everett’s solution. Each near electron is in a number of superposition states with its twin electron that is over “there”, and each of those superpositions must conserve angular momentum because that is the law, meaning that one electron is a spin-up and the other is a spin-down. When observer A “measures” a near electron, observer A becomes correlated with the near electron, and also with its twin electron over “there” because both of those electrons were already in a correlated superposition state to begin with. When observer B measures the twin electron over “there”, he becomes correlated with the twin electron, and consequently with the correlated superposition state of observer A and both electrons. That is why when observer A finds a spin-up electron, observer B finds a spin-down electron. Basically, observer A and observer B are not really measuring the spins of the electrons. Instead, they have really put together a very complex experiment that always places them into the same universe amongst many parallel universes. That is why the mouse does not affect the universe.

Our theory in a certain sense bridges the positions of Einstein and Bohr, since the complete theory is quite objective and deterministic ("God does not play dice with the universe"), and yet on the subjective level, of assertions relative to observer states, it is probabilistic in the strong sense that there is no way for observers to make any predictions better than the limitations imposed by the uncertainty principle.

In conclusion, we have seen that if we wish to adhere to objective descriptions then the principle of the psycho-physical parallelism requires that we should be able to consider some mechanical devices as representing observers. The situation is then that such devices must either cause the probabilistic discontinuities of Process 1, or must be transformed into the superpositions we have discussed. We are forced to abandon the former possibility since it leads to the situation that some physical systems would obey different laws from the rest, with no clear means for distinguishing between these two types of systems. We are thus led to our present theory which results from the complete abandonment of Process 1 as a basic process. Nevertheless, within the context of this theory, which is objectively deterministic, it develops that the probabilistic aspects of Process 1 reappear at the subjective level, as relative phenomena to observers.

One is thus free to build a conceptual model of the universe, which postulates only the existence of a universal wave function which obeys a linear wave equation. One then investigates the internal correlations in this wave function with the aim of deducing laws of physics, which are statements that take the form: Under the conditions C the property A of a subsystem of the universe (subset of the total collection of coordinates for the wave function) is correlated with the property B of another subsystem (with the manner of correlation being specified). For example, the classical mechanics of a system of massive particles becomes a law which expresses the correlation between the positions and momenta (approximate) of the particles at one time with those at another time. All statements about subsystems then become relative statements, i.e., statements about the subsystem relative to a prescribed state for the remainder (since this is generally the only way a subsystem even possesses a unique state), and all laws are correlation laws.

The theory based on pure wave mechanics is a conceptually simple causal theory, which fully maintains the principle of the psycho-physical parallelism. It therefore forms a framework in which it is possible to discuss (in addition to ordinary phenomena) observation processes themselves, including the inter-relationships of several observers, in a logical, unambiguous fashion. In addition, all of the correlation paradoxes, like that of Einstein, Rosen, and Podolsky, find easy explanation.

While our theory justifies the personal use of the probabilistic interpretation as an aid to making practical predictions, it forms a broader frame in which to understand the consistency of that interpretation. It transcends the probabilistic theory, however, in its ability to deal logically with questions of imperfect observation and approximate measurement.

Since this viewpoint will be applicable to all forms of quantum mechanics which maintain the superposition principle, it may prove a fruitful framework for the interpretation of new quantum formalisms. Field theories, particularly any which might be relativistic in the sense of general relativity, might benefit from this position, since one is free to construct formal (non-probabilistic) theories, and supply any possible statistical interpretations later. (This viewpoint avoids the necessity of considering anomalous probabilistic jumps scattered about space-time, and one can assert that field equations are satisfied everywhere and everywhen, then deduce any statistical assertions by the present method.)

By focusing attention upon questions of correlations, one may be able to deduce useful relations (correlation laws analogous to those of classical mechanics) for theories which at present do not possess known classical counterparts. Quantized fields do not generally possess pointwise independent field values, the values at one point of space-time being correlated with those at neighboring points of space-time in a manner, it is to be expected, approximating the behavior of their classical counterparts. If correlations are important in systems with only a finite number of degrees of freedom, how much more important they must be for systems of infinitely many coordinates.

Finally, aside from any possible practical advantages of the theory, it remains a matter of intellectual interest that the statistical assertions of the usual interpretation do not have the status of independent hypotheses, but are deducible (in the present sense) from the pure wave mechanics, which results from their omission.

For the more mathematically gifted, I encourage you to try reading the full text of Hugh Everett’s original draft Ph.D. thesis. One reason John Wheeler had Hugh Everett heavily edit his original 137-page document down to his final 36-page doctoral dissertation was that he was afraid the departmental physicists on Hugh Everett’s dissertation committee would not understand the material and that the oral defense of his Ph.D. thesis would not go well, and could possibly even lead to his failure to pass his thesis defense. So do not feel too badly if the mathematics goes way over your head.

Comments are welcome at

To see all posts on softwarephysics in reverse order go to:

Steve Johnston