Wednesday, July 01, 2015

Introduction to Softwarephysics

Softwarephysics is a simulated science for the simulated Software Universe that we are all immersed in. It is an approach to software development, maintenance, and support based upon concepts from physics, chemistry, biology, and geology that I have been using on a daily basis for over 35 years as an IT professional. For those of you not in the business, IT is short for Information Technology, commercial computer science. The purpose of softwarephysics is to explain why IT is so difficult, to suggest possible remedies, and to provide a direction for thought. If you are an IT professional, general computer user, or simply an individual interested in computer science, physics, chemistry, biology, or geology then softwarephysics might be of interest to you, if not in an entirely serious manner, perhaps at least in an entertaining one.

From 1975 – 1979, I was an exploration geophysicist exploring for oil, first with Shell, and then with Amoco. In 1979, I made a career change into IT, and spent about 20 years in development. For the past 14 years, I have been in IT operations, supporting middleware on WebSphere, JBoss, Tomcat, and ColdFusion. When I transitioned into IT from geophysics, I figured if you could apply physics to geology; why not apply physics to software? So like the exploration team at Amoco that I had just left, consisting of geologists, geophysicists, paleontologists, geochemists, and petrophysicists, I decided to take all the physics, chemistry, biology, and geology that I could muster and throw it at the problem of software. The basic idea was that many concepts in physics, chemistry, biology, and geology suggested to me that the IT community had accidentally created a pretty decent computer simulation of the physical Universe on a grand scale, a Software Universe so to speak, and that I could use this fantastic simulation in reverse, to better understand the behavior of commercial software, by comparing software to how things behaved in the physical Universe. Softwarephysics depicts software as a virtual substance, and relies upon our understanding of the current theories in physics, chemistry, biology, and geology to help us model the nature of software behavior. So in physics we use software to simulate the behavior of the Universe, while in softwarephysics we use the Universe to simulate the behavior of software. Along these lines, we use the Equivalence Conjecture of Softwarephysics as an aid; it allows us to shift back and forth between the Software Universe and the physical Universe, and hopefully to learn something about one by examining the other:

The Equivalence Conjecture of Softwarephysics
Over the past 70 years, through the uncoordinated efforts of over 50 million independently acting programmers to provide the world with a global supply of software, the IT community has accidentally spent more than $10 trillion creating a computer simulation of the physical Universe on a grand scale – the Software Universe.

Logical Positivism and Effective Theories
Many IT professionals have a difficult time with softwarephysics because they think of physics as being limited to the study of real things like electrons and photons, and since software is not “real”, how can you possibly apply concepts from physics and the other sciences to software? To address this issue, softwarephysics draws heavily upon two concepts from physics that have served physics quite well over the past century – the concept of logical positivism and the concept of effective theories. This was not always the case. In the 17th, 18th, and 19th centuries, physicists mistakenly thought that they were actually discovering the fundamental laws of the Universe, which they thought were based upon real tangible things like particles, waves, and fields. Classical Newtonian mechanics (1687), thermodynamics (1850), and classical electrodynamics (1864) did a wonderful job of describing the everyday world at the close of the 19th century, but early in the 20th century it became apparent that the models upon which these very successful theories were based did not work very well for small things like atoms or for objects moving at high velocities or in strong gravitational fields. This provoked a rather profound philosophical crisis within physics at the turn of the century, as physicists worried that perhaps 300 years of work was about to go down the drain. The problem was that classical physicists confused their models of reality with reality itself, and when their classical models began to falter, their confidence in physics began to falter as well. This philosophical crisis was resolved with the adoption of the concepts of logical positivism and some new effective theories in physics. Quantum mechanics (1926) was developed for small things like atoms, the special theory of relativity (1905) was developed for objects moving at high velocities and the general theory of relativity (1915) was developed for objects moving in strong gravitational fields.

Logical positivism, usually abbreviated simply to positivism, is an enhanced form of empiricism, in which we do not care about how things “really” are; we are only interested with how things are observed to behave. With positivism, physicists only seek out models of reality - not reality itself. When we study quantum mechanics, we will find that the concept of reality gets rather murky in physics anyway, so this is not as great a loss as it might at first seem. By concentrating on how things are observed to behave, rather than on what things “really” are, we avoid the conundrum faced by the classical physicists. In retrospect, this idea really goes all the way back to the very foundations of physics. In Newton’s Principia (1687) he outlined Newtonian mechanics and his theory of gravitation, which held that the gravitational force between two objects was proportional to the product of their masses divided by the square of the distance between them. Newton knew that he was going to take some philosophical flack for proposing a mysterious force between objects that could reach out across the vast depths of space with no apparent mechanism, so he took a very positivistic position on the matter with the famous words:

I have not as yet been able to discover the reason for these properties of gravity from phenomena, and I do not feign hypotheses. For whatever is not deduced from the phenomena must be called a hypothesis; and hypotheses, whether metaphysical or physical, or based on occult qualities, or mechanical, have no place in experimental philosophy. In this philosophy particular propositions are inferred from the phenomena, and afterwards rendered general by induction.

Instead, Newton focused on how things were observed to move under the influence of his law of gravitational attraction, without worrying about what gravity “really” was.

The second concept, that of effective theories, is an extension of positivism. An effective theory is an approximation of reality that only holds true over a certain restricted range of conditions and only provides for a certain depth of understanding of the problem at hand. For example, Newtonian mechanics is an effective theory that makes very good predictions for the behavior of objects moving less than 10% of the speed of light and which are bigger than a very small grain of dust. These limits define the effective range over which Newtonian mechanics can be applied to solve problems. For very small things we must use quantum mechanics and for very fast things moving in strong gravitational fields, we must use relativity theory. So all of the current theories of physics, such as Newtonian mechanics, classical electrodynamics, thermodynamics, statistical mechanics, the special and general theories of relativity, quantum mechanics, and the quantum field theories of QED and QCD are effective theories that are based upon models of reality, and all these models are approximations - all these models are fundamentally "wrong", but at the same time, these effective theories make exceedingly good predictions of the behavior of physical systems over the limited ranges in which they apply. That is the goal of softwarephysics – to provide for an effective theory of software behavior that makes useful predictions of software behavior that are applicable to the day-to-day activities of IT professionals. So in softwarephysics, we adopt a very positivistic viewpoint of software; we do not care what software “really is”, we only care about how software is observed to behave and try to model those behaviors with an effective theory of software behavior that only holds true over a certain restricted range of conditions and only provides for a certain depth of understanding of the problem at hand.

GPS satellites provide a very good example of positivism and effective theories at work. There are currently 31 GPS satellites orbiting at an altitude of 12,600 miles above the Earth, and each contains a very accurate atomic clock. The signals from the GPS satellites travel to your GPS unit at the speed of light, so by knowing the travel time of the signals from at least 4 of the GPS satellites, it is possible to determine your position on Earth very accurately. In order to do that, it is very important to have very accurate timing measurements. Newtonian mechanics is used to launch the GPS satellites to an altitude of 12,600 miles and to keep them properly positioned in orbit. Classical electrodynamics is then used to beam the GPS signals back down to Earth to the GPS unit in your car. Quantum mechanics is used to build the transistors on the chips onboard the GPS satellites and to understand the quantum tunneling of electrons in the flash memory chips used to store GPS data on the satellites. The special theory of relativity predicts that the onboard atomic clocks on the GPS satellites will run slower and lose about 7.2 microseconds per day due to their high velocities relative to an observer on the Earth. But at the same time, the general theory of relativity also predicts that because the GPS satellites are further from the center of the Earth and in a weaker gravitational field, where spacetime is less deformed than on the surface of the Earth, their atomic clocks also run faster and gain 45.9 microseconds per day due to the weaker gravitational field out there. The net effect is a gain of 38.7 microseconds per day, so the GPS satellite atomic clocks have to be purposefully built to run slow by 38.7 microseconds per day before they are launched, so that they will keep in sync with clocks on the surface of the Earth. If this correction were not made, an error in your position of 100 yards/day would accrue. The end result of the combination of all these fundamentally flawed effective theories is that it is possible to pinpoint your location on Earth to an accuracy of 16 feet or better for as little as $100. But physics has done even better than that with its fundamentally flawed effective theories. By combining the effective theories of special relativity (1905) with quantum mechanics (1926), physicists were able to produce a new effective theory for the behavior of electrons and photons called quantum electrodynamics QED (1948) which was able to predict the gyromagnetic ratio of the electron, a measure of its intrinsic magnetic field, to an accuracy of 11 decimal places. As Richard Feynman has pointed out, this was like predicting the exact distance between New York and Los Angeles accurate to the width of a human hair!

So Newtonian mechanics makes great predictions for the macroscopic behavior of GPS satellites, but it does not work very well for small things like the behavior of individual electrons within transistors, where quantum mechanics is required, or for things moving at high speeds or in strong gravitational fields where relativity theory must be applied. And all three of these effective theories are based upon completely contradictory models. General relativity maintains that spacetime is curved by matter and energy, but that matter and energy are continuous, while quantum mechanics maintains that spacetime is flat, but that matter and energy are quantized into chunks. Newtonian mechanics simply states that space and time are mutually independent dimensions and universal for all, with matter and energy being continuous. The important point is that all effective theories and scientific models are approximations – they are all fundamentally "wrong". But knowing that you are "wrong" gives you a great advantage over people who know that they are "right", because knowing that you are "wrong" allows you to seek improved models of reality. So please consider softwarephysics to simply be an effective theory of software behavior that is based upon models that are fundamentally “wrong”, but at the same time, fundamentally useful for IT professionals. So as you embark upon your study of softwarephysics, please always keep in mind that the models of softwarephysics are just approximations of software behavior, they are not what software “really is”. It is very important not to confuse models of software behavior with software itself, if one wishes to avoid the plight of the 19th century classical physicists.

If you are an IT professional and many of the above concepts are new to you, do not be concerned. This blog on softwarephysics is aimed at a diverse audience, but with IT professionals in mind. All of the above ideas will be covered at great length in the postings in this blog on softwarephysics and in a manner accessible to all IT professionals. Now it turns out that most IT professionals have had some introduction to physics in high school or in introductory college courses, but that presents an additional problem. The problem is that such courses generally only cover classical physics, and leave the student with a very good picture of physics as it stood in 1864! It turns out that the classical physics of Newtonian mechanics, thermodynamics, and classical electromagnetic theory were simply too good to discard and are still quite useful, so they are taught first to beginners and then we run out of time to cover the really interesting physics of the 20th century. Now imagine the problems that the modern world would face if we only taught similarly antiquated courses in astronomy, metallurgy, electrical and mechanical engineering, medicine, economics, biology, or geology that happily left students back in 1864! Since many of the best models for software behavior stem from 20th century physics, we will be covering a great deal of 20th century material in these postings – the special and general theories of relativity, quantum mechanics, quantum field theories, and chaos theory, but I hope that you will find that these additional effective theories are quite interesting on their own, and might even change your worldview of the physical Universe at the same time.

Unintended Consequences for the Scientific Community
As I mentioned at the close of my original posting on SoftwarePhysics, my initial intention for this blog on softwarephysics was to fulfill a promise I made to myself about 20 years ago to approach the IT community with the concept of softwarephysics a second time, following my less than successful attempt to do so in the 1980s, with the hope of helping the IT community to better cope with the daily mayhem of life in IT. However, in laying down the postings for this blog an unintended consequence arose in my mind as I became profoundly aware of the enormity of this vast computer simulation of the physical Universe that the IT community has so graciously provided to the scientific community free of charge and also of the very significant potential scientific value that it provides. One of the nagging problems for many of the observational and experimental sciences is that many times there is only one example readily at hand to study or experiment with, and it is very difficult to do meaningful statistics with a population of N=1.

But the computer simulation of the physical Universe that the Software Universe presents provides another realm for comparison. For example, both biology and astrobiology only have one biosphere on Earth to study and even physics itself has only one Universe with which to engage. Imagine the possibilities if scientists had another Universe readily at hand in which to work! This is exactly what the Software Universe provides. For example, in SoftwareBiology and A Proposal For All Practicing Paleontologists we see that the evolution of software over the past 70 years has closely followed the same path as life on Earth over the past 4.0 billion years, in keeping with Simon Conway Morris’s contention that convergence has played the dominant role in the evolution of life on Earth. In When Toasters Fly, we also see that software has evolved in fits and starts as portrayed by the punctuated equilibrium of Stephen Jay Gould and Niles Eldredge, and in The Adaptationist View of Software Evolution we explore the overwhelming power of natural selection in the evolution of software. In keeping with Peter Ward’s emphasis on mass extinctions dominating the course of evolution throughout geological time, we also see in SoftwareBiology that there have been several dramatic mass extinctions of various forms of software over the past 70 years as well, that have greatly affected the evolutionary history of software, and that between these mass extinctions, software has also tended to evolve through the gradual changes of Hutton’s and Lyell’s uniformitarianism. In Software Symbiogenesis and Self-Replicating Information, we also see the very significant role that parasitic/symbiotic relationships have played in the evolution of software, in keeping with the work of Lynn Margulis and also of Freeman Dyson’s two-stage theory of the origin of life on Earth. In The Origin of Software the Origin of Life, we explore Stuart Kauffman’s ideas on how Boolean nets of autocatalytic chemical reactions might have kick-started the whole thing as an emergent behavior of an early chaotic pre-biotic environment on Earth, and that if Seth Shostak is right, we will never end up talking to carbon-based extraterrestrial aliens, but to alien software instead. In Is the Universe Fine-Tuned for Self-Replicating Information? we explore the thermodynamics of Brandon Carter’s Weak Anthropic Principle (1973), as it relates to the generation of universes in the multiverse that are capable of sustaining intelligent life. Finally, in Programming Clay we revisit Alexander Graham Cairns-Smith’s theory (1966) that Gene 1.0 did not run on nucleic acids, but on clay microcrystal precursors instead.

Similarly for the physical sciences, in Is the Universe a Quantum Computer? we find a correspondence between TCP/IP and John Cramer’s Transactional Interpretation of quantum mechanics. In SoftwarePhysics and Cyberspacetime, we also see that the froth of CPU processes running with a clock speed of 109 Hz on the 10 trillion currently active microprocessors that comprise the Software Universe can be viewed as a slowed down simulation of the spin-foam froth of interacting processes of loop quantum gravity running with a clock speed of 1043 Hz that may comprise the physical Universe. And in Software Chaos, we examine the nonlinear behavior of software and some of its emergent behaviors and follow up in CyberCosmology with the possibility that vast quantities of software running on large nonlinear networks might eventually break out into consciousness in accordance with the work of George Dyson and Daniel Dennett. Finally, in Model-Dependent Realism - A Positivistic Approach to Realism we compare Steven Weinberg’s realism with the model-dependent realism of Stephen Hawking and Leonard Mlodinow and how the two worldviews affect the search for a Final Theory. Finally, in The Software Universe as an Implementation of the Mathematical Universe Hypothesis we at long last explore what software might really be, and discover that the Software Universe might actually be more closely related to the physical Universe than you might think.

The chief advantage of doing fieldwork in the Software Universe is that, unlike most computer simulations of the physical Universe, it is an unintended and accidental simulation, without any of the built-in biases that most computer simulations of the physical Universe suffer. So you will truly be able to do fieldwork in a pristine and naturally occuring simulation, just as IT professionals can do fieldwork in the wild and naturally occuring simulation of software that the living things of the biosphere provide. Secondly, the Software Universe is a huge simulation that is far beyond the budgetary means of any institution or consortium by many orders of magnitude. So if you are an evolutionary biologist, astrobiologist, or paleontologist working on the origin and evolution of life in the Universe, or a physicist or economist working on the emergent behaviors of nonlinear systems and complexity theory, or a neurobiologist working on the emergence of consciousness in neural networks, or even a frustrated string theorist struggling with quantum gravity, it would be well worth your while to pay a friendly call upon the local IT department of a major corporation in your area. Start with a visit to the Command Center for their IT Operations department to get a global view of their IT infrastructure and to see how it might be of assistance to the work in your area of interest. From there you can branch out to the applicable area of IT that will provide the most benefit.

The Impact of Self-Replicating Information Upon the Planet
One of the key findings of softwarephysics is concerned with the magnitude of the impact upon the planet of self-replicating information.

Self-Replicating Information – Information that persists through time by making copies of itself or by enlisting the support of other things to ensure that copies of itself are made.

Basically, we have seen several waves of self-replicating information dominate the Earth:
1. Self-replicating autocatalytic metabolic pathways of organic molecules
2. RNA
3. DNA
4. Memes
5. Software

Note that because the self-replicating autocatalytic metabolic pathways of organic molecules, RNA and DNA have become so heavily intertwined over time that I now simply refer to them as the “genes”. Over the past 4.0 billion years, the surface of the Earth has been totally reworked by three forms of self-replicating information – the genes, memes and software, with software rapidly becoming the dominant form of self-replicating information on the planet. For more on this see:

A Brief History of Self-Replicating Information
How to Use an Understanding of Self-Replicating Information to Avoid War
How to Use Softwarephysics to Revive Memetics in Academia
Is Self-Replicating Information Inherently Self-Destructive?
Is the Universe Fine-Tuned for Self-Replicating Information?
Self-Replicating Information

Softwarephysics and the Real World of Human Affairs
Having another universe readily at hand to explore, even a simulated universe like the Software Universe, necessarily has an impact upon one's personal philosophy of life, and allows one to draw certain conclusions about the human condition and what’s it all about, so as you read through the postings in this blog you will stumble across a bit of my own personal philosophy - definitely a working hypothesis still in the works. Along these lines you might be interested in a few postings where I try to apply softwarephysics to the real world of human affairs:

MoneyPhysics – my impression of the 2008 world financial meltdown.

The Fundamental Problem of Everything – If you Google "the fundamental problem of everything", this will be the only hit you get on the entire Internet, which is indicative of the fundamental problem of everything!

What’s It All About? and Genes, Memes and Software – my current working hypothesis on what’s it all about.

How to Use an Understanding of Self-Replicating Information to Avoid War – my current working hypothesis for how the United States can avoid getting bogged down again in continued war in the Middle East.

Hierarchiology and the Phenomenon of Self-Organizing Organizational Collapse - a modern extension of the classic Peter Principle that applies to all hierarchical organizations and introduces the Time Invariant Peter Principle.

Some Specifics About These Postings
The postings in this blog are a supplemental reading for my course on softwarephysics for IT professionals entitled SoftwarePhysics 101 – The Physics of Cyberspacetime, which was originally designed to be taught as a series of seminars at companies where I was employed. Since softwarephysics essentially covers the simulated physics, chemistry, biology, and geology of an entire simulated universe, the slides necessarily just provide a cursory skeleton upon which to expound. The postings in this blog go into much greater depth. Because each posting builds upon its predecessors, the postings in this blog should be read in reverse order from the oldest to the most recent, beginning with my original posting on SoftwarePhysics. In addition, several universities also now offer courses on Biologically Inspired Computing which cover some of the biological aspects of softwarephysics, and the online content for some of these courses can be found by Googling for "Biologically Inspired Computing" or "Natural Computing". At this point we will finish up with my original plan for this blog on softwarephysics with a purely speculative posting on CyberCosmology that describes the origins of the Software Universe, cyberspacetime, software and where they all may be heading. Since CyberCosmology will be purely speculative in nature, it will not be of much help to you in your IT professional capacities, but I hope that it might be a bit entertaining. Again, if you are new to softwarephysics, you really need to read the previous posts before taking on CyberCosmology. I will probably continue on with some additional brief observations about softwarephysics in the future, but once you have completed CyberCosmology, you can truly consider yourself to be a bona fide softwarephysicist.

For those of you following this blog, the posting dates on the posts may seem to behave in a rather bizarre manner. That is because in order to get the Introduction to Softwarephysics listed as the first post in the context root of http://softwarephysics.blogspot.com/ I have to perform a few IT tricks. When publishing a new posting, I simply copy the contents of the Introduction to Softwarephysics to a new posting called the New Introduction to Softwarephysics. Then I update the original Introduction to Softwarephysics entry with the title and content of the new posting to be published. I then go back and take “New” out of the title of the New Introduction to Softwarephysics. This way the Introduction to Softwarephysics always appears as the first posting in the context root of http://softwarephysics.blogspot.com/. The side effect of all this is that the real posting date of posts is the date that appears on the post that you get when clicking on the Newer Post link at the bottom left of the posting webpage.

SoftwarePhysics 101 – The Physics of Cyberspacetime is now available on Microsoft OneDrive.

SoftwarePhysics 101 – The Physics of Cyberspacetime - Original PowerPoint document

Entropy – A spreadsheet referenced in the document

BSDE – A 1989 document describing how to use BSDE - the Bionic Systems Development Environment - to grow applications from genes and embryos within the maternal BSDE software.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
http://softwarephysics.blogspot.com/

Regards,
Steve Johnston

Tuesday, June 09, 2015

A Further Comment on Fermi's Paradox and the Galactic Scarcity of Software

I just finished reading:

If the Universe is Teeming with Aliens…
Where is Everybody?
Fifty Solutions to the Fermi Paradox and the Problem of Extraterrestrial Life


(2002) by Stephen Webb. In Where is Everybody? Stephen Webb briefly goes through 50 possible solutions to explain Fermi’s Paradox:

Fermi’s Paradox - If the Universe is just chock full of intelligent beings, why do we not see any evidence of their existence?

He divides the 50 solutions into three categories:

1. They Are Here
2. They Exist But Have Not Yet Communicated
3. They Do Not Exist

Many of these solutions can be found in Some Additional Thoughts on the Galactic Scarcity of Software, SETS - The Search For Extraterrestrial Software, CyberCosmology and The Sounds of Silence the Unsettling Mystery of the Great Cosmic Stillness. Stephen Webb can only spend a few pages on each proposed solution, but he does so in a very unbiased and even-handed manner, and his book probably does provide the most exhaustive analysis of the Fermi Paradox that I am aware of.

I think that, like most members of the scientific community, Stephen Webb has a very low level of confidence in most of the solutions that suggest that intelligent aliens have already made their presence known here on Earth. That leaves the solutions that maintain that intelligent aliens exist, but like us, do not yet have the necessary technology or desire to communicate with the rest of the galaxy, or that we indeed are alone in our galaxy and that intelligent aliens simply do not exist elsewhere in it. However, the solutions that propose that intelligent aliens do exist within our galaxy, but have not yet made their presence known to us all suffer from the same problem. We, ourselves, are only a mere 400 years into the Scientific Revolution, and yet we already have started to make our presence known to the rest of the galaxy through radio and television broadcasts. Certainly, within the next few centuries, the Earth will have sufficient technology to unleash von Neumann probes upon our galaxy, self-replicating robotic probes that travel from star system to star system building copies along the way. Studies have shown that once released, von Neumann probes should easily subdue our entire galaxy within a few million years, and since our galaxy is about 10 billion years old, we should already find ourselves knee deep in alien von Neumann probes, but that obviously is not the case. That leaves the solutions that maintain that we are alone in the galaxy, and that alien intelligences simply do not exist. Most of those solutions hinge upon the Rare Earth Hypothesis presented in the classic Rare Earth (2000) by Peter Ward and Donald Brownlee. The Rare Earth Hypothesis maintains that our Earth and Solar System are a fluke of nature that is very hard to reproduce in the first place, and because our universe is also a very dangerous place for intelligent beings, it is very hard for a planet and planetary system to remain hospitable for intelligent beings for very long.

In the conclusion of Where is Everybody? Stephen Webb comes up with a personal solution for Fermi’s Paradox based upon a “death by a thousand cuts” explanation which I find very plausible. Now that we have essentially figured out how our Universe formed and evolved, how galaxies and stars formed and evolved, how our Solar System formed and evolved, how the Earth formed and evolved, how simple prokaryotic life formed and evolved, how more complex eukaryotic life formed and evolved, how complex multicellular life based upon eukaryotic cells formed and evolved, how complex neuronetworks formed and evolved within complex multicellular organisms, and how intelligent and self-aware organisms emerged from these complex neuronetworks, it becomes quite apparent that any disruption to this very complicated chain of events could easily derail the emergence of sentient beings.

To model this effect Stephen Webb cleverly uses the sieve of Eratosthenes as an example. The sieve of Eratosthenes is a simple algorithm than can be used to quickly filter out the non-prime numbers from a population of natural numbers. To understand this we need to review what natural and prime numbers are. The natural numbers are formed by simply taking “1” and adding “1” to it an infinite number of times. This of course leads to the familiar sequence of counting numbers “1, 2, 3, 4, 5, 6, 7, 8, 9, 10 …”. Prime numbers are simply the natural numbers that are only evenly divisible by “1” and themselves, yielding the sequence of “1, 2, 3, 5, 7, 11, 13, 17…”. Naturally, there are an infinite number of natural numbers because you can always add “1” to a number to obtain a number that is “1” greater, but around 300 B.C. Euclid also proved that there are also an infinite number of prime numbers too. Since all natural numbers are not prime numbers, that means that the set of all natural numbers must represent a “bigger” infinity than the set of all prime numbers. The problem is that there is no formula that tells you in advance whether or not a particular natural number is going to be a prime number. However, there are statistical approaches, and these statistical approaches tell us that as the number of digits in a natural number increases, the odds of it being a prime number decreases because there are so many smaller natural numbers that could be a divisor of the natural number in question. The Prime Number Theorem states that for any natural number N that is sufficiently large, the odds of the number N being a prime number is very close to 1 / ln(N). Thus the odds that a natural number N with 1000 digits is prime is about one in 2300 because ln 101000 ~ 2302.6, whereas for a natural number with 2000 digits, about one in 4600 is prime because ln 102000 ~ 4605.2, and the average gap between consecutive prime numbers amongst the first N natural numbers is roughly ln(N), and so the gap between prime numbers increases as N increases. This means that as natural numbers get bigger, the odds of them being a prime number decreases and the gap between prime numbers increases as natural numbers get bigger too. So as N goes to infinity, the odds of N being a prime number go to zero. So even though there are an infinite number of prime numbers hidden amongst the infinite number of natural numbers, as N goes to infinity the vast majority of the natural numbers get eliminated from the set of prime numbers because they have a divisor that eliminates them from being a prime number. Stephen Webb proposes that a similar elimination process may have occurred amongst the 1 trillion planets and moons that are likely to be found within our galaxy, leaving the Earth as the sole safe harbor for intelligent technologically-capable beings. As each fluky turn of events that led to the emergence of intelligent technologically-capable beings on the Earth unfolded, it eliminated a number of the 1 trillion candidates, and perhaps eliminated them more than once in an “overkill” manner.

To see how this works take a look at Figure 1 which is a pictorial depiction of the sieve of Eratosthenes. Recall that Eratosthenes was the first person to measure the size of the Earth by measuring the angular height of the Sun in the Egyptian cities of Syene and Alexandria at the summer solstice and then measuring the distance between the two cities. The sieve of Eratosthenes works like this. First you pick a population of natural numbers. In Figure 1 we chose the first 120 natural numbers. The numbers “1” and “2” form the first two prime numbers. Then we begin to eliminate candidate prime numbers from the population of the first 120 natural numbers by multiplying the second prime number “2” by the sequence “1, 2, 3, 4, 5 …”, yielding:

2 * 1 = 2
2 * 2 = 4
2 * 3 = 6
2 * 4 = 8

It is easy to see that this eliminates 1/2 of the first 120 natural numbers. Then we do the same thing for the third prime number “3”, by multiplying it by the sequence “1, 2, 3, 4, 5 …”, yielding:

3 * 1 = 3
3 * 2 = 6
3 * 3 = 9
3 * 4 = 12

This process would eliminate another 1/3 of the natural numbers, but there is some “overkill” in the process because the numbers “6” and “12” were already eliminated by the prime number “2”. We see similar “overkill” actions when performing the same process for prime numbers “5” and “7”. In fact, by the time we get to prime number “11” all of the natural numbers in the first 120 natural numbers that “11” would eliminate, like “22”, “33”, “44”, “55”, “66”, “77”, “88” and “99” have already been eliminated by smaller prime numbers than “11”.

Stephen Webb uses this “overkill” action of the sieve of Eratosthenes as a metaphor. Perhaps there is no single explanation for Fermi's Paradox, like only rocky planets with plate tectonics produce intelligent beings. Perhaps it is the myriad elements of the Rare Earth Hypothesis in combination with the inherent dangers of our galaxy that finally “overkills” all other planets and moons in our galaxy, leaving Earth alone with the only intelligent technologically-capable beings in the galaxy.

Figure 1 – Perhaps the myriad elements of the Rare Earth Hypothesis in combination essentially form a sieve of Eratosthenes that eliminates all other locations in our galaxy as possible candidates for intelligent technologically-capable beings.

The problem with this solution is that in recent years we have obtained evidence from the Kepler Mission:

http://kepler.nasa.gov/

that demonstrates that planets are quite common in our galaxy, and we have even found some Earth-like planets out there already. Researchers have even begun to obtain spectra from the atmospheres of such planets, so within a few years we may be able to detect the presence of carbon-based life forms from molecular signatures in the atmospheres of these observed exoplanets. So it is rather difficult to propose that all of the factors that tend to eliminate intelligent technologically-capable beings in our galaxy always come together to eliminate them with 100% efficiency. So let me try to add one more elimination factor to the long list that we already have.

Another Possible Explanation For Fermi’s Paradox
Whenever I read material about such matters, I always try to remember that we are now living in a very special time in the history of the self-replicating information that has been reworking the surface of our planet for the past 4.0 billion years. Recall that softwarephysics maintains that there are currently three forms of self-replicating information on the Earth – the genes, memes and software, with software rapidly becoming the dominant form of self-replicating information on the planet (see A Brief History of Self-Replicating Information for details). Given the fact that software will soon become the dominant form of self-replicating information on the planet, I seriously doubt that there will be any human beings, as we currently now know them, on the Earth a few centuries from now. Mankind, as we know it, will simply no longer exist. By that time, we will have become a well-dispersed transition fossil marking the arrival of machine intelligence on the Earth. Hopefully, we will have merged with the machines in a parasitic/symbiotic manner, like all of the preceding parasitic/symbiotic mergers that have marked the evolutionary history of self-replicating information on the Earth over the past 4.0 billion years, but those details remains to be seen.

In Programming the Universe: A Quantum Computer Scientist Takes on the Cosmos (2006) Seth Lloyd proposes that our Universe is simply a vast quantum computer calculating how to perform. Perhaps in 1,000 years when software has finally become the dominant form of self-replicating information on the planet and is running on huge networks of quantum computers, it will make no distinction between the “real” Universe and the “simulated” universes that it can easily cook up in its own hardware. Perhaps as we saw in Quantum Computing and the Many-Worlds Interpretation of Quantum Mechanics the software running on these vast networks of quantum computers of the future will come to realize that the Many-Worlds interpretation of quantum mechanics is indeed correct, and that the humans of long ago were simply a large collection of quantum particles constantly getting entangled or “correlated” with other quantum particles, and splitting off into parallel universes in the process. This constant splitting gave the long-forgotten humans the delusion that they were conscious beings, and led them to do very strange things, like look for similarly deluded entities.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
http://softwarephysics.blogspot.com/

Regards,
Steve Johnston

Sunday, May 10, 2015

Quantum Computing and the Many-Worlds Interpretation of Quantum Mechanics

In my last posting, The Software Universe as an Implementation of the Mathematical Universe Hypothesis we explored Max Tegmark’s proposal that our physical Universe, and the Software Universe that we IT professionals and end-users are all immersed in, is simply an unchanging eternal mathematical structure that has always existed in a Platonic sense. In that posting we discussed Max Tegmark’s proposal that there is a Level III multiverse comprised of an infinite number of Level I and Level II multiverses that are constantly splitting due to Hugh Everett’s Many-Worlds Interpretation of quantum mechanics. In this posting I would like to further explore the Many-Worlds Interpretation of quantum mechanics as it relates to quantum computing because many quantum computer researchers consider it key to the advancement of quantum computing.

The concept of quantum computing goes back to some early work in 1982 by Richard Feynman and David Deutsch, but it was David Deutsch who carried the idea forward and came up with the very first theoretical design of a quantum computer, similar to Alan Turing’s 1936 theoretical description for classical computers . Here is a link to David Deutsch’s seminal 1985 paper describing quantum computers and contrasting them with the classical computers that we work with today:

http://folk.uio.no/ovrum/articles/deutsch85.pdf

A more accessible outline of quantum computing can be found in David Deutsch’s book The Fabric of Reality (1997). Another very good book on quantum computing is Programming the Universe: A Quantum Computer Scientist Takes on the Cosmos (2006) by Seth Lloyd. Seth Lloyd is currently working on quantum computers at MIT, and is the first quantum mechanical engineer in MIT’s Mechanical Engineering department. Seth Lloyd is recognized for proposing the very first technologically feasible design for a quantum computer. In his book he proposes that the physical Universe is a huge quantum computer calculating how to behave and generates what we observe in the physical Universe, along the lines of Max Tegmark’s Level III multiverse. A good online synopsis of this idea is available in The Computational Universe (2002), in which he calculates the computing power of the entire physical Universe treated as one large quantum computer. You can find this fascinating paper at:

http://www.edge.org/3rd_culture/lloyd2/lloyd2_p2.html

So why is the Many-Worlds Interpretation important to quantum computer research? Well, the whole point to quantum computation is that a quantum computer can perform a huge number of logical operations in parallel using a limited amount of hardware, while classical computers need dedicated hardware for each logical operation. For example, in a classical computer, like your laptop, a 1-bit memory location can hold a 1 or a 0, but in a quantum computer, a 1-qubit memory location can hold both a 1 and a 0 at the same time in a mixed quantum state! In a classical computer, when your code reads the 1 or 0 at the top of an if-then-else block, it will do one thing or the other by branching either into the then-block of code or into the else-block of code. But in a quantum computer, a 1-qubit memory location can be in a mixed quantum state of being both 1 and 0 at the same time, so when the quantum computer reads the 1-qubit memory location, it logically splits into two quantum computers. One of the twin quantum computers performs the then-block, while the other quantum computer performs the else-block at the same time and in parallel. So with a quantum computer, you can have a single computer behave like a nearly infinite number of computers all working in parallel on the same problem at the same time. In that sense, a quantum computer would behave very much like Mickey’s water-carrying brooms in The Sorcerer's Apprentice segment of Walt Disney’s Fantasia, constantly splitting in two to perform a task at each logical branch of your program:

https://www.youtube.com/watch?v=T8gOh0wEgLg

In his book, David Deutsch asks the very compelling question of where exactly are all of those computations being performed, if not in a huge number of parallel universes by a huge number of parallel quantum computers? That is why the Many-World’s Interpretation of quantum mechanics seems so natural to those working on quantum computers. In fact, it is rather difficult to picture how a quantum computer could operate using the standard Copenhagen Interpretation of quantum mechanics. So let me refresh your memory on the Copenhagen Interpretation of quantum mechanics before proceeding.

In 1927, Niels Bohr and Werner Heisenberg proposed a very positivistic interpretation of quantum mechanics now known as the Copenhagen Interpretation. You see, Bohr was working at the University of Copenhagen Institute of Theoretical Physics at the time. The Copenhagen Interpretation contends that absolute reality does not really exist. Instead, there are an infinite number of potential realities, defined by the wavefunction ψ of a quantum system, and when we make a measurement of a quantum system, the wavefunction of the quantum system collapses into a single value that we observe, and thus brings the quantum system into reality (see Quantum Software for more on wavefunctions). This satisfied Max Born’s contention that wavefunctions are just probability waves. The Copenhagen Interpretation suffers from several philosophical problems though. For example, Eugene Wigner pointed out that the devices we use to measure quantum events are also made out of atoms which are quantum objects in themselves, so when an observation is made of a single atom of uranium to see if it has gone through a radioactive decay using a Geiger counter, the atomic quantum particles of the Geiger counter become entangled in a quantum superposition of states with the uranium atom. If the uranium has decayed, then the uranium atom and the Geiger counter are in one quantum state, and if the atom has not decayed, then the uranium atom and the Geiger counter are in a different quantum state. If the Geiger counter is fed into an amplifier, then we have to add in the amplifier too into our quantum superposition of states. If a physicist is patiently listening to the Geiger counter, we have to add him into the chain as well, so that he can write and publish a paper which is read by other physicists and is picked up by Time magazine for a popular presentation to the public. So when does the “measurement” actually take place? We seem to have an infinite regress. Wigner’s contention is that the measurement takes place when a conscious being first becomes aware of the observation. Einstein had a hard time with the Copenhagen Interpretation of quantum mechanics for this very reason because he thought that it verged upon solipsism. Solipsism is a philosophical idea from Ancient Greece. In solipsism, your Mind is the whole thing, and the physical Universe is just a figment of your imagination. So I would like to thank you very much for thinking of me and bringing me into existence! Einstein’s opinion of the Copenhagen Interpretation of quantum mechanics can best be summed up by his statement "Is it enough that a mouse observes that the Moon exists?". Einstein objected to the requirement for a conscious being to bring the Universe into existence, because in Einstein’s view, measurements simply revealed to us the condition of an already existing reality that does not need us around to make measurements in order to exist. But in the Copenhagen Interpretation, the absolute reality of Einstein does not really exist. Additionally, in the Copenhagen Interpretation, objects do not really exist until a measurement is taken, which collapses their associated wavefunctions, but the mathematics of quantum mechanics does not shed any light on how a measurement could collapse a wavefunction.

The collapse of the wavefunction is also a one-way street. According to the mathematics of quantum mechanics a wavefunction changes with time in a deterministic manner, so like all of the other current effective theories of physics, they are reversible in time and can be run backwards. This is also true in the Copenhagen Interpretation, so long as you do not observe the wavefunction and collapse it by the process of observing it. In the Copenhagen Interpretation, once you observe a wavefunction and collapse it, you cannot undo the collapse, so the process of observation becomes nonreversible in time. That means if you fire photons at a target, but do not observe them, it is possible to reverse them all in time and return the Universe back to its original state. That is how all of the other effective theories of physics currently operate. But in the Copenhagen Interpretation, if you do observe the outgoing photons you can never return the Universe back to its original state. This can best be summed up by the old quantum mechanical adage - look particle, don’t look wave. A good way to image this in your mind is to think of a circular tub of water. If you drop a pebble into the exact center of a circular tub of water, a series of circular waves will propagate out from the center. Think of those waves as the wavefunction of an electron changing with time into the future according to the Schrödinger equation. When the circular waves hit the circular walls of the tub they will be reflected back to the center of the tub. Essentially, they can be viewed as moving backwards in time. This can happen in the Copenhagen Interpretation so long as the electron is never observed as its wavefunction moves forward or backward in time. However, if the wavefunction is observed and collapsed, it can never move backwards in time, so observation becomes a one-way street.

In 1956 , Hugh Everett working on his Ph.D. under John Wheeler, proposed the Many-Worlds Interpretation of quantum mechanics as an alternative. The Many-Worlds Interpretation admits an absolute reality, but claims that there are an infinite number of absolute realities spread across an infinite number of parallel universes. In the Many-Worlds Interpretation, when electrons or photons encounter a two-slit experiment, they go through one slit or the other, and when they hit the projection screen they interfere with electrons or photons from other universes that went through the other slit! In Everett’s original version of the Many-Worlds Interpretation, the entire Universe splits into two distinct universes whenever a particle is faced with a choice of quantum states, and so all of these universes are constantly branching into an ever growing number of additional universes. In the Many-Worlds Interpretation of quantum mechanics, the wavefunctions or probability clouds of electrons surrounding an atomic nucleus are the result of overlaying the images of many “real” electrons in many parallel universes. Thus, according to the Many-Worlds Interpretation wavefunctions never collapse. They just deterministically evolve in an abstract mathematical Hilbert space and are reversible in time, like everything else in physics.

While doing research for The Software Universe as an Implementation of the Mathematical Universe Hypothesis I naturally consulted Max Tegmark’s HomePage at:

http://space.mit.edu/home/tegmark/mathematical.html

and I found a link there to Hugh Everett’s original 137-page Jan 1956 draft Ph.D. thesis in which he laid down the foundations for the Many-Worlds Interpretation. This is a rare document indeed because on March 1, 1957, Everett submitted a very compressed version of his theory in his final 36-page doctoral dissertation, "On the Foundations of Quantum Mechanics", after heavy editing by his thesis advisor John Wheeler to make his Ph.D. thesis more palatable to the committee that would be hearing his oral defense and also to not offend Niels Bohr, one of the founding fathers of the Copenhagen Interpretation and still one of its most prominent proponents. But years later John Wheeler really did want to know what Niels Bohr thought of Hugh Everett’s new theory and encouraged Everett to visit Copenhagen in order to meet with Bohr. Everett and his wife did finally travel to Copenhagen in March of 1959, and spent six weeks there. But by all accounts the meeting between Bohr and Everett was a disaster, with Bohr not even discussing the Many-Worlds Interpretation with Everett.

Below is the link to Hugh Everett’s original 137-page Jan 1956 draft Ph.D. thesis:

http://www.pbs.org/wgbh/nova/manyworlds/pdf/dissertation.pdf

I have also placed his thesis on Microsoft One Drive at:

https://onedrive.live.com/redir?resid=21488ff1cf19c88b!1437&authkey=!ADIm_WTYLkbx90I&ithint=file%2cpdf

Since I love to read the original source documents for great ideas, like Copernicus’s On the Revolutions of the Celestial Spheres (1543), Galileo’s the Starry Messenger (1610) and Dialogue Concerning the Two Chief World Systems (1632), Newton’s Principia (1687), and Darwin’s On the Origin of Species (1859), I could not resist reading Hugh Everett’s original work too. So in this posting I would like to step through Hugh Everett’s original Ph.D. thesis with you page by page, with a little translation along the way. To do that, let’s focus on the introduction and the concluding chapter of his original Ph.D. thesis, where he outlines what he is trying to achieve, and then skip over most of the math in the intervening chapters. For those chapters I will only highlight his key findings as he builds his case for the Many-Worlds Interpretation.

For the remainder of this posting, direct quotes from Hugh Everett’s original Ph.D. thesis will be in blue, while my comments will be in black.

The Many-Worlds Interpretation

THE THEORY OF THE UNIVERSAL WAVEFUNCTION

Hugh Everett, III

I. INTRODUCTION

We begin, as a way of entering our subject, by characterizing a particular interpretation of quantum theory which, although not representative of the more careful formulations of some writers, is the most common form encountered in textbooks and university lectures on the subject.

With the very first sentence of his Ph.D. thesis, Hugh Everett lays down the gauntlet and begins by discussing the Copenhagen Interpretation of quantum mechanics and classifying it as not one of the more “careful formulations”. He is correct about the textbooks of the day exclusively teaching the Copenhagen Interpretation. I took my very first quantum mechanics course in 1970, and in those days the Copenhagen Interpretation was taught as a quantum mechanical fact. In fact, the textbooks of the day did not even refer to the idea of the act of measurement collapsing wavefunctions as the Copenhagen Interpretation because that would infer that other interpretations were even possible.

A physical system is described completely by a state function ψ, which is an element of a Hilbert space, and which furthermore gives information only concerning the probabilities of the results of various observations which can be made on the system. The state function ψ is thought of as objectively characterizing the physical system, i.e., at all times an isolated system is thought of as possessing a state function, independently of our state of knowledge of it. On the other hand, ψ changes in a causal manner so long as the system remains isolated, obeying a differential equation. Thus there are two fundamentally different ways in which the state function can change:

Hugh Everett begins his dissertation stating what everybody already agrees upon in classical quantum mechanics. Every physical system, like a single electron, can be described by a wavefunction called ψ that is a solution to Schrödinger’s equation. Note that in his thesis Hugh Everett sometimes uses the term “state function” and sometimes the term “wave function”, rather than the term wavefunction ψ. All of these terms mean the same thing. They are just solutions to the Schrödinger wave equation, which sometimes Hugh Everett refers to simply as the “wave equation”. The wavefunction ψ is a wiggly line that extends over the whole Universe, but has the greatest amplitude near where the electron is most likely to be found. The wavefunction ψ is also a complex function with both a real and imaginary part, so it has both an amplitude and a phase (See The Foundations of Quantum Computing for details).

The chief difference between quantum mechanics and classical mechanics is that in classical mechanics objects have definite properties, like a definite position or a definite velocity. This is not so in quantum mechanics. In quantum mechanics, objects can be in a mixture or superposition of states. For example, if you pin down the exact location of an electron in quantum mechanics, the electron is said to be in a certain state of position called an eigenstate and with a certain numerical position that is called an eigenvalue. In the Copenhagen Interpretation the act of measurement takes an object that is in a superposition of states and collapses its wavefunction down into a particular eigenstate with a particular eigenvalue. And this is a totally probabilistic process. The wavefunction itself does not specifically say where the object is located in advance. The wavefunction just tells you the probability of observing specific eigenstates with specific eigenvalues, and this probability is obtained by finding the square of the wavefunction's amplitude at a given position. For example, observing a hydrogen atom which initially is in a superposition of many states might determine that the hydrogen atom is in its ground state eigenstate at a known energy level that is its energy eigenvalue.

Hugh Everett then goes on to define two ways the wavefunction ψ can change with time:

Process 1: A discontinuous change brought on by observation. In the standard Copenhagen Interpretation this causes the wavefunction ψ of the electron which is spread out over the entire Universe with decreasing amplitude as you get further away from where the electron is likely to be found, to suddenly collapse so that the amplitude of the wavefunction ψ of the electron becomes huge where the electron is observed.

Process 2: The electron is not observed, so its wavefunction ψ sort of smears out with time in a deterministic manner.

The question of the consistency of the scheme arises if one contemplates regarding the observer and his object-system as a single (composite) physical system. Indeed, the situation becomes quite paradoxical if we allow for the existence of more than one observer. Let us consider the case of one observer A, who is performing measurements upon a system S, the totality (A + S) in turn forming the object-system for another observer, B.

If we are to deny the possibility of B's use of a quantum mechanical description (wave function obeying wave equation) for A + S, then we must be supplied with some alternative description for systems which contain observers (or measuring apparatus). Furthermore, we would have to have a criterion for telling precisely what type of systems would have the preferred positions of "measuring apparatus" or "observer" and be subject to the alternate description. Such a criterion is probably not capable of rigorous formulation.

On the other hand, if we do allow B to give a quantum description to A + S, by assigning a state function ψ A+S, then, so long as B does not interact with A + S, its state changes causally according to Process 2, even though A may be performing measurements upon S. From B's point of view, nothing resembling Process 1 can occur (there are no discontinuities), and the question of the validity of A's use of Process 1 is raised. That is, apparently either A is incorrect in assuming Process 1, with its probabilistic implications, to apply to his measurements, or else B's state function, with its purely causal character, is an inadequate description of what is happening to A + S.

Basically, if I am observer A and I observe electron S using Process 1, according to the Copenhagen Interpretation I collapse the wavefunction ψ of the electron down to a single point in space. But I am just made up of a huge number of quantum particles too, just like the single electron that I observed. So if you, as observer B, do not watch me (as observer A) observing electron S, the wavefunction ψ A+S that describes me and the electron does not collapse and it continues to change in a deterministic manner according to Process 2. So either I, as observer A, do not really collapse the wavefunction of the electron with Process 1, or you, as observer B, do not let the combination of me and the electron evolve in time in an undisturbed manner according to Process 2. Thus, the Copenhagen Interpretation leads to a contradiction when more than one observer is involved.

To better illustrate the paradoxes which can arise from strict adherence to this interpretation we consider the following amusing, but extremely hypothetical drama. Isolated somewhere out in space is a room containing an observer, A, who is about to perform a measurement upon a system S. After performing his measurement he will record the result in his notebook. We assume that he knows the state function of S (perhaps as a result of previous measurement), and that it is not an eigenstate of the measurement he is about to perform. A, being an orthodox quantum theorist, then believes that the outcome of his measurement is undetermined and that the process is correctly described by Process 1.

In the meantime, however, there is another observer, B, outside the room, who is in possession of the state function of the entire room, including S, the measuring apparatus, and A, just prior to the measurement. B is only interested in what will be found in the notebook one week hence, so he computes the state function of the room for one week in the future according to Process 2. One week passes, and we find B still in possession of the state function of the room, which this equally orthodox quantum theorist believes to be a complete description of the room and its contents. If B's state function calculation tells beforehand exactly what is going to be in the notebook, then A is incorrect in his belief about the indeterminacy of the outcome of his measurement. We therefore assume that B's state function contains non-zero amplitudes over several of the notebook entries.

At this point, B opens the door to the room and looks at the notebook (performs his observation). Having observed the notebook entry, he turns to A and informs him in a patronizing manner that since his (B's) wave function just prior to his entry into the room, which he knows to have been a complete description of the room and its contents, had non-zero amplitude over other than the present result of the measurement, the result must have been decided only when B entered the room, so that A, his notebook entry, and his memory about what occurred one week ago had no independent objective existence until the intervention by B. In short, B implies that A owes his present objective existence to B's generous nature which compelled him to intervene on his behalf. However, to B's consternation, A does not react with anything like the respect and gratitude he should exhibit towards B, and at the end of a somewhat heated reply, in which A conveys in a colorful manner his opinion of B and his beliefs, he rudely punctures B's ego by observing that if B's view is correct, then he has no reason to feel complacent, since the whole present situation may have no objective existence, but may depend upon the future actions of yet another observer.

Clearly, in a Universe with more than one observer, the opening hypothesis of his thesis that wavefunctions change in time by either Process 1 or Process 2 cannot be right. Otherwise, nothing in the Universe would ever “really” happen until its very last sentient being took a peek into the room above and collapsed its very complicated wavefunction. Hugh Everett next proposes several alternative explanations.

It is now clear that the interpretation of quantum mechanics with which we began is untenable if we are to consider a universe containing more than one observer. We must therefore seek a suitable modification of this scheme, or an entirely different system of interpretation. Several alternatives which avoid the paradox are:

Alternative 1: To postulate the existence of only one observer in the universe. This is the solipsist position, in which each of us must hold the view that he alone is the only valid observer, with the rest of the universe and its inhabitants obeying at all times Process 2 except when under his observation.

This view is quite consistent, but one must feel uneasy when, for example, writing textbooks on quantum mechanics, describing Process 1, for the consumption of other persons to whom it does not apply.

Alternative 2: To limit the applicability of quantum mechanics by asserting that the quantum mechanical description fails when applied to observers, or to measuring apparatus, or more generally to systems approaching macroscopic size.

If we try to limit the applicability so as to exclude measuring apparatus, or in general systems of macroscopic size, we are faced with the difficulty of sharply defining the region of validity. For what n might a group of n particles be construed as forming a measuring device so that the quantum description fails? And to draw the line at human or animal observers, i.e., to assume that all mechanical aparata obey the usual laws, but that they are somehow not valid for living observers, does violence to the so-called principle of psycho-physical parallelism, and constitutes a view to be avoided, if possible. To do justice to this principle we must insist that we be able to conceive of mechanical devices (such as servomechanisms), obeying natural laws, which we would be willing to call observers.

Alternative 3: To admit the validity of the state function description, but to deny the possibility that B could ever be in possession of the state function of A + S. Thus one might argue that a determination of the state of A would constitute such a drastic intervention that A would cease to function as an observer.

The first objection to this view is that no matter what the state of A + S is, there is in principle a complete set of commuting operators for which it is an eigenstate, so that, at least, the determination of these quantities will not affect the state nor in any way disrupt the operation of A. There are no fundamental restrictions in the usual theory about the knowability of any state functions, and the introduction of any such restrictions to avoid the paradox must therefore require extra postulates.

The second objection is that it is not particularly relevant whether or not B actually knows the precise state function of A + S. If he merely believes that the system is described by a state function, which he does not presume to know, then the difficulty still exists. He must then believe that this state function changed deterministically, and hence that there was nothing probabilistic in A's determination.

Alternative 4: To abandon the position that the state function is a complete description of a system. The state function is to be regarded not as a description of a single system, but of an ensemble of systems, so that the probabilistic assertions arise naturally from the incompleteness of the description.

It is assumed that the correct complete description, which would presumably involve further (hidden) parameters beyond the state function alone, would lead to a deterministic theory, from which the probabilistic aspects arise as a result of our ignorance of these extra parameters in the same manner as in classical statistical mechanics.

Alternative 5: To assume the universal validity of the quantum description, by the complete abandonment of Process 1. The general validity of pure wave mechanics, without any statistical assertions, is assumed for all physical systems, including observers and measuring apparata. Observation processes are to be described completely by the state function of the composite system which includes the observer and his object-system, and which at all times obeys the wave equation (Process 2).

This brief list of alternatives is not meant to be exhaustive, but has been presented in the spirit of a preliminary orientation. We have, in fact, omitted one of the foremost interpretations of quantum theory, namely the position of Niels Bohr. The discussion will be resumed in the final chapter, when we shall be in a position to give a more adequate appraisal of the various alternate interpretations. For the present, however, we shall concern ourselves only with the development of Alternative 5.

It is evident that Alternative 5 is a theory of many advantages. It has the virtue of logical simplicity and it is complete in the sense that it is applicable to the entire universe. All processes are considered equally (there are no "measurement processes" which play any preferred role), and the principle of psycho-physical parallelism is fully maintained. Since the universal validity of the state function description is asserted, one can regard the state functions themselves as the fundamental entities, and one can even consider the state function of the whole universe. In this sense this theory can be called the theory of the "universal wave function", since all of physics is presumed to follow from this function alone. There remains, however, the question whether or not such a theory can be put into correspondence with our experience.

The present thesis is devoted to showing that this concept of a universal wave mechanics, together with the necessary correlation machinery for its interpretation, forms a logically self consistent description of a universe in which several observers are at work.

Alternative 5 is Hugh Everett’s Many-Worlds Interpretation of quantum mechanics. In this interpretation of quantum mechanics he completely eliminates Process 1 as a way for wavefunctions to change with time. Instead, he plans to bring in the acts of measurement and observation under Process 2, and simply let the wavefunctions evolve with time according to the wave equation. In this interpretation of quantum mechanics wavefunctions are the fundamental thing and provide all that can be known of the Universe. In fact, the whole Universe can be considered to be one single very complex wavefunction evolving with time. That is why he calls his theory the theory of the universal wavefunction.

We shall be able to introduce into the theory systems which represent observers. Such systems can be conceived as automatically functioning machines (servomechanisms) possessing recording devices (memory) and which are capable of responding to their environment. The behavior of these observers shall always be treated within the framework of wave mechanics. Furthermore, we shall deduce the probabilistic assertions of Process 1 as subjective appearances to such observers, thus placing the theory in correspondence with experience. We are then led to the novel situation in which the formal theory is objectively continuous and causal, while subjectively discontinuous and probabilistic. While this point of view thus shall ultimately justify our use of the statistical assertions of the orthodox view, it enables us to do so in a logically consistent manner, allowing for the existence of other observers. At the same time it gives a deeper insight into the meaning of quantized systems, and the role played by quantum mechanical correlations.

In order to bring about this correspondence with experience for the pure wave mechanical theory, we shall exploit the correlation between subsystems of a composite system which is described by a state function. A subsystem of such a composite system does not, in general, possess an independent state function. That is, in general a composite system cannot be represented by a single pair of subsystem states, but can be represented only by a superposition of such pairs of subsystem states…. there is no single state for Particle 1 alone or Particle 2 alone, but only the superposition of such cases.

In fact, to any arbitrary choice of state for one subsystem there will correspond a relative state for the other subsystem, which will generally be dependent upon the choice of state for the first subsystem, so that the state of one subsystem is not independent, but correlated to the state of the remaining subsystem. Such correlations between systems arise from interaction of the systems, and from our point of view all measurement and observation processes are to be regarded simply as interactions between observer and object-system which produce strong correlations.

Let one regard an observer as a subsystem of the composite system: observer + object-system. It is then an inescapable consequence that after the interaction has taken place there will not, generally, exist a single observer state. There will, however, be a superposition of the composite system states, each element of which contains a definite observer state and a definite relative object-system state. Furthermore, as we shall see, each of these relative object-system states will be, approximately, the eigenstates of the observation corresponding to the value obtained by the observer which is described by the same element of the superposition. Thus, each element of the resulting superposition describes an observer who perceived a definite and generally different result, and to whom it appears that the object-system state has been transformed into the corresponding eigenstate. In this sense the usual assertions of Process 1 appear to hold on a subjective level to each observer described by an element of the superposition. We shall also see that correlation plays an important role in preserving consistency when several observers are present and allowed to interact with one another (to "consult" one another) as well as with other object-systems.

In order to develop a language for interpreting our pure wave mechanics for composite systems we shall find it useful to develop quantitative definitions for such notions as the "sharpness" or "definiteness" of an operator A for a state ψ, and the "degree of correlation" between the subsystems of a composite system or between a pair of operators in the subsystems, so that we can use these concepts in an unambiguous manner. The mathematical development of these notions will be carried out in the next chapter (II) using some concepts borrowed from Information Theory. We shall develop there the general definitions of information and correlation, as well as some of their more important properties. Throughout Chapter II we shall use the language of probability theory to facilitate the exposition, and because it enables us to introduce in a unified manner a number of concepts that will be of later use. We shall nevertheless subsequently apply the mathematical definitions directly to state functions, by replacing probabilities by square amplitudes, without, however, making any reference to probability models.

Having set the stage, so to speak, with Chapter II, we turn to quantum mechanics in Chapter III. There we first investigate the quantum formalism of composite systems, particularly the concept of relative state functions, and the meaning of the representation of subsystems by noninterfering mixtures of states characterized by density matrices. The notions of information and correlation are then applied to quantum mechanics. The final section of this chapter discusses the measurement process, which is regarded simply as a correlation-inducing interaction between subsystems of a single isolated system. A simple example of such a measurement is given and discussed, and some general consequences of the superposition principle are considered.

This will be followed by an abstract treatment of the problem of Observation (Chapter IV). In this chapter we make use only of the superposition principle, and general rules by which composite system states are formed of subsystem states, in order that our results shall have the greatest generality and be applicable to any form of quantum theory for which these principles hold. (Elsewhere, when giving examples, we restrict ourselves to the non-relativistic Schrödinger Theory for simplicity.) The validity of Process 1 as a subjective phenomenon is deduced, as well as the consistency of allowing several observers to interact with one another.

Chapter V supplements the abstract treatment of Chapter IV by discussing a number of diverse topics from the point of view of the theory of pure wave mechanics, including the existence and meaning of macroscopic objects in the light of their atomic constitution, amplification processes in measurement, questions of reversibility and irreversibility, and approximate measurement.

The final chapter summarizes the situation, and continues the discussion of alternate interpretations of quantum mechanics.

With that Hugh Everett ends the introduction of his thesis. Basically, he is proposing that Process 1, in which an external observer A observes a quantum system like an electron, and causes a discontinuous change to the electron’s wavefunction ψ is an illusion. Instead, the wavefunction of observer A becomes “correlated” with the wavefunction of the electron into a composite wavefunction of both observer A and the electron. This composite wavefunction then evolves in time according to the wave equation. Thus, Process 1 really does not exist. Everything in the Universe just evolves in time according to Process 2. In the following chapters, Hugh Everett goes on to explain how this evolution of a correlated composite wavefunction can produce all of the strange quantum mechanical things we observe in the lab.

II. PROBABILITY, INFORMATION, AND CORRELATION

The present chapter is devoted to the mathematical development of the concepts of information and correlation. As mentioned in the introduction we shall use the language of probability theory throughout this chapter to facilitate the exposition, although we shall apply the mathematical definitions and formulas in later chapters without reference to probability models. We shall develop our definitions and theorems in full generality, for probability distributions over arbitrary sets, rather than merely for distributions over real numbers, with which we are mainly interested at present. We take this course because it is as easy as the restricted development, and because it gives a better insight into the subject.

The first three sections develop definitions and properties of information and correlation for probability distributions over finite sets only. In section four the definition of correlation is extended to distributions over arbitrary sets, and the general invariance of the correlation is proved. Section five then generalizes the definition of information to distributions over arbitrary sets. Finally, as illustrative examples, sections seven and eight give brief applications to stochastic processes and classical mechanics, respectively.


Now the really heavy math begins once we leave the Introduction and proceed into the main body of Hugh Everett’s Ph.D. thesis. He first goes into the mathematics of probability distributions that would be familiar to anybody who deals with statistics. He is mainly concerned with joint probabilities and conditional distributions because he is focusing upon what happens when observer A observes electron S. He reminds the reader about what independence means because it is important to his proposal. If observer A never interacts with electron S it means that they are statistically independent and cannot affect each other, but if observer A does observe electron S it means they are correlated and that is a whole different story.

Independence means that the random variables take on values which are not influenced by the values of other variables with respect to which they are independent. That is, the conditional distribution of one of two independent variables, Y, conditioned upon the value xi for the other, is independent of xi, so that knowledge about one variable tells nothing of the other.

Next he applies Claude Shannon’s concept of Information (1948) to the analysis (see Some More Information About Information for details) for joint distributions that are correlated and not independent. Remember, Claude Shannon’s formulation for the concept of Information hinges upon the amount of “surprise” there is in a signal composed of 1s and 0s, while in softwarephysics we use Leon Brillouin’s concept of Information as a form of negative entropy (see The Demon of Software for details). Hugh Everett goes on to conclude that if two things are not dependent upon each other, like me observing electron S1 and you observing a different electron S2 that the total amount of Information obtained is equal to the sum of the Information I get about electron S1 and you get about electron S2.

For independent random variables X, Y, ... ,Z, the following relationship is easily proved:

(2.4) IXY... Z = IX + IY + ... + IZ (X, Y, ... ,Z independent) ,

so that the information of XY... Z is the sum of the individual quantities of information, which is in accord with our intuitive feeling that if we are given information about unrelated events, our total knowledge is the sum of the separate amounts of information. ….

But what if there is some correlation when measuring two things that depend upon each other? Like measuring the market value of your home and your annual salary? Those two numbers are not independent of each other, so knowing one tells you something about the other.

….which we are told, the natural thing to do to arrive at a single number to measure the strength of correlation is to consider the expected change in information about X, given that we are to be told the value of Y. This quantity we call the correlation information, or for brevity, the correlation, of X and Y, and denote it by {X, Y}….

….Thus the correlation is symmetric between X and Y, and hence also equal to the expected change of information about Y given that we will be told the value of X. Furthermore, according to (3.3) the correlation corresponds precisely to the amount of "missing information" if we possess only the marginal distributions, i.e., the loss of information if we choose to regard the variables as independent.

Here he is saying that suppose you have a box full of interacting molecules and you know their individual positions and velocities. That information he defines as the marginal information about the marginal distributions of the molecules. But if molecule X bounces off molecule Y, then their positions and velocities will no longer be independent because molecule X has interacted with molecule Y, so their marginal distributions are correlated. Hugh Everett calls the information tied up with that correlation the correlation information {X,Y} and it is symmetric {X,Y} = {Y,X}. He then goes on to take his definition of correlation information to derive the conservation of information in classical mechanics (see The Demon of Software for details). Remember that physicists get very nervous about the idea of destroying information because then they cannot reverse the effective theories of physics in time.

we have proved that….and the total information is conserved.

Now it is known that the individual (marginal) position and momentum distributions tend to decay, except for rare fluctuations, into the uniform and Maxwellian distributions respectively, for which the classical entropy is a maximum. This entropy is, however, except for the factor of Boltzman's constant, simply the negative of the marginal information

(7.4) Imarginal = IX1 + IY1 + IZ1 + ... + Ipxn + Ipyn + Ipzn

which thus tends towards a minimum. But this decay of marginal information is exactly compensated by an increase of the total correlation information

(7.5) {total} = Itotal - Imarginal

since the total information remains constant. Therefore, if one were to define the total entropy to be the negative of the total information, one could replace the usual second law of thermodynamics by a law of conservation of total entropy, where the increase in the standard (marginal) entropy is exactly compensated by a (negative) correlation entropy. The usual second law then results simply from our renunciation of all correlation knowledge (stosszahlansatz), and not from any intrinsic behavior of classical systems. The situation for classical mechanics is thus in sharp contrast to that of stochastic processes, which are intrinsically irreversible.

Thus Hugh Everett ends Chapter II of his thesis by deriving a concept of Information that is very similar to Leon Brillouin’s concept of Information as being a form of negative entropy. Remember, Hugh Everett defined marginal information as the information about the individual molecules, and {total} as the total amount of correlation information that is created by the molecules bouncing off each other and becoming correlated. So we can rewrite equation (7.5) as:

Itotal = Icorrelation + Imarginal

which says that the total amount of Information in classical mechanics does not change and that information is conserved. For example, let’s say you start off with a box that initially only has molecules on the left side of the box. So you begin with lots of marginal information about the individual molecules because you know they are all in the left side of the box. But as time progresses, the molecules will bounce off each other and begin to scatter into the right side of the box, until the molecules finally become smoothly spread throughout the entire box, as the second law of thermodynamics predicts. In the process, the marginal information of the molecules will decrease with time, but at the same time the correlation information of the molecules will increase as the molecules bounce off each other and become correlated, so the total amount of information remains constant.

III. QUANTUM MECHANICS

Having mathematically formulated the ideas of information and correlation for probability distributions, we turn to the field of quantum mechanics. In this chapter we assume that the states of physical systems are represented by points in a Hilbert space, and that the time dependence of the state of an isolated system is governed by a linear wave equation.

It is well known that state functions lead to distributions over eigenvalues of Hermitian operators (square amplitudes of the expansion coefficients of the state in terms of the basis consisting of eigenfunctions of the operator) which have the mathematical properties of probability distributions (non-negative and normalized). The standard interpretation of quantum mechanics regards these distributions as actually giving the probabilities that the various eigenvalues of the operator will be observed, when a measurement represented by the operator is performed.

A feature of great importance to our interpretation is the fact that a state function of a composite system leads to joint distributions over subsystem quantities, rather than independent subsystem distributions, i.e., the quantities in different subsystems may be correlated with one another. The first section of this chapter is accordingly devoted to the development of the formalism of composite systems, and the connection of composite system states and their derived joint distributions with the various possible subsystem conditional and marginal distributions. We shall see that there exist relative state functions which correctly give the conditional distributions for all subsystem operators, while marginal distributions can not generally be represented by state functions, but only by density matrices.

In Section 2 the concepts of information and correlation, developed in the preceding chapter, are applied to quantum mechanics, by defining information and correlation for operators on systems with prescribed states. It is also shown that for composite systems there exists a quantity which can be thought of as the fundamental correlation between subsystems, and a closely related canonical representation of the composite system state. In addition, a stronger form of the uncertainty principle, phrased in information language, is indicated.

The third section takes up the question of measurement in quantum mechanics, viewed as a correlation producing interaction between physical systems. A simple example of such a measurement is given and discussed. Finally some general consequences of the superposition principle are considered.

Hugh Everett then goes on to discuss composite systems consisting of several parts. For example, suppose we have a system S composed of two electrons S1 and S2.

….It is well known that if the states of a pair of systems S1 and S2 are represented by points in Hilbert spaces H1 and H2 respectively, then the states of the composite system S = S1 + S2 (the two systems S1 and S2 regarded as a single system S are represented correctly by points of the direct product of H1 and H2….

After a great a great deal of math Hugh Everett concludes:

….Therefore there exists in general no state for S1 which correctly gives the marginal expectations for all operators in S1….

However, even though there is generally no single state describing marginal expectations, we see that there is always a mixture of states, .... which does yield the correct expectations.

which means that when two systems S1 and S2 interact with each other, there is no wavefunction for just S1 or S2 alone, but there is a wavefunction for the composite system of S1 and S2 together. That means that when you as system S1 observe an electron S2 there are no longer separate wavefunctions for you and the electron. Instead, you and the electron become entangled into a single composite wavefunction for both you and the electron. He goes on to summarize this as:

In summary, we have seen in this section that a state of a composite system leads to joint distributions over subsystem quantities which are generally not independent. Conditional distributions and expectations for subsystems are obtained from relative states, and subsystem marginal distributions and expectations are given by density matrices.

There does not, in general, exist anything like a single state for one subsystem of a composite system. That is, subsystems do not possess states independent of the states of the remainder of the system, so that the subsystem states are generally correlated. One can arbitrarily choose a state for one subsystem, and be led to the relative state for the other subsystem. Thus we are faced with a fundamental relativity of states, which is implied by the formalism of composite systems. It is meaningless to ask the absolute state of a subsystem - one can only ask the state relative to a given state of the remainder of the system….

Next he discusses the marginal information of individual particles and the correlated information due to the particles interacting with each other by bouncing off each other in terms of operators acting upon their wavefunctions ψ. Remember in quantum mechanics, the wavefunction ψ of a particle is the whole deal and contains all of the information there is about the particle, like its position and velocity. In quantum mechanics that information is determined by applying mathematical operators to the wavefunction ψ. For example, if you want to know how much energy a particle has, there is a mathematical operator that you can apply to its wavefunction ψ that will give you an actual number. It’s like if you want to know how much money somebody has on them, you can apply an operation to them that frisks them down and checks all of their pockets for wallets, billfolds, loose bills and change and then adds it all up.

We wish to be able to discuss information and correlation for Hermitian operators A, B, ... , with respect to a state function ψ. These quantities are to be computed, through the formulas of the preceding chapter, from the square amplitudes of the coefficients of the expansion of ψ terms of the eigenstates of the operators.


Finally Hugh Everett notes that quantum mechanics is very much like classical mechanics in regards to the relationship between the entropy and Information that is obtained by applying the above mathematical operators. Remember, in classical mechanics we saw that the correlation information is the information that arises from particles interacting with each other by bouncing off each other. The chief difference for quantum mechanics is that instead of particles bouncing off each other, we have mathematical operators operating on their wavefunctions instead:

….It is also interesting to note that the quantity - Trace(ρ ln ρ ) is (apart from a factor of Boltzman's constant) just the entropy of a mixture of states characterized by the density matrix ρ. Therefore the entropy of the mixture characteristic of a subsystem S1 for the state ψS = ψS1 + S2 is exactly matched by a correlation information {S1 ,S2}, which represents the correlation between any pair of operators A, B, which define the canonical representation. The situation is thus quite similar to that of classical mechanics.


Next Hugh Everett takes up the thorny issues of measurement in quantum mechanics. Recall that in the Copenhagen Interpretation measuring the wavefunction ψ of an electron causes it to mysteriously collapse to a single point and that is where you will find the electron.

We now consider the question of measurement in quantum mechanics, which we desire to treat as a natural process within the theory of pure wave mechanics. From our point of view there is no fundamental distinction between "measuring apparata" and other physical systems. For us, therefore, a measurement is simply a special case of interaction between physical systems - an interaction which has the property of correlating a quantity in one subsystem with a quantity in another….

….Nearly every interaction between systems produces some correlation however. Suppose that at some instant a pair of systems are independent, so that the composite system state function is a product of subsystem states ψS = ψS1 ψS2 . Then this condition obviously holds only instantaneously if the systems are interacting - the independence is immediately destroyed and the systems become correlated. We could, then, take the position that the two interacting systems are continually "measuring" one another, if we wished….

….Suppose that we have a system of only one coordinate, q, (such as position of a particle), and an apparatus of one coordinate r (for example the position of a meter needle)….

….This principle has the far reaching implication that for any possible measurement, for which the initial system state is not an eigenstate, the resulting state of the composite system leads to no definite system state nor any definite apparatus state. The system will not be put into one or another of its eigenstates with the apparatus indicating the corresponding value, and nothing resembling Process 1 can take place….

….Thus in general after a measurement has been performed there will be no definite system state nor any definite apparatus state, even though there is a correlation. It seems as though nothing can ever be settled by such a measurement. Furthermore this result is independent of the size of the apparatus, and remains true for apparatus of quite macroscopic dimensions….

Suppose, for example, that we coupled a spin measuring device to a cannonball, so that if the spin is up the cannonball will be shifted one foot to the left, while if the spin is down it will be shifted an equal distance to the right. If we now perform a measurement with this arrangement upon a particle whose spin is a superposition of up and down, then the resulting total state will also be a superposition of two states, one in which the cannonball is to the left, and one in which it is to the right. There is no definite position for our macroscopic cannonball!

This behavior seems to be quite at variance with our observations, since macroscopic objects always appear to us to have definite positions. Can we reconcile this prediction of the purely wave mechanical theory with experience, or must we abandon it as untenable? In order to answer this question we must consider the problem of observation itself within the framework of the theory.

To understand the above section we need a little background in experimental physics. Electrons have a quantum mechanical property called spin. You can think of an electron’s spin like the electron has a little built-in magnet. In fact, it is the spin of the little electron magnets that add up to make the real magnets that you put on your refrigerator. When you throw an electron through a distorted magnetic field that is pointing up the electron will pop out in one of two states. It will either be aligned with the magnetic field (called spin up) or it will be pointing 1800 in the opposite direction of the magnetic field (called spin down). Both the spin up and spin down conditions are called an eigenstate. Prior to the observation of the electron’s spin, the electron is in a superposition of states and is not in an eigenstate. Now if the electron in the eigenstate of spin up is sent through the same magnetic field again, it will be found to pop out in the eigenstate of spin up again. Similarly, a spin down electron that is sent through the magnetic field again will also pop out as a spin down electron. Now here is the strange part. If you rotate the magnetic field by 900 and send spin up electrons through it, 50% of the electrons will pop out with a spin pointing to the left, and 50% will pop out with a spin pointing to the right. And you cannot predict in advance which way a particular spin up electron will pop out. It might spin to the left, or it might spin to the right. The same goes for the spin down electrons – 50% will pop out spinning to the left and 50% will pop out spinning to the right.

Figure 1 - In the Stern-Gerlach experiment we shoot electrons through a distorted magnetic field. Classically, we would expect the electrons to be spinning in random directions and the magnetic field should deflect them in random directions, creating a smeared out spot on the screen. Instead, we see that the act of measuring the spins of the electrons puts them into eigenstates with eigenvalues of spin up or spin down and the electrons are either deflected up or down. If we rotate the magnets by 900, we find that the electrons are deflected to the right or to the left.

In the above section, Hugh Everett is proposing that when a device, like our magnets above, measures the spin of an electron that is in an unknown state, and not in a spin up or spin down eigenstate, the device does not put the electron into a spin up or spin down eigenstate as the Copenhagen Interpretation maintains. Instead the device and the electron enter into a correlated composite system state or combined wavefunction with an indeterminate spin of the electron.

In the next chapter Hugh Everett explains how this new worldview can be used to explain what we observe in the lab. In fact, he will propose that from the perspective of the measuring magnets and the electron, two independent observational histories will emerge, one with the measuring magnets finding a spin up electron and one with the measuring magnets finding a spin down electron, and both of these will be just as “real” as the other. For them, the Universe has essentially split in two, with each set in its own Universe. That is where the “Many-Worlds” in the Many-Worlds Interpretation of quantum mechanics comes from.

IV. OBSERVATION

We shall now give an abstract treatment of the problem of observation. In keeping with the spirit of our investigation of the consequences of pure wave mechanics we have no alternative but to introduce observers, considered as purely physical systems, into the theory.

We saw in the last chapter that in general a measurement (coupling of system and apparatus) had the outcome that neither the system nor the apparatus had any definite state after the interaction - a result seemingly at variance with our experience. However, we do not do justice to the theory of pure wave mechanics until we have investigated what the theory itself says about the appearance of phenomena to observers, rather than hastily concluding that the theory must be incorrect because the actual states of systems as given by the theory seem to contradict our observations.

Recall that in Chapter III Hugh Everett demonstrated that when an observer O observes an electron, the wavefunction ψ of observer O and whatever apparatus that is used to observe the electron become entangled or “correlated” with the wavefunction ψ of the electron into a total state function ψ of the Observer and the electron together, and that neither the observer O nor the electron have separate wavefunctions after the observation is made.

We shall see that the introduction of observers can be accomplished in a reasonable manner, and that the theory then predicts that the appearance of phenomena, as the subjective experience of these observers, is precisely in accordance with the predictions of the usual probabilistic interpretation of quantum mechanics.

We are faced with the task of making deductions about the appearance of phenomena on a subjective level, to observers which are considered as purely physical systems and are treated within the theory. In order to accomplish this it is necessary to identify some objective properties of such an observer (states) with subjective knowledge (i.e., perceptions). Thus, in order to say that an observer O has observed the event a, it is necessary that the state of O has become changed from its former state to a new state which is dependent upon a.

It will suffice for our purposes to consider our observers to possess memories (i.e., parts of a relatively permanent nature whose states are in correspondence with the past experience of the observer). In order to make deductions about the subjective experience of an observer it is sufficient to examine the contents of the memory.

As models for observers we can, if we wish, consider automatically functioning machines, possessing sensory apparata and coupled to recording devices capable of registering past sensory data and machine configurations. We can further suppose that the machine is so constructed that its present actions shall be determined not only by its present sensory data, but by the contents of its memory as well. Such a machine will then be capable of performing a sequence of observations (measurements), and furthermore of deciding upon its future experiments on the basis of past results. We note that if we consider that current sensory data, as well as machine configuration, is immediately recorded in the memory, then the actions of the machine at a given instant can be regarded as a function of the memory contents only, and all relevant experience of the machine is contained in the memory.

Now remember this is 1956! There really weren’t many computers running around in 1956. I know because I was there. Still, Hugh Everett is now proposing to take human observers out of the equation and replace them with computers using Artificial Intelligence instead. This is a wise move because human observers use consciousness to record observations, and we still do not understand what consciousness is. By taking human observers out of the analysis he avoids that complication. No wonder that the Many-Worlds Interpretation seems to naturally lend itself to quantum computers. Computers were part of the analysis from the very beginning.

For such machines we are justified in using such phrases as "the machine has perceived A" or "the machine is aware of A" if the occurrence of A is represented in the memory, since the future behavior of the machine will be based upon the occurrence of A. In fact, all of the customary language of subjective experience is quite applicable to such machines, and forms the most natural and useful mode of expression when dealing with their behavior, as is well known to individuals who work with complex automata.

When dealing quantum mechanically with a system representing an observer we shall ascribe a state function, ψO to it. When the State ψO describes an observer whose memory contains representations of the events A,B, ... ,C we shall denote this fact by appending the memory sequence in brackets as a subscript, writing:

ψO[A,B, ... ,C]

The symbols A,B, ... ,C, which we shall assume to be ordered time wise, shall therefore stand for memory configurations which are in correspondence with the past experience of the observer. These configurations can be thought of as punches in a paper tape, impressions on a magnetic reel, configurations of a relay switching circuit, or even configurations of brain cells. We only require that they be capable of the interpretation "The observer has experienced the succession of events A,B, ... ,C." (We shall sometimes write dots in a memory sequence, [. .. A,B, ... ,C], to indicate the possible presence of previous memories which are irrelevant to the case being considered.)

Our problem is, then, to treat the interaction of such observer-systems with other physical systems (observations), within the framework of wave mechanics, and to deduce the resulting memory configurations, which we can then interpret as the subjective experiences of the observers.

The machine with Artificial Intelligence is going to make a series of observations A, B, C…. and record them in its memory. Hugh Everett concludes with this summary:

In the language of subjective experience, the observer which is described by a typical element, ψ'i,j...k, of the superposition has perceived an apparently random sequence of definite results for the observations. It is furthermore true, since in each element the system has been left in an eigenstate of the measurement, that if at this stage a redetermination of an earlier system observation Sl takes place, every element of the resulting final superposition will describe the observer with a memory configuration of the form [... ,ai1 ,... , ajl , ... , akr,ajl ] in which the earlier memory coincides with the later – i. e., the memory states are correlated. It will thus appear to the observer which is described by a typical element of the superposition that each initial observation on a system caused the system to "jump" into an eigenstate in a random fashion and thereafter remain there for subsequent measurements on the same system. Therefore, qualitatively, at least, the probabilistic assertions of Process 1 appear to be valid to the observer described by a typical element of the final superposition.

So when you throw an electron through a nonuniform magnetic field, the machine with Artificial Intelligence is going to record that the electron randomly “jumps ” into a spin up eigenstate or a spin down eigenstate and then continues to remain a spin up or spin down electron. Hugh Everett then proceeds to summarize all of this and explain how such an observer O that becomes entangled or “correlated” with an electron will leave behind in its memory a sequence of events that is exactly what we observe in the lab. The electron will seem to behave in a random manner until it is observed and put into a particular eigenstate, and then the electron will remain in that eigenstate until it is perturbed again. Since the latest observation supplies all of the possible information about the relative system state of the observer O and the electron, and previous observations are not correlated with it, the Heisenberg Uncertainty Principle is not violated either. If the observer O measures the electron’s velocity, a further measure of its position will blur its velocity.

We can therefore summarize the situation for an arbitrary sequence of observations, upon the same or different systems in any order, and for which the number of observations of each quantity in each system is very large, with the following result:

Except for a set of memory sequences of measure nearly zero, the averages of any functions over a memory sequence can be calculated approximately by the use of the independent probabilities given by Process 1 for each initial observation, on a system, and by the use of the transition probabilities (2.23) for succeeding observations upon the same system. In the limit, as the number of all types of observations goes to infinity the calculation is exact, and the exceptional set has measure zero.

This prescription for the calculation of averages over memory sequences by probabilities assigned to individual elements is precisely that of the orthodox theory (Process 1). Therefore all predictions of the usual theory will appear to be valid to the observer in almost all observer states, since these predictions hold for almost all memory sequences.

In particular, the uncertainty principle is never violated, since, as above, the latest measurement upon a system supplies all possible information about the relative system state, so that there is no direct correlation between any earlier results of observation on the system, and the succeeding observation. Any observation of a quantity B, between two successive observations of quantity A (all on the same system) will destroy the one-one correspondence between the earlier and later memory states for the result of A. Thus for alternating observations of different quantities there are fundamental limitations upon the correlations between memory states for the same observed quantity, these limitations expressing the content of the uncertainty principle.

In conclusion, we have described in this section processes involving an idealized observer, processes which are entirely deterministic and continuous from the over-all viewpoint (the total state function is presumed to satisfy a wave equation at all times) but whose result is a superposition, each element of which describes the observer with a different memory state. We have seen that in almost all of these observer states it appears to the observer that the probabilistic aspects of the usual form of quantum theory are valid. We have thus seen how pure wave mechanics, without any initial probability assertions, can lead to these notions on a subjective level, as appearances to observers.

So if an observer throws lots of electrons through a nonuniform magnetic field the observer will perceive the electrons popping out randomly in spin up and spin down eigenstates, but what really is happening is that the observer and the electrons are splitting off into their own universes each time an electron goes through the magnetic field. One observer-electron pair splits off into a spin up universe, while another observer-electron pair splits off into a spin down universe.

3 Several Observers

We shall now consider the consequences of our scheme when several observers are allowed to interact with the same systems, as well as with one another (communication). In the following discussion observers shall be denoted by O 1, O2,..., other systems by S1, S2,...,and observables by operators A, B, C,....

We shall also wish to allow communication among the observers, which we view as an interaction by means of which the memory sequences of different observers become correlated. (For example, the transfer of impulses from the magnetic tape memory of one mechanical observer to that of another constitutes such a transfer of information.)

Case 1: We allow two observers to separately observe the same quantity in a system, and then compare results.

After a bit of math, he concludes:

This means that observers who have separately observed the same quantity will always agree with each other.

For example, suppose a spin up electron pops out of our measuring magnets and observer O1 measures it with a set of magnets and finds it to be a spin up electron. If observer O2 observes the very same spin up electron, he will also measure it to be a spin up electron, and when observers O1 and O2 compare results they will agree. This is in agreement with the finding that the Copenhagen Interpretation predicts.

Case 2: We allow two observers to measure separately two different, noncommuting quantities in the same system.

For Case 2 we could have one observer measure the spin of an electron, while the other observer measures the velocity of the electron. The spin of an electron and its velocity are noncommuting quantities. Commuting quantities are quantities that fall under the Heisenberg Uncertainty Principal, like the position and velocity of the electron, where the measurement of one affects the measurement of the other. For this case Hugh Everett mathematically demonstrates that again the same results are obtained as predicted by the Copenhagen Interpretation.

Case 3: We suppose that two systems S1 and S2 are correlated but no longer interacting, and that O1 measures property A in S1 and O2 property B in S2.

It is therefore seen that one observer's observation upon one system of a correlated, but non-interacting pair of systems, has no effect on the remote system, in the sense that the outcome or expected outcome of any experiments by another observer on the remote system are not affected. Paradoxes like that of Einstein-Rosen-Podolsky which are concerned with such correlated, non-interacting, systems are thus easily understood in the present scheme.

Case 3 is the basis for the infamous EPR (Einstein-Rosen-Podolsky) paradox that has caused so much grief for the Copenhagen Interpretation. The EPR Paradox goes like this. Suppose we prepare many pairs of quantum mechanically “entangled” electrons that conserve angular momentum. Each pair consists of one spin up electron and one spin down electron, but we do not know which is which at the onset. Now let the pairs of electrons fly apart and let two observers measure their spins. If observer A measures an electron there will be a 50% probability that he will find a spin up electron and a 50% chance that he will find a spin down electron, and the same goes for observer B, 50% of observer’s B electrons will be found to have a spin up, while 50% will be found with a spin down. Now the paradox of the EPR paradox, from the perspective of the Copenhagen Interpretation, is that when observer A and observer B come together to compare notes, they find that each time observer A found a spin up electron, observer B found a spin down electron, even though the electrons did not know which way they were spinning before the measurements were performed. Somehow when observer A measured the spin of an electron, it instantaneously changed the spin of the electron that observer B measured . Einstein hated this “spooky action at a distance” feature of the Copenhagen Interpretation that made physics nonlocal, meaning that things that were separated by great distances could still instantaneously change each other. He thought that it violated the speed of light speed limit of his Special Theory of Relativity that did not allow information to travel faster than the speed of light. Einstein thought that the EPR paradox was the final nail in the coffin of quantum mechanics. There had to be some “hidden variables” that allowed electrons to know if they “really” were a spin up or spin down electron. Hugh Everett solves this problem by letting the electrons be in all possible spin states in a large number of parallel universes. When observers measure the spin of an electron, they really do not measure the spin of the electron. They really measure in which universe they happen to be located in, and since everything in the Many-Worlds Interpretation relies on “correlated” composite wavefunctions, it should come as no surprise that when observer A and observer B come together, they find that their measurements of the electron spins are correlated.

V. SUPPLEMENTARY TOPICS

We have now completed the abstract treatment of measurement and observation, with the deduction that the statistical predictions of the usual form of quantum theory (Process 1) will appear to be valid to all observers. We have therefore succeeded in placing our theory in correspondence with experience, at least insofar as the ordinary theory correctly represents experience.

We should like to emphasize that this deduction was carried out by using only the principle of superposition, and the postulate that an observation has the property that if the observed variable has a definite value in the object-system then it will remain definite and the observer will perceive this value. This treatment is therefore valid for any possible quantum interpretation of observation processes, i.e., any way in which one can interpret wave functions as describing observers, as well as for any form of quantum mechanics for which the superposition principle for states is maintained. Our abstract discussion of observation is therefore logically complete, in the sense that our results for the subjective experience of observers are correct, if there are any observers at all describable by wave mechanics.

In this chapter we shall consider a number of diverse topics from the point of view of our pure wave mechanics, in order to supplement the abstract discussion and give a feeling for the new viewpoint. Since we are now mainly interested in elucidating the reasonableness of the theory, we shall often restrict ourselves to plausibility arguments, rather than detailed proofs.

1. Macroscopic objects and classical mechanics

In the light of our knowledge about the atomic constitution of matter, any "object" of macroscopic size is composed of an enormous number of constituent particles. The wave function for such an object is then in a space of fantastically high dimension (3N, if N is the number of particles). Our present problem is to understand the existence of macroscopic objects, and to relate their ordinary (classical) behavior in the three dimensional world to the underlying wave mechanics in the higher dimensional space.

Let us begin by considering a relatively simple case. Suppose that we place in a box an electron and a proton, each in a definite momentum state, so that the position amplitude density of each is uniform over the whole box. After a time we would expect a hydrogen atom in the ground state to form, with ensuing radiation. We notice, however, that the position amplitude density of each particle is still uniform over the whole box. Nevertheless the amplitude distributions are now no longer independent, but correlated. In particular, the conditional amplitude density for the electron, conditioned by any definite proton (or centroid) position, is not uniform, but is given by the familiar ground state wave function for the hydrogen atom. What we mean by the statement, "a hydrogen atom has formed in the box," is just that this correlation has taken place - a correlation which insures that the relative configuration for the electron, for a definite proton position, conforms to the customary ground state configuration.

The wave function for the hydrogen atom can be represented as a product of a centroid wave function and a wave function over relative coordinates, where the centroid wave function obeys the wave equation for a particle with mass equal to the total mass of the proton-electron system. Therefore, if we now open our box, the centroid wave function will spread with time in the usual manner of wave packets, to eventually occupy a vast region of space. The relative configuration (described by the relative coordinate state function) has, however, a permanent nature, since it represents a bound state, and it is this relative configuration which we usually think of as the object called the hydrogen atom. Therefore, no matter how indefinite the positions of the individual particles become in the total state function (due to the spreading of the centroid), this state can be regarded as giving (through the centroid wave function) an amplitude distribution over a comparatively definite object, the tightly bound electron-proton system. The general state, then, does not describe any single such definite object, but a superposition of such cases with the object located at different positions.

In the above section Hugh Everett proposes putting an electron and a proton in a box with each particle given a known initial momentum. Then according to the Heisenberg Uncertainty Principle we cannot know anything about their positions, so they must be uniformly smeared out over the insides of the whole box, and they should stay that way forever. However, eventually the electron and proton will interact and form a hydrogen atom, giving off a photon in the process. The two particles will then be defined by a composite correlated wavefunction that corresponds to the ground state of a hydrogen atom. This composite correlated wavefunction can be viewed as the product of a centroid wavefunction with the mass of a hydrogen atom and a relative wavefunction spread over coordinates relative to the proton. If we then open the box and release the hydrogen atom this centroid wavefunction will spread out all over the place as the hydrogen atom diffuses away from the box, but there still will be a relative component of the total composite wavefunction that represents the relative location of the electron with respect to the proton.

In a similar fashion larger and more complex objects can be built up through strong correlations which bind together the constituent particles. It is still true that the general state function for such a system may lead to marginal position densities for any single particle (or centroid) which extend over large regions of space. Nevertheless we can speak of the existence of a relatively definite object, since the specification of a single position for a particle, or the centroid, leads to the case where the relative position densities of the remaining particles are distributed closely about the specified one, in a manner forming the comparatively definite object spoken of.

Suppose, for example, we begin with a cannonball located at the origin, described by a state function:

ψ[cj(0,0,0)]

where the subscript indicates that the total state function ψ describes a system of particles bound together so as to form an object of the size and shape of a cannonball, whose centroid is located (approximately) at the origin, say in the form of a real gaussian wave packet of small dimensions, with variance σ02 for each dimension.

If we now allow a long lapse of time, the centroid of the system will spread in the usual manner to occupy a large region of space....

It is not true that each individual particle spreads independently of the rest, in which case we would have a final state which is a grand superposition of states in which the particles are located independently everywhere. The fact that they are in bound states restricts our final state to a superposition of "cannonball" states. The wave function for the centroid can therefore be taken as a representative wave function for the whole object.

Similarly, in the above section Hugh Everett mathematically demonstrates that if we have a large number of particles that constitute a cannonball with a composite wavefunction ψ[cj(0,0,0)] defined upon the coordinates (0,0,0) that this composite wavefunction will indeed spread out with time, just like the wavefunction for a single unbound electron will spread out with time, but the individual particles will not spread out all over the place causing the cannonball to essentially evaporate. Thus large objects composed of bound particles will continue to behave as large objects composed of bound particles as time progresses. Next, he describes what an observer would record when observing the cannonball move through space. The observer would become correlated into a superposition of his wavefunction with that of the centroid cannonball wavefunction, and the cannonball will then appear to behave in a manner conforming to classical mechanics:

Let us now consider the result of an observation (considered along the lines of Chapter IV) performed upon a system of macroscopic bodies in a general state. The observer will not become aware of the fact that the state does not correspond to definite positions and momenta (i.e., he will not see the objects as "smeared out" over large regions of space) but will himself simply become correlated with the system - after the observation the composite system of objects + observer will be in a superposition of states, each element of which describes an observer who has perceived that the objects have nearly definite positions and momenta, and for whom the relative system state is a quasi-classical state in the previous sense, and furthermore to whom the system will appear to behave according to classical mechanics if his observation is continued. We see, therefore, how the classical appearance of the macroscopic world to us can be explained in the wave theory.

Since Hugh Everett has eliminated Process 1 from his theory, he next addresses what happens when an observation is made. For example, suppose observer A uses magnets to measure the spin of an electron and finds that the electron is a spin up electron. In the Copenhagen Interpretation the act of observing the electron will collapse its wavefunction into a spin up eigenstate and this is an irreversible process that cannot be reversed in time. Hugh Everett goes through some more mathematics using what he has already discussed above to come to a different conclusion:

3. Reversibility and irreversibility

….There is another way of looking at this apparent irreversibility within our theory which recognizes only Process 2. When an observer performs an observation the result is a superposition, each element of which describes an observer who has perceived a particular value. From this time forward there is no interaction between the separate elements of the superposition (which describe the observer as having perceived different results), since each element separately continues to obey the wave equation. Each observer described by a particular element of the superposition behaves in the future completely independently of any events in the remaining elements, and he can no longer obtain any information whatsoever concerning these other elements (they are completely unobservable to him).

The irreversibility of the measuring process is therefore, within our framework, simply a subjective manifestation reflecting the fact that in observation processes the state of the observer is transformed into a superposition of observer states, each element of which describes an observer who is irrevocably cut off from the remaining elements. While it is conceivable that some outside agency could reverse the total wave function, such a change cannot be brought about by any observer which is represented by a single element of a superposition, since he is entirely powerless to have any influence on any other elements.

There are, therefore, fundamental restrictions to the knowledge that an observer can obtain about the state of the universe. It is impossible for any observer to discover the total state function of any physical system, since the process of observation itself leaves no independent state for the system or the observer, but only a composite system state in which the object-system states are inextricably bound up with the observer states. As soon as the observation is performed, the composite state is split into a superposition for which each element describes a different object-system state and an observer with (different) knowledge of it. Only the totality of these observer states, with their diverse knowledge, contains complete information about the original object-system state - but there is no possible communication between the observers described by these separate states. Any single observer can therefore possess knowledge only of the relative state function (relative to his state) of any systems, which is in any case all that is of any importance to him….

So instead of the observer collapsing the wavefunction of the electron with his magnets, the observer splits into two observers. One observer sees a spin up electron and the other observer sees a spin down electron. These two observers are totally unaware of each other and are completely cut off from each other with no possibility to interact. This is how when a quantum computer reads a 1-qubit memory location that is in a superposition of 1 and 0 at the top of an if-then-else block, one quantum computer will execute the then-block, while the other instance of the quantum computer will execute the else-block.

….We take this opportunity to caution against a certain viewpoint which can lead to difficulties. This is the idea that, after an apparatus has interacted with a system, in "actuality" one or another of the elements of the resultant superposition described by the composite state-function has been realized to the exclusion of the rest, the existing one simply being unknown to an external observer (i.e., that instead of the superposition there is a genuine mixture). This position must be erroneous since there is always the possibility for the external observer to make use of interference properties between the elements of the superposition.

In the present example, for instance, it is in principle possible to deflect the two beams back toward one another with magnetic fields and recombine them in another inhomogeneous field, which duplicates the first, in such a manner that the original spin state (before entering the apparatus) is restored. This would not be possible if the original Stern-Gerlach apparatus performed the function of converting the original wave packet into a non-interfering mixture of packets for the two spin cases. Therefore the position that after the atom has passed through the inhomogeneous field it is "really" in one or the other beam with the corresponding spin, although we are ignorant of which one, is incorrect.

Shooting a beam of electrons through an inhomogeneous magnetic field will cause two beams to seemingly emerge, one with spin up electrons and one with spin down electrons. But according to Hugh Everett each electron will end up in each beam, but in two separate universes, and each electron will be just as “real” as the other. For example, Hugh Everett maintains that this must be so because theoretically it is possible to reflect the electrons coming out of a Stern-Gerlach device back through the device to return the spin up and spin down electrons back into being electrons in a mixed state of spin up and spin down. Essentially, this is what our circular tub of water would do to the circular waves arising from dropping a pebble into the center of the circular tub of water.

It is therefore improper to attribute any less validity or "reality" to any element of a superposition than any other element, due to this ever present possibility of obtaining interference effects between the elements. All elements of a superposition must be regarded as simultaneously existing.

Below is Hugh Everett’s final chapter in its entirety where he nicely sums things up, without any mathematics at all.

VI. DISCUSSION

We have shown that our theory based on pure wave mechanics, which takes as the basic description of physical systems the state function - supposed to be an objective description (i.e., in one-one, rather than statistical, correspondence to the behavior of the system) - can be put in satisfactory correspondence with experience. We saw that the probabilistic assertions of the usual interpretation of quantum mechanics can be deduced from this theory, in a manner analogous to the methods of classical statistical mechanics, as subjective appearances to observers - observers which were regarded simply as physical systems subject to the same type of description and laws as any other systems, and having no preferred position. The theory is therefore capable of supplying us with a complete conceptual model of the universe, consistent with the assumption that it contains more than one observer.

Because the theory gives us an objective description, it constitutes a framework in which a number of puzzling subjects (such as classical level phenomena, the measuring process itself, the inter-relationship of several observers, questions of reversibility and irreversibility, etc.) can be investigated in detail in a logically consistent manner. It supplies a new way of viewing processes, which clarifies many apparent paradoxes of the usual interpretation - indeed, it constitutes an objective framework in which it is possible to understand the general consistency of the ordinary view.

We shall now resume our discussion of alternative interpretations. There has been expressed lately a great deal of dissatisfaction with the present form of quantum theory by a number of authors, and a wide variety of new interpretations have sprung into existence. We shall now attempt to classify briefly a number of these interpretations, and comment upon them.

a. The "popular" interpretation. This is the scheme alluded to in the introduction, where ψ is regarded as objectively characterizing the single system, obeying a deterministic wave equation when the system is isolated but changing probabilistically and discontinuously under observation.

In its unrestricted form this view can lead to paradoxes like that mentioned in the introduction, and is therefore untenable. However, this view is consistent so long as it is assumed that there is only one observer in the universe (the solipsist position - Alternative 1 of the Introduction). This consistency is most easily understood from the viewpoint of our own theory, where we were able to show that all phenomena will seem to follow the predictions of this scheme to any observer. Our theory therefore justifies the personal adoption of this probabilistic interpretation, for purposes of making practical predictions, from a more satisfactory framework

b. The Copenhagen interpretation. This is the interpretation developed by Bohr. The ψ function is not regarded as an objective description of a physical system (i.e., it is in no sense a conceptual model), but is regarded as merely a mathematical artifice which enables one to make statistical predictions, albeit the best predictions which it is possible to make. This interpretation in fact denies the very possibility of a single conceptual model applicable to the quantum realm, and asserts that the totality of phenomena can only be understood by the use of different, mutually exclusive (i.e., "complementary") models in different situations. All statements about microscopic phenomena are regarded as meaningless unless accompanied by a complete description (classical) of an experimental arrangement.

While undoubtedly safe from contradiction, due to its extreme conservatism, it is perhaps overcautious. We do not believe that the primary purpose of theoretical physics is to construct "safe" theories at severe cost in the applicability of their concepts, which is a sterile occupation, but to make useful models which serve for a time and are replaced as they are outworn.

Another objectionable feature of this position is its strong reliance upon the classical level from the outset, which precludes any possibility of explaining this level on the basis of an underlying quantum theory. (The deduction of classical phenomena from quantum theory is impossible simply because no meaningful statements can be made without pre-existing classical apparatus to serve as a reference frame.) This interpretation suffers from the dualism of adhering to a "reality" concept (i.e., the possibility of objective description) on the classical level but renouncing the same in the quantum domain.

There is some political maneuvering going on in the above passage. The “popular” interpretation really is the Copenhagen Interpretation, but Niels Bohr was still a living giant of quantum theory at the time, and it would not be wise for this Ph.D. thesis to be seen as a direct attack on the Copenhagen Interpretation and Niels Bohr. So Hugh Everett breaks apart the Copenhagen Interpretation into two parts. Part 1 he calls the “popular” interpretation in which wavefunctions mysteriously collapse when an observation is made. Part 2 he calls the Copenhagen Interpretation where wavefunctions are just a mathematical tool used to perform calculations after you set up a macroscopic experiment to make a quantum mechanical measurement. For example, the wavefunctions of electrons passing through a nonuniform magnetic field could be used to calculate that 50% will be observed to be spin up electrons, while 50% will be observed to be spin down electrons. So now we know that the “popular” interpretation that Hugh Everett has been attacking from the very first line of his thesis:

“a particular interpretation of quantum theory which, although not representative of the more careful formulations of some writers, is the most common form encountered in textbooks and university lectures on the subject.”

is really not Niels Bohr’s sacred Copenhagen Interpretation at all. It is really the “popular” interpretation that he has been attacking all along. I think we all would have practiced a similar maneuver in his shoes.

c. The "hidden variables" interpretation. This is the position (Alternative 4 of the Introduction) that '" is not a complete description of a single system. It is assumed that the correct complete description, which would involve further (hidden) parameters, would lead to a deterministic theory, from which the probabilistic aspects arise as a result of our ignorance of these extra parameters in the same manner as in classical statistical mechanics.

The ψ function is therefore regarded as a description of an ensemble of systems rather than a single system. Proponents of this interpretation include Einstein, Bohm, Wiener and Siegal.

Einstein hopes that a theory along the lines of his general relativity, where all of physics is reduced to the geometry of space-time could satisfactorily explain quantum effects. In such a theory a particle is no longer a simple object but possesses an enormous amount of structure (i.e., it is thought of as a region of space-time of high curvature). It is conceivable that the interactions of such "particles" would depend in a sensitive way upon the details of this structure, which would then play the role of the "hidden variables". However, these theories are non-linear and it is enormously difficult to obtain any conclusive results. Nevertheless, the possibility cannot be discounted.

Bohm considers ψ to be a real force field acting on a particle which always has a well-defined position and momentum (which are the hidden variables of this theory). The ψ-field satisfying Schrödinger equation is pictured as somewhat analogous to the electromagnetic field satisfying Maxwell's equations, although for systems of n particles the ψ -field is in a 3n-dimensional space. With this theory Bohm succeeds in showing that in all actual cases of measurement the best predictions that can be made are those of the usual theory, so that no experiments could ever rule out his interpretation in favor of the ordinary theory. Our main criticism of this view is on the grounds of simplicity - if one desires to hold the view that if is a real field then the associated particle is superfluous since, as we have endeavored to illustrate, the pure wave theory is itself satisfactory.

Wiener and Siegal have developed a theory which is more closely tied to the formalism of quantum mechanics. From the set N of all nondegenerate linear Hermitian operators for a system having a complete set of eigenstates, a subset I is chosen such that no two members of I commute and every element outside I commutes with at least one element of I . The set I therefore contains precisely one operator for every orientation of the principal axes of the Hilbert space for the system. It is postulated that each of the operators of I corresponds to an independent observable which can take any of the real numerical values of the spectrum of the operator. This theory, in its present form, is a theory of infinitely many "hidden variables," since a system is pictured as possessing (at each instant) a value for everyone of these "observables" simultaneously, with the changes in these values obeying precise (deterministic) dynamical laws. However, the change of anyone of these variables with time depends upon the entire set of observables, so that it is impossible ever to discover by measurement the complete set of values for a system (since only one "observable" at a time can be observed). Therefore, statistical ensembles are introduced, in which the values of all of the observables are related to points in a "differential space," which is a Hilbert space containing a measure for which each (differential space) coordinate has an independent normal distribution. It is then shown that the resulting statistical dynamics is in accord with the usual form of quantum theory.

It cannot be disputed that these theories are often appealing, and might conceivably become important should future discoveries indicate serious inadequacies in the present scheme (i.e., they might be more easily modified to encompass new experience). But from our viewpoint they are usually more cumbersome than the conceptually simpler theory based on pure wave mechanics. Nevertheless, these theories are of great theoretical importance because they provide us with examples that "hidden variables" theories are indeed possible.

d. The stochastic process interpretation. This is the point of view which holds that the fundamental processes of nature are stochastic (i.e., probabilistic) processes. According to this picture physical systems are supposed to exist at all times in definite states, but the states are continually undergoing probabilistic changes. The discontinuous probabilistic "quantum-jumps" are not associated with acts of observation, but are fundamental to the systems themselves.

A stochastic theory which emphasizes the particle, rather than wave, aspects of quantum theory has been investigated by Bopp. The particles do not obey deterministic laws of motion, but rather probabilistic laws, and by developing a general "correlation statistics" Bopp shows that his quantum scheme is a special case which gives results in accord with the usual theory. (This accord is only approximate and in principle one could decide between the theories. The approximation is so close, however, that it is hardly conceivable that a decision would be practically feasible.)

Bopp's theory seems to stem from a desire to have a theory founded upon particles rather than waves, since it is this particle aspect (highly localized phenomena) which is most frequently encountered in present day high-energy experiments (cloud chamber tracks, etc.). However, it seems to us to be much easier to understand particle aspects from a wave picture (concentrated wave packets) than it is to understand wave aspects (diffraction, interference, etc.) from a particle picture.

Nevertheless, there can be no fundamental objection to the idea of a stochastic theory, except on grounds of a naked prejudice for determinism. The question of determinism or indeterminism in nature is obviously forever undecidable in physics, since for any current deterministic [ probabilistic] theory one could always postulate that a refinement of the theory would disclose a probabilistic [deterministic] substructure, and that the current deterministic [probabilistic] theory is to be explained in terms of the refined theory on the basis of the law of large numbers [ignorance of hidden variables). However, it is quite another matter to object to a mixture of the two where the probabilistic processes occur only with acts of observation.

e. The wave interpretation. This is the position proposed in the present thesis, in which the wave function itself is held to be the fundamental entity, obeying at all times a deterministic wave equation.

This view also corresponds most closely with that held by Schrödinger. However, this picture only makes sense when observation processes themselves are treated within the theory. It is only in this manner that the apparent existence of definite macroscopic objects, as well as localized phenomena, such as tracks in cloud chambers, can be satisfactorily explained in a wave theory where the waves are continually diffusing. With the deduction in this theory that phenomena will appear to observers to be subject to Process I, Heisenberg's criticism of Schrödinger’s opinion - that continuous wave mechanics could not seem to explain the discontinuities which are everywhere observed - is effectively met. The "quantum jumps" exist in our theory as relative phenomena (i.e., the states of an object-system relative to chosen observer states show this effect), while the absolute states change quite continuously.

The wave theory is definitely tenable and forms, we believe, the simplest complete, self-consistent theory.

We should like now to comment on some views expressed by Einstein. Einstein's criticism of quantum theory (which is actually directed more against what we have called the "popular" view than Bohr's interpretation) is mainly concerned with the drastic changes of state brought about by simple acts of observation (i.e., the infinitely rapid collapse of wave functions), particularly in connection with correlated systems which are widely separated so as to be mechanically uncoupled at the time of observation. At another time he put his feeling colorfully by stating that he could not believe that a mouse could bring about drastic changes in the universe simply by looking at it.

However, from the standpoint of our theory, it is not so much the system which is affected by an observation as the observer, who becomes correlated to the system.

In the case of observation of one system of a pair of spatially separated, correlated systems, nothing happens to the remote system to make any of its states more "real" than the rest. It had no independent states to begin with, but a number of states occurring in a superposition with corresponding states for the other (near) system. Observation of the near system simply correlates the observer to this system, a purely local process - but a process which also entails automatic correlation with the remote system. Each state of the remote system still exists with the same amplitude in a superposition, but now a superposition for which element contains, in addition to a remote system state and correlated near system state, an observer state which describes an observer who perceives the state of the near system. From the present viewpoint all elements of this superposition are equally "real." Only the observer state has changed, so as to become correlated with the state of the near system and hence naturally with that of the remote system also. The mouse does not affect the universe - only the mouse is affected.


This is Hugh Everett’s solution to the EPR paradox. Recall that if we prepare many pairs of quantum mechanically “entangled” electrons that conserve angular momentum, initially each electron will be in a mixture of spin states because it has not been measured yet. In the Copenhagen Interpretations these electrons “really” do not know what their spins are at this point, but when observer A and observer B later measure their spins with a Stern-Gerlach device and then compare notes, they will find that whenever observer A measured a spin up electron, observer B measured its twin as a spin down electron. Since the electrons “really” did not know what their spins were before being measured, somehow measuring the spin of an electron “here” instantaneously determined the spin of its twin over “there”, and the “here” and “there” can be on the opposite ends of the visible Universe. In 1982 Alain Aspect actually conducted an experiment that validated this finding using photons instead of electrons, so this is not just a thought experiment. The Universe actually behaves like this!

Here is Hugh Everett’s solution. Each near electron is in a number of superposition states with its twin electron that is over “there”, and each of those superpositions must conserve angular momentum because that is the law, meaning that one electron is a spin up and the other is a spin down. When observer A “measures” a near electron, observer A becomes correlated with the near electron, and also with its twin electron over “there” because both of those electrons were already in a correlated superposition state to begin with. When observer B measures the twin electron over “there”, he becomes correlated with the twin electron, and consequently with the correlated superposition state of observer A and both electrons. That is why when observer A finds a spin up electron, observer B finds a spin down electron. Basically, observer A and observer B are not really measuring the spins of the electrons. Instead, they have really put together a very complex experiment that always places them into the same universe amongst many parallel universes. That is why the mouse does not affect the universe.

Our theory in a certain sense bridges the positions of Einstein and Bohr, since the complete theory is quite objective and deterministic ("God does not play dice with the universe"), and yet on the subjective level, of assertions relative to observer states, it is probabilistic in the strong sense that there is no way for observers to make any predictions better than the limitations imposed by the uncertainty principle.

In conclusion, we have seen that if we wish to adhere to objective descriptions then the principle of the psycho-physical parallelism requires that we should be able to consider some mechanical devices as representing observers. The situation is then that such devices must either cause the probabilistic discontinuities of Process 1, or must be transformed into the superpositions we have discussed. We are forced to abandon the former possibility since it leads to the situation that some physical systems would obey different laws from the rest, with no clear means for distinguishing between these two types of systems. We are thus led to our present theory which results from the complete abandonment of Process 1 as a basic process. Nevertheless, within the context of this theory, which is objectively deterministic, it develops that the probabilistic aspects of Process 1 reappear at the subjective level, as relative phenomena to observers.

One is thus free to build a conceptual model of the universe, which postulates only the existence of a universal wave function which obeys a linear wave equation. One then investigates the internal correlations in this wave function with the aim of deducing laws of physics, which are statements that take the form: Under the conditions C the property A of a subsystem of the universe (subset of the total collection of coordinates for the wave function) is correlated with the property B of another subsystem (with the manner of correlation being specified). For example, the classical mechanics of a system of massive particles becomes a law which expresses the correlation between the positions and momenta (approximate) of the particles at one time with those at another time. All statements about subsystems then become relative statements, i.e., statements about the subsystem relative to a prescribed state for the remainder (since this is generally the only way a subsystem even possesses a unique state), and all laws are correlation laws.

The theory based on pure wave mechanics is a conceptually simple causal theory, which fully maintains the principle of the psycho-physical parallelism. It therefore forms a framework in which it is possible to discuss (in addition to ordinary phenomena) observation processes themselves, including the inter-relationships of several observers, in a logical, unambiguous fashion. In addition, all of the correlation paradoxes, like that of Einstein, Rosen, and Podolsky, find easy explanation.

While our theory justifies the personal use of the probabilistic interpretation as an aid to making practical predictions, it forms a broader frame in which to understand the consistency of that interpretation. It transcends the probabilistic theory, however, in its ability to deal logically with questions of imperfect observation and approximate measurement.

Since this viewpoint will be applicable to all forms of quantum mechanics which maintain the superposition principle, it may prove a fruitful framework for the interpretation of new quantum formalisms. Field theories, particularly any which might be relativistic in the sense of general relativity, might benefit from this position, since one is free to construct formal (non-probabilistic) theories, and supply any possible statistical interpretations later. (This viewpoint avoids the necessity of considering anomalous probabilistic jumps scattered about space-time, and one can assert that field equations are satisfied everywhere and everywhen, then deduce any statistical assertions by the present method.)

By focusing attention upon questions of correlations, one may be able to deduce useful relations (correlation laws analogous to those of classical mechanics) for theories which at present do not possess known classical counterparts. Quantized fields do not generally possess pointwise independent field values, the values at one point of space-time being correlated with those at neighboring points of space-time in a manner, it is to be expected, approximating the behavior of their classical counterparts. If correlations are important in systems with only a finite number of degrees of freedom, how much more important they must be for systems of infinitely many coordinates.

Finally, aside from any possible practical advantages of the theory, it remains a matter of intellectual interest that the statistical assertions of the usual interpretation do not have the status of independent hypotheses, but are deducible (in the present sense) from the pure wave mechanics, which results from their omission.

For the more mathematically gifted, I encourage you to try reading the full text of Hugh Everett’s original draft Ph.D. thesis. One reason John Wheeler had Hugh Everett heavily edit his original 137-page document down to his final 36-page doctoral dissertation was that he was afraid the departmental physicists on Hugh Everett’s dissertation committee would not understand the material and that the oral defense of his Ph.D. thesis would not go well, and could possibly even lead to his failure to pass his thesis defense. So do not feel too badly if the mathematics goes way over your head.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
http://softwarephysics.blogspot.com/

Regards,
Steve Johnston