Monday, September 17, 2007

How To Think Like A Scientist

I was just about to tell you about applying science to computer science, when I realized I was getting ahead of myself. First I need to define what I mean by science. As with all of softwarephysics, this is my own operational definition. However, I think it is pretty close to the mainstream concept of what science is as held by the majority of the scientific community.

First of all, science is a way of thinking. Science has a methodology to aid in this way of thinking which has been very successful over the past 400 years. The purpose of the scientific method is to formulate theories or models of the Universe. A scientific model is a simplified approximation of reality that allows people to gain insight into the real structure and operation of true reality. Scientists create models to explain observations, predict future observations, and provide direction for thought. The scientific method is a little different than the way most people think in their daily lives, so let’s examine some of the ways people come up with ideas with a little help from our philosophical friends.

There are three main approaches to gaining knowledge:

1. Inspiration/Revelation
These are ideas that just come out of the blue with no apparent source. I find that IT people are very good at this. For example, on a conference call for a website outage I am frequently surprised at the incredible level of troubleshooting skill of many of the participants. I frequently wonder to myself “Where did that insight come from?” when somebody nails a root cause out of the blue.

Most of the great ideas in science have also come from inspiration/revelation. For example, in 1900 Max Planck had the insight that he could solve the Ultraviolet Catastrophe by assuming that charged particles in the walls of a room could only oscillate with certain fixed or quantized frequencies. The classical electromagnetic theory of the day predicted that the room you are currently sitting in should be bathed in a lethal level of ultraviolet light and x-rays, and that the walls of the room should be at a temperature of absolute zero having turned over all of their available energy into zapping you to death. This was clearly evidence of a theory missing the mark by a wide margin! Planck thought that his fixed frequency solution was just a mathematical trick, but in 1905 Einstein had the revelation that maybe this was not just a trick. Maybe light did not always behave as an electromagnetic wave. Maybe light sometimes behaved like a stream of particles we now call photons that only came in fixed or quantized amounts of energy. The fixed energy of the photons would match up with the fixed frequencies of the charged particles in the walls of your room. In 1924, Louis de Broglie had another revelation and suggested that particles, like electrons, might behave like waves too, just as a stream of photons sometimes behaved like an electromagnetic wave. In 1925, Werner Heisenberg and Erwin Schrödinger developed quantum mechanics based upon these insights, and in 1948 the transistors in your PC were invented at Bell Labs based upon quantum mechanics.

The limits of Inspiration/Revelation:
You never know for sure that your idea is correct.

2. Deductive Rationalism
With deductive rationalism you make a few postulates which usually come from inspiration/revelation and then you deduce additional ideas or truths from them using pure rational thought. Plato and Des Cartes were big fans of deductive rationalism. It goes like this:

If A = B
And B = C
Then A = C

The limits of deductive rationalism:
In 1931, Kurt Gödel proved that no self-consistent mathematical theory could deduce all truths and that no self-consistent mathematical theory could prove that it was always self-consistent (does not contradict itself). So you cannot deduce all truths.

3. Inductive Empiricism
With inductive empiricism you make a lot of observations and then reverse the deductive rationalism process. Aristotle and John Locke were big fans of inductive empiricism. If I observe that 99.99% of the time that A = C, then I will assume that A is really equal to C, and I will chalk up the .01% discrepancy to observational error. I don’t know any thing about B at this point because I have no observations of B’s state. However, if I make some more observations and find that 99.99% of the time that B = C, then I will infer that B is really equal to C, and therefore, that A is really equal to B too.

If A = C 99.99% of the time
And B = C 99.99% of the time
Then B = C, A = C, and A = B

The limits of empirical induction:
The above may all just be coincidences and you have to have good technology in order to make accurate observations. Most Ancient Greek philosophers did not like inductive empiricism because they thought that all physical measurements on Earth were debased and corrupt. They believed in the power of pure uncorrupted rational thought. This was largely due to the poor level of measurement technology they possessed at the time (they had no Wily). But even in the 17th century when Galileo was demonstrating experiments to his patrons that proved, contrary to Aristotle’s teachings, that all bodies fell with the same acceleration, they thought his experimental demonstrations were magic tricks!

People get into trouble when they only use one or two of the above three approaches to knowledge to make decisions. I know that I do. Politicians have frequently been known to not use any of them at all! The power of the scientific method is that it uses all three of the above approaches to knowledge. Like the checks and balances in the U.S. Constitution, this helps to keep you out of trouble.

The Scientific Method
1. Formulate a set of hypotheses based upon inspiration/revelation with a little empirical inductive evidence mixed in.

2. Expand the hypotheses into a self-consistent model or theory by deducing the implications of the hypotheses.

3. Use more empirical induction to test the model or theory by analyzing many documented field observations or performing controlled experiments to see if the model or theory holds up. It helps to have a healthy level of skepticism at this point. As philosopher Karl Popper has pointed out, you cannot prove a theory to be true, you can only prove it to be false. Galileo pointed out that the truth is not afraid of scrutiny, the more you pound on the truth, the more you confirm its validity.

Effective Theories
The next concept that we need to understand is that of effective theories. Physics currently does not have an all encompassing unifying theory or model. Researchers are looking for a TOE – Theory of Everything in physics, but currently we do not have one. Instead, we have a series of pragmatic effective theories. An effective theory is an approximation of reality that only works over a certain range of conditions. For example, Newtonian mechanics allowed us to put men on the Moon, but it cannot explain how atoms work or why the clocks on GPS satellites run faster than clocks on Earth. All of the current theories in physics are effective theories that only work over a certain range of conditions. Physics currently comes in three sizes – Small, Medium, and Large

• Small – less than 10-10 meter and tiny masses
Quantum Mechanics – atomic bombs and transistors
• Medium – 19th Century Classical Physics
Newtonian Mechanics – space shuttle launches
Maxwell’s Electromagnetic Theory – electric motors
Thermodynamics – air conditioners
• Large – greater than 20,000 miles/sec or very massive objects
Einstein’s General Theory of Relativity – cosmology, black holes, and GPS satellites

Since all of the other sciences are built upon a foundation of underlying effective theories in physics, that means that all of science is “wrong”! But knowing that you are “wrong” gives you a huge advantage over people who know that they are “right” because knowing that you are “wrong” allows you to keep an open mind to search for models that are better approximations of reality.

In addition to covering different ranges of conditions, effective theories also come in different levels of depth with more profound effective theories providing deeper levels of insight. For example, Charles’ Law is a very high-level effective theory that states that at a constant pressure, the volume of a gas in a cylinder is proportional to the temperature of the gas. If you double the temperature of a gas in a cylinder having a freely moving piston, its volume will expand and double in size. A more profound effective theory for the same phenomena is called statistical mechanics which views the gas as a large number of molecules bouncing around in the cylinder. When you double the temperature of the gas, you double the energy of the molecules, so they bounce around faster and take up more room. An even deeper effective theory is called quantum mechanics which views the molecules as standing waves in the cylinder.

The goal of softwarephysics is to provide a pragmatic high-level effective theory of software behavior at a level of complexity similar to that of Charles’ Law. Having an effective theory of software behavior is useful because it allows you to make day-to-day IT decisions with more confidence. For example, suppose you learn 30 minutes before your maintenance window goes down that you have a new EJB that must go into production, but that it corrupts 0.5% of a certain new database transaction. A young programmer on your team quickly produces a “fixed” version of the EJB, but he does not have time to regression test it. Do you put the “fixed” EJB into production, or do you go with the one with the known 0.5% bug with the hope that the corrupted database records can be corrected later? As we shall see later, softwarephysics helps in such situations.

The Most Difficult Thing in Science
The final concept of the scientific method is the most difficult for human beings. In science, you are not allowed to believe in things. You are not allowed to accept models or theories without supporting evidence. However, you are allowed to have a level of confidence in models and theories. For example, I do not “believe” in Newtonian mechanics because I know that it is “wrong”, but I do have a high level of confidence that it could launch me into an Earth orbit. I might get blown up on the launch pad, but like all of our astronauts, I would bet my life on Newtonian mechanics getting me into an Earth orbit instead of plunging me into the Sun if I do my calculations properly! Similarly, I have a low level of confidence in the old miasma theory of disease. In the early 19th century, it was thought by the scientific community that diseases were caused by miasma, a substance found in foul smelling air. And there was a lot of empirical evidence to support this model. For example, people who lived near foul smelling 19th century rivers were more prone to dying of cholera than people who lived further from the rivers. We had death certificate data to prove that empirical fact. If you were running a cesspool cleaning business in the 19th century, you knew that on the first day of work your rookies were likely to get sick and vomit when they were exposed to the miasma from their first cesspool and a few days later they might come down with a fever and die on you! The miasma theory of disease even had predictive power! If you were running a 19th century cesspool cleaning business in the middle of a cholera epidemic, and you shut down your operation during the epidemic, while your competitors kept theirs open, you would probably enjoy a larger market share when the epidemic subsided. This just highlights the dangers of relying too heavily on the inductive empiricism approach to gaining knowledge.

As a human being, it is hard not to believe in things. I have been married for 32 years and I have two wonderful adult children. And I truly believe in them all! If somebody confronted me with incontrovertible evidence that one of my children had embezzled funds, my first thought would be that there must be some horrible mistake. However, in scientific matters, you are not allowed this luxury.

The Scientific Method and Softwarephysics
So what does all of this have to do with softwarephysics? Softwarephysics is a high-level effective theory of software behavior. It is a simulated science for the simulated Software Universe that we are all immersed in. Let me explain. In the 1970s, I was an exploration geophysicist writing FORTRAN software to simulate geophysical observations for oil companies. When I transitioned into IT in 1979, it seemed like I was trapped in a frantic computer simulation, just like the ones I used to program for oil companies. After a few months in Amoco’s IT department, I had the following inspiration/revelation:

The Equivalence Conjecture of Softwarephysics

Over the past 70 years, through the uncoordinated efforts of over 50 million independently acting programmers to provide the world with a global supply of software, the IT community has accidentally spent more than $10 trillion creating a computer simulation of the physical Universe on a grand scale – the Software Universe.

I soon realized that I could use this simulation in reverse. By understanding how the physical Universe behaved, I could predict how the Software Universe would react to stimuli, and I proceeded to deduce many implications for software behavior based upon this insight. This was a bit of a role reversal; in physics we use software to simulate the behavior of the Universe, while in softwarephysics we use the Universe to simulate the behavior of software.

The one problem that I have always had with softwarephysics has been with the confirmation of the model via inductive empiricism. How do you produce and analyze large amounts of documented field observations of software behavior or run controlled experiments for a simulated science? “Hey, Boss I would like to run a double blind experiment where we install software into production, but only half of it goes through UAT testing. The other half comes straight from the programmers as is, and we don’t know which is which in advance”. Unfortunately, I have always had a full-time job without the luxury of graduate students! So I am relying on 30+ years of personal anecdotal observation of software behavior to offer softwarephysics as a working hypothesis.

Next time I will describe why applying science to computer science is a good idea using the challenges faced by steam engine designers in the 18th century as a case study.

Comments are welcome at

To see all posts on softwarephysics in reverse order go to:

Steve Johnston

No comments: