Sunday, December 08, 2013

Some Additional Thoughts on the Galactic Scarcity of Software

One of the essential findings of softwarephysics is that there are currently three forms of self-replicating information upon the Earth – the genes, memes, and software, with software, rapidly becoming the dominant form of self-replicating information on the planet and, indeed, within our very own Solar System too. For more on this, see A Brief History of Self-Replicating Information. In several other previous postings, we have discussed the significance of Fermi’s Paradox as it relates to the scarcity of software within our galaxy.

Fermi’s Paradox - If the Universe is just chock full of intelligent beings, why do we not see any evidence of their existence?

The corollary to Fermi’s Paradox being that, if the Universe is just chock full of intelligent beings, it should also be just chock full of software, and we should see evidence of that software because the effects of software are the easiest to observe from afar. Basically, I have suggested two explanations for our inability to detect the presence of any intelligent beings or software within our galaxy, beyond that which is already found within our very own Solar System:

1. In SETS - The Search For Extraterrestrial Software and CyberCosmology, I suggested that the Rare Earth Hypothesis presented in the classic Rare Earth (2000) of Peter Ward and Donald Brownlee, may indicate that we are indeed the very first technological civilization to arise within our Milky Way galaxy that was capable of initiating the origin of software. With this explanation, the reason that our radio telescopes are not currently choked with tons of friendly intragalactic SPAM for the construction of alien computers, and the free downloads of alien software, is that there just are no other civilizations out there in our galaxy that have ever attained the ability to kick off the origin of software. Similarly, the reason we do not find ourselves knee-high in von Neumann probes, stuffed with alien software, is that there is nobody out there in our galaxy that ever built and launched the very first one. Calculations show that even with the very limited level of technology that is nearly within our grasp, one could easily build self-replicating von Neumann probes that could explore the entire Milky Way galaxy in as little as 5 million years, and that you could do that by only releasing a handful of these von Neumann probes upon a galaxy to get things going. Indeed, this should be so easy to do that the costs of doing so should not be a hindering factor for any technological civilization worth its salt.

2. In The Sounds of Silence the Unsettling Mystery of the Great Cosmic Stillness, I suggested a far more disturbing and sinister explanation. I proposed that the discovery of science and technology itself snuffs out all technological civilizations with an efficiency of nearly 100% before they can unleash software upon a galaxy. The problem is that when a civilization stumbles upon science and technology, the old-time meme-complexes of the past do not go away, and doing things like giving iPhones, iPads, thermonuclear weapons, missile systems, and billions of machines capable of spewing out 24 billion tons of carbon dioxide each year to very ancient political and religious meme-complexes snuffs out alien civilizations before the software they generate even has a chance to take over and begin to explore the galaxy.

Most likely, the explanation for Fermi’s Paradox stems from a combination of both factors, and possibly, a few more to boot. Personally, neither of these explanations is very appealing to me, but since I truly do love science and technology, the second is much more disturbing for me than the first. I just finished reading Alone in the Universe – Why Our Planet is Unique (2011) by one of my most favorite authors, John Gribbin. If you ever want somebody to explain really complex scientific ideas in understandable terms, you should always turn to John Gribbin. I have read many of his books in the past, and I have just put the remainder on my reading list for the future. In Alone in the Universe – Why Our Planet is Unique, John Gribbin provides some additional supporting evidence for the Rare Earth Hypothesis in addition to the evidence found in the classic Rare Earth of Peter Ward and Donald Brownlee. For example, Gribbin points out that out of the dozens of civilizations that Homo sapiens has churned out in the past 10,000 years, only one was even able to figure out that the Earth was not the center of the Universe, and then go on to develop an advanced technology. The fact that human beings were probably just as smart 200,000 years ago as they are today, and that it took a full 200,000 years for them to develop an advanced technology, does not bode well for the development of civilizations capable of developing technology and software. And even today, much of the world’s population is still scientifically illiterate. For example, many of the members of the United States Congress are clearly scientifically illiterate, and proudly so.

But the most disturbing new piece of evidence supporting the Rare Earth Hypothesis is the fact that a nearby star, Gliese 710, currently at a distance of 63.8 light years, is heading our way. In about 1.36 million years Gliese 710 will be within 1.10 light years of the Sun and will likely be able to disturb billions of very distant comets in the Sun’s Oort cloud. Some of these very distant comets will then be perturbed into orbits that carry them into the inner regions of our Solar System and on a collision course with the Earth. It is possible that the Earth will then be subjected to mass extinction collisions on a weekly basis for many millions of years that rival or even exceed the one that killed off the dinosaurs 65 million years ago in the K-T mass extinction. Essentially, the Earth will return to a period like the Late Heavy Bombardment which occurred 3.8 – 4.1 billion years ago, and which left many craters on the Moon and the Earth. Clearly, only the simplest of single-celled life forms on Earth could survive such a massive onslaught of incoming comets, and even they would have a very tough go of it. John Gribbin points out that if these incoming comets had started to arrive upon the Earth just 400 years ago, instead of 1.36 million years into the future, they would have prevented the Earth from ever having achieved a technological civilization during its entire 4.567 billion year history capable of exploring our galaxy with software. Our Sun is increasing in brightness by a factor of about 1% every 100 million years, as the amount of helium in the Sun’s core continuously increases, and consequently, increases the density and gravitational strength of the Sun’s core, since a helium nucleus has a mass equal to the mass of four hydrogen nuclei. The increasing gravitational pull of the Sun’s core requires a counteracting increase in the pressure within the Sun’s core to match the increasing gravitational pull of the core. This means that the remaining hydrogen protons within the Sun’s core must move faster at a higher temperature to increase the core’s pressure. The faster-moving hydrogen protons cause the proton-proton nuclear reaction running within the Sun’s core to run faster and release more energy at a higher rate. This increased rate of the production of energy within the Sun’s core has to go someplace, so the Sun ends up radiating more energy into space, and the poor Earth just happens to be in the way. In The Life and Death of Planet Earth (2002), Peter Ward explains that rocky planets like our Earth probably can only support complex multicellular life for about 1 billion years. Our Earth has already burned up about 600 million years of that 1 billion years of opportunity, so the odds of complex multicellular life arising once again upon the Earth after a second Late Heavy Bombardment episode are quite slim indeed, not to mention the improbability of those complex multicellular life forms going on to evolve into intelligent beings capable of developing technology and software. So the arrival of Gliese 710 in 1.36 million years may be all she wrote. It might be the technological end of the line for our Earth.

Therefore it seems that, once again, that although Brandon Carter’s Weak Anthropic Principle (1973) may guarantee that all intelligent beings will only find themselves in universes capable of sustaining intelligent beings, it does not guarantee that the universe that they find themselves in will be overly friendly to intelligent beings.

The Weak Anthropic Principle - Intelligent beings will only find themselves existing in universes capable of sustaining intelligent beings.

It seems that our Universe certainly just barely qualifies as a home for intelligent beings.

So Why is there Any Software At All?
If intelligent beings and software are indeed so very rare in our Universe, one must then ask the question why is there any software to be found at all? Perhaps the answer to that question can be found at the very end of another one of John Gribbin’s books, In search of the multiverse: parallel worlds, hidden dimensions, and the ultimate quest for the frontiers of reality (2009). The currently emerging general consensus amongst cosmologists seems to be that our Universe is just one member of an infinitely large and eternal Multiverse. This paradigm very nicely explains Brandon Carter’s Weak Anthropic Principle – it is just a selection process at work. Of the infinite number of universes in the Multiverse, only a very small fraction need to be suitable for intelligent beings and software. But since a small fraction of an infinite number is still infinite, that provides plenty of opportunities within the Multiverse for intelligent beings. So what kind of universe should an intelligent being expect to find itself in? Well, most likely an intelligent being should expect to find itself in a universe such as ours that just barely makes the grade because D- universes like ours will vastly outnumber the universes that are extremely friendly to the existence of intelligent beings. The other reason that cosmologists are leaning towards the existence of a Multiverse is that the string theorists are coming up with at least 10500 different ways to make a universe with string theory, as outlined in The Cosmic Landscape (2006) by Leonard Susskind. A Multiverse, therefore, solves many problems at both the very largest and the very smallest of scales.

At the very end of In search of the multiverse : parallel worlds, hidden dimensions, and the ultimate quest for the frontiers of reality, John Gribbin considers Lee Smolin’s theory that universes arise in the Multiverse when black holes form in a preceding universe, as outlined in Smolin’s The Life of the Cosmos (1997). In The Life of the Cosmos, Lee Smolin proposes that a kind of Darwinian natural selection occurs in the Cosmic Landscape of the Multiverse. Universes that are very good at making black holes will spawn many offspring universes, while those that are not as good at making black holes will not. Thus over an infinite amount of time, the universes that are good at making black holes will come to dominate the Cosmic Landscape. For Smolin, the intelligent beings in our Universe are just a fortuitous by-product of making black holes because, in order for a universe to make black holes, it must exist for many billions of years, and do other useful things, like easily make carbon in the cores of stars, and all of these factors aid in the formation of intelligent beings, even if those intelligent beings might be quite rare in such a universe. I have always liked Lee Smolin’s theory about black holes in one universe spawning new universes in the Multiverse, but I have always been bothered by the idea that intelligent beings are just a by-product of black hole creation because it seems a bit too coincidental. Towards the end of In search of the multiverse : parallel worlds, hidden dimensions, and the ultimate quest for the frontiers of reality, John Gribbin proposes a solution to that quandary. Perhaps intelligent beings in a preceding universe might be responsible for creating the next generation of universes in the Multiverse by attaining the ability to create black holes on a massive scale. For example, people at CERN are currently trying to create mini-black holes with the LHC collider. Currently, it is thought that there is a supermassive black hole at the center of the Milky Way Galaxy and apparently all other galaxies as well. In addition to the supermassive black holes found at the centers of galaxies, there are also numerous stellar-mass black holes that form when the most massive stars in the galaxies end their lives in supernova explosions. For example, our Milky Way galaxy contains several hundred billion stars, and about one out of every thousand of those stars is massive enough to become a black hole. Therefore, our galaxy should contain about 100 million stellar-mass black holes. Actually, the estimates run from about 10 million to a billion black holes in our galaxy, with 100 million black holes being the best order of magnitude guess. So let us presume that it took the current age of the Milky Way galaxy, about 10 billion years, to produce 100 million black holes naturally. Currently, the LHC collider at CERN can produce at least 100 million collisions per second, which is about the number of black holes that the Milky Way galaxy produced in 10 billion years. Now imagine that we could build a collider that produced 100 million black holes per second. Such a prodigious rate of black hole generation would far surpass the natural rate of black hole production of our galaxy by a factor of about 1020. Clearly, if only a single technological civilization with such technological capabilities should arise anytime during the entire history of each galaxy within a given universe, such a universe would spawn a huge number of offspring universes, compared to those universes that could not sustain intelligent beings with such capabilities. As Lee Smolin pointed out, we would then see natural selection in action again because the Multiverse would come to be dominated by universes in which it was easy for intelligent beings to make black holes with a minimum of technology. The requirements simply would be that it was very easy to produce black holes by a technological civilization, and that the universe in which these very rare technological civilizations find themselves is at least barely capable of supporting intelligent beings. It seems that these requirements describe the state of our Universe quite nicely.

This hypothesis helps to explain why our Universe seems to be such a botched job from the perspective of providing a friendly home for intelligent beings and software. All that is required for a universe to dominate the Cosmic Landscape of the Multiverse is for it to meet the bare minimum of requirements for intelligent beings to evolve, and more importantly, allow those intelligent beings to easily create black holes within them. Since software is needed in all such universes to run the machines that generate the black holes, that explains why our Universe is capable of supporting software, but just barely so, and that is why software is so rare within our galaxy and Universe.

This hypothesis also highlights our responsibilities as sentient beings in a Universe that has become self-aware. If our Universe really was created by intelligent beings in another universe within the Multiverse, these intelligent beings surely evolved from more humble origins, and probably never attained much of a higher moral standing than ourselves. That is why it is incumbent upon us as sentient beings to develop our own moral codes of conduct and then to follow them, and to not succumb to the mindless compulsions of the genes, memes and software to self-replicate at all costs, with all the associated downsides of a ruthless nature to survive.

I realize that much of this is wild speculation, but it does help to explain all of the current evidence at hand. That does not necessarily mean that it is true. However, the beauty of this explanation for the current state of the Multiverse is that it does help to heal some of the wounds between the “believers” and the “non-believers” of the modern world. As Stuart Kauffman points out in Reinventing the Sacred: A New View of Science, Reason, and Religion (2008), there is a compelling need for us to bridge this gap between the “believers” and the “non-believers” of the world. Stuart Kauffman suggests that we all can agree upon the sacredness of the emergent nonlinear Majesty of our Universe, and use this profound realization as a model for the God of our choosing. John Gribbin has another excellent book, Deep Simplicity – Bringing Order to Chaos and Complexity (2004) which describes the emergent Majesty of dissipative nonlinear systems far from equilibrium, as does At Home in the Universe (1995) by Stuart Kauffman. For more information on these topics please also see Software Chaos and The Origin of Software the Origin of Life.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston

Sunday, September 01, 2013

Some More Information About Information

For IT professionals it is very important to have a clear understanding of the nature of information in order for us to effectively perform our jobs, and having a good working definition of information is key to that understanding. Unfortunately, currently there are several different formulations for the concept of information to be found within differing fields of study, and this makes the concept of information even more confusing than it needs to be. In softwarephysics we have been exclusively using Leon Brillouin’s concept of information, also known as negentropy, to explore the nature of information in an IT setting because Brillouin’s concept of information seems to be the most useful formulation for IT professionals. In many softwarephysics postings, we have also seen many close correlations between the activities of IT professionals and the functions performed by living biological systems, so Brillouin’s concept of information should also be quite valuable for biologists as well.

In Entropy - the Bane of Programmers and The Demon of Software, we discussed the second law of thermodynamics and Leon Brillouin’s concept of information. Brillouin defined the change in information content of a system undergoing a change as the difference between the initial entropy of the system and the final entropy of the system after the change has been made:

     ∆Information = Si - Sf
     Si = initial entropy
     Sf = final entropy

Where the entropy of a system is defined by Boltzmann’s equation for entropy:

     S = k ln(N)
     N = number of microstates or ways of producing the macrostate of the system.

To make these ideas a bit easier to understand, we enlisted the use of poker hands to illustrate the above concepts, and we discovered that drawing three Aces to yield a hand with four Aces and a 2 of clubs resulted in a large increase in the information content of your hand, especially if your original hand consisted of an Ace of spades, 2 of clubs, 7 of hearts, 10 of diamonds and an 8 of spades. We also saw that physicists got very nervous if you started talking about destroying information because the idea of destroying information threatened all of their current effective theories. Currently, all of the theories of physics are reversible in time, meaning that they work just as well moving backwards in time as they do moving forwards in time, and if it is possible to destroy the information necessary to return a system to its initial state, then the Universe is not reversible in nature as the current theories of physics predict, and the theories of physics then collapse.

However, as IT professionals we all know that it is indeed very easy to destroy information, like destroying the code for a functional program by simply including a few typos, so what is going on? It all boils down to how you want to define the concept of information in a manner that is useful to your area of work. The reason I like Leon Brillouin’s formulation for the concept of information is that it easily highlights the difference between useful information and useless information. In Brillouin’s formulation, things containing lots of useful information have very low entropies, and consequently, are very improbable and rare things, like the code for a totally bug-free program. In Brillouin’s formulation of information, it is also very easy to turn useful low-entropy information into useless high-entropy information by simply scrambling the useful information into useless information. For example, if you take the source code file for an apparently bug-free program and scramble it with some random additions and deletions, the odds are that you will most likely end up with a low-information high-entropy mess. That is because the odds of creating an even better version of the program by means of inserting random additions and deletions are quite low while turning it into a mess are quite high. There are simply too many ways of messing up the program to win at that game. As we have seen this is simply the second law of thermodynamics in action. Leon Brillouin’s formulation for the concept of information also goes hand in hand with our current understanding of energy. According to the first law of thermodynamics, energy cannot be created nor destroyed, but the second law of thermodynamics explains that it is very easy to degrade useful low-entropy energy, like the energy in gasoline, into the useless high-entropy waste heat energy that your car’s engine generates in abundance as you drive along. Similarly, using Brillouin’s formulation for information, the second law of thermodynamics is constantly degrading low-entropy useful information into high-entropy useless information, as the entropy of the entire Universe itself constantly increases with time.

The Concept of Entropy and Information in Communications Theory
However, to make matters more confusing, there is another formulation for the concepts of entropy and information to be found in communications theory, developed by Claude Shannon in 1948. This formulation for the concept of information is very valuable in communications theory because it is useful in developing ways of encoding information digitally and then transmitting the digital data over a noisy channel with a limited bandwidth. It also can be used to define the speed with which digital information can be transmitted over a channel with a given bandwidth. No doubt, the cable company in your area runs many TV commercials making fun of the very slow DSL Internet connections that your local telephone company provides, in order to persuade you to obtain your Internet connection from the cable company. That is because the bandwidth of coaxial cable is much higher than the bandwidth of the simple copper wire that your telephone company possesses, and consequently, your cable company can push much more digital data down their coaxial cables than your poor phone company can push down their plain copper wires. Personally, I have one of those very slow phone company DSL Internet connections, rather than a high-speed coaxial connection from my cable company because my phone company has had more than 100 years of experience with keeping connections up, while my cable company only has about 30 years of experience with providing very spotty availability. As a member of MidOps on pager call for problems, it is much more important for me to have an Internet connection with an availability as close as possible to 100%, rather than to have the bandwidth necessary to push lots of bits down the line. That is because I really do not generate lots of bits of information as I push the buttons on my laptop at a human speed to fix problems.

I am not going to delve much further into communications theory because it is really not needed by softwarephysics, but I do wish to explain Claude Shannon’s concept of information because, unfortunately, it also uses the terms of information and entropy, but in a different way than does Leon Brillouin, and consequently, can lead to a great deal of confusion. Claude Shannon was interested in encoding information digitally and then transmitting it over a noisy channel. To do that he realized that all information could be encoded by using a series of 1s and 0s called bits. We do this in IT today as well. For example, in the ASCII code set we assign A = “01000001” and Z = “01011010”. Similarly, we can define the integer 25 = “00011001” using a base 2 or binary representation. Now once we convert characters and numbers into binary 1s and 0s we can then send them down a communications channel as a series of voltage spikes. Of course, all real communications channels are noisy, so there is always the possibility of misreading a message composed of 1s and 0s due to noise on the channel. To overcome that problem, you have to use an encoding scheme that not only encodes the information to be transmitted, but that also contains information that allows you to tell when a message has been garbled by noise. But in order to do that Shannon first had to figure out how much information was in a message in the first place. For example, both of the messages down below are encoded with a total of 16 1s and 0s:

     0000100000000000
     1001110010100101

However, the first message consists mainly of 0s, so it seems that it should be easier to apply some kind of error detection and correction scheme to the first message, compared to the second message, because the 1s are so rare in the first message. Doing the same thing for the second message should be much harder because the second message is composed of eight 0s and eight 1s. This led Shannon to conclude that the second message must contain more information than the first message. He also concluded that the 1s in the first message must contain more information than the 0s because the 1s were much less probable than the 0s, and consequently, the arrival of a 1 had much more significance than the arrival of a 0 in the message. Using this line of reasoning, Shannon proposed that if the probability of receiving a 0 in a message was p and the probability of receiving a 1 in a message was q, then the information H in the arrival of a single 1 or 0 must not simply be one bit of information. Instead, it must depend upon the probabilities p and q of the arriving 1s and 0s:

     H(p) = - p log2p -  q log2q

Since in this case the message is only composed of 1s and 0s, it follows that:

     q =  1 -  p

Figure 1 shows a plot of the information H(p) of the arrival of a 1 or 0 as a function of p, the probability of a 0 arriving in a message when the message is only composed of 1s and 0s:

Figure 1 – A plot of Shannon’s Information/Entropy equation H(p) versus the probability p of finding a 0 in a message composed solely of 1s and 0s

Notice that the graph peaks to a value of 1.0 when p = 0.50 and has a value of zero when p = 0.0 or p = 1.0. Now if p = 0.50 that means that q = 0.50 too because:

     q =  1 -  p

Substituting p = 0.50 and q = 0.50 into the above equation yields the information content of an arriving 0 or 1 in a message, and we find that it is equal to one full bit of information:

     H(0.50)  =  -(0.50) log2(0.50) - (0.50) log2(0.50)  =  -log2(0.50)  =  1

And we see that value of H(0.50) on the graph in Figure 1 does indeed have a value of 1 bit.

To fully understand the above equation it is necessary to review the concept of logarithms. The above equation uses a base 2 logarithm because we are concerned with messages composed of only two symbols 1 and 0. Recall that the familiar log function found on scientific calculators is a base 10 logarithm log10 and that:

     100 = 1 so the log10(1) = 0
     101 = 10 so the log10(10) = 1
     102 = 100 so the log10(100) = 2
     103 = 1000 so the log10(1000) = 3

Similarly for a base 2 logarithm:

     20 = 1 so the log2(1) = 0
     21 = 2 so the log2(2) = 1
     22 = 4 so the log2(4) = 2
     23 = 8 so the log2(8) = 3

And for numbers that are less than 1.0, like probabilities, the logarithms are negative:

     10-1 = 0.1 so the log10(0.1) = -1
     10-2 = 0.01 so the log10(0.01) = -2
     10-3 = 0.001 so the log10(0.001) = -3

Similarly for a base 2 logarithm:

     2-1 = 0.5 so the log2(0.5) = -1
     2-2 = 0.25 so the log2(0.25) = -2
     2-3 = 0.125 so the log2(0.125) = -3

Now suppose the arriving message consists only of 0s. In that case, p = 1.0 and q = 0.0, and the information content of an incoming 0 or 1 is H(1.0) and calculates out to a value of 0.0 in our equation and also in the plot of H(p) in Figure 1. This simply states that a message consisting simply of arriving 0s contains no information at all. Similarly, a message consisting only of 1s would have a p = 0.0 and a q = 1.0, and our equation and plot calculate a value of H(0.0) = 0.0 too, meaning that a message simply consisting of 1s conveys no information at all as well. What we see here is that seemingly a “messy” message consisting of many 1s and 0s conveys lots of information, while a “neat” message consisting solely of 1s or 0s conveys no information at all. When the probability of receiving a 1 or 0 in a message is 0.50 – 0.50, each arriving bit contains one full bit of information, but for any other mix of probabilities, like 0.80 – 0.20, each arriving bit contains less than a full bit of information. From the graph in Figure 1, we see that when a message has a probability mix of 0.80 – 0.20 that each arriving 1 or 0 only contains about 0.72 bits of information. The graph also shows that it does not matter whether the 1s or the 0s are the more numerous bits because the graph is symmetric about the point p = 0.50, so a 0.20 – 0.80 mix of 1s and 0s also only delivers 0.72 bits of information for each arriving 1 or 0.

Claude Shannon went on to generalize his formula for H(p) to include cases where there were more than two symbols used to encode a message:

     H(p) = - Σ p(x) log2 p(x)

The above formula says that if you use 2, 3, 4, 5 …. different symbols to encode information, just add up the probability of each symbol multiplied by the log2 of the probability of each symbol in the message. For example, suppose we choose the symbols 00, 01, 10, and 11 to send messages and that the probability of sending a 1 or a 0 are both 0.50. That means the probability p for each symbol 00, 01, 10 and 11 is 0.25 because each symbol is equally likely. So how much information does each of these two-digit symbols now contain? If we substitute the values into Shannon’s equation we get an answer of 2 full bits of information:

     H(0.25, 0.25, 0.25, 0.25) =  - 0.25 log2(0.25) - 0.25 log2(0.25)  - 0.25 log2(0.25) - 0.25 log2(0.25)  = 
     - log2(0.25) = 2

which makes sense because each symbol is composed of two one-bit symbols. In general, if all the symbols we use are n bits long, they will then all contain n bits of information each. For example, in biology genes are encoded in DNA using four bases A, C, T and G. A codon consists of 3 bases and each codon codes for a particular amino acid or is an end of file Stop codon. On average, prokaryotic bacterial genes code for about 400 amino acids using 1200 base pairs. If we assume that the probability distribution for all four bases, A, C, T and G are the same for all the bases in a gene, namely a probability of 0.25, then we can use our analysis above to conclude that each base contains 2 bits of information because we are using 4 symbols to encode the information. That means a 3 base codon contains 6 bits of information and a protein consisting of 400 amino acids contains 2400 bits of information or 300 bytes of information in IT speak.

Now here is where the confusion comes in about the nature of information. All along, using Brillouin’s formulation for the concept of information, we have been saying that “messy” things have lots of entropy and little information, while “neat” things have very little entropy and lots of information. The story goes that Claude Shannon was not quite sure what to call his formula for H(p). Then one day in 1949 he happened to visit the mathematician and early computer pioneer John von Neumann, and that is when information and entropy got mixed together in communications theory:

”My greatest concern was what to call it. I thought of calling it ‘information’, but the word was overly used, so I decided to call it ‘uncertainty’. When I discussed it with John von Neumann, he had a better idea. Von Neumann told me, ‘You should call it entropy, for two reasons. In the first place, your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, nobody knows what entropy really is, so in a debate you will always have the advantage.”

Unfortunately, with that piece of advice, we ended up equating information with entropy in communications theory.

So in Claude Shannon's Information Theory people calculate the entropy, or information content of a message, by mathematically determining how much “surprise” there is in a message. For example, in Claude Shannon's Information Theory, if I transmit a binary message consisting only of 1s or only of 0s, I transmit no useful information because the person on the receiving end only sees a string of 1s or a string of 0s, and there is no “surprise” in the message. For example, the messages “1111111111” or “0000000000” are both equally boring and predictable, with no real “surprise” or information content at all. Consequently, the entropy, or information content, of each bit in these messages is zero, and the total information of all the transmitted bits in the messages is also zero because they are both totally predictable and contain no “surprise”. On the other hand, if I transmit a signal containing an equal number of 1s and 0s, there can be lots of “surprise” in the message because nobody can really tell in advance what the next bit will bring, and each bit in the message then has an entropy, or information content, of one full bit of information.

This concept of entropy and information content is very useful for people who work with transmission networks and on error detection and correction algorithms for those networks, but it is not very useful for IT professionals. For example, suppose you had a 10-bit software configuration file and the only “correct” configuration for your particular installation consisted of 10 1s in a row like this “1111111111”. In Claude Shannon's Information Theory that configuration file contains no information because it contains no “surprise”. However, in Leon Brillouin’s formulation of information there would be a total of N = 210 possible microstates or configuration files for the 10-bit configuration file, and since the only “correct” version of the configuration file for your installation is “1111111111” there are only N = 1 microstates that meet that condition.

Using the formulas we learned in The Demon of Software, we can now calculate the entropy of our single “correct” 10-bit configuration file and the entropy of all possible 10-bit configuration files:

Boltzman's Definition of Entropy
S = ln(N)
N = Number of microstates

Leon Brillouin’s Definition of Information
∆Information = Si - Sf
Si = initial entropy
Sf = final entropy

as:

Sf = ln(1) = 0

Si = ln(210) = ln (1024) = 6.93147

So using Leon Brillouin’s formulation for the concept of information the Information content of a single “correct” 10-bit configuration file is:

Si - Sf = 6.93147 – 0 = 6.93147

which, if you look at the table in The Demon of Software, contains a little more information than drawing a full house in poker without drawing any additional cards and would be even less likely for you to stumble upon by accident than drawing a full house.

So in Claude Shannon's Information Theory, a very “buggy” 10 MB executable program file would contain just as much information and would require just as many network resources to transmit as transmitting a bug-free 10 MB executable program file. Clearly, Claude Shannon's Information Theory formulations for the concepts of information and entropy are less useful for IT professionals than are Leon Brillouin’s formulations for the concepts of information and entropy.

What John von Neumann was trying to tell Claude Shannon was that his formula for H(p) looked very much like Boltzmann’s equation for entropy:

     S = k ln(N)

The main difference was that Shannon was using a base 2 logarithm, log2 in his formula, while Boltzmann used a base e natural logarithm ln or loge in his formula for entropy. But given the nature of logarithms, that really did not matter much. To see why pull up the scientific calculator on your PC. It will have an ln button to find base e natural logarithms (loge) and a log button to find normal base 10 (log10) logarithms.

Figure 2 – Scientific calculators have an ln button to find base e natural logarithms loge and a log button to find normal base 10 log10 logarithms

It is very easy to convert from one logarithm to another by just multiplying by a constant:

     log2(x) = 3.3219 log10(x) = 3.3219 log(x)

     ln(x) = loge(x) = 2.3025 log10(x) = 2.3025 log(x)

Try using the calculator to validate that:

     log(100) = 2

     log2(100) = 2 * 3.3219 = 6.6438
     (hint try using the the xy button to do a 26.6438)

     ln(100) = 4.6052

Notice that the log of numbers less than 1.0, like probabilities such as 0.5 are always negative. For example log10(0.5) = -0.3010 and log2(0.5) = 3.3219 * -0.3010 = -1.0 as we saw above. Because the log of a number that is less than 1.0 is always negative, Shannon had to use minus signs in his equation to have it yield an H(p) Information/Entropy value that was positive.

The main point of confusion arises because in communications theory the concepts of information and entropy pertain to encoding and transmitting information, while in IT and many other disciplines, like biology, we are more interested in the amounts of useful and useless information in a message. For example, in communications theory, the code for a buggy 300,000 byte program contains just as much information as a totally bug-free 300,000 byte version of the same program and would take just as much bandwidth and network resources to transmit accurately over a noisy channel as transmitting the bug-free version of the program. Similarly, in communications theory a poker hand consisting of four Aces and a 2 of clubs contains just as much information and is just as “valuable” as any other 5-card poker hand because the odds of being dealt any particular card is 1/52 for all the cards in a deck, and therefore, all messages consisting of 5 cards contain exactly the same amount of information. Similarly, all genes that code for a protein consisting of 400 amino acids all contain exactly the same amount of information, no matter what those proteins might be capable of doing. However, in both biology and IT we know that just one incorrect amino acid in a protein or one incorrect character in a line of code can have disastrous effects, so in those disciplines the quantity of useful information is much more important than the number of bits of data to be transmitted accurately over a communications channel.

Of course, the concepts of useful and useless information lie in the eye of the beholder to some extent. Brillouin’s formula attempts to quantify this difference, but his formula relies upon Boltzmann’s equation for entropy, and Boltzmann’s equation has always had the problem of how do you define a macrostate? There really is no absolute way of defining one. For example, suppose I invented a new version of poker in which I defined the highest ranking hand to be an Ace of spades, 2 of clubs, 7 of hearts, 10 of diamonds and an 8 of spades. The odds of being dealt such a hand are 1 in 2,598,964 because there are 2,598,964 possible poker hands, and using Boltzmann’s equation that hand would have a very low entropy of exactly 0.0 because N = 1 and ln(1) = 0.0. Necessarily, the definition of a macrostate has to be rather arbitrary and tailored to the problem at hand. But in both biology and IT we can easily differentiate between macrostates that work as opposed to macrostates that do not work, like comparing a faulty protein or a buggy program with a functional protein or program.

An IT Application of These Concepts
Recently at my place of employment, we lost the system disk for one of our nodes in a WebSphere Cell consisting of 6 servers or nodes. When the problem was first detected by UnixOps, I was paged out to bring down WebSphere on the server so that UnixOps could work with IBM that night to install a new disk for the server. When the server finally came back up, I restarted WebSphere on the server as usual, and had some programmers validate the applications in the Cell with test transactions. This is always a problematic effort because only 1/6th of the validation transactions actually hit the affected server, while the other 5/6ths of the validation transactions hit the other healthy servers, so this form of validation is not very effective, and in this case did not uncover a problem we had with the new disk. As usual, UnixOps had to restore the system disk using backup tapes, and unfortunately, an old obsolete tnsnames.ora file was installed on the disk by mistake. The tnsnames.ora file is an Oracle configuration file that defines database addresses for establishing connections to Oracle databases, and because we now had an old obsolete file on the server, certain WebSphere datasources were not working properly, and were causing some intermittent errors for some of our website end-users. When the problem was finally detected, we paged out Oracle DBA to take a look, and working with UnixOps, they pulled a more recent version of the tnsnames.ora file from the backup tapes. MidOps upper management was a bit perturbed that the wrong tnsnames.ora file had been restored to the server, so I was given the task of comparing the tnsnames.ora file on the affected server with the other tnsnames.ora files on the other 5 servers in the Cell. I pulled the tnsnames.ora file from each server, and using the Unix diff command, I compared all of the files for differences. I then found that there were actually two different versions of the tnsnames.ora file on the servers in the Cell. To make matters worse, I actually found four different versions of the tnsnames.ora file on the twin WebSphere Cell that we use to balance traffic with, for a total of 6 different versions of the tnsnames.ora file on 12 servers. IT Management was very surprised to find so many different versions of the tnsnames.ora file floating around on the WebSphere Cell servers because it is such a critical configuration file, and requested that a single “correct” tnsnames.ora file be distributed to all of the servers.

Being a softwarephysicist, I immediately began to cringe at the thought of trying to unravel all of the spaghetti to obtain a single “correct” composite tnsnames.ora file to be used by all of the servers. Softwarephysics told me that embarking upon such an endeavor was a very dangerous thing indeed. From the above analysis, we see that all 6 versions of the tnsnames.ora file are members of the same macrostate of being a functional file with zero errors for their particular server, and consequently, all contained the same amount of useful information. Coming up with a single composite tnsnames.ora file and installing it on all 12 servers would constitute a change, and according to the second law of thermodynamics, the total amount of entropy in the Universe must increase and the total amount of useful information must decrease when such a change is made. The trick to pulling off such a change, without causing an outage, is to dump the required increase of entropy and diminished amount of useful information into waste heat. But that is a very difficult thing to do indeed, and the odds are that something will be overlooked in coming up with a composite tnsnames.ora file that works for all 12 servers because such a tnsnames.ora file would have far fewer constituent microstates, like a royal flush in poker, and would thus have a much lower entropy and contain much more useful information than the 6 versions of the file that we currently have floating around in production. Now having a composite tnsnames.ora file that works for all 12 servers would be the ideal situation, and it would indeed contain more useful information than the other tnsnames.ora files because, not only would it work on all 12 servers, but if the file were lost on one server, we could simply copy the file from one of the other servers, rather than trying to pull the file from a backup tape. But as we have seen, trying to produce such a composite tnsnames.ora file would be swimming upstream against Nature’s desire to constantly increase the amount of entropy and disorder in the Universe. That is why in IT we generally follow the prescription of “If it ain’t broke, don’t fix it”.

Whenever I find myself in such awkward situations, like having 6 versions of the same file, I always lament IT’s reluctance to heed the tenets of softwarephysics. Softwarephysics maintains that the only way to prevent such high-entropy messy situations from occurring is to rigorously follow standards at all times and for all things, as the biosphere does, and not to cave into the expediency of the moment to rush nonstandard software into production in order to hit a deadline. Oh what a tangled web you weave, when first you practice to NOT FOLLOW STANDARDS!

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston

Wednesday, July 24, 2013

Enablement – the Definitive Characteristic of Living Things

In my last posting A Proposal for an Odd Collaboration to Explore the Origin of Life with IT Professionals, I proposed that it might be worthwhile for researchers working on the origin of life or astrobiology to add some IT professionals to their research teams, and I also offered to become a junior member of such an odd collaboration. I am glad to report that since then I have had some very positive feedback on my proposal. Among others, I have had some very interesting email exchanges with Professor Stuart Kauffman, and also a very interesting phone conversation with him as well. In The Origin of Software the Origin of Life and A Brief History of Self-Replicating Information, we first showcased some of Stuart Kauffman’s many contributions to scientific thought, especially his concept of the apparent emergence of self-organized order in nonlinear systems far from thermodynamic equilibrium that he calls “order for free”, and also of his work with Boolean networks that provide a model for the autocatalytic networks of organic molecules that most likely kicked off the evolution of life on Earth. In this posting, I would like to explore another one of Stuart Kauffman’s very exciting new concepts called Enablement, by discussing some of the ideas in the paper and video presentation below, with the intent of proposing that Enablement may be the long sought for definitive characteristic of living things. Please see:

No entailing laws, but enablement in the evolution of the biosphere
Giuseppe Longo, Maël Montévil, Stuart Kauffman (Jan 11, 2012)

Stuart Kauffman also gave a very interesting talk at MIT on October 19, 2011, a few months before the above paper was published, covering the essentials of the paper, and which can be viewed at:

http://vimeo.com/30875984

Stuart Kauffman has been a long-time advocate of moving beyond reductionism in order to understand complex systems. In the above paper with Giuseppe Longo and Maël Montévil, the authors formalize a new concept that Kauffman calls Enablement. Imagine a pool table with all the balls racked up in the center. Next, we shoot a cue ball into the closely packed pool balls and observe what happens. In order to keep track of what we are observing, we establish some arbitrary state variables that seem to do a good job of keeping track of what is going on. In this example, we might choose the X and Y position of each ball and also its velocity components Vx and Vy relative to the pool table as arbitrary state variables to keep track of the system of pool balls. These state variables would probably do a better job of keeping track of what is going on than the colors of the pool balls, which most likely will not change much during the interactions of the pool balls. Physicists would say that the changing values of the state variables of X, Y, Vx and Vy with time constitute a phase space, and that we could calculate the trajectory of each ball in that phase space using a deterministic theory of motion, like Newton’s laws of motion or even quantum mechanics if we had to because the evolution of wavefunctions over time is deterministic. All we would need to do would be to find the initial conditions of the pool balls and the boundary conditions defined by the size and shape of the pool table, and then simply integrate the differential equations that define the deterministic theory of motion that we use for the calculation, using the initial and boundary conditions to define the unique analytic solution for the problem at hand. Now according to Laplace, given the initial conditions of the cue ball’s original position and velocity, plus a deterministic theory of motion like Newton’s laws, and the boundary conditions defined by the size and shape of the pool table, we could exactly predict how the system of balls would evolve over time because we could predict the trajectory of each ball in the phase space we defined. Kauffman calls this capability of predicting trajectories in phase space “entailment” because we can “entail” the future course of each pool ball in the system of pool balls. Laplace was so impressed by Newtonian thought that he famously proclaimed that entailment could be extended to the entire Universe. Given Newtonian mechanics and knowing the positions of all the atoms in the Universe and their motions in 3D space at one particular time, would allow one to predict all future events in the Universe. However, with the rise of deterministic chaos theory in the last century, we realized that Laplace was not entirely correct in his analysis because nearly all systems in the Universe are nonlinear, and we also discovered that nonlinear systems are very sensitive to initial conditions, so although we could theoretically predict the trajectory of each pool ball in phase space, in practice we cannot do so because we cannot determine the initial conditions of the system with infinite precision and accuracy. See Software Chaos for more details on deterministic chaos.

The concept of Enablement goes one step further. Kauffman contends that what makes complex systems like the biosphere different from others is that the biosphere can change the boundary conditions of its phase space as it proceeds along a particular trajectory in phase space by means of taking advantage of Darwinian preadaptations. Personally, I prefer to use Stephen Gould’s term of exaptation, rather than Darwin’s preadaptation, because it removes the vague inference of a teleological intent found in the term preadaptation. Kauffman points out that the first difficulty we would have in extending our pool table analysis to the biosphere would be in choosing the proper state variables to be used to keep track of the biosphere because we do not know in advance where the biosphere may be heading on its own. For example, suppose in our pool table example we only used unnumbered pool balls, and relied upon their colors to keep track of their individual positions and movements. If we had then placed a green filter over our pool table lights, suddenly, the green number 6 ball on the green felt would have disappeared in the green light, and would have become invisible, like a neutrino back in the 1920s. In this case, color would indeed have become an important state variable to keep track of, but as Wolfgang Pauli did in 1930 with neutrinos, we could still deduce the existence of the green number 6 ball by means of the conservation of energy, momentum, and angular momentum of the other balls. Now from a purely Darwinian perspective, the seemingly proper state variables to use for the biosphere should be measures of utility that provide a survival advantage to the possessor, but specifically defining such state variables in advance would be very difficult indeed. Instead, let us just imagine an infinite and unbounded phase space of utility, consisting of all possible uses of all possible things. For Kauffman, the concept of Enablement means that as organisms evolve along a particular trajectory in utility phase space, they may develop a function for one purpose that can be easily exapted for another purpose, like the trajectory of lungfish in the deep past that led to the development of swim bladders by exapting the functions of their primitive lungs into the function of a primitive swim bladder, and consequently, led to the development of swim bladders that could be used for neutral buoyancy locomotion. Paleontologists currently believe that swim bladders evolved when lungfish, scurrying about from pond to pond, took in water into their primitive lungs and survived. Suddenly, their primitive lungs took on a new function, that of neutral buoyancy locomotion, that had nothing to do with the original purpose of lungs to take in oxygen, but which could be put to good use for low-energy locomotion, and provided a survival advantage to its possessors and to their progeny. With the concept of Enablement, Kauffman maintains that the potential phase space of utility of the biosphere at any given time is essentially infinite and unbounded because with each new added function the biosphere comes up with, a new “AdjacentPossible” in its available portion of the infinite and unbounded utility phase space also becomes available. For example, in Kauffman’s MIT presentation he uses the analogy of the near infinite uses of a screwdriver. The screwdriver is also one of my own favorite analogies for explaining the Darwinian concept of preadaptation or exaptation. In Self-Replicating Information and When Toasters Fly I used a similar analogy for screwdrivers:

What happens is that organisms develop a primitive function for one purpose, through small incremental changes, and then discover, through serendipity, that this new function can also be used for something completely different. This new use will then further evolve via inheritance, innovation and natural selection. For example, we have all upon occasion used a screwdriver as a wood chisel in a pinch. Sure the screwdriver was meant to turn screws, but it does a much better job at chipping out wood than your fingernails, so in a pinch it will do quite nicely. Now just imagine Darwin’s processes of inheritance, innovation and natural selection at work selecting for screwdrivers with broader and sharper blades and a butt more suitable for the blows from a hammer, and soon you will find yourself with a good wood chisel. At some distant point in the future, screwdrivers might even disappear for the want of screws, leaving all to wonder how the superbly adapted wood chisels came to be.

As an IT professional, you probably do this all the time. How often do you write code from scratch? I know that I never do. I simply find the closest piece of existing code that I have on hand and then turn the screwdriver into a wood chisel through small incremental changes to the code, by testing each small change to see how closely my screwdriver has evolved towards being a wood chisel. And I think that most of us also code using this Darwinian process of inheritance, innovation and natural selection too. I am a rather lazy programmer, so many times rather than thinking through a new chunk of code during the iterative process of coding and testing, I will simply make an “educated guess” at the new code to be introduced. After 35 years of coding, you begin to code by “ear”. Many times, I can fall upon the correct code after a few shots of directed random change, and that sure beats racking your brain over new code. Surprisingly, sometimes I even come up with “better” code through this Darwinian process than if I sat down and carefully thought it all through. This has probably been going on since 1945 when Konrad Zuse wrote the first “Guten Tag Welt!” program in Plankalkuel – just speculating here on the origin of the compulsion for all programmers, new to a computer language, to write the obligatory “Hello World!” program as their first effort. So the basic idea of grabbing some old code or architectural design elements from a couple of older Applications and slowly modifying them through an iterative process of inheritance, innovation and natural selection into a new Application is no stranger to IT. As Simon Conway Morris commented in Life’s Solution (2003) "How much of a complex organism, say a humanoid, has evolved at a much earlier stage, especially in terms of molecular architecture? In other words, how much of us is inherent in a single-celled eukaryote, or even a bacterium? Conversely, we are patently more than microbes, so how many genuinely evolutionary novelties can we identify that make us what we are? It has long been recognized that evolution is a past master at co-option and jury-rigging: redeploying existing structures and cobbling them together in sometimes quite surprising ways. Indeed, in many ways that is evolution”. When I first read these words, I accidentally misread the quote as "Indeed, in many ways that is IT”.


In his MIT presentation, Kauffman goes on to describe how the opportunistic and serendipitous use of preadaptations or the exploitation of what he calls the AdjacentPossible in utility phase space changes the very boundary conditions of the utility phase space available to the biosphere at any given time. And once an evolutionary trajectory has been deflected by such an enabling event, like the development of swim bladders, it opens a whole new ecological niche within the biosphere in a manner of "radical emergence" because now parasites can take up residency in the swim bladders that never before existed. It is as if the size and shape of our pool table were constantly changing due to the trajectories of the pool balls themselves. The key insight here is that the biosphere has changed the boundary conditions of its utility phase space over time by means of exapting already existing functions into new and unpredictable uses. Certainly, the crossing over of "dead" molecules to "live" molecules in the deep past must have been one of those dramatic Enabling events of radical emergence. To Kauffman, such diverted paths through utility phase space “enable” the course of evolution of the biosphere through utility phase space, rather than “entail” the course of evolution through utility phase space in a Newtonian sense. Because we cannot predefine the ever-changing boundary conditions of the utility phase space, or even the state variables to use to create a utility phase space in the first place, the old concepts of integrating differential equations subject to initial conditions and boundary conditions that has served science so well in the past cannot be used to predict the evolution of complex interacting things like the biosphere. Kauffman then goes on to explain that the concept of Enablement also extends to the econosphere and to the evolution of technology over time as well. Kauffman explains to the students at MIT that it is much easier to invent new technologies today than it was 50,000 years ago because the utility phase space of technology has exploded over the past 50,000 years, and consequently has a much larger AdjacentPossible than it did in the past. And with each new innovation that one of the MIT graduates might come up with in the future, the utility phase space of technology will again increase in size in new and unexpected ways, with the technologies they invent being used in unanticipated manners.

We have certainly seen Enablement in action in the evolution of software over the past 70 years, and also in the evolution of the memes as well. After all, both the econosphere and the technosphere are meme-complexes, as are all of the other cultural artifacts and activities of mankind. See SoftwareBiology, When Toasters Fly and How to Use Softwarephysics to Revive Memetics in Academia for more details on how Enablement has combined with the Darwinian mechanisms of inheritance, innovation and natural selection to shape the evolution of the genes, memes, and software.

With this background in hand, I would now like to propose that the concept of Enablement may be the defining characteristic of “living things”. I believe this leads to a unique and unambiguous definition of life for the very first time:

Life – A form of self-replicating information with the ability to change the boundary conditions of its utility phase space in unpredictable ways by means of exaptation.

For many years I have been advocating the idea that there are now three Replicators on the planet in a Dawkinsian sense – genes, memes, and software, but I have always struggled with what makes these Replicators different from other forms of replicating information, like a quartz crystal forming in a melt. I am now confident that the defining characteristic of “life” is the ability of self-replicating information to change the boundary conditions of its utility phase space in new and unpredictable ways by means of exapting current functions into new uses that change the size and shape of its utility phase space. So I would like to add this defining characteristic to my previous seven:

Self-Replicating Information – Information that persists through time by making copies of itself or by enlisting the support of other things to ensure that copies of itself are made.

The Characteristics of Self-Replicating Information
All forms of self-replicating information have some common characteristics.

1. All self-replicating information evolves over time through the Darwinian processes of inheritance, innovation and natural selection, which endows self-replicating information with one telling characteristic – the ability to survive in a Universe dominated by the second law of thermodynamics and nonlinearity.

2. All self-replicating information begins spontaneously as a parasitic mutation that obtains energy, information and sometimes matter from a host.

3. With time, the parasitic self-replicating information takes on a symbiotic relationship with its host.

4. Eventually, the self-replicating information becomes one with its host through the symbiotic integration of the host and the self-replicating information.

5. Ultimately, the self-replicating information replaces its host as the dominant form of self-replicating information.

6. Most hosts are also forms of self-replicating information.

7. All self-replicating information has to be a little bit nasty in order to survive.

8. The defining characteristic of self-replicating information is the ability of self-replicating information to change the boundary conditions of its utility phase space in new and unpredictable ways by means of exapting current functions into new uses that change the size and shape of its particular utility phase space.

As human beings, we have a tendency to overdo the labeling of things, as if the very act of us naming something gives it some semblance of reality, and I think this has been one of our problems in defining what life is in an unambiguous way. For the purposes of softwarephysics, I have always maintained that organic life, memes, and software are all forms of a higher, more encompassing whole, that I call Self-Replicating Information. Richard Dawkins and Susan Blackmore call these entities Replicators, and indeed Susan Blackmore has also proposed that there are now three Replicators on the planet. For Susan Blackmore, technical memes or “temes” are the third type of Replicator, as she outlined in her TED presentation at:

Memes and "temes"
https://www.ted.com/talks/susan_blackmore_on_memes_and_temes

Similarly, I have deemed software as the third Replicator because of the impact that the rise of software has had on the planet over the past 70 years, and because there may soon come a day when software will break free and begin to write itself. But in all cases, I think that the defining characteristic of the Replicators is their ability to change the boundary conditions of their utility phase space in new and unpredictable ways by means of exaptation, and that is why the genes, memes and software all seem to have a life of their own, independent of human intention.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston

Sunday, May 26, 2013

A Proposal for an Odd Collaboration to Explore the Origin of Life with IT Professionals

Currently, there are a number of diverse collaborations throughout the world exploring the origin of life on Earth and elsewhere. These teams are normally composed of members from the hard sciences, such as astronomers, astrophysicists, biologists, biochemists, chemists, geologists, geochemists, geophysicists, and physicists. I would like to propose that these diverse teams also take on a few additional participants from the Computer Science departments of their home universities and also a number of IT professionals from the IT departments of several major corporations throughout the world. The purpose of this strange collaboration would be to use the origin and evolution of commercial software over the past 70 years as a model for the origin and evolution of life on Earth and elsewhere. My hope would be that amongst such a diverse team something would click – somebody from outside IT might see something in what the IT community has painstakingly built over the past 70 years that rings a bell in their domain of experience. I think this effort could be conducted with very little cost using remote WebEx conferences, instant messaging, and email over the Internet. I would be very much interested in participating in such a collaboration as a junior member, with the intention of recruiting additional IT professionals from the pool of computer science graduates from the participating universities who happen to have some interest in bioinformatics or biologically inspired computing, and who have also moved into the world of corporate IT. Let me explain.

Currently, I am in the Middleware Operations group of the IT department of a major US corporation and support all of its externally facing websites and also all of the internal applications used to run the business. I graduated from the University of Illinois in 1973 with a B.S. in Physics and from the University of Wisconsin in 1975 with an M.S. in Geophysics, and from 1975 – 1979 I was an exploration geophysicist exploring for oil, first with Shell, and then with Amoco. I started programming in 1972, and in 1979 I decided to make a career change and become an IT professional in Amoco’s IT department. When I first transitioned into IT from geophysics, I figured if you could apply physics to geology; why not apply physics to software? So like the exploration team at Amoco that I had just left, consisting of geologists, geophysicists, paleontologists, geochemists, and petrophysicists, I decided to take all the physics, chemistry, biology, and geology that I could muster and throw it at the problem of software. The basic idea was that many concepts in physics, chemistry, biology, and geology suggested to me that the IT community had accidentally created a pretty decent computer simulation of the physical Universe on a grand scale, a Software Universe so to speak, and that I could use this fantastic simulation in reverse, to better understand the behavior of commercial software by comparing software to how things behaved in the physical Universe. Softwarephysics depicts software as a virtual substance and relies upon our understanding of the current theories in physics, chemistry, biology, and geology to help us model the nature of software behavior. So in physics, we use software to simulate the behavior of the Universe, while in softwarephysics we use the Universe to simulate the behavior of software.

I will soon be turning 62 years old and heading into the homestretch, so a few years back I started this blog on softwarephysics to share what I had discovered over the years with the rest of the IT community. My initial intention for my blog on softwarephysics was to help the IT community to better cope with the daily mayhem of life in IT. However, in laying down the postings for this blog an unintended consequence arose in my mind as I became profoundly aware of the enormity of this vast computer simulation of the physical Universe that the IT community had so graciously provided to the scientific community free of charge, and also of the very significant potential scientific value that it provided. One of the nagging problems for many of the observational and experimental sciences is that many times there is only one example readily at hand to study or experiment with, and it is very difficult to do meaningful statistics with a population of N=1. But the computer simulation of the physical Universe that the Software Universe presents provides another realm for comparison. For example, both biology and astrobiology only have one biosphere on Earth to study and even physics itself has only one Universe with which to engage. Imagine the possibilities if scientists had another Universe readily at hand in which to work! This is exactly what the Software Universe provides.

Currently, there are many researchers working on the origin of life on Earth and elsewhere, but the problem is that on Earth we are missing the rocks from the very first billion years of the Earth’s history when life first arose, so workers in the field are left to draw historical inferences in deep time based upon the modern metabolic pathways, RNA, and DNA we still have at hand today, and also upon biochemical simulations in the lab that are based upon those inferences. And even if we do find life on other planets, we will most likely be faced with the same challenge of not being able to figure out how it all happened.

My suggestion would be that everybody is looking just a couple of levels too low in the hierarchy of self-replicating information. Living things are just one form of self-replicating information, and all forms of self-replicating information have many characteristics in common as they battle the second law of thermodynamics in a nonlinear Universe. Currently, there are three forms of self-replicating information on the Earth – the genes, memes, and software, with software rapidly becoming the dominant form of self-replicating information on the planet. However, of the three, the only form of self-replicating information that we have a good history of is software, going all the way back to May of 1941 when Konrad Zuse cranked up his Z3 computer for the very first time. So the best model for the origin of life might be obtained by studying the hodge-podge of precursors, false starts, and failed attempts that led to the origin and early evolution of software, with particular attention paid to the parasitic/symbiotic relationships that allowed software to bootstrap itself into existence.

Yes, there are many other examples of universal Darwinism at work in the Universe, such as the evolution of languages or political movements, but I think that the origin and evolution of software provides a unique example because both programmers and living things are faced with nearly identical problems. A programmer must assemble a huge number of characters into complex patterns of source code to instruct a computer to perform useful operations. Similarly, living things must assemble an even larger number of atoms into complex molecules in order to perform the functions of life. And because the Universe is largely nonlinear in nature, meaning that small changes to initial conditions will most likely result in dramatic, and many times, lethal outcomes for both software and living things, the evolutionary history of living things on Earth and of software have both converged upon very similar solutions to overcome the effects of the second law of thermodynamics in a nonlinear Universe. For example, both living things and software went through a very lengthy prokaryotic architectural period, with little internal structure, to be followed by a eukaryotic architectural period with a great deal of internal structure, which later laid the foundations for forms with a complex multicellular architecture. And both also experienced a dramatic Cambrian explosion in which large multicellular systems arose consisting of huge numbers of somatic cells that relied upon the services of large numbers of cells to be found within a number of discrete organs.

Also, software presents a much clearer distinction between the genotype and phenotype of a system than do other complex systems, like languages or other technologies that also undergo evolutionary processes. The genotype of software is determined by the source code files of programs, while the phenotype of software is expressed by the compiled executable files that run upon a computer and that are generated from the source code files by a transcription process similar to the way genes are transcribed into proteins. Also, like a DNA or RNA sequence, source code provides a very tangible form of self-replicating information that can be studied over historical time without ambiguity. Source code is also not unique, in that many different programs, and even programs written in different languages can produce executable files with identical phenotypes or behaviors.

Currently, many researchers working on the origin of life and astrobiology are trying to produce computer simulations to help investigate how life could have originated and evolved at its earliest stages. But trying to incorporate all of the relevant elements into a computer simulation is proving to be a very daunting task indeed. Why not simply take advantage of the naturally occurring $10 trillion computer simulation that the IT community has already patiently evolved over the past 70 years and has already run for 2.2 billion seconds? It has been hiding there in plain sight the whole time for anybody with a little bit of daring and flair to explore.

Some might argue that this is an absurd proposal because software currently is a product of the human mind, while biological life is not a product of intelligent design. Granted, biological life is not a product of intelligent design, but neither is the human mind. The human mind and biological life are both the result of natural processes at work over very long periods of time. This objection simply stems from the fact that we are all still, for the most part, self-deluded Cartesian dualists at heart, with seemingly a little “me” running around within our heads that just happens to have the ability to write software and to do other challenging things. But since the human mind is a product of natural processes in action, so is the software that it produces. See:

The Ghost in the Machine the Grand Illusion of Consciousness

Still, I realize that there might be some hesitation to participate in this collaboration because it might be construed by some as an advocacy of intelligent design, but that is hardly the case. The evolution of software over the past 70 years has essentially been a matter of Darwinian inheritance, innovation and natural selection converging upon similar solutions to that of biological life. For example, it took the IT community about 60 years of trial and error to finally stumble upon an architecture similar to that of complex multicellular life that we call SOA – Service Oriented Architecture. The IT community could have easily discovered SOA back in the 1960s if it had adopted a biological approach to software and intelligently designed software architecture to match that of the biosphere. Instead, the worldwide IT architecture we see today essentially evolved on its own because nobody really sat back and designed this very complex worldwide software architecture; it just sort of evolved on its own through small incremental changes brought on by many millions of independently acting programmers through a process of trial and error. When programmers write code, they always take some old existing code first and then modify it slightly by making a few changes. Then they add a few additional new lines of code and test the modified code to see how far they have come. Usually, the code does not work on the first attempt because of the second law of thermodynamics, so they then try to fix the code and try again. This happens over and over until the programmer finally has a good snippet of new code. Thus, new code comes into existence through the Darwinian mechanisms of inheritance coupled with innovation and natural selection. Some might object that this coding process of software is actually a form of intelligent design, but that is not the case. It is important to differentiate between intelligent selection and intelligent design. In softwarephysics we extend the concept of natural selection to include all selection processes that are not supernatural in nature, so for me, intelligent selection is just another form of natural selection. This is really nothing new. Predators and prey constantly make “intelligent” decisions about what to pursue and what to evade, even if those “intelligent” decisions are only made with the benefit of a few interconnected neurons or molecules. So in this view, the selection decisions that a programmer makes after each iteration of working on some new code really are a form of natural selection. After all, programmers are just DNA survival machines with minds infected with memes for writing software, and the selection processes that the human mind undergo while writing software are just as natural as the Sun drying out worms on a sidewalk or a cheetah deciding upon which gazelle in a herd to pursue.

For example, when IT professionals slowly evolved our current $10 trillion worldwide IT architecture over the past 2.2 billion seconds, they certainly did not do so with the teleological intent of creating a simulation of the evolution of the biosphere. Instead, like most organisms in the biosphere, these IT professionals were simply trying to survive just one more day in the frantic world of corporate IT. It is hard to convey the daily mayhem and turmoil of corporate IT to outsiders. In 1979, I had been working continuously on geophysical models and simulations in Fortran and Basic for my thesis and for oil companies, ever since taking CS 101 at the University of Illinois back in 1972, but when I made a career change from being an exploration geophysicist at Amoco to become a systems analyst in Amoco’s IT department in 1979, I was in complete shock. When I first hit the floor of Amoco’s IT department on one very scary Monday morning, I suddenly found myself surrounded by countless teams of IT professionals, all running around like the Mad Hatter in Alice in Wonderland. After a couple of terrorizing weeks on this new job, it seemed to me like I was trapped in a frantic computer simulation, like the ones that I had been programming for the past seven years, hopelessly buried in punch card decks and fan-fold listings. But I quickly realized that all IT jobs essentially boiled down to simply pushing buttons. All you had to do was to push the right buttons, in the right sequence, at the right time, and with zero errors. How hard could that be? Well, it turned out to be very difficult indeed, and in response I began to subconsciously work on softwarephysics to try to figure out why this job was so hard, and how I could dig myself out of the mess that I had gotten myself into. After a while, it dawned on me that the fundamental problem was the second law of thermodynamics operating in a nonlinear simulated universe. The second law made it very difficult to push the right buttons in the right sequence and at the right time because there were so many erroneous combinations of button pushes. Writing and maintaining software was like looking for a needle in a huge utility phase space. There just were nearly an infinite number of ways of pushing the buttons “wrong”. The other problem was that we were working in a very nonlinear utility phase space, meaning that pushing just one button incorrectly usually brought everything crashing down. Next, I slowly began to think of pushing the correct buttons in the correct sequence as stringing together the correct atoms into the correct sequence to make molecules in chemical reactions that could do things. I also knew that living things were really great at doing that. Living things apparently overcame the second law of thermodynamics by dumping entropy into heat as they built low entropy complex molecules from high entropy simple molecules and atoms. I then began to think of each line of code that I wrote as a step in a biochemical pathway. The variables were like organic molecules composed of characters or “atoms” and the operators were like chemical reactions between the molecules in the line of code. The logic in several lines of code was the same thing as the logic found in several steps of a biochemical pathway, and a complete function was the equivalent of a full-fledged biochemical pathway in itself. But one nagging question remained - how could I take advantage of these similarities to save myself? That’s a long story, but in 1985 I started working on BSDE– the Bionic Systems Development Environment, which was used at Amoco to “grow” software biologically from an “embryo” by having programmers turn on and off a set of “genes”. The second half of my original softwarephysics posting provides more information on BSDE:

SoftwarePhysics

To have some fun with softwarephysics, and to see how it might help with exploring the origin of life, please take a look at the postings down below. To get to any other posting, just use the Blog Archive links in the upper right-hand corner of each posting.

A Brief History of Self-Replicating Information

The Driving Forces of Software Evolution

A Proposal For All Practicing Paleontologists

How Software Evolves

Self-Replicating Information

SoftwareBiology
Skip down to the section on SoftwarePaleontology

The Origin of Software the Origin of Life

Programming Clay

Using the Evolution of Software as a Model for Astrobiologists

An IT Perspective of the Cambrian Explosion

Using the Origin of Software as a Model for the Origin of Life

An IT Perspective on the Origin of Chromatin, Chromosomes and Cancer

Software Embryogenesis

Introduction to Softwarephysics

Some Pertinent Observations Already Gleaned From the History of IT
1. It’s all about self-replicating information.
Living things are just one form of self-replicating information that happen to carry along their own hardware with them. The concept of living things is really just an artificial human classification, and that is why it is nearly impossible for us to define. Perhaps, in some sense, the vitalists had it right all along; the mysterious vital force they sought was simply the driving force of self-replicating information seeking to survive. So the definition of life might really just hinge upon an arbitrary level of information density. A quartz crystal forming in a melt, or an ice crystal forming from many mobile water molecules plugging into a lattice, are also primitive forms of self-replicating information, just with a lower information density and a higher entropy than we are accustomed to finding in living things, but it really is just a matter of degree. So do not think in terms of investigating the origin of life, rather, think in terms of investigating the early history of self-replicating information on the Earth. After all, in the end, everything is just made of dead atoms. See:

The Demon of Software

2. Self-replicating information is very opportunistic.
Self-replicating information is very opportunistic and will exapt whatever hardware happens to be available at the time, and will certainly jump ship if something better comes along. For example, software started out on electrical relays and then proceeded to vacuum tubes, discrete transistors, integrated circuits with thousands of transistors, integrated circuits with millions of transistors, integrated circuits with billions of transistors, and will probably jump ship again to optical chips in the next decade or so. Now 4 billion years ago, the only available hardware on the newly formed Earth was organic monomers brought to the Earth by comets and asteroids or generated here by abiotic processes, and rock-forming minerals. All rocks in the Earth’s crust are formed from rock-forming silicate minerals that are composed of polymerized silica tetrahedrons. The silica tetrahedrons are made from one atom of silicon and four atoms of oxygen and are very similar in structure to methane. The silica tetrahedrons have a net charge of -4 so they polymerize into single chains, double chains, sheets, or 3-dimensional frameworks with positive cations of Ca++, Na+, K+, Fe++, or Mg++ interspersed in the crystalline lattice to neutralize the negative charge. So in a sense, they are the silicon equivalent of organic molecules. The rock-forming minerals are essentially the structural proteins that hold rocks together, and their 3-dimensional structures are key to rock properties, just as the structure of alpha chains and beta sheets are key to the properties of proteins. The minerals in rocks are usually formed at very high temperatures and usually under very high pressures too, so they are very much out of thermodynamic equilibrium at the Earth’s surface or at the bottom of the sea. Granted, there may be hundreds of kinds of rocks, but no matter the rock, they all eventually chemically weather down into mud and sand. The mud is formed from sheet-like polymers of silica tetrahedrons called clay minerals, and the sand comes from quartz grains which are made from very tough frameworks of pure silica tetrahedrons. It’s the H+ ions, mainly from carbonic and other acids, that break down the minerals by working themselves into the crystalline lattices to replace the positive cations. Since the silicates are very much like organic molecules, the odds are that our very distant ancestors were some kind of hybrid of the two. Also, life probably first arose several hundred meters below the Earth’s surface in the pore fluids circulating through rocks near the first spreading centers when plate tectonics was first initiated. These environments would have been safe from the late heavy bombardment 3.8 – 4.1 billion years ago. So make friends with a good geochemist. See:

Programming Clay

3. Self-replicating information easily forms parasitic/symbiotic relationships.
Lynn Margulis’s endosymbiotic theory seems to be universal for all forms of self-replicating information. The genes, memes, and software on Earth are all currently deeply intertwined in very complex parasitic/symbiotic relationships amongst them all. The Earth could certainly not support a population of 7 billion people without them all competing and also working together. Similarly, I vividly remember the early 1990s, when it was predicted that LANs composed of “high-speed” Intel 386 PCs running at a whopping 33 MHz would make IBM’s mainframes obsolete, and indeed, IBM nearly did go bankrupt in those days. However, today we find that IBM mainframes running z/OS, Unix servers, client PCs running Windows or Mac, and smartphones have all formed a heavily interdependent hybridized parasitic/symbiotic relationship, and so has the software running upon them.

4. Convergence plays a major role in the evolution of self-replicating information.
As Daniel Dennett put it, there are only a certain number of “Good Tricks” for living things to discover, and the same goes for all forms of self-replicating information. Over the past 70 years, software architecture has very closely recapitulated the same path through Design Space that living things did billions of years ago on Earth. Software went through a lengthy period of prokaryotic organization, which was followed by a period of eukaryotic organization, which finally led to the foundations of multicellular organization. And over the past decade, software has seen a Cambrian explosion, in which large numbers of somatic objects use the services of large numbers of service objects in service organs. See the SoftwarePaleontology section of:

SoftwareBiology

Similarly, in

Crocheting Software

we see that crochet and knitting patterns are precursors to computer software that evolved in parallel branches to software, but were never really in the same line of descent. I imagine on the early Earth there were similar branches of self-replicating information that were never our distant ancestors, but might have been unrelated competitors at the time. They might even still exist today in some isolated environments.

5. Beware of the memes lurking within your mind.
Meme-complexes are very conservative in nature and have a very great reluctance to adopting new memes from outside that might threaten the very existence of the entire meme-complex. This is especially true of scientific meme-complexes, and rightly so. Scientific meme-complexes must always be on guard to prevent the latest crackpot idea from taking hold. But if you look to the history of science, the downside to all this is that nearly all of the great scientific breakthroughs were delayed by 10 – 50 years, patiently awaiting acceptance by a scientific meme-complex. That is why Thomas Kuhn found that scientific meme-complexes were so very reluctant to adopt paradigm shifts. The rigidity of scientific meme-complexes today holds back scientific progress because it prevents people from working together on difficult problems, like the origin of life.

Some very good books on the subject are:

The Meme Machine (1999) by Susan Blackmore,

Virus of the Mind: The New Science of the Meme (1996) by Richard Brodie

Also, see:

How to Use Softwarephysics to Revive Memetics in Academia

6. Adopt a positivistic approach using effective theories.
Softwarephysics adopts a very positivistic view of software in that we do not care about what software “really” is; we only care about how software is observed to behave, and we only attempt to model this behavior with a set of effective theories. Positivism is an enhanced form of empiricism, in which we do not care about how things “really” are; we are only interested in how things are observed to behave. With positivism, physicists only seek out models of reality - not reality itself. Effective theories are an extension of positivism. An effective theory is an approximation of reality that only holds true over a certain restricted range of conditions and only provides for a certain depth of understanding of the problem at hand. For example, Newtonian mechanics works very well for objects moving in weak gravitational fields at less than 10% of the speed of light and which are larger than a very small mote of dust. For things moving at high velocities or in strong gravitational fields we must use relativity theory, and for very small things like atoms, we must use quantum mechanics. All of the current theories of physics, such as Newtonian mechanics, classical electrodynamics, thermodynamics, statistical mechanics, the special and general theories of relativity, quantum mechanics, and quantum field theories like QED (quantum electrodynamics) are just effective theories that are based upon models of reality, and all these models are approximations - all these models are fundamentally "wrong", but at the same time, these effective theories make exceedingly good predictions of the behavior of physical systems over the limited ranges in which they apply, and that is all positivism hopes to achieve. Remember, all of chemistry is just an approximation of QED, and QED is also just an approximate effective theory that explains nearly all of the behaviors of electrons but cannot explain the gravitational attraction between electrons. When all is said and done, biology and chemistry are all just about electrons in various quantum states, and electrons are nearly massless particles made of who knows what. See:

Model-Dependent Realism - A Positivistic Approach to Realism

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston