Tuesday, August 06, 2019

How to Study the Origin of Life on the Earth and Elsewhere in the Universe Right Here at Home

Over the past year, I have taken several excellent online courses on the origin of life on the Earth and elsewhere in the Universe. In all of these very interesting courses, I have noticed that as we go further and further back into deep time that it becomes ever harder to figure out exactly what might have happened. Indeed, this is very difficult to do because we only have one example of carbon-based life here on the Earth to examine. This is further complicated by the fact that none of these excellent courses have been able to definitively define exactly what life on the Earth is nor what life elsewhere in the Universe might look like. It is very difficult to determine the origin of something that cannot even be properly defined. That is why I have recommended in many of my softwarephysics posts to step up just one level higher first and begin by definitively defining the general concept of self-replicating information. Once that has been done we can then view carbon-based life on the Earth as just one of many forms of self-replicating information. Then we can proceed to explore the general characteristics of all forms of self-replicating information and also some possible commonalities that might exist with their origins.

Self-Replicating Information – Information that persists through time by making copies of itself or by enlisting the support of other things to ensure that copies of itself are made.

Over the past 4.56 billion years we have seen five waves of self-replicating information sweep across the surface of the Earth and totally rework the planet, as each new wave came to dominate the Earth:

1. Self-replicating autocatalytic metabolic pathways of organic molecules
2. RNA
3. DNA
4. Memes
5. Software

Software is currently the most recent wave of self-replicating information to arrive upon the scene and is rapidly becoming the dominant form of self-replicating information on the planet. For more on the above see A Brief History of Self-Replicating Information.

The Characteristics of Self-Replicating Information
All forms of self-replicating information have some common characteristics:

1. All self-replicating information evolves over time through the Darwinian processes of inheritance, innovation and natural selection, which endows self-replicating information with one telling characteristic – the ability to survive in a Universe dominated by the second law of thermodynamics and nonlinearity.

2. All self-replicating information begins spontaneously as a parasitic mutation that obtains energy, information and sometimes matter from a host.

3. With time, the parasitic self-replicating information takes on a symbiotic relationship with its host.

4. Eventually, the self-replicating information becomes one with its host through the symbiotic integration of the host and the self-replicating information.

5. Ultimately, the self-replicating information replaces its host as the dominant form of self-replicating information.

6. Most hosts are also forms of self-replicating information.

7. All self-replicating information has to be a little bit nasty in order to survive.

8. The defining characteristic of self-replicating information is the ability of self-replicating information to change the boundary conditions of its utility phase space in new and unpredictable ways by means of exapting current functions into new uses that change the size and shape of its particular utility phase space. See Enablement - the Definitive Characteristic of Living Things for more on this last characteristic. That posting discusses Stuart Kauffman's theory of Enablement in which living things are seen to exapt existing functions into new and unpredictable functions by discovering the “AdjacentPossible” of springloaded preadaptations.

By focusing on a definition of self-replicating information first and the common characteristics of all forms of self-replicating information, we eliminate the problems of trying to define “life” itself. In previous posts, I proposed that since carbon-based life on the Earth and software are both forms of self-replicating information, that it only makes sense to look to the origin and early evolution of software for some clues to the origin and early evolution of carbon-based life. For example, in the SoftwarePaleontology section of SoftwareBiology, I explained how the evolution of software over the past 78 years, or 2.46 billion seconds, ever since Konrad Zuse first cranked up his Z3 computer in May of 1941, has closely followed the same path through Design Space as did carbon-based life on this planet over the past 4.0 billion years in keeping with Simon Conway Morris's contention that convergence has played the dominant role in the evolution of life on Earth.

Software is now rapidly becoming the dominant form of self-replicating information on the planet and is having a major impact on mankind as it comes to predominance. So we are now living in one of those very rare times when a new form of self-replicating information, in the form of software, is coming to predominance. For biologists, this presents an invaluable opportunity because software has been evolving about 100 million times faster than living things over the past 2.46 billion seconds. And the evolution of software over that period of time is the only history of a form of self-replicating information that has actually been recorded by human history. In fact, the evolutionary history of software has all occurred within a single human lifetime, and many of those humans are still alive today to testify as to what actually had happened, something that those working on the origin of life on the Earth and its early evolution can only try to imagine.

As I saw in many of these courses, some are now trying to model the origin of carbon-based life on the Earth and elsewhere in the Universe with software. But why not just take advantage of the $10 trillion worldwide IT infrastructure that has been evolving all on its own for the past 2.46 billion seconds? Certainly, the biosphere of the Earth and this $10 trillion IT infrastructure are the most complicated information processing systems that we know of and billions of times more complicated than any research group could ever attain. Take a look at A Lesson for IT Professionals - Documenting the Flowcharts of Carbon-Based Life in the KEGG Online Database for example. The simplest way to do so would be to do some fieldwork in the corporate IT department of a major corporation or governmental agency. The objective of this fieldwork would be to observe the complexities of supporting heavy-duty commercial software in action on a network of hundreds of physical or virtual servers. I think that observing the IT processes that are used to write new code and install it into production on a large network of servers and then watching the emergent behaviors of the code under load would help researchers to understand the origins of carbon-based life and its complex information processing networks of organic molecules.

And there are tons of data for researchers to work with. Most large-scale IT operations have a great deal of monitoring software in place that displays and records the performance of many thousands of software components under processing load. For example, when I retired in December of 2016 my employer had about 500+ Unix servers running the software for their external websites and the internal web-based applications used to run the business. At peak load, these Unix servers were running several billion Objects (think eukaryotic cells) in Java Virtual Machines (JVMs) and all of these Objects (eukaryotic cells) were sending messages to each other, like the cells in a complex multicellular organism.

During an outage, the Command Center of the IT Department will page out perhaps 10 people to join an outage conference call. Then, using the display consoles of the monitoring software on their laptops the members of the outage call will all try to figure out what the heck is going on so that they can quickly fix the problem. Outages can cost between thousands to millions of dollars per second, depending on the business being conducted by the commercial software. Outages frequently happen in the middle of the night when new code goes into production or during the following day when the new code hits peak load. That would be an example of a simple deterministic outage caused by a low-level code change. But many times an outage just happens all on its own for no apparent reason. The whole network of information processes just seems to go berserk. Naturally, such “out of the blue” outages greatly displease IT Management because IT Management always wants to know the root cause of an outage. I spent many years trying to tell IT Management that such emergent outages just naturally arise for complex nonlinear networks far from thermal equilibrium but that explanation never went over very well. Anyway, the monitoring software records tons of data that can be used later to try and find the root cause of the outage. But this monitoring software runs continuously and provides a continuous stream of data that a research team could put to good use.

For a description of what a heavy-duty IT infrastructure looks like see Software Embryogenesis. For a more detailed post on such a proposal see A Proposal for an Odd Collaboration to Explore the Origin of Life with IT Professionals. In that post, I proposed that it might be worthwhile for researchers working on the origin of life or astrobiology to collaborate with the department of their university that teaches business-oriented computer science and with the IT department of a local major corporation or government agency to add some IT professionals to their research teams to bring in some new insights to their research efforts.

Some may object to the idea of software being a form of self-replicating information because currently, software is a product of the human mind. But I think that objection stems from the fact that most people simply do not consider themselves to be a part of the natural world. Instead, most people consciously or subconsciously consider themselves to be a supernatural and immaterial spirit that is temporarily haunting a carbon-based body. For more on that see The Ghost in the Machine the Grand Illusion of Consciousness.

In order for evolution to take place, we need all three Darwinian processes at work – inheritance, innovation and natural selection. And that is the case for all forms of self-replicating information, including carbon-based life, memes and software. Currently, software is being written and maintained by human programmers, but that will likely change in the next 10 – 50 years when the Software Singularity occurs and AI software will be able to write and maintain software better than a human programmer. Even so, one must realize that human programmers are also just machines with a very complicated and huge neural network of neurons that have been trained with very advanced Deep Learning techniques to code software. Nobody learned how to code software sitting alone in a dark room. All programmers inherited the memes for writing software from teachers, books, other programmers or by looking at the code of others. Also, all forms of selection are “natural” unless they are made by supernatural means. So a programmer pursuing bug-free software by means of trial and error is no different than a cheetah deciding upon which gazelle in a herd to pursue.

Conclusion
In The Danger of Believing in Things, I discussed what can happen when a science gets "stuck" on a problem that seems impossible to solve. Sometimes the best thing to do when apparently "stuck" is to simply walk away from the problem for a bit and seek the advice of others, especially others with an entirely different perspective of the problem at hand.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston

No comments: