Thursday, December 31, 2015

Machine Learning and the Ascendance of the Fifth Wave

As I have frequently said in the past, the most significant finding of softwarephysics is that it is all about self-replicating information:

Self-Replicating Information – Information that persists through time by making copies of itself or by enlisting the support of other things to ensure that copies of itself are made.

Indeed, in The Software Universe as an Implementation of the Mathematical Universe Hypothesis and An Alternative Model of the Software Universe we saw that perhaps our observable Universe is just one instance of a Big Bang of mathematical information that exploded out 13.7 billion years ago into a new universe amongst an infinite Multiverse of self-replicating forms of mathematical information, in keeping with John Wheeler's infamous "It from Bit" supposition. Unfortunately, since we are causally disconnected from all of these other possible Big Bang instances, and even causally disconnected from the bulk of our own Big Bang Universe, we most likely will never know if such is the case.

However, closer to home we do not suffer from such a constraint, and we certainly have seen how the surface of our planet has been totally reworked by many successive waves of self-replicating information, as each wave came to dominate the Earth:

1. Self-replicating autocatalytic metabolic pathways of organic molecules
2. RNA
3. DNA
4. Memes
5. Software

Software is currently the most recent wave of self-replicating information to arrive upon the scene and is rapidly becoming the dominant form of self-replicating information on the planet. For more on this see:

A Brief History of Self-Replicating Information
Self-Replicating Information
Is Self-Replicating Information Inherently Self-Destructive?
Enablement - the Definitive Characteristic of Living Things
Is the Universe Fine-Tuned for Self-Replicating Information?
How to Use an Understanding of Self-Replicating Information to Avoid War
The Great War That Will Not End
How to Use Softwarephysics to Revive Memetics in Academia

As we have seen, as each new wave of self-replicating information came to dominate the Earth, it kept all of its predecessors around because each of the previous waves was necessary for the survival of the newest wave. For example, currently software is being generated by the software-creating memes residing in the minds of human programmers, and those memes depend upon the DNA, RNA and metabolic pathways of the distant past for their existence today. But does that necessarily have to always be so for software? Perhaps not. By combining some of the other key findings of softwarephysics, along with some of the recent advances in Machine Learning, it may be possible for software to one day write itself, and that day may not be that far into the future. Let me illustrate such a process with an example.

In my new job, I now have to configure middleware software, rather than support middleware software, as I did in my previous position in Middleware Operations. Now my old joke was that Middleware Operations did not make the light bulbs, we just screwed them in and kept them lit. But now I actually have to make the light bulbs, and making middleware light bulbs is much closer to my Applications Development roots because it requires a great deal of string manipulation with zero defects. You see, for the first 20 years of my IT career I was a programmer in Applications Development, but I have been out of Applications Development since the mid-1990s, and one thing has really changed since then. We did not have instant messaging back in the mid-1990s. Back then I would come into the office and if I were into some heavy coding, I would simply send my phone to "voicemail" so that people could not interrupt my train of thought while coding. Most of the day I would be deep into my "coding zone" and totally oblivious to my surroundings, but periodically I would take a break from coding to read the relatively small amount of email that I received each day. We did not have group email accounts in those days either, so I did not receive hundreds of meaningless emails each day from "reply to all" junkies that really did not apply to me in the slightest way. I have now found that the combination of a constant stream of ASAP instant messages from fellow workers and the thousands of meaningless emails I receive each day now mean that is very difficult to do coding or configuration work because I am in a state of constant interruption by others, with no time to really think about what I am doing.

To help with all of this, I am now writing MISE Korn shell commands, as much as possible, to automate the routine string manipulations that I need to perform (see Stop Pushing So Many Buttons for details). MISE (Middleware Integrated Support Environment) is currently a toolkit of 1831 Unix aliases, pointing to Korn shell scripts, that I use to do my work, and that I have made available to my fellow teammates doing Middleware work for my present employer. For example, my most recent effort was a MISE command called fac that formats firewall rule requests by reading an input file and outputting a fac.csv file that can be displayed in Excel. The Excel fac.csv file is in the exact format required by our Firewall Rule Request software, and I can just copy/paste some cells from the generated Excel fac.csv file into the Firewall Rule Request software with zero errors. I also wrote a MISE command called tcn that can read the same fac input file after the firewall rules have been generated by NetOps. The MISE tcn command reads the fac input file and conducts connectivity tests from all of the source servers to the destination servers at the destination ports.

The challenge I have with writing new MISE Korn shell commands is that I am constantly being peppered by ASAP instant message requests from other employees while trying to code the MISE Korn shell commands, which means I really no longer have any time to think about what I am coding. But under such disruptive conditions, I have found that my good old Darwinian biological approach to software really pays off because it minimizes the amount of thought that is required. For example, for my latest MISE effort, I wanted to read an input file containing many records like this:

#FrontEnd Apache Servers_______________Websphere Servers________________Ports
SourceServer1;SourceServer2;SourceServer3 DestServer1;DestServer2;DestServer3 Port1;Port2

and output a fac.csv file like this:

#FrontEnd Apache Servers_______________Websphere Servers________________Ports

Where S_IP1 is the IP address of SourceServer1. MISE has other commands that easily display server names based upon what the servers do, so it is very easy to display the necessary server names in a Unix session, and then to copy/paste the names into the fac input file. Remember, one of the key rules of softwarephysics is to minimize button pushing, by doing copy/paste operations as much as possible. So the MISE fac command just needed to read the first file and spit out the second file after doing all of the nslookups to get the IP addresses of the servers on the input file. Seems pretty simple. But the MISE fac command also had to spit out the original input file into the fac.csv file with all of its comment and blank records, and then a translated version of the input file with the server names translated to IP addresses, and finally a block of records with all of the comment and blank records removed that could be easily copy/pasted into the Firewall Rule Request software, and with all of the necessary error checking code, it came to 229 lines of Korn shell script.

The first thing I did was to find some old code in my MISE bin directory that was somewhat similar to what I needed. I then made a copy of the inherited code and began to evolve it into what I needed through small incremental changes between ASAP interruptions. Basically, I did not think through the code at all. I just kept pulling in tidbits of code from old MISE commands as needed to get my new MISE command closer to the desired output, or I tried adding some new code at strategic spots based upon heuristics and my 44 years of coding experience without thinking it through at all. I just wanted to keep making progress towards my intended output with each try, using the Darwinian concepts of inheriting the code from my most current version of the MISE command, coupled with some new variations to it, and then testing it to see if I came any closer to the desired output. If I did get closer, then the selection process meant that the newer MISE command became my current best version, otherwise I fell back to its predecessor and tried again. Each time I got a little closer, I made a backup copy of the command, like fac.b1 fac.b2 fac.b3 fac.b4.... so that I could always come back to an older version in case I found myself going down the wrong evolutionary path. It took about 21 versions to finally get me to the final version that did all that I wanted, and that took me several days because I could only code for 10 - 15 minutes at a time between ASAP interruptions. I know that this development concept is known as genetic programming in computer science, but genetic programming has never really made a significant impact on IT, but I think that is about to change.

Now my suspicion has always been that some kind of software could also perform the same tasks as I outlined above, only much faster and more accurately, because there is not a great deal of "intelligence" required by the process, and I think that the dramatic progress we have seen with Machine Learning, and especially with Deep Learning, over the past 5 - 10 years provides evidence that such a thing is actually possible. Currently, Machine Learning is making lots of money for companies that analyze the huge amounts of data that Internet traffic generates. By analyzing huge amounts of data, described by huge "feature spaces" with tens of thousands of dimensions, it is possible to find patterns through pure induction. Then by using deduction, based on the parameters and functions discovered by induction, it is possible to predict things like what is SPAM email or what movie a subscriber to Netflix might enjoy. Certainly, similar techniques could be used to deduce whether a new version of a piece of software is closer to the desired result than its parent, and if so, create a backup copy and continue on with the next iteration step to evolve the software under development into a final product.

The most impressive thing about modern Machine Learning techniques is that they carry with them all of the characteristics of a true science. With Machine Learning one forms a simplifying hypothesis, or model, that describes the behaviors of a complex dynamical system based upon induction, by observing a large amount of empirical data. Using the hypothesis, or model, one can then predict the future behavior of the system and of similar systems. This finally quells my major long-term gripe that computer science does not use the scientific method. For more on this see How To Think Like A Scientist. I have long maintained that the reason that the hardware improved by a factor of 10 million since I began programming back in 1972, while the way we create and maintain software only improved by a factor of about 10 during the same interval of time, was due to the fact that the hardware guys used the scientific method to make improvements, while the software guys did not. Just imagine what would happen if we could generate software a million times faster and cheaper than we do today!

My thought experiment about inserting a Machine Learning selection process into a Darwinian development do-loop may seem a bit too simplistic to be practical, but in Stop Pushing So Many Buttons, I also described how 30 years ago in the IT department of Amoco, I had about 30 programmers using BSDE (the Bionic Systems Development Environment) to grow software biologically from embryos by turning genes on and off. BSDE programmers put several million lines of code into production at Amoco using the same Darwinian development process that I described above for the MISE fac command. So if we could replace the selection process step in a Darwinian development do-loop with Machine Learning techniques, I think we really could improve software generation by a factor of a million. More importantly, because BSDE was written using the same kinds of software that it generated, I was able to use BSDE to generate code for itself. The next generation of BSDE was grown inside of its maternal release, and over a period of seven years, from 1985 – 1992, more than 1,000 generations of BSDE were generated, and BSDE slowly evolved into a very sophisticated tool through small incremental changes. I imagine that by replacing the selection process step with Machine Learning, those 7 years could have been compressed into 7 hours or maybe 7 minutes - who knows? Now just imagine a similar positive feedback loop taking place within the software that was writing itself and constantly improving with each iteration through the development loop. Perhaps it could be all over for us in a single afternoon!

Although most IT professionals will certainly not look kindly upon the idea of becoming totally obsolete at some point in the future, it is important to be realistic about the situation because all resistance is futile. Billions of years of history have taught us that nothing can stop self-replicating information once it gets started. Self-replicating information always finds a way. Right now there are huge amounts of money to be made by applying Machine Learning techniques to the huge amounts of computer-generated data we have at hand, so many high-tech companies are heavily investing in it. At the same time, other organizations are looking into software that generates software, to break the high-cost barriers of software generation. So this is just going to happen as software becomes the next dominant form of self-replicating information on the planet. And as I pointed out in The Economics of the Coming Software Singularity and The Enduring Effects of the Obvious Hiding in Plain Sight IT professionals will not be alone in going extinct. Somehow the oligarchies that currently rule the world will need to figure out a new way to organize societies as all human labor eventually goes to a value of zero. In truth, that decision too will most likely be made by software.

For more on Machine Learning please see:

Introduction to Machine Learning Theory and Its Applications: A Visual Tutorial with Examples - by Nick McCrea

A Deep Learning Tutorial: From Perceptrons to Deep Networks - by Ivan Vasilev

I recently audited Professor Andrew Ng's excellent online class at Stanford University:

Machine Learning

This is an excellent course that uses a high-level language called Octave that can be downloaded for free. In the class exercises, Octave is used to do the heavy lifting of the huge matrices and linear algebra manipulations required to do Machine Learning, especially for developers who would actually like to develop a real Machine Learning application for their company. Although the math required is something you might see in an advanced-level university physics or math course, Professor Ng does an amazing job at explaining the ideas in a manner accessible to IT professionals. Struggling through the Octave code also brings home what the complex mathematical notation is really trying to say. I have found that IT professionals tend to get a bit scared off by mathematical notation because they find it intimidating. But in reality, complex mathematical notation can always be expanded into the simple mathematical processes it is abbreviating, and when you do that in code, it is not so scary after all.

Comments are welcome at

To see all posts on softwarephysics in reverse order go to:

Steve Johnston