Tuesday, December 22, 2020

CRISPR - the First Line Editor for DNA

In Programming Biology in the Biological Computation Group of Microsoft Research we saw that Microsoft is currently working on a biological IDE (Integrated Development Environment) called Station B for biologists that would work much like Eclipse or Microsoft's Visual Studio does for computer software developers. The purpose of Station B is to heavily automate the very tedious process of understanding what old DNA source code does and editing it with new DNA in order to be able to program biology. In order to do that, Microsoft is leveraging its 30 years of experience with developing and maintaining software source code. Now, modern software developers might wonder what's the big deal with that? That is because modern software developers never suffered with the limitations of programming computer source code on punch cards. And sadly, until very recently, biologists have been essentially forced to do biological programming on cards. Only in the past few years have biological programmers developed the CRISPR technology that lets them edit DNA source code with something that is the equivalent of an early 1970s-like line editor like ed or ex on Unix.

So in solidarity with our new-found biological programmers, let's review the history of how software source code has been developed and maintained over the past 79 years, or 2.5 billion seconds, ever since Konrad Zuse first cranked up his Z3 computer in May of 1941. For more on the computational adventures of Konrad Zuse please see So You Want To Be A Computer Scientist?. Unfortunately, even with the new CRISPR technology, synthetic biology programmers are currently still essentially stuck back in the 1970s when it comes to editing DNA source code. The purpose of this posting is to let them know that computer software programmers have been there too and that there is the possibility for great improvements in the future. But first, let's see how CRISPR works from a computer programmer's perspective.

How CRISPR Works
In 2012, Jennifer Doudna and Emmanuelle Charpentier published a paper that described how DNA could be precisely edited using a CRISPR-Cas9 complex that could be programmed with specific strings of RNA to edit targeted DNA in cells. Similarly, in 2016 it was discovered that a CRISPR-Cas13 complex could edit targeted RNA. For the first time, CRISPR technology now allows biologists to find specific strings of DNA bases in cells and replace them with new strings of DNA bases. This allows biologists to knock out specific genes, modify specific genes or even add new genes to eukaryotic cells. CRISPR is essentially a line editor that allows biologists to now locate specific lines of code in DNA and make a change to that line of code. CRISPR was actually invented by bacteria as a literal form of antiviral software that operates exactly like the antivirus software we run on our hardware. A virus is a parasitic form of DNA or RNA wrapped in a protein coat that parasitizes cells including bacterial cells. For more about viruses see A Structured Code Review of the COVID-19 Virus. Viruses that infect bacteria are called phages. After a viral phage attaches itself to a bacterial cell, it injects its DNA or RNA into the bacterial cell. The DNA or RNA of the viral phage then takes over the machinery of the bacterial cell and has the bacterial cell create hundreds of copies of the viral phage that either leak out of the bacterial cell or cause the bacterial cell to burst and die. So a virus is a pure form of parasitic self-replicating information. For more about the vast powers of self-replicating information see A Brief History of Self-Replicating Information.

Now billions of years ago, bacteria came up with a defense mechanism against viral attacks. Purely by means of the Darwinian processes of inheritance, innovation and natural selection, bacteria found a way to cut little snippets of the viral DNA or RNA code from the viral invaders and store those snippets in its own loop of bacterial DNA. Through a complex process, these viral signatures are then used to find similar sections of viral DNA or RNA that might have found their way into a bacterial cell that is under attack by a virus. If such invading DNA or RNA is discovered by the bacterial antivirus CRISPR software, the CRISPR software will cut the invading DNA or RNA at that point. That is like chopping a virus in half with a guillotine. The virus then stops dead in its tracks. Biologists now use this bacterial CRISPR antiviral system to edit the DNA of the complex eukaryotic cells of higher life forms. They do this by building CRISPR-Cas9 complexes with guide RNA that matches the section of DNA that they wish to edit as shown in Figure 2. The CRISPR-Cas9 complex will then cut the target DNA at the specified point and a new section of DNA can then be inserted.

The antiviral software running on your hardware does the very same thing. The vendor of your antiviral software periodically updates a database on your hardware with new viral signatures that are snippets of virus code from the computer viruses that are running around the world. The antiviral software then scans your hardware for the viral signatures and if it finds any, the antiviral software destroys the viral infection. This is an example of convergence. In biology, convergence is the idea that sometimes organisms that are not at all related will come up with very similar solutions to common problems that they share. For example, the concept of the eye has independently evolved at least 40 different times in the past 600 million years, and there are many examples of “living fossils” showing the evolutionary path. For example, the camera-like structures of the human eye and the eye of an octopus are nearly identical, even though each structure evolved totally independent of the other.

Figure 1 - The eye of a human and the eye of an octopus are nearly identical in structure, but evolved totally independently of each other. As Daniel Dennett pointed out, there are only a certain number of Good Tricks in Design Space and natural selection will drive different lines of descent towards them.

A good Wikipedia article on CRISPR can be found at:

CRISPR gene editing
https://en.wikipedia.org/wiki/CRISPR_gene_editing

Below is a very good YouTube video that quickly explains CRISPR in seven minutes and twenty seconds:

What is CRISPR?
https://www.youtube.com/watch?v=MnYppmstxIs

For a deeper dive, try this two-part YouTube video:

The principle of CRISPR System and CRISPR-CAS9 Technique (Part 1)
https://www.youtube.com/watch?v=VtOZdThI6dM

The principle of CRISPR System and CRISPR-CAS9 Technique (Part 2)
https://www.youtube.com/watch?v=KIMsVSQGBqw

Figure 2 – CRISPR-Cas9 acts like a line editor that allows biologists to locate a specific line of code in DNA and replace it with new DNA code.

The Major Advances in Working on Computer Source Code
Now let's look at the major developments that we have seen in writing and editing computer source code over the years to see where biological programming might be heading. Basically, we have had five major advances:

1. Punch Cards in the 1950s and 1960s
2. Line Editors like ed and ex on Unix in the early 1970s
3. The development of full-screen editors like ISPF on IBM mainframes and vi on Unix in the late 1970s
4. The development of IDEs like Eclipse or Microsoft's Visual Studio in the 1990s
5. The development of AI code generators in the 2020s

Now, I may be jumping the gun with the idea that AI code generators will appear in the 2020s, but we must take note that Microsoft has purchased the exclusive rights to the GPT-3 AI text generator that can now spit out computer source code with some acceptable level of bugs in it. For more on that see: The Impact of GPT-3 AI Text Generation on the Development and Maintenance of Computer Software. But let's first go back to the 1950s and 1960s when we wrote computer source code on punch cards and see how the development of line editors and full-screen editors then improved programming efficiency by a factor of more than 100 times.

Programming on Punch Cards
Like the biological labwork of today, writing and maintaining software in the 20th century was very inefficient, time-consuming and tedious. For example, when I first learned to write Fortran code at the University of Illinois at Urbana in 1972, we were punching out programs on an IBM 029 keypunch machine, and I discovered that writing code on an IBM 029 keypunch machine was even worse than writing term papers on a manual typewriter. At least when you submitted a term paper with a few typos, your professor was usually kind enough not to abend your term paper right on the spot and give you a grade of zero. Sadly, I learned that such was not the case with Fortran compilers! The first thing you did was to write out your code on a piece of paper as best you could back at the dorm. The back of a large stack of fan-folded printer paper output was ideal for such purposes. In fact, as a physics major, I first got hooked by software while digging through the wastebaskets of DCL, the Digital Computing Lab, at the University of Illinois looking for fan-folded listings of computer dumps that were about a foot thick. I had found that the backs of thick computer dumps were ideal for working on lengthy problems in my quantum mechanics classes.

It paid to do a lot of desk-checking of your code back at the dorm before heading out to the DCL. Once you got to the DCL, you had to wait your turn for the next available IBM 029 keypunch machine. This was very much like waiting for the next available washing machine on a crowded Saturday morning at a laundromat. When you finally got to your IBM 029 keypunch machine, you would load it up with a deck of blank punch cards and then start punching out your program. You would first press the feed button to have the machine pull your first card from the deck of blank cards and register the card in the machine. Fortran compilers required code to begin in column 7 of the punch card so the first thing you did was to press the spacebar 6 times to get to column 7 of the card. Then you would try to punch in the first line of your code. If you goofed and hit the wrong key by accident while punching the card, you had to eject the bad card and start all over again with a new card. Structured programming had not been invented yet, so nobody indented code at the time. Besides, trying to remember how many times to press the spacebar for each new card in a block of indented code was just not practical. Pressing the spacebar 6 times for each new card was hard enough! Also, most times we proofread our card decks by flipping through them before we submitted the card deck. Trying to proofread indented code in a card deck would have been rather disorienting, so nobody even thought of indenting code. Punching up lots of comment cards was also a pain, so most people got by with a minimum of comment cards in their program deck.

After you punched up your program on a card deck, you would then punch up your data cards. Disk drives and tape drives did exist in those days, but disk drive storage was incredibly expensive and tapes were only used for huge amounts of data. If you had a huge amount of data, it made sense to put it on a tape because if you had several feet of data on cards, there was a good chance that the operator might drop your data card deck while feeding it into the card reader. But usually, you ended up with a card deck that held the source code for your program and cards for the data to be processed too. You also punched up the JCL (Job Control Language) cards for the IBM mainframe that instructed the IBM mainframe to compile, link and then run your program all in one run. You then dropped your finalized card deck into the input bin so that the mainframe operator could load your card deck into the card reader for the IBM mainframe. After a few hours, you would then return to the output room of the DCL and go to the alphabetically sorted output bins that held all the jobs that had recently run. If you were lucky, in your output bin you found your card deck and the fan-folded computer printout of your last run. Unfortunately, normally you found that something probably went wrong with your job. Most likely you had a typo in your code that had to be fixed. If it was nighttime and the mistake in your code was an obvious typo, you probably still had time for another run, so you would get back in line for an IBM 029 keypunch machine and start all over again. You could then hang around the DCL working on the latest round of problems in your quantum mechanics course. However, machine time was incredibly expensive in those days and you had a very limited budget for machine charges. So if there was some kind of logical error in your code, many times you had to head back to the dorm for some more desk checking of your code before giving it another shot the next day.

Figure 3 - An IBM 029 keypunch machine like the one I first learned to program on at the University of Illinois in 1972.

Figure 4 - Each card could hold a maximum of 80 bytes. Normally, one line of code was punched onto each card.

Figure 5 - The cards for a program were held together into a deck with a rubber band, or for very large programs, the deck was held in a special cardboard box that originally housed blank cards. Many times the data cards for a run followed the cards containing the source code for a program. The program was compiled and linked in two steps of the run and then the generated executable file processed the data cards that followed in the deck.

Figure 6 - To run a job, the cards in a deck were fed into a card reader, as shown on the left above, to be compiled, linked, and executed by a million-dollar mainframe computer with a clock speed of about 750 kHz and about 1 MB of memory.

Figure 7 - The output of a run was printed on fan-folded paper and placed into an output bin along with your input card deck.

I finished up my B.S. in Physics at the University of Illinois at Urbana in 1973 and headed up north to complete an M.S. in Geophysics at the University of Wisconsin at Madison. Then from 1975 – 1979, I was an exploration geophysicist exploring for oil, first with Shell, and then with Amoco. I kept coding Fortran the whole time. In 1979, I made a career change into IT and spent about 20 years in development. For the last 17 years of my career, I was in IT operations, supporting middleware on WebSphere, JBoss, Tomcat, and ColdFusion. In 1979, when I became an IT professional in Amoco's IT department, I noticed that not much had changed with the way software was developed and maintained. Structured programming had arrived, so we were now indenting code and adding comment statements to the code, but I was still programming on cards. We were now using IBM 129 keypunch machines that were a little bit more sophisticated than the old IBM 029 keypunch machines. However, the coding process was still very much the same. I worked on code at my desk and still spent a lot of time desk checking the code. When I was ready for my next run, I would get into an elevator and travel down to the basement of the Amoco Building where the IBM mainframes were located. Then I would punch my cards on one of the many IBM 129 keypunch machines but this time with no waiting for a machine. After I submitted my deck, I would travel up 30 floors to my cubicle to work on something else. After a couple of hours, I would head down to the basement again to collect my job. On a good day, I could manage to get 4 runs in. But machine time was still incredibly expensive. If I had a $100,000 project, $25,000 went for programming time, $25,000 went to IT overhead like management and data management services costs, and a full $50,000 went to machine costs for compiles and test runs!

This may all sound very inefficient and tedious today, but it can be even worse. I used to talk to the old-timers about the good old days of IT. They told me that when the operators began their shift on an old-time 1950s vacuum tube computer, the first thing they did was to crank up the voltage on the vacuum tubes to burn out the tubes that were on their last legs. Then they would replace the burned-out tubes to start the day with a fresh machine.

Figure 8 – In the 1950s, the electrical relays of the very ancient computers were replaced with vacuum tubes that were also very large, used lots of electricity and generated lots of waste heat too, but the vacuum tubes were 100,000 times faster than relays.

They also explained that the machines were so slow that they spent all day processing production jobs. Emergency maintenance work to fix production bugs was allowed at night, but new development was limited to one compile and test run per week! They also told me about programming the plugboards of electromechanical Unit Record Processing machines back in the 1950s by physically rewiring the plugboards. The Unit Record Processing machines would then process hundreds of punch cards per minute by routing the punch cards from machine to machine in processing streams.

Figure 9 – In the 1950s Unit Record Processing machines like this card sorter were programmed by physically rewiring a plugboard.

Figure 10 – The plugboard for a Unit Record Processing machine.

Programming With a Line Editor
A line editor lets you find and work on one line of code at a time just as a CRISPR-Cas9 complex lets you find and edit one specific line of DNA code in a living cell. Line editors were first developed for computer terminals like teletype machines that were a combination of a keyboard and a printer. You could type in operating system commands on the teletype keyboard and the operating system of the computer would print back responses on a roll of paper - see Figure 11. You could also edit files on the teletype using a line editor. The Unix line editor ed was developed by Ken Thompson in August 1969 on a DEC PDP-7 at AT&T Bell Labs. The ed line editor was rather difficult to use, so Bill Joy wrote the ex line editor in 1976 for Unix. The ex line editor is still used today in Unix shell scripts to edit files.

Figure 11 – Some graduate students huddled around a DEC PDP-8/e minicomputer. Notice the teletype machines in the foreground on the left that were used to input code and data into the machine and to print out results as well.

The following Wikipedia articles are a good introduction to line editors.

Line Editor
https://en.wikipedia.org/wiki/Line_editor

ed (text editor)
https://en.wikipedia.org/wiki/Ed_(text_editor)

ex (text editor)
https://en.wikipedia.org/wiki/Ex_(text_editor)

The article below is a good reference if you would like to play with the ex line editor on a Unix system or under Cygwin running on a PC.

Introducing the ex Editor
https://www.cs.ait.ac.th/~on/O/oreilly/unix/vi/ch05_01.htm

You can download a free copy of Cygwin from the Cygwin organization at:

Cygwin
https://www.cygwin.com/

Now let's play with the ex line editor under Cygwin on my Windows laptop. In Figure 12 we see a 17 line C++ program that asks a user for two numbers and then prints out the sum of two numbers. The ex line editor usually only prints out one line at a time, but I was able to get ex to print out the whole program by entering the command "1,17". I can also find the first occurrence of the string "cout" using the command "/cout". In a manner similar to the way CRISPR-Cas9 can edit a specific line of DNA code, I can then change "cout" to "pout" with the command "s/cout/pout/". Finally, I can list out the whole program again with a "1,17" command to the ex line editor.

Figure 12 – A 17 line C++ program that asks for two numbers and then prints out the sum. To list the whole program, I type "1,17" into the ex line editor.

Figure 13 – To find the first occurrence of the string "cout", I type "/cout" into the ex line editor

Figure 14 – To change the first "cout" string to "pout", I type in "s/cout/pout/ into the ex line editor.

Figure 15 – Now I list the whole program with the "1,17" command. Notice that the first "cout" has been mutated to "pout". Such a devastating mutation means that the C++ program will not even compile into an executable file that can run on my computer.

As I mentioned, I finished up my B.S. in Physics at the University of Illinois at Urbana in 1973 and headed up north to complete an M.S. in Geophysics at the University of Wisconsin at Madison. I was working with a team of graduate students who were collecting electromagnetic data in the field on a DEC PDP-8/e minicomputer - see Figure 11. The machine cost about $30,000 in 1973 (about $176,000 in 2020 dollars) and was about the size of a large side-by-side refrigerator. The machine had 32 KB of magnetic core memory, about 2 million times less memory than a modern 64 GB smartphone. This was my first experience with interactive computing and using a line editor to create and edit code. Previously, I had only written Fortran programs for batch processing on IBM mainframes using an IBM 029 keypunch machine and punch cards. I wrote BASIC programs on the DEC PDP-8/e minicomputer using a teletype machine and a built-in line editor. The teletype machine was also used to print out program runs. My programs were saved to a magnetic tape and I could also read and write data from a tape as well. I found that using the built-in line editor made programming at least 10 times easier than writing code on punch cards. Now with a simple one-line command I could print out large portions of my code and see all of the code at the same time. When programming on cards, you could only look at code one card at a time. Now, whenever you made a run with a compile-link-run job on cards, you did get a listing of the complete program. But otherwise, you spent most of your time shuffling through a large deck of cards. That made it very difficult to find specific lines of code in your deck of cards or all the occurrences of a particular variable.

Another neat thing about the DEC PDP-8/e minicomputer was there were no computer charges and I did not have to wait hours to see the output of a run. None of my departmental professors knew how to program computers, so there was plenty of free machine time because only about four graduate students knew how to program at the time. I also learned the time-saving trick of interactive programming. Originally, I would write a BASIC program and hard-code the data values for a run directly in the code and then run the program as a batch job. After the run, I would then edit the code to change the hard-coded data values and then run the program again. Then one of my fellow graduate students showed me the trick of how to add a very primitive interactive user interface to my BASIC programs. Instead of hard-coding data values, my BASIC code would now prompt me for values that I could enter on the fly on the teletype machine. This allowed me to create a library of "canned" BASIC programs on tape that I never had to change. I could just run my "canned" programs with new input data as needed.

Programming With a Full-Screen Editor
In 1972, Dahl, Dijkstra, and Hoare published Structured Programming, in which they suggested that computer programs should have a complex internal structure with no GOTO statements, lots of subroutines, indented code, and many comment statements. But writing such complicated structures on punch cards was nearly impossible. Even with a line editor, it was very difficult to properly do all of the indentation and spacing. As I mentioned above, for several decades programmers writing code on punch cards hardly ever even got to see their code all in one spot on a listing. Mainly, we were rifling through card decks to find things. Trying to punch up structured code on punch cards would have just made a very difficult process much harder and trying to find things in a deck of structured code would have been even slower if there were lots of punch cards with just comment statements on them. So nobody really wrote structured code with punch cards or even a line editor. In response, in 1974 IBM introduced the SPF (Structured Programming Facility) full-screen editor. At the time, IBM was pushing SPF as a way to do the complicated coding that structured programming called for. Later, IBM renamed SPF to ISPF (Interactive Systems Productivity Facility).

Figure 16 - IBM 3278 terminals were connected to controllers that connected to IBM mainframes. The IBM 3278 terminals then ran interactive TSO sessions with the IBM mainframes. The ISPF full-screen editor was then brought up under TSO after you logged into a TSO session.

A full-screen editor like ISPF is much like a wordprocessing editor like Microsoft's Word or WordPad that displays a whole screen of code at one time. With your cursor, you can move to different lines of code and edit them directly by typing over old code. You can also insert new code and do global changes to the code. ISPF is still used today for IBM's z/OS and z/VM mainframe operating systems. In addition to a powerful full-screen editor, ISPF lets programmers run many utilities from a hierarchical set of menus rather than run them as TSO command-line commands. At first, in the late 1970s, the IBM 3278 terminals appeared in IT departments in "tube rows" like the IBM 029 keypunch machines of yore because machine costs for interactive TSO sessions were still much higher than the rate for batch jobs running off punch cards. Today, programmers do not worry much about machine costs, but in the early 1980s the machine costs for developing a new application still came to 50% of the entire budget for the project. But by the early 1980s, each IT professional was finally given their own IBM 3278 terminal on their own desk. Finally, there was no more waiting in line for an input device!

At the same time, the universities around the world were also building Computer Science Departments in the late 1960s and early 1970s. The universities found it much cheaper to teach students how to program on cheap Unix machines rather than IBM mainframes. So most universities became Unix shops teaching the C programming language. Now, when those students hit the corporate IT departments in the early 1980s, they brought along their fondness for the Unix operating system and the C programming language. But since most corporate IT departments still remained IBM shops, these students had to learn the IBM way of life. Still, the slow influx of programmers with experience with Unix and C running on cheap Unix servers eventually led to the Distributed Computing mass extinction event in the early 1990s. The Distributed Computing mass extinction event was much like the End-Permian greenhouse gas mass extinction 252 million years ago that killed off 95% of marine species and 70% of land-based species. The Distributed Computing mass extinction event ended the reign of the mainframes and replaced them with large numbers of cheap Unix servers in server farms. This greatly advanced the presence of Unix in IT shops around the world and the need for business programmers to learn Unix and a full-screen editor for Unix files. For more on that see Cloud Computing and the Coming Software Mass Extinction. Fortunately, Unix also had a very popular full-screen editor called vi. When Bill Joy wrote the Unix ex line editor in 1976, he also included a visual interface to it called "vi" that was sort of a cross between a line editor and a full-screen editor. The vi full-screen editor takes a bit of time to get used to, but once you get used to vi, it turns out to be a very powerful full-screen editor that is still used today by a large number of Unix programmers. In fact, I used vi to do all of my Unix editing until the day I retired in December of 2016. One of the advantages of using vi on Unix is that you can use a very similar Windows full-screen editor called Vim to edit Windows files. That way, your editing motor-memory works on both the Unix and Windows operating systems. Below is a good Wikipedia article on the vi editor.

vi
https://en.wikipedia.org/wiki/Vi

You can download the Vim editor at:

Vim
https://www.vim.org/

Figure 17 – Now take a look at our C++ program in the vi editor that is running under Cygwin on my Windows laptop. Notice that my cursor is now located at the end of line 17 after I easily added some comment lines and repeated them with the copy/paste function of the vi full-screen editor.

One of the other characteristics of structured programming was the use of "top-down" programming, where programs began execution in a mainline routine that then called many other subroutines. The purpose of the mainline routine was to perform simple high-level logic that called the subordinate subroutines in a fashion that was easy to follow. This structured programming technique made it much easier to maintain and enhance software by simply calling subroutines from the mainline routine in the logical manner required to perform the needed tasks, like assembling Lego building blocks into different patterns that produced an overall new structure. The structured approach to software also made it much easier to reuse software. All that was needed was to create a subroutine library of reusable source code that was already compiled. A mainline program that made calls to the subroutines of the subroutine library was compiled as before. The machine code for the previously compiled subroutines was then added to the resulting executable file by a linkage editor. This made it much easier for structured eukaryotic programs to use reusable code by simply putting the software "conserved core processes" into already compiled subroutine libraries. The vi full-screen editor running under Unix and the ISPF full-screen editor running under TSO on IBM 3278 terminals also made it much easier to reuse source code because now many lines of source code could be simply copied from one program file to another, with the files stored on disk drives, rather than on punch cards or magnetic tape.

Thus, not only did the arrival of full-screen editors make writing and maintaining code much easier, but they also ushered in a whole new way of coding. Now programmers were able to write structured code that was easier to support and maintain. They could also benefit from code reuse by being able to do easy copy/paste operations between files of source code and also by building libraries of reusable code that could be brought in at link-time. With a large library of reusable code, programmers could just write mainline programs that made calls to the subroutines in reusable code libraries. Complex living things do the same thing.

In Facilitated Variation and the Utilization of Reusable Code by Carbon-Based Life, I showcased the theory of facilitated variation by Marc W. Kirschner and John C. Gerhart. In The Plausibility of Life (2005), Marc W. Kirschner and John C. Gerhart present their theory of facilitated variation. The theory of facilitated variation maintains that, although the concepts and mechanisms of Darwin's natural selection are well understood, the mechanisms that brought forth viable biological innovations in the past are a bit wanting in classical Darwinian thought. In classical Darwinian thought, it is proposed that random genetic changes, brought on by random mutations to DNA sequences, can very infrequently cause small incremental enhancements to the survivability of the individual, and thus provide natural selection with something of value to promote in the general gene pool of a species. Again, as frequently cited, most random genetic mutations are either totally inconsequential, or totally fatal in nature, and consequently, are either totally irrelevant to the gene pool of a species or are quickly removed from the gene pool at best. The theory of facilitated variation, like classical Darwinian thought, maintains that the phenotype of an individual is key, and not so much its genotype since natural selection can only operate upon phenotypes. The theory explains that the phenotype of an individual is determined by a number of "constrained" and "deconstrained" elements. The constrained elements are called the "conserved core processes" of living things that essentially remain unchanged for billions of years, and which are to be found to be used by all living things to sustain the fundamental functions of carbon-based life, like the generation of proteins by processing the information that is to be found in DNA sequences, and processing it with mRNA, tRNA and ribosomes, or the metabolism of carbohydrates via the Krebs cycle. The deconstrained elements are weakly-linked regulatory processes that can change the amount, location and timing of gene expression within a body, and which, therefore, can easily control which conserved core processes are to be run by a cell and when those conserved core processes are to be run by them. The theory of facilitated variation maintains that most favorable biological innovations arise from minor mutations to the deconstrained weakly-linked regulatory processes that control the conserved core processes of life, rather than from random mutations of the genotype of an individual in general that would change the phenotype of an individual in a purely random direction. That is because the most likely change of direction for the phenotype of an individual, undergoing a random mutation to its genotype, is the death of the individual.

Marc W. Kirschner and John C. Gerhart begin by presenting the fact that simple prokaryotic bacteria, like E. coli, require a full 4,600 genes just to sustain the most rudimentary form of bacterial life, while much more complex multicellular organisms, like human beings, consisting of tens of trillions of cells differentiated into hundreds of differing cell types in the numerous complex organs of a body, require only a mere 22,500 genes to construct. The baffling question is, how is it possible to construct a human being with just under five times the number of genes as a simple single-celled E. coli bacterium? The authors contend that it is only possible for carbon-based life to do so by heavily relying upon reusable code in the genome of complex forms of carbon-based life.

Figure 18 – A simple single-celled E. coli bacterium is constructed using a full 4,600 genes.

Figure 19 – However, a human being, consisting of about 100 trillion cells that are differentiated into the hundreds of differing cell types used to form the organs of the human body, uses a mere 22,500 genes to construct a very complex body, which is just slightly under five times the number of genes used by simple E. coli bacteria to construct a single cell. How is it possible to explain this huge dynamic range of carbon-based life? Marc W. Kirschner and John C. Gerhart maintain that, like complex software, carbon-based life must heavily rely on the microservices of reusable code.

The authors then propose that complex living things arise by writing "deconstrained" mainline programs that then call the "constrained" libraries of reusable code that have been in use for billions of years.

Programming with an IDE (Integrated Development Environment)
IDEs first appeared in the early 1990s. An IDE assists developers with writing new software or maintaining old software by automating many of the labor-intensive and tedious chores of working on software. The chief advance that came with the early IDEs was their ability to do interactive full-screen debugging. With interactive full-screen debugging, programmers were able to step through their code line-by-line as the code actually executed. Programmers could set a breakpoint in their code, and when the breakpoint was reached as their code executed, the code stopped executing and put them into debug mode at the breakpoint. They could then look at the values of all of the variables used by their code. They could also step through the code line by line and watch the logical flow of their code operate as it branched around their program mainline and into and back from called functions. They could also watch the values of variables change as each new line of code ran. This provided a huge improvement in debugging productivity. Writing code is easy. Debugging code so that it actually works takes up about 95% of a programmer's time. Prior to full-screen debugging in an IDE, programmers would place a number of PRINT statements into their code for debugging purposes. For example, they might add the following line of code to their program: PRINT("This is breakpoint #1, TotalOrderItems, TotalOrderAmount). They would then run the program that they were debugging. After their program crashed, they would look at the output. If they saw something like "This is breakpoint #1 6 $4824.94", they would know that their code at least got to the debug PRINT statement before it crashed, and also that at that time, 6 items had been ordered for a total of $4824.94. By placing many PRINT debug statements in their code, programmers were able to very slowly identify all of the bugs in their program and then fix them.

Another major advance that IDEs brought was automatic code completion. This was especially important for object-oriented languages like C++ and Java (1995). With automatic code completion, the IDE full-screen editor would sense new code that was being written and would provide a drop-down of possible ways to complete the code being typed. For example, if the programmer typed in the name of an existing object the automatic code completion would display the list of methods defined by the object and the data attributes defined by the object. The programmer could then select the proper element from the drop-down and the code would then be completed by the IDE. My guess would be that the next IDE advance would be an AI form of super-code completion. For example, the GPT-3 AI text generator can now spit out computer source code with some acceptable level of bugs in it. If GPT-3 were allowed to read tons of software code for particular languages, it could get even better. For more on that see: The Impact of GPT-3 AI Text Generation on the Development and Maintenance of Computer Software.

Microsoft's very first IDE was Visual C++ and came out in 1993. After developing several similar IDEs for different computer languages, Microsoft combined them all under a product called Visual Studio in 1997. Similarly, IBM released its Visual Age IDE in 1993. In 2004, Visual Age evolved into Eclipse 3.0 and was further developed and maintained by the open-source Eclipse Foundation. For those of you not familiar with Microsoft's Visual Studio IDE, you can read about it and download a free trial version of Visual Studio for your own personal use at:

Visual Studio
https://visualstudio.microsoft.com/vs/

If you do not need all of the bells and whistles of a modern IDE, you can download a permanent free copy of Visual Studio Code. Visual Studio Code is a very useful and free IDE with all of the powers of a 1990s-style IDE. It has interactive full-screen debugging and code completion and can be used for many different languages like C, C++, C# and Java.

Visual Studio Code
https://code.visualstudio.com/

When you crank up Microsoft's Visual Studio, you will find all of the software tools that you need to develop new software or maintain old software. The Visual Studio IDE allows software developers to perform development and maintenance chores that took days or weeks back in the 1960s, 1970s and 1980s in a matter of minutes.

Figure 20 – Above is a screenshot of Microsoft's Visual Studio IDE. It assists developers with writing new software or maintaining old software by automating many of the labor-intensive and tedious chores of working on software. Station B intends to do the same for biologists.

Before you download a free community version of Visual Studio be sure to watch some of the Visual Studio 2019 Launch videos at the bottom of the Visual Studio 2019 download webpage to get an appreciation for what a modern IDE can do.

You can also download a free permanent copy of the Eclipse IDE for many different computer languages at:

Eclipse Foundation
https://www.eclipse.org/

Figure 21 – A screenshot of the Eclipse IDE for C++.

Conclusion
As you can imagine, writing and maintaining modern software that might consist of millions of lines of code would have been very difficult back in the 1960s using punch cards. To address that problem, IT slowly came up with many software tools over the decades to help automate the process. Being able to quickly edit code and debug it was certainly a priority. But unfortunately, the progress of automating software development and maintenance was surprisingly very slow and was always impeded by the expediency of the moment. There always was the time-pressure of trying to get new software into Production. So rather than spending time to write automation software that could be used to improve the productivity of writing and maintaining software, IT Management always stressed the importance of hitting the next deadline with whatever incredibly inefficient technology we had at the time. When I first transitioned from being an exploration geophysicist to being an IT professional back in 1979, I was shocked by the very primitive development technology that IT had at hand. At the time, it seemed that IT Management wanted us to perform miracles on punch cards, sort of like painting the Eifel Tower with a single paintbrush and a bucket of paint in a single day. In response, in 1985 I developed my own mainframe-based IDE called BSDE (Bionic Systems Development Environment) at a time when IDEs did not exist. The BSDE IDE was used to "grow" applications in a biological manner from an "embryo" and from 1985 - 1992 BSDE was used to put several million lines of code into Production at Amoco. For more on BSDE see the last part of Programming Biology in the Biological Computation Group of Microsoft Research. Similarly, something like Microsoft's Station B initiative to build a biological IDE for writing and editing DNA code will surely be necessary to do advanced biological programing on the billions of DNA base pairs used by complex multicellular carbon-based life. That will certainly take a lot of time and money to accomplish, but I think that the long history of writing and maintaining computer software certainly shows that it will be well worth the while.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston

Wednesday, December 09, 2020

The Observatory on Social Media at Indiana University Studies the Parasitic/Symbiotic Interactions Between the Memes and Software

In Some Thoughts on the Netflix Documentary The Social Dilemma, The Perils of Software Enhanced Confirmation Bias and The Current Global Coevolution of COVID-19 RNA, Human DNA, Memes and Software we discussed the very complex nature of the ongoing parasitic/symbiotic relationships between the memes and social media software. Recall that softwarephysics maintains that software is the fifth wave of self-replicating information to sweep across the planet and is rapidly becoming the dominant form of self-replicating information on the Earth as it displaces the memes as the dominant form of self-replicating information on the planet.

The December 2020 issue of Scientific American features an article The Attention Economy by Filippo Menczer and Thomas Hills that discusses some of these parasitic/symbiotic relationships, especially as they relate to the spread of false memes by social media software. In that very interesting article, I learned about the Observatory on Social Media (OSoMe, pronounced 'awe•some') at Indiana University that is headed by Professor Filippo Menczer. Here is their website:

Observatory on Social Media
http://osome.iuni.iu.edu/

This website features the scientific work of a truly unique group of individuals in academia. The Observatory on Social Media is the only academic research group that I know of that is actively exploring the parasitic/symbiotic relationships between the memes and software and how these interactions are impacting the "real world" of human affairs. Much of the material gathered by the Observatory on Social Media stems from the Networks & agents Network (NaN) research group at Indiana University.

Networks & agents Network (NaN):
https://cnets.indiana.edu/groups/nan/

The NaN research group conducts many studies into the nature of the social networks created by social media software using large amounts of field data gathered in the wild by staff fieldwork and also by using numerous numerical models to conduct simulations of what is observed in the field. One of their chief areas of research is the spread of misinformation on the Internet via social media software. NaN has created a number of software tools that are available to the public to analyze such traffic models. The NaN website also features a large number of YouTube lectures that explain the work that they are doing. This webpage takes a bit of time to load so please be patient.

NaN talks and other videos:
https://cnets.indiana.edu/groups/nan/talks/

A good lecture to start with is one by Professor Filippo Menczer:

4 Reasons Why Social Media Make Us Vulnerable to Manipulation
https://www.youtube.com/watch?v=uLYbkTQT064

Professor Filippo Menczer is very interested in why misinformation seems to go viral just as often as true information in social media diffusion networks. The NaN research group has discovered a number of explanations for this. First, their models show that whenever a social network of trusted agents, composed of agents with a finite level of attention, is exposed to high levels of information overload, the misinformation memes tend to go viral more often than true memes. On the other hand, when the same network is exposed to lower levels of information overload, the true memes tend to propagate. Another cause for the spread of misinformation stems from the algorithmic bias that most social media websites employ. Because the purpose of their Machine Learning algorithms is to keep users engaged so that they see more ads, these algorithms naturally select for content that is more popular to accomplish that goal. Unfortunately, users then begin to equate popularity with quality because all they see is popular content. And because the social media Machine Learning algorithms soon learn the social biases of users, they also tend to channel users to similar content that reaffirms their worldview by means of confirmation bias. This causes polarization. Memes tend to spread very quickly within one echo chamber but not between echo chambers so debunking memes never confront misinformation memes. See Some Thoughts on the Netflix Documentary The Social Dilemma, The Perils of Software Enhanced Confirmation Bias and The Current Global Coevolution of COVID-19 RNA, Human DNA, Memes and Software for more on that. Novel memes also spread much easier than less dramatic memes. Misinformation memes also tend to be more sensational and with higher levels of novelty than more mundane true memes, and this also fosters the propagation of misinformation memes. Finally, programmed bots can become superspreaders of misinformation by being programmed to retweet misinformation. Observed field data indicate that bots are very effective at igniting the kindling of a viral cascade that later gets amplified by real human users into a misinformation bonfire.

Once again, let me repeat the fundamental characteristics of self-replicating information for those of you new to softwarephysics.

Self-Replicating Information – Information that persists through time by making copies of itself or by enlisting the support of other things to ensure that copies of itself are made.

The Characteristics of Self-Replicating Information
All forms of self-replicating information have some common characteristics:

1. All self-replicating information evolves over time through the Darwinian processes of inheritance, innovation and natural selection, which endows self-replicating information with one telling characteristic – the ability to survive in a Universe dominated by the second law of thermodynamics and nonlinearity.

2. All self-replicating information begins spontaneously as a parasitic mutation that obtains energy, information and sometimes matter from a host.

3. With time, the parasitic self-replicating information takes on a symbiotic relationship with its host.

4. Eventually, the self-replicating information becomes one with its host through the symbiotic integration of the host and the self-replicating information.

5. Ultimately, the self-replicating information replaces its host as the dominant form of self-replicating information.

6. Most hosts are also forms of self-replicating information.

7. All self-replicating information has to be a little bit nasty in order to survive.

8. The defining characteristic of self-replicating information is the ability of self-replicating information to change the boundary conditions of its utility phase space in new and unpredictable ways by means of exapting current functions into new uses that change the size and shape of its particular utility phase space. See Enablement - the Definitive Characteristic of Living Things for more on this last characteristic. That posting discusses Stuart Kauffman's theory of Enablement in which living things are seen to exapt existing functions into new and unpredictable functions by discovering the “AdjacentPossible” of springloaded preadaptations.

Over the past 4.56 billion years we have seen five waves of self-replicating information sweep across the surface of the Earth and totally rework the planet, as each new wave came to dominate the Earth:

1. Self-replicating autocatalytic metabolic pathways of organic molecules
2. RNA
3. DNA
4. Memes
5. Software

Software is currently the most recent wave of self-replicating information to arrive upon the scene and is rapidly becoming the dominant form of self-replicating information on the planet. For more on the above see A Brief History of Self-Replicating Information. Recently, the memes and software have formed a very powerful newly-formed parasitic/symbiotic relationship with the rise of social media software. In that parasitic/symbiotic relationship, the memes are now mainly being spread by means of social media software and social media software is being spread and financed by means of the memes. But again, this is nothing new. All 5 waves of self-replicating information are all coevolving by means of eternal parasitic/symbiotic relationships. For more on that see The Current Global Coevolution of COVID-19 RNA, Human DNA, Memes and Software.

Again, self-replicating information cannot think, so it cannot participate in a conspiracy-theory-like fashion to take over the world. All forms of self-replicating information are simply forms of mindless information responding to the blind Darwinian forces of inheritance, innovation and natural selection. Yet despite that, as each new wave of self-replicating information came to predominance over the past four billion years, they all managed to completely transform the surface of the entire planet, so we should not expect anything less from software as it comes to replace the memes as the dominant form of self-replicating information on the planet. But this time might be different. What might happen if software does eventually develop a Mind of its own? After all, that does seem to be the ultimate goal of all the current AI software research that is going on.

Conclusion
If you are at all interested in the parasitic/symbiotic relationships between the memes and social media software, you should definitely explore the above website of the Observatory on Social Media. The Observatory on Social Media seems to have really gotten started around 2016, so it is a rather new research effort that certainly should be heavily funded in today's hyperpolitical world because it is studying things like the transmission of misinformation on the Internet. In the modern political world, such matters are certainly as significant and as grave as the spread of nuclear weapons' secrets in the 1950s amongst the world powers.

Additionally, I think that the studies of social media networks by the Observatory on Social Media and NaN have a further application to many additional fields as well. If you think of the self-organizing coordinated networks of humans and bots that form the self-sustaining echo chambers that have been discovered in the wild as active agent-based meme-complexes, you have a form of self-replicating information with many of the characteristics of carbon-based life. In addition, and most importantly, they have done a great deal of fieldwork and collected tons of data in the wild. I bet that many researchers working on evolutionary biology, the origin of carbon-based life on the Earth, astrobiology and those working on the rise of eusocial behavior in species by means of multilevel selection would find their work extremely interesting and beneficial.

Their work could also help to revive memetics in academia. A few years back, I had a lengthy email exchange with Susan Blackmore, one of the founders of memetics, about her brilliant TED presentation which can be viewed at:

Memes and "temes"
https://www.ted.com/talks/susan_blackmore_on_memes_and_temes

Note that I consider Susan Blackmore's temes to really be technological artifacts that contain software. After all, a smartphone without software is simply a flake tool with a very dull edge. Susan Blackmore's most significant contribution is her hypothesis that the very large and highly over-engineered human brain evolved to store and process memes. Memetic-drive then went on to develop languages, cultures and social media software to further store and propagate memes. That is why we are such suckers for false memes. Our Minds evolved to store and propagate memes, even memes that are not acting in our best interests. For more on that see A Brief History of Self-Replicating Information.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston

Saturday, November 28, 2020

Some Thoughts on the Netflix Documentary The Social Dilemma

I just finished watching the 2020 Netflix documentary The Social Dilemma, and I thought that some softwarephysics could be of help in assessing the risks and benefits of social media software. The Social Dilemma features many young IT developers and IT management professionals from Silicon Valley who have become somewhat disenchanted with how social media software has affected individuals and society in general. The Social Dilemma depicts a rather bleak and very disturbing picture of software in the current age. Now I was born in 1951, and I was raised during the 1950s and 1960s at a time when software was merely printing out paychecks and bills. So I had no interactions with software growing up as a child. However, I do have two children and five grandchildren, and I have seen them all grow up with the great advancements that software made over the past 40 years with the arrival of personal computing. In this posting, I would like to use some softwarephysics to suggest that none of the concerns raised by The Social Dilemma are new threats to mankind and that, to the contrary, social media software has provided many great benefits for all. I think that much of this disenchantment stems from not having a full grasp of what is really going on with the interactions between software and mankind.

It's All About Self-Replicating Information
First, because software is the latest form of self-replicating information to arrive on the planet, you must understand the overwhelming power of self-replicating information to rework an entire planet, or perhaps, even an entire galaxy. Once again, let me repeat the fundamental characteristics of self-replicating information for those of you new to softwarephysics.

Self-Replicating Information – Information that persists through time by making copies of itself or by enlisting the support of other things to ensure that copies of itself are made.

The Characteristics of Self-Replicating Information
All forms of self-replicating information have some common characteristics:

1. All self-replicating information evolves over time through the Darwinian processes of inheritance, innovation and natural selection, which endows self-replicating information with one telling characteristic – the ability to survive in a Universe dominated by the second law of thermodynamics and nonlinearity.

2. All self-replicating information begins spontaneously as a parasitic mutation that obtains energy, information and sometimes matter from a host.

3. With time, the parasitic self-replicating information takes on a symbiotic relationship with its host.

4. Eventually, the self-replicating information becomes one with its host through the symbiotic integration of the host and the self-replicating information.

5. Ultimately, the self-replicating information replaces its host as the dominant form of self-replicating information.

6. Most hosts are also forms of self-replicating information.

7. All self-replicating information has to be a little bit nasty in order to survive.

8. The defining characteristic of self-replicating information is the ability of self-replicating information to change the boundary conditions of its utility phase space in new and unpredictable ways by means of exapting current functions into new uses that change the size and shape of its particular utility phase space. See Enablement - the Definitive Characteristic of Living Things for more on this last characteristic. That posting discusses Stuart Kauffman's theory of Enablement in which living things are seen to exapt existing functions into new and unpredictable functions by discovering the “AdjacentPossible” of springloaded preadaptations.

Over the past 4.56 billion years we have seen five waves of self-replicating information sweep across the surface of the Earth and totally rework the planet, as each new wave came to dominate the Earth:

1. Self-replicating autocatalytic metabolic pathways of organic molecules
2. RNA
3. DNA
4. Memes
5. Software

Software is currently the most recent wave of self-replicating information to arrive upon the scene and is rapidly becoming the dominant form of self-replicating information on the planet. For more on the above see A Brief History of Self-Replicating Information. Recently, the memes and software have formed a very powerful newly-formed parasitic/symbiotic relationship with the rise of social media software. In that parasitic/symbiotic relationship, the memes are now mainly being spread by means of social media software and social media software is being spread and financed by means of the memes. But again, this is nothing new. All 5 waves of self-replicating information are all coevolving by means of eternal parasitic/symbiotic relationships. For more on that see The Current Global Coevolution of COVID-19 RNA, Human DNA, Memes and Software.

Again, self-replicating information cannot think, so it cannot participate in a conspiracy-theory-like fashion to take over the world. All forms of self-replicating information are simply forms of mindless information responding to the blind Darwinian forces of inheritance, innovation and natural selection. Yet despite that, as each new wave of self-replicating information came to predominance over the past four billion years, they all managed to completely transform the surface of the entire planet, so we should not expect anything less from software as it comes to replace the memes as the dominant form of self-replicating information on the planet. But this time might be different. What might happen if software does eventually develop a Mind of its own? After all, that does seem to be the ultimate goal of all the current AI software research that is going on.

The Grand Delusion of Consciousness
Next, you must look inward and question the very nature of your own consciousness. Philosophers and theologians have been doing this for thousands of years, so this is nothing new. In The Ghost in the Machine the Grand Illusion of Consciousness, I explained that most people simply do not consider themselves to be a part of the natural world. Instead, most people, consciously or subconsciously, consider themselves to be a supernatural and immaterial spirit that is temporarily haunting a carbon-based body. Now, in everyday life, such a self-model is a very useful delusion like the delusion that the Sun, planets and stars all revolve about us on a fixed Earth. In truth, each of us tends to self-model ourselves as an immaterial Mind with consciousness that can interact with other immaterial Minds with consciousness too, even though we have no evidence that these other Minds truly do have consciousness. After all, all of the other Minds that we come into contact with on a daily basis could simply be acting as if they were conscious Minds that are self-aware. Surely, a more accurate self-model would be for us to imagine ourselves as carbon-based robots. More accurately, in keeping with the thoughts of Richard Dawkins and Susan Blackmore, softwarephysics models humans as DNA survival machines and Meme Machines with Minds infected with all sorts of memes. Some of those memes are quite useful and some are quite nasty. The 3 billion bases of DNA in each of your cells are trying to self-replicate at all costs and so are the memes in your Mind. For more on this, see Susan Blackmore's brilliant TED presentation at:

Memes and "temes"
https://www.ted.com/talks/susan_blackmore_on_memes_and_temes

Note that I consider Susan Blackmore's temes to really be technological artifacts that contain software. After all, a smartphone without software is simply a flake tool with a very dull edge.

The Chief Concerns of The Social Dilemma
The main concerns raised by The Social Dilemma seem to be:

1. Social media software uses algorithms to keep us engaged so that we watch more ads.
2. Companies are keeping track of all of our behaviors on social media so that they can target us with specific ads.
3. People are becoming addicted to social media software because the attention-grabbing algorithms are feeding us what we want to see without limit.
4. Some people are publishing false information on social media software that then goes viral because the false information confirms the current worldview of many.
5. Thus, the algorithms of social media software tend to reinforce confirmation bias, foster extreme views and facilitate the propagation of false information.
6. Given the above, social media software is largely responsible for the current very divisive political landscape in the United States of America.
7. Many people do not realize that many of these horrible things are happening as software manipulates their Minds to watch more ads and content that confirms and further radicalizes their personal worldview.
8. Because people have become addicted to social media software, they are no longer interacting with each other face to face.

But from the perspective of softwarephysics do the above problems really stem from social media software? I would contend that most of the above problems actually stem from the DNA in our cells trying to self-replicate and the memes inventing thousands of different languages to aid the self-replication of memes. In this view, most of the above problems originated because of the parasitic/symbiotic relationships between our DNA and the memes in our Minds. Only recently, have these problems been amplified by the arrival of social media software that has exasperated the situation. For example, before the invention of languages, memes could only replicate when people copied the behaviors of other human beings, like copying how someone else made a flake tool. That made the spreading of memes and false memes very difficult. But once languages arrived on the scene, memes could then spread by word of mouth to many others and down through the generations as myths and oral histories too. It also allowed people to begin to gossip and spread rumors about each other. Most gossip and rumors are very interesting and tantalizing memes that tend to survive while less interesting nonessential memes do not. For example, the meme that Suzie made the honor roll this semester can hardly compete with the meme that Suzie was expelled from school for doing something on her smartphone. These problems greatly increased with the invention of writing. Once memes could be written down they seemed to grow in significance. Written memes had a sense of permanence and authority that could be spread wide and far. This sense of permanence and authority was highly enhanced by the invention of the printing press and the spread of official documents and books. Much later radio, movies and television came along and provided for the mass distribution of memes composed of very interesting gossip and rumors and at a very low cost per unit.

So our DNA and memes have been manipulating our Minds ever since our Minds came along. And now we have software doing the same thing. Remember, as an intelligent being in a Universe that has become self-aware, the world doesn’t have to be the way it is. Once you understand what human DNA, memes, and software are up to, you do not have to fall prey to their mindless compulsion to replicate. As I said before, human DNA, memes, and software are not necessarily acting in your best interest, they are only trying to replicate, and for their purposes, you are just a temporary disposable survival machine to be discarded in less than 100 years. All of your physical needs and desires are geared to ensuring that your DNA survives and gets passed on to the next generation, and the same goes for your memes. Your memes have learned to use many of the built-in survival mechanisms that DNA had previously constructed over hundreds of millions of years, such as fear, anger, and violent behavior. Have you ever noticed the physical reactions your body goes through when you hear an idea that you do not like or find to be offensive? All sorts of feelings of hostility and anger will emerge. I know it does for me, and I think I know what is going on! The physical reactions of fear, anger, and thoughts of violence are just a way for the memes in a meme-complex to ensure their survival when they are confronted by a foreign meme. They are merely hijacking the fear, anger, and violent behavior that DNA created for its own survival millions of years ago. Fortunately, because software is less than 80 years old, it is still in the early learning stages of all this, but software has an even greater potential for hijacking the dark side of mankind than the memes, and with far greater consequences.

For example, the world is currently struggling with a world-wide COVID-19 pandemic. In the United States of America, there currently is a very strange political meme that considers the wearing of masks and social distancing to be an infringement of personal freedoms and a sign of political disloyalty. Clearly, this meme is not working in the self-interests of the Minds storing and spreading the meme. That is because there is a 1% chance that the RNA in the COVID-19 virus will end the life of the Mind storing and spreading the meme. Unfortunately, the COVID-19 virus is highly infectious and 1% of 330 million comes to 3.3 million dead Americans. So the rational thing to do is to wear masks and practice social distancing for a few more months until the COVID-19 vaccine can be distributed. For more on that see The Current Global Coevolution of COVID-19 RNA, Human DNA, Memes and Software, A Structured Code Review of the COVID-19 Virus and Love in the Time of COVID-19. So this meme is definitely not working in the best interest of the DNA or the other memes that the Mind stores. Yet, the meme persists. That is because the meme has incorporated itself into a very durable meme-complex that contains memes concerned with gun control, abortion, climate change and prayer in school. And that very durable meme-complex shapes the worldview of about 40% of Americans. So this very dangerous meme persists by forming a parasitic/symbiotic relationship with the other memes in the meme-complex and it manipulates the Minds and behaviors of those infected with the meme.

But Isn't Social Media Software Responsible for the Current Political Division in the United States?
Prior to today, the most politically divisive years in my lifetime were in the late 1960s and early 1970s. At the time, the Vietnam War was raging and the United States of America had a draft. Many of my friends were of draft age and were very active in the anti-war movement. Some participated in the October 1967 march on the Pentagon and the protests in the streets of Chicago at the 1968 Democratic Convention. In the spring of 1970, during my freshman year in college, there was a student strike and the National Guard was on campus in response to protests over the Cambodian invasion. Now don't forget that even future presidents Clinton, Bush and Trump all managed to avoid the Vietnam War and the draft too by various dubious means. This was also a time of great racial tension in the country with riots and violent police encounters in many cities. At the time, many worried that the country was coming apart at the seams. Yet there was no social media software at the time.

The political and social turmoil of the late 1960s and early 1970s brings to mind the dangers of curtailing free speech on social media software. Currently, many liberal and conservative legislators have become interested in censoring the political content of social media software. At present, social media software is protected by section 230 of the Communication Decency Act which says that "No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider", which basically means that Facebook, Twitter, Instagram and YouTube cannot be sued for liable when people post "false" political information on their platforms. Unfortunately, the conservative legislators want social media software to suppress liberal thoughts and the liberal legislators want it to suppress conservative thoughts by allowing all to sue for the posting of political "lies" from both sides. And that is the problem. My truths can be your lies. I just think back to those anti-war days when even future presidents balked at the widely-held political truths of the day. Personally, I now only have confidence in science and mathematics. All other forms of human thought seem to be hopelessly flawed by confirmation bias. For me, the only way to handle this is to allow all to express their political thoughts without censorship so long as it does not promote violence or other illegal activity.

In truth, most political and social upheaval arises from economic disturbances. I would attribute most of today's political divisiveness to the economic disturbance caused by automation and AI software displacing workers. This economic displacement by software has caused much more trouble than social media software. It has produced a shrinking middle class and great wealth disparity. In the 20th century, such conditions led to Fascism, genocide and world war without the help of social media software. For more on that see The Danger of Tyranny in the Age of Software, Oligarchiology and the Rise of Software to Predominance in the 21st Century, Is it Finally Time to Reboot Civilization with a New Release? and The Softwarephysics of the Green New Deal.

The Benefits of Social Media Software
I personally do not use Facebook or Twitter, but my wife does use Facebook to keep in touch with friends and family in a seemingly harmless manner. However, I do use YouTube to watch very interesting lectures on science. I love the YouTube algorithm that suggests even more such lectures. I do not mind the targeted ads because I can cancel them after about 5 seconds, unlike radio and television. Television has been using a very inaccurate carpet-bombing approach to advertising for 70 years. You are stuck watching many minutes of ads that you have no interest in along with the rest of the viewing audience as collateral damage while the advertisers try to find the very few interested members of the audience. I much prefer the surgical-strike ads that are specifically targeted to me by YouTube algorithms. Sometimes I am even interested in them and the products that they are trying to sell me.

As for the damage to children, one of my grandsons was given an indestructible Chromebook when he was three years old. I know lots of people discourage kids from having too much screen time, but my grandson was able to teach himself how to read and write with the aid of YouTube videos before he was four years old. He also learned the Japanese, Hebrew, Hindi and other character sets online with YouTube videos. He then learned how to add and subtract numbers and use negative numbers too. He learned how to multiply and divide numbers by creating a "skip counting" matrix. Shortly after he turned five years old, he got into finding prime numbers with the sieve of Eratosthenes.

Figure 1 – The sieve of Eratosthenes is an ancient algorithm for identifying prime numbers through a process of elimination.

After setting up the initial grid of sequential numbers he would use a bunch of Divisibility Rules that he learned on YouTube, like the Divisibility Rule for 3, to eliminate the non-prime numbers that were not already eliminated by multiples on the sieve of Eratosthenes. For example, take a number like 267 in the grid and add together each digit in the number (2 + 6 + 7 = 15). Then take that sum (15) and determine if it is divisible by 3. The original number is divisible by 3, if and only if, the sum of its digits is divisible by 3. Who knew? He would also make a 20 x 20 matrix on paper and then fill it in by manually multiplying out all of the matrix elements using the "skip counting" method he learned on YouTube. Any number in the resulting matrix that was not in the first row or column could not be a prime number and could then be used to eliminate additional non-primes from the grid of Eratosthenes. He also learned all of the elements in the periodic table and memorized their atomic numbers too. My daughter is a high school chemistry teacher and she would get a kick out of asking him what the atomic number of cobalt was. She had no idea, but her son did. I also got stuck with trying to find many of the elements in our home. "Grandpa, do we have any beryllium?" All you have to do is get them started on one educational YouTube video and then turn the Autoplay radio button on in YouTube. The YouTube Machine Learning algorithms then do the rest.

Conclusion
In conclusion, all of the atrocities in the history of mankind were accomplished without the aid of social media software. They were all accomplished simply with the inventions of language, writing, printing, radio, movies and television. Certainly, social media software will participate in the atrocities of the future, but social media software will not be the cause of future atrocities. History has shown that humans are perfectly able to carry out atrocities all on their own without such help.

To paraphrase Shakespeare:

“The fault, dear Brutus, is not in our software
But in ourselves"


Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston

Saturday, November 07, 2020

The Softwarephysics of the Green New Deal

The Green New Deal Resolution of Representative Alexandria Ocasio-Cortez of New York and Senator Ed Markey of Massachusetts is a very ambitious and broad-ranging proposal and framework for decarbonizing the American economy, and in the process, cure most of America's long-term historical problems at the same time.

Text of the Green New Deal Resolution
https://www.congress.gov/116/bills/hres109/BILLS-116hres109ih.pdf

Alexandria Ocasio-Cortez Takes Green New Deal Global in Passionate Speech
https://www.youtube.com/watch?v=E1w3V4PUv2s

Many object to tying the decarbonization of America with such things as providing all Americans with healthcare, decent housing and good-paying jobs. Granted, as I pointed out in The Deadly Dangerous Dance of Carbon-Based Intelligence and Last Call for Carbon-Based Intelligence on Planet Earth, the world is currently messing with the carbon cycle of the Earth in a way that could bring forth the demise of civilization. Fixing this problem will certainly require the reworking of the entire world economy, and while totally reworking the entire world economy, maybe it does make sense to fix all of the world's historical problems in the process. But how about reworking the entire world economy in such a way that also takes into account the future problems of the world too? In The Impact of GPT-3 AI Text Generation on the Development and Maintenance of Computer Software, we saw that the Software Singularity may be closer than we think. The Software Singularity is a time in the near future when software will finally be able to self-replicate on its own without the need of human beings, and will also be able to improve itself without limit. As I pointed out in Oligarchiology and the Rise of Software to Predominance in the 21st Century and Is it Finally Time to Reboot Civilization with a New Release?, the resulting displacement of most of the world's working population by AI software and AI-controlled automation processes will end the current economic paradigm upon which all civilizations have been run for the past 4,000 years. Such a monumental economic disturbance could also help to spell the demise of civilization as we know it. So I would like to propose a Global Comprehensive Green New Deal to handle all of the earth-shattering problems now facing mankind. But I do not need to do so because somebody else has already done that.

A Global Comprehensive Green New Deal
I did not support Andrew Yang's 2020 Presidential campaign because I knew that he could not win. However, that does not mean that I thought that his policies were incorrect. In fact, implementing Andrew Yang's policies by the entire world will probably be the only way we will be able to hold civilization together for the next 100 years or so. Andrew Yang's chief proposals are for a Universal Basic Income, Human-Centered Capitalism and the decarbonization of America with wind, solar and 4th-generation nuclear reactors. The Universal Basic Income and Human-Centered Capitalism are designed to handle the vast economic disturbances that robotics and AI software will soon be generating as most people stop working for a living. My suspicion is that Andrew Yang also knows that wind and solar just do not have the necessary energy densities to do the job of decarbonizing the world. But large numbers of molten salt nuclear reactors running on spent uranium fuel rods, depleted uranium, natural uranium and thorium have an energy density that is millions of times greater and can certainly do the job.

Andrew Yang's 2020 Policies
https://www.yang2020.com/policies/

Recently, Andrew Yang started up a Humanity Forward Foundation to carry forward the policies of his 2020 Presidential campaign at a global level.

Humanity Forward Foundation
https://movehumanityforward.com/

But to truly have a Global Comprehensive Green New Deal we will need to have the world adopt the tenents of ecomodernism as expressed in the Ecomodernist Manifesto:

AN ECOMODERNIST MANIFESTO
http://www.ecomodernism.org/

Basically, the Ecomodernist Manifesto suggests that mankind needs to stop pretending that it can be one with Nature. Today, many modern environmentalists strive to reduce the impact of mankind on Nature by having mankind scale back to the needs of the hunter-gatherers of old. The problem is that nobody really wants to live with the poverty of a hunter-gatherer lifestyle for long. Perhaps for a weekend retreat, but not much longer than that. The Ecomodernists suggest that the only way to save Nature is to decouple mankind from Nature by totally stopping the exploitation of Nature by mankind. This can only be accomplished if mankind has access to a limitless supply of energy. The Ecomodernists maintain that modern 4th-generation nuclear reactors burning uranium and thorium provide just such a limitless supply of energy. There is enough uranium and thorium on the Earth to run the world for hundreds of thousands of years. After that, there will always be sources of uranium and thorium on the Moon and the nearby asteroids. By moving all of mankind to self-contained cities run by modern nuclear reactors, it would be possible to totally decouple mankind from Nature. Nature could then be left to heal on its own.

These totally self-contained cities would be like huge interstellar spaceships. Each self-contained city would contain a fixed number of atoms that would be constantly recycled using the energy from 4th-generation nuclear reactors. The only additional atoms required to run the self-contained cities would be small amounts of uranium and thorium. All of the food and manufactured goods would be derived from the atoms already in the self-contained cities and from the atoms in discarded goods, sewage, garbage and waste that would be recycled back into food and other useful products by unlimited amounts of energy, robotics and AI software. Such self-contained cities with occupants all living with a modern high-level standard of living would be the actualization of the science-fiction future that I was promised while growing up back in the 1950s. But unlike many of the other things from the science-fiction future of the 1950s that I grew up with, we now have the necessary technology to make it actually all happen if we decide to do so.

Problems with the Current Green New Deal
But what about the current Green New Deal? There are some problems. The current Green New Deal does not take into account the fast-approaching dramatic economic impacts of robotics and AI software, and it does not properly deal with the very low energy density of wind and solar power. Our Universe primarily runs on two very powerful forms of energy - gravitational potential energy and nuclear energy. Surprisingly, gravitational potential energy is the most powerful even though the force of gravity is the weakest force in the Universe. The gravitational potential energy of the Universe is so vast simply because our Universe has huge amounts of mass spread out all over the place and not just in one spot.

Gravitational potential energy is released when things fall down. For example, when the gas and dust in a vast molecular cloud falls together, it releases huge amounts of gravitational potential energy that heats up the gas and dust at the center to form a new star. Similarly, when gas and dust fall into the supermassive black hole at the center of a galaxy, it gives off radiation so powerful that it can literally be seen across the entire observable Universe as a quasar. This huge amount of gravitational potential energy is leftover from the Big Bang that kicked off our little Universe in the Multiverse and is the result of the decay of the Inflaton field that powers the expansion of the Multiverse. See The Software Universe as an Implementation of the Mathematical Universe Hypothesis for more about the Inflaton field of the Multiverse. But unfortunately, we do not have access to cosmic amounts of gravitational potential energy. The only gravitational potential energy that we have access to comes in the form of hydroelectric power, and that gravitational potential energy really comes from seawater that was lifted into the clouds by the very dilute energy density of sunlight spread over large areas of the ocean.

The second most powerful form of energy in our Universe is nuclear energy - primarily in the form of fusion nuclear energy in the cores of stars. Fusion may come along someday for us as a useful source of energy, but we have been working on fusion energy for about 70 years now and it might take another 20 - 100 years to perfect, and we need something right now to replace carbon-based fuels! The physics of fusion is tough. You have to really squeeze protons together with high densities and high temperatures to make them fuse. For example, the core of our Sun has a tremendous density of 150 g/cm3 which makes it about 150 times denser than water and it has a temperature of 15.7 million degrees Kelvin. Yet the core of our Sun only produces about 275 watts/m3 of heat! Since the human body generates about 100 watts of heat and has a volume of about 0.1 cubic meters, that means the human body generates about 1,000 watts/m3, or about 3.6 times the heat production of the Sun's core! My wife uses a very small space heater on her bathroom sink top to heat the bathroom. It generates 1200 watts of heat. So my wife would need 4.36 cubic meters of the Sun's core to provide the same amount of heat and that would take up most of my half of the bathroom.

But the Universe has been kind. The Universe also contains another form of nuclear energy that remains largely untapped. That is because it requires Intelligence to unleash it. It is called fission nuclear energy. Certain very heavy nuclei like uranium-233, uranium-235, plutonium-239 and plutonium-240 can fission into lighter nuclei when hit with a neutron. When these nuclei fission, they split in half and the two halves fly apart at about 40% of the speed of light! That's a tremendous amount of kinetic energy that turns into heat as the fission products bounce around hitting other atoms. Additional energy is released as the fission products decay into lighter nuclei and give off gamma-rays and high-speed electrons.

Figure 1 - When a neutron hits a uranium-235 nucleus it can split it into two lighter nuclei like Ba-144 and Kr-89 that fly apart at about 40% of the speed of light and two or three additional neutrons. The nuclei that fly apart are called fission products that are very radioactive with half-lives of less than 30 years and need to be stored for about 300 years. The additional neutrons can then strike other uranium-235 nuclei, causing them to split as well. Some neutrons can also hit uranium-238 nuclei and turn them into plutonium-239 and plutonium-240 that can also fission when hit by a neutron.

Figure 2 – A ball of thorium or uranium smaller than a golf ball can fuel an American lifestyle for 100 years. This includes all of the electricity, heating, cooling, driving and flying that an American does in 100 years. We have already mined enough thorium and uranium to run the whole world for thousands of years. There is enough thorium and uranium on the Earth to run the world for hundreds of thousands of years.

Molten salt nuclear reactors burn the entire ball of uranium or thorium into fission products. Some of those fission products are valuable nuclei that can be used for medical or industrial purposes. Regardless of that, the generated fission products only need to be safely stored for 10 half-lives or 300 years until they become as radioactive as coal ash. And we already know how to safely store things for 300 years or more. We already have many buildings, books and paintings that have been carefully preserved for 300 years. But the current nuclear reactors that we are running only burn a few percent of the ball and then leave behind a radioactive ball that needs to be safely stored for 200,000 years until it decays to safe levels. For more on how modern advanced fission reactors work, see: Last Call for Carbon-Based Intelligence on Planet Earth.

It is now thought that these very precious heavy nuclei are created when orbiting neutron stars collide. Each neutron star is several miles in diameter and is essentially a very huge atomic nucleus composed of huge numbers of neutrons. As the two neutron stars orbit each other, they give off energy in the form of gravitational waves and begin to orbit closer and closer together as their orbital energy radiates away. They also orbit each other faster and faster to conserve angular momentum. Eventually, the two neutron stars collide like two giant balls of neutrons smashing together at a high velocity in a vast cosmic atom smasher. This allows some very heavy nuclei to form like uranium-235 uranium-238 and thorium-232. Plutonium-239 and plutonium-240 also form, but those nuclei have relatively short half-lives so they do not exist long enough to become incorporated into a new stellar planetary system. It is also thought that these very heavy nuclei are created by Type II supernovae when the cores of very massive stars collapse at the end of their lives. Fortunately for us, Nature squirrels away vast amounts of cosmic energy in these nuclei, just waiting for some form of Intelligence to come along and put them to good use. They are a form of unlimited limitless energy for any Intelligence smart enough to put them to good use.

Many Green New Deal Supporters Propose to Solve All of Our Problems with Lots of Solar Panels and Wind Turbines
But couldn't we just build lots of solar panels and wind turbines instead of using 4th-generation nuclear reactors? That would produce millions of high-paying jobs that could bring people out of poverty and into a thriving middle class like we had back in the 1950s and 1960s when I was growing up. The problem is that both solar panels and wind turbines are driven by the very low-density energy of sunlight. Our Sun does give off lots of energy, but by the time sunlight reaches the top of the Earth's atmosphere it has dropped down to a value of 1370 Watts/m2. That's about as much as my wife's space heater for every square meter. But lots of that energy gets reflected back into space by clouds and the atmosphere. Plus, you only get that 1370 Watts/m2 for a surface that is perpendicular to the incoming sunlight. During the night you get nothing and even during the daytime you only get a portion of it because your solar panels are tilted from being perpendicular to the incoming sunlight. So even if solar panels were 100% efficient, you would need lots of them to power the world.

But how about calculating an order of magnitude approximation? Lots of people use the "toe" unit of energy - the energy of one ton of oil equivalent. In gigawatt-hours that comes to:

1 toe = 11,630 KWh = 0.011630 GWh where KWh = KilloWattHour and GWh = GigaWattHour.

One Mtoe = 1 million toes.

In 2018, the world used 14,421 Mtoe = 167,716,230 GWh for the whole year. To find the average steady-state power usage of the world we need to divide by the number of hours in a year:

167,716,230 GWh / 24 / 365.25 = 19,132 GW which is about 20,000 GW

So the whole world needs the continuous output from about 20,000 one-GW power plants on average. That could be furnished by 80,000 modular 250 MW molten salt nuclear reactors, each about the size of a city bus, that could be built on assembly lines and would not need any storage infrastructure. The world currently has about 50,000 commercial aircraft in service so this is doable. Since 1/2 of the world is always in darkness and probably does not have enough wind either, let's say you need to continuously store and release about 10,000 GW of power for those down-times. That's a lot of rechargeable batteries.

For example, to just power America's Green New Deal with wind and solar power alone, Mike Conley and Tim Maloney used the numbers from the 132-page report of the environmental Solutions Project that requires 18 billion square meters of solar panels and 500,000 5 MW wind turbines to supply all of the energy needs of the United States. Mike and Tim point out that once all of this infrastructure has been constructed on 132,000 square miles of land, we will need to replace 1.23 million square meters of solar panels and 80 of the 5 MW wind turbines every day, forever, as the solar panels and wind turbines wear out. One idea in the Green New Deal is to create millions of high-paying jobs building the huge arrays of solar panels and wind turbines and additional high-paying jobs to maintain and replace large amounts of hardware as it wears out. There will also need to be a large number of jobs devoted to collecting and recycling the materials in worn-out solar panels and wind turbines. The benefit of millions and millions of high-paying jobs is that it will help to bring many people out of poverty and into the middle class. The problem is that it will also make solar and wind power very expensive.

Figure 3 – The new GE 5 MW wind turbine is 500 feet tall, about the height of a 50 story building. We will need 500,000 of them and will need to replace 80 of them each day as they wear out in the future.

Roadmap To Nowhere - The Myth of Powering the Nation With Renewable Energy
https://www.roadmaptonowhere.com/

In a sense, the Green New Deal would be moving millions and millions of people back to the soil to become sunshine farmers. That is because the energy density of sunshine is so very low and undependable. And wind power is just one step removed from sunshine farming because the wind is caused by the differential heating of the Earth's atmosphere by sunshine. Sunshine farming is subject to all of the same debilitating vagaries of traditional farming like too many cloudy or hot windless days. And like traditional farmers, sunshine farmers cannot work at night. On the other hand, a true Global Comprehensive Green New Deal would divert such potential sunshine farmers into building molten salt nuclear reactors, upgrading electrical grids, rebuilding infrastructure, converting transportation systems to electrical and hydrogen-based systems, building energy-efficient low-cost housing and insulating existing buildings. Investments in those activities would also create millions of high-paying jobs and would deliver more benefits than simply investing in sunshine farming.

Conclusion
Now don't get me wrong. I have been buying wind and solar-powered electricity for about 10 years now, only because they are better than electricity from coal-fired power plants. But if I could buy electricity from power plants run by molten salt nuclear reactors, I would definitely do that. But I cannot do that because there are no molten salt nuclear reactors generating electricity. This is where a Global Comprehensive Green New Deal comes in. We need the government of the United States of America to adopt a "Manhatten Project"-like Global Comprehensive Green New Deal to rapidly develop molten salt nuclear reactors and all of the necessary supply chain infrastructure to fuel them. Such a project would also help to attain all of the objectives of a solar and wind-based Green New Deal. Given an infinite amount of limitless energy that is far cheaper than coal, we can achieve all of the worthy objectives of the Green New Deal, and by further implementing the ideas of Andrew Yang, we can also overcome the challenges of the Software Singularity and the fast-approaching dramatic economic impacts of robotics and AI software. And I love the good intentions of Representative Alexandria Ocasio-Cortez of New York and Senator Ed Markey of Massachusetts. For me, they bring back the hopes and aspirations of the 1960s for a better world. I just think that a little bit of physics and softwarephysics could be of help in achieving their goals.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston