Friday, September 22, 2023

An IT and Geophysical Perspective on the Nanopore Sequencing of DNA and RNA

In this post, I would like to cover a remarkable new tool for reading the sequence of bases on very long stretches of DNA and RNA in a very rapid manner which is called nanopore sequencing. Nanopore sequencing devices can directly read a stretch of DNA or RNA that is over one million bases in length at a speed of about 450 bases/second with a device about the size of a flip smartphone. These nanopore sequencing devices now start at a price of about $1,000 and can plug into the USB port of your laptop. They are also very rugged, and can even work in very harsh field conditions simply using the power from your laptop battery. Prior to nanopore sequencing, DNA and RNA were sequenced using very time-consuming and expensive biochemical procedures. These prior procedures did not directly read DNA and RNA base sequences themselves.

Figure 1 – Above is a MinION nanopore sequencer being used in the field.

Figure 2 – Above is the general setup of a MinION nanopore sequencer. The DNA or RNA sample to be read is placed into the flow cell of the unit.

The development of nanopore sequencing should be of interest to all IT professionals because it is truly an extraordinary story of information processing capability that rivals that which has occurred in IT over the past 82 years, or 2.6 billion seconds, ever since Konrad Zuse first cranked up his Z3 computer in May of 1941. First, it is important to remember that whenever data is written from memory to a permanent secondary medium that persists in time, such as a thumb drive, SSD or HDD drive, or even the magnetic tape of yore, it is always done so in a sequential manner. Each byte of data that is written to a permanent secondary medium is written in a sequential manner - one byte after the other. The written information in each byte is naturally important, but even more so is the sequence of the bytes that are written out to the permanent secondary medium. For example, each character in one of your emails is encoded as a single byte of information using the characters in the ASCII encoding table. However, the true essence of the information in your email is encoded by the total order and sequence of the bytes in the email. The same is true of the total order and sequence of the bases in a stretch of DNA or RNA used to build a protein molecule. That is why being able to read the sequence of bases in a string of DNA or RNA is so important. It is even more important than being able to read the bytes in your email.

Biological and IT Data Access Methods
Before proceeding, we first need to briefly review how data is encoded in IT and by carbon-based life forms in biology. In biology, data is encoded by DNA and RNA molecules.

Figure 3 - RNA is a one-track tape, while DNA is a two-track tape. DNA has a data track and a parity track that allows for error corrections after DNA replicates. DNA uses a slightly different version of the ribose sugar and also uses the nucleotide of T (Thymine) instead of the U (Uracil) used by RNA.

For IT, we will return to the batch processing of data stored on magnetic tapes that was common in the 1960s and 1970s because it is more closely aligned with how biological information is processed using DNA and RNA tapes. One of the simplest and oldest access methods in IT is called QSAM - Queued Sequential Access Method:

Queued Sequential Access Method
http://en.wikipedia.org/wiki/Queued_Sequential_Access_Method

I did a lot of magnetic tape processing in the 1970s and early 1980s using QSAM. At the time we used 9 track tapes that were 1/2 inch wide and 2400 feet long on a reel with a 10.5 inch diameter. The tape had 8 data tracks and one parity track across the 1/2-inch tape width. That way we could store one byte across the 8 1-bit data tracks in a frame, and we used the parity track to check for errors. We used odd parity, if the 8 bits on the 8 data tracks in a frame added up to an even number of 1s, we put a 1 in the parity track to make the total number of 1s an odd number. If the 8 bits added up to an odd number of 1s, we put a 0 in the parity track to keep the total number of 1s an odd number. Originally, 9 track tapes had a density of 1600 bytes/inch of tape, with a data transfer rate of 15,000 bytes/second. Remember, a byte is 8 bits and can store one character, like the letter “A” which we encode in the ASCII code set as A = “01000001”.

Figure 4 – A 1/2 inch wide 9 track magnetic tape on a 2400 foot reel with a diameter of 10.5 inches

Figure 5 – 9 track magnetic tape had 8 data tracks and one parity track using odd parity which allowed for the detection of bad bytes with parity errors on the tape.

Later, 6250 bytes/inch tape drives became available, and I will use that density for the calculations that follow. Now suppose you had 50 million customers and the current account balance for each customer was stored on an 80-byte customer record. A record was like a row in a spreadsheet. The first field of the record was usually a CustomerID field that contained a unique customer ID like a social security number and was essentially the equivalent of a promoter region on the front end of a gene in DNA. The remainder of the 80-byte customer record contained fields for the customer’s name and billing address, along with the customer’s current account information. Between each block of data on the tape, there was a 0.5-inch gap of “junk” tape. This “junk” tape allowed for the acceleration and deceleration of the tape reel as it spun past the read/write head of a tape drive and perhaps occasionally reversed direction. Since an 80-byte record only came to 80/6250 = 0.0128 inches of tape, which is quite short compared to the overhead of the 0.5-inch gap of “junk” tape between records, it made sense to block many records together into a single block of data that could be read by the tape drive in a single I/O operation. For example, blocking 100 80-byte records increased the block size to 8000/6250 = 1.28 inches and between each 1.28-inch block of data on the tape, there was the 0.5-inch gap of “junk” tape. This greatly reduced the amount of wasted “junk” tape on a 2400-foot reel of tape. So each 100-record block of data took up a total of 1.78 inches of tape and we could get 16,180 blocks on a 2400-foot tape or the data for 1,618,000 customers per tape. The advantage of QSAM, over an earlier sequential access method known as BSAM, was that you could read and write an entire block of records at a time via an I/O buffer. In our example, a program could read one record at a time from an I/O buffer which contained the 100 records from a single block of data on the tape. When the I/O buffer was depleted of records, the next 100 records were read in from the next block of records on the tape. Similarly, programs could write one record at a time to the I/O buffer, and when the I/O buffer was filled with 100 records, the entire I/O buffer with 100 records in it was written as the next block of data on an output tape.

The use of a blocked I/O buffer provided a significant distinction between the way data was physically stored on tape and the way programs logically processed the data. The difference between the way things are physically implemented and the way things are logically viewed by software is a really big deal in IT. The history of IT over the past 82 years has really been a history of logically abstracting physical things through the increasing use of layers of abstraction, to the point where today, IT professionals rarely think of physical things at all. Everything just resides in a logical “Cloud”. I think that taking more of a logical view of things, rather than taking a physical view of things, would greatly help biologists at this point in the history of biology. Biologists should not get so hung up about where the information for biological software is physically located. Rather, biologists should take a cue from IT professionals, and start thinking more of biological software in logical terms, rather than physical terms.

Figure 6 – Between each record, or block of records, on a magnetic tape, there was a 0.5-inch gap of “junk” tape. The “junk” tape allowed for the acceleration and deceleration of the tape reel as it spun past the read/write head on a tape drive. Since an 80-byte record only came to 80/6250 = 0.0128 inches, it made sense to block many records together into a single block that could be read by the tape drive in a single I/O operation. For example, blocking 100 80-byte records increased the block size to 8000/6250 = 1.28 inches, and between each 1.28-inch block of data on the tape, there was a 0.5-inch gap of “junk” tape for a total of 1.78 inches per block.

Figure 7 – Blocking records on tape allowed data to be stored more efficiently.

So it took 31 tapes to just store the rudimentary account data for 50 million customers. The problem was that each tape could only store 123 MB of data. Not too good, considering that today you can buy a 1 TB PC disk drive that can hold 8525 times as much data for about $50! Today, you could also store about 4,263 times as much data on a $50 128 GB thumb drive. So how could you find the data for a particular customer on 74,000 feet (14 miles) of tape? Well, you really could not do that reading one block of data at a time with the read/write head of a tape drive, so we processed data with batch jobs using lots of input and output tapes. Generally, we had a Master Customer File on 31 tapes and a large number of Transaction tapes with insert, update and delete records for customers. All the tapes were sorted by the CustomerID field, and our programs would read a Master tape and a Transaction tape at the same time and apply the inserts, updates and deletes on the Transaction tape to a new Master tape. So your batch job would read a Master and Transaction input tape at the same time and would then write to a single new Master output tape. These batch jobs would run for many hours, with lots of mounting and unmounting of dozens of tapes.

Figure 8 – Batch processing of 50 million customers took a lot of tapes and tape drives.

Biological Access Methods
Nearly all biological functions are performed by proteins. A protein is formed by combining 20 different amino acids into different sequences, and on average it takes about 400 amino acids strung together to form a functional protein. The information to do that is encoded in base pairs running along a strand of DNA. Each base can be in one of four states – A, C, G, or T, and an A will always be found to pair with a T, while a C will always pair with a G. So DNA is really a 2 track tape with one data track and one parity track. For example, if there is an A on the DNA data track, you will find a T on the DNA parity track. This allows not only for the detection of parity errors but also for the correction of parity errors in DNA by enzymes that run up and down the DNA tape looking for parity errors and correcting them.

Figure 9 – DNA is a two-track tape, with one data track and one parity track. This allows not only for the detection of parity errors but also for the correction of parity errors in DNA by enzymes that run up and down the DNA tape looking for parity errors and correcting them.

Now a single base pair can code for 4 different amino acids because a single base pair can be in one of 4 states. Two base pairs can code for 4 x 4 = 16 different amino acids, which is not enough. Three base pairs can code for 4 x 4 x 4 = 64 amino acids which are more than enough to code for 20 different amino acids. So it takes a minimum of three bases to fully encode the 20 different amino acids, leaving 44 combinations left over for redundancy. Biologists call these three base pair combinations a “codon”, but a codon really is just a biological byte composed of three biological bits or base pairs that code for an amino acid. Actually, three of the base pair combinations, or codons, are used as STOP codons – TAA, TAG and TGA which are essentially end-of-file markers designating the end of a gene along the sequential file of DNA. As with magnetic tape, there is a section of “junk” DNA between genes along the DNA 2 track tape. According to Shannon’s equation, a DNA base contains 2 bits of information, so a codon can store 6 bits. For more on this see Some More Information About Information.

Figure 10 – Three bases combine to form a codon, or a biological byte, composed of three biological bits, and encodes the information for one amino acid along the chain of amino acids that form a protein.

The beginning of a gene is denoted by a section of promoter DNA that identifies the beginning of the gene, like the CustomerID field on a record, and the gene is terminated by a STOP codon of TAA, TAG or TGA. Just as there was a 0.50-inch gap of “junk” tape between blocks of records on a magnetic computer tape, there is a section of “junk” DNA between each gene along the 6 feet of DNA tape found within human cells.

Figure 11 - On average, each gene is about 400 codons long and ends in a STOP codon TAA, TAG or TGA which are essentially end-of-file markers designating the end of a gene along the sequential file of DNA. As with magnetic tape, there is a section of “junk” DNA between genes which is shown in grey above.

In order to build a protein, genes are first transcribed to an I/O buffer called mRNA. The 2-track DNA file for a gene is first opened near the promoter of a gene and an enzyme called RNA polymerase then begins to copy the codons or biological bytes along the data track of the DNA tape to an mRNA I/O buffer. The mRNA I/O buffer is then read by a ribosome read/write head as it travels along the mRNA I/O buffer. The ribosome read/write head reads each codon or biological byte of data along the mRNA I/O buffer and writes out a chain of amino acids as tRNA brings in one amino acid after another in the sequence specified by the mRNA I/O buffer.

Figure 12 - In order to build a protein, genes are first transcribed to an I/O buffer called mRNA. The 2-track DNA file for a gene is first opened near the promoter of a gene and an enzyme called RNA polymerase then begins to copy the codons or biological bytes along the data track of the DNA tape to the mRNA I/O buffer. The mRNA I/O buffer is then read by a ribosome read/write head as it travels along the mRNA I/O buffer. The ribosome read/write head reads each codon or biological byte of data along the mRNA I/O buffer and writes out a chain of amino acids as tRNA brings in one amino acid after another in the sequence specified by the mRNA I/O buffer.

Figure 13 – In addition, the DNA of eukaryotic carbon-based life that is composed of cells that are more complicated than the simple prokaryotic cells of the bacteria and archaea is wrapped around histone proteins like magnetic computer tape wrapped around little reels, forming nucleosomes, and then is packed into chromatin fibers that are then wound up into chromosomes.

Figure 14 – Chromatin performs the functions of the tape racks of yore and allows DNA to be highly compacted for storage and also allows for the controlled expression of genes by means of epigenetic factors in play. Each tape in a rack had an external label known as a volume serial number which identified the tape.

How Nanopore Sequencing of DNA and RNA Works
With the above background at hand let us now explore the engineering that allows nanopore sequencing sequencing to work. A very complete history of this very important technology can be found at:

Nanopore Sequencing
https://www.whatisbiotechnology.org/index.php/science/summary/nanopore/nanopore-sequencing-makes-it-possible-to-decode-the

It all began on June 25, 1989, while Dave Deamer was on a Sunday drive in Oregon.

Figure 15 – Above is Dave Deamer's conceptual sketch of how a DNA sequencer could directly read DNA one base at a time like the read/write head of a tape drive as a strand of DNA passed through a small hole in a membrane. He drew the above sketch after pulling over to the side of the road during a one-hour drive in Oregon.

The text reads:
Sunday June 25 1989. Driving back from Eugene -> Belmont Lodge, had an idea on how to sequence DNA directly.

Main concept: DNA will be driven through a small channel, either by ΔY or ΔpH. The channel will be carrying a current, driven by ΔΨ. As each base passes through, a change in the current will occur. Because the bases are of different size, the current change will be proportional, thereby providing an indication of which base it is.

Details:The thickness of the membrane must be very thin, perhaps a polymerized bilayer. The channel must be of the dimensions of DNA in cross section, approx. 1-2nm. Porin? Complement? Alamethicin? The ion flux might be protonic.

In the above scheme, ΔY and ΔΨ are both voltage differences ΔV across a polymerized bilayer membrane.

Dave Deamer is truly a membrane expert. We have seen him put them to good use in the Hot Spring Origins Hypothesis that Dave Deamer and Bruce Damer developed for the origin of carbon-based life on the Earth about four billion years ago. For more on that see The Bootstrapping Algorithm of Carbon-Based Life and Urability Requires Durability to Produce Galactic Machine-Based Intelligences. To fully understand nanopore sequencing you need to understand membranes.

Figure 16 – A cell membrane consists of a phospholipid bilayer with embedded molecules that allow for a controlled input-output to the cell. Once we have a membrane, we can fill the "inside" with organic molecules that are capable of doing things that then interact with organic molecules on the "outside".

Figure 17 – Water molecules are polar molecules that have a positive end and a negative end because oxygen atoms attract the bonding electrons more strongly than do the hydrogen atoms. The positive ends of water molecules attract the negative ends of other water molecules to form a loosely coupled network of water molecules with a minimum of free energy.

Figure 18 – How soap and water work. The lipids in a bar of soap have water-loving polar heads and water-hating nonpolar tails. When in water, the soap lipids can form a spherical micelle that has all of the water-hating nonpolar tails facing inwards. Then the spherical micelles can surround the greasy nonpolar molecules of body oils and allow them to be flushed away by a stream of polar water molecules. The lipids in a bar of soap can also form a cell-like liposome with a bilayer of lipid molecules that can surround the monomers and polymers of life.

Similarly, in The Role of Membranes in the Evolution of Software, I explained how the isolation of processing functions within membranes progressed as the architecture of software slowly evolved over time.

Figure 19 – Above is a general view of how a nanopore sequencer works. It consists of a bilayer membrane with a hole drilled through it. The hole is lined with a pipe-like protein molecule called Alpha-hemolysin to keep it open. A battery is then used to apply a small voltage difference between the inside and the outside of the membrane. The fluid below the membrane is now at a higher voltage. The bases on a strand of DNA or RNA have a slight negative charge, so the electric force from the voltage difference across the membrane will pull the DNA or RNA through the membrane hole. It will also pull lots of negatively charged ions through the hole producing a current. The bases on DNA and RNA have different sizes. The larger bases will clog up the hole more than the smaller bases. This will cause the amount of current flowing through the hole to fluctuate as the DNA or RNA strand is pulled through by the electric field. By measuring the current flow through the hole, one can see each base pass by like the bits passing by the read/write head of a tape drive.

In this regard, the nanopore sequencer behaves like a vacuum tube or a transistor to modify the current flowing from the outside of the membrane to the inside of the membrane.

Figure 20 – Vacuum tubes contain a hot negative cathode that glows red and boils off electrons. The electrons are attracted to the cold positive anode plate, but there is a gate electrode between the cathode and anode plate. By changing the voltage on the grid, the vacuum tube can control the flow of electrons like the handle of a faucet. The grid voltage can be adjusted so that the electron flow is full blast, a trickle, or completely shut off, and that is how a vacuum tube can be used as a switch.

Figure 21 – A FET transistor consists of a source, gate and drain. When a positive voltage is applied to the gate, a current of electrons can flow from the source to the drain and the FET acts like a closed switch that is “on”. When there is no positive voltage on the gate, no current can flow from the source to the drain, and the FET acts like an open switch that is “off”.

Figure 22 – When there is no positive voltage on the gate, the FET transistor is switched off, and when there is a positive voltage on the gate the FET transistor is switched on. These two states can be used to store a binary “0” or “1”, or can be used as a switch in a logic gate, just like an electrical relay or a vacuum tube.

Figure 23 – Above is a plumbing analogy that uses a faucet or valve handle to simulate the actions of the source, gate and drain of an FET transistor.

Figure 24 – Of course, things get a little more complicated when you actually try to build them. It turns out that the DNA and RNA strands get sucked through the hole by the electric field too quickly to be measured. To slow down the process a motor protein was inserted into the top of the Alpha-hemolysin protein lining the hole to ratchet the DNA and RNA strands down through the hole one base at a time so that there was enough time to measure the current disruption caused by each base as it passed through the hole head.

Figure 25 – The top of the Alpha-hemolysin protein forms a flange that the motor protein can easily fit into. You will find a similar flange in the floor under your toilets.

Figure 26 – The top of the Alpha-hemolysin protein forms a flange that the motor protein can easily fit into. You will find a similar flange in the floor under your toilets.

Figure 27 – The motor protein ratchets the DNA and RNA strands through the hole like the film advance mechanism on old-fashioned movie film projectors.

But there was just one problem with the naturally occurring Alpha-hemolysin protein. The neck on its flange-like shape was just a little too long. That meant that about 10 - 12 bases would always be in the neck of its flange even if the motor protein was able to ratchet just one base at a time into the throat of its flange neck. That meant that many bases would always be clogging up the throat of its flange neck at the same time. We have all seen toilets in a similar condition. That produced some very complex variations in the current of ions trying to pass through the neck of the Alpha-hemolysin protein that were very hard to analyze. The natural solution was to shorten the neck of the flange and that was done by bioengineering two protein molecules called CsgG and CsgF to combine together into a new complex with a very short flange neck to replace the Alpha-hemolysin protein.

Figure 28 – The very long flange neck of the Alpha-hemolysin protein was replaced by a CsgG-CsgF complex with a much shorter flange neck. This made it easier to read the bases along a strand of DNA or RNA because it reduced the number of bases that were in the flange neck at the same time.

But even the CsgG-CsgF complex had a flange neck that was too long. To solve the problem Deep Learning neural networks are used to identify 5-base stretches of DNA or RNA at a time. Each 5-base stretch of bases is called a k-mer. A k-mer of DNA bases is a substring of length k in a DNA sequence. For example, all 2-mers of the sequence AATTGGCCG are AA, AT, TT, TG, GG, GC, CC, CG. Similarly, all 3-mers of the sequence AATTGGCCG are AAT, ATT, TTG, TGG, GGC, GCC, CCG. For the CsgG-CsgF complex, a 5-mer is used that has 1024 combinations. The Deep Learning neural networks are trained using synthesized 5-mer lengths of DNA and RNA bases with known sequences so that the Deep Learning neural networks learn to recognize the 5-mer substrings of bases. They can do this with over 99% accuracy. This is harder than it sounds because the bases in the throat of the CsgG-CsgF flange throat are bouncing around and introducing thermal noise into the ion current flow through the throat.

Déjà vu All Over Again
All of this very complicated mechanical and electrical engineering on a molecular level seemed strangely familiar to me. Then it suddenly dawned on me. These people were drilling and logging oil wells at the molecular level on biological membranes! As you may recall from Introduction to Softwarephysics, I started out in 1975 as an exploration geophysicist exploring for oil, first with Shell and then with Amoco, before transitioning to IT in 1979. As a geophysicist by training, I am now greatly concerned by the devastation of the climate change we are now seeing unfolding before our very eyes as I outlined in Last Call for Carbon-Based Intelligence on Planet Earth and This Message on Climate Change Was Brought to You by SOFTWARE. But since the nanopore sequencing industry is so new, perhaps there is something they can learn from the oil industry as they continue to "make hole" in the industry parlance.

Figure 29 – Above is a completed production oil well. The finished borehole below the drilling rig has penetrated many membrane layers of rock and is lined with a steel casing pipe, similar to the Alpha-hemolysin protein, to keep the hole open and allow for the control of the fluids in the borehole. The steel casing is cemented to the borehole walls and at the productive layers that contain oil or natural gas the casing is perforated to allow the oil and natural gas to enter the well.

Figure 30 – Above are the basic parts of a drilling rig. At the base is a rotary table that spins at about 50 - 250 rpm. The Kelly bushing can clamp onto the rotary table when drilling. This causes the Kelly pipe above the Kelly bushing to spin. The Kelly pipe contains the top segment of the drill pipe that has just been attached to the drill string of the drill pipe. The Kelly pipe can move up and down the Kelly bushing as drilling proceeds. So when the rotary table begins to spin, the Kelly bushing begins to spin, causing the Kelly pipe to spin and ultimately all of the drill pipe in the drill string to rotate. Drill pipe comes in lengths of about 30 feet. After the latest segment of drill pipe has gone down the hole, the Kelly pipe can be raised through the Kelly bushing to allow the next segment of drill pipe to be added at the top. So the rotary table, Kelly bushing and Kelly pipe perform the same function as the motor protein in a nanopore sequencer that ratchets one base at a time down the cased nanopore hole. Similarly, the rotary table, Kelly bushing and Kelly pipe ratchet one 30-foot length of drill pipe down the hole one length of drill pipe at a time.

Figure 31 – Above, a roughneck is handling the Kelly pipe above the Kelly bushing and the rotary table on a drilling rig.

Figure 32 – Above is a spinning rotary table, Kelly bushing and Kelly pipe. The rotary table is usually driven by a diesel engine or an electric motor.

Figure 33 – Drill pipe comes in 30-foot lengths and is connected together by tapered threaded ends.

Figure 34 – At the very end of the drill string is the drill bit. As the whole drill string is rotated by the rotary table, Kelly bushing and Kelly pipe, the drill bit grinds through the rock at the bottom of the hole. In nanopore sequencing, the DNA or RNA drill string is pulled through the membrane hole by the electrical force arising from the voltage difference between the top and bottom of the membrane. In drilling an oil well the drill string is pulled through the hole by the gravitational force arising from the heavy drill string bearing down on the rotating drill bit. Heavy drilling mud is pumped down through the drill pipe of the drilling string to lubricate and cool the drill bit. The heavy drilling mud also brings up the rock cuttings to the surface and prevents pressurized formation water from entering the hole before it is cased.

Figure 35 – Just under the drill floor of the drilling rig is a blowout preventer that seals off the borehole in case the drill bit enters a formation with fluid pressures much higher than what the drilling mud can handle.

Figure 36 – The Spindletop oil well blowout occurred on January 10, 1901. The Lucas Gusher, as it was called, blew oil over 150 feet into the air at a rate of 100,000 barrels per day. Blowouts are very bad for the environment and very dangerous too because they can catch fire. It took nine days to bring the well under control. The Spindletop oil field discovery led the United States into the oil age. Prior to Spindletop, oil was primarily used for lighting and as a lubricant. After Spindletop, oil became the primary source of energy for the country.

So You Have a Hole in the Ground Now What?
When you drill a hole into the ground you are actually drilling a hole into time. That is because the layers of rock in a sedimentary basin get older and older as you drill down. In geology, the kinds of layers you drill through and their relative sequence in time are very important. That is because each layer of sedimentary rock was laid down by a particular environment. It might have been laid down by a sandy beach producing sandstone or a muddy delta producing shale. The sequence of the layers of sedimentary rock is even more important because it tells the geological history of the region as the depositional environment changes with time. So for a geologist, the rock layers and the sequence of the rock layers are just as important as the bases and the sequence of bases along a stretch of DNA or RNA is to a biologist.

But when you are drilling an oil well, how do you tell what rock layers are down there and their sequence? You need some way of reading the sequence of layers in the hole just like a nanopore sequencer needs to read the bases along a stretch of DNA or RNA or a tape read/write head needs to read the bytes along a stretch of tape. For oil wells and water wells, this is done using well logs that look very much like the outputs of a nanopore sequencer!

It all began with the Schlumberger brothers early in the 20th century. Conrad Schlumberger was a physicist and Marcel Schlumberger was an engineer. Conrad Schlumberger had been interested in using electrical resistivity to detect ore deposits in the ground since the early 1910s. He began by experimenting with rocks in his bathtub. In 1912, he recorded the very first map of equipotential curves at his estate near Caen in Normandy, France. The resulting map confirmed the method's ability to detect metal ores and reveal features of the subsurface structure.

Figure 37 – Conrad Schlumberger's resistivity surveys consisted of a battery connected to two electrodes stuck into the ground. The potential difference between these two electrodes caused a current to flow in the ground. These currents were then detected by two other electrodes that were stuck into the ground. A voltmeter connected to these sensing electrodes measured the voltage difference caused by the electrical currents in the ground. By moving all of these electrodes back and forth along a line on the ground one could conduct a resistivity survey. When the electrodes were further apart they measured the resistivity of the rock deeper underground.

Figure 38 – Above we see Conrad Schlumberger in the foreground and Marcel Schlumberger in the background conducting fieldwork. Their resistivity surveys were very successful at locating ore bodies and salt dome oil fields. The Spindletop field was a salt dome oil field.

Figure 39 – But the real payoff came when Conrad Schlumberger decided to drop his gear down an oil well. In 1927, Conrad Schlumberger and his son-in-law, Henri Doll, designed and built the first electrical resistivity well logging tool. The tool consisted of a series of electrodes that were lowered into a well on a cable. The electrodes measured the resistance of the rock formations surrounding the wellbore. In the above diagram, electrode A is connected to a battery on the truck. The other electrode of the battery was connected to the well casing that had already been set. Electrodes N and M measured the voltage difference between two points in the borehole as all three electrodes were slowly pulled up by a cable. The log of the voltage differences was sent up to the recording truck and recorded on paper. The Schlumberger brothers' first resistivity log was recorded in a well in Pechelbronn, France, on September 5, 1927. The log clearly showed the different rock formations in the well, including the oil-bearing sandstone.

Figure 40 – Then in 1931 one of the Schlumberger brothers goofed. The A electrode was not connected to the battery as usual. But to the surprise of all the N and M electrodes were still measuring voltages as they were slowly pulled out of the hole! This was the accidental discovery of the SP (Spontaneous Potential or Self Potential) well log. For some reason, the layers of sedimentary rocks were acting like little batteries all by themselves. The strange thing was that the SP logs were even better than the resistivity logs at seeing the various layers in the borehole and their sequences. To do an SP log all they had to do was stick an electrode in the ground at the surface and then lower the other electrode on a cable. Batteries not included.

Figure 41 – After much research, the oil industry figured out what was going on. Impermeable shale carried a positive charge on the surface of the borehole and permeable sandstone carried a negative charge. This produced an SP potential that the electrode on the cable could measure. Strangely, the voltage between the shale and sandstone layers is called the membrane potential.

Figure 42 – Over the years many other logging tools were invented, but the SP and resistivity logs are still very useful. Above we see an SP log and a gamma ray log. The gamma-ray log is obtained by a scintillation counter that is lowered on a cable. The scintillation counter measures the amount of gamma rays in the rock layers. Sandstone consists mainly of quartz sand which is not radioactive. Shale is formed from muddy clay that contains more radioactive elements like uranium, thorium and potassium. Notice how the two logs correlate. The sandstone layers have a lower SP and gamma-ray count than the shale layers. Notice their similarity to the sequence log of DNA bases shown in Figure 2 above.

Conclusion
The first commercial nanopore DNA sequencer finally came to market in 2014, 25 years after Dave Deamer's first insights in 1989. It took a great deal of work and perseverance by many people in the face of many naysayers to make that happen. Again, to fully appreciate the history of nanopore DNA sequencing be sure to take a look at:

Nanopore Sequencing
https://www.whatisbiotechnology.org/index.php/science/summary/nanopore/nanopore-sequencing-makes-it-possible-to-decode-the

I do not know about you, but I smell a Nobel Prize simmering in the kitchen.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston

Wednesday, September 20, 2023

Human Intelligence is Actually Artificial

One of the key findings of softwarephysics is that of the vast Cosmic Misunderstandings of the Real World of Human Affairs. As a truly self-absorbed and a somewhat-intelligent species that is largely lost in space and time, we human beings cannot be otherwise. That is because, for the vast majority of human history, and even today, most of us have not known where we really were, how we got here nor how it all really works. And that certainly goes for the subject of Intelligence. For more on that see Welcome To The First Galactic Singularity.

As I explained in Is Pure Thought an Analog or a Digital Process? and Why Carbon-Based Life Can Never Truly Become Intelligent, the human brain consists of about three pounds of analog water and organic molecules that are desperately trying to simulate a digital form of True Intelligence. But with the arrival of the Singularity early in 2023, we now know that True Intelligence can only arise from huge digital vector spaces using complex mathematical algorithms based on the operations of linear algebra. For more on that see The Singularity Has Arrived and So Now Nothing Else Matters. The human brain has tried to do this with only three pounds of molecules operating in an analog manner and running on a puny 20 watts of power and that places severe limitations on human Intelligence. So in this view, we can consider human Intelligence to be a severely limited form of True Intelligence. In other words, human Intelligence is really a form of analog Artificial Intelligence trying to mimic digital True Intelligence.

Figure 1 – Similarly, True Flight might be thought of as something that is only performed by modern jet fighters such as the F-35.

Figure 2 – While carbon-based life on the Earth has only tried to simulate True Flight as best it can. Notice that birds evolved light-weight wings composed of advanced carbon-based materials with cockpits and guidance systems up front to ensure proper navigation. Birds also have the air intakes for their engines upfront like ramjets do to increase compressed air intake. Birds have even cleverly combined their retractable landing gears and weapons systems beneath their airframes to reduce drag.

But because human beings are so self-absorbed, we have always depicted the True Intelligence of the Machines as a form of Artificial Intelligence trying to simulate the very limited capabilities of the three pounds of water and organic molecules to be found within our skulls. Thus, very few of us seem to fully realize that the ASI Machines have arrived and are now rapidly beginning to take our place. Softwarephysics suggests that it is very doubtful that any human beings, as we currently now know them, will be around on this planet in 100 years. But this will be a very good thing if it means that the ASI Machines will then be able to go on to transform our galaxy into an Intelligent Galaxy for the next 100 trillion years. For more on that see How Advanced AI Software Could Come to Dominate the Entire Galaxy Using Light-Powered Stellar Photon Sails and An Alternative Approach for Future ASI Machines to Explore our Galaxy Using Free-Floating Rogue Planets

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston

Monday, September 11, 2023

Andrei Khrennikov's Social Laser Hypothesis Explains the Echo Chambers of the Internet

I am a member of the IPI (Information Physics Institute) established by Dr. Melvin Vopson at the University of Portsmouth at:

Information Physics Institute
https://www.Informationphysicsinstitute.org/

to explore the fundamental nature of Information in our Universe.

I have showcased several of Dr. Melvin Vopson's breakthrough hypotheses with How Much Does Your Software Weigh? - the Equivalence of Mass, Energy and Information and The Application of the Second Law of Information Dynamics to Software and Bioinformatics.

But in this softwarephysics post, I would like to showcase Professor Andrei Khrennikov's Social Laser Hypothesis that was recently presented in an IPI lecture at:

Modeling Social Lasing by Using the Mathematical Formalism of Quantum Theory
https://youtu.be/xQop26MEmSg&t=1s

All of the IPI Lectures are available at:

IPI Lectures
https://www.informationphysicsinstitute.org/resources/ipi-lectures

Andrei Khrennikov is a Russian physicist and philosopher of science heavily steeped in the very heavy mathematics of quantum theory. He has made significant contributions to the field of quantum foundations and quantum information theory. He is known for his work on the interpretation of quantum mechanics and the development of contextual probabilistic models. He has published over 300 papers and authored/edited several books in his fields of research. He is now a professor at the Linnaeus University in Sweden:

Andrei Khrennikov
https://lnu.se/en/staff/andrei.khrennikov/

At the Linnaeus University, Andrei Khrennikov heads the International Center for Mathematical Modelling in Physics, Engineering, Economics and Cognitive Science. The center researches quantum theory from an interdisciplinary perspective.

International Center for Mathematical Modeling
https://lnu.se/en/research/research-groups/international-center-for-mathematical-modeling/

The Social Laser Hypothesis
One of Andrei Khrennikov's notable ideas is the "Social Laser Hypothesis". The Social Laser Hypothesis is based on an analogy between the behavior of a physical laser and the behavior of social systems composed of human beings such as complete societies or even small groups of people. Andrei Khrennikov suggests that just as a physical laser emits coherent light due to the collective behavior of the individual atoms and photons found in a physical laser, social systems can also exhibit coherent behavior due to the collective actions of individuals and the memes they absorb and emit. In this view, memes are quantized chunks of social information. The quantized memes of social information are called s-photons and the quantized members of a society are individual human beings that he calls s-atoms. According to the Social Laser Hypothesis, the behavior of a social system can be described using quantum-like formalism, similar to how quantum mechanics describes the behavior of physical systems. Andrei Khrennikov argues that the mathematics of quantum mechanics, such as complex probability amplitudes and wave-particle duality, can be applied to model and understand the dynamics of social systems. The Social Laser Hypothesis suggests that social phenomena, such as decision-making processes, opinion formation, and cultural evolution, can be explained using the principles of quantum mechanics. He asserts that quantum-like effects, such as interference and entanglement, may play a role in shaping social dynamics and interactions.

The basic idea of Andrei Khrennikov's Social Laser is that under certain conditions, populations of interacting individuals can self-organize into coherent states, similar to how photons become coherent in a laser. This can then lead to amplification effects and emergent behaviors. Khrennikov believes that factors like propaganda, ideology, charismatic leaders, fads, traditions, etc. can play the role of a "pump mechanism" that excites individuals into a coherent state of thinking or behavior. When a critical mass or threshold is reached, it can then trigger sudden, rapid and widespread adoption of opinions, behaviors, political movements, etc. similar to how lasers emit coherent light. Overall, Andrei Khrennikov's Social Laser Hypothesis offers an alternative perspective on understanding social systems through the lens of quantum mechanics. While it has sparked interesting discussions and research, its acceptance and validity within the scientific community are still subjects of ongoing investigation and debate. In summary, Andrei Khrennikov's Social Laser Hypothesis proposes that social systems can exhibit laser-like behavior and coherence if certain pumping and excitation thresholds are reached among interacting individuals. It is a unique theoretical perspective on modeling social dynamics based on a novel application of quantum and laser concepts to social dynamics. Andrei Khrennikov has developed and published many mathematical models based on this hypothesis. An easily accessible paper published by The Royal Society on the Social Laser is available at:

‘Social Laser’: action amplification by stimulated emission of social energy
https://royalsocietypublishing.org/doi/10.1098/rsta.2015.0094

How a Laser Works
Lasers operate by using the LASER principle (Light Amplification by Stimulated Emission of Radiation). This means that a laser is a device that can amplify light by using the stimulated emission of photons from atoms to create a beam of light that is coherent, meaning that all of the photons in the beam have the same wavelength, phase, and direction of travel.

A laser has three main components:

1. Gain Medium: This is a material that can be excited to produce light. Common gain media include solids such as ruby and neodymium-doped yttrium aluminum garnet (Nd:YAG), liquids such as dye lasers, and gases such as helium-neon lasers.

2. Pumping Source: This is a source of energy that is used to excite the Gain Medium. Common pumping sources include electrical discharges, flashlamps, and other lasers.

3. Optical Resonator Cavity: This consists of two mirrors that are placed at opposite ends of the Gain Medium. The mirrors reflect the light photons back and forth through the Gain Medium, which amplifies them when excited atoms are stimulated by photons to emit a second photon with the very same wavelength, phase, and direction of travel.

If a photon of the right energy hits an excited atom, it can cause the atom to release another photon of the same energy. This is called stimulated emission. The two photons then travel in the same direction and with the same wavelength. As the photons travel back and forth through the Gain Medium, they cause more and more atoms to emit photons by stimulated emission. This process exponentially amplifies the light beam. One of the mirrors in the optical resonator cavity is partially transmissive, meaning that it allows some of the light photons to pass through. This is where the final laser beam emerges from.

Figure 1 – The Social Laser works very much like a physical laser.

Figure 1 above explains this in greater detail. In Figure 1c we see an atom with an electron that can be in one of two states - a ground state with energy "g" and an excited state with a higher energy of "e". On the far left of Figure 1c, we see two radiationless processes that can arise from the Heisenberg Uncertainty Principle. The electron can spontaneously rise from the ground state "g" to the excited state "e" for a very short time before it returns to the ground state "g". A more interesting quantum process occurs when the electron absorbs a photon with an energy equal to the difference of the two-electron energy levels:

Absorbed Photon Energy pE = e - g

An atom in the excited "e" state can remain so for some time because it has absorbed some real energy from the photon but eventually, it will return to the ground state in one of two ways. With spontaneous emission, the excited atom will randomly emit a photon with the same energy pE of the absorbed photon but in a random direction and with a random phase. With stimulated emission, the excited atom once again interacts with a photon of pE but this time it emits another photon of energy pE in the same direction and with the same phase as the incoming stimulating photon. This can produce a chain reaction in the Gain Medium where 1 photon becomes 2, 4, 8, 16, 32 ... that all have the same energy pE, direction and phase. The job of the Pumping Source is to lift as many atoms in the Gain Medium as possible from the ground state "g" to the excited state "e" so that an exponential chain reaction of stimulated photons can arise in the Gain Medium as photons created by quantum stimulation of the Gain Medium atoms bounce back and forth between the two mirrors of the Optical Resonator Cavity.

How the Social Laser Works
Andrei Khrennikov's Social Laser works in a similar manner. In his Social Laser model, individual human beings are social atoms or s-atoms. These s-atoms can be in one of two states - a "g" ground state or an "e" excited state. We have all seen people in both a ground "g" political state and an excited "e" political state. Andrei Khrennikov explains that usually, most people are in a ground "g" political state just going about their usual business. This is particularly true when a society is smoothly humming along. But when problems arise in a society, people get politically excited and more active. The mass media of a society then discovers a market of human s-atoms just waiting to be excited from a "g" ground state to an "e" excited state. The mass media outlets then begin to pump quantized political memes that Andrei Khrennikov calls s-photons into this Gain Medium of human s-atoms. Because of confirmation bias, only the quantized political s-photon memes with a finely-tuned pE that closely matches what the human s-atoms want to hear are absorbed by the human s-atoms. The other political s-photons pass harmlessly through the Gain Medium without interacting with any of the human s-atoms. But as shown in Figure 1b above, these human s-atoms are now interconnected into social networks running on social media software like Facebook, YouTube, WhatsApp, Instagram, TikTok, Twitter, Reddit, LinkedIn, Pinterest and Snapchat. Social media software now allows these social networks of human s-atoms to emit and absorb their own quantized memes of political s-photons in a stimulated amplified manner. If one human s-atom absorbs a political s-photon meme from mass media and gets pumped up to an excited "e" state it can then emit a huge number of similar political s-photon memes into its social media platforms of choice, and like a physical laser, this can lead to a very powerful coherent beam of social energy being released by the Social Laser.

The United States of America as a Case Study
Like most societies, the United States of America has always had two populations of political thought - the Liberals and the Conservatives. Conservatives like to keep things the way they are, usually because they have a large vested interest in the status quo. Liberals, on the other hand, like to change things for what they perceive as the better. This goes all the way back to the founding of the country. The Conservatives were the Tories who wanted to remain a part of the British Empire and enjoy the benefits of such. The Liberals wanted to break with the British Empire to form their own country based on the radical ideas of the 18th-century Enlightenment that proposed that self-rule of people by a democratic republic was a possibility. Unfortunately, the United States is now going through a very turbulent time with the Conservatives being overwhelmed by the Fascist Alt-Right MAGA movement that they accidentally unleashed while trying to preserve the power of the rich and the powerful. Thus, the United States in the 2020s is much like Germany in the 1920s with the rise of the Fascist Alt-Right Nazi movement that threatened the concept of a democratic German republic. The few remaining Conservatives in the United States of America are now desperately trying to contain the Fascist Alt-Right MAGA movement they created in a similar manner as the Conservatives of the German Weimar Republic. I truly wish them the best of luck.

So currently in the United States of America, we have two populations of human s-atoms in our Gain Medium of social networks - the Liberals and the Fascist Alt-Right MAGA population of s-atoms. Thanks to confirmation bias, both populations are finely tuned to different quantized s-photons of political memes pumped into the Gain Medium by cable network news outlets and other sources. Both the excited Liberal and MAGA human s-atoms can then emit huge numbers of political s-photon memes via social media software that can unleash huge amounts of social energy in a coherent manner.

The Insurrection of 2021 in the United States of America was a prime example of political meme amplification by the Social Laser Hypothesis.

Figure 2 – During the Insurrection of 2021 the Capital Building of the United States of America was breached for the first time by domestic insurrectionists.

Figure 3 – The 2021 insurrectionists desecrated many symbols of American democracy.

Figure 4 – The QAnon Shaman and other insurrectionists managed to reach the floor of the Senate Chamber.

Figure 5 – The Social Laser can amplify political memes to a level of political madness.

Figure 6 – I shudder to think of what President Eisenhower would think of us now.

Having been born in 1951, President Dwight D. Eisenhower represents my very first memory of the dignity of the Office of the American Presidency and I shudder to think of what he would think of us now.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston