Saturday, July 27, 2019

A Lesson for IT Professionals - Documenting the Flowcharts of Carbon-Based Life in the KEGG Online Database

Whether you are an IT professional in Applications Development, coding and maintaining complex software, or in Operations, trying to run complex software, you have no doubt experienced the anxiety of starting out on a new IT job or moving to a new area within your current employer's IT Department. Normally, the first thing your new group will do is to show you a chart similar to Figure 1. Figure 1 shows a simplified flowchart for the NAS system. Of course, being new to the group, you barely know what the NAS system does, but you are at least thankful that your new group started you out by showing you a simplified flowchart of the NAS system instead of a detailed flowchart of the NAS system! Unfortunately, most times you will later find out that a detailed flowchart for the NAS system does not even exist. Instead, the details of the NAS system can only be found within the minds of the developers in the Applications Development group as an unwritten oral history. That unwritten oral history has to be slowly learned by figuratively sitting around the weekly campfires of the Applications Development group meetings. Developers also slowly learn parts of the code for the NAS system on their own by working on portions of the NAS system.

Figure 1 – Above is a simplified flowchart for the NAS system.

Unfortunately, IT has never been very good at documenting software. This was most evident when I was in Middleware Operations supporting the software for high-volume corporate websites. When we had a website outage, the Command Center would page out perhaps 10 IT people into an outage call - representatives from the Call Centers, the IT Manager on call, Applications Development, Middleware Operations, Database Operations, Unix Operations, Network Operations and IT Security. As the members of the call dialed in, the Command Center would repeatedly explain what was going on to the new arrivals. Modern complex software is now heavily monitored by monitoring software that displays tons of information on consoles and real-time graphs of processing loads, response times and throughput levels. Usually, during an outage, a few icons on the main console would turn red and then you could drill down a hierarchy of displays that featured real-time graphs at lower and lower processing levels to help figure out what the problem was. Naturally, this all created huge amounts of data that were hard to relate to each other in real-time because of the sheer volume of data. So many times we would then have Applications Development explain to the outage call what each software component that was in trouble did and what it was dependent on in a high-level sense to help the outage team troubleshoot the problem. Of course, many times that required Applications Development to quickly come up with the answers to totally impossible questions, like asking how many people in the United States of America go by the nickname of "Chuck". But that was not a problem. The members of the outage team would all simply pretend without question that Applications Development could do such things off the top of their heads on an outage call. For more on the career benefits of advanced pretending skills see Hierarchiology and the Phenomenon of Self-Organizing Organizational Collapse.

During the last decade, there have been attempts by the monitoring software industry to try to automate the very complex documentation of software by recording data on how software behaved under processing load and then determining the high-level and detailed processing flowcharts for software by induction. Unfortunately, many times such attempts at trying to determine processing flowcharts by sheer induction just generated flowcharts where every software component depended on every other software component and that was not of much help.

Now over the past 100 years, biochemists have also been trying to figure out the processing flowcharts of carbon-based life by induction, deduction and by performing experiments. This flowcharting activity has been greatly enhanced by the rise of bioinformatics. A fine example of this is the KEGG (Kyoto Encyclopedia of Genes and Genomes). The KEGG is an online collection of databases detailing genomes, biochemical pathways, diseases, drugs, and biochemical molecules. KEGG is available at:

https://www.genome.jp/kegg/kegg2.html

A good description of the KEGG is available in the Wikipedia at:

https://en.wikipedia.org/wiki/KEGG

Figure 2 – Above is a simplified flowchart of the metabolic pathways used by carbon-based life.

Figure 3 – Above is a high-level flowchart of the metabolic pathways used by carbon-based life as presented in the KEGG online database. You can use the KEGG online database to drill down into the above chart and to dig down into the flowcharts for many other biological processes.

I encourage all IT professionals to try out the KEGG online database to drill down into the documented flowcharts of carbon-based life to appreciate the complexities of carbon-based life and to see an excellent example of the documentation of complex information flows. Softwarephysics has long suggested that a biological approach to software was necessitated to progress to such levels of complexity. For more on that see Agile vs. Waterfall Programming and the Value of Having a Theoretical Framework.

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston