Wednesday, October 28, 2020

The Impact of GPT-3 AI Text Generation on the Development and Maintenance of Computer Software

If you are a busy IT Development professional, you probably have not had much time to keep up with the advances being made in AI text generation. But since software source code is just text, you should be paying attention, especially with the arrival of GPT-3 last May. Since Microsoft has since purchased the exclusive rights to the GPT-3 source code, you should definitely be paying attention to how AI-generated computer source code may affect your job in the near future. In order to do that, please view the following two YouTube videos.

From Essays to Coding, This New A.I. Can Write Anything
https://www.youtube.com/watch?v=Te5rOTcE4J4

The next one features an interview with GPT-3 itself in a Turing-Test-like manner. It is a little bit scary.

What It's Like To be a Computer: An Interview with GPT-3
https://www.youtube.com/watch?v=PqbB07n_uQ4

How Does GPT-3 Do It?
GPT-3 uses Machine Learning neural networks with tremendous Deep Learning capabilities to produce coherent text of all kinds, including computer source code. All Machine Learning techniques essentially use the concepts of Universal Darwinism - inheritance, innovation and natural selection to work their wonders. All Machine Learning techniques take some input data and then apply an initial model to the data that tries to explain the data. Next, a selection process is applied by first measuring how well the model explains the data and then keeping those models around that do well. The surviving models are then mutated slightly with some innovation. The mutated models are then inherited by the Machine Learning processes into a new generation of models that are then subjected to the same Machine Learning processes all over again to see if any improvements have been made. This Darwinian process of inheritance, innovation and natural selection is then repeated over and over again until a final model is output.

Developers writing computer source code manually today also use these same concepts of Universal Darwinism to produce code. For example, developers never code up the source code for new software from scratch. Instead, developers take old existing code from previous applications, or from the applications of others in their development group, or perhaps, even from the Internet itself as a starting point and then use the Darwinian processes of inheritance, innovation and natural selection to evolve the software into the final product. The developer currently does this with a very tedious manual process of:

Borrow some old code → modify code → test → fix code → test → Borrow some more old code → modify code → test → fix code → test ....

To understand how GPT-3 simulates the above process, begin with some very excellent YouTube videos by Luis Serrano. If you are totally new to Machine Learning, begin with this first video which covers all of the major machine learning approaches currently being followed:

A Friendly Introduction to Machine Learning
https://www.youtube.com/watch?v=IpGxLWOIZy4

The next video covers neural networks and the Deep Learning that large neural networks can perform. GPT-3 relies heavily on many Deep Learning neural networks that have basically read everything on the Internet and have taught themselves how to talk all by themselves, just like a human toddler does by listening to the people around it.

A Friendly Introduction to Deep Learning and Neural Networks
https://www.youtube.com/watch?v=BR9h47Jtqyw

The next video explains how recurrent neural networks use feedback loops, like the feedback loops found in electronic circuits and biochemical reactions, to stabilize output.

A Friendly Introduction to Recurrent Neural Networks
https://www.youtube.com/watch?v=UNmqTiOnRfg

The next video explains how generative neural networks with positive feedback can be paired with neural networks with negative feedback. This same mechanism is used to control the expression of genes stored in DNA.

A Friendly Introduction to Generative Adversarial Networks (GANs)
https://www.youtube.com/watch?v=8L11aMN5KY8

Now with the above background, next take a look at the following excellent posts by Jay Alammar.

Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention)
https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/

The Illustrated Transformer
https://jalammar.github.io/illustrated-transformer/

Finally, read this last post that explains how GPT-3 uses all of the above to work.

How GPT3 Works - Visualizations and Animations
http://jalammar.github.io/how-gpt3-works-visualizations-animations/

How Will GPT-3 Affect My Programming Job?
I don't think that GPT-3 has brought us to the Software Singularity yet, but we are getting closer. Here is a YouTube video that seems to have a good take on the subject:

GPT3 - Will AI replace programmers?
https://www.youtube.com/watch?v=u5MmL3nqvfE

One commenter wisely noted, "I'm not scared about GPT-3. I'm scared about GPT-5.".

But I can imagine Microsoft incorporating GPT-3 into their Visual Studio IDE. Since GPT-3 is much too large to run on a single PC and needs to run in the Cloud, Microsoft Visual Studio will need to run in a metered manner with Cloud support for GPT-3, GPT-4, GPT-5... Also, GPT-3 can be fine-tuned for specific computer languages to vastly improve performance. Currently, GPT-3 is just generating code based on the code that it randomly read on the Internet by accident. So I think that GPT-3 will only slightly alter the normal development cycle by removing lots of the tedious aspects of coding that really do not require actual thinking. It will probably look more like this:

Generate some code with GPT-3 → modify code → test → fix code → test → Generate some code with GPT-3 → modify code → test → fix code → test ....

The reason I say this is because, back in 1985, I began working on an IDE with a built-in code generator that was called BSDE - the Bionic Systems Development Environment. This was while I was a programmer in the IT department of Amoco. BSDE was an early mainframe-based IDE (Integrated Development Environment like Eclipse or Microsoft Visual Studio) at a time when there were no IDEs. During the 1980s BSDE was used to grow several million lines of production code for Amoco by growing applications from embryos in an interactive mode. BSDE would first generate an embryo application for a programmer by reading the genes for the application. The application genes were stored on a sequential file containing the DDL statements for the DB2 tables and indexes that the application was going to use to store data. The embryo was then grown within BSDE by the programmer injecting reusable BSDE skeleton code segments into the embryo and by turning on the embryo's genes to generate SQL code on the fly. This continued on until the application grew to maturity and BSDE delivered the completed application into Production. About 30 programmers in Amoco's IT department used BSDE to put several million lines of code into Production. The development process looked something like this:

Generate some code with BSDE → modify code → test → fix code → test → Generate some code with BSDE → modify code → test → fix code → test ....

For more about BSDE take a look at Agile vs. Waterfall Programming and the Value of Having a Theoretical Framework.

How Will GPT-3 Affect the Rest of Society?
Since most white-collar jobs deal primarily with reading and writing text and that text does not have to be as perfect as computer source code has to be, other white-collar jobs, beyond those in IT, are probably at more risk from AI text generators than are IT jobs. As I pointed out in Is it Finally Time to Reboot Civilization with a New Release?, Oligarchiology and the Rise of Software to Predominance in the 21st Century and The Danger of Tyranny in the Age of Software society will soon need to come to grips with trying to run civilizations in which most people do not work as we currently know it. In that regard, perhaps your next job interview will be conducted by a GPT-3 avatar!

Comments are welcome at scj333@sbcglobal.net

To see all posts on softwarephysics in reverse order go to:
https://softwarephysics.blogspot.com/

Regards,
Steve Johnston

No comments: