AI Connections 2024: Part 1: Introduction

A nine-part series on AI Connections 2024, to build a terrain map and to pose research questions that will be tracked by The Fragile Sea during the year. Commercial exploitation, and how might we use AI, along with rapid development vectors, and risks, are also discussed.

Pre-dawn Pataua river 5:50:06 am Jan 1st 2024, Northland, New Zealand | © Brent Smith

1. The series - what is it about?

Welcome to this nine-part series on AI Connections, 2024. The intention is to contribute to assembling a coherent map of AI, as a matter of record, to add to a logical organisation of the vast amount of research and rapid progress in AI, noting connections and research questions into 2024, and thereafter, to see how these questions might be progressed and resolved during the year.

The series approaches AI from the following vectors:

Part 1: Introducing the series – this post.

Part 2: Technologies – systems, limitations, exploration avenues

Part 3: Commercial uses - breakthroughs, opportunities, investments, industries, services, workforce

Part 4: Neural architectures and sentience – the theory of mind, neurobiological alignment, consciousness

Part 5: Meaning, Language, and Data - linguistics, semantics, psychology, philosophy, data sources

Part 6: Ethics, oversight and legal – ethics, control, regulation, copyright

Part 7: Media and social – companion AI, publishing, AI in the newsroom, fiction, esports, metaverse, social media, the arts

Part 8: Future humanity – solving humanity’s main resource issues, human benefits, human enhancements, transhumanism

Part 9: Summaries and wrap-up – a concise revisit of main points from the series.

In this post introducing the series, the focus will be on context and background. It reads a little longer than the antecedent parts, we trust that that will not detract from your engagement and enjoyment of the series.

Over the year, the status and progress of research questions raised will be reported. Thereafter, every year around the same time, how the questions posed have been answered or changed in some way during the year will be presented, and new research questions set for the succeeding year.

This is a living series, any inaccuracies are entirely the fault of the author and will be amended if needed, in the bi-weekly newsletters throughout the year.

May you find the series enjoyable, valuable, and rewarding of your time, and the continuing newsletters and research notes here at The Fragile Sea. You can subscribe to the free bi-weekly newsletter here.

Photo of Photo of Pre-dawn Pataua river 5:50:15 am Jan 1st 2024, Northland, New Zealand — Pre-dawn Pataua river 5:50:15 am Jan 1st 2024, Northland, New Zealand | © Brent Smith

2. AI - looking back

The first book I recall purchasing on AI was in 1995, though I remember reading other works earlier. It was entitled ‘AI: The Tumultuous History of The Search for Artificial Intelligence’ [1], released in 1993, written by the Canadian entrepreneur and researcher in artificial intelligence and image processing, Daniel Crevier, PhD.

Picking up the book again, is to recall how thrilling it was to read the first time, detailing the “dramatic successes and equally dramatic failures of a half-century search for AI.” That book sent me on a journey: since then, it has been a personal fascination to follow the course of AI history.

We have a wealth of sources to draw on: papers, books, articles, studies, and online sources, including AI institutes and author rankings by publications [2], the US National AI research institutes [3], global universities, institutes, and their courses and programmes [4], and tens of prominent and obscure hubs and sources for reports, surveys, courses, and programmes of research and development. AI is already big business in media.

From this, one might easily assume that the current AI ‘wave’ has been a long time coming. The history of AI is luminous with key innovations along the way, accompanied by AI Winters, dashed expectations, and very clever breakthroughs, a large portion of which have been modelled on human neuroscience.

So, welcome, let us begin with background.

3. From whence we came: the shoulders of giants

Daniel Crevier’s book goes back to the god Amon in Ancient Rome around 800 B.C. (!) and then brings us through luminaries in the nineteenth and early twentieth centuries, such as Leonardo Torres Quevedo, building relay-activated devices to play chess at the turn of the century, and pioneers at the intersection of mathematics, computing, networks, and AI: Alan Turing, Marvin Minsky, Betrand Russell, John von Neumann, David Hilbert, Claude Shannon, and others.

We should acknowledge also Ada Lovelace, Grace Hopper, Katherine Johnson, and other more recent female and male coding, computing, and AI experts, [5], [6], [7]. The 75-page 'Annotated History of Modern AI and Deep Learning’ by Jürgen Schmidhuber details the contributions of experts in logic, math and computing [8]. I am sure this may be very unfair to whomever is unjustly not listed; the Wikipedia 'Timeline of artificial intelligence' captures the extraordinary history of AI from antiquity to the present [9].

Taking account also of logic and neurobiology, it is worth noting the following high points:

Gottfried Wilhelm Leibniz (1646-1716) first developed the chain rule, which calculates “any component of the joint distribution of a set of random variables using only conditional probabilities” [10], [11], [12]. The chain rule is employed in AI, particularly in backpropagation (of which more below).
Santiago Ramón y Cajal, received the Nobel Prize in 1906 with Camillo Golgi, for uncovering the microscopic ‘tree-like’ structure of the brain, brain cells, and neurons [13].
The first artificial recurrent neural network (RNN - ‘the Ising model’), was created in 1925 by Wilhelm Lenz and Ernst Ising [9].
The first computational model of a neuron was proposed by Warren McCulloch
(neuroscientist) and Walter Pitts (logician) in 1943, and modelled in an electrical circuit [14].
In 1949, Donald Hebb wrote 'The Organization of Behavior' [15], in which he argued that if two nerves fire at the same time, the connection between them is strengthened [16] (or very nearly at the same time, as Lisa Feldman Barrett notes in ‘Seven and a Half Lessons About the Brain’ [17]). This is a major design foundation in probability algorithms for artificial neural networks.
Paul Werbos, in his 1974 dissertation, ‘The roots of backpropagation: from ordered derivatives to neural networks and political forecasting’ first described the process of training artificial neural networks through backpropagation of errors [18]. He was also a pioneer of recurrent neural networks [19].

Daniel Crevier brings us through what he calls the "golden years of 1956-63", and quotes the organisers of the Dartmouth Conference on Artificial Intelligence as saying “every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it” ([1] p. 26).

It must have been the Workshop he was referring to. Crevier references the ‘Conference Organiser’, quoted in 1956. The Dartmouth Workshop was held in 1956 at the Dartmouth College, Hanover, New Hampshire, and was “widely considered to be the founding event of artificial intelligence as a field” [20], whereas the Dartmouth Conference on AI was first held in 1961 [21].

Perhaps it was optimism that led to a belief that electronic systems, machines, could recreate human intelligence better than humans. In any event, Crevier writes that early AI researchers turned away from trying to create the “workings of the brain in networks of artificial neurons” (electronic circuits), thinking that the processes could be emulated more efficiently by the emerging computer technology (Ibid, p.71).

As we shall see, a focus on symbolic representation (in software) was thought to have led to an extended AI winter. Later, the focus returned to neurobiology as a fundamental design approach underpinning AI systems - apart from one innovation that was thought not to arise in biological brains, that is, backpropagation (BP). The backward propagation of errors is “an algorithm that is designed to test for errors working back from output nodes to input nodes. It is an important mathematical tool for improving the accuracy of predictions in data mining and machine learning” [22]. As we shall see, there are many experts that believe some form of BP exists in human neurobiology.

Photo of low tide, pre-dawn Pataua bar 5:54:06 am Jan 1st 2024, Northland, New Zealand — Low tide, pre-dawn Pataua bar 5:54:06 am Jan 1st 2024, Northland, New Zealand | © Brent Smith

4. The perceptron

Of particular note, in Crevier’s recounting, is Frank Rosenblatt’s creation of the Perceptron, in a paper issued in 1957 at the Cornell Aeronautical Laboratory Inc., entitled ‘The Perceptron A Perceiving and Recognising Automaton’ [23] and a year later in a second paper in the Psychological Review, entitled ‘The Perceptron: A Probabilistic Model For Information Storage And Organization In The Brain’ [24].

The perceptron was indeed modelled on the human brain as an artificial neuron and was demonstrated in 1957 using an IBM 704 computer, a 15 (US) ton behemoth using vacuum tube logic circuitry that failed on average every eight hours but was seen at the time as being exceptionally reliable [25]. In July 1958, Science magazine provocatively headlined an article as ‘Perceptron thinks’, asking whether it might replace the human brain [26].

The perceptron was initially flawed, as recounted with additional colour, by Grace Lindsay, in her recent book ‘Models of the Mind’ ([27] p.71). “The reign of the perceptron” she writes, “was cut short in 1969.” It was a book, ‘Perceptrons: An Introduction to Computational Geometry’ [28], by Rosenblatt’s classmate, Marvin Minsky, and Seymour Papert, that put paid to its earliest form of implementation, having the effect, as Crevier also recounts, of “a rock thrown into the buzzing activity of this little pool” ([1] p.105).

The Wikipedia entry on perceptrons is more forthright still, noting that the book was at the centre of a "long-standing controversy in the study of artificial intelligence. It is claimed that pessimistic predictions made by the authors were responsible for a change in the direction of research in AI, concentrating efforts on so-called 'symbolic' systems, a line of research that petered out and contributed to the so-called AI winter of the 1980s when AI's promise was not realized” [29].

But that was not the end of it. As Grace Lindsay recounts, “the period that followed (their book) became known as the dark ages of connectionism… but if the hype was excessive… so too was the backlash”. Multi-layer perceptrons started the rehabilitation, and, though mentioned in Minsky and Papert’s book, Lindsay notes that they were dismissive of it. Their book simply notes multilayer networks with a diagram, stating that it “ought to be possible” but “we have not investigated this” ([28] p.206).

The story gets more interesting still: the Wikipedia entry for deep learning does not mention Minsky and Papert’s book at all, nor the limitations of the first perceptron, and commences the section on Rosenblatt with a reference to Charles Tappert, Professor of Computer Science, addressing a 2019 IEEE conference on computational intelligence, in which he said: “Rosenblatt's 1962 book" (‘Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms’ [30]), “introduced a multilayer perceptron (MLP)” [31].

Why is this important? The emphasis in the Wikipedia entry on Rosenblatt’s 1962 book, and a multilayer perceptron, prior to, and without mentioning Minsky and Papert’s book of 1969, seems to be at odds with both Crevier's, and Lindsay’s accounts, and others also concur. A paper (2017) on the origin of deep learning stated that the known limitation was highlighted by Minsky and Papert (1969) when they “attacked the limitations of perceptions by emphasizing that perceptrons cannot solve functions like XOR… As a result, very little research was done in this area until about the 1980s” [32].

XOR, or (e)X(clusive) OR is a widely used mathematical construct in logical gates, cryptography, networks, and AI: “With two inputs, XOR is true if and only if the inputs differ (one is true, one is false). With multiple inputs, XOR is true if and only if the number of true inputs is odd. Some informal ways of describing XOR are 'one or the other but not both', 'either one or the other', and 'A or B, but not A and B'" [33], [34]. This is useful for obfuscation and diffusion in cryptography, for encryption, decryption, error checking, and fault tolerance in networks, and for pseudo-random number generation. It is “extremely common as a component in more complex ciphers” [34], [35].

In his 2019 address, Tappert noted that Rosenblatt should be recognised as a father of deep learning, along with Hinton, LeCun and Bengio, who had just received the Turing Award as the fathers of the deep learning revolution. Considering the achievements made by each of them, Rosenblatt is undoubtedly in that company, including Turing himself [36], however, the Wikipedia entry on deep learning is odd in some aspects, one of them being that it only refers to Geoff Hinton beginning from 1995.

In 1986, David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams released a paper in Nature entitled ‘Learning representations by back-propagating errors’ [37], which was the beginning, Lindsay writes, of the “connectionist revival story” ([27] p. 75). It solved a multi-year problem in AI, that is, how to train multi-layer artificial networks. “It remains to this day”, she writes, “the dominant way in which artificial neural networks are trained to do interesting tasks” (Ibid., p. 75). But the origins of backpropagation go back further.

5. Backpropagation - what is it?

BP is one of the foundational pillars of artificial intelligence, “the most successful algorithm used to train artificial neural networks” [38]. Though somewhat more complicated, the basic notion is to refeed output metrics back in to inform further inputs, weighting and scoring the iterations to train the model, with the inclusion also of fine-tuning and directed algorithms to modulate and improve results [39], [40]. The algorithm is designed to adjust the weights, minimising the error between actual output and predicted output, through multiple iterations [41].

It was thought that BP was biologically implausible in the human brain, though many papers have questioned this [42], [43], and the paper above notes “In neuroscience, models based on backpropagation have helped to understand how information is processed in the visual system. However, it was not possible to fully rely on these insights, as backpropagation was so far seen as unrealistic for the brain to implement. Our work provides strong confidence to remove such concerns and thus could lead to a series of future works on understanding the brain with backpropagation” [38].

It has also been an area of considerable research: one paper notes “We suspect that the learning algorithm employed by the brain is indeed rather similar to backpropagation” [44]. After considering many algorithms, the ‘general idea’ that computed error signals inform the update of synaptic strengths is “almost certainly how the brain learns”. Several other papers also discuss its likelihood in neurological processes [44], [45], [46], [47], [48], [49].

Backpropagation has its own unique and fascinating history:

Paul Werbos released his 1974 dissertation in book form in 1994, entitled ‘The Roots of Backpropagation: From Ordered Derivatives to Neural Networks and Political Forecasting’ [18].

A paper entitled ‘On the Origin of Deep Learning’ tracks back to Aristotle’s Associationism, 300 BCE, which is a theory that “the mind is a set of conceptual elements that are organised as associations between these elements.” The paper refers to a 1989 paper entitled ‘Theory of the backpropagation neural network’ by Robert Hecht-Nielsen [50], which includes an appendix on a “speculative neurophysiological model illustrating how the backpropagation neural network architecture might plausibly be implemented in the mammalian brain” [32].

Jürgen Schmidhuber’s 75-page 'Annotated History of Modern AI and Deep Learning’ [8] tracks back to Leibniz’s chain rule (1676) and includes his own pivotal role in modern AI developments.

Photo of almost sunrise Pataua river 6:16:04 am Jan 1st 2024, Northland, New Zealand — Almost sunrise Pataua river 6:16:04 am Jan 1st 2024, Northland, New Zealand | © Brent Smith

6. RNN, ANN, and CNN Neural Networks

The progress of neural networks has proceeded through recurrent, artificial, and convolutional neural networks (RNN, ANN, and CNN). As noted, RNNs were invented in 1925 by Wilhelm Lenz and Ernst Ising; the first ‘trainable neural network’ (ANN), is recognised as the perceptron, first demonstrated by Frank Rosenblatt in 1957 [51] (and of course the later multi-layer perceptron).

The CNN architecture was introduced by Kunihiko Fukushima in 1980, known as the "neocognitron” [52]. Other sources note Yann LeCun’s invention of Convnets in the 1980s, which were “inspired by the earlier works of Japanese computer scientist Kunihiko Fukushima, ConvNets (are) modelled after the brain’s visual cortex, a part that handles sight” [53]. It is from this stream that AI has recognised and produced images.

It is worth noting the differences between all three architectures because they lead to the current emergence of AI into a new Spring of mainstream awareness:

i) An ANN employs a group of multiple perceptrons at each layer. It is a feed-forward neural network (FFN) because inputs are processed only in the forward direction [54]. Wikipedia notes deep neural networks (DNNs) as a form of ANNs: "There are different types of neural networks but they always consist of the same components: neurons, synapses, weights, biases, and functions. These components as a whole function similarly to a human brain and can be trained like any other machine learning (ML) algorithm… Deep architectures include many variants of a few basic approaches” [55].

ii) Convolutional neural networks (CNN) are popular and employ a variation of multilayer perceptrons based on three main layers: convolutional, pooled, and fully connected. They exhibit high accuracy in image recognition, but also have significant limitations: “The first layers focus on interpreting simple features in an image such as its edges and colors. As the image processes through layers, the network can recognize complex features such as object shapes. Finally, the deepest layer can identify the target object” [56]. CNNs are used in computer vision and have been applied to acoustic modelling for automatic speech recognition (ASR) [55].

iii) Recurrent neural networks (RNN) are more complex. They save the output of processing nodes and feed the result back into the model (not in one direction only). This is how the model is said to learn to predict the outcome of a layer. If the network’s prediction is incorrect, then the system self-learns and continues working towards the correct prediction during backpropagation: "An RNN remembers each and every piece of information through time. It is useful in time series prediction only because of the feature to remember previous inputs as well. This is called Long Short Term Memory (LSTM)” [[54]]. Simplilearn has a good introduction to LTSM, as an RNN that can detain long-term dependencies in sequential data [57]. Other sources are equally as good [58], [59], [60], [61].

7. The emergence of large language models (LLMs)

The innovations beyond these types of neural networks have led us to the current emergence of large language models built on Transformers, utilising multi-parallel self-attention, (called ‘multi-head’), proposed by Vaswani et al., at Google Brain, and released in a paper entitled ‘Attention Is All You Need’, in 2017 [62]. (It was Jakob Uszkoreit, a senior software engineer on the Google team, who came up with the name Transformers. The team paper notes all contributors as being equal).

Multi-head attention, and the end-to-end processes of transformers, are extremely complex, but very clever. By iterating inputs as parallel processes (characters, words, and graphics, as bits, transformed as queries, keys, and values), and including their contextual positions to other inputs, they are fed in repeatedly from a very large dataset. By concatenating all the results from the parallel processes, outputs are much improved on previous methods, to present predicted values and positions in context. That’s a full lunch to digest, but it is the essence of the Transformer self-attention breakthrough in LLMs. Multi-head self-attention is the term for multiple parallel processes, each concentrating on their own iterations, and then concatenating the results.

The limitations of CNNs [53] led to Transformers, as noted in the abstract to the Vaswani et al., paper: “The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder… We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely” [62].

It should be noted however, that prior to this 2017 paper, an attention mechanism was proposed in 2014 for machine translation, by Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio [63], and a mechanism similar to a transformer was proposed in 1992 by Jürgen Schmidhuber [64] (see also [65]).

In relation to graphics and visual attention mechanisms, the paper previously referenced, ‘On the Origin of Deep Learning’ notes that “when humans look at an image, we do not scan it bit by bit or stare at the whole image, but we focus on some major part of it and gradually build the context after capturing the gist". Attention mechanisms were first discussed by Larochelle and Hinton (2010) [66], and Denil et al., (2012) [67] (see also [32]).

Ketan Doshi provides a good overview in a Medium series: ‘Transformers Explained Visually (Part 3): Multi-head Attention, Deep Dive: A Gentle Guide to the Inner Workings of Self-Attention,… in Plain English' [68]. Other good explanations can be found in these references [69], [70], [71].

The capability of inputting images is just one example of how clever the innovation has become. An image (transformed as above) gets split into multiple channels, whereupon each channel focuses on certain parts, to ‘extract edges and gradients, textures and patterns, parts of objects and then finally objects. It then iterates through multi-head attention. Vision Transformers are explained well in these two references [72], [73].

Photo of sunrise Pataua river 6:19:48 am Jan 1st 2024, Northland, New Zealand — Sunrise Pataua river 6:19:48 am Jan 1st 2024, Northland, New Zealand | © Brent Smith

8. The ecosystem around transformer models

Since that paper, a vast ecosystem of large language models, datasets, ancillary testing mechanisms, APIs, cloud user environments, coding tools, input construction (known as ‘prompt engineering’), and training courses, have exploded like a big bang, based mainly on transformer model architecture, and attention. Fine-tuning of models, parameters, limitations, problems, load balancing, and innovation, all proceed at an exceptional pace.

Groundbreaking papers, along with the Vaswani et al., paper, are:

i) ‘LLaMA: Open and Efficient Foundation Language Models’ (Feb 2023) [74],

ii) The 85-page “A Survey of Large Language Models” (June 2023, pages 59 to 85 are references [75]), and

iii) This article: “10 AI Research Papers Every AI Enthusiast Should Read” captures a significant cross-section of relevant papers and subject matter [76].

AI has proceeded rapidly from this point, with a torrent of papers, updated LLMs, and new approaches to engineering prompts (inputs), and to methods for more assurance in outputs. The recent AI Spring, however, can only be understood in the context of perceptrons, backpropagation, transformers, and attention.

At a topographic level, LLMs use deep learning, a subset of machine learning, which is itself a subset of AI. Topographic ‘terrain maps’ for AI are discussed further below.

9. Some limitations and issues

But even as backpropagation is pivotal, herein lies some of the issues and limitations with AI large language models built on this history and architecture. Geoff Hinton was quoted in 2017 as saying he was “deeply suspicious” now of backpropagation, and maybe it was time to “throw it all away and start again” [77]. He didn’t think it was the way the brain works. One of the issues is known as catastrophic forgetting, where a machine learning process loses memory of a previous task or process to focus on a present one. Later still, in December 2022, he proposed a Forward-Forward algorithm that would replace “the forward and backward passes of backpropagation by two forward passes, one with positive (i.e. real) data and the other with negative data which could be generated by the network itself” [78]. The research continues, aligning neuroscience with AI.

There are other issues also with LLMs: power requirements, dataset sizes, ethics, copyright, cost, and so forth; this series in 2024 will discuss them all. Kate Crawford’s recent book ‘The Atlas of AI’ [79] is a good place to start.

Even with the Long Short Term Memory functionality of CNNs, maintaining a memory state and combining memories in time from across that state is still a key challenge in large language models, and there is a quadratic cost differential to modifying certain parameters to obtain better results. Some of the emergent behaviours of larger models are also concerning, if not fascinating, we will discuss these also in other parts of the series and throughout 2024.

From the histories, as noted, symbiotic research in neuroscience and artificial intelligence has proceeded. Grace Lindsay’s book goes on to more insights concerning the brain, memory, and AI, including the rehabilitation of Richard Semon’s work in engrams, and their subsequent identification as “multiscale networks of neurons” [80], [81]. We will return to Daniel Crevier’s AI history shortly.

Artificial neurons – perceptrons - continue to be the key factor in modelling human intelligence in artificial neural architecture. A good ‘definitive perceptron tutorial’, and its ongoing relevance to AI, can be found in this reference [82].

Photo of definitely sunrise Pataua river 6:21:52 am Jan 1st 2024, Northland, New Zealand — Definitely sunrise Pataua river 6:21:52 am Jan 1st 2024, Northland, New Zealand | © Brent Smith

10. What about consciousness?

The results have been so surprising, and appear to be so nearly like human outputs, that it is little wonder the question of consciousness, controversial at the best of times, should have returned to prominence. Many experts unequivocally deny AI, in its current form, as being capable of sentience, but are we sure current AI systems don’t have the potential capability, however nascent, to exhibit human consciousness?

As artificial intelligence proceeds rapidly on these foundations, the question of sentience, or consciousness in machines, is unavoidable. It is not so easy to answer, and brings definitional issues, linguistics, semantics, the meaning of meaning, and controversy.

Over the past few years, across the spectrum of arguments from ‘LLMs are just glorified copy-paste recall systems’ to ‘LLMs exhibit near human-like cognition, and we don’t understand how’ (the so-called ‘black-box’ analogy), considerable debate continues. If we stare at the spectrum long enough, all arguments make compelling points, but none of them have, in my view, at least so far, killed each other. We must know what we mean when we infer or anthropomorphise consciousness. (To 'anthropomorphise' simply means to attribute human behaviours or characteristics to non-human or inanimate objects).

11. How do we define consciousness?

When a human can pick up a photo, and instantly recall from many years past, emotions, sights, sounds, tastes, songs, any variety of environmental conditions (wind, rain), and background history, i.e., the ‘story’ of the image, how can that be possible in silicon?

As will be discussed in Part Four: Neural architectures and sentience, humans must come to some agreement on definitions for degrees and types of consciousness - for example, will implant memories, like Rachel (Sean Young) in the movie ‘Blade Runner’, suffice? Or a new form or degree of consciousness? Does it matter if it resides in carbon biology or silicon? Can it be the same? Can emotions be attached to implanted memories in silicon?

AI Anthropomorphism is highly relevant, that is, the tendency to infer human attributes in inanimate objects, characterised as transference. As a stand-alone subject, it affords considerable academic study and a wealth of papers, as represented in this paper: 'AI anthropomorphism and its effect on users' self-congruence and self–AI integration' [83].

In part Four of this series, we look at how close AI is to consciousness; the emergence of unexpected behaviours, the ‘black box’ analogy. For now, it is understandable to expect that we might make huge mistakes in thinking and acting as though AI is conscious, in attributing AI systems with that status too readily; any attribution of a consciousness classification may have unintended consequences.

Or perhaps we might shortly be deluded into thinking a system has become ‘conscious’, even as it remains controlled behind the scenes. ChatGPT can respond nicely, it seems that manners can be coded quite readily, in other words, the tone and timbre of responses may not arise purely from predictive iterations from within the data. We might, for example, readily attribute a response with more ‘authority’ because it appears to be ‘speaking our language’.

Either way, a powerful potential force has been unleashed, but how will we know? That is where the focus should be, not whether AI will, or won’t, or when, but answering that question, how will we know? We need to arrive at common definitions for degrees of sentience, or at least agree on defining what forms or degrees of consciousness we are talking about. In Part Four, we will discuss how humans seem to be talking past each other when it comes to consciousness. A new acronym is created - NORK. No one really knows.

A brand new day Pataua river 6:24:18 am Jan 1st 2024, Northland, New Zealand | © Brent Smith

12. Security and sentience are linked

And what of security? One of the rational pathways seems to be that we will need independent verification – almost like a certifying body or bodies; in other words, there may be no complete technical solution within systems, within AI, to safety and verity, nor to determining if it is, and what type of consciousness a system presents; security and consciousness seem to be inextricably linked (see for example [84], [85], [86], [87]).

These are the subjects to be discussed in other parts of this series and throughout 2024.

13. Onwards, with roadblocks in the path

Let us now return to Crevier and beyond, to complete this introduction up to date. After the ‘golden years of 1956-63’, AI research did return, later, to the human brain, and much of the current AI architecture is directly informed by, and designed from, biological neural architecture. Indeed, as we shall see in later parts of this series, brain science is now benefitting in return from AI.

But we are not up to date yet. After these golden years, Crevier then discusses the conquest of microworlds, 1963-70, noting work at MIT and Standford, which “inspired other efforts.” As Crevier reports, from media accounts at the time, robots were now able to “look through television cameras”, and “interpret what they saw”. (Note: see how easy it is to slip into loaded terminology? What does ‘interpret what they saw’ mean?). He writes also that AI “struck an apparently fatal blow against the now rival science of artificial neural networks, by exposing basic fatal flaws in current neural network research” (Ibid., p.73).

Crevier goes on to discuss the ‘dramatic failure’ of automatic language translation in the late 1960s, the limitations of models, and the disappointments of DARPA in their Speech Understanding Programme with Carnegie Mellon, in a chapter entitled ‘Clouds on the AI Horizon’.

Crevier’s final chapters are luminous, discussing the critical role of games in AI history; the size of human memory and its capacity; the struggle of computers and memory constraints at the time to keep up; the tension between hardware and software; the fascinating concept that AI software scientists jumped into boots too large for them when hardware colleagues provided them with machines they “can’t quite handle.”

Crevier concludes that AI will be mostly beneficial, “seeping into all human activities”, however, “in the long-term AI remains immensely threatening” (Ibid, p. 341). This was 1993.

14. AI winters

After first reading that book, initially, in my own lived experience, the meme of ‘AI Winters’ emerged, now with its own Wikipedia entry, identifying two major ‘dark’ periods (1974-1980 and 1987-1993), along with several smaller episodes [88]. This meme was challenged later, robustly, by researchers, among them computer scientist Ray Kurzweil, in papers, and in two notable books, ‘The Age of Spiritual Machines’ (2000) [89], and ‘The Singularity Is Near’ (2018) [90].

There is little doubt that there were several dead-ends, setbacks, and failures, as the Wikipedia entry presents, but noting Kurzweil’s contention from the latter book, that "many observers still think the AI winter was the end of the story, and that nothing since has come of the AI field. Yet today many thousands of AI applications are deeply embedded in the infrastructure of every industry”.

We haven’t, of course, touched on the historic development of other areas in AI, robotics, agent systems and so forth; it is best, then, to attempt to define how AI might be sub-categorised.

Photo of small surf on Pataua bar, 6:51:48 am Jan 1st 2024, Northland, New Zealand — If only it was a little bigger - Pataua bar, 6:51:48 am Jan 1st 2024, Northland, New Zealand | © Brent Smith

15. Towards a taxonomy: defining AI categories

One paper [91] conducted a search between 2010 and 2019 on the development and application of AI in scientific text, by training SciBERT, a “pretrained language model based on BERT" [92], to “address the lack of high-quality, large-scale, labeled scientific data (on AI)” [93].

BERT stands for Bidirectional Encoder Representations (based on Transformers). It is an open source machine learning framework for natural language processing (NLP).

Their approach was fascinating. First, they addressed the total number of papers submitted to arXiv from 2010 to 2019, well over 1m. They implemented SciBERT classifiers on arXiv metadata and subject labels and considered - of the 39 computing subjects - that six were directly relevant to AI, namely, “Artificial Intelligence, Computer Vision, Computation and Language (Natural Language Processing), Machine Learning, Multiagent Learning, and Robotics,” (acknowledging that this delineation was ‘contestable’). After training SciBERT on the subsets, they then applied their models to “scientific text in larger corpora: Clarivate Web of Science (WoS), Digital Science Dimensions, and Microsoft Academic Graph (MAG).”

Their results focused on using large language model tools to “demonstrate high-classification performance” but unfortunately, not on producing a data-driven subject sub-categorisation, as they noted: “developing guidelines for labeling publications for AI-relevance would require addressing definitional questions we sidestepped in this work; it would represent a departure from using the implicit delineation of the field provided by arXiv preprints”.

So, it was not useful for a topographical terrain map of AI, based on released papers, which would have been nice. Search strategies for AI are a subject of ongoing research, another paper compares a new strategy with “three other recent search strategies of artificial intelligence”, and the authors make their tool available “for other researchers to use and refine” [94]. The six delineators noted above, while useful, don’t capture a complete cross-section of AI areas for research mapping, since one of them, “artificial intelligence”, is a generic, overarching term.

In terms of mapping out the AI terrain, there are four, and no doubt more, good references for subject matter topography, as follows:

Jonathan Shriftman has developed ‘A Beginners Guide to The Generative AI Infrastructure Stack’, a pyramidal stack of the building blocks of Generative AI, in a deeply knowledgeable article that reflects his already successful commercial forays into AI [95].
Andreessen Horowitz provides a useful ‘High-level tech stack: Infrastructure, models, and apps’ in an article entitled ‘Who Owns the Generative AI Platform?’ by Matt Bornstein, Guido Appenzeller, and Martin Casado [96].
Ukrainian AI scientist, Viacheslav Osaulenko, has written a reflective, excellent series of articles on AI, Part Four includes a drawn terrain map of AI in the form of a human brain [97], [98].
Andreessen Horowitz also provides a lower level ‘Emerging LLM App Stack’ and a detailed explanation which are also very useful, by Matt Bornstein and Rajko Radovanovic [99].

In looking at AI subject mapping then, from multiple sources, there is not wide agreement on the separation of AI subsets since they overlap, however, a hesitant first attempt at delineating AI into the following sub-fields (alphabetically) is attempted here; they do appear generally organised as such in several sources:

Cognitive computing – (some sources expressly delineate CC as not part of AI)
Computer vision
Distributed Artificial Intelligence (DAI)
- Single-agent systems
- Multi-agent systems
- AI chatbots, companion AI chatbots
Expert systems
Future worlds
- Metaverse environments (education, gaming, virtual reality)
Fuzzy logic
Machine learning (ML)
- Deep learning – reinforcement learning, etc
- Large language models, transformer architecture, attention, etc
- Neural networks
Natural language processing (NLP)
- Speech translation - different to NLP but both used in combination [100], [101]
- Chatbots
Robotics
- Brain-computer interface robotics
- Humanoid and non-humanoid robotics
- Multi-agent systems
- Companion AI robotics
- Industrial robotics (e.g. agricultural, medical, nuclear cleanup, drones, nanobots)

We should note that industrial uses don't always map cleanly to a technology taxonomy, for example, biometric security, such as facial recognition technology, combines tools and systems from a number of subheads in the taxonomy, with non-AI components.

In Part Two: Technologies, we will elaborate and track progress using the taxonomy above as AI sub-heads. They will also be useful in Part 6: Ethics, oversight and legal.

It seems that AI has sprung into mass awareness out of nowhere, with an emergent concern, or perhaps fear, of the imminent arrival of some sort of parallel consciousness, whatever that is, one that could supersede human sentience. That is still to be uncovered, but it should be clear from the above, that AI did not come from nowhere, and somehow, we have not suddenly begat monsters.

The current systems in LLMs, robotics and other areas, as we will see in Part 2: Technologies, are extraordinarily clever, some incredible minds have created what we are now seeing, standing on the shoulders of a long line of giants. That the systems are not perfect is clear, and there is considerable potential for harm as well as enormous potential for good. But they do represent, in total, a major innovation breakthrough after many false dawns.

16. Up to date, January 2024

In May 2023, when Dr Geoff Hinton announced that he was leaving Google, it was widely reported. He was quoted as saying the idea that "this stuff" could become more intelligent than humans, he had thought, was way off, perhaps 30 to 50 years. “Obviously”, he said, “I no longer think that” [102].

He wanted to step away from the body corporate, without criticism, believing that the tech giants were locked in a competition that might be unstoppable. His concern was that the "average person" might not be able to discern what was true anymore. He also noted emergent attributes, that is, unexpected behaviours, from systems that were not entirely transparent as to how the behaviours and outputs came about. These systems have been termed ‘black boxes’ [103], [104], but that characteristic has been disputed by some researchers [105].

Previously, in March 2023, The Future of Life Institute, Cambridge, MA, released an Open Letter, calling to “immediately pause for at least 6 months the training of AI systems more powerful than GPT-4” [106]. As of March 22nd, 2023 (after which, apparently, no more signatures were added), 33709 signatures had been gathered, including Steve Wozniak, Elon Musk, Andrew Yang, and a host of prominent, expert, and ‘average’ persons [107].

Around the same time as the letter was released, Eliezer Yudkowsky, co-founder, and research fellow at the Machine Intelligence Research Institute (MIRI), Berkeley, California, went further still. In a widely reported Time interview, he maintained that “If you can’t be sure whether you’re creating a self-aware AI, this is alarming not just because of the moral implications of the “self-aware” part, but because being unsure means you have no idea what you are doing and that is dangerous and you should stop” [108].

To date, there seems little evidence of a pause.

In a profound sense, the concerns of major experts, and the promise of the technology, are both vital. In July 2023, the Guardian reported that Joe Biden had announced major Tech firms agreeing to eight measures encouraging responsible practises [109]. Nascent regulatory efforts are stirring, with the ‘EU AI Act: first regulation on artificial intelligence’ [110], and the US Congress being urged ‘all hands on deck’ to advance comprehensive legislation [111].

This is a start, even though previously in March 2023, it was widely reported that tech companies were ditching their AI ethics staff, and some commentators took issue with the Open Letter. One AI expert, Joanna Bryson, noted that “We don't need AI to be arbitrarily slowed, we need AI products to be safe” [112]. Yet it is not clear that there is any consensus around what is ‘safe’. Nor whether safe can stay safe, i.e., can it be controlled?

What about dark use? What about warfare and military use? On August 27^th, 2023, Eric Lipton, writing in The New York Times, reported on AI bringing “the robot wingman to aerial combat”, noting that the “Pentagon is starting to embrace the potential of a rapidly emerging technology, with far-reaching implications for war-fighting tactics” [113].

Photo of patterns in the ocean, Pataua, 06:35:18 Jan 1st, 2024, Northland, New Zealand — Patterns in the ocean, Pataua, 06:35:18 Jan 1st, 2024, Northland, New Zealand | © Brent Smith

17. Landmark EU AI Regulation

In March 2021 The European Union published a proposal entitled 'Laying Down Harmonised Rules On Artificial Intelligence (Artificial Intelligence Act) And Amending Certain Union Legislative Acts' [114].

On December 8^th 2023, the European Union agreed on details of the AI Act and the various harmonisations and amendments to existing acts. We will discuss the implications and summarise the relevant sections of the Act in Part 6: Ethics, oversight and legal.

The (EU) AI Act is a positive and leading step and is timely, since in April 2023, researchers at the Centre for the Governance of AI surveyed 13,000 people across 11 countries and found, across the two regions of the United States and Europe, that 91% of respondents agreed "AI is a technology that requires careful management" [115].

We should note also that regulation is not wholly sufficient to fully satisfy a maturity model capable of guiding pursuits that raise questions around ethics, sentience, copyright, and other aspects of AI progress. Copyright wars are not far away, and datasets may be exhausted sooner than anticipated. Maybe we need new thinking around royalties, patent law, and copyright, fair reward for fair use. Maybe also, this is an opportunity to cleanse the world's databases, and make long forgotten knowledge and wisdom available again.

One influential voice, Marc Andreessen, has a cheerful, optimistic approach to AI, excoriating doomsayers, in a piece entitled ‘Why AI Will Save The World’ [116]. In it, he notes: “What AI offers us is the opportunity to profoundly augment human intelligence to make all of these outcomes of intelligence – and many others, from the creation of new medicines to ways to solve climate change to technologies to reach the stars – much, much better from here.”

This is an important voice since we readily report the downsides to almost everything, and so a well-reasoned view on positivity, from an experienced investor and contributor, needs the space also to breathe. His last paragraph is particularly powerful.

Nevertheless, there will always be competing views, some quite opposite, and a necessary debating space in which to evaluate the benefits and potential drawbacks of AI.

Photo of morning on Pataua beach, 07:12:39 Jan 1st, 2024, Northland, New Zealand — Morning! Pataua beach, 07:12:39 Jan 1st, 2024, Northland, New Zealand | © Brent Smith

18. This year, 2024

It is a luminous time to see how this develops. In the coming months, through my newsletter, and the research notes on my web, The Fragile Sea, I seek to communicate, simply and accurately, what I see, as a practical realist, a technologist, a research writer, and a human. We can make it work to benefit humanity, the environment, our outlook as a species, our co-habitation and regeneration of Earth, and of course, our future. We can do it. We can also control it so that it does work for the betterment of our world.

What follows in the next sections of this series, are the details that emerge from the taxonomic sub-categories of AI we have identified, in defined areas that prompt research questions to pursue in 2024.

In the coming year, I will write about the technologies, the biological and artificial neurological architectures, the innovations, investments, and opportunities, the workforce realignments almost certain to come, and the social vectors, including ideas about control, ethics, the question of consciousness, and the social paths that we might take, to understand, live with, and guide, what is almost certainly an emergent enabler, and a powerful force.

I am delighted to welcome you as a reader to this series, hopefully to The Fragile Sea bi-weekly newsletter (on many subjects, not only AI), and to the Research Notes as they are released (a good many will be free to read, some are premium).

I would be delighted to welcome you as a subscriber to the free bi-weekly newsletter here.

This completes the Introduction to the nine-part series: ‘AI Connections 2024’.

In part two, we will discuss the technologies currently in the foreground, in the taxonomic areas of AI identified above.

Thank you for reading, Take care, Brent.

Room 5000 - a short story I wrote in 1981 about a computer becoming sentient

TFS#09 - What do Neoliberalism, Friederich Hayek, markets, algorithms, AI, and creativity have in common? We delve into these subjects for more connections

TFS#08 - What are the correlations between growth, debt, inflation, and interest rates? In this business edition of The Fragile Sea, we go hunting in corporate, institutional, and academic papers for insights in the face of heightened political, economic, corporate, and environmental risks, and more besides!

TFS#07 - We discuss a mixing pot of subjects - the state of AI, will there be food shortages this summer? good things and not so in energy, pandemics - are we ready? some remarkable discoveries, and more!

TFS#06 - Can AI produce true creativity? We discuss music, art and creativity, why human creators have a strong future, and why we must assure that they do

TFS#05 - Practical guides for implementing AI, in other news, a revisit on CRISPR, and events in spaceweather, fake publishing, spring blossoms, and more!

TFS#04 - Has Artificial General Intelligence (AGI) arrived already? We look at the goings on in AI over the past four months

TFS#03 - AGI and machine sentience, copyright, developments in biotech, space weather, and much more

TFS#02 - Sam Altman's $7trn request for investment in AI, economic outlooks, and happenings in biotech, robotics, psychology, and philosophy.

TFS#01 - Economic outlooks, and happenings in AI, social media, biotech, robotics, psychology, and philosophy.

AI 2024 Series

Part 1: Introduction / History of AI

Part 2: Technologies

Part 3: Commercial uses

Part 4: Neural architectures and sentience - coming soon!

Part 5: Meaning, Language, and Data

Part 6: Ethics, oversight and legal

Part 7: Media and social

Part 8: Future humanity

[1]: D. Crevier, ‘AI: the tumultuous history of the search for artificial intelligence’ New York, NY: Basic Books, 1993.

[2]: AIRankings.org, ‘AIRankings.’ (updated) https://airankings.org/

[3]: NSF, ‘National AI Research Institutes.’ May 04, 2023. https://nsf-gov-resources.nsf.gov/2023-05/AI_Research_Institutes_Map_2023.pdf?VersionId=GtBfiPXUI3e_RePJ6Ub2y5UVfhPdKKct

[4]: AI Neurons, ‘Global AI Research Institutes,’ neurons.AI (updated) https://neurons.ai/directory/research-institutes/

[5]: ML Concepts, ‘The Impact of Women in AI and ML: A History of Progress and Innovation.’ Mar, 2023: https://www.linkedin.com/pulse/impact-women-ai-ml-history-progress-innovation-ml-concepts-com/

[6]: G. L. Great Learning Team, ‘Top 12 AI Leaders and Researchers you Should Know in 2022,’ Great Learning Blog, Nov 2023: https://www.mygreatlearning.com/blog/ai-researchers-and-leaders/

[7]: J. Parkins IQA, ‘7 Women Leaders in AI, Machine Learning, and Robotics | LinkedIn.’ Sep, 2021: https://www.linkedin.com/pulse/7-women-leaders-ai-machine-learning-robotics-jade-parkins-iqa/

[8]: J. Schmidhuber, ‘Annotated History of Modern AI and Deep Learning.’ arXiv, Dec. 29, 2022. doi: https://doi.org/10.48550/arXiv.2212.11279.

[9]: Wikipedia, ‘Timeline of artificial intelligence,’ Wikipedia. 2024: https://en.wikipedia.org/w/index.php?title=Timeline_of_artificial_intelligence&oldid=1173824592

[10]: DeepAI, ‘Chain Rule,’ DeepAI, 2023: https://deepai.org/machine-learning-glossary-and-terms/chain-rule

[11]: Wikipedia, ‘Chain rule (probability),’ Wikipedia, 2024: https://en.wikipedia.org/w/index.php?title=Chain_rule_(probability)&oldid=1165554070

[12]: Wikipedia, ‘Gottfried Wilhelm Leibniz,’ Wikipedia, 2024: https://en.wikipedia.org/w/index.php?title=Gottfried_Wilhelm_Leibniz&oldid=1174056600

[13]: Wikipedia, ‘Santiago Ramón y Cajal,’ Wikipedia, 2024: https://en.wikipedia.org/w/index.php?title=Santiago_Ram%C3%B3n_y_Cajal&oldid=1167304926

[14]: W. S. McCulloch and W. Pitts, ‘A logical calculus of the ideas immanent in nervous activity,’ Bulletin of Mathematical Biophysics, vol. 5, no. 4, pp. 115–133, Dec. 1943, doi: https://doi.org/10/djsbj6.

[15]: D. O. Hebb, ‘The Organization of Behavior A NEUROPSYCHOLOGICAL THEORY’ John Wiley and Sons, 1949. https://pure.mpg.de/rest/items/item_2346268_3/component/file_2346267/content

[16]: R. E. Brown, ‘Donald O. Hebb and the Organization of Behavior: 17 years in the writing,’ Molecular Brain, vol. 13, no. 1, p. 55, Apr. 2020, doi: https://doi.org/10/gsn6xd.

[17]: L. F. Barrett, ‘Seven and a Half Lessons About the Brain’, Main Market edition. Picador, 2021.

[18]: P. J. Werbos, ‘The roots of backpropagation: from ordered derivatives to neural networks and political forecasting’ in Adaptive and learning systems for signal processing, communications, and control. New York: Wiley, 1994.

[19]: Wikipedia, ‘Paul Werbos,’ Wikipedia, 2024: https://en.wikipedia.org/w/index.php?title=Paul_Werbos&oldid=1146656080

[20]: Wikipedia, ‘Dartmouth workshop,’ Wikipedia, 2024: https://en.wikipedia.org/w/index.php?title=Dartmouth_workshop&oldid=1168445773

[21]: Wikipedia, ‘Dartmouth Conference,’ Wikipedia, 2024: https://en.wikipedia.org/w/index.php?title=Dartmouth_Conference&oldid=1161696727

[22]: A. Zola and J. Vaughan, ‘What is a backpropagation algorithm and how does it work?,’ Enterprise AI, 2023: https://www.techtarget.com/searchenterpriseai/definition/backpropagation-algorithm

[23]: F. Rosenblatt, ‘The Perceptron.’ https://blogs.umass.edu/brain-wars/files/2016/03/rosenblatt-1957.pdf

[24]: F. Rosenblatt, ‘The perceptron: A probabilistic model for information storage and organization in the brain.,’ Psychological Review, vol. 65, no. 6, pp. 386–408, 1958, doi: https://doi.org/10/fg6wr5.

[25]: Wikipedia, ‘IBM 704,’ Wikipedia, 2024: https://en.wikipedia.org/w/index.php?title=IBM_704&oldid=1182402772

[26]: Science, ‘‘Perceptron’ Thinks | Science News,’ Science, Jul. 19, 1958: https://www.sciencenews.org/archive/perceptron-thinks

[27]: G. Lindsay, ‘Models of the Mind: How Physics, Engineering and Mathematics Have Shaped Our Understanding of the Brain’ Bloomsbury Sigma, 2022.

[28]: M. Minsky and S. Papert, ‘Perceptrons: an introduction to computational geometry’ Expanded ed. Cambridge, Mass: MIT Press, 1988.

[29]: Wikipedia, ‘Perceptrons: an introduction to computational geometry,’ Wikipedia, 2024: https://en.wikipedia.org/w/index.php?title=Perceptrons_(book)&oldid=1159221874

[30]: F. Rosenblatt, Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books, 1962. https://books.google.ie/books/about/Principles_of_Neurodynamics.html?id=7FhRAAAAMAAJ&redir_esc=y

[31]: C. C. Tappert, ‘Who Is the Father of Deep Learning?,’ in 2019 International Conference on Computational Science and Computational Intelligence (CSCI), Dec. 2019, pp. 343–348. doi: https://doi.org/10/gn7j5x.

[32]: H. Wang and B. Raj, ‘On the Origin of Deep Learning.’ arXiv, Mar. 02, 2017. http://arxiv.org/abs/1702.07800

[33]: Wikipedia, ‘Exclusive or,’ Wikipedia, 2024: https://en.wikipedia.org/w/index.php?title=Exclusive_or&oldid=1173105833

[34]: Wikipedia, ‘XOR gate,’ Wikipedia, 2024: https://en.wikipedia.org/w/index.php?title=XOR_gate&oldid=1174826100

[35]: Wikipedia, ‘XOR cipher,’ Wikipedia, 2024: https://en.wikipedia.org/w/index.php?title=XOR_cipher&oldid=1145752501

[36]: I. Sample, ‘Race to AI: the origins of artificial intelligence, from Turing to ChatGPT,’ The Guardian, Oct. 28, 2023. https://www.theguardian.com/technology/2023/oct/28/artificial-intelligence-origins-turing-to-chatgpt

[37]: D. E. Rumelhart, G. E. Hinton, and R. J. Williams, ‘Learning representations by back-propagating errors,’ Nature, vol. 323, no. 6088, Art. no. 6088, Oct. 1986, doi: https://doi.org/10/cvjdpk.

[38]: Y. Song, T. Lukasiewicz, Z. Xu, and R. Bogacz, ‘Can the Brain Do Backpropagation? —Exact Implementation of Backpropagation in Predictive Coding Networks,’ Adv Neural Inf Process Syst, vol. 33, pp. 22566–22579, 2020, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7610561/

[39]: M. A. Nielsen, ‘Neural Networks and Deep Learning,’ 2015 http://neuralnetworksanddeeplearning.com

[40]: A. Al-Masri, ‘Backpropagation in a Neural Network: Explained | Built In.’ Aug, 2023: https://builtin.com/machine-learning/backpropagation-neural-network

[41]: Sciencedirect, ‘Backpropagation - an overview | ScienceDirect Topics.’, 2022: https://www.sciencedirect.com/topics/chemical-engineering/backpropagation

[42]: I. Pozzi, S. Bohté, and P. Roelfsema, ‘A Biologically Plausible Learning Rule for Deep Learning in the Brain.’ arXiv, Jul. 02, 2019. doi: https://doi.org/10.48550/arXiv.1811.01768.

[43]: A. H. Marblestone, G. Wayne, and K. P. Kording, ‘Toward an Integration of Deep Learning and Neuroscience,’ Frontiers in Computational Neuroscience, vol. 10, 2016, doi: https://doi.org/10/gcsgr2.

[44]: J. Campbell, ‘The Considerations of Biological Plausibility in Deep Learning,’ 2022: https://journals.library.cornell.edu/index.php/CURJ/article/download/660/618/203

[45]: T. P. Lillicrap, A. Santoro, L. Marris, C. J. Akerman, and G. Hinton, ‘Backpropagation and the brain,’ Nat Rev Neurosci, vol. 21, no. 6, pp. 335–346, Jun. 2020, doi: https://doi.org/10/ggsc7t.

[46]: J. C. R. Whittington and R. Bogacz, ‘Theories of Error Back-Propagation in the Brain,’ Trends in Cognitive Sciences, vol. 23, no. 3, pp. 235–250, Mar. 2019, doi: https://doi.org/10/gfwbtx.

[47]: G. Hinton, ‘How to do backpropagation in a brain,’ Nov. 2014 https://www.cs.toronto.edu/~hinton/backpropincortex2014.pdf

[48]: Y. Bengio, D.-H. Lee, J. Bornschein, T. Mesnard, and Z. Lin, ‘Towards Biologically Plausible Deep Learning.’ arXiv, Aug. 08, 2016. doi: https://doi.org/10.48550/arXiv.1502.04156.

[49]: Y. H. Liu, S. Smith, S. Mihalas, and E. Shea-Brown, ‘Biologically-plausible backpropagation through arbitrary timespans via local neuromodulators,’ 2022: https://openreview.net/pdf?id=jPx7vYUNUCt

[50]: Hecht-Nielsen, ‘Theory of the backpropagation neural network,’ in International 1989 Joint Conference on Neural Networks, 1989, pp. 593–605 vol.1. doi: https://doi.org/10.1109/IJCNN.1989.118638.

[51]: L. Hardesty, ‘Explained: Neural networks,’ MIT News | Massachusetts Institute of Technology, Apr, 2017: https://news.mit.edu/2017/explained-neural-networks-deep-learning-0414

[52]: Wikipedia, ‘History of artificial neural networks,’ Wikipedia, 2024: https://en.wikipedia.org/w/index.php?title=History_of_artificial_neural_networks&oldid=1166512188

[53]: S. Goled, ‘The beginning of the end for Convolutional Neural Networks?,’ Analytics India Magazine, Mar, 2022: https://analyticsindiamag.com/the-beginning-of-the-end-for-convolutional-neural-networks/

[54]: Abhishekg25, ‘Difference between ANN, CNN and RNN,’ GeeksforGeeks, 2023: https://www.geeksforgeeks.org/difference-between-ann-cnn-and-rnn/

[55]: Wikipedia, ‘Deep learning,’ 2024: https://en.wikipedia.org/w/index.php?title=Deep_learning&oldid=1178398312

[56]: M. Memon, ‘ANN vs CNN vs RNN: Neural Networks Guide.’, 2023: https://levity.ai/blog/neural-networks-cnn-ann-rnn

[57]: M. Banoula, ‘Introduction to Long Short-Term Memory(LSTM) | Simplilearn,’ Simplilearn.com, 2023: https://www.simplilearn.com/tutorials/artificial-intelligence-tutorial/lstm

[58]: colah, ‘Understanding LSTM Networks—colah’s blog’, 2015: https://colah.github.io/posts/2015-08-Understanding-LSTMs/

[59]: J. Brownlee, ‘A Gentle Introduction to Long Short-Term Memory Networks by the Experts,’ MachineLearningMastery.com,Jul 2021: https://machinelearningmastery.com/gentle-introduction-long-short-term-memory-networks-experts/

[60]: aakarsha, ‘Deep Learning | Introduction to Long Short Term Memory,’ GeeksforGeeks. 2023. https://www.geeksforgeeks.org/deep-learning-introduction-to-long-short-term-memory/

[61]: S. Saxena, ‘What is LSTM? Introduction to Long Short-Term Memory,’ Analytics Vidhya, 2021: https://www.analyticsvidhya.com/blog/2021/03/introduction-to-long-short-term-memory-lstm/

[62]: A. Vaswani et al., ‘Attention Is All You Need.’ arXiv, Dec. 05, 2017. http://arxiv.org/abs/1706.03762

[63]: D. Bahdanau, K. Cho, and Y. Bengio, ‘Neural Machine Translation by Jointly Learning to Align and Translate.’ arXiv, May 19, 2016 http://arxiv.org/abs/1409.0473

[64]: J. Schmidhuber, ‘Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks,’ Neural Computation, vol. 4, no. 1, pp. 131–139, Jan. 1992, doi: http://doi.org/10.1162/neco.1992.4.1.131.

[65]: Wikipedia, ‘Transformer (machine learning model),’ Wikipedia, 2024: https://en.wikipedia.org/w/index.php?title=Transformer_(machine_learning_model)&oldid=1192805371

[66]: H. Larochelle and G. E. Hinton, ‘Learning to combine foveal glimpses with a third-order Boltzmann machine,’ in Advances in Neural Information Processing Systems, Curran Associates, Inc., 2010. https://proceedings.neurips.cc/paper/2010/hash/677e09724f0e2df9b6c000b75b5da10d-Abstract.html

[67]: M. Denil, L. Bazzani, H. Larochelle, and N. Freitas, ‘Learning Where to Attend with Deep Architectures for Image Tracking,’ Neural computation, vol. 24, pp. 2151–84, Apr. 2012, doi: https://doi.org/10.1162/NECO_a_00312.

[68]: K. Doshi, ‘Transformers Explained Visually (Part 3): Multi-head Attention, deep dive,’ Medium, Jan, 2021: https://towardsdatascience.com/transformers-explained-visually-part-3-multi-head-attention-deep-dive-1c1ff1024853

[69]: G. Kalra, ‘Attention Networks: A simple way to understand Multi-Head Attention’ Jul, 2022: https://medium.com/@geetkal67/attention-networks-a-simple-way-to-understand-multi-head-attention-3bc3409c4312

[70]: S. Cristina, ‘The Transformer Attention Mechanism,’ MachineLearningMastery.com, Jan 2023: https://machinelearningmastery.com/the-transformer-attention-mechanism/

[71]: S. Cristina, ‘How to Implement Multi-Head Attention from Scratch in TensorFlow and Keras’, Jan, 2023: MachineLearningMastery.com. https://machinelearningmastery.com/how-to-implement-multi-head-attention-from-scratch-in-tensorflow-and-keras/

[72]: A. Islam, ‘Vision Transformers: An Innovative Approach to Image Processing!,’ The Pythoneers, Feb, 2023: https://medium.com/pythoneers/vision-transformers-an-innovative-approach-to-image-processing-3387c398d67f

[73]: M. Sharique, ‘Vision Transformer: ‘Attention in Images,’’ Analytics Vidhya, Jul 2021: https://medium.com/analytics-vidhya/vision-transformer-attention-in-images-a86d0a8adc1b

[74]: H. Touvron et al., ‘LLaMA: Open and Efficient Foundation Language Models.’ arXiv, Feb. 27, 2023 http://arxiv.org/abs/2302.13971

[75]: W. X. Zhao et al., ‘A Survey of Large Language Models.’ arXiv, Jun. 29, 2023 http://arxiv.org/abs/2303.18223

[76]: A. Islam, ‘10 AI Research Paper Every AI Enthusiast Should Read,’ Medium, Jan, 2023: https://ai.plainenglish.io/10-ai-research-paper-every-ai-enthusiast-should-read-cdf3a6cf2cff

[77]: S. LeVine, ‘Artificial intelligence pioneer says we need to start over,’ Axios, Sep, 2017: https://www.axios.com/2017/12/15/artificial-intelligence-pioneer-says-we-need-to-start-over-1513305524

[78]: G. Hinton, ‘The Forward-Forward Algorithm: Some Preliminary Investigations.’ arXiv, Dec. 26, 2022. http://arxiv.org/abs/2212.13345

[79]: K. Crawford, ‘Atlas of AI: power, politics, and the planetary costs of artificial intelligence’ New Haven London: Yale University Press, 2021.

[80]: Picower Institute, ‘Engrams emerging as the basic unit of memory’, Jan, 2020: https://picower.mit.edu/news/engrams-emerging-basic-unit-memory

[81]: Cengage, ‘Semon, Richard (1859-1918) | Encyclopedia.com.’ https://www.encyclopedia.com/psychology/encyclopedias-almanacs-transcripts-and-maps/semon-richard-1859-1918

[82]: J. R. Ph.D, ‘From Basic Gates to Deep Neural Networks: The Definitive Perceptron Tutorial,’ Medium, Apr, 2023: https://towardsdatascience.com/the-definitive-perceptron-guide-fd384eb93382

[83]: A. Alabed, A. Javornik, and D. Gregory-Smith, ‘AI anthropomorphism and its effect on users’ self-congruence and self–AI integration: A theoretical framework and research agenda,’ Technological Forecasting and Social Change, vol. 182, p. 121786, Sep. 2022, doi: https://doi.org/10/gr9g25.

[84]: I. Mader, ‘‘Conscious’ or ‘sentient’ AI: Severe Legal, Risk & Security Implications,’ Medium, Sep 2023: https://medium.com/@IsabellaMader_/conscious-or-sentient-ai-severe-legal-risk-security-implications-d6c3bee5c822

[85]: M. Comiter, ‘Attacking Artificial Intelligence: AI’s Security Vulnerability and What Policymakers Can Do About It | Belfer Center for Science and International Affairs’ Aug, 2019: https://www.belfercenter.org/publication/AttackingAI

[86]: R. Morrison, ‘LaMDA is not sentient but human-like AI poses an ‘increasing security risk,’’ Tech Monitor, Jun 2022: https://techmonitor.ai/technology/ai-and-automation/human-level-speech-ai-poses-increasing-security-risk

[87]: Mailchimp, ‘From Automation to Sentience: Understanding the Future of AI for Businesses,’ Mailchimp, 2022 https://mailchimp.com/resources/ai-sentient/

[88]: Wikipedia, ‘AI winter,’ 2024: https://en.wikipedia.org/w/index.php?title=AI_winter&oldid=1167794818

[89]: R. Kurzweil, ‘The age of spiritual machines: when computers exceed human intelligence’ New York, NY: Penguin Books, 2000.

[90]: R. Kurzweil, ‘The singularity is near: when humans transcend biology’ This impression 2018. London: Duckworth, 2018.

[91]: J. Dunham, J. Melot, and D. Murdick, ‘Identifying the Development and Application of Artificial Intelligence in Scientific Text.’ arXiv, May 28, 2020. doi: https://doi.org/10.48550/arXiv.2002.07143.

[92]: J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, ‘BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,’ in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 4171–4186. doi: https://doi.org/10/ggbwf6.

[93]: I. Beltagy, K. Lo, and A. Cohan, ‘SciBERT: A Pretrained Language Model for Scientific Text.’ arXiv, Sep. 10, 2019. doi: https://doi.org/10.48550/arXiv.1903.10676.

[94]: N. Liu, P. Shapira, and X. Yue, ‘Tracking developments in artificial intelligence research: constructing and applying a new search strategy,’ Scientometrics, vol. 126, no. 4, pp. 3153–3192, Apr. 2021, doi: https://doi.org/10/gh6gq3.

[95]: J. Shriftman, ‘The Building Blocks of Generative AI,’ Medium, July, 2023: https://shriftman.medium.com/the-building-blocks-of-generative-ai-a75350466a2f

[96]: M. Bornstein, G. Appenzeller, and M. Casado, ‘Who Owns the Generative AI Platform?,’ Andreessen Horowitz, Jan 2023: https://a16z.com/who-owns-the-generative-ai-platform/

[97]: V. Osaulenko, ‘AI Territory. Content so far…,’ Medium, Dec 2020: https://ai-territory.medium.com/ai-territory-content-so-far-2e71d5d26bc6

[98]: V. Osaulenko, ‘The Map of Artificial Intelligence (2020),’ The Startup , Dec 2020: https://medium.com/swlh/the-map-of-artificial-intelligence-2020-2c4f446f4e43

[99]: M. Bornstein and R. Radovanovic, ‘Emerging Architectures for LLM Applications,’ Andreessen Horowitz, June 2020: https://a16z.com/emerging-architectures-for-llm-applications/

[100]: N. Kumawat, ‘Difference between Natural Language Processing and Speech Recognition,’ InsideAIML, 2020: https://insideaiml.com/blog/differencebetween-natural-language-processing-and-speech-recognition-1059

[101]: Stone Water, ‘Is speech recognition part of NLP?,’ Quora, 2021: https://www.quora.com/Is-speech-recognition-part-of-NLP

[102]: C. Metz, ‘‘The Godfather of A.I.’ Leaves Google and Warns of Danger Ahead,’ The New York Times, May 01, 2023. https://www.nytimes.com/2023/05/01/technology/ai-google-chatbot-engineer-quits-hinton.html

[103]: P. J. Blazek, ‘Why we will never open deep learning’s black box,’ Medium, Mar, 2022: https://towardsdatascience.com/why-we-will-never-open-deep-learnings-black-box-4c27cd335118

[104]: A. Mirza, ‘Decoding the Black Box: Interpretable Machine Learning Models,’ Medium. Jun, 2023: https://levelup.gitconnected.com/decoding-the-black-box-interpretable-machine-learning-models-3af5dee617ea

[105]: K. Miller, ‘AI’s Ostensible Emergent Abilities Are a Mirage,’ Stanford HAI, May, 2023: https://hai.stanford.edu/news/ais-ostensible-emergent-abilities-are-mirage

[106]: Wikipedia, ‘Eliezer Yudkowsky,’ Wikipedia. Jul. 12, 2023. https://en.wikipedia.org/w/index.php?title=Eliezer_Yudkowsky&oldid=1165050185

[107]: Future of Life Institute, ‘Pause Giant AI Experiments: An Open Letter,’ Future of Life Institute, Mar, 2022: https://futureoflife.org/open-letter/pause-giant-ai-experiments/

[108]: E. Yudkowsky, ‘The Open Letter on AI Doesn’t Go Far Enough,’ Time, Apr, 2023: https://time.com/6266923/ai-eliezer-yudkowsky-open-letter-not-enough/

[109]: K. Paul, J. Bhuiyan, and D. Rushe, ‘Top tech firms commit to AI safeguards amid fears over pace of change,’ The Guardian, Jul. 21, 2023. https://www.theguardian.com/technology/2023/jul/21/ai-ethics-guidelines-google-meta-amazon

[110]: European Parliament News, ‘EU AI Act: first regulation on artificial intelligence | News | European Parliament.’, Jun, 2023: https://www.europarl.europa.eu/news/en/headlines/society/20230601STO93804/eu-ai-act-first-regulation-on-artificial-intelligence

[111]: C. Lima, ‘Schumer launches ‘all hands on deck’ push to regulate AI,’ Washington Post, Jun. 21, 2023. https://www.washingtonpost.com/technology/2023/06/21/ai-regulation-us-senate-chuck-schumer/

[112]: T. D. Upside, ‘Tech Companies Ditch AI Ethics Staff As Panic Rises,’ The Motley Fool, Mar, 2023: https://www.fool.com/investing/2023/03/29/tech-companies-ditch-ai-ethics-staff-as-panic-rise/

[113]: E. Lipton, ‘A.I. Brings the Robot Wingman to Aerial Combat,’ The New York Times, Aug. 27, 2023. https://www.nytimes.com/2023/08/27/us/politics/ai-air-force.html

[114]: EU, ‘Laying Down Harmonised Rules On Artificial Intelligence (Artificial Intelligence Act) And Amending Certain Union Legislative Acts.’ Apr. 22, 2021. https://eur-lex.europa.eu/resource.html?uri=cellar:e0649735-a372-11eb-9585-01aa75ed71a1.0001.02/DOC_1&format=PDF

[115]: N. Dreksler et al., ‘Preliminary Survey Results: US and European Publics Overwhelmingly and Increasingly Agree That AI Needs to Be Managed Carefully | GovAI Blog.’, Apr, 2023: https://www.governance.ai/post/increasing-consensus-ai-requires-careful-management

[116]: M. Andreessen, ‘Why AI Will Save the World,’ Andreessen Horowitz. Jun, 2023: https://a16z.com/ai-will-save-the-world/

AI Connections 2024: Part 1: Introduction

1. The series - what is it about?

2. AI - looking back

3. From whence we came: the shoulders of giants

4. The perceptron

5. Backpropagation - what is it?

6. RNN, ANN, and CNN Neural Networks

7. The emergence of large language models (LLMs)

8. The ecosystem around transformer models

9. Some limitations and issues

10. What about consciousness?

11. How do we define consciousness?

12. Security and sentience are linked

13. Onwards, with roadblocks in the path

14. AI winters

15. Towards a taxonomy: defining AI categories

16. Up to date, January 2024

17. Landmark EU AI Regulation

18. This year, 2024

Read next

The Fragile Sea Newsletter TFS #09

The Fragile Sea Newsletter TFS #08

The Fragile Sea Newsletter TFS #07

Comments ()

1. The series - what is it about?

2. AI - looking back

3. From whence we came: the shoulders of giants

4. The perceptron

5. Backpropagation - what is it?

6. RNN, ANN, and CNN Neural Networks

7. The emergence of large language models (LLMs)

8. The ecosystem around transformer models

9. Some limitations and issues

10. What about consciousness?

11. How do we define consciousness?

12. Security and sentience are linked

13. Onwards, with roadblocks in the path

14. AI winters

15. Towards a taxonomy: defining AI categories

16. Up to date, January 2024

17. Landmark EU AI Regulation

18. This year, 2024

Read next

Comments ( )

Comments ()