Age of AI: Everything you need to know about artificial intelligence

AI is appearing in seemingly every corner of modern life, from music and media to business and productivity, even dating. There’s so much it can be hard to keep up — so read on to find out everything from the latest big developments to the terms and companies you need to know in order to stay current in this fast-moving field.

To begin with, let’s just make sure we’re all on the same page: what is AI?

Artificial intelligence, also called machine learning, is a kind of software system based on neural networks, a technique that was actually pioneered decades ago but very recently has blossomed thanks to powerful new computing resources. AI has enabled effective voice and image recognition, as well as the ability to generate synthetic imagery and speech. And researchers are hard at work making it possible for an AI to browse the web, book tickets, tweak recipes, and more.

Oh, but if you’re worried about a Matrix-type rise of the machines — don’t be. We’ll talk about that later!

Our guide to AI has three main parts, each of which we update regularly and can be read in any order:

First, the most fundamental concepts you need to know as well as more recently important ones.
Next, an overview of the major players in AI and why they matter.
And last, a curated list of the recent headlines and developments that you should be aware of.

By the end of this article you’ll be about as up to date as anyone can hope to be these days. We will also be updating and expanding it as we press further into the age of AI.

Inteligencia Artificial

AI 101

One of the wild things about AI is that although the core concepts date back more than 50 years, few of them were familiar to even the tech-savvy before very recently. So if you feel lost, don’t worry — everyone is.

And one thing we want to make clear up front: although it’s called “artificial intelligence,” that term is a little misleading. There’s no one definition of intelligence out there, but what these systems do is definitely closer to calculators than brains. The input and output of this calculator is just a lot more flexible. You might think of artificial intelligence like artificial coconut — it’s imitation intelligence.

With that said, here are the basic terms you’ll find in any discussion of AI.

Neural network

Our brains are largely made of interconnected cells called neurons, which mesh together to form complex networks that perform tasks and store information. Recreating this amazing system in software has been attempted since the ’60s, but the processing power required wasn’t widely available until 15-20 years ago, when GPUs let digitally defined neural networks flourish. At their heart they are just lots of dots and lines: the dots are data and the lines are statistical relationships between those values. As in the brain, this can create a versatile system that quickly takes an input, passes it through the network, and produces an output. This system is called a model.

Model

The model is the actual collection of code that accepts inputs and returns outputs. The similarity in terminology to a statistical model or a modeling system that simulates a complex natural process is not accidental. In AI, model can refer to a complete system like ChatGPT, or pretty much any AI or machine learning construct, whatever it does or produces. Models come in various sizes, meaning both how much storage space they take up and how much computational power they take to run. And these depend on how the model is trained.

Training

To create an AI model, the neural networks making up the base of the system are exposed to a bunch of information in what’s called a dataset or corpus. In doing so, these giant networks create a statistical representation of that data. This training process is the most computation-intensive part, meaning it takes weeks or months (you can kind of go as long as you want) on huge banks of high-powered computers. The reason for this is that not only are the networks complex, but datasets can be extremely large: billions of words or images that must be analyzed and given representation in the giant statistical model. On the other hand, once the model is done cooking it can be much smaller and less demanding when it’s being used, a process called inference.

Inference

When the model is actually doing its job, we call that inference, very much the traditional sense of the word: stating a conclusion by reasoning about available evidence. Of course it is not exactly “reasoning,” but statistically connecting the dots in the data it has ingested and, in effect, predicting the next dot. For instance, saying “Complete the following sequence: red, orange, yellow…” it would find that these words correspond to the beginning of a list it has ingested, the colors of the rainbow, and infers the next item until it has produced the rest of that list. Inference is generally much less computationally costly than training: think of it like looking through a card catalog rather than assembling it. Big models still have to run on supercomputers and GPUs, but smaller ones can be run on a smartphone or something even simpler.

Generative AI

Everyone is talking about generative AI, and this broad term just means an AI model that produces an original output, like an image or text. Some AIs summarize, some reorganize, some identify, and so on — but an AI that actually generates something (whether or not it “creates” is arguable) is especially popular right now. Just remember that just because an AI generated something, that doesn’t mean it is correct, or even that it reflects reality at all! Only that it didn’t exist before you asked for it, like a story or painting.

Today’s top terms

Beyond the basics, here are the AI terms that are most relevant here in mid-2023.

Large language model

The most influential and versatile form of AI available today, large language models are trained on pretty much all the text making up the web and much of English literature. Ingesting all this results in a foundation model (read on) of enormous size. LLMs are able to converse and answer questions in natural language and imitate a variety of styles and types of written documents, as demonstrated by the likes of ChatGPT, Claude, and LLaMa. While these models are undeniably impressive, it must be kept in mind that they are still pattern recognition engines, and when they answer it is an attempt to complete a pattern it has identified, whether or not that pattern reflects reality. LLMs frequently hallucinate in their answers, which we will come to shortly.

Foundation model

Training a huge model from scratch on huge datasets is costly and complex, and so you don’t want to have to do it any more than you have to. Foundation models are the big from-scratch ones that need supercomputers to run, but they can be trimmed down to fit in smaller containers, usually by reducing the number of parameters. You can think of those as the total dots the model has to work with, and these days it can be in the millions, billions, or even trillions.

Fine tuning

A foundation model like GPT-4 is smart, but it’s also a generalist by design — it absorbed everything from Dickens to Wittgenstein to the rules of Dungeons & Dragons, but none of that is helpful if you want it to help you write a cover letter for your resumé. Fortunately, models can be fine tuned by giving them a bit of extra training using a specialized dataset, for instance a few thousand job applications that happen to be laying around. This gives the model a much better sense of how to help you in that domain without throwing away the general knowledge it has collected from the rest of its training data.

Reinforcement learning from human feedback, or RLHF, is a special kind of fine tuning you’ll hear about a lot — it uses data from humans interacting with the LLM to improve its communication skills.

Diffusion

From a paper on an advanced post-diffusion technique, you can see how an image can be reproduced from even very noisy data.

Image generation can be done in numerous ways, but by far the most successful as of today is diffusion, which is the technique at the heart of Stable Diffusion, Midjourney, and other popular generative AIs. Diffusion models are trained by showing them images that are gradually degraded by adding digital noise until there is nothing left of the original. By observing this, diffusion models learn to do the process in reverse as well, gradually adding detail to pure noise in order to form an arbitrarily defined image. We’re already starting to move beyond this for images, but the technique is reliable and relatively well understood, so don’t expect it to disappear any time soon.

Hallucination

Originally this was a problem of certain imagery in training slipping into unrelated output, such as buildings that seemed to be made of dogs due to an an over-prevalence of dogs in the training set. Now an AI is said to be hallucinating when, because it has insufficient or conflicting data in its training set, it just makes something up.

This can be either an asset or a liability; an AI asked to create original or even derivative art is hallucinating its output; an LLM can be told to write a love poem in the style of Yogi Berra, and it will happily do so — despite such a thing not existing anywhere in its dataset. But it can be an issue when a factual answer is desired; models will confidently present an response that is half real, half hallucination. At present there is no easy way to tell which is which except checking for yourself, because the model itself doesn’t actually know what is “true” or “false,” it is only trying to complete a pattern as best it can.

Inteligencia Artificial, Robot, Ai

AGI or strong AI

Artificial General Intelligence, or strong AI, is not really a well-defined concept, but the simplest explanation is that it is an intelligence that is powerful enough not just to do what people do, but learn and improve itself like we do. Some worry that this cycle of learning, integrating those ideas, and then learning and growing faster will be a self-perpetuating one that results in a super-intelligent system that is impossible to restrain or control. Some have even proposed delaying or limiting research to forestall this possibility.

It’s a scary idea, sure, and movies like The Matrix and Terminator have explored what might happen if AI spirals out of control and attempts to eliminate or enslave humanity. But these stories are not grounded in reality. The appearance of intelligence we see in things like ChatGPT is an impressive act, but has little in common with the abstract reasoning and dynamic multi-domain activity that we associate with “real” intelligence. While it’s near-impossible to predict how things will progress, it may be helpful to think of AGI as something like interstellar space travel: we all understand the concept and are seemingly working toward it, but at the same time we’re incredibly far from achieving anything like it. And due to the immense resources and fundamental scientific advances required, no one is going to just suddenly accomplish it by accident!

AGI is interesting to think about, but there’s no sense borrowing trouble when, as commentators point out, AI is already presenting real and consequential threats today despite, and in fact largely due to, its limitations. No one wants Skynet, but you don’t need an superintelligence armed with nukes to cause real harm: people are losing jobs and falling for hoaxes today. If we can’t solve those problems, what chance do we have against a T-1000?

LeackStat 2023