An LLM (“large language model”) is a type of artificial intelligence designed to understand and compose text. LLMs can be fine-tuned to execute a range of tasks. These include content creation, data interpretation, sentiment analysis, and customer service. Multimodal models can even analyze and generate images, videos, and audio. LLMs’ processing capacities are powered by artificial neural networks that are trained on very large datasets (hence the modifier large in their name).
Below, we demystify large language models and spell out their workings in the simplest of terms. You’ll learn why LLMs are important, what they can do, how they work, and what their future looks like.
{toc}
Why are LLMs so important?
LLMs are important because they can carry out tasks that previously required human intellect and effort. LLMs can take over human activities like formatting presentations, responding to customer support tickets or creating variants of a Facebook ad - and often execute them more accurately, efficiently, and at a lower cost. By assuming tedious tasks, LLMs give people the ability to spend their time on higher-level decision-making and management.
What are LLMs capable of?
LLMs are capable of various language processing tasks. They can write new content, summarize existing content, and translate documents and interpret data supplied by human users. Many LLMs are also skilled conversationalists who can effectively mimic human dialogue.
Some multimodal LLMs can also generate music, images, and videos. That said, these models usually use language modeling in conjunction with other generative AI techniques to produce outputs that are not text.
Are chatbots LLMs?
Most contemporary chatbots are powered by LLMs, but this was not always the case. The first chatbot models relied on a set of rules and decision trees to answer human queries. Many businesses still employ these bots for customer service; you’ll know you’re using one if it quickly runs out of answers and transfers you to a human agent.
However, LLM technology enables advanced language processing, which helps modern chatbots simulate human conversations and produce more complex and accurate outputs. ChatGPT, Google Gemini, and Claude are examples of this LLM-powered chatbots.
What is the difference between LLMs and generative AI?
LLMs are a subset of generative AI designed specifically for written text. Where generative AI can produce various types of outputs that include text, images, videos, and music, LLMs focus on understanding and composing text.
Many modern AI products combine LLMs with other forms of generative AI. For example, ChatGPT’s language processing power comes from its GPT-3.5 and GPT-4 LLMs, while DALL-E allows the chatbot to create images. DALL-E is a different type of generative AI (called Contrastive Image-Language Pre-Training, or CLIP) that can understand text much like an LLM, but is trained to make contextual connections between words and imagery.
What is an example of an LLM?
GPT is a famous example of an LLM. It stands for “Generative Pretrained Transformer,” a name that stems from its neural network architecture (more on LLM functionality below).
OpenAI launched the first GPT in 2018 under the name GPT-1. Since its launch, the LLM has steadily grown in power and complexity. Today GPT-4 serves as the foundation of ChatGPT — one of the world’s best-known generative AI products.
How can I use an LLM?
You can use an LLM for a variety of language-processing and content-creation tasks, such as:
- Content writing
- Visual content creation
- Code generation
Content writing
LLMs are designed to process text, so writing content is a natural function for them. You can use an LLM-based AI to compose new text, or review and edit existing passages.
LLM-powered products such as ChatGPT and Gemini can be used in all types of content writing, ranging from technical documents to poetry. However, there are some fundamental differences in their outputs. ChatGPT’s text is more convoluted and pedantic, while Gemini’s is more casual and engaging. Both AIs (and other LLMs, for that matter) err often and leave a heavy AI signature in their outputs, so fact-checking and editing remain crucial human tasks in AI content creation.
Visual content creation
You can use multimodal LLMs to produce visual content, such as images, videos, and presentations. These LLMs incorporate generative AI architecture that matches the language in your inputs to graphic elements based on training data.
Midjourney is an AI product commonly used to generate images. This AI relies on an LLM that works together with a diffusion model to create imagery based on user inputs. Midjourney’s outputs are sophisticated and creative at a glance, but getting the desired image usually takes several iterations with fine-tuned prompts.
Likewise, Plus AI is an LLM-based presentation maker. You can use Plus AI to generate presentation-ready slide decks from a prompt. The AI tool lets you choose from a vast catalog of templates and generates content and relevant images for every slide.
Code generation
Certain LLMs are trained to help you write code. For example, ChatGPT can generate code in Java, Python, PHP, and any other common programming language. Recent advancements in LLMs’ power and performance have made more of this AI-produced code executable.
How does an LLM work?
LLMs work by performing complex mathematics to generate the best string of output “tokens” for any given set of input “tokens.” The LLM’s capabilities stem from its key component, the transformer. The transformer is a type of artificial neural network that converts sequences input by the user (such as text or code) into output text. LLM architectures comprise layers of transformers.
Here’s a simple explanation of how an LLM transformer produces its outputs.
- Tokenization. An input sequence is broken up into tokens — the most basic bits of data LLMs work with. These bits of text may include entire words, subwords, or single characters.
- Mapping. With the input sequence tokenized, the LLM’s transformer maps each token to a unique numerical representation. Once the transformer sees the input text sequence as a set of numerical data, it’s able to process it.
- Contextualization. At this step, the transformer “understands” the input sequence. It gauges the semantic distance (a measure of relatedness) between tokens to identify the context of each token in relation to one another. To perform this step, the LLM employs an attention mechanism, a system that operates similarly to human cognition and establishes the strength of each token’s contextual weight, or relevance, within the sequence. The attention mechanism’s ability to recognize contextual patterns comes from its learning (which we’ll discuss below).
- Output. Once the LLM understands the input tokens, it relies on its training to predict the most probable output tokens. These tokens get converted from numerical representations into text — which is what the user sees.
How are LLMs trained?
LLMs are trained through exposure to massive datasets that teach the model to understand the context of different words and sequences and generate outputs based on prediction. The datasets used in training vary with the model but may include Common Crawl, Wikipedia, Github, and other resources. Likewise, depending on the model, the learning method can be unsupervised, self-supervised, or semi-supervised.ost machine learning approaches involve a combination of the three methods, along with Reinforcement Learning from Human Feedback (RLHF) techniques like fine-tuning and prompt engineering.
What is the most advanced LLM?
Several AI products contend for the “most advanced LLM” title, but it’s hard to pinpoint a winner. That’s because developers seldom disclose all of their LLM’s performance data.
The complexity of an LLM — meaning its capacity to execute different tasks — is usually quantified using the parameter count and context window. While we know the context window sizes of most language models on the market, the parameter counts of many LLMs are shrouded in mystery. The best we can do is use estimated parameter counts and known context windows to rank LLMs based on complexity, as you can see below:
LLM parameters
Parameters are the settings an LLM creates to help itself understand and compose text. The LLM learns these parameters during its training with each iteration of its prediction. The more parameters a trained LLM has, the more capacity it has to process linguistic nuances.
LLM context windows
The context window of an LLM refers to the number of input tokens the model can process. This metric determines the size of the input and has a positive correlation with the model’s performance. That’s because larger context windows let the LLM process more tokens, and the more tokens the model is exposed to at once, the more nuanced its understanding of the input text.
Also, larger context windows translate into longer “attention spans” for LLMs. In practical terms, if the LLM you’re using can’t remember earlier parts of your conversation, it’s because those parts have moved outside of its context window.
What are the major challenges facing LLMs?
Here are four of the most significant challenges facing LLMs:
- Contextual limits on inputs and outputs
- Factual inaccuracy
- Bias
- Training costs
LLMs have input and output limits
The newest LLMs have wide context windows that can accept lengthy text sequences with hundreds of thousands of characters. To analyze a very long document, LLMs must break it up into smaller chunks to fit the window. By splitting up the inputs, the user limits an LLM’s ability to grasp the context of the entire data corpus, and this reduces the quality of the model’s output.
LLMs are often inaccurate
Even the most powerful LLMs like GPT 4 and Gemini 1.5 can struggle to produce accurate facts. Also, the models often show no signs of doubt in their answers. This false confidence is a major headache for people using AI for research and writing. These LLM hallucinations usually stem from a lack of training data on the input, or from the model’s inability to remember this data. That’s why human fact-checking is still a must any time a language model’s output is used for vital tasks.
LLMs may be biased
Many common language models have been caught outputting text riddled with unmistakable racial and gender stereotypes. LLMs often have these biases because of the datasets on which they’ve been pre-trained and the lack of RLHF fine-tuning that would eliminate prejudiced responses.
Language models are costly to train and run
Complex language models with high parameter counts and massive context windows take millions and even billions of dollars to train. The lofty price tag is driven by the high cost of Graphic Processing Units (GPUs), which the models use for processing. Likewise, the RLHF techniques needed to fine-tune pre-trained models demand lots of expensive human labor and expertise. Finally, once they are trained, they can also be very expensive to run.
What is the future of LLMs?
The future of LLMs is conjectural, and academic opinions differ on how far their progress will go. Some scholars argue that technological singularity — a point where machines’ intelligence surpasses humans’ — is imminent and will cause unforeseen consequences for humanity. Others believe that LLMs’ lack of self-awareness and sensory experience will obstruct their path to achieving (let alone exceeding) human intellect. Despite these conflicting opinions, most experts agree that language models will become more accurate and dependable, with a greater capacity to execute specialized tasks.
FAQs about LLMs
Is GPT an LLM?
Yes, GPT is a type of LLM. GPT (short for “Generative Pretrained Transformer”) is a type of neural network architecture that allows LLMs to understand human inputs and compose text in response. GPTs were first launched by OpenAI in 2018, and currently power the company’s well-known AI chatbot, ChatGPT.
Is BERT an LLM?
Yes, BERT is an LLM. BERT (initialism for “Bidirectional Encoder Representations from Transformers”) is a type of LLM that uses transformer architecture, much like GPT. Google developed BERT to improve its ability to understand the context behind search queries. Today, Google Search algorithms continue using BERT to make search results more relevant to user queries.
Do LLMs understand?
LLMs understand language. However, whether they understand the meaning of the inputs they get and the outputs they generate is debatable. LLMs are mechanisms that convert text into numerical representations and match the input to a response based on its training data. It’s the LLMs’ ability to manipulate language that allows them to produce a conversational output, which, in turn, creates the illusion of understanding.
Are LLMs conscious?
Whether LLMs are conscious is the subject of debate, but much of the evidence rejects this possibility. Most crucially, LLMs lack subjective experience — they have no senses, emotions, or even true self-awareness (although they can mimic the latter well). That said, current obstacles in the way of LLM consciousness may be only temporary, and conscious and sentient forms of artificial intelligence may be possible in the future.
What is the history of LLMs?
The history of LLMs takes root in semantics — a field of linguistics that studies the structure, origins, and meaning of language. Semantics emerged in the late nineteenth century and eventually spurred the development of Natural Language Processing (NLP) — a field of computer science that forms the basis of modern LLMs. The first chatbots using some form of NLP were produced in the 1960s, with MIT’s ELIZA being a prominent example. The last few decades of the twentieth century saw vital LLM innovations like neural networks, while deep learning and transformer architecture were developed more recently to produce the first LLM-based generative AI tools, like ChatGPT.