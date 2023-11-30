Frontiers

Overestimating the capabilities of AI models like ChatGPT can lead to unreliable applications.

Mikhail Burtsev, Martin Reeves, and Adam Job November 30, 2023 Reading Time: 11 minutes

Large language models (LLMs) appear poised to transform businesses. Their ability to generate detailed, creative responses to questions in simple language and code has generated a wave of enthusiasm, leading to ChatGPT reaching 100 million users faster than any other technology after it first launched. Is. Subsequently, investors invested more than $40 billion in artificial intelligence startups in the first half of 2023 – that’s more than 20% of all global venture capital investments – and companies ranging from seed-stage startups to tech giants are investing in the technology. Developing new applications.

But while LLMs are incredibly powerful, their ability to generate human text may invite us to falsely attribute them with other human capabilities, leading to misuse of the technology. With a deeper understanding of how LLMs work and their fundamental limitations, managers can make more informed decisions about how LLMs are used in their organizations, addressing their shortcomings with a mix of complementary technologies and human governance. can do.

Mechanics of LLM

LLM is basically a machine learning model designed to predict the next element in a sequence of words. Previously, more rudimentary language models operated sequentially, drawing from the probability distribution of words within their training data to predict the next word in a sequence. (Think of your smartphone keyboard that suggests the next word in a text message.) However, these models lack the ability to consider the larger context in which a word appears and its multiple meanings and relationships.

The advent of the latest neural network architecture – Transformer – marked a significant evolution towards modern LLM. Transformers allow neural networks to process large chunks of text simultaneously to establish stronger connections between words and the context in which they appear. Training these Transformers on increasingly vast amounts of text has resulted in a leap in sophistication that enables LLMs to generate human responses to signals.

This capability of an LLM depends on several important factors, including the size of the model, represented by the number of trainable weights (called weights). parameters), the quality and quantity of training data (defined by the number of). tokenreferring to word or subword units), and the maximum size of input that the model can accept as a prompt (known as its). reference window size,

About the Author

Mikhail Burtsev, Ph.D., is a Landau AI Fellow at the London Institute for Mathematical Sciences, former Scientific Director of the Artificial Intelligence Research Institute, and author of over 100 papers in the field of AI. Martin Reeves is Chairman of the BCG Henderson Institute, which focuses on business strategy. Adam Jobe, Ph.D., is the Director of the Strategy Lab at the BCG Henderson Institute.

