It is convenient to think that by killing the head of the snake with fire and fury, the gophers will leave the garden alone. Recent history in the Middle East has shown that scenario to not only be false, but instead, the gophers reemerge more emboldened and there are holes everywhere in the garden.
Language Models
Chat GPT has the AI revolution in full swing. The question is, have we really solved the fundamental problem of accessing information and communicating it effectively?
It is exciting that we can now have rich dialogue with a computer. For two decades, we’ve asked Google what the most relevant website is to our query. Now, we ask the wise man GPT. And, we now can instruct the wise man to write for us. As exciting as this is, we are making an errant assumption that GPT is wise.
The LLM model that GPT is based on does not think. It looks for text sequences in its dataset (which are not all trusted authorities or great works of authorship) that map to users’ text inputs. There is no reasoning going on here. Its outputs are not a formulation of scholars like an encyclopedia, or a ranking of relevance like Google search. Thus, as a societal matter, we should be wary that rapidly increasing the pace of content creation based on information that may neither be factually accurate nor serve to inform the reader could crowd out the very information that made the internet useful.
There are solutions. The goal should be to create a system that can synthesize information, make it easier to find the trusted authorities, reason through it, offer up coherent perspectives, and like GPT, author works. In order to accomplish this, we have to tackle some underlying technical challenges.
First, anyone who has sat at the backend of a search engine observing user queries has realized that a ton of searches are vague, brief, and or ambiguous. When I handed friends an app with a search bar for image generation, they searched for things like ”soccer,” “woman dancing,” and “dog with flowers.” If you asked an artist to draw one of those descriptions, he would likely scrunch his forehead before peppering you with clarifying questions. There are ways to predict certain things about what you might mean based on your previous searches, previous searches by other users and external data. However, like the artist, the algorithm cannot read your mind. In sum, there is a “garbage in” problem.
As an attorney, a significant portion of my job is asking follow-up questions to gather more information. Asking context specific questions is a task of gathering necessary information while making a probability calculation of whether the person is willing or able to be more specific. Many journalists will attest that it’s typically easier to get someone to tell you a story than to describe something with specificity. And yet, software engineers are often fearful of users leaving the app by putting up too much friction in the interface by asking clarifying questions. Nonetheless, without solving the “garbage in” problem, the outputs will continue to be inaccurate, random and or not very useful.
The second challenge is structuring software to contextualize and synthesize information. The Onion, a humor publication, is not the same as the Harvard Medical Journal. Double blind studies are not the same as pop psychology that states that alcohol is healthy. These rules and hierarchies of authorities are teachable and therefore they are programmable. Relating information from different domains, formulating complex hypotheses and assembling experiments is something that deep learning is well equipped to do. Deep learning, and more broadly machine learning, are a category of statistical algorithms. LLMs are merely one tool in the toolkit, but at the step of contextualizing and synthesizing, they are the wrong tool.
Finally, expressing information in the most digestible and compelling way is another hard problem. Think about the difference of how your kindergarten teacher spoke to you, and then think about how a poet laureate writes, and then how an attorney advises her client. On the one hand, there is colorful wording and the use of a variety of rhetorical tactics–-narratives, analogies and metaphors among them—that allow you to connect with the language. And on the other, there is a spectrum of specificity to understandability. An explanation can generally be simple and reductive or complex and detailed, but not both. A superior solution would allow users to adjust the level of detail they desire. But the reduction of detail should never distort the meaning. This is particularly a concern for specialized domains like law and medicine, where the risk of error is very high.
Solving these challenges is the next frontier in language models. It is how we make the internet a better encyclopedia–instead of a graveyard of word sequences mashed together in a gazillion parameter language model with a human face. Oh, Socrates, where art thou?