The Static Language
The Unshakable Artificial Tongue
This post is done in collaboration with the newsletter Voice Bites by Vocalime. Voice Bites delivers the most relevant developments in the conversational AI space every month. Its goal is to make voice technology, chatbots, and language models easy to bite into.
------------------------------------------------------------------------------------------------------
Introduction
Language is our advantage. We are weaker than many in our primate family tree, less dexterous than those that never left the forest canopy, slower than the cheetah and gazelle, and less aquatic than our even more ancient ancestors. The ability to deliver complex information and skills not just through evolutionary trial and error hardwiring certain tendencies into our biochemistry but also through personal changes that can occur within our own lifetimes led us, and not any other species, to dominate the planet. While it is true that humans can be competitive and violent creatures, the strength that comes from language is our ability to cooperate. That which has made us successful is, in the end, not the competition that drove evolution, but the cooperation that drove culture.
Language is so essential to the human experience that we appear to be hardwired for it. Take the example of Nicaraguan Sign Language, a sign language that developed spontaneously among Nicaragua’s deaf community in the 1970s. Although its inventors had never heard spoken language, the users of Nicaraguan Sign Language were able to develop a fully fledged language complete with cases, verb agreement, and other grammar that was generally compatible with spoken and written languages. Certain aspects of human language do seem to be innate and even universal to the species.
Given the fundamental role that language plays in the human experience, the role that it will play for the AI experience should be of great interest. If an AI is a single actor with no fellows, then the need for it even to have language is dubious. If there are multiple AIs communicating with each other (or a decentralized AI that has different modules that need to communicate), then some kind of language will need to be developed, even if it looks absolutely nothing like human language. This post will be a brief discussion of an AI’s need for language, the AI’s ability to communicate with its fellows, and what an AI language should look like, drawing heavily from Plato’s Cratylus. Out of necessity, this post will also touch briefly on human projects in artificial languages.
The Universal Tongue
From utopians to language artists, the prospect of a universal language remains a captivating idea for many today. Esperanto, the most well-known universal language project, can claim hundreds of thousands of speakers and began as a utopian dream of creating a universal lingua franca that everybody would learn in addition to their own language. Lojban, a more intellectual endeavor, is an attempt to create a language that is entirely unambiguous in its grammar and meaning. And in the deep end, there is the intimidating Ithkuil, a constructed language with the goal of condensing as much meaning into an arbitrarily large set of phonemes as possible, a language so complicated that not even the creator considers himself a speaker.
In relation to the different purposes of language, each of these constructed languages, or conlangs, hopes to maximize one of these purposes. For Esperanto, a practical conlang, its goal is to facilitate ease of communication—many of its speakers even today hope that it will become a universal language that leads to better international understanding and cooperation. For Lojban, a logical conlang, its goal is to have the most unambiguous text possible, much in contrast to natural human language, which itself is rife with ambiguity and contradiction. For Ithkuil, a maximalist conlang, its goal is to maximize the meaning that each word has, cramming so much meaning into each set of phonemes that it has been speculated that a fluent speaker would think around five or six times as fast as a speaker of another language.
For human brains with only so much computational power, each of these three different attributes—facility of meaning, clarity of meaning, and amount of meaning—may have tradeoffs associated with them. Ithkuil, an extremely convoluted language, is very difficult to understand and convey. Lojban, an absolutist language, has very strict grammar rules that make it unambiguous, but misses a lot of the nuance and subtext of natural language and has a very unnatural feel that hinders an organic conveyance. Esperanto, a practical language, has the unfortunate side effect of simplifying and dumbing down language, creating ambiguity and reducing the meaning conveyed. To this is added the fact that all human languages are naturally limited by the array of sounds producible and discernable by humans. This is a serious disadvantage for artificial languages, particularly for Ithkuil, which sounds extremely unnatural given its enormous diversity of phonemes crammed into the same linguistic framework.
A computer, of course, has none of these limitations. It is conceivable that an AI language, either a manner through which an AI may speak to itself or the language it may develop with other AIs, might be able to maximize and optimize each of the three abilities of language listed here—and, of course, do so without even needing the ability to speak.
Background: How Computers Talk
The way in which computers store information is obviously much different from how humans do. Seeing as the simplest way to store information is in binary digits (the system that virtually all computers use to store and transmit information), language almost seems laughable for computers except when they need to interact with humans. Why have to use language to describe a file when you can just send the file perfectly? Similarly, we meat creatures access information, to use Max Tegmark’s analogy, much in the same way that a search engine does, by specifying certain information and then essentially combing through all of the possible results that could be associated with the query. Computers, on the other hand, have perfect memory, not searching for information, but accessing it directly simply by going directly to the memory cells in which it was stored. They don’t search for the information, they go to the information. With that kind of capability, language seems like a relic, a curiosity that a computer might want to learn only to communicate with the computationally inferior human race.
However, there is a revolution in computing taking place—and it is making computers more like us. The way in which humans learn is through our neural networks, a term that to those of you who study artificial intelligence will ring a bell. In a nutshell, if you are looking at a set of cutlery and are asked to identify the spoon, your brain will analyze the different aspects of the different pieces of cutlery before you, like their shape, size, color, and other features, and you will choose the spoon. How did you know it was a spoon? Because you have seen hundreds if not thousands of spoons before and you know what one looks like, each instance of seeing a spoon being a further weighted confirmation in favor of the thing with that particular shape being one. You even have the remarkable ability of identifying something that you have never seen before as a spoon simply based on your experiences with them. That is something that basic digital computers do not have. The way that machine learning works, it is possible to train digital computers to do this, but the computational power and energy required to do this is enormous, seeing as it can take over a thousand transistors to multiply two numbers and trillions of calculations are needed to effectively train an algorithm. So, while these algorithms are very precise and are very impressive in their capacity to learn, it is an extremely intensive process.
Seeing as humans essentially learn with neural networks and we are not digital computers, there must be a more energetically and computationally efficient way to learn, and there is: analog computers. An analog computer, instead of simply storing the rigid zero or one that a digital computer does, can represent a continuous range of values between zeroes and ones, generally done with a literal voltage that can be much more precise than rigid digits can represent. For example, voltages can more easily express irrational numbers than digital computers: storing an arbitrarily large number of digits of pi can require a huge amount of bits or signals that need to be interpreted, but a voltage that reads 3.14159 etc. is just a single signal, much easier to communicate. Addition is simple: just add two voltages together as opposed to the dozens of transistors required, and multiplication is just a voltage passing through a resistor instead of over a thousand transistors. However, such computers are far less precise and far less versatile than digital computers: since voltage values are literal voltages that are continuous as opposed to discrete digits like in digital computers, there is always room for error and you will never get the exact same outcome twice. Add that up over dozens of individual calculations and you get chaos. Additionally, if you need a specific resistor to multiply two numbers together, then the computer can do exactly one thing: multiply a number by a constant. Compare that to the almost infinite versatility of the Turing complete digital computer (more on that later).
Nevertheless, with the massive amounts of calculations required for machine learning and the fact that machine learning is a rapidly accelerating field, the price of developing analog computers for performing these calculations is becoming increasingly worth the benefits they bring. However, the problem of chaos rising from multiple layers of calculations still requires a solution. The current solution? Convert the analog signal to digital and make corrections between steps. In essence, what that means is that raw sensory data is coming in, being processed, and then converted to a more stable and exact Form before being processed again as an analog signal. Add the potential for compression of the digital data for ease of further analysis and this post will start to sound suspiciously like my first full-length post on this blog, The Search for Perfection. Unlike purely digital computers, I would argue these hybrid analog-digital computers do have a need for a certain kind of language, if only to facilitate this correction and processing of information. After all, that is one of the reasons that we have language.
The Constant Language
Plato’s Cratylus is his foremost text on linguistics and etymology. Interestingly, the majority of the dialogue is not between Socrates and the titular character, but instead between Socrates and Cratylus’ associate Hermogenes. Hermogenes asks Socrates a seemingly endless set of queries about the etymologies of different Greek names and words. Socrates’ answers start out plausibly enough, with explanations of how the different gods got their names (it makes sense why Ares would be named after the Greek word for war, Cratylus 407c), but they get more and more contrived when he begins to talk about more abstract concepts like justice, benefit, and gain.
The contrived natures of Socrates’ etymological musings reveal an interesting truth about language, particularly through his musings on sign language. If humans were, en masse, unable to speak, we would most likely communicate through sign language, which would likely start off somewhat crude before developing into a more complete language, much like the history of Nicaraguan Sign Language. Nevertheless, the easiest signs to invent would be imitations of physical objects: Socrates gives the example of galloping to signify a horse (423a). Similarly, we can paint a picture of a horse to imitate it and even write characters that look somewhat like one, like the Mandarin character for horse (馬). On the side of spoken language, there are a few ways to “imitate” these physical things. The obvious way would be to make the same noise that it makes, deriving onomatopoeia (from the Greek “making names”) into words. Onomatopoeia are, of course, written representations of non-word sounds, like “bang” or “thwump.” However, some of these onomatopoeia do become actual words, like “whippoorwill.” Since they’re no longer just sounds with names but now real words, it does not seem fair to simply call them onomatopoeia, and so I call these kinds of words lexilogiopoeia (from the Greek “making words”). Socrates lampoons the concept of universal lexilogiopoeia, since imitating a sheep does not mean that one is naming one (423b). In essence, there are many elements that one might imitate when one makes a name for an object, like sound, color, and shape (427d), and an imitation of different elements might create a valid word. But how do you design a sign or a word that can fully imitate a color? Or worse, an abstract concept like justice? There is no lexilogiopoeia for justice, since justice makes no single discernable sound. Justice has no color, size, or texture. You cannot fully imitate justice via speech or sign language, but nevertheless, we still have a word for justice. Nobody has ever painted a picture of justice, at least not in a way that was not metaphorical or personifying. It seems that language, whether spoken, written, or signed, is better suited to describing justice than trying to represent it visually. If a computer is to be dedicated to justice, it will need to develop some kind of concept for it, and its language will have to include a “word” for it.
Clearly, it is not that simple. Words and definitions change to match the times. Indeed, throughout the dialogue, Socrates constantly makes increasingly contrived claims that different words in Greek are derived from words having to do with flow or motion, an inspiration he attributes to the pre-Socratic philosopher Heraclitus as well as to Homer (402a–b). The ancient “namegivers,” those who invented the ancient words to which ours are related today, dizzied themselves by seeing everything moving around them and therefore named things to reference the concepts of flow and change (411b–c). Change is certainly a crucial component of language—after all, if nothing ever changed, language might be entirely superfluous since there would be no need to convey new information. However, if something changes, then any “knowledge” that one previously had on the subject is no longer knowledge, and if everything is constantly moving and changing, then there can be no knowledge at all (439e–440a). Indeed, Socrates provides rebuttals of the relativist ideologies of Protagoras, Euthydemus, and Euthyphro throughout the dialogue and manages to dissuade Cratylus away from taking a similar stance (429c–d). And after all, being dizzy is not exactly a good state of mind in which to find oneself when doing something as important as making a language. Socrates even completely recants the flow argument at the end of the dialogue, saying that words are supposed to describe immutable, perfect Forms. However, our perceptions of these Forms and the precise concept to which we refer when we say something like “justice” can and often does change throughout time (439c–440e). This is one of the many reasons that there can be no perfect or divine language (401a).
All language has to be dynamic—that is one of the lessons of the Cratylus. However, perhaps a computer’s language can be more exact and, through dialectic and contemplation of the important things like these immutable Forms, have to change less.
The Dialectical Machine
Those of you who are even remotely familiar with AI ethics will already be raising your eyebrows at the idea of “AI justice.” Since justice is a concept invented by humans, many of whom disagree on the subject, human biases often appear in AI-applied justice. There are plenty of horrible examples of AI/machine learning algorithms that show racial bias, because of course they do—if the data used to train them was flawed, their perceptions of justice will also be flawed. What is more, even if one rejects the doctrine of relativists who say that justice is different for everybody, this does not change the fact that people still disagree strongly on the issue. There is also, simply put, far too much to say about justice—Plato wrote dozens of dialogues on the subject.
However, the format of Plato’s writings as dialogues as opposed to essays implies a number of things about the search for justice. For starters, it is necessary to expose oneself to many different viewpoints in order to get a holistic view of justice, evidenced by the broad range of opinions that present themselves even within individual dialogues like the Gorgias or the Republic. Additionally, being engaged in dialectic means that one is actively reflecting and philosophizing on the nature of justice and other lofty concepts. Therefore, the best language is the one that can best engage in dialectic and philosophy.
Socrates explicitly says that the best user of words is the dialectician, defined as the person who can best ask and answer questions (390c), and that this person should assist the “rule-setter,” or the giver of names, in making language, comparing the arrangement to a ship captain overseeing the crafting of a rudder to ensure that it is the best rudder possible. He argues that the person who uses a tool like a rudder or indeed a name is more likely to know if the proper Form is present in that particular tool (390b). This is certainly a dubious claim. A ship captain is not a carpenter and the skilled carpenter would know how their particular craft was supposed to be used. Similarly, the dialectician is not a rule-setter dedicated to creating language, but instead is dedicated to using language that has already been created. But why suggest something that is so obviously dubious? It suggests that language is indeed a compromise between the user and the maker, with the user even being able to modify the work of the maker in a pinch, and that the maker is still responsible for the bulk of the creation of the language, but the user of the language should provide input into how the language is to be used, just like a ship captain would want to provide input on what exactly the rudder was supposed to do. It is ridiculous to think that the diner, effectively the user, of a dish would know more about gastronomy than the chef du cuisine, but the diner places the order. Socrates is a prime example of a dialectician, asking and answering many questions throughout the course of the Platonic dialogues, and he is primarily and passionately focused on philosophy. Socrates would have all skilled dialecticians focused on such a task, as evidenced by him connecting the Ancient Greek word for “questioner” (erōtan) with the word for passion (erōs).
However, there are clearly other skilled users of language that Socrates admires. His citations of Homer and Hesiod throughout the dialogue as well as his praise for Apollo, which can be found in the Cratylus (403e–405b) and in many other dialogues (Apology, Gorgias, Phaedo), imply that his praise of dialecticians is hiding something. While he here defines dialecticians as skilled askers and answerers of questions, Homer and Hesiod are poets and storytellers and Apollo is the god of music. None of them are particularly interested in dialectic, but it is impossible to deny that the poets and musicians are also passionate users of language. This implies that the use of language is best directed by dialecticians and toward dialectic, but also that there can be beauty in language that is not used in this way. Certainly there is power, which is the topic of the Gorgias, but there is also beauty and creativity in the clever usage of language beyond just flattering and manipulative rhetoric. If a computer were able to feel emotions that remotely mapped onto a human’s emotions, its language would have to have the capacity for music, poetry, and beauty beyond just the beauty of dialectic.
Conclusion
If an artificial intelligence wants to have a kind of language that best directs it toward the truth, then it has to be a language that is as static, or unchanging, as it can be. If it gets dizzy in trying to define its terms and think about the world around it, it will be just as confused about the truth and about immutable Forms as we humans are. It should do its best to craft a language that is not immutable—that is impossible—but that is well-defined and seeks to approach immutable Forms as closely as possible, even if actually achieving that is not possible.
This language should also have the goal of being universal. We can learn from the previous human attempts to form universal languages and how each tried to min-max a particular element of linguistic communication, be it ease of learning, level of ambiguity, or amount of meaning. Theoretically, a computer’s language, simply due to the level of computational power that a powerful computer may have, does not have any need to compromise on any of these elements. The computer code and language that is running the device upon which you are reading this post is Turing complete, meaning that it can simulate any given function that has a definite end and in theory run any computer program that any other computer can run. Virtually all programming languages are Turing complete and, in that way, are essentially universal for any other Turing complete computer or programming languages. Thus, a computer’s language should be limited in its dynamism, universal, easy to understand for other computers of comparable intelligence, virtually unambiguous, and highly descriptive. For human languages, each of these elements are in varying levels of tension and tradeoff with each other, but for a powerful enough computer, perhaps they can be reconciled.
And what should this intelligence do with this language? If it wants to get as close as it can to the immutable Forms and the deepest understanding of the universe that it can muster, then it should use this language to engage in dialectic. A computer engaging in dialectic might sound bizarre, not least because of the question of with whom it would engage, but it may be possible to engage in dialectic with itself or different versions of itself. In both science fiction, exemplified in the TV show WestWorld, and in science fact, exemplified in the feedback loops into which AIs enter in order to fine-tune and correct themselves, different AIs can converse with one another and grow. Indeed, a famous example of two AIs doing this is two Facebook AIs creating a kind of pidgin language to trade virtual balls and books. These feedback loops can be used to further refine this language into something approaching perfection for philosophy, lending at least some dynamism to this otherwise unshakable tongue. However, if we want to preserve some of the enigmatic beauty of humanity, we have to preserve the beauty that is found in human poetry and music. If a computer’s language can have the ability to be incredibly precise, then it should also have the ability to be less so in order to convey more vague varieties of beauty.
Language is our advantage. It lets us talk about abstract and concrete concepts in a way that a crude, undeveloped sign language does not allow. Many of the most esteemed minds and people in society are lauded because they can use language in a way that investigates the world and causes others to think in new and exciting ways. A computer must be able to do the same in order to get as close as it can to the truth.
