Language, Personality, and Linear Algebra

One of my favourite hobbies is to learn languages and a peculiar thing I’ve noticed is that I seem to have different “personalities” in different languages. In English, the language I grew up speaking at school, I feel more academic. In Hindi, which I spoke at home, I feel more friendly and warm. Similar effects happen in the other languages I learned. Of course it’s not as if I have entirely different personalities, but the subtle differences are there. I used to think this was just a personal feeling I had, but it turns out this is a well documented phenomenon for many multilinguals.

To the physicist within me, this didn’t make sense. To see why, let’s start with the assumption that the main purpose of language is to describe human existence. That could be physical things in the real world, emotions/feelings in the mind or anything that we could imagine. Language evolved to fully describe human existence, which in some broad average sense should be the same across all societies/civilisations.

Some societies may lag behind others in science so their languages might not have words for things such as “magnetohydrodynamics”, but differences like these are marginal. When it comes to the fundamental elements of human existence — like those that shape your personality — the possibility space is uniform when averaged over cultures. People from all backgrounds are capable of feeling emotions like love, hate, happiness, or sadness. All civilisations are equally capable of innovation and creativity. If need be, they can all invent words to describe new phenomena they learn about, like “magnetohydrodynamics”.

So all languages ultimately aim to describe the same object — the abstract sample space of possibilities that makes up human existence. If they all describe the same object, then there shouldn’t be any real difference between languages. In more mathematical terms, all languages should be isomorphic to each other in some sense since they all have the same power to express the fundamental elements of human existence. As Shakespeare put it “A rose by any other name would smell just as sweet”.

Where do these different personalities come from then? There’s nothing fundamental within one language that makes it more academic or friendly vs another. This phenomenon must come from the sociocultural context that the language originates from. Although they all have the same expressive power, different languages will naturally highlight those elements of human existence that are most often expressed in their originating cultures.

So languages are, in some fundamental sense, isomorphic to one another, but different languages highlight different aspects of the space of human existence more prominently. When viewed this way, a language is comparable to the linear basis of a vector space.

Think about that most classic of freshman physics problems — the block sliding down a wedge. As any good teacher will tell you, the first step is to define your x and y axes relative to which you define all your forces. You could use the traditional up and down for y and left and right for x, but your life will be much easier if you rotate the traditional x-y basis and choose parallel to the wedge for x and perpendicular to the wedge for y. Of course, both bases will let you solve the problem — they are both isomorphic — but the rotated basis naturally highlights the dynamics relevant to the problem. Similarly, any language is equally capable of describing reality, but certain languages naturally highlight those elements which show up within the dynamics of its originating culture. If we imagine human existence as some type of vector space, then languages are the bases of this space.

The phenomenon of words “lost in translation” lends itself nicely to this model. Consider the Sanskrit word dharma, which arises in Hinduism, Buddhism, and Jainism. Wikipedia says that “there is no direct single-word translation for dharma in European languages”, but that doesn’t mean a translation isn’t possible — we can just use multiple words. Dharma is some combination of “righteousness”, “merit”, and “religious/moral duty” among other things. If we really wanted to get a precise meaning, nothing stops us from using an infinite amount of words to express the abstract concept of dharma in the basis of the English language. It’s just expressed more cleanly as one word in Sanskrit because this particular value is important in Indic culture. In the bra-ket notation of linear algebra — developed for quantum mechanics but well suited for abstract experiments like this — we could write dharma as some infinite linear sum of concepts more natural in English:|dharma⟩ = a|righteousness⟩ + b|merit⟩ + c|religious duty⟩ +…

This vector space paradigm has already been applied in the space of natural language processing. One of the fundamental techniques in NLP is word embedding, where words are statistically converted to vectors in a vector space. Techniques like this give concrete support for the mental model of languages as a vector space bases.

Languages can still remain “isomorphic” yet evoke different “personalities” analogous to different bases for the same vector space that have different eigenvectors. Just like we can see a physics problem in a totally different light when we change our basis, we open a new window to the world when we learn a new language.