mittmattmutt's blog

What language do we speak on social media?

How does the internet change language and how we use it? It might seem little: after all, I am writing this in standard English and that’s how you are understanding it. Or it might seem a lot: you are, almost certainly, a stranger to me, and for most of humanity’s history, most humans have not talked to strangers in such a way. So you might think what while the grammar of language hasn’t changed, the uses to which it’s put have.

My aim here is to explore this thought by introducing some tools from linguistics and the philosophy of language, and using them to analyse a particular type of internet language: Twitter English. I will suggest that looking at this through the linguistic lens we can see good arguments against the conclusion of the last paragraph: Twitter English, I will suggest, has a different grammar to standard English. It is a different language.

This is a very important conclusion. If we are, despite ourselves, speaking a different language in speaking on Twitter (and other such platforms), then we should care whether we are speaking a good language. And we should be suspicious that we aren’t: after all, should we expect a language whose structure is determined by a handful of engineers working on behalf of people looking to maximise profit is the best language we could be speaking? No, I think, and so I’ll suggest ways that we could improve Twitter English.

A Framework For Thinking About Language

What language is is a big and scary question (most ‘what is x’ questions are). But we can make things easier by just stipulating that language is what some linguists study, and then taking a look at the sort of thing they do.

For our purposes, we can understand a language as a bundle of types of properties, or again as a layers of rules. Consider this sample sentence:

(1) Ume is barking

There are a bunch of things we can say about it. It has syntactic, semantic, and pragmatic properties (it has other properties, like for example to do with sound; since I know very little about phonology, the field which studies such things, and since anyway sound seems less important in the written media of the internet, I will ignore these properties).

Syntactically, we can tell that this sentence is well-formed, in a way that, for example, and outside of Yoda-English, the following isn’t:

(2) Barking Ume is

More generally, the study of syntax has unearthed a series of rules that govern when a sentence is well-formed, and specifying these rules is an important part of describing a language.

Our sentence also has semantic properties: it means something. It describes the current actions of a particular dog. Other sentences mean other things: ‘Rooibos ate kibble’ describes the past actions of a particular cat. Some sentences mean nothing: ‘Sincerity ate magic’ seems like it’s nonsense. As with syntax, to give an account of language we need to specify its semantics. This involves saying what its words mean, but also, more importantly, how the meaning of complex expressions, like sentences, is determined by the meanings of its parts and its structure. This fact — that semantics is concerned with the structure of meaning — will be important later.

Finally, our sentence has pragmatic properties: there are facts about how and why it is used. The pragmatics of our sentence will typically, although not always, be pretty straightforward (it is used to state a canine-related fact). But there are more interesting cases.

For example, pragmatics is invoked to explain why, if one is asked how a date went, and one replies

(3) Well, the starter I ordered was nice

One manages to use that sentence — which just literally means something nice about food — that convey that the date went badly. For another example, pragmatics also is used to explain how we can perform actions by saying words. Speech act theory points out, for example, that when I say

(4) I name you ‘Joey Joe Joe’

I am not reporting on some existing state of affairs: rather, I am bringing it being the state of affairs of the person’s being called Joey Joe Joe. In saying those words, I cause a change in the world, a change that makes that currently unnamed person have a name.

Phonology, syntax, semantics, pragmatics — these we will take as the core features of a language (although, as said, I won’t say anything more about phonology). Using these concepts as tools to get a grip on the subject, we can now ask: what, if anything, is distinctive about social media language? A surprising amount, it turns out.

Twitter English Is Not English

Twitter English, of course, looks like English. It has the same syntax, for the most part: if we want to report on the actions of our dog, we have to say something like (1) and can’t say something like (2). And the words on Twitter seem to mean the same as they do in regular English. Based on that, you might think it’s the same language.

But that is, I think, not the best way to look at it. One reason for this is that the basic pragmatics — what one does with the language — are so wildly different from the pragmatics of English and other natural languages that I think important facts are occluded by treating Twitter English as standard English.

Thus think about what we can and cannot do with tweets: we cannot say long things without threading tweets. That’s a biggie. We cannot, outside of DMs, address individual people, or small groups, without possibly letting all our mutual followers see what we’re saying. To speak on twitter is, ipso facto, to address possibly hundreds of people, possibly most of whom one doesn’t know. That’s just impossible in normal English. (Try it — say something now, while reading this. Did you address hundreds of people? If you did, I beseech you, stop reading this and return to whatever the more important thing it is you’re doing.).

If we think that the pragmatics are part of the language, then given the wildly different pragmatics of twitter, we should hold that the language we speak on it is different from our everyday language.

Hmm, well, maybe. That argument is only at best mildly convincing. But I can do more: I can show that both the syntax and the semantics of Twitter English are different from normal English. In order to show that, however, we need to explore the nature of semantic theory some more.

Some Principles of Semantics

Here are two principles:

(P1) The core meaning of an expression is what anyone perceiving a use of that expression will understand.

(P2) All the core meaning of an expression is syntactically realized.

(Note for people who know about this stuff: I’m going to be making a ton of simplifying assumptions. I’m pretty confident that if you add in the complexity the points will still stand, but doing so would make this post unreadable to normal people, and I want normal people to read it.)

Let me illustrate these two principles: imagine you’re walking quickly through a mall. You hear a man say ‘I want that coat!’, walk some more, and hear another man, outside another store, also say ‘I want that coat!’ There’s a good sense in which these speakers say different things: the former, let’s say that his name is Joey Joe Joe, and let’s say he’s standing outside a Gap, says something that’s true provided Joey Joe Joe wants Gap’s New Spring Parka. The latter — Shabadoo Junior — is standing outside an Urban Outfitters, and says that Shabadoo Junior wants Urban Outfitter’s Best Pressed Denim Jacket.

Although both speakers say different thing, nevertheless there’s something like a shared meaning graspable by me: that the person speaking wants some coat that they can currently perceive and which has their attention (roughly). Call that core meaning: the thing that anyone hearing an expression understands (for cognoscenti: this sounds like Kaplanian character, but that isn’t quite what I mean, as will become apparent).

To see the second principle, change the example to ‘I want the coat!’ Again plausibly, the first is saying something like that he wants the coat in the window of the Gap, while the second is saying something like that he wants the coat in the window of the Urban Outfitters. However, it’s plausible — although philosophers, being the sort of people they are have tried to deny it — that in each case, they said, syntactically speaking, the same sentence. And since we’ve agreed that they’ve also both expressed the same core meaning, it seems plausible to say that all and only core meaning is syntactically realized. On this story, ‘in the window of the Gap’ and ‘in the window of the Urban Outfitters’, aren’t core meaning, and so aren’t part of the syntax of the sentence.

I admit, to properly defend these principles would take a lot of work; I won’t do it here. You’ll just have to trust me.

Now note the following crucially important fact:

(*) Anyone perceiving a tweet involves perceiving facts about its reception

So, by P1

(*) Facts about its reception are part of its core meaning

And, by P2

(*) Facts about its reception are syntactically realized

The reason for the first thing is just because those facts — the number of replies, retweets, and favourites — are part of the tweet object that is displayed on the timeline. To see a tweet is to see those things, and so by the independently plausible principles they are part of its core meaning and thus the syntax:

A tweet as it appears on one’s timeline: note that information about its reception — its retweets and likes — is automatically included

And this makes, I think, independent sense: we say that the unit of significance, on twitter, is not the (normal) English sentence embedded in it, but rather the whole thing. The syntax of a tweet, then, consists of an English kernel along with metadata and reception information. We could represent it as so: S,[g],[p],DT,R,RT,F, where the bracketed things (g=gif, p=picture) are optional, DT=date time, R=replies, and so on.

Embedding Arguments

Here’s another argument for the claim that reception information is part of the core meaning of a tweet. In the theory of meaning, semantics, a very important type of argument is embedding arguments. In an embedding argument, you test your analysis of a bit of language by embedding that bit of language in a more complicated linguistic environment and making sure that your analysis still holds up.

In particular, here is an important principle:

(P3) The core meaning of an expression is preserved when embedded

Many of the most famous arguments in semantics make use of something like this principle. To see it in action, consider definite descriptions, expressions like ‘the president’.

Here’s an initially plausible analysis of definite descriptions: they simply serve to stand for the thing that they describe. The core meaning of ‘the president’, then, would simply be Trump.

An embedding argument shows that that can’t be right, because that purported core meaning isn’t preserved when ‘the president’ is embedded in certain environments. Thus consider:

(5) In 1934, the president was a democrat

This sentence doesn’t, on its most plausible reading, say that Trump was a democrat in 1934 — he didn’t exist then. Rather, it says that the president at that time was a democrat. But if Trump himself is the core meaning, and core meaning is preserved, it should say that in 1934 he was a democrat. Since it doesn’t, we can conclude that the core meaning of ‘the president’ is not Trump.

Here’s one other example. Some people think that only fact-stating language is really meaningful. Talk about God, aesthetics, morality, and other purported non-empirical stuff is not in the business of talking about the world. Rather, when you say something like

(6) Murder is wrong

You’re expressing your attitude towards murder. It’s like, people say, you were saying ‘Boo, murder!’ You’re not really saying something, anymore than you’re saying something when you boo a sports team, or say ouch, or scream, or laugh. (6) has no core meaning.

Again, though, an embedding argument causes problems. If that sentence has no meaning, then any sentence of which it is part should also have no meaning. If core meaning, and only core meaning, is preserved in embedding, and there is no core meaning, then nothing is preserved, and embeddings should be meaningless.

But they aren’t. Embedding our sentence in an if-then environment yields a perfectly meaningful sentence:

(7) If murder is wrong, then murderers should be imprisoned.

(By contrast, ‘if Boo murder!, murderers should be imprisoned’ truly is meaningless.)

Examples like this could be multiplied. The point is: embedding arguments are a very good way to learn about the core meaning of an expression.

Now here’s a claim: retweets are embeddings. Just as I embed ‘the president was a democrat’ under ‘In 1934’, so I embed a tweet by clicking retweet, which we could imagine is as if I said ‘I retweet’ (remember back in the day retweets were explicitly marked by ‘RT’ in the tweet body — they were clearly embeddings back then).

But note, retweets preserve information about other retweets, favourites, and replies. By our principle, then, information about retweets, favourites, and replies is part of the core meaning of a tweet.

And here is yet a third piece of evidence: increasingly people, when writing pretend tweets of their own for comedic purposes, explicitly write in reception information. I couldn’t find any examples quickly, unfortunately — if I do, I’ll add one later.

(Note, incidentally, that the methodology I have pursued here: a combination of looking at how people actually use language on Twitter as well as invoking theoretical tools, is a good illustration of how working semanticists actually provide analyses of bits of language. Although it might seem like an unserious object of study, in studying twitter English one is doing proper linguistic analysis.)

So what?

I have belaboured this point but only because I think it is monumentally significant in several respects. Our pretheoretic understanding of language is that someone says something, others evaluate it, but the evaluation and what is said are different. If you buy my argument, then this isn’t so for Twitter English: facts about a tweet’s reception are part of its meaning. This is a massive change.

(Skippable technical aside: in fact, recent work in philosophy of language and linguistics has room for reception in the theory of meaning. Properly to explain this would be a bit difficult, but consider a sentence like:

(8) Kefir is delicious

I think kefir is delicious, and so am wont to utter this sentence. Others disagree: they might hear me say that sentence and refuse to assent to it. Here is what some take to be compelling: we have a case of faultless disagreement. It’s disagreement: there is some one claim we’re disputing. But it’s faultless: neither of us is really wrong. You have your tastes, I have mine, and neither is correct. One way to make sense of this is to say that our sentence isn’t true once and for all, but can be variably true or false relative to different audiences. Truth is relative.

To find out whether a sentence like this is true, it’s not enough to look out into the world and see how things stand. Rather, you have to also look to a particular person and their particular taste: if, according to their taste, kefir is delicious, then the sentence is true relative to them; if not, not.

It’s thought that this can explain faultless disagreement: there is some object of disagreement, namely whether kefir is delicious, but the way we evaluate it for truth or falsity is different to the way we evaluate ‘Ume is barking’ for truth and falsity. And the big picture point is that this way of evaluation seems to bring in something like reception: in how the audience, and in particular the audience’s taste, is. This is known as relativism about truth, and I apologise to any philosophers of language having conniptions about the imprecision in the above.)

Returning to the main line of argument: it’s a big change to introduce facts about reception into the meaning of an expression. And it’s a fact we should be worried about. Because, after all, we don’t get to decide whether to convey reception information as part of the meaning of what we say. It’s an automatic feature of the Twitter software, software designed by engineers at the bidding of executives a central motive of whom is to extract profit from our activities.

It’s pretty widely acknowledged that by communicating on social media we are enriching the tech companies who own the platform. My argument takes this a step further: not only do our actions benefit them, but the very things we say are out of our control and determined by those companies. We are alienated from what we can mean by the platforms by which we communicate on the internet.

It’s worth thinking a bit about exactly what the intruded core meaning of tweets are. If it’s notable that they include reception information, it’s also notable what they don’t contain. For example, they don’t contain details about how big the original audience was, but that is surely important since a tweet seen by hundreds of thousands of people originally that gets a couple of hundred retweets is much less noteworthy than one that originally was seen only by a small audience. Arguably, it would be better were that information part of the core meaning of the tweet.

But that suggests a thought: let’s grant that platforms determine what we mean. Then a question we can ask is: what sort of institutions would enable us to mean better things. Granted that Twitter English is inferior in certain ways, what would a superior internet language look like?

Making Internet Language Better

A venerable philosophical and scientific idea has it that our languages can and should be improved. In the 19th century, a flurry of work arose with people trying to design universal languages, of which Esperanto and Volapük were two of the earliest and most famous. Around the same time, the German philosopher/logician Gottlob Frege developed his Begriffsschrift, the goal of which was to make a language that didn’t suffer from the defects of vagueness that afflict natural language. And yet more recently, philosophers interesting in conceptual engineering have noted that it’s almost certainly wrong that our language is as good as it could be, and given this fact, we should try to engineer our language to make it better.

So let’s ask: what would a good internet language look like? I am interested in hearing others’ views on this, but here are some initial thoughts:

It would be good if there were ways to indicate one’s confidence in a given claim. Arguably, we seldom outright believe things: we rather think it’s probable, likely, very likely, almost certain, etc. In real life, but especially in an environment where characters count, these important qualifications might tend to fall out, and we can wander into making stronger claims than we in fact hold. This is bad.

But if we realize that meaning is determined by the interface, then we can design the interface to make it automatic that such information is included. We could include a confidence bar, for example, where you indicate how strong your confidence is in the thing you’re saying, which would automatically show up as part of the tweet. Arguably this could be very beneficial.

Here’s another example. There is more or less no limits to how much you can tweet. And this is reflective of a fact about real world speech, namely that the amount you can speak is limited only by your physical capacities. No one has the right to limit how much you speak, because speaking is a bodily action over which you have control.

But maybe that’s bad. Maybe we should prefer a language the pragmatics of which excluded people from talking constantly: maybe that would lead such people to prioritize what they say, increasing the overall quality of discourse.

And that is something that could easily be implemented in twitter English (while impossible to implement in normal English). A timer could be introduced such that you could only tweet so many times in a given period. It’s an open question whether that would make language better, but it is something worth exploring.

These are merely illustrations. The important point is to realize that we could make better platforms, and that could help us communicate better.


The language of social media is a new sort of language. This is monumentally important, and both worrisome and hope-inspiring. It is worrisome because what we say is determined in part by the structure of the platforms we use, and we don’t have good reason to trust that the designers and owners of those platforms are concerned with letting us communicate in the best way possible.

But it’s hope-inspiring because, recognising this, we recognise the possibility of designing platforms, and thus ways of communicating, that our better than we currently have. The goal of the constructed languages of the 19th century was to foster world peace by eliminating language barriers, and although that didn’t go so well, the prospect of improving how we live by improving how we speak is one that we should still look on with excitement.