What it’s Like Inside the World’s First Emoji Convention 😜

Linguists such as Tyler Schnoebelen, who has done several studies on emoji and emoticons, delved into the power and diversity of the images as a method of communication—while joining the chorus of his peers who say, no, emoji are not a language. He noted how the meaning of one image will differ depending on who is texting it and where and to whom. For example, emoji users in the United States might never end their war over whether[🙏] is a high five or a prayer, but it means something entirely different in Japan: “thank you.” The meaning of emoji—just like words—isn’t static either. “Language changes,” says Schnoebelen, “and emoji are changing.” 

Read More
Emoji, PressTyler Schnoebelen
Emoji linguistics

True language universals can be hard to find but two of the most solid are (1) languages change and (2) people are really good at adapting to what’s handy. We’ll explore the ways emoji are changing, ways they haven’t, and where to look for hot spots of innovation.

Read More
EmojiTyler Schnoebelen
Extreme language in presidential debates: Reagan, Trump and everyone in betwee

If you follow politics in America even a little bit, you know that Republicans talk a lot about taxes and that Donald Trump loves the word tremendous. But how do these rank relative to each other and to what Democrats (and Hillary Clinton, in particular) tend to talk about? Well, one finding is that over the years, Republican candidates have been even more preoccupied with Hillary Clinton than they have been with Ronald Reagan. Another finding is that the debates for the current election have been ~157% more negative than all previous debates.

Read More
U.S. presidential debates through the eyes of a computer

This post wraps up a series I’ve been doing on using machine learning models to understand recent American political debates (here and here). By taking all the transcripts of the debates since last year, I show which words and phrases most distinguish debaters’ styles and issues. Training a computer to identify speakers is usually thought of as a way of doing forensics or personalization. But here, I’m interested in something closer to summarization. If you can pick one section of talk for each candidate from the last debate, which moments are most consistent with everything they’ve said up to then?

Read More
The most Clintonian and Trumpian moments of the VP Debate

Last week, I wrote about the most Clintonian and Trumpesque moments of the first presidential debate. In this post, I’m going to ask the same question of the vice presidential debate: what moments did Tim Kaine and Mike Pence sound most like the candidates at the top of their ticket? We’ll then dissect the results in order to see how different assumptions about the data would affect results of this kind of linguistic style analysis.

Read More
Tyler Schnoebelen
More data beats better algorithms

Most academic papers and blogs about machine learning focus on improvements to algorithms and features. At the same time, the widely acknowledged truth is that throwing more training data into the mix beats work on algorithms and features. This post will get down and dirty with algorithms and features vs. training data by looking at a 12-way classification problem: people accusing banks of unfair, deceptive, or abusive practices.

Read More
Nattering Nabobs of Negativity: Bigrams, “Nots,” and Text Classification

You can get pretty far in text classification just by treating documents as bags of words where word order doesn’t matter. So you’d treat “It’s not reliable and it’s not cheap” the same as “It’s cheap and it’s not not reliable”, even though the first is an strong indictment and the second is a qualified recommendation. Surely it’s dangerous to ignore the ways words come together to make meaning, right?

Read More
Why emojis are here to stay as an essential part of our language

Despite its obvious influence on how we communicate, the vast emoji set is not considered a language in itself. Tyler Schnoebelen, who probably knows more about emojis than anyone else in the world, is a computer linguist who wrote his PhD thesis at Stanford University on the use of emoticons and emojis. He told Time magazine that there are rules and patterns on how we use emojis, but there is no grammar or language structure to form a sentence with them.

Read More
PressTyler Schnoebelen
Training an AI doctor

Some of the earliest applications of artificial intelligence in healthcare were in diagnosis—it was a major push in expert systems, for example, where you aim to build up a knowledge base that lets software be as good as a human clinician. Expert systems hit their peak in the late 1980s, but required a lot of knowledge to be encoded by people who had lots of other things to do. Hardware was also a problem for AI in the 1980s.

Read More
Failed vs. fighting: the linguistic differences between speeches at the RNC and the DNC conventions

We know that Republicans and Democrats talk differently, but what’s the best way to describe these differences? Commentators note the relative darkness of the Republican National Convention and the focus on optimism and higher production quality for the Democratic National Convention. Looking at the words speakers use helps–but you can’t just use simple frequency (for details, check out the methodology section at the bottom).

Read More