"Oh yeah, our software can TOTALLY detect sarcasm. :rolleyes:"
Shady sales and marketing people have been saying this for years.
Unfortunately, many of the prospective clients on the receiving end of this snarky remark don't realize that the comment is sarcastic until they start reviewing the results.
The truth is that sentiment analysis is a long way from solving the problem of sarcasm and many linguistics experts still believe it can't be done reliably.
Problem 1 - People can't even reliably interpret sarcasm
"You really need to stop letting things blow your head up because you’re not that cute," is not sarcastic. It's just a blunt statement.
Strangely, if asked whether or not the statement is sarcastic, a large portion of the population might identify it as so.
That's according to a study by Patricia Rockwell in 2006.
In her report, 'Yeah, right!': A Linguistic Analysis of Self-reported Sarcastic Messages and Their Contexts, Rockwell found that when 218 participants in her study were asked to write down a sarcastic message:
- Only 163 responded
- Of those 163, only 73 (44%) produced responses that were sarcastic
Reality: Many people don't even understand what sarcasm is or how to express it.
Problem 2 - Sarcastic expressions vary widely and often require context
Often, sarcasm is not in your face and incredibly obvious. The art of sarcasm is actually to not be blatant. Even still, context plays a huge role in sarcasm.
Take these three examples:
1. The Big Lebowski
2. The Onion on Obama being elected president
3. Twitter user on a diet
Looking for a pattern in these quotes? Here's the clearest one: All three of them require context to understand that they are sarcastic. For the tweet it's the #sarcasm hashtag.
Sure, there are instances where this isn't the case. For example, "@Delta Losing my bag is a great way to keep me as a customer."
But more common than not, sarcasm isn't this blatant and a statement must be surrounded by some sort of context to be understood as sarcastic.
Problem 3 - Training sets for machine learning are limited and can be overfit
Overfitting the model. It's a common term in statistics and machine learning.
In very basic terms, what it means is that an algorithm has started to memorize the dataset that is being used to train the system instead of learning the patterns within that dataset.
It's kind of like studying for a math test by memorizing the answers to the pre-test rather than learning how to actually solve the problems.
This is one thing that makes sarcasm a particularly difficult aspect of language to train machines to understand.
Machines memorize the fact that tweets are sarcastic but never understand why and therefore can't reliably identify sarcasm within a new set of data.
Furthermore, a study by researchers at Rutgers called into question the reliability of sarcasm training sets themselves, finding that,
"...low performance of human coders in the classification task of sarcastic tweets suggests that gold standards built by using labels given by human coders other than tweets’ authors may not be reliable."
Not only is it difficult to build an accurate, representative sample, but the complexity of sarcasm can often lead to algorithm overfit.
Potential solution - Context aware weighting
Let's go back to context.
One of the most promising solutions for training machines to identify sarcasm is a method constructed by David Bamman and Noah A. Smith that looks for context around tweets.
Rather than just analyzing tweets on their own, the model constructed by Bamman and Noah also looks at attributes of the author (author features), attributes of the intended recipient of a tweet (audience features), and the attributes of responses to potentially sarcastic tweets (response features).
On their sample set of data Bamman and Noah found that adding each feature-set incrementally improved the accuracy of the model.
When using all features, baseline accuracy increased from 75.4% to 85.1%.
One very important thing to note is that the data sample used in this study was created using self-identified sarcastic tweets that included the hashtags #sarcastic or #sarcasm.
The hashtags were used for classification but removed during testing.
Bringing context into the equation is probably the best approach out there to getting more reliable results. Yet, the question remains if the success of this model can be replicated on less overtly sarcastic datasets.
Though promising approaches exist, fundamental problems remain
In their study, Identifying Sarcasm in Twitter: A Closer Look, researchers noted,
"We found that automatic classification can be as good as human classification; however, the accuracy is still low. Our results demonstrate the difficulty of sarcasm classification for both humans and machine learning methods."
Sure, when somebody identifies something as sarcastic with a hashtag or an explicit statement, it's easy to deal with. The problem is when it isn't stated explicitly.
Given all that we now know, it makes perfect sense why software providers with a sentiment analysis component would tell you that their software can detect sarcasm. Right?... #Sarcasm
The bottom line is this: Using natural language processing to detect sarcasm on the internet still has a long way to go and may never be particularly reliable.
You might also like: