Whether you’re a social media analyst scrambling for data prior to pitch time or a consumer insights team working to deliver new insights, the research aspect for analyzing social media data can be daunting at times.
High expectations from your c-suite, internal teams, or clients can place extreme pressure for coming up with impressive findings. However, regardless of who the data is being provided to, in the end the accuracy and authenticity of what is provided rests solely on your shoulders.
The consequences of delivering bad social media analysis can range anywhere from bombing a new client pitch to leading a brand to lose millions, maybe even billions of dollars (advising or suggesting your/client brand to go for “X” based on your research).
All too often, researchers may be tempted to begin with the conclusion first, instead of working towards it. Here are the two biggest mistakes made when undertaking social media research and how to avoid them.
#1 Misrepresenting Data
This problem tends to come in when an analyst tries to validate their claims with data that is not significant to their sample. When this happens, the statements become less reliable and misrepresent the answers.
For example: You have been tasked with finding out what factors are most important to moms when shopping for poultry and set your research up as follows:
- You have a sample of 100,000 identified moms talking online about buying chicken over the past year.
- You then make a claim that when moms are purchasing poultry, having cage free chicken is an important factor in their purchasing behavior.
- Included in the list of factors are also, price, value packs, and chickens being antibiotic free.
Out of the 100,000 identified moms, only 112 of them had mentioned cage free chicken during their purchases.
Yet, 823 mention price, 744 mention value packs, and 1,265 mention antibiotic free chicken, all of which were much more significant than cage free chickens.
- Cage free chickens in this case is not an important enough factor to be listed.
TIP: Set an "at least" percentage to hit
Setting an “at least” percentage says that in order for a variable (search query) to be deemed as significant, it must be at least 5 percent of the total sources/mentions. The percentage is developed based on the number of sources or mentions that are being focused on.
Determining a solid sample number with social media data can be difficult sometimes because it remains arbitrary to the analyst. Use your best judgment and ask honestly what a good sample size would be based on the number of sources/posts presented.
If there is a sample of 100,000 sources, one may go with 5 percent (at least 500 sources) or 15 percent for a sample of 100 (at least 15 sources), since 5 percent may be too low in the second instance. Setting this tip in place when analyzing social media data will help bring more relevant insights.
#2 Forcing the Data
When you force the data, you’re basically jumping to an assumption without looking further into the data.
To use as a hypothetical example, let’s say you are looking at different diets commonly used in the United States and decide to focus on Paleo:
- Over a two year period, females represented 65% of all online conversations mentioning “Paleo Diet,” while the remaining 35% were by males.
- Instead of jumping directly to saying that more women prefer the paleo diet than men, dig deeper on why women are more likely to talk about the Paleo diet more than men.
- You may find that more women are talking more about Paleo because a large fraction of the conversations are about women expressing cooking and baking difficulties when on the paleo diet.
- So, do women really prefer the Paleo diet more than men? If you want to answer that question, you’ll have to look at more than just general share of voice score.
Avoiding the Conclusion-First Trap
So, how exactly can you make sure you don’t do either of the two above? Here are two questions for you to ask yourself to help you avoid making the two major mistakes:
- Ask yourself, “how did I get this answer?“ This simply means that you walk through your methodology. You could be pleasantly surprised at how you derived your answer when you talk your steps through.
- Looking at the number of sources (# of total people talking about your particular search query, not # of mentions) and ask yourself, “Is this representative of my claim?”If you have 200 sources that are specifically talking about disliking McDonald’s Big Mac out of 800K sources talking about Big Mac in general, don’t say, “People who talk about McDonald’s Big Mac express a lot of negativity towards it.” 200 sources out of 800K sources is not a representative sample of that claim.