Social Media Data Mining – A Powerful Tool for Decision Making

posted by Jing Tang on Thursday, May 9, 2019

Within the past twenty years, the flow of information has exponentially increased, largely due to advancements in technology and adoption of social media. The speed at which issues are first manifested to when they are noticed on a large scale has increased. This speed has impacted nearly every aspect of life. Membership organizations, once accustomed to long lead times related to issue management are now faced with first identifying issues, understanding them and their implications on their members and organization and then acting in an appropriate manner in a much more compressed, faster-moving cycle.  

Social media data mining is one type of big data mining. It combines the data and/or information from Twitter, Facebook, Instagram, YouTube, etc. and conducts a series of analyses. Therefore, membership organizations could use the results from the social media data mining analyses to have a better understanding of their members, make quicker and more appropriate decisions and target new potential clients.  

We applied social media data mining on a sample dataset of 10,000 hockey fans, with encrypted email addresses, and more than 4 million social media data record for all participants. Performing this exercise shows that, even with very little experience in the hockey field (we watch hockey games on occasion), we are capable of learning rich insights. The data structure is shown below (Figure 1). The majority of participants were coming from the United States, and the top three states are California, New York, and Illinois, with descending order (shown in Figure 2).   

Figure 1

Figure 2

Based on participants’ personality (determined by IBM Watson Personality Insights values) and interests in different brands, diverse media sources, and various celebrities, we applied statistical methods to identify multi-dimensional, self-manifesting “tribes”, and gained membership insights for each tribe. The population in each tribe has its unique personality, shared common interests, and similar age group, gender, and race. Overall, we identified 8 different naturally occurring tribes among the 10,000 participants. For instance, the first two tribes are defined as “Sociables” and “Challenge Seekers”.  

There are 890 participants in the hockey sample data identified as Sociables. The members of this tribe need harmony and love, like to consider others and trust people. They don’t need to be challenged, don’t like high-intensive activities and are prone to anxiety and vulnerability. This tribe likes the “Ellen Show” the most compared to other tribes and are more focused on local news, events and gossip entertainment. So, interaction with Sociables is likely to be most effective by appealing to their need for practical, modest activities. This tribe may respond to opportunities to be part of a “like-minded” group.  

On the other hand, for the Challenge Seekers, 2114 hockey fans, they need a high level of challenge and adventure, striving for achievement through self-discipline, active imagination and energy, and have low levels of anxiety and depression. This tribe doesn’t often exhibit agreeableness and orderliness. Members of this tribe tend to travel a lot for business and watch international and national news. To successfully appeal to this tribe, a strategy of presenting challenges for them to overcome should be adopted. Challenges could be individual, or group focused. 

Both tribes have their unique characters, and without social media data mining, there is no way to separate this size of data and gain the insight information manually. With accurate information to target the right population, the strategy making will be more efficient. This hockey fan sample dataset has shown what can be done with social media data mining. This statistical analysis can be applied to any topic with a relevant social media dataset such as water quality, food nutrition, policy issues, etc. Moreover, we could involve the twitter content and apply text mining to have a more complete analysis, please see an example here for a text mining analysis. Working with real data enabled DIS analysts to develop a service that provides the client with comprehensive insights regarding an issue or topic. 

About The Author

Jing Tang

As a Statistician for Decision Innovation Solutions, Jing Tang is responsible for analyzing agricultural data to help clients to make better strategic business decisions, and to assist co-workers improve model prediction and model estimation preference. Jing works with clients and coworkers on da ... read more