top of page

A Quick Overview on Data Analytics and Sentiment Analysis!

Data Analytics is just a fancy name for the discipline of using the computer to analyze large amounts of data by applying a group of instructions (aka algorithm or model) to that data. Why you ask? The purpose is either to fulfill a specific need, solve a problem or answer a specific question. In our case, we ultimately want to answer the question ... What is the attitude or sentiment of the person who wrote the article, blog or, post? The sentiment is usually defined as either positive, neutral or, negative. This is known as classification or descriptive data analytics. 

 

Great, now that we know what we are looking for we need to figure out how to get the information (data). Here again, the computer is very useful. There is a vast amount of information that is available and legal for the general public to use. We use the computer to collect that information for us. 

The Journey Continues!

The next step is that we need to define exactly how we are going to go from the raw information to a sentiment value. We use a "model" to do this for us. What is great is that there are models that are widely used and available to the public for us to use. The model uses key words and their combination to classify the text as either positive, neutral or, negative. What is even better is that we can develop our own model and actually train our model for specific types of comments that we are interested in. The computer is actually learning how to determine the sentiment based on how we train it. 

​

Once the model is in place we can classify the sentiment of the article, blog, or post based on the (key) words and how they are used.

The Challenge!

How would you classify the sentiment of the following statement: "ABC stock crushed XYZ." It could be positive, negative or neutral. If you own ABC stock then it is a positive sentiment. If you own XYZ stock it has a negative sentiment. If you don't own either, you don't care and so it is a neutral sentiment. We have just illustrated the challenge in sentiment analysis. No model can be, or is, for that matter, 100% accurate because not only is the sentiment determined by the author, it is also determined by the reader. 

​

But, does this mean we throw it out? Absolutely not. Answering the question of what is happening begins to open up our view to the second question of, Why is this happening? This is called "diagnostic" and this is the second type of Data Analytics. With the above example, we can begin to discover the answer to why did ABC stock crush XYZ. We are using Data Analytics as a time-saving and efficient tool for our growth, enrichment, and understanding.

​

bottom of page