What Is the Twitter Sentiment Analysis Tool and Why Do People Use It?

It feels like cheating but it really isn't

By Giorgi MikhelidzePublished 4 years ago • 4 min read

The Twitter sentiment analysis tool has been around ever since APIs first started to appear on large social platforms. Developers have been using a very complicated type of code in order to identify people’s reactions towards a specific product or company.

The tool keeps on getting better as AI technology develops thanks to skilled engineers in Google, Facebook and sometimes even Twitter itself.

The reason why this sentiment analysis is so good for Twitter is because of character limitations. The AI tends to be much more accurate with small texts and considering that Twitter gives you a maximum of 280 characters, it simply makes the job easier for the algorithm.

What do people use this algorithm for?

One of the primary uses of this technology is to determine the technology towards a specific financial asset. For example, people may try to find out if investors were positive about Bitcoin or some kind of currency for the future.

The algorithm would display the average score based on millions of tweets and let the user know if it’s a positive sentiment or a negative one.

Another use is to determine the sentiment of a company’s audience. It usually requires a large following, but it gets distorted the more replies a specific page has. For example, a small company on Twitter used the sentiment analysis to determine what people were mentioning their brand in, and how they were reacting to their product or content.

However, in order to truly emphasize the details of how the tool works. We need to go much deeper into its technical composition.

How the tool works

In order to use the tool to its fullest, there will be three main resources that you will have to have.

Those are the Twitter API, a lexicon library and a little knowledge of Python, Java or any language that can be used on the server-side. This is basically like Data Science 101, so for those who have at least some experience in it, it should be a piece of cake.

The first task is to import the Twitter API as well as the lexicon library into the text editor, or the specific file you create to write the code in.

Once that’s finished, you will have to identify a specific keyword. For example, let’s say we want to find what the people think about Bitcoin. In order to do so, we need to indicate the keyword “Bitcoin” into the code, but we also need to determine the timeframe.

Why? Because a tweet that was published in 2015 isn’t necessarily going to help us determine the price of Bitcoin a week from now. So, you can create a variable that will tap into the API’s identification of the date, and indicate it as no earlier than October 2019. This will help filter the API for only the most relevant Tweets that could affect the price in the near future.

Once all of the code is finished, it will be time to run the program. The console will output something similar to this - Sentiment: -0.00000132.

That’s just a random number I just said, so your search for “Bitcoin” could be a bit different. Now, how does the algorithm calculate this number? Simple, by assigning a specific number to the sentiment of a tweet. But how does it break it down? Through the lexicon library, we imported in the text editor.

Breaking down the tweet

The lexicon library comes with an already identified sentiment of specific words. The identification ranges from -1 (very negative) to +1 (very positive). For example, the word bad would have a score of -0.5, but a combination of words such as “very” and “bad” would have a score of -1. It’s quite simple.

Now, once our algorithm identifies the Tweet that contains the keyword we indicated it will have to break down the whole sentence. This is called tokenization. Words turn into tokens and are searched for in the lexicon library.

They are given their sentiment score and only then are considered in context.

Once the context has been analyzed, the average score of the whole Tweet will be output in the console.

This happens to every single tweet our algorithm identifies with the specified keyword and timeframe. Sometimes it could be hundreds, and sometimes even millions.

It then compiles all of these sentiment scores together and outputs an average. It’s almost impossible to get a complete -1 or +1 score. Usually, it’s somewhere in-between these numbers.

But, once there is a legitimate number, an investor could simply identify how people are looking at a specific asset for the coming future, thus structure his investment strategy in both the short and long run.

By the way, everything I just explained happens in no more than a second.

social media

About the Creator

Giorgi Mikhelidze

Reader insights

Be the first to share your insights about this piece.

How does it work?

Add your insights

Comments

There are no comments for this story

Be the first to respond and start the conversation.

Keep reading

More stories from Giorgi Mikhelidze and writers in 01 and other communities.