Can you predict Bitcoin market crash with Twitter?

a live tracking Dash App

Posted on March 20, 2018

Built a Dash Web App that tracks tweets, analysis its sentiment and plots them on a graph for visual representation. Everything is done live and the graph updates after each incoming tweet. Was able to find a pattern in user sentiments, which could be used as a potential indicator for dramatic market crashes.

What is Dash

Dash is a Python framework by plotly for building analytical web applications. It runs on top of Plotly.js, React, and Flask, and ties modern UI elements like dropdowns, sliders, and graphs to your analytical Python code.

There are clear instructions on its website for the installation of Dash.

The Script

Part 1: Collecting Tweets

On one of my previous blog posts on extracting data from Facebook and Twitter, I have listed down the steps for setting up a Twitter App. Once set up, you need to note down your consumer key, consumer secret, access token and access secret which will be needed to retrieve tweets.

## Importing Libraries 
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import json
import sqlite3
from textblob import TextBlob
from unidecode import unidecode
import time
from guess_language import guess_language

You'll then have to provide your four Twitter access keys after setting up the Twitter App.

#consumer key, consumer secret, access token, access secret.

ckey='your_twitter_consumer_key'
csecret='your_twitter_consumer_secret'
atoken='your_twitter_access_token'
asecret='your_twitter_access_secret'

In the next step, we initialize a SQLite database to store the retrieved tweets.

conn = sqlite3.connect('bitcoin.db')
c = conn.cursor()

def create_table():
    c.execute("CREATE TABLE IF NOT EXISTS sentiment(unix REAL, tweet TEXT, sentiment REAL)")
    conn.commit()

create_table()

After retrieving the tweets, we use TextBlob library to predict the sentiment of that tweet. Guess_language is another library which is used to filter only 'english' tweets to predict the sentiments.

There are also many tweets which were predicted with a neutral sentiment and might not help in our predictions and were hence filtered out.

class listener(StreamListener):

    def on_data(self, data):
        try:
            data = json.loads(data)
            tweet = unidecode(data['text'])
            time_ms = data['timestamp_ms']
            analysis = TextBlob(tweet)
            sentiment = analysis.sentiment.polarity
            print(time_ms, tweet, sentiment)
            if sentiment != 0.0 and guess_language(tweet) == 'en':
                c.execute("INSERT INTO sentiment (unix, tweet, sentiment) VALUES (?, ?, ?)",
                      (time_ms, tweet, sentiment))
                conn.commit()

        except KeyError as e:
            print(str(e))
        return(True)

    def on_error(self, status):
        print(status)


while True:
    try:
        auth = OAuthHandler(ckey, csecret)
        auth.set_access_token(atoken, asecret)
        twitterStream = Stream(auth, listener())
        twitterStream.filter(track=["bitcoin"])
    except Exception as e:
        print(str(e))
        time.sleep(5)
Part 2: Building the Dash App

After installing Dash by following the instructions on its website, we import the libraries and initialize the Dash App.

# Importing Libraries

import dash
from dash.dependencies import Input, Output, Event
import dash_core_components as dcc
import dash_html_components as html
import plotly
import plotly.graph_objs as go
import sqlite3
import pandas as pd

app = dash.Dash(__name__)
app.title='Bitcoin Sentiments'

Dash is a browser based App and in the next step we'll design a simple UI that has an input field and a line graph.

app_colors = {
    'background': '#0C0F0A',
    'text': '#FFFFFF',
}

app.layout = html.Div(
    [   html.Div(className='container', children=[html.H1('Live Twitter Sentiment of Bitcoins', style={'color':"#CECECE", 'padding':'10px', 'word-spacing':'1em'}),
                                                  dcc.Input(id='sentiment_term', value='bitcoin', type='text', style={'color':'black' ,'margin':'10px'}),
                                                  ]),
        dcc.Graph(id='live-graph', animate=True),
        dcc.Interval(
            id='graph-update',
            interval=1*1000
        ),
    ], style={'backgroundColor': app_colors['background']},
)

The "inputs" and "outputs" of our application interface are described declaratively through the app.callback decorator.

Whenever an input property changes, the function that the callback decorator wraps will get called automatically. Dash provides the function with the new value of the input property as an input argument and Dash updates the property of the output component with whatever was returned by the function.

@app.callback(Output('live-graph', 'figure'),
              [Input(component_id='sentiment_term', component_property='value')],
              events=[Event('graph-update', 'interval')])

The below function gets data from the SQLite database and plots it into a graph. The resample method is used to smoothen the curve by taking the mean of the sentiment values for every 30 minutes.

def update_graph_scatter(sentiment_term):
    try:
        conn = sqlite3.connect('bitcoin.db')
        c = conn.cursor()
        df = pd.read_sql("SELECT * FROM sentiment WHERE tweet LIKE ? ORDER BY unix DESC LIMIT 50000", conn ,params=('%' + sentiment_term + '%',))
        df.sort_values('unix', inplace=True)
        df['date'] = pd.to_datetime(df['unix'], unit='ms') + pd.Timedelta('05:30:00')
        df.set_index('date', inplace=True)

        max_, min_ = 1.0, -1.0
        df['sentiment_smoothed'] = round((((df['sentiment'] - min_) * (100)) / (max_ - min_)))
        df = df.resample('30T').mean()

        df.dropna(inplace=True)

        X = df.index
        Y = df.sentiment_smoothed.values

        data = plotly.graph_objs.Scatter(
                x=X,
                y=Y,
                name='Scatter',
                mode= 'lines+markers'
                )

        return {'data': [data],'layout' : go.Layout(xaxis=dict(range=[min(X),max(X)]),
                                                    yaxis=dict(range=[min(Y),max(Y)]),
                                                    font={'color':app_colors['text']},
                                                    plot_bgcolor = app_colors['background'],
                                                    paper_bgcolor = app_colors['background'],
                                                   )}

    except Exception as e:
        with open('errors.txt','a') as f:
            f.write(str(e))
            f.write('\n')

The App runs on the port 8050 by default. You can choose custom ports and can run multiple apps at the same time. You can access the App at http://127.0.0.1:8000/ on your browser.

if __name__ == '__main__':
    app.run_server(debug=False, port=8000)
Part 3: Data Visualization

I was able to collect over 200,000 Bitcoin Tweets over 2 days. Comparing the sentiments with the coindesk Bitcoin price gave interesting results.

12pm 12pmt

There is definite resemblance between the ups and downs of the Bitcoin price and the Twitter sentiments around the same time. We have not seen any rapid changes in the market and fluctuations in the range of hundreds is normal for the cryptocurrency. This could also be why the sentiments take some time to reflect the changes in the actual bitcoin price (you can see a lag in the sentiments graph).

However, any event-triggered fall or rise, which is usually the case for rapid changes in the Bitcoin price, like government policies, hacking etc, the news strikes first and the sentiments in such case will precede the changes in the Bitcoin price.

I will continue tracking the sentiments and will update if we encounter such a situation.

Tracking Politics

Twitter sentiments can especially be useful to predict elections. And I ran a similar analysis to study the moods around two popular figures - "Narendra Modi" and "Rahul Gandhi".

Polls

For now Modi leads ever so slightly.