Grassroots SEO Knowing Your Twitter Audience

The actual creation of a blog site isn’t too tough, it’s reaching your audience that is hard. Just like any startup or new business venture, research should be done on your market if you intend to monetize your brand. So when it comes to finding an audience with interests in programming, automation and other nerdy things, the pool seems to evaporate like water in a heat exchanger. Luckily we have Python to find our audience for us while we sleep and work our day job.

Social Media Market Share

Social Media Market Share

Where to Find Our People

As of April 2016, Facebook has a cool 1.6 Billion users along with Twitter’s 320 Million. LinkedIn may only have 100 Million users, but those interacting on LinkedIn are more professionally focused and may be interested in reading a technical blog. So these are the “Big 3”. As we previously discussed about SEO, visibility is a huge deal.

Market Research – Automated

In order to identify our consumer base (those that are likely to be interested in the AML blog), we can start with Twitter. Twitter is like a focus group for the 21st century, there is so much data out there about your audience you only need the tools to extract it. The Twitter API and a Python package called Tweepy is what you will need.

Objective: Using Python, we will learn about our audience, extract keywords they seem to be interested in, generate new words to search for and get to know our people.

Method:

  1. Get a Twitter Account and start a new app
  2. Create a list of keywords that represent your blog
  3. Start coding in Python




Creating a Blog Audience is Tough Business

Since we do not want to create another “Twitter Bot”, we will not be automatically following those that we identify. According to our objective, we are trying to understand our audience, not chase them away!  We can start with a few keywords that describe AML such as: Python, Data Science, Programming, and Automation. Tweepy takes these words and returns only the tweets that have these words in them.

"""--------------The Process------------
Tweepy ref - http://docs.tweepy.org/en/v3.5.0/api.html
0. Get a twitter account and create a new app -- apps.twitter.com
1. Have a list of things you think your consumers are interested in
2. Open the Twitter stream and analyze the matches from the Twitter API
3. Keep an index of screen_names, tweets and any other key information you may be interested in
-maybe even keep links to other blogs to scrape yourself for new key words??
4. Post that data into a database or save it to a text file 
5. Extend - Start gaining followers by "favouriting" their content
6. Extend Even Further - Create some charts for your Chief Marketing Officer (CMO) with pandas
"""

Think of this code as “Listening to your customer”. In fact, we’ll be importing the StreamListener() from the Tweepy package. Check out the imports:

from tweepy import Stream,API
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import time #we will use this to limit our consumption
import json #to capture the tweets
#also use my sql to save the data
import MySQLdb
#what if we want to make some graphs with the data
import pandas as pd
import matplotlib.pyplot as plt

You are going to need a consumer key, consumer secret, access token and the access secret alpha-numeric combination so that the Twitter API can authenticate you through Oauth. You get these from Twitter.

ckey = 'your consumer key'
csecret = 'your consumer secret'
atoken = 'your access token'
asecret = 'your access secret'

Strategy First

Before we go any further with the main loop, let’s list the questions we would like answered from this code:

  • Who is saying what?
  • Where are they saying it from and in what language? (Who knows, maybe we want to be international??)
  • Are there any other blogs we should follow that our consumers are talking about?
  • Can we save our competitors blogs to web-crawl later?
  • Maybe we should blog about topics they do or don’t cover?

Code Second

Using the StreamListener() that we imported from the Tweepy package, we can now pass it through to our Listener() object. On_data() is actually a Tweepy method and it basically retrieves the Twitter Stream in JSON format. If you are familiar with Python, this should get you pretty pumped because now the returning data be put into a Python Dictionary for easy data manipulation.

class listener(StreamListener):
def on_data(self, data):
try:
print data
#make the data into a json object
#it looks like a dictionary anyway
all_data = json.loads(data)

Now that all of our data is readily accessible, we start pulling interesting information.

            tweetTime = all_data["created_at"]
tweet = all_data["text"]
username = all_data["user"]["screen_name"]
#who is saying what?
print "At ",tweetTime," ",username," says --\n ",tweet
print "\n Oh No they Didn\\'t!!!"
#where are they saying it from? in what language?
country = all_data["place"]["country"]
language = all_data["language"]
#are there any blogs we should follow that our consumers are talking about? 
#we can check our competitors blogs?
#maybe we should blog about topics they do or dont cover?
importantLinks = getLinks(tweet)

The last line of code here is getLinks(tweet) which is a user-defined function that does a Reg-ex search for anything that has “http” in it. That means we’re looking for links a user posted within their tweet. We can save these links for later.

#####opening the stream and start listening
#authorize ourselves using OAuth by just passing the ckey csecret
auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
#open the twitter stream by using our class
twitterStream = Stream(auth, listener())
#use the Streams filter handler
twitterStream.filter(track=["python","automate","programming","data science"])

Extending My Code

Eventually you will see opportunities for this code. Maybe you will want to grow your search terms. Right now we are only using four topics, we would eventually want to grow our terms along with our audience. Speaking of endless, we should probably dump everything in a database. Import MySQLdb anyone?

 

Python Gives You Wings!

-j

 

Full Code

from tweepy import Stream,API
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import time #we will use this to limit our consumption
import json #to capture the tweets
#also use my sql to save the data
import MySQLdb
#what if we want to make some graphs with the data
import pandas as pd
import matplotlib.pyplot as plt
"""--------------The Process------------
Tweepy ref - http://docs.tweepy.org/en/v3.5.0/api.html
0. Get a twitter account and create a new app -- apps.twitter.com
1. Have a list of things you think your consumers are interested in
2. Open the Twitter stream and analyze the matches from the Twitter API
3. Keep an index of screen_names, tweets and any other key information you may be interested in
-maybe even keep links to other blogs to scrape yourself for new key words??
4. Post that data into a database or save it to a text file 
5. Extend - Start gaining followers by "favouriting" their content
6. Extend Even Further - Create some charts for your Chief Marketing Officer (CMO) with pandas
"""
ckey = 'your consumer key'
csecret = 'your consumer secret'
atoken = 'your access token'
asecret = 'your access secret'
def getLinks(text):
regex = r'https?://[^\s<>"]+|www\.[^\s<>"]+'
match = re.search(regex, text)
if match:
return match.group()
return ''
class listener(StreamListener):
def on_data(self, data):
try:
print data
#make the data into a json object
#it looks like a dictionary anyway
all_data = json.loads(data)
tweetTime = all_data["created_at"]
tweet = all_data["text"]
username = all_data["user"]["screen_name"]
#who is saying what?
print "At ",tweetTime," ",username," says --\n ",tweet
print "\n Oh No they Didn\\'t!!!"
#where are they saying it from? in what language?
country = all_data["place"]["country"]
language = all_data["language"]
#are there any blogs we should follow that our consumers are talking about? 
#we can check our competitors blogs?
#maybe we should blog about topics they do or dont cover?
importantLinks = getLinks(tweet)
#get a unix timestamp
saveThis = str("User: "+username+" Text: "+tweet) #twitter uses colons sometimes so we have to some up with a different method
#save into a csv or a db
saveCSV = "Y"
saveDB = "N"
###if you want to save to a csv
if saveCSV == "Y":
saveFile = open('twitDB.csv','a')
saveFile.write(saveThis)
saveFile.write('\n')
saveFile.close()
#if saveDB == "Y":
#    #replace mysql.server with "localhost" if you are running via your own server!
#    #server   MySQL username    MySQL pass  Database name.
#    conn = MySQLdb.connect("mysql.server","beginneraccount","cookies","beginneraccount$tutorial")
#    c = conn.cursor()
#	c.execute("INSERT INTO rolodex (tweetTime, username, tweet, importantLinks, country, language) VALUES (%s,%s,%s,%s,%s,%s)",
#	(tweetTime, username, tweet, importantLinks, country, language))
#	conn.commit()
#keep the light on
return True
except BaseException,e:
print "failed ondata ",str(e)
time.sleep(5)
def on_error(self, status):
print status
#####opening the stream and start listening
#authorize ourselves using OAuth by just passing the ckey csecret
auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
#open the twitter stream by using our class
twitterStream = Stream(auth, listener())
#use the Streams filter handler
twitterStream.filter(track=["python","automate","programming","data science"])

Leave a Reply

Your email address will not be published. Required fields are marked *