The actual creation of a blog site isn’t too tough, it’s reaching your audience that is hard. Just like any startup or new business venture, research should be done on your market if you intend to monetize your brand. So when it comes to finding an audience with interests in programming, automation and other nerdy things, the pool seems to evaporate like water in a heat exchanger. Luckily we have Python to find our audience for us while we sleep and work our day job.
Where to Find Our People
As of April 2016, Facebook has a cool 1.6 Billion users along with Twitter’s 320 Million. LinkedIn may only have 100 Million users, but those interacting on LinkedIn are more professionally focused and may be interested in reading a technical blog. So these are the “Big 3”. As we previously discussed about SEO, visibility is a huge deal.
Market Research – Automated
In order to identify our consumer base (those that are likely to be interested in the AML blog), we can start with Twitter. Twitter is like a focus group for the 21st century, there is so much data out there about your audience you only need the tools to extract it. The Twitter API and a Python package called Tweepy is what you will need.
Objective: Using Python, we will learn about our audience, extract keywords they seem to be interested in, generate new words to search for and get to know our people.
- Get a Twitter Account and start a new app
- Create a list of keywords that represent your blog
- Start coding in Python
Creating a Blog Audience is Tough Business
Since we do not want to create another “Twitter Bot”, we will not be automatically following those that we identify. According to our objective, we are trying to understand our audience, not chase them away! We can start with a few keywords that describe AML such as: Python, Data Science, Programming, and Automation. Tweepy takes these words and returns only the tweets that have these words in them.
Think of this code as “Listening to your customer”. In fact, we’ll be importing the StreamListener() from the Tweepy package. Check out the imports:
You are going to need a consumer key, consumer secret, access token and the access secret alpha-numeric combination so that the Twitter API can authenticate you through Oauth. You get these from Twitter.
Before we go any further with the main loop, let’s list the questions we would like answered from this code:
- Who is saying what?
- Where are they saying it from and in what language? (Who knows, maybe we want to be international??)
- Are there any other blogs we should follow that our consumers are talking about?
- Can we save our competitors blogs to web-crawl later?
- Maybe we should blog about topics they do or don’t cover?
Using the StreamListener() that we imported from the Tweepy package, we can now pass it through to our Listener() object. On_data() is actually a Tweepy method and it basically retrieves the Twitter Stream in JSON format. If you are familiar with Python, this should get you pretty pumped because now the returning data be put into a Python Dictionary for easy data manipulation.
Now that all of our data is readily accessible, we start pulling interesting information.
The last line of code here is getLinks(tweet) which is a user-defined function that does a Reg-ex search for anything that has “http” in it. That means we’re looking for links a user posted within their tweet. We can save these links for later.
Extending My Code
Eventually you will see opportunities for this code. Maybe you will want to grow your search terms. Right now we are only using four topics, we would eventually want to grow our terms along with our audience. Speaking of endless, we should probably dump everything in a database. Import MySQLdb anyone?
Python Gives You Wings!