Export your Twitter Followers to text and MSSQL Database

I would like a better Idea of who is joining and who is leaving my twitter feed.  The apps I have found are just short of horrendous.

Note: Thjs works for me based on the twitter feeds that I follow and follow me.  There is a holy grail expression that says all lines not starting with @, however I am having difficulty getting this to work in Notepad++.  Even after this, there are still about 5 lines I have to clean up.  This is a dirty way to do it, and hopefully Ill find a better way as I go.

In the meantime, until an app that actually works is released, below is the process I take to clean up the list.  I use notepad++ for this.  Note that the search functionality has three types, all that are used.  If you select the wrong type it will error out or give unexpected results.

This is something I am doing on the fly.  I may have typo’s or a not completely efficient system.  I do this at least once a week and will update this process as I go.

Note that items within the ‘ ‘ are the actual terms.  Do not include the ‘ in the actual field.

Note that when copying the terms from this page, do not inadvertently include the blank space prior to or after the ‘.  This will effect the results of RegEx expressions.

Copy paste follower list from twitter.com into notepad++.  Make sure to get the entire list.

Notepad++ (RegEx) Find all in current document ‘@.*’ .  Select and copy all results to New tab

(RegEx) Find/replace ‘.*:’ with null
(Extended)  Find ‘ ? ‘ Replace with \n \t

(RegEx) Find ‘ .*[^\r][\n] ‘ Replace with Null

(RegEx) Find ‘ \t ‘ Replace with Null

(Normal) search ‘Protected account’ and ‘Verified Account’  replace with null

For some cleanup.

(RegEx) Find ‘ \n[^@]([A-Z][a-z].*) ‘ Replace with Null

(RegEx) Find ‘ \n\s[^@]([A-Z][a-z].*) ‘ Replace with Null.

(RegEx) Find all in current document ‘ @ *. ‘ and copy to new doc.

(RegEx) Find ‘  .*:  ‘ and replace with Null.

Manual Cleanup

If you are planning on putting this into a database by turning it into an Importable CSV, I reccomend running the following on the results which should simply be a list of all of your followers starting with potentially a space or two and then the @ symbol:

  • (RegEx) Find ‘ ^ .* @ ‘ replace with ‘ <Date Format of your choice>, @ ‘ For instance I put the following in the replace field, ‘ 5-15-2013, @ ‘.

After running through this process you should have a clean import file.  Take a quick look through the file just to be certain.

Once I have verified it is clean I use Microsoft SQL Import utility to do a quick import of my csv file into a database.  I created a database named ‘Social’ with a table named ‘Twitter’.  The table has only two fields at the moment.

  • Date – datetime
  • User – varchar[50]

You can let the import utility create your table for you the first time you run it.  After selecting the imput file, be certain to change the name of the fields and the type of field under the advanced section.  The User field time is fine with the default.  Be sure though to change the Date field from the default varchar to Date.

After configureing the input, be sure to name your table.

All subsequent times you run this, be sure to name your fields in a consistent manner and select the correct variable type.  When you get to the table selection field it will default to creating a new table.  Be certain to click on the table name and select the table you created in the first import.  By default, when you select an existing table, it should change the import type to append.  It doesn’t hurt to hit the ‘Edit Mappings’ button on this just to be certain.

So far, just because I wanted to see who left me, I wrote a quick and dirty query.  Ill create a more advanced view with appropriate parameters in an upcoming session.  For the time being, this is the query I use to see:

  • Who I lost

use <databasename>

select *
from Followers
where Date = ‘2013-05-13’
and [User] not in (select [user] from followers where Date = ‘2013-05-15’)

  • Who I gained

use <databasename>

select * 
from Followers
where Date = ‘2013-05-15’
and [User] not in (select [user] from followers where Date = ‘2013-05-13’)

The above queries allow you to compare Date 1 with Date 2.  This should be used only if you have two single dates to compare.  If like me, you want to know who left or who gained over a time period, these queries will help.

  • Who I lost

use <databasename>

select distinct([User])
from Followers
where Date between ‘2013-05-13’ and ‘2013-05-15’
and [User] not in (select [user] from followers where Date = ‘2013-05-16’)

  • Who I gained

use <databasename>

select distinct([User])
from Followers
where Date = ‘2013-05-16’
and [User] not in (select [user] from followers where Date between ‘2013-05-13’ and ‘2013-05-15’)

Basically the first query says, “Make a list of all unique items which have an associated date from date1 to date2.  Then make another list of users with a date from date2(+1 day).  Then show me a list of all users from the first list who do not have a corresponding entry in the second list.”  More simply, “If an entry exists in the between dates, but does not exist in the current date, then the follower has left.”

(If I can figure out the Not Operator in Notepad++, this should cover everything I am looking to do in one step .*[@].*  Ill look at this next time I do this.)

 

Micah Norman

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: