I had an interesting quick task last week, and I’m allowed to post 3 pics from it. I got to visualise users of an internet company based on some TSV (tab separated value) file. They generate this user-stats file daily where each row represents one user. But they couldn’t see if they are making any progress with their iterations and changes or not, except by approximate assessment while scrolling the file. They wanted it done in shell as they would mostly use it over ssh so I decided to use awk.
Sign-ins in last 14 days
One thing I could get from that file is the info if a user has signed-in in last 14 days. I decided to visualise it like this. Each -/¤ represents one user, they are ordered by sign up date. The green ¤ did login in last 14 days.
We can nicely see how green fields become more and more frequent with time.
What this doesn’t answer us is if this is because of user churn or did the web product get better with time. Probably both but to what degree?
Users that signed-up for the second day
To find out that I needed some datapoint that isn’t affected by distance from now, and it was there. Did the user sign-in for any other than the first day.
So this tells us few things. From start in 2009 to Jun 2011 this rate was improving considerably. Then it peaked and flatted out. From May 2012 it actually had some drops which could mean that new iterations reduced the quality of the web app. Because drops are short and rate returns to approximate plateau level between them (and based on dates which show increased number of users per month) it could also mean that the web app got some less quality visitors that just checked out the webapp. These things are typical if you get promoted in some media, which brings a spike of users, but they are there for curiosity and not necesarry looking for exactly such app at that particular time.
What I would dare to judge is that the perceived first user’s app quality didn’t improve in any meaningfull way sice Jun. 2011, which was a shock to them.
How many users upgraded from free plan
Suposedly 37signals once wrote (and everybody keeps parroting it to this company all the time) that “free users don’t convert to paid users, and most of paid users start as paid users”.
They knew that users did upgrade on them, but never knew exactly how many, so I visulised that too. Stars are currently active paid users. Blue stars are the one that started as paid users, greens are the ones that upgraded:
Well the parrots should maybe keep their beaks closed. It turns out that 30 out of 71 currently active paying users came here by upgrading from the free plan. It also seems that the upgraders are much less “churny”, which makes sense.
This says nothing against 37signals, they have their own data. But it does say that what holds for A doesn’t necesarily hold for B.
—
update: I decided to show the script for the first 2 tables (it accepts the width and what to measure), but don’t shoot me for its fugliness, I just hacked away until it did what I wanted, and I am fine with it:
#!/bin/bash awk -v width=$1 -v char=$2 -F"\t" '{x=substr($7,char,1);if(x=="+"){p++;x="33[1;31m¤33[0m"};if(x=="*"){p++;x="33[1;32m¤33[0m"};printf("%s ", x);i++;if(i>width){i=0;printf(" | ");printf("33[1;34m %s33[0m\t", $4);while(p-->0){printf("_")};p=0;print""}} END {print""}'
The TSV looked something like this:
1234 lite 1 1-Dec-2012 00-xxx-0000 1-Dec-2012 .*** 1235 pro 1 2-Dec-2012 00-xxx-0000 6-Dec-2012 +**-
~follow me on twiter~