Web application user visualization (in shell with awk)

I had an interesting quick task last week, and I’m allowed to post 3 pics from it. I got to visualise users of an internet company based on some TSV (tab separated value) file. They generate this user-stats file daily where each row represents one user. But they couldn’t see if they are making any progress with their iterations and changes or not, except by approximate assessment while scrolling the file. They wanted it done in shell as they would mostly use it over ssh so I decided to use awk.

s in last 14 days

One thing I could get from that file is the info if a user has signed-in in last 14 days. I decided to visualise it like this. Each -/¤ represents one user, they are ordered by sign up date. The green ¤ did login in last 14 days.

We can nicely see how green fields become more and more frequent with time.


What this doesn’t answer us is if this is because of user churn or did the web product get better with time. Probably both but to what degree?

Users that signed-up for the second day

To find out that I needed some datapoint that isn’t affected by distance from now, and it was there. Did the user sign-in for any other than the first day.


So this tells us few things. From start in 2009 to Jun 2011 this rate was improving considerably. Then it peaked and flatted out. From May 2012 it actually had some drops which could mean that new iterations reduced the quality of the web app. Because drops are short and rate returns to approximate plateau level between them (and based on dates which show increased number of users per month) it could also mean that the web app got some less quality visitors that just checked out the webapp. These things are typical if you get promoted in some media, which brings a spike of users, but they are there for curiosity and not necesarry looking for exactly such app at that particular time.

What I would dare to judge is that the perceived first user’s app quality didn’t improve in any meaningfull way sice Jun. 2011, which was a shock to them.

How many users upgraded from free plan

Suposedly 37signals once wrote (and everybody keeps parroting it to this company all the time) that “free users don’t convert to paid users, and most of paid users start as paid users”.

They knew that users did upgrade on them, but never knew exactly how many, so I visulised that too. Stars are currently active paid users. Blue stars are the one that started as paid users, greens are the ones that upgraded:


Well the parrots should maybe keep their beaks closed. It turns out that 30 out of 71 currently active paying users came here by upgrading from the free plan. It also seems that the upgraders are much less “churny”, which makes sense.

This says nothing against 37signals, they have their own data. But it does say that what holds for A doesn’t necesarily hold for B.

update: I decided to show the script for the first 2 tables (it accepts the width and what to measure), but don’t shoot me for its fugliness, I just hacked away until it did what I wanted, and I am fine with it:

awk -v width=$1 -v char=$2 -F"\t" '{x=substr($7,char,1);if(x=="+"){p++;x="33[1;31m¤33[0m"};if(x=="*"){p++;x="33[1;32m¤33[0m"};printf("%s ", x);i++;if(i>width){i=0;printf(" | ");printf("33[1;34m %s33[0m\t", $4);while(p-->0){printf("_")};p=0;print""}} END {print""}'

The TSV looked something like this:

1234    lite   1       1-Dec-2012      00-xxx-0000     1-Dec-2012      .***
1235    pro    1       2-Dec-2012      00-xxx-0000     6-Dec-2012      +**-   

~follow me on twiter~


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s