Recently I described a way to do [A/B-testing with Google Analytics and R](/2015/08/20/doing-a-b-testing-with-google-analytics-and-r/. I wrote about tests with a two-way outcome: bounced or not bounced.

But what do we do when the metric has more than two possible outcomes? Let’s say something like sessionDuration or pageLoadTime. Then Google just gives you an aggregated value like avgSessionDuration or avgPageLoadTime. You don’t get any information about the distribution of this metric.

So you have to do it on your own! As I showed the last time we define another customDimension. This customDimension gets an unique value for every session. Using php you can do something like this

1
  dimensionsValue2 = md5(session_id() . "some_salt");

The salt is required to hide the real session-id.

Once this value is added as customDimension to Google Analytics

1
  ga('set', 'dimension2', dimensionValue2);

we can use it to separate the sessions.

Retrieving data

Using RGoogleAnalytics we can fetch the data

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
query.list <- Init(start.date = start.date,
                   end.date = end.date,
                   dimensions = "ga:date,ga:dimension1,ga:dimension2",
                   metrics = "ga:pageviewsPerSession",
                   max.results = 10000,
                   table.id = table.id,
                   caching.dir = "cache",
                   caching = cache)
 
  ga.query <- QueryBuilder(query.list)
  data.perDay <- GetReportData(ga.query, token, split_daywise = TRUE, delay = 0)

Now we have the data to compute the density for both versions (encoded in dimension1) because we’ve got the value for pageviewsPerSession for each session.

So doing some aggregating and plotting we get something like this:

density for a/b-testing