RGoogleAnalytics is a great R-package for accessing Google Analytics-data from R via the API. The package was written by Tatvic, a webanalytics consulting company from India. They also offer regularly free webinars about webanaytics.

The excellent R-blog R-bloggers.com also published several posts about RGoogleAnalytics:

I use this R-package regularly. But I came across some shortcomings: First I missed caching of the data I retrieved from Google. During development I fetch the same data over and over again. It would be nice if this data could be cached so I haven’t to ask the Google API on each run. Unfortunately this package doesn’t seem to get very much attention from Tatvic. So I decided to fork this package on github and patch it. You can get the fork at github.

Caching

To use caching you have to add two parameters to the Init-method:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
query.list < - Init(start.date = as.character(start.date),
                     end.date = as.character(end.date),
                     dimensions = "ga:isoYearIsoWeek",
                     metrics = "ga:users,ga:sessions,ga:bounceRate,ga:pageviews,ga:pageviewsPerSession,ga:avgSessionDuration",
                     max.results = 10000,
                     table.id = table.id,
                     caching.dir = "cache",
                     caching = TRUE)
 
  ga.query <- QueryBuilder(query.list)
  data <- GetReportData(ga.query, token, split_daywise = FALSE, delay = 0)

The parameter caching.dir specifies a directory where the cached data is stored. It’s possible to use a relative path (as in the example above). Then it’s relative to the current working directory.

You can switch between caching with the parameter caching.

Pagination

Another shortcoming was the handling of pagination. Using the original package you have to activate pagination explicitly with the parameter paginate_query. I decided that I would like to start pagination automatically if the answer of a query needs pagination. So now you get a message that pagination is needed but also get the full data.

Splitting daywise

Splitting daywise is the way to avoid sampling. The original code contained some problems when the last day of the splitted range has an empty return value. I also fixed this issue.

So if you use RGoogleAnalytics and ran into the same issues feel free and give my version a try. Any feedback is welcome.