Caching and other enhancements to RGoogleAnalytics
Contents
RGoogleAnalytics is a great R-package for accessing Google Analytics-data from R via the API. The package was written by Tatvic, a webanalytics consulting company from India. They also offer regularly free webinars about webanaytics.
The excellent R-blog R-bloggers.com also published several posts about RGoogleAnalytics:
- How to extract Google Analytics data in R using RGoogleAnalytics
- Query Multiple Google Analytics View IDs with R
I use this R-package regularly. But I came across some shortcomings: First I missed caching of the data I retrieved from Google. During development I fetch the same data over and over again. It would be nice if this data could be cached so I haven’t to ask the Google API on each run. Unfortunately this package doesn’t seem to get very much attention from Tatvic. So I decided to fork this package on github and patch it. You can get the fork at github.
Caching
To use caching you have to add two parameters to the Init-method:
|
|
The parameter caching.dir specifies a directory where the cached data is stored. It’s possible to use a relative path (as in the example above). Then it’s relative to the current working directory.
You can switch between caching with the parameter caching.
Pagination
Another shortcoming was the handling of pagination. Using the original package you have to activate pagination explicitly with the parameter paginate_query. I decided that I would like to start pagination automatically if the answer of a query needs pagination. So now you get a message that pagination is needed but also get the full data.
Splitting daywise
Splitting daywise is the way to avoid sampling. The original code contained some problems when the last day of the splitted range has an empty return value. I also fixed this issue.
So if you use RGoogleAnalytics and ran into the same issues feel free and give my version a try. Any feedback is welcome.