Working with tree-based hierarchies using data.tree

Lately I tried to visualize an hierarchy with Tableau Desktop. The problem was that the hierarchy had a variable depth because it was tree-based. Each row had an id and a parent_id. Normally hierarchies in Tableau are defined by pulling some fields together, such as product category, product group and product id.

Handling tree-based hierarchies seems to be a lot more complex. I found a plugin at https://github.com/tableau/extension-hierarchy-navigator-sandboxed but this only works online.

So I asked myself how I can handle this using R. I found the R-package data.tree at https://github.com/gluc/data.tree. I want to describe how I use this package to preprocess my data.

Using R 4.1.0 and R 4.0.5 on MacOS using RSwitch

Lately R version 4.1.0 was released on CRAN (see https://cran.r-project.org).

The MacOS version was one day late. But yesterday it was released, too. So I wanted to test the new version without being able to go back. “No problem”, I thought “I’m already using RSwitch”. RSwitch is a tiny Mac program which allows you to switch between installed R versions.

But unfornately RSwitch told me that R version 4.0.5 was incomplete after I’ve installed version 4.1.0.

Accessing PIWIK PRO from R

The main tool for tracking the action on a website is Google Analytics. But more and more websites switch to other tools such as Matomo or PIWIK PRO due to GDPR.

My employer decided to switch to PIWIK PRO, too. So I was looking for a way to access the data PIWIK PRO was collecting to process it with R. When we used Google Analytics as web analytics tool I used RGoogleAnalytics. I added some enhancements such as caching the data and splitting the requests into daily chunks to handle sampling issues with Google Analytcs.

Unfortunately I haven’t found any R package providing access to PIWIK PRO data. So I wrote my own: piwikproR