forcats::fct_lump_n() with weights "overall"

Sometimes I want to summarize some categories which don’t have much impact on my analysis. So the best way to do this is using some of the forcats::fct_lump*() functions. But I often struggle to find the way using the weights to order the categories. That’s because the main use case of fct_lump() is a vector of a factor containing several values and getting the most n and the rest combined as “other”.

JSON, NULL values and as_tibble

When working with data provided by common APIs you will almost always get in contact with JSON formatted data. Using R’s rjson::fromJSON will transform JSON to R’s lists. So far so good. Converting those lists to tibble using tibble::as_tibble will fail when the JSON (and therefor the list) contains NULL values. So you havve to replace them before building the tibble.

How to use nflfastR with Google BigQuery?

Lately I wanted to play around with nflfastR. That’s a great package giving you access to NFL’s play-by-play data since 1999. It let’s you download all the data and store it in several different databases.

Unfortunately I ran into trouble when I tried to import the data to Google’s BigQuery.

Where does bigrquery store credential-tokens on a Mac?

This blogpost is mainly a reminder for myself where I can delete this information.

But what’s all about?

Recently I installed bigrquery and ran queries against Google’s BigQuery cloud-database before I had installed gcloud (and did the authorization as described at https://docs.getdbt.com/reference/warehouse-setups/bigquery-setup#local-oauth-gcloud-setup).

So R asked me to authorize the session in a browser. So far so good.

But every time I started a new R session I was asked if I want to use the well known account or another one.

So I was wondering where this account information was stored and how I could delete this information.