Blog

Most Recurring Word on each Country's Wikipedia Page

A friend recently offered me a fun book: "Brilliant maps. An atlas for curious minds". Full of beautiful maps, one was showing the most frequent word of each country’s English Wikipedia page. Let's try to reproduce it using R.

Introducing my new R package {BFS}

The Swiss Federal Statistical Office, or BFS from “Bundesamt für Statistik” in German, provides a rich public database. As Swiss citizen and R enthousiast, I wanted to easily access its datasets directly from R. So I created the BFS package.

The Evolution of Regional Inequalities Around the World

Last month "Nature" published a paper introducing new data on regional human development across the globe. I couldn't resist to have a look at this new database and try some exploratory analysis.

Is There Gender Equality in the Pokémon Universe?

Last week I came accross the funny dataset of the current Reddit's Data Is Beautiful DataViz monthly challenge: "Information on All 802 Pokemon" and discovered with surprize a `percentage_male` variable. So I decided to dig further into this gender dimension of the Pokémon universe.

Slides of my Talk at the R Users Meetup Geneva

This July I had the chance to speak at the R Users Meetup Geneva. I shared the learnings of my year-long exploration of the tidyverse through blogging. It was such a pleasure to meet and discuss with other R users as passionnate as I am!

Can you Guess a Cuisine from its Ingredients?

Cooking is sometimes used as a metaphor for data preparation in machine learning. To practice my skills in machine learning, I decided to look for a dataset related to cooking. I found a Kaggle dataset where the goal of the competition is to predict the category of a dish’s cuisine given a list of its ingredients.

Reproducing The Economist Most Popular Map of 2017

I am a big fan of The Economist charts, in particular the ones published in the Graphic details blog. According to their Christmas countdown, The Economist most popular map of 2017 were this map (see below). In this article, we will reproduce it with `ggplot2`, `sf` and `leaflet`.

Predicting the Number of Swiss Phone Calls per Hour

I recently heard that one of the major Swiss telecommunications provider, Swisscom AG, decided to share data on a Open Data Portal. As Swiss citizen, I was curious to see if I could put my hands on interesting datasets.

My Package {polyglot} Is Now on CRAN

A few months ago, I wrote my first package called {learner}. I finally took the time to put it on CRAN. Following the advice of a CRAN team member, I changed its name. I had to agree: it sounded like a machine learning package. So now the {learner} package is dead, long life to {polyglot}!

Do the Rich Countries Always Win?

Like most of us, I watched the Olympics Winter Games. But after seeing the medal table, I had the impression that the richer a country was, the more medals he got. But was it really the case? And if yes, to what extent?

Which Marvel Characters and Movies are the Most Central?

To begin this year, I was looking for a quick project related to social network visualization. In this blog post, we will find out which characters and movies are the most central in the Marvel cinematic universe.

The Lack of Female Protagonists in Children’s Movies

Some time ago, I read a very interesting post from Giora Simchoni who used machine learning to estimate how many children's books have a central female character. Based on a Goodread list, he showed that for every book with a central female character, there are between 1.1 and 1.3 books written with a male protagonist. I couldn't help thinking that his analysis should be replicated on children's movies.

Introducing My Second R Package, {bfsdata}

The {bfsdata} package makes the data from the Swiss Federal Statistical Office (or BFS for "Bundesamt für Statistik") easily accessible to R users. It lets you search, download and read BFS datasets directly from the R console.

Who's The Most Popular Tennis Player on Twitter?

In the world of men's tennis, only four players, known as the "Big Four", have dominated the main tournaments since 2004. You probably know their names: Roger Federer, Rafael Nadal, Novak Djokovic and Andy Murray. As they regularly make the headlines all over the world, I wanted to know more about their popularity on Twitter with the R packages {rtweet} and {tidytext}.

James Bond Tourism

James Bond must be the most well-travelled man in the history of movies. As it is summer time, let's have a look at the 007 travel locations in the 24 films of the spy franchise with the R packages {maps}, {ggplot2} and {ggmap}.

Introducing {learner}, My First Package!

The {learner} package lets you use the R console as an interactive learning environment in order to memorize any dataset you want. Its main goal is to put foreign language vocabulary learning in the R workflow, so R can also be used to study languages or anything related to flashcards.

Orwell’s 1984, An (Un)Sentimental Analysis

Dystopian books are trendy. After Donald Trump's election, George Orwell novel 1984 hit the No. 1 spot in Amazon’s book sales chart. Also because I love this book, let's make a text analysis using the R package {tidytext}.

Marvel vs DC Comics

Les adaptations de superhéros brillent aux Box Office. Les films Marvel's The Avengers et Avengers: Age of Ultron sont ainsi les 5ème et 7ème plus gros succès au Box Office international. L'envie m'a prise de comparer les succès commerciaux des deux sociétés qui se partagent un quasi monopole des superhéros: Marvel Studios et DC Entertainment.

More articles »

Blog