Category: visualization

Your own personalized motivational GIF

Your own personalized motivational GIF

One of the R OpenSci ozunconference 2018 projects was all about gganimate. The guys and gals dubbed this project learngganimate and dedicated an expansive GitHub repository to it. As part of this project, they created some — let’s call it — very creative GIFS.

One of their GIFs I particularly liked, copied below. Using the OpenSci syn package they looked up synonyms for cool and printed those in some nice colors.

On GitHub, you can find the original code for this project. However, I didn’t get it working on my machine — due to recent updates to the gganimate package — so I had to create my own version, which you find below.

devtools::install_github("ropenscilabs/syn") # only needed for first-time install
devtools::install_github('thomasp85/gganimate') # install the most recent version of gganimate

library(syn)
library(ggplot2)
library(gganimate)
library(dplyr)

set.seed(1) # for reproducibility purposes

synonyms <- syn("great") # store synonyms for your word of chosing

n = 15 # number of synonyms to sample
time = 3 # their position in the plot as well as the duration of their display

# generate dataframe with random synonyms sentences and assigned locations
sentences_df <- data_frame(
sentence = paste("#rstats ==", sample(synonyms, n), "!!")
, x = time
, y = seq(time, time * n, time)
)

# generate the actual plot
ggplot(sentences_df,
aes(x, -y,
label = sentence,
group = sentence,
fill = sentence)) +
geom_label(size = 10, colour = "white", label.size = 0.3) +
transition_components(id = sentence, time = y,
enter_length = n * time + time ,
exit_length = n * time + time) +
scale_fill_viridis_d() +
theme_void() +
theme(legend.position = "none") ->
plot1

# animate the plot
animate(plot1, nframes = n * time + time)

This code renders the following GIF:

Try to play around with the code to change the GIF:

  • Change the set.seed argument to get different synonyms in there,
  • Change the n to include more or less words,
  • Change the x and y variables to position the labels differently,
  • Change the size, colour, and fill of the geom_label function to change the label design,
  • Or change the transition_components arguments to change the display timing.

Moreover, you could change the sentence variable to something to motivate yourself. For instnace, in the following code, I changed it to include my name, and synonyms for the word good. Moreover, I picked a different gganimate function — transition_time — to display the labels according to a different pattern. 

set.seed(2) # for reproducibility purposes

# generate dataframe with random synonyms sentences and assigned locations
sentences_df <- data_frame(
sentence = paste0("Paul is ", sample(syn("good"), n), "!")
, x = time
, y = seq(time, time * n, time)
)

# generate the actual plot
ggplot(sentences_df,
aes(x, -y,
label = sentence,
group = sentence,
fill = sentence)) +
geom_label(size = 10, colour = "white", label.size = 0.3) +
transition_time(time = y) +
scale_fill_viridis_d() +
theme_void() +
theme(legend.position = "none") ->
plot2

# animate the plot
animate(plot2, nframes = n * time + time)

I think the result is very pleasing, comforting, and positive! Except maybe for the dinkum bit, but fortunately neither I or thesaurus.com know what that means, so it might as well be positive : )

If you go about creating your own animations, you can save them using the save_animation function of the gganimate package. Good luck!


PS. The code to generate the GIF at the top of this blog is posted below. It uses another gganimate function called transition_states:

set.seed(3) # for reproducibility purposes

time = 5
n = 5

# generate dataframe with random synonyms sentences and assigned locations
sentences_df <- data_frame(
sentence = paste0("You are ", sample(syn("amazing"), n), "!")
, x = runif(n)
, y = seq(time, time * n, time)
)

# generate the actual plot
ggplot(sentences_df,
aes(x, -y,
label = sentence,
group = sentence,
fill = sentence)) +
geom_label(size = 12, colour = "white", label.size = 0.5) +
transition_states(states = sentence, transition_length = time, state_length = time) +
theme_void() +
theme(legend.position = "none") +
coord_cartesian(xlim = c(-0.5, 1.5)) ->
plot3

# animate the plot
animate(plot3, nframes = n * time + time)

Animated Citation Gates turned into Selection Gates

Bret Beheim — senior researcher at the Max Planck Institute for Evolutionary Anthropology — posted a great GIF animation of the response to his research survey. He calls the figure citation gates, relating the year of scientific publication to the likelihood that the research materials are published open-source or accessible.

To generate the visualization, Bret used R’s base plotting functionality combined with Thomas Lin Pedersen‘s R package tweenrto animate it.

Bret shared his R code for the above GIF of his citation gates on GitHub. With the open source code, this amazing visual display inspired others to make similar GIFs for their own projects. For example, Anne-Wil Kruijt’s dance of the confidence intervals:

A spin-off of the citation gates: A gif showing confidence intervals of sample means.

Applied to a Human Resource Management context, we could use this similar animation setup to explore, for instance, recruitment, selection, or talent management processes.

Unfortunately, I couldn’t get the below figure to animate properly yet, but I am working on it (damn ggplot2 facets). It’s a quick simulation of how this type of visualization could help to get insights into the recruitment and selection process for open vacancies.

The figure shows how nearly 200 applicants — sorted by their age — go through several selection barriers. A closer look demonstrates that some applicants actually skip the screening and assessment steps and join via a fast lane in the first interview round, which could happen, for instance, when there are known or preferred internal candidates. When animated, such insights would become more clearly visible.

Chatterplots

Chatterplots

I’ve mentioned before that I dislike wordclouds (for instance here, or here) and apparently others share that sentiment. In his recent Medium blog, Daniel McNichol goes as far as to refer to the wordcloud as the pie chart of text data! Among others, Daniel calls wordclouds disorienting, one-dimensional, arbitrary and opaque and he mentions their lack of order, information, and scale. 

Wordcloud of the negative characteristics of wordclouds, via Medium

Instead of using wordclouds, Daniel suggests we revert to alternative approaches. For instance, in their Tidy Text Mining with R book, Julia Silge and David Robinson suggest using bar charts or network graphs, providing the necessary R code. Another alternative is provided in Daniel’s blogthe chatterplot!

While Daniel didn’t invent this unorthodox wordcloud-like plot, he might have been the first to name it a chatterplot. Daniel’s chatterplot uses a full x/y cartesian plane, turning the usually only arbitrary though exploratory wordcloud into a more quantitatively sound, information-rich visualization.

R package ggplot’s geom_text() function — or alternatively ggrepel‘s geom_text_repel() for better legibility — is perfectly suited for making a chatterplot. And interesting features/variables for the axis — apart from the regular word frequencies — can be easily computed using the R tidytext package. 

Here’s an example generated by Daniel, plotting words simulatenously by their frequency of occurance in comments to Hacker News articles (y-axis) as well as by the respective popularity of the comments the word was used in (log of the ranking, on the x-axis).

[CHATTERPLOTs arelike a wordcloud, except there’s actual quantitative logic to the order, placement & aesthetic aspects of the elements, along with an explicit scale reference for each. This allows us to represent more, multidimensional information in the plot, & provides the viewer with a coherent visual logic& direction by which to explore the data.

Daniel McNichol via Medium

I highly recommend the use of these chatterplots over their less-informative wordcloud counterpart, and strongly suggest you read Daniel’s original blog, in which you can also find the R code for the above visualizations.

Mathematical aRt

Marcus Volz is a research fellow at the University of Melbourne, studying geometric networks, optimisation and computational geometry. He’s interested in visualisation, and always looking for opportunities to represent complex information in novel ways to accelerate learning and uncover the unexpected.

One of Marcus’ hobbies is the visualization of mathematical patterns and statistical algorithms via R. He has a whole portfolio full of them, including a Github page with all the associated R code. For my recent promotion, my girlfriend asked Marcus to generate a K-nearest neighbors visual and she had it printed on a large canvas.

20181109_143559.jpg

The picture contains about 10.000 points, randomly uniformly distributed across x and y, connected by lines with their closest other points. Marcus shared the code to generate such k-nearest neighbor algorithm plots here on Github. So if you know your way around R, you could make your own version:

#' k-nearest neighbour graph
#'
#' Computes a k-nearest neighbour graph for a given set of points. Refer to the \href{https://en.wikipedia.org/wiki/Nearest_neighbor_graph}{Wikipedia article} for details.
#' @param points A data frame with x, y coordinates for the points
#' @param k Number of neighbours
#' @keywords nearest neightbour graph
#' @export
#' @examples
#' k_nearest_neighbour_graph()

k_nearest_neighbour_graph <- function(points, k=8) {
  get_k_nearest <- function(points, ptnum, k) {
    xi <- points$x[ptnum]
    yi <- points$y[ptnum]     points %>%
      dplyr::mutate(dist = sqrt((x - xi)^2 + (y - yi)^2)) %>%
      dplyr::arrange(dist) %>%
      dplyr::filter(row_number() %in% seq(2, k+1)) %>%
      dplyr::mutate(xend = xi, yend = yi)
  }
  
  1:nrow(points) %>%
    purrr::map_df(~get_k_nearest(points, ., k))
}

Those less versed in R can use Marcus package mathart. With this package, Marcus shares many more visual depictions of cool algorithms! You can install the package and several dependencies with the following lines of code:

install.packages(c("devtools", "mapproj", "tidyverse", "ggforce", "Rcpp"))
devtools::install_github("marcusvolz/mathart")
devtools::install_github("marcusvolz/ggart")

Subsequently, you can visualize all kinds of cool stuff, like for instance rapidly exploring random trees (see this Wikipedia article for details):

# Generate rrt edges
set.seed(1)
df <- rapidly_exploring_random_tree() %>% mutate(id = 1:nrow(.))

# Create plot
ggplot() +
  geom_segment(aes(x, y, xend = xend, yend = yend, size = -id, alpha = -id), df, lineend = "round") +
  coord_equal() +
  scale_size_continuous(range = c(0.1, 0.75)) +
  scale_alpha_continuous(range = c(0.1, 1)) +
  theme_blankcanvas(margin_cm = 0)
rrt
Via https://github.com/marcusvolz/mathart

This k-d tree (see this Wikipedia article for details) is also amazing:

result <- kdtree(mathart::points)

ggplot() +
  geom_segment(aes(x, y, xend = xend, yend = yend), result) +
  coord_equal() +
  xlim(0, 10000) + ylim(0, 10000) +
  theme_blankcanvas(margin_cm = 0)
kdtree
Via https://github.com/marcusvolz/mathart

This page of Marcus’ mathart Github repository contains the code exact code for these and many other visualizations of algorithms and statistical phenomena. Do check it out if you’re interested!

 

Also, check out the “Fun” section of my R tips and tricks list for more cool visuals you can generate in R!

dygraphs

dygraphs

Today I learned about dygraphs, a fast, flexible open source JavaScript charting library. As everything in JavaScript, the charts produced by dygraphs integrate completely in the webbrowser and are thus very functional and interactive. See, for instance, the below where the graph highlights the y-axis value for both time series in the graph based on the x-axis value of my mouse location (January 24 2009). Very cool!

1.png

While I am no JS hero, the webpage includes a dypgrahs tutorial, as well as a playground environment.

Fortunately, I do know my way around R, and of course someone had already integrated dypgrahs in R in the form of the dygraphs R package. It works like a charm!

install.packages("dygraphs")
library("dygraphs")

dygraph(AirPassengers)

Also in R, your dygraphs are fully interactive, with my mouse hoevering over June 1951 in the below example.

2.PNG

And you can add all kinds of cool elements and modifications to the graphs, such as for instance a range selector:

dygraph(AirPassengers) %>% dyRangeSelector()

3.PNG

For the full range of visualization options dygraphs offers in R, please do have a look at the official RStudio page.

100 amazing color palettes including their Hex codes

100 amazing color palettes including their Hex codes

TJ Mahr hinted to this Canva webpage on Twitter. It contains 100 beautiful color palettes including their hexadecimal color codes. For instance, these three below.

The great thing is that these color palettes are include in the ggthemes package in R. Hence, the following code uses this Nightlife palette directly in an R script, resulting in the plot below.

library(ggplot2)
library(ggthemes)

ggplot(mtcars) +
  aes(x = disp, y = mpg, color = factor(cyl)) +
  geom_point(size = 6) +
  ggthemes::scale_color_canva(palette = "Nightlife")

Rplot

What’s your favorite color palette among these 100?