Data visualizations that make smart use of icons have a way of conveying information that sticks. Dataviz professionals like Moritz Stefaner know this and use the practice in their daily work.
A recent #tidytuesday entry by Georgios Karamanis demonstrates how easy it is to integrate visual icons in your data figures when you write code in R. You can simply store the URL location of an icon as a data column, and map it to an aesthetic using the ggplot2::geom_image function.
Do have a closer look at Georgios’ github repository for week 21 of tidytuesday. You will probably have to alter the code a bit to get it to work. though!
For those who haven’t moved away from base R plotting functions yet, here’s a good StackOverflow item showing how to use icons in both base R and tidyverse.
Disclaimer: This page contains one or more links to Amazon. Any purchases made through those links provide us with a small commission that helps to host this blog.
Data visualization and the (in)effective communication of information are salient topics on this blog. I just love to read and write about best practices related to data visualization (or bad practices), or to explore novel types of complex graphs. However, I am not always online, and I am equally fond of reading about data visualization offline.
These amazing books about data visualization are written by some of the leading experts in the dataviz scene:
Looking for a custom typeface to use in your data visualizations? Google Fonts is an awesome databank of nearly a thousands font families you can access, download, and use for free.
Some time back the animations below went sort of viral in the statistical programming community. In them, economics professor Nick Huntington-Klein demonstrates step-by-step how statistical tests estimate effect sizes.
I've been getting used to gganimate and thought it would be useful to put together some illustrations of what various causal inference methods *actually do to data* and how they work. Here, for example, is what it means to control for a (binary) variable pic.twitter.com/lmEvJSPQgY
The BBC data team developed an R package (bbplot) which makes the process of creating publication-ready graphics in their in-house style using R’s ggplot2 library a more reproducible process, as well as making it easier for people new to R to create graphics.
Apart from sharing several best practices related to data visualization, they walk you through the steps and R code to create graphs such as the below:
The data team at the Economist also felt a need to share their lessons learned via Medium. They show some of their most misleading, confusing, and failing graphics of the past years, and share the following mistakes and their remedies:
Truncating the scale (image #1 below)
Forcing a relationship by cherry-picking scales
Choosing the wrong visualisation method (image #2 below)
Taking the “mind-stretch” a little too far (image #3 below)
Confusing use of colour (image #4 below)
Including too much detail
Lots of data, not enough space
Moreover, they share the data behind these failing and repaired data visualizations:
All 538’s data visualizations are just stunningly beautiful and often very ingenious, using new chart formats to display complex patterns. Moreover, the range of topics they cover is huge. Anything ranging from their traditional background — politics — to great cover stories on sumo wrestling and pricy wine.
Josh Starmer is assistant professor at the genetics department of the University of North Carolina at Chapel Hill.
But more importantly: Josh is the mastermind behind StatQuest!
StatQuest is a Youtube channel (and website) dedicated to explaining complex statistical concepts — like data distributions, probability, or novel machine learning algorithms — in simple terms.
Once you watch one of Josh’s “Stat-Quests”, you immediately recognize the effort he put into this project. Using great visuals, a just-about-right pace, and relateable examples, Josh makes statistics accessible to everyone. For instance, take this series on logistic regression:
And do you really know what happens under the hood when you run a principal component analysis? After this video you will:
Or are you more interested in learning the fundamental concepts behind machine learning, then Josh has some videos for you, for instance on bias and variance or gradient descent:
With nearly 200 videos and counting, StatQuest is truly an amazing resource for students ‘and teachers on topics related to statistics and data analytics. For some of the concepts, Josh even posted videos running you through the analysis steps and results interpretation in the R language.
StatQuest started out as an attempt to explain statistics to my co-workers – who are all genetics researchers at UNC-Chapel Hill. They did these amazing experiments, but they didn’t always know what to do with the data they generated. That was my job. But I wanted them to understand that what I do isn’t magic – it’s actually quite simple. It only seems hard because it’s all wrapped up in confusing terminology and typically communicated using equations. I found that if I stripped away the terminology and communicated the concepts using pictures, it became easy to understand.
Over time I made more and more StatQuests and now it’s my passion on YouTube.