Tag: maps

18 Pitfalls of Data Visualization

18 Pitfalls of Data Visualization

Maarten Lambrechts is a data journalist I closely follow online, with great delight. Recently, he shared on Twitter his slidedeck on the 18 most common data visualization pitfalls. You will probably already be familiar with most, but some (like #14) were new to me:

  1. Save pies for dessert
  2. Don’t cut bars
  3. Don’t cut time axes
  4. Label directly
  5. Use colors deliberately
  6. Avoid chart junk
  7. Scale circles by area
  8. Avoid double axes
  9. Correlation is no causality
  10. Don’t do 3D
  11. Sort on the data
  12. Tell the story
  13. 1 chart, 1 message
  14. Common scales on small mult’s
  15. #Endrainbow
  16. Normalise data on maps
  17. Sometimes best map is no map
  18. All maps lie

Even though most of these 18 rules below seem quite obvious, even the European Commissions seems to break them every now and then:

Can you spot what’s wrong with this graph?

Circular Map Cutouts in R

Circular Map Cutouts in R

Katie Jolly wanted to surprise a friend with a nice geeky gift: a custom-made map cutout. Using R and some visual finetuning in Inkscape, she was able to made the below.

A detailed write-up of how Katie got to this product is posted here.

Basically, the R’s tigris package included all data on roads, and the ArcGIS Open Data Hub provided the neighborhood boundaries. Fortunately, the sf package is great for transforming and manipulating geospatial data, and includes some functions to retrieve a subset of roads based on their distance to a centroid. With this subset, Katie could then build these wonderful plots in no time with ggplot2.

What to consider when choosing colors for data visualization, by DataWrapper.de

What to consider when choosing colors for data visualization, by DataWrapper.de

Lisa Charlotte Rost of DataWrapper often writes about data visualization and lately she has focused on the (im)proper use of color in visualization. In this recent blog, she gives a bunch of great tips and best practices, some of which I copied below:

color in data vis advice
Gradient colors can be great to show a pattern but, for categorical data, it is often easier to highlight the most important values with colored bars, positions (like in a dot plot) or even areas. [https://blog.datawrapper.de/colors/]
color in data vis advice
If you need more than seven colors in a chart, consider using another chart type or to group categories together. [https://blog.datawrapper.de/colors/]
color in data vis advice
Consider using the same color for the same variables, but do differentiate between categories, even across graphics. [https://blog.datawrapper.de/colors/]
color in data vis advice
Using grey for less important elements in your chart makes your highlight colors (which should be reserved for your most important data points) stick out even more.  [https://blog.datawrapper.de/colors/]
color in data vis advice
Consider color-blind people. There are many different types of color blindness: Use an online tool or Datawrapper’s automatic colorblind-check. [https://blog.datawrapper.de/colors/]
 You can find additional useful tips in the original DataWrapper blog.

Xenographics: Unusual charts and maps

Xenographics: Unusual charts and maps

Xeno.graphics is the collection of unusual charts and maps Maarten Lambrechts maintains. It’s a repository of novel, innovative, and experimental visualizations to inspire you, to fight xenographphobia, and popularize new chart types.

For instance, have you ever before heard of a time curve? These are very useful to visualize the development of a relationship over time.

Time curves are based on the metaphor of folding a timeline visualization into itself so as to bring similar time points close to each other. This metaphor can be applied to any dataset where a similarity metric between temporal snapshots can be defined, thus it is largely datatype-agnostic. [https://xeno.graphics/time-curve]
The upset plot is another example of an upcoming visualization. It can demonstrate the overlap or insection in a dataset. For instance, in the social network of #rstats twitter heroes, as the below example from the Xenographics website does.

Understanding relationships between sets is an important analysis task. The major challenge in this context is the combinatorial explosion of the number of set intersections if the number of sets exceeds a trivial threshold. To address this, we introduce UpSet, a novel visualization technique for the quantitative analysis of sets, their intersections, and aggregates of intersections. [https://xeno.graphics/upset-plot/]
The below necklace map is new to me too. What it does precisely is unclear to me as well.

In a necklace map, the regions of the underlying two-dimensional map are projected onto intervals on a one-dimensional curve (the necklace) that surrounds the map regions. Symbols are scaled such that their area corresponds to the data of their region and placed without overlap inside the corresponding interval on the necklace. [https://xeno.graphics/necklace-map/]
There are hundreds of other interestingcharts, maps, figures, and plots, so do have a look yourself. Moreover, the xenographics collection is still growing. If you know of one that isn’t here already, please submit it. You can also expect some posts about  certain topics around xenographics.


Evolving Floorplans – by Joel Simon

Evolving Floorplans – by Joel Simon

Joel Simon is the genius behind an experimental project exploring optimized school blueprints. Joel used graph-contraction and ant-colony pathing algorithms as growth processes, which could generate elementary school designs optimized for all kinds of characteristics: walking time, hallway usage, outdoor views, and escape routes just to name a few.

Two generated designs, minimizing the traffic flow (left) as well as escape routes (right) [original]
Other designs tried to maximize the number of windows, resulting in seemingly random open courtyards [original]

The original floor plan [original]
Definitely check out the original write-up if you are interested in the details behind the generation process! Or have a look at some of Joel’s other projects.

Become a data-driven Sommelier by text mining wine reviews

Become a data-driven Sommelier by text mining wine reviews

Aleszu Bajak at Storybench.org published a great demonstration of the power of text mining. He used the R tidytext package to analyse 150,000 wine reviews which Zach Thoutt had scraped from Wine Enthusiast in November of 2017.

Aleszu started his analysis on only the French wines, with a simple word count per region:

[orginal blog]
Next, he applied TF-IDF to surface the words that are most characteristic for specific French wine regions — words used often in combination with that specific region, but not in relation to other regions.

[orginal blog]
The data also contained some price information, which Aleszu mapped France with ggplot2 and the maps package to demonstrate which French wine regions are generally more costly.

[orginal blog]
On the full dataset, Alezsu also demonstrated that there is a strong relationship between price and points, meaning that, in general, more expensive wines seem to get better reviews:

[orginal blog]
The full script and more details you can find in the orginal blog.