Tag: graph

Visualize graph, diagrams, and proces flows with graphviz.it

Graphviz.it is a free online tool to create publication-ready diagrams in an interactive fashion. It uses

It uses graphviz-d3-renderer Bower module and adds editor and live preview of code. Try it on Graphviz fiddling website.

Here are some examples:

The github page hosts more details and you can even follow the development on twitter.

OriginLab’s Graph Gallery: A blast from the past

Continuing my recent line of posts on data visualization resources, I found another repository in my inbox: OriginLab’s GraphGallery!

If I’m being honest, I would personally advice you to look at the dataviz project instead, if you haven’t heard of that one yet.

However, OriginLab might win in terms of sentiment. It has this nostalgic look of the ’90s, and apparently people really used it during that time. Nevertheless, despite looking old, the repo seems to be quite extensive, with nearly 400 different types of data visualizations:

Quantity isn’t everything though, as some of the 400 entries are disgustingly horrible:

What I do like about this OriginLab repo is that it has an option to sort its contents using a random order. This really facilitates discovery of new pearls:

Thanks to Maarten Lambrechts for sharing this resource on twitter a while back!

This visualisation tool I've never heard of is used by "500.000 scientists and engineers" and has an amazing gallery of 388 different chart types and variations https://t.co/0N3Td6jhql pic.twitter.com/A7g4DeEGEv
— Maarten Lambrechts (@maartenzam) September 20, 2019

The Mental Game of Python, by Raymond Hettinger

YouTube recommended I’d watch this recorded presentation by Raymond Hettinger at PyBay2019 last October. Quite a long presentation for what I’d normally watch, but what an eye-openers it contains!

Raymond Hettinger is a Python core developer and in this video he presents 10 programming strategies in these 60 minutes, all using live examples. Some are quite obvious, but the presentation and examples make them very clear. Raymond presents some serious programming truths, and I think they’ll stick.

First, Raymond discusses chunking and aliasing. He brings up the theory that the human mind can only handle/remember 7 pieces of information at a time, give or take 2. Anything above proves to much cognitive load, and causes discomfort as well as errors. Hence, in a programming context, we need to make sure programmers can use all 7 to improve the code, rather than having to decypher what’s in front of them. In a programming context, we do so by modularizing and standardizing through functions, modules, and packages. Raymond uses the Python random module to hightlight the importance of chunking and modular code. This part was quite long, but still interesting.

For the next two strategies, Raymond quotes the Feinmann method of solving problems: “(1) write down a clear problem specification; (2) think very, very hard; (3) write down a solution”. Using the example of a tree walker, Raymond shows how the strategies of incremental development and solving simpler programs can help you build programs that solve complex problems. This part only lasts a couple of minutes but really underlines the immense value of these strategies.

Next, Raymond touches on the DRY principle: Don’t Repeat Yourself. But in a context I haven’t seen it in yet, object oriented programming [OOP], classes, and inherintance.

Raymond continues to build his arsenal of programming strategies in the next 10 minutes, where he argues that programmers should repeat tasks manually until patterns emerge, before they starting moving code into functions. Even though I might not fully agree with him here, he does have some fun examples of file conversion that speak in his case.

Lastly, Raymond uses the graph below to make the case that OOP is a graph traversal problem. According to Raymond, the Python ecosystem is so rich that there’s often no need to make new classes. You can simply look at the graph below. Look for the island you are currently on, check which island you need to get to, and just use the methods that are available, or write some new ones.

Raymond Hettinger: ” OOP is a graph traversal problem”

While there were several more strategies that Raymond wanted to discuss, he doesn’t make it to the end of his list of strategies as he spend to much time on the first, chunking bit. Super curious as to the rest? Contact Raymond on Twitter.

Overviews of Graph Classification and Network Clustering methods

Thanks to Sebastian Raschka I am able to share this great GitHub overview page of relevant graph classification techniques, and the scientific papers behind them. The overview divides the algorithms into four groups:

Moreover, the overview contains links to similar collections on community detection, classification/regression trees and gradient boosting papers with implementations.

As well as a link to relevant graph classification benchmark datasets.

"Awesome Graph Classification" — A collection of graph classification methods, covering embedding, deep learning, graph kernel, and factorization papers with reference implementations https://t.co/ugpL3xSvf1
— Sebastian Raschka (@rasbt) July 16, 2019

Creating plots with custom icons for data points

Data visualizations that make smart use of icons have a way of conveying information that sticks. Dataviz professionals like Moritz Stefaner know this and use the practice in their daily work.

A recent #tidytuesday entry by Georgios Karamanis demonstrates how easy it is to integrate visual icons in your data figures when you write code in R. You can simply store the URL location of an icon as a data column, and map it to an aesthetic using the ggplot2::geom_image function.

Via https://github.com/gkaramanis/tidytuesday/tree/master/week-21

Do have a closer look at Georgios’ github repository for week 21 of tidytuesday. You will probably have to alter the code a bit to get it to work. though!

For those who haven’t moved away from base R plotting functions yet, here’s a good StackOverflow item showing how to use icons in both base R and tidyverse.

Now that I think of it, the above probably uses the same methods that were used to make this amazing Game of Thrones map in R.

Learn from the Pros: How media companies visualize data

Past months, multiple companies shared their approaches to data visualization and their lessons learned.

Click the companies in the list below to jump to their respective section

The Financial Times
The Britisch Broadcast Corporation
The Economist
FiveThirtyEight

Financial Times

The Financial Times (FT) released a searchable database of the many data visualizations they produced over the years. Some lovely examples include:

Graphic showing what May needs to happen to get her deal over the line when MPs vote on Friday — Data visualization belonging to a recent Brexit piece by the FT, viahttps://www.ft.com/graphics

Dutch housing graphic — Searching the FT database for *European House Prices* via https://www.ft.com/graphics returns this map of the Netherlands.

BBC

The BBC released a free cookbook for data visualization using R programming. Here is the associated Medium post announcing the book.

The BBC data team developed an R package (bbplot) which makes the process of creating publication-ready graphics in their in-house style using R’s ggplot2 library a more reproducible process, as well as making it easier for people new to R to create graphics.

Apart from sharing several best practices related to data visualization, they walk you through the steps and R code to create graphs such as the below:

One of the graphs the BBC cookbook will help you create, via https://bbc.github.io/rcookbook/

Ultimately, you should be able to reproduce these BBC style graphics, viahttps://medium.com/bbc-visual-and-data-journalism/how-the-bbc-visual-and-data-journalism-team-works-with-graphics-in-r-ed0b35693535

Economist

The data team at the Economist also felt a need to share their lessons learned via Medium. They show some of their most misleading, confusing, and failing graphics of the past years, and share the following mistakes and their remedies:

Truncating the scale (image #1 below)
Forcing a relationship by cherry-picking scales
Choosing the wrong visualisation method (image #2 below)
Taking the “mind-stretch” a little too far (image #3 below)
Confusing use of colour (image #4 below)
Including too much detail
Lots of data, not enough space

Moreover, they share the data behind these failing and repaired data visualizations:

Via https://medium.economist.com/mistakes-weve-drawn-a-few-8cdd8a42d368

FiveThirtyEight

I could not resist including this (older) overview of the 52 best charts FiveThirtyEight claimed they made.

All 538’s data visualizations are just stunningly beautiful and often very
ingenious, using new chart formats to display complex patterns. Moreover, the range of topics they cover is huge. Anything ranging from their traditional background — politics — to great cover stories on sumo wrestling and pricy wine.