Category: visualization

Flow charts and process diagrams with Draw.io & VS Code

A flowchart is a picture of the separate steps of a process in sequential order. It it super useful to organize and interpret business processes, IT systems, or computer algorithms.

Example of a very simple flowchart

I draw flowcharts and process diagrams all the time in my daily work as a data scientist!

Drawing out the business process is often a first step in any project, in order to really understand the underlying business workflow and problems. I feel doing so greatly facilitates opportunity finding.

Moreover, when designing a machine learning or data science architecture — with data coming from different sources, being manipulated using different workflows, and ending up in models feeding multiple business processes — drawing the whole she-bang out really helps me personally to keep overview.

There are licensed software programs such as Microsoft Visio that allow you to create flowcharts. But there are also numerous free applications that can help you draw up a flow chart.

It's easier than ever to create beautiful flowcharts from Data Visualizer - Microsoft Tech Community — Via Microsoft Tech Community

Draw.io or app.diagrams.net is my favorite free online application.

How to create flow charts in draw.io - draw.io — Via Draw.io

It allows the easy creation of beatiful flowcharts and process diagrams.

Here’s another great static example:

How to customise the draw.io interface in Confluence Cloud : draw.io Helpdesk

Moreover, Draw.io easily integrates with other suites, like google drive, one drive, et cetera.

Now, some fellow geek out there — Henning Dieterichs — actually built an unofficial draw.io plugin for Visual Studio Code.

I’ve recently transitioned to VS Code for all my Python programming, so I really welcome this cool feature. It integrates all the flow chart functionality of draw.io right there in your IDE. Incredible!

Here’s a demo:

Here’s another demo, but with a light theme, showing how easy it is to export your diagrams to a shareable png file.

Moreover, due to VS Code’s amazing “LiveShare” feature, you can even collaborate with colleagues and build a flow chart together, simulatenously:

Now there are many more features to this plugin. You can write and change the JavaScript code behind the objects to tailor it completely to your theme and tastes. Or if you prefer working with XML, you can just alter that code. Everything seems to work as a charm.

Have a look at the plugin yourself: https://github.com/hediet/vscode-drawio

Note:
I am in no way affiliated with Draw.io, Microsoft, Visual Studio Code, or the author of this plugin.
I just get enthusiastic : )

All buildings in the Netherlands, color coded by year of construction

Could you guess that you are looking at Amsterdam?

Maybe you spotted the canals?

Bert Spaan colorcoded every building in the Netherlands according to their yaer of construction and visualized the resulting map of nearly 10 million buildings in a JavaScript leaflet webpage.

It resulted in this wonderful map, which my screenshots don’t do any honor. So have a look yourself!

https://github.com/waagsociety/buildings/tree/gh-pages/high-res

Select the right data visualization or chart type

I found this amazing website data-to-viz.com that helps you select the right data visualization or chart type for your data.

Got numeric data? Two variables? No inherent order? Just a few data points? Pick a boxplot, histogram, or scatterplot!

Categorical data? There’s a seperate decision tree for those!

There’s a whole world of possible chart types you can choose from. The website explains you how they work and when to use which type.

The website also warns you for some common mistakes in data visualization.

The cover image is a poster you can buy to support the authors of data-viz.com!

10 Tips for Effective Dashboard Design by Deloitte

My colleague prof. Jack van Wijk pointed me towards these great guidelines by Deloitte on how to design an effective dashboard.

Click to access deloitte-nl-data-analytics-10-commandments-effective-dashboarding.pdf

Some of these rules are more generally applicable to data visualization. Yet, the Deloitte 10 commandments form a good checklist when designing a dashboard.

Here’s my interpretation of the 10 rules:

Know your message or goal
Choose the chart that conveys your message best
Use a grid to bring order to your dashboard
Use color only to highlight and draw attention
Remove unneccessary elements
Avoid information overload
Design for ease of use
Text is as important as charts
Design for multiple devices (desktop, tablet, mobile, …)
Recycle good designs (by others)

In terms of recycling the good work by others operating in the data visualization field, check out:

I just love how Deloitte uses example visualizations to help convey what makes a good (dashboard) chart:

10 Guidelines to Better Table Design

Jon Schwabisch recently proposed ten guidelines for better table design.

Next to the academic paper, Jon shared his recommendations in a Twitter thread.

I recently published "Ten Guidelines for Better Tables" in the Journal of Benefit Cost Analysis (@benefitcost) on ways to improve your data tables.

Here's a thread summarizing the 10 guidelines.

Full paper is here: https://t.co/VSGYnfg7iP pic.twitter.com/W6qbsktioL
— Jon Schwabish (@jschwabish) August 3, 2020

Let me summarize them for you:

Right-align your numbers
Left-align your texts
Use decimals appropriately (one or two is often enough)
Display units (e.g., $, %) sparsely (e.g., only on first row)
Highlight outliers
Highlight column headers
Use subtle highlights and dividers
Use white space between rows and columns
Use white space (or dividers) to highlight groups
Use visualizations for large tables

Afbeelding — Highlights in a table. Via twitter.com/jschwabish/status/1290324966190338049/photo/2

How most statistical tests are linear models

Jonas Kristoffer Lindeløv wrote a great visual explanation of how the most common statistical tests (t-test, ANOVA, ANCOVA, etc) are all linear models in the back-end.

Jonas’ original blog uses R programming to visually show how the tests work, what the linear models look like, and how different approaches result in the same statistics.

George Ho later remade a Python programming version of the same visual explanation.

If I was thought statistics and methodology this way, I sure would have struggled less! Have a look yourself: https://lindeloev.github.io/tests-as-linear/