Tag: resources

10 Guidelines to Better Table Design

10 Guidelines to Better Table Design

Jon Schwabisch recently proposed ten guidelines for better table design.

Next to the academic paper, Jon shared his recommendations in a Twitter thread.

Let me summarize them for you:

  • Right-align your numbers
  • Left-align your texts
  • Use decimals appropriately (one or two is often enough)
  • Display units (e.g., $, %) sparsely (e.g., only on first row)
  • Highlight outliers
  • Highlight column headers
  • Use subtle highlights and dividers
  • Use white space between rows and columns
  • Use white space (or dividers) to highlight groups
  • Use visualizations for large tables
Afbeelding
Highlights in a table. Via twitter.com/jschwabish/status/1290324966190338049/photo/2
Afbeelding
Visualizations in a table. Via twitter.com/jschwabish/status/1290325409570197509/photo/3
Afbeelding
Example of a well-organized table. Via twitter.com/jschwabish/status/1290325663543627784/photo/2
The 10 Fundamental Concepts of JavaScript

The 10 Fundamental Concepts of JavaScript

Another pearl of a resource on Twitter is this thread by Madison on 10 of fundamentalal concepts of Javascript — and programming in general for that matter.

For your convience, I copied the links below. Just click them to browse to the resource and learn more about the concept

Click to learn more about each concept

  1. Variables & Scoping
  2. Data types
  3. Objects, Funtions & Arrays
  4. Document Object Model (DOM)
  5. Prototypes & this.
  6. Events
  7. Flow Control (specifically, for-loops)
  8. Security & (web) Accesibility
  9. Good coding practices (to which I’ve linked before)
  10. Async

This 10-step list was compiled as apart of this interesting podcast, which I recommend you listen to as well.

Want to learn more?

According to many, this is the best book to continue learning more about JavaScript.

There’s a (now classic) conference talk that comes with this book, which I can also recommend you watch:

treevis.net – A Visual Bibliography of Tree Visualizations

treevis.net – A Visual Bibliography of Tree Visualizations

Last week I cohosted a professional learning course on data visualization at JADS. My fellow host was prof. Jack van Wijk, and together we organized an amazing workshop and poster event. Jack gave two lectures on data visualization theory and resources, and mentioned among others treevis.net, a resource I was unfamiliar with up until then.

treevis.net is a lot like the dataviz project in the sense that it is an extensive overview of different types of data visualizations. treevis is unique, however, in the sense that it is focused on specifically visualizations of hierarchical data: multi-level or nested data structures.

Hans-Jörg Schulz — professor of Computer Science at Aarhus University in Denmark — maintains the treevis repo. At the moment of writing, he has compiled over 300 different types of hierachical data visualizations and displays them on this website.

As an added bonus, the repo is interactive as there are several ways to filter and look for the visualization type that best fits your data and needs.

Most resources come with added links to the original authors and the original papers they were first published in, so this is truly a great resources for those interested in doing a deep dive into data visualization. Do have a look yourself!

Anomaly Detection Resources

Anomaly Detection Resources

Carnegie Mellon PhD student Yue Zhao collects this great Github repository of anomaly detection resources: https://github.com/yzhao062/anomaly-detection-resources

The repository consists of tools for multiple languages (R, Python, Matlab, Java) and resources in the form of:

  1. Books & Academic Papers
  2. Online Courses and Videos
  3. Outlier Datasets
  4. Algorithms and Applications
  5. Open-source and Commercial Libraries/Toolkits
  6. Key Conferences & Journals

Outlier Detection (also known as Anomaly Detection) is an exciting yet challenging field, which aims to identify outlying objects that are deviant from the general data distribution. Outlier detection has been proven critical in many fields, such as credit card fraud analytics, network intrusion detection, and mechanical unit defect detection.

https://github.com/yzhao062/anomaly-detection-resources

Quick Access — Table of Contents

Getting started with Python in Visual Studio Code

Getting started with Python in Visual Studio Code

After several years of proscrastinating, the inevitable finally happened: Three months ago, I committed to learning Python!

I must say that getting started was not easy. One afternoon three months ago, I sat down, motivated to get started. Obviously, the first step was to download and install Python as well as something to write actual Python code. Coming from R, I had expected to be coding in a handy IDE within an hour or so. Oh boy, what was I wrong.

Apparently, there were already a couple of versions of Python present on my computer. And apparently, they were in grave conflict. I had one for the R reticulate package; one had come with Anaconda; another one from messing around with Tensorflow; and some more even. I was getting all kinds of error, warning, and conflict messages already, only 10 minutes in. Nothing I couldn’t handle in the end, but my good spirits had dropped slightly.

With Python installed, the obvious next step was to find the RStudio among the Python IDE’s and get working in that new environment. As an rational consumer, I went online to read about what people recommend as a good IDE. PyCharm seemed to be quite fancy for Data Science. However, what’s this Spyder alternative other people keep talking about? Come again, there are also Rodeo, Thonny, PyDev, and Wing? What about those then? A whole other group of Pythonista’s said that, as I work in Data Science, I should get Anaconda and work solely in Jupyter Notebooks! Okay…? But I want to learn Python to broaden my skills and do more regular software development as well. Maybe I start simple, in a (code) editor? However, here we have Atom, Sublime Text, Vim, and Eclipse? All these decisions. And I personally really dislike making regrettable decisions or committing to something suboptimal. This was already taking much, much longer than the few hours I had planned for setup.

This whole process demotivated so much that I reverted back to programming in R and RStudio the week after. However, I had not given up. Over the course of the week, I brought the selection back to Anaconda Jupyter Notebooks, PyCharm, and Atom, and I was ready to pick one. But wait… What’s this Visual Studio Code (VSC) thing by Microsoft. This looks fancy. And it’s still being developed and expanded. I had already been working in Visual Studio learning C++, and my experiences had been good so far. Moreover, Microsoft seems a reliable software development company, they must be able to build a good IDE? I decided to do one last deepdive.

The more I read about VSC and its features for Python, the more excited I got. Hey, VSC’s Python extension automatically detects Python interpreters, so it solves my conflicts-problem. Linting you say? Never heard of it, but I’ll have it. Okay, able to run notebooks, nice! Easy debugging, testing, and handy snippets… Okay! Machine learning-based IntelliSense autocompletes your Python code – that sounds like something I’d like. A shit-ton of extensions? Yes please! Multi-language support – even tools for R programming? Say no more! I’ll take it. I’ll take it all!

Linting messages in the editor and the Problems panel
Linting in VSC provides code suggestions

My goods friends at Microsoft were not done yet though. To top it all of, they have documented everything so well. It’s super easy to get started! There are numerous ordered pages dedicated to helping you set up and discover your new Python environment in VSC:

The Microsoft VSC pages also link to some more specific resources:

  • Editing Python in VS Code: Learn more about how to take advantage of VS Code’s autocomplete and IntelliSense support for Python, including how to customize their behvior… or just turn them off.
  • Linting Python: Linting is the process of running a program that will analyse code for potential errors. Learn about the different forms of linting support VS Code provides for Python and how to set it up.
  • Debugging Python: Debugging is the process of identifying and removing errors from a computer program. This article covers how to initialize and configure debugging for Python with VS Code, how to set and validate breakpoints, attach a local script, perform debugging for different app types or on a remote computer, and some basic troubleshooting.
  • Unit testing Python: Covers some background explaining what unit testing means, an example walkthrough, enabling a test framework, creating and running your tests, debugging tests, and test configuration settings.
IntelliSense and autocomplete for Python code
Python IntelliSense in VSC makes real-time code autocomplete suggestions

My Own Python Journey

So three months in I am completely blown away at how easy, fun, and versatile the language is. Nearly anything is possible, most of the language is intuitive and straightforward, and there’s a package for anything you can think of. Although I have spent many hours, I am very happy with the results. I did not get this far, this quickly, in any other language. Let me share some of the stuff I’ve done the past three months.

I’ve mainly been building stuff. Some things from scratch, others by tweaking and recycling other people’s code. In my opinion, reusing other people’s code is not necessarily bad, as long as you understand what the code does. Moreover, I’ve combed through lists and lists of build-it-yourself projects to get inspiration for projects and used stuff from my daily work and personal life as further reasons to code. I ended up building:

  • my own Twitter bot, based off of this blog, which I’ll cover in a blog soon
  • my own email bot, based off of this blog, which I’ll cover in a blog soon. It sends me cheerful pictures and updates
  • my own version of this Google images scraper
  • my own version of this Glassdoor scraper
  • a probabilistic event occurance simulator, which I’ll share in a blog post soon
  • a tournament schedule generator that takes in participants, teams (sizes), timeslots, etc and outputs when and where teams needs to play each other
  • a company simulator that takes in growth patterns and generates realistic HR data, which I plan to use in one of my next courses
  • a tiny neural network class, following this Youtube tutorial
  • solutions to the first 31 problems of Project Euler, which I highly recommend you try to solve yourself!
  • solutions to the first dozen problems posed in Automate the Boring Stuff with Python. This book and online tutorial forces you to get your hands dirty right from the start. Simply amazing content and the learning curve is precisely good

I’ve also watched and read a lot:

Although it is no longer maintained, you might find some more, interesting links on my Python resources page or here, for those transitioning from R. If only the links to the more up-to-date resources pages. Anyway, hope this current blog helps you on your Python journey or to get Python and Visual Studio Code working on your computer. Please feel free to share any of the stories, struggles, or successes you experience!

Helpful resources for A/B testing

Helpful resources for A/B testing

Brandon Rohrer — (former) data scientist at Microsoft, iRobot, and Facebook — asked his network on Twitter and LinkedIn to share their favorite resources on A/B testing. It produced a nice list, which I summarized below.

The order is somewhat arbitrary, and somewhat based on my personal appreciation of the resources.

Cover image via Optimizely