Tag: business

Flow charts and process diagrams with Draw.io & VS Code

A flowchart is a picture of the separate steps of a process in sequential order. It it super useful to organize and interpret business processes, IT systems, or computer algorithms.

Example of a very simple flowchart

I draw flowcharts and process diagrams all the time in my daily work as a data scientist!

Drawing out the business process is often a first step in any project, in order to really understand the underlying business workflow and problems. I feel doing so greatly facilitates opportunity finding.

Moreover, when designing a machine learning or data science architecture — with data coming from different sources, being manipulated using different workflows, and ending up in models feeding multiple business processes — drawing the whole she-bang out really helps me personally to keep overview.

There are licensed software programs such as Microsoft Visio that allow you to create flowcharts. But there are also numerous free applications that can help you draw up a flow chart.

It's easier than ever to create beautiful flowcharts from Data Visualizer - Microsoft Tech Community — Via Microsoft Tech Community

Draw.io or app.diagrams.net is my favorite free online application.

How to create flow charts in draw.io - draw.io — Via Draw.io

It allows the easy creation of beatiful flowcharts and process diagrams.

Here’s another great static example:

How to customise the draw.io interface in Confluence Cloud : draw.io Helpdesk

Moreover, Draw.io easily integrates with other suites, like google drive, one drive, et cetera.

Now, some fellow geek out there — Henning Dieterichs — actually built an unofficial draw.io plugin for Visual Studio Code.

I’ve recently transitioned to VS Code for all my Python programming, so I really welcome this cool feature. It integrates all the flow chart functionality of draw.io right there in your IDE. Incredible!

Here’s a demo:

Here’s another demo, but with a light theme, showing how easy it is to export your diagrams to a shareable png file.

Moreover, due to VS Code’s amazing “LiveShare” feature, you can even collaborate with colleagues and build a flow chart together, simulatenously:

Now there are many more features to this plugin. You can write and change the JavaScript code behind the objects to tailor it completely to your theme and tastes. Or if you prefer working with XML, you can just alter that code. Everything seems to work as a charm.

Have a look at the plugin yourself: https://github.com/hediet/vscode-drawio

Note:
I am in no way affiliated with Draw.io, Microsoft, Visual Studio Code, or the author of this plugin.
I just get enthusiastic : )

Practical Tools for Human-Centered Design

Google’s guidebook to human-centered AI design refered to the Design Kit, containing numerous helpful tools to help you design products with user experience in mind.

The design kit website contains many practical methods, tools, case studies and much more resources to help you in the design process.

Human-centered design is a practical, repeatable approach to arriving at innovative solutions. Think of these Methods as a step-by-step guide to unleashing your creativity, putting the people you serve at the center of your design process to come up with new answers to difficult problems.

The design kit methods section provides some seriously handy guidelines to help you design your products with the customer in mind. A step-by-step process guideline is offered, as well as neat worksheets to records the information you collect in the process, and a video explanation of the method.

Example method screenshot from designkit.org/methods/frame-your-design-challenge

Google’s Guidebook for Developing AI Product Development

I came across another great set of curated resources by one of the teams at Google:

The People + AI Guidebook.

The People + AI Guidebook was written to help user experience (UX) professionals and product managers follow a human-centered approach to AI.
The Guidebook’s recommendations are based on data and insights from over a hundred individuals across Google product teams, industry experts, and academic research.
These six chapters follow the product development flow, and each one has a related worksheet to help turn guidance into action.

The People & AI guidebook is one of the products of the major PAIR project team (People & AI Research).

Here are the direct links to the six guidebook chapters:

Links to the related worksheets you can find here.

Determine optimal sample sizes for business value in A/B testing, by Chris Said

A/B testing is a method of comparing two versions of some thing against each other to determine which is better. A/B tests are often mentioned in e-commerce contexts, where the things we are comparing are web pages.

Business leaders and data scientists alike face a difficult trade-off when running A/B tests: How big should the A/B test be? Or in other words, After collecting how many data points, or running for how many days, should we make a decision whether A or B is the best way to go?

This is a tradeoff because the sample size of an A/B test determines its statistical power. This statistical power, in simple terms, determines the probability of a A/B test showing an effect if there is actually really an effect. In general, the more data you collect, the higher the odds of you finding the real effect and making the right decision.

By default, researchers often aim for 80% power, with a 5% significance cutoff. But is this general guideline really optimal for the tradeoff between costs and benefits in your specific business context? Chris thinks not.

Chris said wrote a great three-piece blog in which he explains how you can mathematically determine the optimal duration of A/B-testing in your own company setting:

Part I: General Overview. Starts with a mostly non-technical overview and ends with a section called “Three lessons for practitioners”.
Part II: Expected lift. A more technical section that quantifies the benefits of experimentation as a function of sample size.
Part III: Aggregate time-discounted lift. A more technical section that quantifies the costs of experimentation as a function of sample size. It then combines costs and benefits into a closed-form expression that can be optimized. Ends with an FAQ.
Chris Said (via)

Moreover, Chris provides three practical advices that show underline 80% statistical power is not always the best option:

You should run “underpowered” experiments if you have a very high discount rate
You should run “underpowered” experiments if you have a small user base
Neverheless, it’s far better to run your experiment too long than too short

Simulations shows that for Chris’ hypothetical company and A/B test, 38 days would be the optimal period of time to gather data
via chris-said.io/2020/01/10/optimizing-sample-sizes-in-ab-testing-part-I/

Chris ran all his simulations in Python and shared the notebooks.

Free Springer Books during COVID19

Update: Unfortunately, Springer removed the free access to its books.

Book publisher Springer just released over 400 book titles that can be downloaded free of charge following the corona-virus outbreak.

Here’s the full overview: https://link.springer.com/search?facet-content-type=%22Book%22&package=mat-covid19_textbooks&facet-language=%22En%22&sortOrder=newestFirst&showAll=true

Most of these books will normally set you back about $50 to $150, so this is a great deal!

There are many titles on computer science, programming, business, psychology, and here are some specific titles that might interest my readership:

Note that I only got to page 8 of 21, so there are many more free interesting titles out there!

Join 385 other subscribers

A/B testing and Statistics at Etsy, by Emily Robinson

Generating numbers is easy; generating numbers you should trust is hard!

Emily Robinson is a data scientist at Etsy, an e-commerce website for handmade and vintage products. In the #rstats community, Emily is nearly as famous as her brother David Robinson, whom we know from the tidytext R-package.

Like any large tech company, Etsy relies heavily on statistics to improve their way of doing business. In their case, data from real-life experiments provide the business intelligence that allow effective decision-making. For instance, they experiment with the layout of their buttons, with the text shown near products, or with the suggestions made after a search query. To detect whether such changes have (ever so) small effects on Etsy’s KPI’s (e.g., conversion), data scientists such as Emily rely on traditional A/B testing.

In a 40-minute presentation, Emily explains how statistical issues such as skewed distributions, outliers, and power are dealt with at Etsy, among others using bootstrapping and simulations. Moreover, 30 minutes in Emily shares her lessons when it comes to working with (less stats-savvy) business stakeholders. For instance, how to help identify and transform business questions into data questions back into business solutions, or how to deal with the desire to peek at the results of experiments early.

Overall, I can the presentation below, the slides of which you find on Emily’s GitHub.