Tag: github

Awesome R Shiny Resources & Extensions

Awesome R Shiny Resources & Extensions

Rob Gilmore curates a github repo listing resources for working with Shiny, the R web framework and dashboarding tool.

Nan Xiao curates a second repository, listing awesome R packages offer that extensions to Shiny, like extended UI or server components.

They should be your go-to resources when looking for anything Shiny!

Shiny Resources

Extensions

Become a Data Science Professional

Become a Data Science Professional

Amit Ness gathered an impressive list of learning resources for becoming a data scientist.

It’s great to see that he shares them publicly on his github so that others may follow along.

But beware, this learning guideline covers a multi-year process.

Amit’s personal motto seems to be “Becoming better at data science every day“.

Completing the hyperlinked list below will take you several hundreds days at the least!

Learning Philosophy:

Index

Flow charts and process diagrams with Draw.io & VS Code

Flow charts and process diagrams with Draw.io & VS Code

A flowchart is a picture of the separate steps of a process in sequential order. It it super useful to organize and interpret business processes, IT systems, or computer algorithms.

Icon Process #369494 - Free Icons Library
Example of a very simple flowchart

I draw flowcharts and process diagrams all the time in my daily work as a data scientist!

Drawing out the business process is often a first step in any project, in order to really understand the underlying business workflow and problems. I feel doing so greatly facilitates opportunity finding.

Moreover, when designing a machine learning or data science architecture — with data coming from different sources, being manipulated using different workflows, and ending up in models feeding multiple business processes — drawing the whole she-bang out really helps me personally to keep overview.

There are licensed software programs such as Microsoft Visio that allow you to create flowcharts. But there are also numerous free applications that can help you draw up a flow chart.

It's easier than ever to create beautiful flowcharts from Data Visualizer -  Microsoft Tech Community
Via Microsoft Tech Community

Draw.io or app.diagrams.net is my favorite free online application.

How to create flow charts in draw.io - draw.io
Via Draw.io

It allows the easy creation of beatiful flowcharts and process diagrams.

Here’s another great static example:

How to customise the draw.io interface in Confluence Cloud : draw.io  Helpdesk

Moreover, Draw.io easily integrates with other suites, like google drive, one drive, et cetera.

Now, some fellow geek out there — Henning Dieterichs — actually built an unofficial draw.io plugin for Visual Studio Code.

I’ve recently transitioned to VS Code for all my Python programming, so I really welcome this cool feature. It integrates all the flow chart functionality of draw.io right there in your IDE. Incredible!

Here’s a demo:

Via github

Here’s another demo, but with a light theme, showing how easy it is to export your diagrams to a shareable png file.

Via github

Moreover, due to VS Code’s amazing “LiveShare” feature, you can even collaborate with colleagues and build a flow chart together, simulatenously:

via Github

Now there are many more features to this plugin. You can write and change the JavaScript code behind the objects to tailor it completely to your theme and tastes. Or if you prefer working with XML, you can just alter that code. Everything seems to work as a charm.

Have a look at the plugin yourself: https://github.com/hediet/vscode-drawio


Note:
I am in no way affiliated with Draw.io, Microsoft, Visual Studio Code, or the author of this plugin.
I just get enthusiastic : )

Repository of Production Machine Learning

Repository of Production Machine Learning

The Institute for Ethical Machine Learning compiled this amazing curated list of open source libraries that will help you deploy, monitor, version, scale, and secure your production machine learning.

🔍 Explaining predictions & models🔏 Privacy preserving ML📜 Model & data versioning
🏁 Model Training Orchestration💪 Model Serving and Monitoring🤖 Neural Architecture Search
📓 Reproducible Notebooks📊 Visualisation frameworks🔠 Industry-strength NLP
🧵 Data pipelines & ETL🏷️ Data Labelling🗞️ Data storage
📡 Functions as a service🗺️ Computation distribution📥 Model serialisation
🧮 Optimized calculation frameworks💸 Data Stream Processing🔴 Outlier and Anomaly Detection
🌀 Feature engineering🎁 Feature Stores⚔ Adversarial Robustness
💰 Commercial Platforms
Direct links to the sections of the Github repo

The Institute for Ethical Machine Learning is a think-tank that brings together with technology leaders, policymakers & academics to develop standards for ML.

#100DaysOfCode: Machine Learning & Data Visualization

#100DaysOfCode: Machine Learning & Data Visualization

2018 seemed to be the year of challenges going viral on the web. Most of them were plain stupid and/or dangerous. However, one viral challenge I did like: #100DaysOfCode

1. Code minimum an hour every day for the next 100 days.

2. Tweet your progress every day with the #100DaysOfCode hashtag.

3. Each day, reach out to at least two people on Twitter who are also doing the challenge

100 Days of Code rulebook

Many (aspiring) programming professionals competed in this challenge, sharing their learning journeys in domains from web development, machine learning, or data visualization.

With this blog, I wanted to share two of those learning journeys that stood out for me.

Machine learning

First, there’s Avik Jain’s 100 days of Machine Learning code repository on Github. Avik’s repository contains all learning activities he followed during the 53 days of programming he completed. Some of Avik’s entries really stood out, and I particularly liked his educational infographics:

Just look at the wonderful design and visual aids on this decision tree for dummies infographic, pseudocode and all:

Day 23: Decision trees for dummies. This just looks fabulous right?!

Apart from the infographics, Avik also links to many very well produced tutorials that helped him improve his machine learning skills. Such as the free Python for Data Science Handbook Avik worked through, or this Youtube tutorial on deep learning in Python with Tensorflow and Keras:

Although Avik didn’t seem to have completed the full 100 days, many others did.

Data visualization

I have blogged about Hannah Yan Han‘s 100 days of code project before, but she definately deserves another mention here. Her 100 days revolved around data science, data visualization, and storytelling using both R and Python. You can find her #100DaysOfCode Medium page here, and her associated Github repository here.

For example, one day Hannah explored where instant noodles come from, how they are served, and whether people like them or not.

A different day she would examine which sports are the thoughest:

Or how scientific researchers migrate across the globe:

Hannah used many different plot types in those 100 days. Also some lesser known ones, like these upset plots on TED talk data:

Heck, she even made her own R package to generate Mondriaan-like paintings on one of the days:

What I found so great about Hannah’s project is that she picked a novel dataset every couple of days. Moreover, she used a extremely large variety of different visualization formats. All visuals were equally beautiful, but Hannah made sure to pick the right one for the purpose she was trying to serve. If you are interested in data visualization, you seriously should check out Hannah’s 100DaysOfCode Medium page.