Tag: neuralnetwork

Python resources (free courses, books, & cheat sheets)

Find more comprehensive Python repositories:
Vinta’s awesome Python Github repository, the easy Python docs, the Python Wiki Beginners Guide, or CourseDuck’s overview of free Python courses!

My list of Python resources is still quite short so if you have additions, please comment below or contact me! There are separate overviews for Data Science, Machine Learning, & Statistics resources in general, and for R resources and SQL resources in specific.

LAST UPDATED: 11-11-2018

Cheat sheets:

Courses:

Books:

Data Science, Machine Learning, & Statistics resources (free courses, books, tutorials, & cheat sheets)

Welcome to my repository of data science, machine learning, and statistics resources. Software-specific material has to a large extent been listed under their respective overviews: R Resources & Python Resources. I also host a list of SQL Resources and datasets to practice programming. If you have any additions, please comment or contact me!

LAST UPDATED: 21-05-2018

Courses:

Video:

Books:

Sentiment Lexicons:

Cheatsheets:

Other:

Google Fonts – huge collection of text fonts
Checkmycolours.com – check whether your colours have enough contrast
Vischeck.com – check whether your images are colorblind-friendly
Coblis – Color Blind Simulation
Color Oracle – color blind simulation
Chrome color enhancer – customizable color filter for website browsing

Predict the Sentimental Response to your Facebook Posts

Max Woolf writes machine learning blogs on his personal blog, minimaxir, and posts open-source code repositories on his GitHub. He is a former Apple Software QA Engineer and graduated from Carnegie Mellon University. I have published his work before, for instance, this short ggplot2 tutorial by MiniMaxir, but his new project really amazed me.

Max developed a Facebook web scaper in Python. This tool gathers all the posts and comments of Facebook Pages (or Open Facebook Groups) and the related metadata, including post message, post links, and counts of each reaction on the post. The data is then exported to a CSV file, which can be imported into any data analysis program like Excel, or R.

The data format returned by the Facebook scaper.

Max put his scraper to work and gathered a ton of publicly available Facebook posts and their metadata between 2016 and 2017.

However, this was only the beginning. In a follow-up project, Max trained a recurrent neural network (or RNN) on these 2016-2017 data in order to predict the proportionate reactions (love, wow, haha, sad, angry) to any given text. Now, he has made this neural network publicly available with the Python 2/3 module and R package, reactionrnn, which builds on Keras/TensorFlow (see Keras: Deep Learning in R or Python within 30 seconds & R learning: Neural Networks).

Python implementation

For Python, reactionrnn can be installed from pypi via pip:

python3 -m pip install reactionrnn

You may need to create a venv (python3 -m venv <path>) first.

from reactionrnn import reactionrnn

react = reactionrnn()
react.predict("Happy Mother's Day from the Chicago Cubs!")

[('love', 0.9765), ('wow', 0.0235), ('haha', 0.0), ('sad', 0.0), ('angry', 0.0)]

R implementation

For R, you can install reactionrnn from this GitHub repo with devtools (working on resolving issues to get package on CRAN):

# install.packages('devtools')
devtools::install_github("minimaxir/reactionrnn", subdir="R-package")

library(reactionrnn)
react <- reactionrnn()
react %>% predict("Happy Mother's Day from the Chicago Cubs!")

      love        wow       haha        sad      angry 
0.97649449 0.02350551 0.00000000 0.00000000 0.00000000

You can view a demo of common features in this Jupyter Notebook for Python, and this R Notebook for R.

Notes

reactionrnn is trained on Facebook posts of 2016 and 2017 and will often yield responses that are characteristic for this corpus.
reactionrnn will only use the first 140 characters of any given text.
Max intends to build a web-based implementation using Keras.js
Max also intends to improve the network (longer character sequences and better performance) and released it as a commercial product if any venture capitalists are interested.
Max’s projects are open-source and supported by his Patreon, any monetary contributions are appreciated and will be put to good creative use.

Must read: Computer Age Statistical Inference (Efron & Hastie, 2016)

Statistics, and statistical inference in specific, are becoming an ever greater part of our daily lives. Models are trying to estimate anything from (future) consumer behaviour to optimal steering behaviours and we need these models to be as accurate as possible. Trevor Hastie is a great contributor to the development of the field, and I highly recommend the machine learning books and courses that he developed, together with Robert Tibshirani. These you may find in my list of R Resources (Cheatsheets, Tutorials, & Books).

Today I wanted to share another book Hastie wrote, together with Bradley Efron, another colleague of his at Stanford University. It is called Computer Age Statistical Inference (Efron & Hastie, 2016) and is a definite must read for every aspiring data scientist because it illustrates most algorithms commonly used in modern-day statistical inference. Many of these algorithms Hastie and his colleagues at Stanford developed themselves and the book handles among others:

Regression:
- Logistic regression
- Poisson regression
- Ridge regression
- Jackknife regression
- Least angle regression
- Lasso regression
- Regression trees
Bootstrapping
Boosting
Cross-validation
Random forests
Survival analysis
Support vector machines
Kernel smoothing
Neural networks
Deep learning
Bayesian statistics

Visualizing Neural Networks in Processing

Coding Train is a Youtube channel by Daniel Shiffman that covers anything from the basics of programming languages like JavaScript (with p5.js) and Java (with Processing) to generative algorithms like physics simulation, computer vision, and data visualization. In particular, these latter topics, which Shiffman bundles under the label “the Nature of Code”, draw me to the channel.

In a recent series, Daniel draws from his free e-book to create his seven-video playlist where he elaborates on the inner workings of neural networks, visualizing the entire process as he programs the algorithm from scratch in Processing (Java). I recommend the two videos below consisting of the actual programming, especially for beginners who want to get an intuitive sense of how a neural network works.

PS. I tend to watch them on double speed.

Part 1:

Part 2:

R learning: Neural Networks

Artificial neural networks (ANNs) are computing systems inspired by the human brain. They can teach themselves to do tasks, simply by considering examples of the tasks’ outcome. For example, they can learn to identify images that contain cats by analyzing example images that have been tagged “cat” or “no cat”. When given enough examples, the neural network can autonomously determine whether “untagged” images include cats or not (Wikipedia). If you want to learn more and have 20 minutes to spare, I can recommend this YouTube video by Brandon Rohrer.

Neural networks are commonly used for those machine learning problems where there is a vast amount of (complex) data available. Some toy examples include fingerprint recognition, language translation, car steering behaviours, object detection, text generation, and doodle recognition (by Google). Chances are pretty high that any system that makes complex recommendations these days (e.g., “Is this John in the picture?”, “Did you mean “South End Taco’s” instead of “Sout En dTacos”?”) has a neural net running in the background.

http://www.r-exercises.com designs tutorials for beginning programmers in R. On their website they host a learning series on neural networks, consisting of three sets of exercises: Part 1, Part 2, and Part 3. Afterwards, you can check your performance with the solutions: Solutions 1, Solutions 2, and Solutions 3.

Keep on learning!

P.S. afterwards you might want to check out this package and API for deep learning in R and Python.