Category: entertainment

Wordle with NLP for Data Scientists

I have played my fair share of Wordle.

I’m not necessarily good at it, but most days I get to solve the puzzle.

The experience is completely different with Semantle — a Wordle-inspired puzzle in which you also need to guess the word of the day.

Unlike in Wordle, Semantle gives you unlimited guesses though. And, boy, you will need many!

Like Wordle, Semantle gives you hints as to how close your guesses were to the secret word of the day.

However, where Wordle shows you how good your guesses were in terms of the letters used, Semantle evaluates the semantic similarity of your guesses to the secret word. For the 1000 most similar words to the secret word, it will show you its closeness like in the picture above.

This semantic similarity comes from the domain of Natural Language Processing — NLP — and this basically reflects how often words are used in similar contexts in natural language.

For instance, the words “love” and “hate” may seem like opposites, but they will often score similarly in grammatical sentences. According to the semantle FAQ the actual opposite of “love” is probably something like “Arizona Diamondbacks”, or “carburetor”.

Another example is last day’s solution (15 March 2022), when the secret word was circle. The ten closest words you could have guessed include circles and semicircle, but more distinctive words such as corner and clockwise.

Further downfield you could have guessed relatively close words like saucer, dot, parabola, but I would not have expected words like outwaited, weaved, and zipped.

The creator of Semantle scored the semantic similarity for almost all words used in the English language, by training a so-called word2vec model based on a very large dataset of news articles (GoogleNews-vectors-negative300.bin from late 2021).

Now, every day, one word is randomly selected as the secret word, and you can try to guess which one it is. I usually give up after 300 to 400 guesses, but my record was 76 guesses for uncovering the secret word world.

Try it out yourself: https://semantle.novalis.org/

And do share your epic wins and fails!

Infographic of Psychological Biases in Decision Making

I just love psychological experiments around our human biases.

In this case, Dan White visualized some of the psychological biases mentioned in Richard Shotton‘s book “The Choice Factory“.

These biases make for irrational human behavior in the way we make daily decisions.

For example, you will be prepared to pay more for a cookie, when there are less of them in the jar. The generic principle here is that we assign higher valuations to objects under conditions of scarcity.

Once you are aware of such psychological biases, you will start to notice how they are (mis)used nearly everywhere these days. Particularly in sales and marketing. In restaurants, shops, online, and in virtually any case where we act as a consumer, we are subconciously influenced to make certain purchasing decision.

Nudging, is what they call these attempts to manipulate your behavior.

Maybe not so ethical, but still these infographics look amazing and these biases are good to be aware of!

Disclaimer: This page contains one or more links to Amazon.
Any purchases made through those links provide us with a small commission that helps to host this blog.

Using OpenCV to win Mobile games

OpenCV is open-source library with tools and functionalities that support computer vision. It allows your computer to use complex mathematics to detect lines, shapes, colors, text and what not.

OpenCV was originally developed by Intel in 2000 and sometime later someone had the bright idea to build a Python module on top of it.

Using a simple…

pip install opencv-python

…you can now use OpenCV in Python to build advanced computer vision programs.

And this is exactly what many professional and hobby programmers are doing. Specifically, to get their computer to play (and win) mobile app games.

ZigZag

In ZigZag, you are a ball speeding down a narrow pathway and your only mission is to avoid falling off.

Using OpenCV, you can get your computer to detect objects, shapes, and lines.

This guy set up an emulator on his computer, so the computer can pretend to be a mobile device. Then he build a program using Python’s OpenCV module to get a top score

You can find the associated code here, but note that will need to set up an emulator yourself before being able to run this code.

Kick Ya Chop

In Kick Ya Chop, you need to stomp away parts of a tree as fast as you can, without hitting any of the branches.

This guy uses OpenCV to perform image pattern matching to allow his computer to identify and avoid the trees braches. Find the code here.

Whack ‘Em All

We all know how to play Whack a Mole, and now this computer knows how to too. Code here.

Pong

This last game also doesn’t need an introduction, and you can find the code here.

Is this machine learning or AI?

If you’d ask me, the videos above provide nice examples of advanced automation. But there’s no real machine learning or AI involved.

Yes, sure, the OpenCV package uses pre-trained neural networks under the hood, and you can definitely call those machine learning. But the programmers who now use the opencv library just leverage the knowledge stored in those network to create very basal decision rules.

IF pixel pattern of mole
THEN whack!
ELSE no whack.

To me, it’s only machine learning when there’s really some learning going on. A feedback loop with performance improvement. And you may call it AI, IMO, when the feedback loop is more or less autonomous.

Fortunately, programmers have also been taking a machine learning/AI approach to beating games. Specifically using reinforcement learning. Think of famous applications like AlphaGo and AlphaStar. But there are also hobby programmers who use similar techniques. For example, to get their computer to obtain highscores on Trackmania.

In a later post, I’ll dive into those in more detail.

How a File Format Exposed a Crossword Scandal

Vincent Warmerdam shared this Youtube video which I thoroughly enjoyed watched. It’s about Saul Pwanson, a software engineer whose hobby project got a little out of hand.

In 2016, Saul Pwanson designed a plain-text file format for crossword puzzle data, and then spent a couple of months building a micro-data-pipeline, scraping tens of thousands of crosswords from various sources.

After putting all these crosswords in a simple uniform format, Saul used some simple command line commands to check for common patterns and irregularities.

Surprisingly enough, after visualizing the results, Saul discovered egregious plagiarism by a major crossword editor that had gone on for years.

Ultimately, 538 even covered the scandal:

I thoroughly enjoyed watching this talk on Youtube.

Saul covers the file format, data pipeline, and the design choices that aided rapid exploration; the evidence for the scandal, from the initial anomalies to the final damning visualization; and what it’s like for a data project to get 15 minutes of fame.

I tried to localize the dataset online, but it seems Saul’s website has since gone offline. If you do happen to find it, please do share it in the comments!

The sound of colleagues

Missing our colleagues? The rustic sounds of the office jungle?

Try out these customizeable office noises.

Building a new desktop!

I recently decided to buy a new computer.

While looking for laptops, it struck me that they can be so expensive for the hardware you get. I actually don’t need to my computer to be mobile, as most of the time it just sits in my study.

Hence, I opted for buying a desktop. And even better, I decided to build one myself!

I thought building a PC was going to be all complex and technical, but it’s actually really easy! I hope I can inspire you to try out for yourself as well.

Basically, you need only need 6 parts to build a computer:

Casing
Power supply
Motherboard
Processor (CPU)
Hard drive (SSD)
Memory (RAM)
Optional: Graphics card (GPU)
Optional: (extra) Fans

Desktop Computer Components (With images) | Computer history, Old ... — Via Pinterest (look at that old school case & speakers)

So I did some research into what hardware to buy. Specifically, I wanted a PC that could handle some deep learning and some of the newer video games. Hence, I decided on this setup:

Casing: Be Quiet! Base with pre-installed fans
Power supply: Cooler Master V550 Gold
Motherboard: MSI B450-A Pro Max
Processor (CPU): AMD Ryzen 5 3600X
Hard drive (SSD): Crucial P1 1TB
Memory (RAM): Crucial Ballistix 3200MHz 2x8GB (I got grey ones)
Graphics card (GPU): MSI GeForce RTX 2060 Super Armor OC

Note: these are affiliate links.
If you buy a similar setup, it will generate a few bucks used to keep my website live!

My setup totalled to about €1100 or $1200, but it may depend on the vendors you pick. Nonetheless, the CPU and the GPU are definitely the most expensive (and important).

I did not buy any additional fans, as the Be Quiet base already had some pre-installed. However, I think it might be better to install extra’s.

Actually, it’s very easy to upgrade (or downgrade) your system. You can easily switch out modules to decrease or increase the performance (and cost). For instance, you can install another two memory cards on your motherboard, or simply spend more on a GPU.

After everything was delivered to my house, I thought the hard part started: building the desktop and putting everything together. But actually, this only took me about an hour or two, with the help of some great tutorials on Youtube: