Tag: paulvanderlaken

Reviewing year 4 of paulvanderlaken.com

Reviewing year 4 of paulvanderlaken.com

Despite the pandemic, 2020 has been a great year for me.

Professionally, I grew into my role as data science product owner. And next to this, I got more and more freelance side gigs. Mostly teaching, but also some consultancy projects. Unfortunately, all my start-up ideas failed miserably again this year, yet I’ll keep trying : )

Personally, 2020 was also generous to us. We have a family expansion coming in 2021! (Un)Fortunately, the whole quarantaine situation provided a lot of time to make our house baby-ready!

A year in numbers

2020 was also a great year for our blog.

Here are some statistics. We reached 300 followers, on the last day of the year! Who could have imagined that?!

Statistic20192020delta
Views107.828150.59940%
Visitors70.870100.53942%
Followers15930089%
Posts9672-25%
Comments405948%
per post0,420,8297%
Likes11686-26%
per post1,211,19-1%

This tremendous growth of the website is despite me posting a lot less frequently this year.

After a friend’s advice, I started posting less, but more regularly.

Can you spot the pattern in my 2020 posting behavior?

Compare that to my erratic 2019 posting:

Now my readers have got something to look forward to every Tuesday!

Yet, is Tuesday really the best day for me to post my stuff?

You seem to prefer visiting my blog on Wednesdays.

Let me know what you think in the comments!

I am looking forward to what 2021 has in store for my blogging. I guess a baby will result in even less posts… But we’ll just focus on quality over quantity!

I hope I can keep up with the exponential growth:

Best new articles in 2020

There are many ways in which you could define the quality of an article.

For me, the most obvious would be to look at some view-based metric. Something like the number of views, or the number of unique visitors.

Yet, some articles have been online longer than others. So maybe we should focus on the average views per day. Still these you can expect to be increase as articles have been in existance longer.

In my opinion, how an article attract viewers over time tells an interesting story. For instance, how stable are the daily viewer numbers? Are they rising? This is often indicative that external websites link to my article. Which implies it holds valuable information to a specific readership. In turn, this suggests that the article is likely to continue attracting viewers in the future.

Here is an abstract visualization. Every line represents and article. Every line/article starts in the lower left corner. On the x-axis you see the number of days since posting. So lines slowly move right, the longer they have been on my website. On the y-axis you see the total viewers it attracted.

You can see three types of blog articles: (1) articles that attract 90% of their views within the first month, (2) articles that generate a steady flow of visitors, (3) articles that never attract (m)any readers.

Here’s a different way of visualizing those same articles: by their average daily visitors (x) and the standard deviation in daily visitors (y).

Basically, I hope to write articles that get many daily visitors (high x). Yet, I also hope that my articles have either have stable (or preferably increasing) visitor numbers. This would mean that they either score low on y, or that y increases over time.

By these measures, my best articles of 2020 are, in my opinion:

  1. Bayesian statistics using R, Python, & Stan
  2. Automatically create perfect .gitignore file
  3. Create a publication-ready correlation matrix
  4. Simulating and visualizing the Monthy Hall problem in R & Python
  5. How most statistical tests are linear

Best all time reads

For the first time, my blog roll & archives were the most visited page of my website this year! A whopping 13k views!!

With regard to the most visited pages of this year, not much has changed since 2019. We see some golden oldies and I once again conclude that my viewership remains mostly R-based:

  1. R resources
  2. New to R?
  3. R tips and tricks
  4. The house always wins
  5. Simple correlation analysis in R
  6. Visualization innovations
  7. Beating battleships with algorithms and AI
  8. Regular expressions in R
  9. Learn project-based programming
  10. Simpson’s paradox

Which articles haven’t you read?

Did you know you can search for keywords or tags using the main page?

A new blog post every Tuesday!

A new blog post every Tuesday!

Hi dear readers!

I’ve been blogging for just over three years now. Writing my first blogs in January 2017 in a small café in London out of pastime, I had never imagined that I would actually attract a readership. And while my blog grew in visitors and followers, it always remained just a hobby for me to summarize and post stuff I was reading or working on at the time.

Still, I want to make sure that you — my readership — get the most out of the time I invest in my blogs. Some blogs are more interesting than others, I am aware. However, I notice that it’s not only the content, but also the time and momentum of posting that draws in you readers.

Hence, I propose the following: Rather than erratically posting everything at the time of writing, I’m going to provide you with some regularity. As of today…

I will post a new blog every Tuesday, at 15:00 GMT+1

“Why that specific time?” I hear you asking. Well, it’s the perfect time as all my followers are still awake on the same day!

  • My Asian followers can read the blog just before or in bed,
  • My African and European readers can look forward to their commute home, and
  • My North- and South-American fans can enjoy it with their morning coffee!

Join 1,415 other subscribers

Over the course of last year, I posted nearly 100 blogs. If you do some quick maths, you will conclude that 1 post every Tuesday is a lot less.

And this is where you come in:

Please let me know at what time you would prefer to have me post a second blog every (other) week. You can do so in the comments, by sending me a direct message, or by reaching out to me on twitter or LinkedIn.

Image via uoe.co.uk/uoe-post-office-services

Top-19 articles of 2019

Top-19 articles of 2019

With only one day remaining in 2019, let’s review the year. 2019 was my third year of blogging and it went by even quicker than the previous two!

Personally, it has been a busy year for me: I started a new job, increased my speaking and teaching activities, bought and moved to my new house, and got married op top of that!

Fortunately, I also started working parttime. This way, I could still reserve some time for learning and sharing my learnings. And sharing I did:

I posted 95 blogs in 2019!
That means one new post every 4 days!

paulvanderlaken.com improved its online footprint as well. We received over 100k visitors in 2019! And many of you subscribed and sticked around. Our little community now includes 55 more members than it did last year! And that is not even including the followers to my new twitter bot Artificial Stupidity!

Thank you for your continued interest!

Join 1,415 other subscribers

Now, I am always curious as to what brings you to my website, so let’s have a look at some 2019 statistics (which I downloaded via my new Python scraper).

Most read articles

There is clearly a power distribution in the quantity with which you read my blogs.

Some blogs consistently attract dozens of visitors each day. Others have only handful of visitors over the course of a year.

These are the 19 articles which were most read in 2019. Hyperlinks are included below the bar chart. It’s a nice combination of R programming, machine learning, HR-related materials, and some entertainment (games & gambling) in between.

Which have and haven’t you read?

  1. R resources
  2. R tips and tricks
  3. New to R?
  4. Books for the modern, data-driven HR professional
  5. The house always wins
  6. Visualization innovations
  7. Simple correlation analysis in R
  8. Beating battleships with algorithms and AI
  9. Regular expressions in R
  10. Simpson’s paradox
  11. Visualizing the k-means clustering algorithm
  12. Survival of the best fit
  13. Datasets to practice and learn data science
  14. Identifying dirty twitter bots
  15. Game of Thrones map
  16. Screeps
  17. Northstar
  18. The difference between DS, ML, and AI visualized
  19. Light GBM vs. XGBoost

Rising stars

Half of these most read articles have actually been published in 2017 or ’18 already. However, of the 95 articles published in 2019, some also demonstrate promising visitor patterns:

The People Analytics books, Visual innovations, and AI Battleships are in the top 19, and several others made it too.

Some of these newer blogs haven’t had the time to mature and redeem their place yet though. Regardless, I have high hopes!

Particularly for Neural Synesthesia, which was easily one of my greatest WOW-moments for ML applications in 2019. It’s truly mesmerizing to see a GAN traverse its latent space.

Reading & posting patterns

I have been posting quite regularly throughout the year. Apart from a holiday to Thailand during the start of January, and the start of my new job in February.

While I write and post most of my blogs in the weekend, I guess I should consider postponing publishing. As you guys are mostly active during Tuesdays and Wedsnesdays!

Statistical summary of 2019

What better way to end 2019 than with a statistical summary?

I have posted more and shorter blogs, and you’ve rewarded me with visits and more likes (also per post). However, we need more discussion!

Statistic2018 2019 Δ
Views85614107388+25%
Unique visitors5759470615+23%
Posts6195+56%
Words / post518371-40%
Likes51111118%
Comments2416-33%
As of 29/12/2019

2020 Outlook

It took some time to get started, but halfway 2017 my blog started attracting an audience. People stayed on during 2018, and visitor number continued to increase through 2019.

With an ongoing expansion from R into Python, and an increased focus on sharing resources, applications, and novelties related to data visualization and machine learning, I have a lot more in store for 2020!

I hope you stick around for the ride!

Please like, subscribe, share, and comment, and we’ll make sure 2020 will be at least as interesting and full of (machine) learning as 2019 has been!

Join 1,415 other subscribers

Two years of paulvanderlaken.com

Yesterday was the second anniversary of my website. I also reflected on this moment last year, and I thought to continue the tradition in 2019.

Let me start with a great, big
THANK YOU
to all my readers for continuing to visit my website!

You are the reason I continue to write down what I read. And maybe even the reason I continued reading and learning last year, despite all other distractions [my “real” job and my PhD : )].

Also a big thank you to all my followers on Twitter and LinkedIn, and those who have taken the time to comment or like my blogs. All of you make that I gain energy from writing this blog!

With that said, let’s start the review of the past year on my blog.

Most popular blog posts of 2018

Most importantly, let’s examine what you guys liked. Which blogs attracted the most visitors? What did you guys read?

Unfortunately, WordPress does not allow you to scrape their statistics pages. However, I was able to download monthly data manually, which I could then visualize to show you some trends.

The visual below shows the cumulative amount of visitors attracted by each blog I’ve written in 2018. Here follow links to the top 8 blogs in terms of visitor numbers this year:

  1. “What’s the difference between data science, machine learning, and artificial intelligence?”, visualized. received 4355 visits. Following a viral blog by David Robinson, I try to demystify the popular terminology.
  2. The House Always Wins: Simulating 5,000,000 Games of Baccarat a.k.a. Punto Banco received 3079 views. After a visit to Holland Casino, I thought it’d be fun to approximate the odds of gambling through statistical simulation.
  3. Bayesian data analysis for newcomers received 2253 views. It contains the link to an open access paper explaining the basics of Bayesian analysis.
  4. Identifying “Dirty” Twitter Bots with R and Python received 2247 views. It tells the story of two programmers who uncover networks of filthy social media accounts.
  5. rstudio::conf 2018 summary received 1514 views. It provides links to the most salient talks and presentations of the yearly R gathering.
  6. R tips & tricks is relatively new and has only yet received 1212 views. Seperate from the R resources guide, this new list contains all the quick tricks that help you program more effectively in R.
  7. Super Resolution: A Photo Enhancer AI received 891 views and elaborates on the development of new tools that can upgrade photo and video data quality.
  8. ggstatsplot: Creating graphics including statistical details is also relatively new but already attracted 810 visitors. It explains the novel visualization package in R that allows you to quickly create elaborate statistical plots.

Biggest failures of 2018

Where there’s success, there’s failure. Some of my posts did not get a lot of attention by my readership. That’s unfortunate, as I really only take the time to blog about the stuff that I deem interesting enough. Were these failed blog posts just unlucky, or am I biased and were they simply really bad and uninteresting?

You be the judge! Here are some of the least read posts of 2018:

General statistics

Now, let’s move to some general statistics: in 2018, paulvanderlaken.com received 85.614 views, by 57.594 unique visitors. I posted 61 new blogs, consisting of a total of 31.598 words. Fifty-one visitors liked one of my posts, and 24 visitors took the time to post a comment of their own (my replies included, probably).

Compared to last year, my website did pretty well!

20172018Δ
Views3849085614122%
Unique visitors2694957594114%
Posts10061-39%
Words / post625518-17%
Likes355146%
Comments9924-76%

However, the above statistics do not properly reflect the development of my website. For instance, I only really started generating traffic after my first viral post (i.e., Harry Plotter). The below graph takes that into account and better reflects the development of the traffic to my website.

The upward trend in traffic looks promising!

All time favorites

Looking back to the start of paulvanderlaken.com, let’s also examine which blogs have been performing well ever since their conception.

Clearly, most people have been coming for the R resources overview, as demonstrated by the visual below. Moreover, the majority of blog posts has not been visited much — only a handful ever cross the 1000 views mark.

The blogs that attracted a large public in 2017 (such as the original Harry Plotter and its sequal, and the Kaggle 2017 DS survey) have phased out a bit.

Fortunately, the introductory guide for newcomers to R is still kickstarting many programming careers! And on an additional positive note, more and more visitors seem to inspect the homepage and archives.

Redirected visitors

Finally , let’s have a closer look as to what brought people to my website.The below visualizes the main domains that redirected visitors.

Search engines provided the majority of traffic in both 2017 and 2018 –
mainly Google; to a lesser extent, DuckDuckGo and Bing (who in his right mind uses Norton Safe Search?!). My Twitter visitors increased in 2018 as compared to 2017, as did my traffic from this specific Quora page.

And that concludes my two year anniversary of paulvanderlaken.com review. I hope you enjoyed it, and that you will return to my website for the many more years to come : )

I end with a big shout out to my most loyal readers!
104 people have subscribed to my website (as of 2019-01-22)
and receive an update wherener I post a new blog.

Thank you for your continued support!

Want to join this group of elite followers?
Press the Follow button 
in the right toolbar, or at the bottom of this blog post.

Univers Interview: “Algorithms haven’t replaced the HR manager yet”

Univers Interview: “Algorithms haven’t replaced the HR manager yet”

The magazine of Tilburg University — Univers — recently interviewed me on my PhD research on People Analytics and data-driven Human Resource management. The Dutch write-up by interviewer Ron Vaessen you can find here, but is unfortunately available in Dutch only.

The full text of my dissertation can be accessed in a flipbook here or downloaded directly via this link.

I have also dedicated several blogs to more background information. A small extract on the ethics of people analytics and machine learning in HR I posted here. Those interested in visualizing survival curves like I did can see this post. Curious about the cover design, read this post

One year of paulvanderlaken.com

One year ago, I registered the domain paulvanderlaken.com with three reasons in mind: (1) I wanted an online environment to store and showcase my pet projects, (2) to share and promote some of the great blogs and research others had been writing, and (3) to show others what I was doing on my path to “data science“. The year has been just amazing. I could not have imagined the amount of positive sentiment I received from friends, family, acquaintances, and old classmates. But, most of all, the nice reactions from complete strangers across the globe! Thank you all so much for the positive response.

To my surprise, some of my stuff actually got read!

Some random stats:

In one year, I wrote 103 blogs which got over 42,000 views by nearly 30,000 visitors. 97.5% of these views occurred in the last six months. Most referrals came via Google (45%), reddit (18%), LinkedIn (8%), Facebook (8%), and Twitter (4%), and my blogs were shared a total of 241 times. Now, 51 people follow my blog, which is best viewed on Tuesdays (31%) and around 15:00h CET (6%).

views.png
My views between January 2017 and 2018, made with ggplot2 in R.

Although my personal learning is still the main reason I maintain this blog, I am very glad people seem to enjoy tagging along. Hopefully, I can continue to discover and write about data (analysis) during the coming 12 months. For now, I’d want to thank my readers for their continued interest and, in particular, my girlfriend for coping with the numerous evenings and weekend I have wasted on my pet projects. Nonetheless, it was definitely worth the effort!

Hope to see you again soon,

Paul