Cascading Stylesheets — or CSS — is the first technology you should start learning after HTML. While HTML is used to define the structure and semantics of your content, CSS is used to style it and lay it out. For example, you can use CSS to alter the font, color, size, and spacing of your content, split it into multiple columns, or add animations and other decorative features.
I was personally encoutered CSS in multiple stages of my Data Science career:
When I started using (R) markdown (see here, or here), I could present my data science projects as HTML pages, styled through CSS.
When I got more acustomed to building web applications (e.g., Shiny) on top of my data science models, I had to use CSS to build more beautiful dashboard layouts.
When I was scraping data from Ebay, Amazon, WordPress, and Goodreads, my prior experiences with CSS & HTML helped greatly to identify and interpret the elements when you look under the hood of a webpage (try pressing CTRL + SHIFT + C).
I know others agree with me when I say that the small investment in learning the basics behind HTML & CSS pay off big time:
I read that Mozilla offers some great tutorials for those interested in learning more about “the web”, so here are some quicklinks to their free tutorials:
Obviously, I want to track and store the versions of my programs and the changes between them. I probably don’t have to tell you that git is the tool to do so.
Normally, you’d have a .gitignore file in your project folder, and all files that are not listed (or have patterns listed) in the .gitignore file are backed up online.
However, when you are working in multiple languages simulatenously, it can become a hassle to assure that only the relevant files for each language are committed to Github.
Each language will have their own “by-files”. R projects come with .Rdata, .Rproj, .Rhistory and so on, whereas Python projects generate pycaches and what not. These you don’t want to commit preferably.
Here you simply enter the operating systems, IDEs, or Programming languages you are working with, and it will generate the appropriate .gitignore contents for you.
Let’s try it out
For my current project, I am working with Python and R in Visual Studio Code. So I enter:
And Voila, I get the perfect .gitignore including all specifics for these programs and languages:
# Created by https://www.gitignore.io/api/r,python,visualstudiocode
# Edit at https://www.gitignore.io/?templates=r,python,visualstudiocode
### Python ###
# Byte-compiled / optimized / DLL files
# C extensions
# Distribution / packaging
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
# Installer logs
# Unit test / coverage reports
# Scrapy stuff:
# Sphinx documentation
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
# celery beat schedule file
# SageMath parsed files
# Spyder project settings
# Rope project settings
# Mr Developer
# mkdocs documentation
# Pyre type checker
### R ###
# History files
# Session Data files
# User-specific files
# Example code in package build process
# Output files from R CMD build
# Output files from R CMD check
# RStudio files
# produced vignettes
# OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
# knitr and R markdown default cache directories
# Temporary files created by R markdown
### R.Bookdown Stack ###
# R package: bookdown caching files
### VisualStudioCode ###
### VisualStudioCode Patch ###
# Ignore all local history of files
# End of https://www.gitignore.io/api/r,python,visualstudiocode
Regular expression (also abbreviated to regex) really is a powertool any programmer should know. It was and is one of the things I most liked learning, as it provides you with immediate, godlike powers that can speed up your (data science) workflow tenfold.
I’ve covered many regex related topics on this blog already, but thought I’d combine them and others in a nice curated overview — for myself, and for you of course, to use.
If you have any materials you liked, but are missing, please let me know!
Tait Brown was annoyed at the Victoria Police who had spent $86 million Australian dollars on developing the BlueNet system which basically consists of an license-plate OCR which crosschecks against a car theft database.
Anyway, he built a system that can identify license plates, read them, and should be able to cross check them with a criminal database.
I really liked reading about this project, so please do so if you’re curious via the links below: