If it’s not in source control, it doesn’t exist.

Collaborative research

Data scientists and statisticians, as opposed to closet mathematicians, rarely do things in vacuum.

In every project you have at least one other collaborator, future-you. You don’t want future-you to curse past-you.

Hadley Wickham

Why version control?

What should an employer look for when they see a certification on a résumé?

For our program, and likely data science in general, they should look at the applicant’s GitHub page. They should see interesting project and code contributions.

Available version control tools

We use Git in this course.

Git

I’m an egotistical bastard, and I name all my projects after myself. First ‘Linux’, now ‘git’.

Linus Torvalds

Centralized vs distributed version control

Svn is a centralized version control system:

Git is a distributed version control system:

What do I need to use Git?


Git workflow

Git survival commands

Git basic usage

Branching in Git



Sample session | Getting started with homework

On GitHub.com:

On your local machine:

Etiquettes of using Git

Write every commit message like the next person who reads it is an axe-wielding maniac who knows where you live.