Over the years, I have given numerous talks and tutorials on using Git and Github to help onboard folks to using these vital dev tools. A lot of people find themselves flooded in the infinite possibilities of the sophisticated tool. But, it turns out the day-to-day uses of Git and Github are often constrained to a small subset of operations.
Git is the de-facto Google Docs for developers — but on many levels, I argue, better. Through powerful techniques like branching, you can vastly improve the organization of your code, and increase your ability to cooperate with other developers on your team.
Of course, learning Git does not happen in one day — there’s so much to learn! However, with this guide, you will be able to do 80% of what your daily operations require you. The point of this blog post is not to make you a Git guru, but to get you started with using Git and to get you up to speed quickly. So, if you are someone who already knows how to use Git, this may not the right blog post for you!
EDIT: (June 16) In line with recent changes to Github's naming conventions, I have also changed the naming conventions of this blog post.
Prerequisites
I will be assuming, for the purposes of this quick tutorial, the following:
- Working Github account
- Git installed on your local machine
2-minute tl;dr
Before I dive into the details, I'll give you a quick 2-minute tl;dr if you are on a rush to get things done quickly.
- Clone a repository: Clone a repository from Github using the
git clone
command. This will enable you to collaborate with other folks from platforms like Github, by cloning the repository code and history. - Pull, pull, pull! Say your collaborator has pushed out new exciting code into
new_file.py
and you want to access it. Well, look no further! You can use thegit pull
command to “download” all the new files that have been “uploaded” to Github. - Add, Commit, Push: The three musketeers of Git. If you want to upload (or push) your code to Github, you want to perform all these three commands. More on what they exactly do below!
git add <file-name>
(e.g.,git add new_file.py
)git commit -m "<new-message>"
(e.g.,git commit -m "This is new code I wrote!"
)git push origin <branch>
(e.g.,git push origin main
)
- Branch out - make it your own! Use branches to work on your own code - and never work on the main branch. Branches essentially make a copy of the code that you can mess around with, until you are ready to make a pull request. To make a branch, run
git checkout -b <branch-name>
and when you are ready to make a pull request, click on the pull request tab on Github and select merge your branch to the main!
Other Fantastic Resources
My guide here isn’t perfect, and is not a catch-all guide to GIt. For some other fantastic resources on Git, I highly recommend that you check out:
- MIT’s Missing Semester of Your CS Education. The Git lecture gives you solid picture of Git and Github and builds it from the bottom-up. In my tutorial, I give you a more abstracted view, just to get you started. But, if you are in the mood to really learn how Git works, so you don’t make crazy mistakes, check it out!
- Dangit Guide: This is a euphemized version of the original guide with expletives. It teaches you how to get unstuck in response to common issues that arise in Git. I definitely recommend that you check the guide out as you get more proficient with using Git.
- How to Write Good Commit Messages: Writing good commit messages is crucial, especially when you aren’t just hacking with your friends (even when you are hacking with your friends, good commit messages might save you an hour or two worth of debugging). This blog post will convince you further why and also teach you good commit message standards.
- StackOverflow: Well, you saw this coming.
A true programmer lives in StackOverflow.StackOverflow is a fantastic resources if you are stuck, and you need immediate help and a quick way out of your problems. If StackOverflow doesn’t answer your questions, you might be better off deleting your local repository and re-cloning (literally).
Preliminary Terminology
When you work with folks who are familiar with Git and Github, you may come across terminology that you are unfamiliar with. This blog post is no different - and indeed, I have tried to use precise terminology. Here, I will give some easy-to-understand, yet very loose, definitions for Git terms (Google ahead for the more precise definitions!)
- Repository: You will often hear people shorten this word to “repo”. Repo refers to the codebase containing all the code of your project. More specifically, it contains all the history of your project, so you can go back and “checkout” your project to previous states.
- Checkout: For most use cases, checkout is the “undo” button for your Git history, where you can switch back to a former commit. More precisely, checkout allows you to switch between different versions of the repo (different branches, different times in history, etc.).
- Push: Uploading local content to your remote Git repository. In other words, pushing your files mean you will be pushing the changes you have made on your local computer to the Github repository where everyone will be able to see your changes.
- Pull: Downloading remote Git repository content to your local drive. When you are collaborating with other folks.
- Branches: Branches are effectively different versions of the same codebase, so that you can store multiple copies of the same repository, all with their unique changes. When you want to add a new feature or fix a bug - no matter how big or how small - you make a new branch to record your changes, instead of pushing to the main branch (which should only contain production-ready code).
- Pull Request: You will often hear people shorten this word to “PR”. This is a Github-specific term, and it is used when you want to push the changes from your branch (e.g.,
brian-fix
) to some other branch (typicallymain
branch). This comes in really handy because this is a way to streamline the review process of your code.
Step 1: Cloning a Github Repo
While Git can be used locally, one of the reasons Git is so powerful is it enables you to collaborate with other developers on platforms like Github.
First, go to the directory where you want to clone the project to. In my case, my go-to directory is my Desktop
, so I would command:
cd ~/Desktop
Second, go ahead and either use an existing or create a new repository
git clone <repo-name>
where you should replace <repo-name>
with the HTML link or the SSH link. I highly recommend using SSH, but that takes a little more to set up here. To find the HTML or SSH link on Github, you can find it by clicking on the Clone or download
button:
Clone or download button on the tensorflow
repository
After cloning, you should see the new repo right in your directory. You can then cd
into the directory, and get right ahead hacking and having fun making new changes to the repo!
Step 2: Git Pull
Imagine a scenario where you are collaborating with a friend of yours (Alice), and Alice has pushed her new changes into the Git repo, and you, being an enthusiastic computer scientist have also made changes you would now like to push to the repo, when all of a sudden you see this message:
! [rejected] your-branch -> your-branch (non-fast-forward)
Oh no! Has Git gone mad? Don't worry - Git is telling you that if you want to push your changes to the remote Git repo, you should first pull
the changes from the remote repo and then push your new changes, so that your local copy is synced to the remote copy! To pull in the new changes, simply run:
git pull origin <branch-name> (e.g., git pull origin main)
Before you push or make changes to your code, make it a habit of yours to git pull
every time!
Step 3: Git add, Git commit, Git push
Quick Overview
The three musketeers of Git: add, commit, push. These three musketeers always go together when you want to push changes you have made to the Github repo so you can collaborate with others. The combination of these three commands, done right, will let you “push” your changes to the original codebase, or to your branch in the repo. Let's take a look at what each of these actions let you do:
git add
: Adds the files you selected to the staging area. The staging area is a buffer before you ship it away by committing your changes. Taking the analogy of shipping your products away, the staging area is a loading dock where you get to decide which products actually get shipped.git commit
: When you commit a file, you are saying, “yes, I want this file in my Git history”. In other words, this is like clicking on the “save” button on your Word document - you are saying that you would like to, at the very least, see a snapshot of that file. When you commit, you are also asked to add a message. This message is the description of what this new change enables you to do. For example, if I made a change to a file that fixes a bug due to a typo that my friend has made, my message might look like:Fix typo in [new.py](http://new.py)
(or more realistically, I would probably writedoes this work now?
)git push
: Woohoo! You have now made a snapshot of the file you want to push forward in your Git history, and all you have left to do is to send that snapshot off to your remote Github repository. To do that,git push
will do the job and will send your file off to the remote Github repository!
So, how exactly do we use these in practice? Let me get you started!
Add, Commit, Push How-To
Let's build up our scenario, where we are in a Git folder ~/Desktop/new-git-folder
and we run the ls
command to list our files:
(base) ~/Desktop/new-git-folder ls -a
.git .gitignore new_file.py README.md
I want to push my new_file.py
to the Github repository. How can I do that? Well - let's remember the three commands: add
, commit
, push
:
git add new_file.py
git commit -m "Add new file to the repo"
git push origin main
When adding files to Git staging, you might also see people do git add *
or git add .
I personally strongly discourage these practices because they will often pick up unwanted files (like your data or if you are a Mac user, .DS_STORE
). They will eventually clutter up Github and cleaning those files up will be a mess. Besides, having these unwanted files in Github will make you look unprofessional.
Step 4: Branching and Pull Requests
Quick Overview
This is not a requirement, but a very strong recommendation if you want to use Git properly. In fact, I always enforce that folks who work with me, in student organizations or in research, to use branching and pull requests, because they effectively allow us to reduce 90% of the Git errors, and because they allow effective quality control. If you are messing around with your own Github repo, this is probably not so necessary - but I still highly recommend that you use good practice.
So what is branching? Branching is making a copy of the codebase that you can work off from, without affecting the original codebase. In more classical CS terms, consider this to be a pointer to a copy of your original codebase. This means that you can make your own fixes, make as much bugs as you want, crash the program, without affecting the original work. When you are happy with your changes, then you submit a PR so that your branch can be merged with the original codebase (meaning you update the original codebase with your changes).
Branching How-To
Now, let's talk about the mechanics of how this works. First, from your original Git codebase, create your own branch. You can do this the slower way of creating a branch using git branch
then checking out into the new branch with git checkout
, or you can do all this in one step,
git checkout -b <branch-name>
For example, if I want to fix a bug related to Issue #31 on Github (don't worry too much about what this means - but in Github, people can raise Issues on your work that you can work on), I might make a branch like so:
git checkout -b fix-issue-31
Once I have done so, I have created a new branch fix-issue-31
and I have checked into the branch (meaning, I am now on that branch). I highly recommend that you customize your Terminal setup so you can see which branch you are on. Otherwise, you can type:
git branch
which will show you all the active branches and which branch you are currently on, like so:
(base) ~/Desktop/new-git-folder git branch
* fix-issue-31
main
where the branch with the asterisk is the branch you are currently on.
Once you are done, follow our usual procedure of add and commit. However, on push
, change it up a little, so that you push to the correct branch:
git push origin <branch-name>
In my case, I would type git push origin fix-issue-31
.
Pull Request How-To
Now is time to make new pull requests. First, head right over the Github and click on the Pull Request button as outlined below:
Then, move right along to New Pull Request
button:
Once you are there, you should be able to submit a new Pull Request. Then, you can collaborate with other folks who will review your work and you can continue to update your branch until everyone is satisfied and thinks your code should be merged into the main
branch. And there you have it! Woohoo!