Fork the project to make your own modifications, which allows you to easily integrate your contributions. However, if you don't send these modifications back upstream - that is, back to the parent repository - you may lose track of them, which could result in different branches in your version control. To ensure that all contributors get information from the same place, you need to understand how git forking and git upstream interact. In this blog post, I will introduce you to the basics, troubleshoot some common issues, and even leave you with a cool tip to stay ahead of the curve.
Git Upstream: Keeping Up to Date#
Let me first explain the common setup and basic workflow for interacting with the upstream repository.
In a standard setup, you usually have an origin and an upstream remote - the latter being the gatekeeper of the project, or the true source you want to contribute to.
First, make sure you have set up a remote for the upstream repository and have also set up an origin:
$ git remote -v
origin [email protected]:my-user/some-project.git (fetch)
origin [email protected]:my-user/some-project.git (push)
If you don't have an upstream, you can easily add it with the remote command.
git remote add upstream [email protected]:some-gatekeeper-maintainer/some-project.git
Check that the remote has been added successfully:
git remote -v
origin [email protected]:my-user/some-project.git (fetch)
origin [email protected]:my-user/some-project.git (push)
upstream [email protected]:some-gatekeeper-maintainer/some-project.git (fetch)
upstream [email protected]:some-gatekeeper-maintainer/some-project.git (push)
Now you can fetch the latest changes from the upstream repository. Repeat this action whenever you want to get updates.
(If the project has tags that haven't been merged into master, you should also do: git fetch upstream --tags
)
git fetch upstream
In general, you want to keep your local branches as close mirrors of the upstream master branch and do any work in feature branches, as they may later become pull requests.
At this point, it doesn't matter whether you use merge or rebase, as the result is usually the same. Let's use merge:
git checkout master
git merge upstream/master
When you want to share some work with the upstream maintainers, you can create a feature branch from the master branch and, when you're satisfied, push it to your remote repository.
You can also use rebase instead, then merge to ensure that the upstream has a clean set of commits (preferably one) to evaluate.
git checkout -b feature-x
#some work and some commits happen
#some time passes git fetch upstream
git rebase upstream/master
If you need to squash several commits into one, you can use the powerful interactive rebase at this point.
Publish with git fork#
After going through the above steps, publish your work in the remote fork with a simple push.
git push origin feature-x
If you have to update it after publishing the remote branch feature-x due to some feedback from the upstream maintainers, you have a few options:
Create a new branch that includes your updates and the updates from the upstream. Merge the updates from the upstream into your local branch and make a merge commit, which will make the upstream repository messy. Rebase your local branch on top of the updated upstream base and then force push to the remote branch.
git push -f origin feature-x
Personally, I prefer to keep the history clean as much as possible and choose option three, but different teams have different workflows. Note: You should only do this when using your own fork; rewriting the history of shared repositories and branches is something you should never do.
Tip of the day: Ahead/Behind number in the prompt#
After fetching, git status will show you how many commits ahead or behind the synchronized remote branch you are. Wouldn't it be better if you could see this information right in your faithful command prompt? That's what I thought, so I started using my bash chopsticks and made it happen.
Here's what it looks like on your prompt after you set it up:
nick-macbook-air:~/dev/projects/stash[1|94]$
This is what you need to add to your .bashrc or equivalent, just a function:
function ahead_behind {
curr_branch=$(git rev-parse --abbrev-ref HEAD);
curr_remote=$(git config branch.$curr_branch.remote);
curr_merge_branch=$(git config branch.$curr_branch.merge | cut -d / -f 3);
git rev-list --left-right --count $curr_branch...$curr_remote/$curr_merge_branch | tr -s '\t' '|';
}
You can use this new function ahead_behind to enrich your bash prompt to achieve the desired effect. I'll leave the coloring work to the reader.
Sample prompt:
export PS1="\h:\w[\$(ahead_behind)]$"
Internal Structure#
For those who like details and explanations, here's how it works.
We get the symbolic name of the current HEAD and the current branch.
curr_branch=$(git rev-parse --abbrev-ref HEAD);
We get the remote that the current branch points to.
curr_remote=$(git config branch.$curr_branch.remote);
We get the branch that this remote branch should be merged into (by using a cheap Unix trick to discard everything including the last slash [/]).
curr_merge_branch=$(git config branch.$curr_branch.merge | cut -d / -f 3);
Now we have what we need to collect the number of commits ahead and behind.
git rev-list --left-right --count $curr_branch...$curr_remote/$curr_merge_branch | tr -s '\t' '|';
We use the old Unix tr to convert TABs to separators |.
Getting started with git upstream#
That's the basic drill with git upstream - how to set it up, create new branches, collect changes, publish with git fork, and a handy tip to see how many commits ahead/behind your remote branch is.