My GitHub workflow - managing forks

Keeping my fork updated

last updated: 17th May 2022

Stage 0: git setup

Workflow assumptions:

  • Upstream project in son Github
  • The upstream project has a develop branch
  • Developers contributes by forking the upstream project and create a feature branch
  • Feature branches can be long lived
  • I have no commit right on any branch on the upstream project
  • I forked the upstream form Github web interface
  • I have clone my fork on my local development environment
  • To submit my changes I create a Pull Request for the upstream develop branch.
  • My local development environment is a macOS X system

enable git rerere

If the feature branch is long-lived, several rebase from develop may be needed.
Conflicts may emerge. If using rebase, where feature related changes are replayed,
these conflicts will keep re-occurring every time. Use git rerere to have git remember
how a conflict was solved the previous time it occured.

$ git config --global rerere.enabled true

add upstream remote

before adding upstream, the output of git remote -v may look like this

$ git remote -v
origin	https://github.com/rija/project.git (fetch)
origin	https://github.com/rija/project.git (push)

then add upstream remote:

$ git remote add upstream [email protected]:client/project.git
$ git remote -v
origin	https://github.com/rija/project.git (fetch)
origin	https://github.com/rija/project.git (push)
upstream	https://github.com/client/project.git (fetch)
upstream	https://github.com/client/project.git (push)

Stage 1: Update my develop branch from upstream's develop branch

$ git checkout develop
$ git fetch upstream
$ git rebase upstream/develop
$ git push origin

Stage 2: Update my feature branch from the develop branch

$ git checkout my-feature-branch
$ git fetch origin
$ git rebase develop

This will create a temporary branch made of the develop branch up to HEAD, then the changes from my-feature-branch will be replayed one by on top of that.

variation: rebasing on specific commit

Tto reduce the conflicts surface of the process, I often want to do the rebasing incrementally onto specific commits, so that the upstream changes I have to be conscious of when going through each conflict i kept manageable. The syntax is almost the same, instead of the develop branch, you indicate the first 8 digits of the commit ID for the commit on the develop branch onto which you want to rebase:

$ git rebase 6198c11

The above is saying "rebase my feature branch onto the develop branch at commit 6198c11"

Stage 3: Figuring what our position and what the current patch is for

once the process starts, git will shows a progress in real time:

...
Applying: Remove prototype code
Applying: Remove references to prototype in Dockerfile for nginx
Applying: Remove prototype step from Gitlab CI config
Applying: Fix dataset view acceptance tests for file table settings
...
Auto packing the repository in background for optimum performance.
See "git help gc" for manual housekeeping.

In theory if there's no conflicts that's the only thing you see.
On my long leaved branch (2000+ commits), there are always conflicts.
When the process encounters a snag, it will try to resolve it. If all strategies fail,
it will output the problem description and hand us the prompt to fix the conflict.

This is the right opportunity to gauge the situation with the following helpful commands:

  • git status: current state of files in the rebasing branch
  • git am --show-current-patch: Show the content of the patch that fails
  • git ls-files -u: See which files are in conflicts
  • git diff --diff-filter=U: See what are the conflicts

Sometimes the output of the second command can be confusing to read on the terminal, so I copy the commit id at the top of the patch and query GitHub for more readable details about the files and changes involved:

https://github.com/rija/gigadb-website/commit/ab0a7f836fbfd8381e2b8faba76984c6e042af09

Stage 4: Manually resolving conflicts

See which files are in conflicts:

$ git ls-files -u
100644 9b42055da84e099659ee6246fb9f5bdb1f034de6 2	tests/behat.yml
100644 d1a9a1ebbe16aa2f7f57c23b4fd57ec906446aa7 3	tests/behat.yml

See what are the conflicts

$ git diff --diff-filter=U
diff --cc tests/behat.yml
index 9b42055d,d1a9a1eb..00000000
--- a/tests/behat.yml
+++ b/tests/behat.yml
@@@ -3,7 -3,7 +3,11 @@@ default
      features:  features
      bootstrap: features/bootstrap
    context:
++<<<<<<< HEAD
 +      class:  'MyMainContext'
++=======
+       class:  'AuthorWorkflowContext'
++>>>>>>> Author-names (#81): setting up test infrastructure
    extensions:
      Behat\MinkExtension\Extension:
        base_url: 'http://lvh.me:9170/'

The number :2 or :3 from the output of git ls-files -u represent the branch identifier. There is sometimes a :1 too.
They can be labelled as ours, theirs and HEAD. (But the mapping order differs depending on the merging method used and whether your workflow differs from what's described in Stage 0).

To show the conflicted files from those three branches without checkout:

$ git show :2:tests/behat.yml
$ git show :3:tests/behat.yml

Fixing the conflict

In case where the fix is about accepting one of those 3 versions, here is how you accept a version and move on.
(Let's say we want to accept the version from :2)

$ git show :2:tests/behat.yml > tests/behat.yml
$ git add tests/behat.yml

(there's a quicker way of doing the same thing shown below in "Patterns of conflicts" using git checkout)

If the fix is not that simple, investigate what correction are necessary and then use git add to signal conflict resolution.

git rebase --skip or git rebase --continue ?

This depend on whether the replaying patch needs to be applied or not once the fix has been made.

The conflicting patch can be consulted in .git/rebase-apply/patch

If the patch still matters, use git rebase --continue
If the fix makes the patch redundant, use git rebase --skip

If the conflicted merge involve debugging changes (e.g: as part of investigating issues on CI) that you know won't matter further down the line of commits to replay, they can be skipped.

Patterns of conflicts

If you repeatedly rebase a long-running branch, you will soon notices patterns for conflicts, with some of them having repeatable resolution. Here are a couple for which the resolution, although manual, can be systemised provided you have a grasp fo the upstream changes made since fork/last rebase (helped by using git rebase <commit id>).

1. You know feature branch changes for all conflicted files are the right version to keep

$ git diff --name-only --diff-filter=U | grep protected | xargs git checkout --theirs
$ git diff --name-only --diff-filter=U | xargs git add
$ git rebase --continue

2. You know upstream changes for all conflicted files are the right version to keep

$ git --rebase skip

3. Keep some files from the upstream branch and others from feature branch

$ git ls-files -u
$ git checkout --theirs <list of conflicted files to checkout from feature branch>
$ git checkout --ours <list of conflicted files to checkout from upstream branch>
$ git diff --name-only --diff-filter=U | xargs git add
$ git rebase --continue

Stage 5: What if things go wrong

git rebase --abort will cancel the rebase as if it was never attempted.

however, all already resolved conflicts will be cancelled too, that's why it's better to rebase onto a commit id rather than a branch, so that you can take in upstream changes in smaller chunks iteratively. That way, if you have to abort, not much conflicts resolution work is lost.

Sometimes I had to use git rebase --quit, when git crashes.

Stage 6: Finalising the rebase

At the end, when the rebase complete, I normally configure the project and run all the tests.

Only when that's done and working I run:

$ git push origin

will fail with the error:

error: failed to push some refs to '[email protected]:rija/gigadb-website.git'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

DO NOT follow the advice from these hints, otherwise we will end up with duplicate commits for the changes made in the feature branch.
This is because the rebasing process has given new commit ids to the feature branch commits that follows the commit upon which the rebase was made.
Therefore, the git remote server see them as distinct commits and cannot connect them with the old commits.

Instead we do a force commit to replace the remote with the rebasing branch:

$ git push --force origin

Warning: Replacing the remote branch that way is highly unadvisable on branch that have been exposed to other developers. In this case, as per the workflow scenario in Stage 0, it's not a problem as each developer fork the upstream project for its own feature and do not share that fork directly. Instead the fork is merged through a pull request into a develop branch (or a reviewer branch).

What if the feature branch is too big and its review by the maintainers becomes difficult?

An approach I like to use is to identify and pick set of files from my feature branch and move them into a new branch rebased onto the upstream branch. That way, I can submit PR for the smaller branch, and keep on doing that until merging the feature branch becomes formality.

  1. Checkout your feature branch
$ git checkout epic-work
  1. Rebase it onto upstream's tip commit (let's call that commit 61f0b29) following (or not) the notes from this document

  2. Create a branch from there

$ git checkout -b smaller-subset
  1. And rebase onto the latest commit on upstream branch before you branched out
$ git reset --mixed 61f0b29
  1. All the changes you have made in epic-work are now local changes. you can pick the files you want to be part of smaller-subset by staging and committing them, and then discard all the unstaged files:
$ git add selected_file.txt selected_dir some_dir/another_file.php
$ git commit -m "smaller set of changes to merge together"
$ git push --set-upstream origin smaller-subset
$ git restore .

How to reset local branch to be the same as its remote counterpart

$ git branch
* my-work 1734edbaf [origin/my-work] W...
  develop             575c54e30 [origin/develop] Fix(DatasetViewTest): C...
$ git fetch origin
$ git reset --hard origin/my-work

how to remove the last n local commits

$ git reset --hard HEAD~n
$ git push --force origin

Use a file from another branch

$ git checkout <name of another branch> -- <file of interest>

Resources

Useful tools

  • Sublime Merge: my preferred GUI for git. It's exhaustive and I use it for branch management and tracking, files and lines history, and for staging changes a the level of hunks inside files
  • FileMerge (part of Apple XCode tools): I use it to to diff compare directories between the rebased branch and a checkout of before the rebasing