blg.tch.re – Rewrite a Git history with git-filter-repo (Apr. 10, 2024)

Rewrite a Git history with git-filter-repo

Today I had to rewrite a Git history to remove a AWS account id from a README. Even if AWS account id is not a sensitive data by itself, the good practice keeps it private to reduce the surface of attack.

Of course, rewriting a Git history has consequences, and should be done carefully.

When you do want to rewrite it, you will easily find many solutions to remove a particular file, with git filter-branch [1] or bfg --delete-files [2].

But editing a file content is a bit less straightforward. One solution is to use bfg --replace-text [3], unforfortunately bfg requires a Java Runtime Environment to run, and I would prefer avoid installing one if possible…

Finally the good old git-filter-repo [4] Python script did the fix nicely.

The installation can’t be easier on my Debian Linux, sudo apt install git-filter-repo.

Then you have to write a /tmp/replacements file containing the data you want to replace in your history. The pattern is <SEARCH>===><REPLACEMENT>, one pattern per line. See the manual for the full syntax.

AWS_ACCOUNT_ID=12345===>AWS_ACCOUNT_ID=*****

git-filter-repo recommands a fresh clone of the project to run, I would say it’s a good idea since I had to do some retries before having the result I wanted.

$ git-filter-repo --replace-text /tmp/replacements

Once your happy with your new history, you will have to re-configure your Git remote origin before push force, because git-repo-filter deletes it to help avoid accidentally repushing to the same repo.