git-filter-repo: The Fast, Safe Git History Rewriting Tool You Need
git-filter-repo is a modern replacement for git filter-branch that rewrites history orders of magnitude faster, with far fewer gotchas, and is now officially recommended by the Git project.
I've spent years battling slow, error-prone Git history rewrites. Every time I needed to strip a large file or split a monorepo, filter-branch would churn for hours—only to silently corrupt the result. That's why I was excited to discover git-filter-repo, a Python-powered rewrite tool that is orders of magnitude faster and backed by the Git project itself. In this review, I'll show you why it's become my go-to tool for any history surgery.
What is git-filter-repo?
git-filter-repo is a command-line tool for rewriting Git repository history. It was created by Elijah Newren to replace the notoriously slow and buggy git filter-branch. The tool is a single Python script that leverages Git's fast-export and fast-import mechanisms, giving it blazing speed and a robust design. It can handle everything from removing large files and directories to splitting repositories and renaming tags. The Git project now officially recommends filter-repo over filter-branch.
Why I starred it
In my own work on a fitness tracking app, I often need to extract a module from a shared monorepo or purge accidentally committed credentials. With filter-branch, these tasks were so painful I'd avoid them. git-filter-repo changed that. Its speed is shocking—a rewrite that took filter-branch 40 minutes completes in under a minute. Plus, its safety guarantees (like preserving reflog and not touching unchanged commits) mean I can trust the result. It's the tool I wish I'd had years ago.
How it works
git-filter-repo works by:
- Exporting the repository history using
git fast-exportinto a stream. - Applying user-specified filters (path, email, message, etc.) to that stream.
- Importing the filtered stream back using
git fast-import. - Optionally performing additional cleanups like removing empty commits.
Key concepts:
- Analyzers: Evaluate your repo before rewriting (e.g.,
--analyzeto find large blobs). - Path filters: Include (
--path) or exclude (--path-rename). - Commit callbacks: Use
--commit-callbackwith Python code for custom logic. - Safety: Refuses to run on a dirty working tree; commits have their original hashes if unchanged.
Quick start
Installation is trivial: download the single git-filter-repo script and place it in your PATH. Ensure you have Python 3.6+ and Git 2.36+.
Now, let's say you want to remove a large file from history:
# Create a fresh clone to avoid issues
git clone --bare https://github.com/example/repo.git /tmp/repo.git
cd /tmp/repo.git
# Remove giant-file.mp4 from all commits
git filter-repo --strip-blobs-bigger-than 10M
# Verify the size reduction
git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectsize)' | awk '/^blob/ {sum+=$2} END {print sum}'
This re-writes history in seconds, even for repos with thousands of commits.
Real-world example
Extracting a subdirectory into a new repository with a prefix, a common monorepo task:
# Clone the original repo (bare)
git clone --bare https://github.com/org/monorepo.git /tmp/monorepo.git
cd /tmp/monorepo.git
# Keep only the 'packages/utils' folder and move it to 'utils/'
git filter-repo --path packages/utils/ --path-rename packages/utils/:utils/
# Update remote and push to a new repo
git remote add new-origin https://github.com/org/utils.git
git push new-origin --mirror
This split off the utils package with full history, renaming all paths as it went. The speed is breathtaking—even a repo with 20k commits processes in minutes.
Pros and cons
Pros
- Blazing fast—often 10-100x faster than filter-branch.
- Safer: works on a bare repo, refuses dirty trees, preserves reflog.
- Rich feature set: path renaming, email obfuscation, tag renaming, and more.
- Single-file script; easy to vendor or install.
- Actively maintained; the Git project recommends it.
Cons
- Requires Python 3.6+ and Git 2.36+ (older systems may need upgrades).
- Not a drop-in replacement syntax; must learn new flags.
- Some advanced use cases need Python callbacks, which can be intimidating.
- No GUI; command-line only.
Alternatives
- git filter-branch: The built-in but deprecated tool—slow and error-prone.
- BFG Repo Cleaner: Great for removing large files, but limited in scope; no longer actively developed.
- git-rebase: Good for small, interactive history edits, but not for large-scale rewrites.
My verdict — should you use it?
Absolutely, yes—if you ever need to rewrite Git history for any reason. git-filter-repo is the undisputed champion in speed, safety, and features. Developers maintaining monorepos or managing legacy repos with accidental blobs will save hours. The only reason to skip it is if you're stuck on an ancient Git version. Otherwise, install it today; your future self will thank you.