Introducing git-recover
I'm old enough to remember the old Norton UNERASE
command: it was part
of the old Norton Utilities for MS-DOS. It made clever use of the FAT
filesystem
to find files that were recently deleted, show them to you and let you
undelete them.
git-recover
brings that
idea to your Git repositories.
Every time you add a version of a file to your git repository - that
is to say, every time you run git add
- Git will put a copy of that
file in its object database. That means that if you accidentally
delete a file that you were working on, if you ever ran git add
on it,
you can probably recover it.
I thought you'd never ask!
(Wait, you didn't? If you really aren't interested in the nitty gritty of how Git manages the index, then I guess you can skip this section. But who isn't interested in that!?!)
Git's index is a "staging area" that will become the next commit. If you recall from my discussion about how Git works, a commit in Git is a snapshot of the entire repository at a single point in time. And the index is also a snapshot: it contains a list of all the files in the repository that will make up the next commit.
You can see this if you look at the index, and Git provides a tool to
do just that: git ls-files --stage
. When I've just cloned a
repository:
% git clone /tmp/foo_repo .
% git ls-files --stage
100644 6af0abcdfc7822d5f87315af1bb3367484ee3c0c 0 foo.txt
And when I add a new file to this repository, I can inspect the index again, and will see the new file:
% git add bar.txt
% git ls-files --stage
100644 ce013625030ba8dba906f756967f9e9ca394464a 0 bar.txt
100644 6af0abcdfc7822d5f87315af1bb3367484ee3c0c 0 foo.txt
Note that the entry for bar.txt
contains the object ID of the file.
When you run git add
, Git actually adds the file to its object database,
and takes the resulting object ID (the SHA-1 hash of the file) and places
that in the index.
You can see the file on disk - Git has added it to the repository as a loose object:
% ls -Flas .git/objects/ce/013625030ba8dba906f756967f9e9ca394464a
4 -r--r--r-- 1 ethomson staff 21 14 Jun 23:58 .git/objects/ce/013625030ba8dba906f756967f9e9ca394464a
So Git has prepared this new file for our commit. But what if we don't
commit this file? What if, instead, we git rm
it? Or if we make some
more changes to bar.txt
and add those instead?
% echo "different changes" > bar.txt
% git add bar.txt
Now we've overwritten our original changes to bar.txt
:
% git ls-files --stage
100644 4a95512212b2f24397fe2df5a2554935bd0a032a 0 bar.txt
100644 6af0abcdfc7822d5f87315af1bb3367484ee3c0c 0 foo.txt
You can see that the object ID for bar.txt
had changed - reflecting our
new file. But what's happened to the original file we added? Where is
object ce01362
?
It's still in our object database:
% ls -Flas .git/objects/ce/013625030ba8dba906f756967f9e9ca394464a
4 -r--r--r-- 1 ethomson staff 21 14 Jun 23:58 .git/objects/ce/013625030ba8dba906f756967f9e9ca394464a
But we never committed it, so this object is not pointed to by any commit in the graph. Nor is it in our index anymore. This unreference blob is "garbage" and - eventually - Git will garbage collect it.
But until it does, we can recover it!
The simplest way to use git-recover
is to use it in interactive mode:
just run git recover -i
. It will show you the first few lines of each
"orphaned" file - those that were once git add
ed to the repository but
were never committed - and let you recover them (or not).
% git recover -i
Recoverable orphaned git blobs:
61c2562a7b851b69596f0bcad1d8f54c400be977 (Thu 15 Jun 2017 12:20:22 CEST)
> Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim
> veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
> commodo consequat. Duis aute irure dolor in reprehenderit in voluptate
Recover this file? [y,n,v,f,q,?]:
You can also run git-recover
without any arguments, and it will show you all
the "orphaned" blobs that you can recover. You can then inspect an
object to decide if it's something that you're interested in (using
git show
).
% git recover
Recoverable orphaned git blobs:
61c2562a7b851b69596f0bcad1d8f54c400be977 Thu 15 Jun 2017 12:20:22 CEST
When you find the object that you want to recover, you can
run git-recover <objectid>
to pull it out of the object database and
write it to disk.
You can specify the filename to write with the (optional) -f
flag:
% git recover 61c2562 -f greeking.txt
Writing 61c2562: greeking.txt.
Specifying the filename is helpful, because you may have rules set up in
your .gitattributes
file on a per-file or per-file extension basis.
Using the -f
flag will make sure that these rules are executed.
git-recover
is a shell script - you can just download it and go.
When you put git-recover
in your PATH
, then it becomes a proper git
command, and you can run git recover
(notice the space instead of the dash).
Please open an issue or a pull request if you have problems or improvements.