Skip to content

Commit 8dc22e8

Browse files
pks-tgitster
authored andcommitted
builtin/index-pack: fix deferred fsck outside repos
When asked to perform object consistency checks via the `--fsck-objects` flag we verify that each object part of the pack is valid. In general, this check can even be performed outside of a Git repository: we don't need an initialized object database as we simply read the object from the packfile directly. But there's one exception: a subset of the object checks may be deferred to a later point in time. For now, this only concerns ".gitmodules" and ".gitattributes" files: whenever we see a tree referencing these files we queue them for a deferred check. This is done because we need to do some extra checks for those files to ensure that they are well-formed, and these checks need to be done regardless of whether the corresponding blobs are part of the packfile or not. This works inside a repository, but unfortunately the logic leads to a segfault when running outside of one. This is because we eventually call `odb_read_object()`, which will crash because the object database has not been initialized. There's multiple options here: - We could in theory create a purely in-memory database with only a packfile store that contains the single packfile. We don't really have the infrastructure for this yet though, and it would end up being quite hacky. - We could refuse to perform consistency checks outside of a repository. But most of the checks work alright, so this would be a regression. - We can skip the finalizing consistency checks when running outside of a repository. This is not as invasive as skipping all checks, but it's not great to randomly skip a subset of tests, either. None of these options really feel perfect. The first one would be the obvious choice if easily possible. There's another option though: instead of skipping the final object checks, we can die if there are any queued object checks. With this change we now die exactly if and only if we would have previously segfaulted. Like this we ensure that objects that _may_ fail the consistency checks won't be silently skipped, and at the same time we give users a much better error message. Refactor the code accordingly and add a test that would have triggered the segfault. Note that we also move down the logic to add the packfile to the store. There is no point doing this any earlier than right before we execute `fsck_finish()`, and it ensures that the logic to set up and perform the consistency check is self-contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 parent 5d795b3 commit 8dc22e8

4 files changed

Lines changed: 47 additions & 3 deletions

File tree

builtin/index-pack.c

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1640,7 +1640,7 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
16401640
rename_tmp_packfile(&final_index_name, curr_index_name, &index_name,
16411641
hash, "idx", 1);
16421642

1643-
if (do_fsck_object)
1643+
if (do_fsck_object && startup_info->have_repository)
16441644
packfile_store_load_pack(the_repository->objects->packfiles,
16451645
final_index_name, 0);
16461646

@@ -2110,8 +2110,23 @@ int cmd_index_pack(int argc,
21102110
else
21112111
close(input_fd);
21122112

2113-
if (do_fsck_object && fsck_finish(&fsck_options))
2114-
die(_("fsck error in pack objects"));
2113+
if (do_fsck_object) {
2114+
/*
2115+
* We cannot perform queued consistency checks when running
2116+
* outside of a repository because those require us to read
2117+
* from the object database, which is uninitialized.
2118+
*
2119+
* TODO: we may eventually set up an in-memory object database,
2120+
* which would allow us to perform these queued checks.
2121+
*/
2122+
if (!startup_info->have_repository &&
2123+
fsck_has_queued_checks(&fsck_options))
2124+
die(_("cannot perform queued object checks outside "
2125+
"of a repository"));
2126+
2127+
if (fsck_finish(&fsck_options))
2128+
die(_("fsck error in pack objects"));
2129+
}
21152130

21162131
free(opts.anomaly);
21172132
free(objects);

fsck.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1350,6 +1350,12 @@ int fsck_finish(struct fsck_options *options)
13501350
return ret;
13511351
}
13521352

1353+
bool fsck_has_queued_checks(struct fsck_options *options)
1354+
{
1355+
return !oidset_equal(&options->gitmodules_found, &options->gitmodules_done) ||
1356+
!oidset_equal(&options->gitattributes_found, &options->gitattributes_done);
1357+
}
1358+
13531359
void fsck_options_clear(struct fsck_options *options)
13541360
{
13551361
free(options->msg_type);

fsck.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -248,6 +248,13 @@ int fsck_tag_standalone(const struct object_id *oid, const char *buffer,
248248
*/
249249
int fsck_finish(struct fsck_options *options);
250250

251+
/*
252+
* Check whether there are any checks that have been queued up and that still
253+
* need to be run. Returns `false` iff `fsck_finish()` wouldn't perform any
254+
* actions, `true` otherwise.
255+
*/
256+
bool fsck_has_queued_checks(struct fsck_options *options);
257+
251258
/*
252259
* Clear the fsck_options struct, freeing any allocated memory.
253260
*/

t/t5302-pack-index.sh

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -293,4 +293,20 @@ test_expect_success 'too-large packs report the breach' '
293293
grep "maximum allowed size (20 bytes)" err
294294
'
295295

296+
# git-index-pack(1) uses the default hash algorithm outside of the repository,
297+
# and it has no way to tell it otherwise. So we can only run this test with the
298+
# default hash algorithm, as it would otherwise fail to parse the tree.
299+
test_expect_success DEFAULT_HASH_ALGORITHM 'index-pack --fsck-objects outside of a repo' '
300+
test_when_finished "rm -rf repo" &&
301+
git init repo &&
302+
(
303+
cd repo &&
304+
printf "100644 blob $(test_oid 001)\t.gitattributes\n" >tree &&
305+
git mktree --missing <tree >tree-oid &&
306+
git pack-objects <tree-oid pack &&
307+
test_must_fail nongit git index-pack --fsck-objects "$(pwd)"/pack-*.pack 2>err &&
308+
test_grep "cannot perform queued object checks outside of a repository" err
309+
)
310+
'
311+
296312
test_done

0 commit comments

Comments
 (0)