Skip to content

fix: do not mistake an EOCD signature inside the comment for the real record#966

Open
spokodev wants to merge 1 commit into
Stuk:mainfrom
spokodev:fix/eocd-signature-in-comment
Open

fix: do not mistake an EOCD signature inside the comment for the real record#966
spokodev wants to merge 1 commit into
Stuk:mainfrom
spokodev:fix/eocd-signature-in-comment

Conversation

@spokodev

Copy link
Copy Markdown

Problem

JSZip.loadAsync cannot read a valid ZIP whose archive comment contains the
end of central directory signature (PK\x05\x06). It even fails on JSZip's own
output:

const comment = "\x50\x4b\x05\x06" + "Z".repeat(18);
const zip = new JSZip();
zip.file("a.txt", "hi");
const buf = await zip.generateAsync({ type: "nodebuffer", comment });
await JSZip.loadAsync(buf);
// throws: "End of data reached (data length = 132, asked index = 23262). Corrupted zip ?"
// expected: loads, a.txt === "hi"

libarchive reads the exact same bytes correctly, so this is a JSZip-side
parsing bug rather than a malformed archive:

$ bsdtar -tvf poc.zip
-rw-rw-r--  0 0      0           2 a.txt

Root cause

readEndOfCentral() locates the EOCD record with
lastIndexOfSignature(CENTRAL_DIRECTORY_END), which returns the occurrence
closest to EOF, and uses it with no validation. When PK\x05\x06 appears inside
the comment, that false positive sits after the real record. Reading its
"comment length" field then yields garbage (in the repro, 23130), and the
reader walks past the end of the buffer, raising "End of data reached".

For the repro the signature appears at offsets 88 (real) and 110 (inside the
comment); the parser picks 110.

Fix

APPNOTE.TXT 4.3.16 states the genuine EOCD satisfies
offset + 22 + commentLength === fileLength (22 being the fixed record size).
findEndOfCentral() scans the candidate signatures backward and returns the
first one matching this invariant. If none matches (e.g. an archive with
trailing bytes appended after the EOCD), it falls back to the last occurrence,
preserving the previous behavior for the prepended/appended-bytes cases.

This required a small addition to lastIndexOfSignature: an optional
endIndex argument so the scan can continue backward past a rejected
candidate.

Tests

Added a load test that round-trips an archive whose comment starts with the
EOCD signature. It fails on main (the "End of data reached" throw) and passes
with the fix. The existing prepended-bytes, appended-bytes and zip64 extra-bytes
load tests continue to pass, as does the rest of the suite.

… record

loadAsync used the last "PK\x05\x06" signature in the file as the end of
central directory record, without validating it. When the archive comment
itself contains those bytes, the false positive is picked and reading the
bogus comment-length field runs past the end of the buffer, throwing
"Corrupted zip ... End of data reached" on otherwise valid archives
(including JSZip's own generateAsync output, which libarchive reads fine).

Per APPNOTE.TXT 4.3.16 the genuine EOCD satisfies
`offset + 22 + commentLength === fileLength`. Scan the candidate signatures
backward and prefer the one matching this invariant, falling back to the
last occurrence so archives with appended/prepended bytes keep working.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant