Skip to content

Commit 130e396

Browse files
committed
Optimise unescape
1 parent d6dd27d commit 130e396

1 file changed

Lines changed: 9 additions & 5 deletions

File tree

lib/rdf/ntriples/reader.rb

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -178,6 +178,8 @@ def self.parse_literal(input, **options)
178178
ESCAPE_CHARS_ESCAPED_REGEXP = Regexp.union(
179179
ESCAPE_CHARS_ESCAPED.keys
180180
).freeze
181+
# Combined pattern for a single-pass unescape (UCHAR first, then escape chars)
182+
UNESCAPE_COMBINED = Regexp.union(UCHAR, ESCAPE_CHARS_ESCAPED_REGEXP).freeze
181183

182184
##
183185
# @param [String] string
@@ -190,11 +192,13 @@ def self.unescape(string)
190192
# greatly reduces the number of allocations and the processing time.
191193
string = string.dup.force_encoding(Encoding::UTF_8) unless string.encoding == Encoding::UTF_8
192194

193-
string
194-
.gsub(UCHAR) do
195-
[($1 || $2).hex].pack('U*')
196-
end
197-
.gsub(ESCAPE_CHARS_ESCAPED_REGEXP, ESCAPE_CHARS_ESCAPED)
195+
# Early return when nothing to unescape: avoids string allocation entirely.
196+
return string unless string.match?(UNESCAPE_COMBINED)
197+
198+
# Single pass handles both \uXXXX/\UXXXXXXXX and backslash escape chars.
199+
string.gsub(UNESCAPE_COMBINED) do |match|
200+
($1 || $2) ? [($1 || $2).hex].pack('U*') : ESCAPE_CHARS_ESCAPED[match]
201+
end
198202
end
199203

200204
##

0 commit comments

Comments
 (0)