@@ -21,11 +21,11 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
2121- ` std::expected<T, E> ` for error handling (replaces ` tl::expected ` )
2222- ` std::format ` for string formatting (replaces ` fmt::format ` )
2323- ` std::ranges ` for range-based algorithms and views (replaces ` range-v3 ` )
24- - ` uni-algo ` library for Unicode processing
24+ - Custom Unicode/IDNA implementation (header-only)
2525
2626** Key Advantages** :
2727- ** Header-only** - just include and use, no linking required
28- - ** Minimal external dependencies** - only requires ` uni-algo ` for Unicode support
28+ - ** Zero external dependencies** - completely self-contained for core URL parsing
2929
3030## Building
3131
@@ -35,17 +35,16 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
3535- C++23-compliant compiler (GCC 13+, Clang 16+, MSVC 2022 17.6+)
3636
3737** Optional** (automatically disabled with warnings if not found):
38- - ` uni-algo ` for full Unicode/IDNA processing
3938- ` catch2 ` for tests
4039- ` nlohmann-json ` for JSON functionality
4140
4241To install optional dependencies:
4342``` bash
4443cd ${VCPKG_ROOT}
45- ./vcpkg install uni-algo catch2 nlohmann-json
44+ ./vcpkg install catch2 nlohmann-json
4645```
4746
48- ** Note** : The library will work for basic URL parsing even without dependencies, but IDNA/Punycode (internationalized domain names) require ` uni-algo ` .
47+ ** Note** : The library is completely self-contained with zero external dependencies. Unicode/ IDNA/Punycode support is built-in via custom header-only implementation .
4948
5049### Configure and Build
5150
@@ -64,6 +63,7 @@ cmake --build _build
6463Key build options:
6564- ` skyr_BUILD_TESTS ` (ON): Build tests
6665- ` skyr_BUILD_WPT ` (OFF): Build Web Platform Tests runner
66+ - ` skyr_BUILD_BENCHMARKS ` (OFF): Build performance benchmarks
6767- ` skyr_ENABLE_FILESYSTEM_FUNCTIONS ` (ON): Enable filesystem::path conversion
6868- ` skyr_ENABLE_JSON_FUNCTIONS ` (ON): Enable JSON serialization
6969- ` skyr_BUILD_WITHOUT_EXCEPTIONS ` (OFF): Build without exceptions
@@ -222,6 +222,159 @@ Test data comes from the official WPT repository:
222222
223223This ensures compliance testing against the latest WhatWG URL specification test cases.
224224
225+ ## Benchmarks
226+
227+ ** Performance benchmarks** measure runtime URL parsing speed to identify optimization opportunities and track performance regressions.
228+
229+ ### Philosophy
230+
231+ - ** Measure, don't guess** - Profile before optimizing
232+ - ** Real-world scenarios** - Tests diverse URL patterns (ASCII, IDN, IPv6, percent-encoded, etc.)
233+ - ** Actionable metrics** - Reports average µs/URL and throughput (URLs/second)
234+ - ** Optional** - Not required for normal development (disabled by default)
235+
236+ ### Building Benchmarks
237+
238+ ``` bash
239+ cmake \
240+ -B _build \
241+ -G " Ninja" \
242+ -Dskyr_BUILD_BENCHMARKS=ON \
243+ .
244+ cmake --build _build --target url_parsing_bench
245+ ```
246+
247+ ### Running Benchmarks
248+
249+ ``` bash
250+ # Default: 10,000 iterations × 34 URLs = 340,000 parses
251+ ./_build/benchmark/url_parsing_bench
252+
253+ # Custom iteration count (100,000 iterations)
254+ ./_build/benchmark/url_parsing_bench 100000
255+
256+ # Quick test (1,000 iterations)
257+ ./_build/benchmark/url_parsing_bench 1000
258+ ```
259+
260+ ### Example Output
261+
262+ ```
263+ =================================================
264+ URL Parsing Benchmark Results
265+ =================================================
266+
267+ Configuration:
268+ Test URLs: 34 unique patterns
269+ Iterations: 10000
270+ Total URLs: 340000
271+
272+ Results:
273+ Total time: 820 ms
274+ Successful: 330000 (97.1%)
275+ Failed: 10000 (2.9%)
276+
277+ Performance:
278+ Average: 2.412 µs/URL
279+ Throughput: 414634 URLs/second
280+
281+ =================================================
282+ ```
283+
284+ ### Interpreting Results
285+
286+ ** Good performance (on modern hardware):**
287+ - Average: < 5 µs/URL
288+ - Throughput: > 200,000 URLs/second
289+
290+ ** Investigate if:**
291+ - Average: > 10 µs/URL
292+ - Throughput: < 100,000 URLs/second
293+
294+ ### Profiling
295+
296+ To find actual performance bottlenecks, use profiling tools:
297+
298+ ** macOS (Instruments - requires Xcode):**
299+ ``` bash
300+ # First, install Xcode from App Store or https://developer.apple.com/download/
301+ # Verify: xctrace version
302+
303+ cmake -B _build -G Ninja -Dskyr_BUILD_BENCHMARKS=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo
304+ cmake --build _build --target url_parsing_bench
305+
306+ # Profile with xctrace (modern replacement for 'instruments' command)
307+ xctrace record --template ' Time Profiler' \
308+ --output /tmp/url_bench.trace \
309+ --launch ./_build/benchmark/url_parsing_bench 50000
310+
311+ # Open results in Instruments GUI
312+ open /tmp/url_bench.trace
313+ ```
314+
315+ ** macOS (sample - built-in, no Xcode needed):**
316+ ``` bash
317+ cmake -B _build -G Ninja -Dskyr_BUILD_BENCHMARKS=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo
318+ cmake --build _build --target url_parsing_bench
319+ sample url_parsing_bench 10 -file /tmp/profile.txt &
320+ ./_build/benchmark/url_parsing_bench 50000
321+ open /tmp/profile.txt
322+ ```
323+
324+ ** Linux (perf):**
325+ ``` bash
326+ cmake -B _build -G Ninja -Dskyr_BUILD_BENCHMARKS=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo
327+ cmake --build _build --target url_parsing_bench
328+ perf record -g ./_build/benchmark/url_parsing_bench 50000
329+ perf report
330+ ```
331+
332+ ** All platforms (Valgrind):**
333+ ``` bash
334+ cmake -B _build -G Ninja -Dskyr_BUILD_BENCHMARKS=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo
335+ cmake --build _build --target url_parsing_bench
336+ valgrind --tool=callgrind ./_build/benchmark/url_parsing_bench 1000
337+ qcachegrind callgrind.out # macOS: brew install qcachegrind
338+ # Linux: kcachegrind
339+ ```
340+
341+ ### Test Coverage
342+
343+ The benchmark tests 34 diverse URL patterns:
344+ - Simple ASCII URLs (http, https, ftp)
345+ - URLs with query parameters and fragments
346+ - URLs with authentication (user: pass @host)
347+ - URLs with non-default ports
348+ - Internationalized domain names (IDN): ` http://example.إختبار/ ` , ` https://münchen.de/ `
349+ - Unicode in paths: ` http://example.com/π ` , ` https://example.org/文档/ `
350+ - Percent-encoded URLs: ` http://example.com/path%20with%20spaces `
351+ - Complex real-world URLs (Google search, GitHub, Wikipedia)
352+ - IPv4 addresses: ` http://192.168.1.1/ ` , ` https://127.0.0.1:8443/ `
353+ - IPv6 addresses: ` http://[::1]/ ` , ` https://[2001:db8::1]/ `
354+ - Edge cases: file://, data:, mailto:
355+
356+ ### Performance Expectations
357+
358+ ** Typical results on modern hardware (Apple M1/M2, Intel i7+, AMD Ryzen):**
359+ - Average: 2-4 µs/URL
360+ - Throughput: 250,000 - 500,000 URLs/second
361+
362+ ** Why this is fast enough:**
363+ - Most applications parse URLs once per request
364+ - A typical HTTP request takes 10-100ms
365+ - URL parsing is < 0.01% of total request time
366+ - Bottleneck is almost never URL parsing
367+
368+ ### Before Adding Dependencies
369+
370+ Before adding external libraries like simdutf for "faster UTF conversion":
371+
372+ 1 . ** Profile first** - Use profiling tools to find real bottlenecks
373+ 2 . ** Measure UTF time** - Is UTF conversion > 10% of runtime?
374+ 3 . ** Consider trade-offs** - Zero dependencies vs marginal speedup
375+
376+ The benchmark helps answer: "Is optimization worth the complexity?"
377+
225378## Code Structure
226379
227380** Directory Layout** :
@@ -279,11 +432,10 @@ Aliases for compatibility:
279432## Key Dependencies
280433
281434- ** C++23 standard library** : ` std::expected ` , ` std::format ` , ` std::ranges `
282- - ** uni-algo** : Unicode algorithms and IDNA processing
283435- ** nlohmann-json** (optional): JSON serialization
284- - ** Catch2** (tests): Testing framework
436+ - ** Catch2** (optional, tests only ): Testing framework
285437
286- ** Key advantage** : Minimal external dependencies - only requires ` uni-algo ` for Unicode support. All other modern C++ features (` expected ` , ` format ` , ` ranges ` ) are provided by the standard library!
438+ ** Key advantage** : Zero external dependencies for core URL parsing! All modern C++ features (` expected ` , ` format ` , ` ranges ` ) and Unicode/IDNA support are either from the standard library or custom header-only implementations.
287439
288440## Code Quality Tools
289441
0 commit comments