Skip to content

Commit 06908d1

Browse files
authored
Benchmark (#175)
* Clean up redundant dependencies * Add parsing benchmark * Update documentation * Clean up documentation and workflows
1 parent c41de1a commit 06908d1

12 files changed

Lines changed: 513 additions & 46 deletions

File tree

.clang-format

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,8 +103,8 @@ IncludeCategories:
103103
- Regex: '^<[^/]+>$'
104104
Priority: 2
105105
SortPriority: 2
106-
# 3. Third-party libraries (uni-algo, catch2, nlohmann)
107-
- Regex: '^<(uni|catch2|nlohmann)/'
106+
# 3. Third-party libraries (catch2, nlohmann)
107+
- Regex: '^<(catch2|nlohmann)/'
108108
Priority: 3
109109
SortPriority: 3
110110
# 4. Project headers (skyr/...)

.github/workflows/clang-tidy.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ jobs:
2828
./bootstrap-vcpkg.sh
2929
export CXXFLAGS="-stdlib=libc++"
3030
export LDFLAGS="-stdlib=libc++ -lc++abi"
31-
./vcpkg install uni-algo nlohmann-json
31+
./vcpkg install nlohmann-json
3232
3333
- name: Install Catch2
3434
run: |

.github/workflows/sanitizers.yml

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -115,17 +115,6 @@ jobs:
115115
git checkout -b master origin/master
116116
./bootstrap-vcpkg.sh
117117
118-
# For Clang builds, use custom triplet with libc++ and set compiler
119-
if [[ "${{ matrix.config.use_libcxx }}" == "true" ]]; then
120-
export CC=${{ matrix.config.cc }}
121-
export CXX=${{ matrix.config.cxx }}
122-
./vcpkg install uni-algo \
123-
--triplet x64-linux-libcxx \
124-
--overlay-triplets=${GITHUB_WORKSPACE}/cmake/vcpkg-triplets
125-
else
126-
./vcpkg install uni-algo
127-
fi
128-
129118
- name: Configure
130119
shell: cmake -P {0}
131120
run: |

.github/workflows/skyr-url-ci.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -298,11 +298,11 @@ jobs:
298298
if [[ "${{ matrix.config.cxx }}" == clang++* ]]; then
299299
export CC=${{ matrix.config.cc }}
300300
export CXX=${{ matrix.config.cxx }}
301-
./vcpkg install catch2 uni-algo nlohmann-json \
301+
./vcpkg install catch2 nlohmann-json \
302302
--triplet x64-linux-libcxx \
303303
--overlay-triplets=${GITHUB_WORKSPACE}/cmake/vcpkg-triplets
304304
else
305-
./vcpkg install catch2 uni-algo nlohmann-json
305+
./vcpkg install catch2 nlohmann-json
306306
fi
307307
308308
- name: Install LLVM Clang on macOS
@@ -338,7 +338,7 @@ jobs:
338338
git checkout -b master origin/master
339339
export
340340
./bootstrap-vcpkg.sh
341-
./vcpkg install catch2 uni-algo nlohmann-json
341+
./vcpkg install catch2 nlohmann-json
342342
343343
- name: Install vcpkg (Windows)
344344
id: vcpkg_windows
@@ -348,7 +348,7 @@ jobs:
348348
git clone https://github.com/microsoft/vcpkg.git ${Env:GITHUB_WORKSPACE}\vcpkg
349349
cd ${Env:GITHUB_WORKSPACE}\vcpkg
350350
.\bootstrap-vcpkg.bat
351-
vcpkg install catch2 uni-algo nlohmann-json --triplet x64-windows
351+
vcpkg install catch2 nlohmann-json --triplet x64-windows
352352
353353
354354
- name: Configure

.github/workflows/wpt.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ jobs:
6161
git fetch origin master
6262
git checkout -b master origin/master
6363
./bootstrap-vcpkg.sh
64-
./vcpkg install uni-algo nlohmann-json
64+
./vcpkg install nlohmann-json
6565
6666
- name: Configure
6767
shell: cmake -P {0}

CLAUDE.md

Lines changed: 160 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,11 +21,11 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
2121
- `std::expected<T, E>` for error handling (replaces `tl::expected`)
2222
- `std::format` for string formatting (replaces `fmt::format`)
2323
- `std::ranges` for range-based algorithms and views (replaces `range-v3`)
24-
- `uni-algo` library for Unicode processing
24+
- Custom Unicode/IDNA implementation (header-only)
2525

2626
**Key Advantages**:
2727
- **Header-only** - just include and use, no linking required
28-
- **Minimal external dependencies** - only requires `uni-algo` for Unicode support
28+
- **Zero external dependencies** - completely self-contained for core URL parsing
2929

3030
## Building
3131

@@ -35,17 +35,16 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
3535
- C++23-compliant compiler (GCC 13+, Clang 16+, MSVC 2022 17.6+)
3636

3737
**Optional** (automatically disabled with warnings if not found):
38-
- `uni-algo` for full Unicode/IDNA processing
3938
- `catch2` for tests
4039
- `nlohmann-json` for JSON functionality
4140

4241
To install optional dependencies:
4342
```bash
4443
cd ${VCPKG_ROOT}
45-
./vcpkg install uni-algo catch2 nlohmann-json
44+
./vcpkg install catch2 nlohmann-json
4645
```
4746

48-
**Note**: The library will work for basic URL parsing even without dependencies, but IDNA/Punycode (internationalized domain names) require `uni-algo`.
47+
**Note**: The library is completely self-contained with zero external dependencies. Unicode/IDNA/Punycode support is built-in via custom header-only implementation.
4948

5049
### Configure and Build
5150

@@ -64,6 +63,7 @@ cmake --build _build
6463
Key build options:
6564
- `skyr_BUILD_TESTS` (ON): Build tests
6665
- `skyr_BUILD_WPT` (OFF): Build Web Platform Tests runner
66+
- `skyr_BUILD_BENCHMARKS` (OFF): Build performance benchmarks
6767
- `skyr_ENABLE_FILESYSTEM_FUNCTIONS` (ON): Enable filesystem::path conversion
6868
- `skyr_ENABLE_JSON_FUNCTIONS` (ON): Enable JSON serialization
6969
- `skyr_BUILD_WITHOUT_EXCEPTIONS` (OFF): Build without exceptions
@@ -222,6 +222,159 @@ Test data comes from the official WPT repository:
222222

223223
This ensures compliance testing against the latest WhatWG URL specification test cases.
224224

225+
## Benchmarks
226+
227+
**Performance benchmarks** measure runtime URL parsing speed to identify optimization opportunities and track performance regressions.
228+
229+
### Philosophy
230+
231+
- **Measure, don't guess** - Profile before optimizing
232+
- **Real-world scenarios** - Tests diverse URL patterns (ASCII, IDN, IPv6, percent-encoded, etc.)
233+
- **Actionable metrics** - Reports average µs/URL and throughput (URLs/second)
234+
- **Optional** - Not required for normal development (disabled by default)
235+
236+
### Building Benchmarks
237+
238+
```bash
239+
cmake \
240+
-B _build \
241+
-G "Ninja" \
242+
-Dskyr_BUILD_BENCHMARKS=ON \
243+
.
244+
cmake --build _build --target url_parsing_bench
245+
```
246+
247+
### Running Benchmarks
248+
249+
```bash
250+
# Default: 10,000 iterations × 34 URLs = 340,000 parses
251+
./_build/benchmark/url_parsing_bench
252+
253+
# Custom iteration count (100,000 iterations)
254+
./_build/benchmark/url_parsing_bench 100000
255+
256+
# Quick test (1,000 iterations)
257+
./_build/benchmark/url_parsing_bench 1000
258+
```
259+
260+
### Example Output
261+
262+
```
263+
=================================================
264+
URL Parsing Benchmark Results
265+
=================================================
266+
267+
Configuration:
268+
Test URLs: 34 unique patterns
269+
Iterations: 10000
270+
Total URLs: 340000
271+
272+
Results:
273+
Total time: 820 ms
274+
Successful: 330000 (97.1%)
275+
Failed: 10000 (2.9%)
276+
277+
Performance:
278+
Average: 2.412 µs/URL
279+
Throughput: 414634 URLs/second
280+
281+
=================================================
282+
```
283+
284+
### Interpreting Results
285+
286+
**Good performance (on modern hardware):**
287+
- Average: < 5 µs/URL
288+
- Throughput: > 200,000 URLs/second
289+
290+
**Investigate if:**
291+
- Average: > 10 µs/URL
292+
- Throughput: < 100,000 URLs/second
293+
294+
### Profiling
295+
296+
To find actual performance bottlenecks, use profiling tools:
297+
298+
**macOS (Instruments - requires Xcode):**
299+
```bash
300+
# First, install Xcode from App Store or https://developer.apple.com/download/
301+
# Verify: xctrace version
302+
303+
cmake -B _build -G Ninja -Dskyr_BUILD_BENCHMARKS=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo
304+
cmake --build _build --target url_parsing_bench
305+
306+
# Profile with xctrace (modern replacement for 'instruments' command)
307+
xctrace record --template 'Time Profiler' \
308+
--output /tmp/url_bench.trace \
309+
--launch ./_build/benchmark/url_parsing_bench 50000
310+
311+
# Open results in Instruments GUI
312+
open /tmp/url_bench.trace
313+
```
314+
315+
**macOS (sample - built-in, no Xcode needed):**
316+
```bash
317+
cmake -B _build -G Ninja -Dskyr_BUILD_BENCHMARKS=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo
318+
cmake --build _build --target url_parsing_bench
319+
sample url_parsing_bench 10 -file /tmp/profile.txt &
320+
./_build/benchmark/url_parsing_bench 50000
321+
open /tmp/profile.txt
322+
```
323+
324+
**Linux (perf):**
325+
```bash
326+
cmake -B _build -G Ninja -Dskyr_BUILD_BENCHMARKS=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo
327+
cmake --build _build --target url_parsing_bench
328+
perf record -g ./_build/benchmark/url_parsing_bench 50000
329+
perf report
330+
```
331+
332+
**All platforms (Valgrind):**
333+
```bash
334+
cmake -B _build -G Ninja -Dskyr_BUILD_BENCHMARKS=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo
335+
cmake --build _build --target url_parsing_bench
336+
valgrind --tool=callgrind ./_build/benchmark/url_parsing_bench 1000
337+
qcachegrind callgrind.out # macOS: brew install qcachegrind
338+
# Linux: kcachegrind
339+
```
340+
341+
### Test Coverage
342+
343+
The benchmark tests 34 diverse URL patterns:
344+
- Simple ASCII URLs (http, https, ftp)
345+
- URLs with query parameters and fragments
346+
- URLs with authentication (user:pass@host)
347+
- URLs with non-default ports
348+
- Internationalized domain names (IDN): `http://example.إختبار/`, `https://münchen.de/`
349+
- Unicode in paths: `http://example.com/π`, `https://example.org/文档/`
350+
- Percent-encoded URLs: `http://example.com/path%20with%20spaces`
351+
- Complex real-world URLs (Google search, GitHub, Wikipedia)
352+
- IPv4 addresses: `http://192.168.1.1/`, `https://127.0.0.1:8443/`
353+
- IPv6 addresses: `http://[::1]/`, `https://[2001:db8::1]/`
354+
- Edge cases: file://, data:, mailto:
355+
356+
### Performance Expectations
357+
358+
**Typical results on modern hardware (Apple M1/M2, Intel i7+, AMD Ryzen):**
359+
- Average: 2-4 µs/URL
360+
- Throughput: 250,000 - 500,000 URLs/second
361+
362+
**Why this is fast enough:**
363+
- Most applications parse URLs once per request
364+
- A typical HTTP request takes 10-100ms
365+
- URL parsing is < 0.01% of total request time
366+
- Bottleneck is almost never URL parsing
367+
368+
### Before Adding Dependencies
369+
370+
Before adding external libraries like simdutf for "faster UTF conversion":
371+
372+
1. **Profile first** - Use profiling tools to find real bottlenecks
373+
2. **Measure UTF time** - Is UTF conversion > 10% of runtime?
374+
3. **Consider trade-offs** - Zero dependencies vs marginal speedup
375+
376+
The benchmark helps answer: "Is optimization worth the complexity?"
377+
225378
## Code Structure
226379

227380
**Directory Layout**:
@@ -279,11 +432,10 @@ Aliases for compatibility:
279432
## Key Dependencies
280433

281434
- **C++23 standard library**: `std::expected`, `std::format`, `std::ranges`
282-
- **uni-algo**: Unicode algorithms and IDNA processing
283435
- **nlohmann-json** (optional): JSON serialization
284-
- **Catch2** (tests): Testing framework
436+
- **Catch2** (optional, tests only): Testing framework
285437

286-
**Key advantage**: Minimal external dependencies - only requires `uni-algo` for Unicode support. All other modern C++ features (`expected`, `format`, `ranges`) are provided by the standard library!
438+
**Key advantage**: Zero external dependencies for core URL parsing! All modern C++ features (`expected`, `format`, `ranges`) and Unicode/IDNA support are either from the standard library or custom header-only implementations.
287439

288440
## Code Quality Tools
289441

CMakeLists.txt

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ option(skyr_BUILD_TESTS "Build the URL tests." ON)
2828
option(skyr_BUILD_WPT "Build Web Platform Tests runner (report-only, not unit tests)." OFF)
2929
option(skyr_BUILD_DOCS "Build the URL documentation." OFF)
3030
option(skyr_BUILD_EXAMPLES "Build the URL examples." OFF)
31+
option(skyr_BUILD_BENCHMARKS "Build performance benchmarks." OFF)
3132
option(skyr_FULL_WARNINGS "Build the library with all warnings turned on." ON)
3233
option(skyr_WARNINGS_AS_ERRORS "Treat warnings as errors." ON)
3334
option(skyr_BUILD_WITHOUT_EXCEPTIONS "Build without exceptions." OFF)
@@ -50,13 +51,6 @@ if (skyr_IS_TOP_LEVEL_PROJECT)
5051
set(CMAKE_CXX_STANDARD_REQUIRED ON)
5152
endif()
5253

53-
# C++23 standard library provides std::expected, std::format, std::ranges
54-
# Only external dependency is uni-algo for Unicode processing
55-
find_package(uni-algo CONFIG)
56-
if (NOT uni-algo_FOUND)
57-
message(WARNING "uni-algo not found. Install with: vcpkg install uni-algo")
58-
endif()
59-
6054
if (skyr_ENABLE_JSON_FUNCTIONS)
6155
find_package(nlohmann_json CONFIG)
6256
if (NOT nlohmann_json_FOUND)
@@ -115,6 +109,12 @@ if (skyr_BUILD_EXAMPLES)
115109
add_subdirectory(examples)
116110
endif()
117111

112+
# Benchmarks
113+
if (skyr_BUILD_BENCHMARKS)
114+
message(STATUS "[skyr-url] Configuring benchmarks")
115+
add_subdirectory(benchmark)
116+
endif()
117+
118118
## Install
119119
include(CMakePackageConfigHelpers)
120120
include(GNUInstallDirs)

README.md

Lines changed: 7 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -32,13 +32,13 @@ This library provides:
3232
* **`std::expected`** for error handling (no external dependency needed!)
3333
* **`std::format`** for string formatting
3434
* **`std::ranges`** for functional-style operations
35-
* Minimal external dependencies - only `uni-algo` required for Unicode/IDNA support
35+
* **Zero external dependencies** - Unicode/IDNA/Punycode support built-in
3636

3737
## Using the library
3838

3939
This project requires:
4040
* A **C++23 compliant compiler** (GCC 13+, Clang 16+, MSVC 2022 17.6+)
41-
* **uni-algo** library (optional but recommended for full IDNA support)
41+
* **No external dependencies** for core URL parsing
4242

4343
### ``vcpkg``
4444

@@ -62,7 +62,7 @@ call to ``bootstrap-vcpkg.sh`` with ``bootstrap-vcpkg.bat``.
6262

6363
### Installing dependencies using `vcpkg`
6464

65-
Using `vcpkg`, install the library dependencies:
65+
Using `vcpkg`, install the optional test dependencies:
6666

6767
```bash
6868
> cd ${VCPKG_ROOT}
@@ -71,10 +71,10 @@ Using `vcpkg`, install the library dependencies:
7171
> git fetch origin master
7272
> git checkout -b master origin/master
7373
> ./bootstrap-vcpkg.sh
74-
> ./vcpkg install uni-algo catch2 nlohmann-json
74+
> ./vcpkg install catch2 nlohmann-json
7575
```
7676

77-
**Note**: Only `uni-algo` is required for the library itself. `catch2` and `nlohmann-json` are only needed for tests and JSON functionality.
77+
**Note**: The library has zero dependencies. `catch2` and `nlohmann-json` are only needed for tests and optional JSON functionality.
7878

7979
### Building the project with `CMake` and `Ninja`
8080

@@ -209,17 +209,13 @@ This library leverages **C++23 standard library features**:
209209
* **`std::format`** (C++23) - String formatting
210210
* **`std::ranges`** (C++20/23) - Range algorithms and views
211211

212-
**External dependencies**:
213-
* [uni-algo](https://github.com/uni-algo/uni-algo) - Unicode algorithms and IDNA processing (optional but recommended)
212+
**Core library**: **Zero external dependencies!**
213+
* Unicode/IDNA/Punycode support via custom header-only implementation
214214

215215
**Test dependencies** (optional):
216216
* [Catch2](https://github.com/catchorg/catch2) - Testing framework
217217
* [nlohmann-json](https://github.com/nlohmann/json) - JSON support (for optional JSON features)
218218

219-
## Acknowledgements
220-
221-
This library includes Unicode processing support from [uni-algo](https://github.com/uni-algo/uni-algo).
222-
223219
## Platform support
224220

225221
Look at the [GitHub Actions Status](https://github.com/cpp-netlib/url/actions)

0 commit comments

Comments
 (0)