Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,12 @@ jobs:
cmake_gen: Ninja
cmake_flags: "-DMERVE_USE_SIMDUTF=ON -DMERVE_SANITIZE=ON"
name_suffix: " (simdutf, ASAN)"
# Ubuntu with error location tracking
- os: ubuntu-22.04
cxx: g++-12
cmake_gen: Ninja
cmake_flags: "-DMERVE_ENABLE_ERROR_LOCATION=ON"
name_suffix: " (error-location)"

name: ${{ matrix.os }} ${{ matrix.cxx || 'MSVC' }}${{ matrix.name_suffix || '' }}
runs-on: ${{ matrix.os }}
Expand Down Expand Up @@ -164,10 +170,22 @@ jobs:
working-directory: rust
run: cargo clippy -- -D warnings

- name: Clippy (error-location)
working-directory: rust
run: cargo clippy --features error-location -- -D warnings

- name: Test
working-directory: rust
run: cargo test

- name: Test (no default features)
working-directory: rust
run: cargo test --no-default-features

- name: Test (error-location)
working-directory: rust
run: cargo test --features error-location

- name: Test (no default features + error-location)
working-directory: rust
run: cargo test --no-default-features --features error-location
34 changes: 33 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,25 @@ const std::optional<lexer_error>& get_last_error();

Returns the last parse error, if any.

### `lexer::get_last_error_location`

```cpp
const std::optional<error_location>& get_last_error_location();
```

Returns the location of the last parse error, if available. Location tracking
is enabled when built with `MERVE_ENABLE_ERROR_LOCATION`.

### `lexer::error_location`

```cpp
struct error_location {
uint32_t line; // 1-based
uint32_t column; // 1-based
size_t offset; // 0-based byte offset
};
```

## C API

merve provides a C API (`merve_c.h`) for use from C programs, FFI bindings, or any language that can call C functions. The C API is compiled into the merve library alongside the C++ implementation.
Expand All @@ -141,11 +160,14 @@ merve provides a C API (`merve_c.h`) for use from C programs, FFI bindings, or a
```c
#include "merve_c.h"
#include <stdio.h>
#include <string.h>

int main(void) {
const char* source = "exports.foo = 1;\nexports.bar = 2;\n";

merve_analysis result = merve_parse_commonjs(source, strlen(source));
merve_error_loc err_loc = {0, 0, 0};
merve_analysis result = merve_parse_commonjs_ex(
source, strlen(source), &err_loc);

if (merve_is_valid(result)) {
size_t count = merve_get_exports_count(result);
Expand All @@ -157,6 +179,10 @@ int main(void) {
}
} else {
printf("Parse error: %d\n", merve_get_last_error());
if (err_loc.line != 0) {
printf(" at line %u, column %u (byte offset %zu)\n",
err_loc.line, err_loc.column, err_loc.offset);
}
}

merve_free(result);
Expand All @@ -180,12 +206,14 @@ Found 2 exports:
| `merve_string` | Non-owning string reference (`data` + `length`). Not null-terminated. |
| `merve_analysis` | Opaque handle to a parse result. Must be freed with `merve_free()`. |
| `merve_version_components` | Struct with `major`, `minor`, `revision` fields. |
| `merve_error_loc` | Error location (`line`, `column`, `offset`). `{0,0,0}` means unavailable. |

#### Functions

| Function | Description |
|----------|-------------|
| `merve_parse_commonjs(input, length)` | Parse CommonJS source. Returns a handle (NULL only on OOM). |
| `merve_parse_commonjs_ex(input, length, out_err)` | Parse CommonJS source and optionally fill error location. |
| `merve_is_valid(result)` | Check if parsing succeeded. NULL-safe. |
| `merve_free(result)` | Free a parse result. NULL-safe. |
| `merve_get_exports_count(result)` | Number of named exports found. |
Expand All @@ -198,6 +226,9 @@ Found 2 exports:
| `merve_get_version()` | Version string (e.g. `"1.0.1"`). |
| `merve_get_version_components()` | Version as `{major, minor, revision}`. |

Build with `-DMERVE_ENABLE_ERROR_LOCATION=ON` to enable non-zero locations
from `merve_parse_commonjs_ex`.

#### Error Constants

| Constant | Value | Description |
Expand Down Expand Up @@ -344,6 +375,7 @@ ctest --test-dir build
| `MERVE_TESTING` | `ON` | Build test suite |
| `MERVE_BENCHMARKS` | `OFF` | Build benchmarks |
| `MERVE_USE_SIMDUTF` | `OFF` | Use simdutf for optimized string operations |
| `MERVE_ENABLE_ERROR_LOCATION` | `OFF` | Track parse error source locations |
| `MERVE_SANITIZE` | `OFF` | Enable address sanitizer |

### Building with simdutf
Expand Down
1 change: 1 addition & 0 deletions cmake/lexer-flags.cmake
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
option(MERVE_LOGGING "verbose output (useful for debugging)" OFF)
option(MERVE_SANITIZE "Sanitize addresses" OFF)
option(MERVE_USE_SIMDUTF "Use simdutf for optimized string operations" OFF)
option(MERVE_ENABLE_ERROR_LOCATION "Track parse error source locations" OFF)
if(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
option(MERVE_SANITIZE_BOUNDS_STRICT "Sanitize bounds (strict): only for GCC" OFF)
endif()
Expand Down
26 changes: 26 additions & 0 deletions include/merve/parser.h
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

#include "merve/version.h"

#include <cstddef>
#include <cstdint>
#include <optional>
#include <string>
Expand Down Expand Up @@ -37,6 +38,18 @@ enum lexer_error {
TEMPLATE_NEST_OVERFLOW, ///< Template literal nesting too deep
};

/**
* @brief Source location information for a parse error.
*
* - line and column are 1-based.
* - offset is 0-based and measured in bytes from the start of input.
*/
struct error_location {
uint32_t line;
uint32_t column;
size_t offset;
};

/**
* @brief Type alias for export names.
*
Expand Down Expand Up @@ -146,6 +159,19 @@ std::optional<lexer_analysis> parse_commonjs(std::string_view file_contents);
*/
const std::optional<lexer_error>& get_last_error();

/**
* @brief Get the location of the last failed parse operation.
*
* @return const std::optional<error_location>& The last error location, or
* std::nullopt if unavailable.
*
* @note This is global state and may be overwritten by subsequent calls
* to parse_commonjs().
* @note Location tracking is enabled when built with
* MERVE_ENABLE_ERROR_LOCATION.
*/
const std::optional<error_location>& get_last_error_location();

} // namespace lexer

#endif // MERVE_PARSER_H
33 changes: 33 additions & 0 deletions include/merve_c.h
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,20 @@ typedef struct {
int revision;
} merve_version_components;

/**
* @brief Source location for a parse error.
*
* - line and column are 1-based.
* - offset is 0-based and measured in bytes from the start of input.
*
* A zeroed location (`{0, 0, 0}`) means the location is unavailable.
*/
typedef struct {
uint32_t line;
uint32_t column;
size_t offset;
} merve_error_loc;

/* Error codes corresponding to lexer::lexer_error values. */
#define MERVE_ERROR_TODO 0
#define MERVE_ERROR_UNEXPECTED_PAREN 1
Expand Down Expand Up @@ -74,6 +88,25 @@ extern "C" {
*/
merve_analysis merve_parse_commonjs(const char* input, size_t length);

/**
* Parse CommonJS source code and optionally return error location.
*
* Behaves like merve_parse_commonjs(). If @p out_err is non-NULL, it is always
* written:
* - On success: set to {0, 0, 0}.
* - On parse failure with known location: set to that location.
* - On parse failure without available location: set to {0, 0, 0}.
*
* @param input Pointer to the JavaScript source (need not be
* null-terminated). NULL is treated as an empty string.
* @param length Length of the input in bytes.
* @param out_err Optional output pointer for parse error location.
* @return A handle to the parse result, or NULL on out-of-memory.
* Use merve_is_valid() to check if parsing succeeded.
*/
merve_analysis merve_parse_commonjs_ex(const char* input, size_t length,
Comment thread
cijiugechu marked this conversation as resolved.
Outdated
merve_error_loc* out_err);

/**
* Check whether the parse result is valid (parsing succeeded).
*
Expand Down
2 changes: 1 addition & 1 deletion rust/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions rust/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ default = ["std"]
libcpp = []
# enable allocations
std = []
# enable parse error location support
error-location = []

[package.metadata.docs.rs]
rustdoc-args = ["--cfg", "docsrs"]
Expand Down
29 changes: 29 additions & 0 deletions rust/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,13 @@ Requires `libc++` to be installed:
merve = { version = "...", features = ["libcpp"] }
```

**error-location**: Enables location-aware parse errors via
`parse_commonjs_with_location`.

```toml
merve = { version = "...", features = ["error-location"] }
```

## API

### `parse_commonjs`
Expand All @@ -58,6 +65,17 @@ Parse CommonJS source code and extract export information. The returned
`Analysis` borrows from `source` because export names may point directly into
the source buffer (zero-copy).

### `parse_commonjs_with_location` (`error-location` feature)

```rust
pub fn parse_commonjs_with_location(
source: &str,
) -> Result<Analysis<'_>, LocatedLexerError>
```

Like `parse_commonjs`, but returns a `LocatedLexerError` that includes
`kind: LexerError` plus optional location (`line`, `column`, `offset`).

### `Analysis<'a>`

| Method | Returns | Description |
Expand Down Expand Up @@ -100,6 +118,17 @@ Returned when the input contains ESM syntax or malformed constructs:

`LexerError` implements `Display` and, with the `std` feature, `std::error::Error`.

### `LocatedLexerError` (`error-location` feature)

```rust
pub struct LocatedLexerError {
pub kind: LexerError,
pub location: Option<ErrorLocation>,
}
```

`ErrorLocation` uses 1-based `line`/`column` and 0-based byte `offset`.

### Versioning helpers

```rust
Expand Down
4 changes: 4 additions & 0 deletions rust/build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -219,6 +219,10 @@ fn main() {
build.include(&deps);
build.cpp(true).std("c++20").warnings(false);

if env::var_os("CARGO_FEATURE_ERROR_LOCATION").is_some() {
build.define("MERVE_ENABLE_ERROR_LOCATION", Some("1"));
}

// Target handling
let target_str = env::var("TARGET").unwrap();
let target: Vec<String> = target_str.split('-').map(|s| s.into()).collect();
Expand Down
Loading
Loading