📚 Refined Explorer — Complete Technical Documentation

A Yazi-like terminal file explorer built in Modern C++ with a persistent LMDB-backed search engine.

📋 Table of Contents

Project Overview
Architecture
Dependencies
Installation & Build
Configuration
Features Deep Dive
Indexing Architecture Deep Dive
Complexity Analysis
Testing
Project Structure
Troubleshooting & FAQs
Future Roadmap

1. Project Overview

Refined Explorer is a high-performance, terminal-based file manager written in C++17. It is inspired by Yazi and Ranger but goes further by embedding a persistent, full-text search engine directly into the file explorer — powered by LMDB (Lightning Memory-Mapped Database).

What makes it unique?

Feature	Standard TUI Explorers	Refined Explorer
Navigation	✅	✅
File Operations	✅	✅
Filename Search	Basic (`find`)	✅ Recursive + Filtered
Content Search	❌ Relies on `grep`	✅ Persistent inverted index (LMDB)
Real-time Index Updates	❌	✅ FSEvents (macOS) / inotify (Linux)
Search Speed	~300-800ms (`grep -R`)	~5-20ms (indexed)

2. Architecture

The application has a clean layered architecture:

┌────────────────────────────────────────────────────┐
│                  Terminal UI Layer                 │
│  (3 panels: metadata | file list | preview)        │
│                  src/tui/                          │
└──────────────────────┬─────────────────────────────┘
                       │ keypresses / commands
┌──────────────────────▼─────────────────────────────┐
│              Application Core Layer                │
│  Navigation · Commands · Selection · Dir Cache     │
│               src/core/                            │
└────────┬──────────────────────────┬────────────────┘
         │                          │
┌────────▼─────────┐   ┌────────────▼────────────────┐
│  Utility Layer   │   │    Indexing & Search Layer   │
│  text_utils.cpp  │   │  inverted_index.cpp          │
│  format.cpp      │   │  traversal.cpp               │
│  file_utils.cpp  │   │  watcher_mac/linux.cpp       │
│  src/utils/      │   │  src/index/                  │
└──────────────────┘   └─────────────────────────────┘
                                   │
                       ┌───────────▼──────────────┐
                       │   LMDB on-disk Database  │
                       │  ~/.cache/refined-explorer│
                       └──────────────────────────┘

Core Data Flow

User launches the app → main.cpp runs loadConfig(), startIndexing(), initializeNavigation(), then navigate().
Background thread starts crawling the indexingRoot, tokenizing files, and persisting them into LMDB.
Watcher thread listens for real-time filesystem events (file created/deleted/renamed/modified) and re-indexes changed files.
User navigates → the UI redraws from the directory cache or a fresh filesystem scan.
User searches → the query is tokenized, word IDs are looked up in LMDB, inode sets are intersected, and paths are resolved.

3. Dependencies

Core Language & Standard Library

Dependency	Version	Purpose
C++ Standard	C++17	`std::filesystem`, structured bindings, `if constexpr`
POSIX	Standard	`stat`, `opendir`, `readdir`, `fcntl`, `signal`, `ioctl`
pthreads	POSIX	`std::thread` background workers

External Libraries

Library	Install	Purpose
LMDB	`brew install lmdb` (macOS) / `apt install liblmdb-dev` (Linux)	Persistent B+ Tree key-value store for the inverted index
CoreServices (macOS only)	Included in Xcode	`FSEvents` API for real-time filesystem monitoring
inotify (Linux only)	Built into Linux kernel	Real-time filesystem event monitoring

Build Tools

Tool	Version	Purpose
CMake	3.10+	Cross-platform build system
Compiler	Apple Clang (macOS) / GCC 11+ (Linux)	C++17 compilation
GoogleTest	v1.14.0	Auto-fetched via `FetchContent` for unit testing

Important

On macOS, you must use Apple Clang (not Homebrew GCC). Apple's CoreServices.framework uses "Blocks" syntax (^) which only Clang supports. Set your compiler with CC=clang CXX=clang++ cmake ..

4. Installation & Build

Step 1 — Install Dependencies

macOS:

brew install lmdb cmake
# Apple Clang is already available via Xcode Command Line Tools
xcode-select --install

Ubuntu/Debian:

sudo apt-get update
sudo apt-get install -y liblmdb-dev cmake g++ build-essential

Step 2 — Clone & Build

# Clone the repo
git clone https://github.com/your-username/refined-explorer.git
cd refined-explorer

# Create build directory
mkdir build && cd build

# Configure (macOS - use Apple Clang explicitly)
CC=clang CXX=clang++ cmake ..

# Build with parallel jobs (adjust -j to your CPU core count)
cmake --build . -j$(nproc)

The build outputs two binaries in build/:

refined_explorer — the main application
tests_runner — the automated test suite

Step 3 — Run

# Start from current directory
./build/refined_explorer

# Start from a specific path
./build/refined_explorer /Users/you/Documents

Note

On first run, LMDB opens at ~/.cache/refined-explorer/lmdb/. The background indexer begins crawling the indexingRoot defined in config.yml. This is a one-time crawl; subsequent launches re-use the persistent index and only re-index changed files.

5. Configuration

Create or edit config.yml in the project root:

performance:
  workers: 5            # Thread count for multi-threaded folder size calculation
  indexing: true        # Enable/disable the background LMDB indexer
  indexing_root: /Users/you/Developer   # Root directory to index

Key	Type	Default	Description
`performance.workers`	int	4	Threads for `getFolderSizeMT()` (directory size computation)
`performance.indexing`	bool	false	Master switch for LMDB indexing
`performance.indexing_root`	string	`$HOME`	The root directory the indexer crawls

The config is parsed at startup in system.cpp:loadConfig(). If the file is missing, safe defaults are used.

6. Features Deep Dive

6.1 3-Panel Terminal UI

The UI is divided into three columns rendered with ANSI escape codes directly to stdout, with no ncurses dependency:

┌──────────────────┬────────────────────┬──────────────────────┐
│   LEFT PANEL     │   MIDDLE PANEL     │   RIGHT PANEL        │
│  File Metadata   │   File List        │   Preview            │
│  ─────────────── │   ───────────────  │   ───────────────    │
│  Name: main.cpp  │ > src/             │  #include "..."      │
│  Size: 712 B     │   include/         │  int main() {        │
│  User: varalika  │   tests/           │    loadConfig();     │
│  Perms: rwxr-xr-x│   CMakeLists.txt  │    startIndexing();  │
│  Modified: ...   │   README.md        │    navigate();       │
└──────────────────┴────────────────────┴──────────────────────┘

Left Panel: Populated by file_details.cpp. Calls stat() for file metadata (size, timestamps, permissions), getpwuid() for username, getgrgid() for group name.
Middle Panel: Rendered from app.nav.fileList (the in-memory directory listing from the cache).
Right Panel: Calls isBinaryFile() (reads first 512 bytes), then renders file content or directory listing.
Terminal Resize: SIGWINCH signal is registered to handleResize(). The handler sets a flag, and the main loop calls handleResizeIfNeeded() which re-queries terminal size with ioctl(TIOCGWINSZ) and redraws.

6.2 Navigation System

Navigation state is managed in NavigatorState (from myheader.h):

struct NavigatorState {
    std::string currPath;       // Current directory being viewed
    std::string prevPath;       // Previously visited directory
    std::vector<std::string> fileList; // Visible files in currPath
    int xcurr    = 1;          // Cursor row on screen (1-indexed)
    int up_screen  = 0;        // Index of first visible file
    int down_screen = 0;       // Count of files scrolled past bottom
    std::stack<NavState> backStack;    // History for ← navigation
    std::stack<NavState> forwardStack; // History for redo
};

Cursor & Scroll Algorithm (navigator.cpp:normalizeRange):

The cursor is clamped to always be within [1, min(rowSize, visibleFiles)]. The scroll offset up_screen is clamped to [0, max(0, total - rowSize)].

if (up_screen < 0)        up_screen = 0
if (up_screen > maxUp)    up_screen = maxUp       // maxUp = total - rowSize
if (xcurr < 1)            xcurr = 1
if (xcurr > maxX)         xcurr = maxX            // maxX = min(rowSize, visible)
down_screen = max(0, total - up_screen - rowSize)

Keybindings:

Key	Action	Implementation
`↑`	Cursor up (scroll if at top of window)	`navigation.cpp`
`↓`	Cursor down (scroll if at bottom of window)	`navigation.cpp`
`→`	Enter directory / open file	`navigation.cpp:handleEnterAction()`
`←` / `Backspace`	Go to parent	Pops `backStack`
`:cd <path>`	Jump to absolute path	`navigator.cpp:navigateToAbsolutePath()`

6.3 Directory Cache

File: src/core/dir_cache.cpp

To avoid repeated filesystem syscalls (opendir/readdir) when entering the same directory multiple times, entries are cached in memory:

struct CacheState {
    std::unordered_map<std::string, std::vector<std::string>> dirCache;
    const int max_cache_entries = 1000000;
};

Cache Lookup Logic (getDirectoryCount):

Check if dirCache[path] exists → if yes, return the cached listing immediately. O(1) average.
If not, call fs::directory_iterator, collect and sort filenames alphabetically, store in cache, return listing.
For operations that change directory contents (create, delete, rename, paste), invalidateDirCache(path) removes the stale entry so the next visit does a fresh scan.

Complexity: Cache hit → O(1). Cache miss → O(N log N) where N = number of files in directory (for sorting).

6.4 File Operations

All operations are implemented in src/core/commands.cpp:

Copy (`c` key)

If selectedFiles is non-empty:
    clipboard ← all paths in selectedFiles
Else:
    clipboard ← [currPath/fileAtCursor]

Uses app.selection.clipboard (a std::vector<std::string>)
Complexity: O(S) where S = number of selected files

Paste (`p` key)

Uses std::filesystem::copy with overwrite_existing flag.
Recursive directory copy via copy_options::recursive.
Complexity: O(F) where F = total files being copied

Delete (`d` key)

Uses std::filesystem::remove_all for full recursive deletion.
Clears the selection state after.
Complexity: O(F) where F = total files being deleted

Rename (`:rename <new_name>`)

Uses POSIX rename() syscall — atomic on the same filesystem.
Complexity: O(1)

Create File (`:create_file <name>`)

Uses std::ofstream to create an empty file.
Complexity: O(1)

Create Directory (`:create_dir <name>`)

Uses POSIX mkdir() syscall with permissions 0777.
Complexity: O(1)

6.5 Selection System

The selection state is a std::unordered_set<std::string> of absolute file paths:

struct SelectionState {
    std::vector<std::string> clipboard;
    std::unordered_set<std::string> selectedFiles;
};

Key	Action
`Space`	Toggle current file in/out of `selectedFiles`
`u`	Clear all selections (`selectedFiles.clear()`)
`c` / `d`	Operate on all `selectedFiles`

Toggle complexity: O(1) average (hash set insert/erase).

6.6 Command Mode

Press : to enter command mode. The input loop reads a line of text and calls processCommand(commandLine) in src/core/command_processor.cpp.

The processor splits the input into tokens and dispatches:

std::vector<std::string> args;
std::stringstream ss(commandLine);
while (ss >> word) args.push_back(word);

std::string command = args[0];

Full Command Reference:

Command	Example	Description
`:rename <new>`	`:rename report_v2.md`	Rename the file under cursor
`:create_file <n>`	`:create_file todo.txt`	Create a new empty file
`:create_dir <n>`	`:create_dir projects`	Create a new directory
`:cd <path>`	`:cd /Users/me/docs`	Jump to an absolute path (supports spaces)
`:search <q>`	`:search readme`	Recursive filename search
`:search --file <q>`	`:search --file config`	Search files only
`:search --dir <q>`	`:search --dir test`	Search directories only
`:find <tokens>`	`:find async lambda`	Content search via LMDB index (AND)
`:find --dir <tokens>`	`:find --dir python script`	Content search, filtered to current dir
`:help`	`:help`	Show in-app help
`:q`	`:q`	Exit command mode
`:exit`	`:exit`	Quit the application

Note

:cd supports paths with spaces because the parser joins all tokens after cd with a space: for (int i = 2; i < args.size(); i++) absPath += " " + args[i];

6.7 Filename Search (`:search`)

File: src/core/search_engine.cpp, src/core/commands.cpp:searchCommand

This is a recursive directory crawl starting from app.nav.currPath. It does not use the index.

Process:

Lowercase the query: transform(..., ::tolower)
Call searchAnything(path, filename, check_file, check_dir) which uses fs::recursive_directory_iterator.
Time the search and log it.
Display results with displaySearchResults().

Flags:

:search <q> → searches both files and directories
:search --file <q> → check_dir = false
:search --dir <q> → check_file = false

Complexity: O(N) where N = total files and directories under current path (linear scan).

6.8 Content Search (`:find`)

File: src/index/inverted_index.cpp:search()

This uses the persistent LMDB inverted index for fast multi-word AND semantic content search.

Step-by-step process:

Step 1 — Tokenize the query:

// Query: "async lambda move"
// After normalizeWord():  tokens = ["async", "lambda", "move"]

Step 2 — Open a read-only LMDB transaction

Step 3 — For each token, look up its ID and collect matching inodes:

db_word2id["async"] → word_id = 42
db_inverted[42]     → {ino=1001, ino=2040, ino=5500}  // files containing "async"

db_word2id["lambda"] → word_id = 71
db_inverted[71]      → {ino=2040, ino=5500}  // files containing "lambda"

A fileCounter[ino]++ map counts how many query words each inode matches.

Step 4 — Intersect results (AND semantics):

int required = tokens.size(); // 2
for (auto& [ino, cnt] : fileCounter)
    if (cnt == required)  // only files matching ALL words
        resolve_path(ino);

Step 5 — Resolve inode → path (macOS):

// macOS-specific volfs path resolution
std::string volPath = "/.vol/" + rootDev + "/" + ino;
fcntl(fd, F_GETPATH, pathBuf);  // kernel resolves inode to absolute path

Complexity Summary:

Operation	Complexity	Note
Word ID lookup	O(log N)	LMDB B+ tree lookup
Inode set retrieval	O(K)	K = number of files with this word
Inode intersection	O(W × K)	W = words in query, K = avg posting list size
Path resolution	O(R)	R = matching results
Total search	≈ O(W × K)	Sub-millisecond for typical queries

6.9 Real-Time Filesystem Watcher

The explorer tracks filesystem changes so the index stays fresh when you add, delete, or rename files.

Platform-specific implementation:

Platform	API Used	File
macOS	`FSEvents` (Apple CoreServices)	`watcher_mac.cpp`
Linux	`inotify`	`watcher_linux.cpp`

Why not kqueue on macOS?
kqueue requires one open file descriptor per watched directory. For large trees (100k+ directories), this hits OS FD limits instantly. FSEvents monitors entire directory subtrees with a single handle.

macOS FSEvents Flow:

FSEventStreamCreate() — registers callback for the indexingRoot.
Runs on a dedicated CFRunLoop thread.
Events are pushed into app.indexing.eventQueue (mutex-protected).
Background worker thread consumes events and calls index.indexPath() or index.removePath().

Event Types:

Flag	Mapped To
`kFSEventStreamEventFlagItemCreated`	`WatcherEventType::CREATE`
`kFSEventStreamEventFlagItemRemoved`	`WatcherEventType::DELETE`
`kFSEventStreamEventFlagItemRenamed`	`WatcherEventType::RENAME`
`kFSEventStreamEventFlagItemModified`	`WatcherEventType::MODIFY`

7. Indexing Architecture Deep Dive

LMDB Database Schema

The index is stored in a single LMDB environment at ~/.cache/refined-explorer/lmdb/ with 5 named sub-databases:

db_files    :  ino    (uint64) → mtime   (uint64)    [unique, INTEGERKEY]
db_word2id  :  word   (str)    → word_id (uint32)    [unique]
db_id2word  :  word_id(uint32) → word    (str)       [unique, INTEGERKEY]
db_inverted :  word_id(uint32) →+ ino   (uint64)    [DUPSORT | DUPFIXED]
db_forward  :  ino    (uint64) →+ word_id(uint32)   [DUPSORT | DUPFIXED]

The →+ notation means one key maps to multiple sorted values (DUPSORT).

Why Inodes Instead of Paths?

Storing full path strings for every file was the main source of index bloat. Instead, we store inodes (the OS-assigned unique file ID):

Each inode is a uint64_t (8 bytes) vs a full path string (50-200+ bytes).
Inodes survive renames — if you rename a file, its inode doesn't change.
On macOS, inodes are resolved to paths at query time using /.vol/<dev>/<ino> and fcntl(F_GETPATH).

Word Normalization Pipeline

Every token from a filename or file content goes through normalizeWord():

Input: "Hello_World!"
  ↓ character filter (keep alnum + [@#_-$&])
  "Hello_World"   ← '!' is rejected → entire word rejected

Wait, the filter is stricter: if any character is disallowed, the entire word is rejected:

for (char c : word) {
    if (!(isalnum(c) || c == '@' || ... ))
        return "";  // reject entire word
}

Then:

Lowercased: "hello_world"
Stopword check: removed if in the 100+ word stopword list
Length limit: rejected if ≥ 128 characters

Differential Indexing (Startup Crawl)

On every launch, the indexer does a differential crawl instead of a full re-index:

getLastSyncTime() reads the last indexed timestamp from db_files[ino=0] (a sentinel metadata entry).
During traversal, each file's mtime is compared to its stored mtime.
Only new or modified files are re-indexed.
setLastSyncTime(now) updates the sentinel.

This makes subsequent startups very fast — only changed files are processed.

8. Complexity Analysis

Summary Table

Feature	Time Complexity	Space Complexity	Notes
Directory listing (cache hit)	O(1)	O(N) cached	N = files in dir
Directory listing (cache miss)	O(N log N)	O(N)	N = files in dir (sort)
Cursor move Up/Down	O(1)	O(1)	Simple counter update
`:cd` navigate absolute	O(P)	O(P)	P = path segments
`:search` (filename)	O(N)	O(R)	N = all files in subtree
`:find` (content, LMDB)	O(W × K)	O(R)	W = words, K = avg posting size
Word ID lookup (LMDB)	O(log V)	—	V = vocab size, B+ tree
Add file to index	O(T × log V)	O(T)	T = tokens in file
Remove file from index	O(T × log V)	—	Scans forward index
Copy N files	O(F)	—	F = total bytes
Delete N files	O(F)	—	Recursive remove
Toggle selection	O(1) avg	—	Hash set
Binary file detection	O(1)	—	Reads first 512 bytes only

Search Performance vs grep

Query: "async lambda" (in a project with 50,000 files)

grep -r "async" . | grep "lambda"   →  ~400ms   (reads every file)
:find async lambda                  →  ~8ms     (LMDB O(log N) lookup)

9. Testing

The project uses GoogleTest (v1.14.0) for automated unit testing.

Running Tests

cd build
./tests_runner
# or with JSON output:
./tests_runner --gtest_output=json:test_results.json

Tests also run automatically after every cmake --build . via the POST_BUILD CMake hook:

add_custom_command(TARGET tests_runner POST_BUILD
    COMMAND tests_runner --gtest_output=json:test_results.json
    COMMAND ${CMAKE_COMMAND} -E remove_directory "${CMAKE_SOURCE_DIR}/tests/dummy"
    COMMENT "Running automated tests and cleaning up..."
)

Test Coverage (110 Cases, 10 Modules)

Module	File	Cases	What's Tested
TextUtils	`test_utils.cpp`	24	`normalizeWord`: symbols, stopwords, case, length limits
FormatUtils	`test_utils.cpp`	13	`humanReadableSize` (B→TB), `truncateStr` edge cases
FileUtils	`test_file_utils.cpp`	12	`isReadable`, `isBinaryFile`, `isDirectory`, permissions
Navigation	`test_navigation.cpp`	13	`normalizeRange`, `scrollToIndex`, `isUnderCurrentDir`
DirCache	`test_dir_cache.cpp`	6	Cache hit/miss, `invalidateCache`, sorted order
Commands	`test_commands.cpp`	15	`cd` with spaces, create/rename/delete, paste collision
Search	`test_search.cpp`	6	Partial match, case insensitive, `--file`/`--dir` flags
InvertedIndex	`test_index.cpp`	12	LMDB open/close, multi-word AND, persistence, word limits
Config	`test_config.cpp`	3	Default values, toggle indexing, worker count
Selection	`test_selection.cpp`	6	Single/multi select, clear, clipboard update
TOTAL		110	100% Pass Rate ✅

Sandbox Directory

Tests use an isolated tests/dummy/ directory created and filled at test start. This directory is automatically deleted after tests complete (via the CMake POST_BUILD hook), so it never pollutes the workspace.

10. Project Structure

refined-explorer/
├── CMakeLists.txt          # Build system: core_lib, refined_explorer, tests_runner
├── config.yml              # Runtime configuration (workers, indexing root)
├── main.cpp                # Entry point: loadConfig → startIndexing → navigate
├── include/
│   └── myheader.h          # Single project-wide header: all structs, enums, extern declarations
├── src/
│   ├── core/               # Application logic
│   │   ├── command_processor.cpp  # Parses & dispatches : commands
│   │   ├── commands.cpp           # copy, paste, delete, rename, search
│   │   ├── dir_cache.cpp          # Directory listing cache (unordered_map)
│   │   ├── file_details.cpp       # Left panel: stat(), getpwuid(), getgrgid()
│   │   ├── file_utils.cpp         # isDirectory, isRegularFile, isReadable, isBinaryFile
│   │   ├── navigation.cpp         # Key event loop, scroll logic, enter/back
│   │   ├── navigation_init.cpp    # Builds backStack from startup path
│   │   ├── navigator.cpp          # normalizeRange, scrollToIndex, navigateToAbsolutePath
│   │   ├── search_engine.cpp      # :search recursive crawl
│   │   └── system.cpp             # loadConfig, signal handlers, getFolderSizeMT
│   ├── index/              # Indexing engine
│   │   ├── inverted_index.cpp     # LMDB: open/close/indexPath/removePath/search
│   │   ├── index_runner.cpp       # startIndexing: opens LMDB, starts watcher + worker
│   │   ├── traversal.cpp          # fd-based recursive directory traversal for crawl
│   │   ├── watcher_mac.cpp        # FSEvents watcher (macOS)
│   │   └── watcher_linux.cpp      # inotify watcher (Linux)
│   ├── tui/                # Terminal UI rendering
│   │   └── ...             # render functions for 3 panels, status bar, help screen
│   └── utils/              # Shared utilities
│       ├── format.cpp             # humanReadableSize, truncateStr
│       └── text_utils.cpp         # normalizeWord, STOPWORDS set
├── tests/
│   ├── test_main.cpp       # GTest environment setup
│   ├── stubs.cpp           # Stubs for functions requiring a live terminal
│   ├── test_commands.cpp
│   ├── test_config.cpp
│   ├── test_dir_cache.cpp
│   ├── test_file_utils.cpp
│   ├── test_index.cpp
│   ├── test_navigation.cpp
│   ├── test_search.cpp
│   ├── test_selection.cpp
│   └── test_utils.cpp
├── build/                  # CMake build artifacts (generated, not committed)
│   ├── refined_explorer    # Main binary
│   ├── tests_runner        # Test binary
│   └── test_results.json   # Latest test report
├── logs/
│   └── debug.log           # Runtime log output from logMessage()
├── README.md               # Quick-start guide
├── DOCUMENTATION.md        # This file
└── notes.md                # Development journal

11. Troubleshooting & FAQs

"LMDB: cannot open environment"

Ensure ~/.cache/refined-explorer/ is writable.
Run: mkdir -p ~/.cache/refined-explorer/lmdb

`:find` returns no results

Check that indexing: true is set in config.yml.
Check logs/debug.log for "LMDB Inode-Index opened" — if absent, LMDB failed to open.
Wait ~5-30 seconds after launch for the initial crawl to complete.
Try :search as a fallback — it doesn't need the index.

"Binary File" shown in preview

The file's first 512 bytes contain a null byte or non-printable character.
isBinaryFile() uses this heuristic. This is expected for images, .pdf, compiled binaries, etc.

Build fails on macOS with CoreServices error

Switch from Homebrew GCC to Apple Clang: CC=clang CXX=clang++ cmake ..

`refined_explorer` binary was accidentally deleted

Simply rebuild: cd build && cmake .. && cmake --build . -j8

Why is the `build/` folder ~38MB?

What	Size	Needed?
`_deps/` (GoogleTest source)	~23 MB	Only for re-building tests
`CMakeFiles/` (object files)	~7.6 MB	Only for incremental builds
`lib/` (compiled static libs)	~2.2 MB	Only for linking
Binaries (`refined_explorer`, `tests_runner`)	~2.5 MB	Yes, to run

You can safely delete everything except the binaries and test_results.json if you're not doing active development.

12. Future Roadmap

🔥 High Priority

SIMD-Accelerated Set Intersection

Instead of iterating through posting lists with a hash counter, use SIMD (AVX2/NEON) instructions to do bitwise AND across compressed inode bitsets — targeting sub-millisecond query latency for queries spanning millions of files.

PDF / DOCX Format Support

Integrate Poppler (PDF) or a lightweight XML parser (DOCX is ZIP+XML) to extract text content and tokenize it for indexing. This turns the explorer into a universal document search engine.

Delta Inode Encoding

Instead of storing raw 8-byte inodes, store the delta between consecutive sorted inodes:

Full:  [1001, 1006, 1015, 1100]
Delta: [1001,    5,    9,   85]

Small deltas → better compression → smaller index. Especially effective when inodes are clustered.

🧠 Future Vision

Trie-Based Prefix Search (Autocomplete)

Build a Trie over the word dictionary:

         root
          ├── a
          │   └── s
          │       └── y
          │           └── n
          │               └── c  ← word_id=42 (isEnd=true)
          └── l
              └── ...

A prefix query like "asy" traverses the trie and collects all word_ids below that node. This enables instant search-as-you-type autocomplete in command mode.

Cross-Volume Indexing

Currently limited to a single indexingRoot. Future: support multiple roots with separate LMDB environments or a partition table.

Documentation auto-generated from source code at version: April 2026.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
include		include
logs		logs
src		src
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
config.json		config.json
main.cpp		main.cpp

Folders and files

Latest commit

History

Repository files navigation

📚 Refined Explorer — Complete Technical Documentation

📋 Table of Contents

1. Project Overview

What makes it unique?

2. Architecture

Core Data Flow

3. Dependencies

Core Language & Standard Library

External Libraries

Build Tools

4. Installation & Build

Step 1 — Install Dependencies

Step 2 — Clone & Build

Step 3 — Run

5. Configuration

6. Features Deep Dive

6.1 3-Panel Terminal UI

6.2 Navigation System

6.3 Directory Cache

6.4 File Operations

Copy (c key)

Paste (p key)

Delete (d key)

Rename (:rename <new_name>)

Create File (:create_file <name>)

Create Directory (:create_dir <name>)

6.5 Selection System

6.6 Command Mode

6.7 Filename Search (:search)

6.8 Content Search (:find)

6.9 Real-Time Filesystem Watcher

7. Indexing Architecture Deep Dive

LMDB Database Schema

Why Inodes Instead of Paths?

Word Normalization Pipeline

Differential Indexing (Startup Crawl)

8. Complexity Analysis

Summary Table

Search Performance vs grep

9. Testing

Running Tests

Test Coverage (110 Cases, 10 Modules)

Sandbox Directory

10. Project Structure

11. Troubleshooting & FAQs

"LMDB: cannot open environment"

:find returns no results

"Binary File" shown in preview

Build fails on macOS with CoreServices error

refined_explorer binary was accidentally deleted

Why is the build/ folder ~38MB?

12. Future Roadmap

🔥 High Priority

SIMD-Accelerated Set Intersection

PDF / DOCX Format Support

Delta Inode Encoding

🧠 Future Vision

Trie-Based Prefix Search (Autocomplete)

Cross-Volume Indexing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Copy (`c` key)

Paste (`p` key)

Delete (`d` key)

Rename (`:rename <new_name>`)

Create File (`:create_file <name>`)

Create Directory (`:create_dir <name>`)

6.7 Filename Search (`:search`)

6.8 Content Search (`:find`)

`:find` returns no results

`refined_explorer` binary was accidentally deleted

Why is the `build/` folder ~38MB?

Packages