Skip to content

Commit 9f5aa6d

Browse files
committed
docs: add ARCHITECTURE.md with detailed technical documentation
1 parent 13d6028 commit 9f5aa6d

1 file changed

Lines changed: 310 additions & 0 deletions

File tree

ARCHITECTURE.md

Lines changed: 310 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,310 @@
1+
# TrueEntropy Architecture
2+
3+
## Overview
4+
5+
TrueEntropy harvests entropy from real-world sources and converts it into cryptographically secure random values. This document explains how each entropy source is collected and transformed into usable random numbers.
6+
7+
## System Architecture
8+
9+
```
10+
┌─────────────────────────────────────────────────────────────────────────┐
11+
│ PUBLIC API │
12+
│ trueentropy.random() / randint() / choice() / shuffle() / ... │
13+
└─────────────────────────────────────────────────────────────────────────┘
14+
15+
16+
┌─────────────────────────────────────────────────────────────────────────┐
17+
│ ENTROPY TAP (tap.py) │
18+
│ Converts raw bytes into usable values (floats, ints, booleans) │
19+
└─────────────────────────────────────────────────────────────────────────┘
20+
21+
22+
┌─────────────────────────────────────────────────────────────────────────┐
23+
│ ENTROPY POOL (pool.py) │
24+
│ 512-byte buffer with SHA-256 whitening and thread-safe access │
25+
└─────────────────────────────────────────────────────────────────────────┘
26+
27+
28+
┌─────────────────────────────────────────────────────────────────────────┐
29+
│ HARVESTERS (harvesters/) │
30+
│ timing | network | system | external | weather | radioactive │
31+
└─────────────────────────────────────────────────────────────────────────┘
32+
```
33+
34+
---
35+
36+
## Entropy Sources
37+
38+
### 1. Timing Jitter (timing.py)
39+
40+
**Source**: CPU instruction timing variations
41+
42+
**How it works**:
43+
```python
44+
measurements = []
45+
for _ in range(iterations):
46+
start = time.perf_counter_ns()
47+
# Perform CPU operations
48+
for _ in range(1000):
49+
_ = 1 + 1
50+
end = time.perf_counter_ns()
51+
measurements.append(end - start)
52+
53+
# Pack as bytes
54+
data = struct.pack(f"!{len(measurements)}Q", *measurements)
55+
```
56+
57+
**Why it's random**:
58+
- CPU scheduling is non-deterministic
59+
- Cache hits/misses vary unpredictably
60+
- Other processes create interference
61+
- Nanosecond precision captures jitter
62+
63+
**Entropy estimate**: ~32 bits per collection
64+
65+
---
66+
67+
### 2. Network Latency (network.py)
68+
69+
**Source**: Round-trip time to remote servers
70+
71+
**How it works**:
72+
```python
73+
targets = ["https://1.1.1.1", "https://8.8.8.8", "https://google.com"]
74+
75+
for target in targets:
76+
start = time.perf_counter_ns()
77+
requests.head(target, timeout=2)
78+
end = time.perf_counter_ns()
79+
80+
latency_ns = end - start # e.g., 64,197,532 ns
81+
measurements.append(latency_ns)
82+
83+
data = struct.pack("!QQQ", *measurements)
84+
```
85+
86+
**Why it's random**:
87+
- Network congestion varies constantly
88+
- Routing paths change dynamically
89+
- Server load fluctuates
90+
- Physical infrastructure conditions
91+
92+
**Entropy estimate**: ~8 bits per server
93+
94+
---
95+
96+
### 3. System State (system.py)
97+
98+
**Source**: Volatile system metrics via psutil
99+
100+
**Metrics collected**:
101+
- Available RAM (bytes)
102+
- CPU usage per core (%)
103+
- Process count and PIDs
104+
- Disk I/O counters
105+
- Network I/O counters
106+
- Timestamps (nanoseconds)
107+
108+
**How it works**:
109+
```python
110+
metrics = []
111+
metrics.append(("ram", psutil.virtual_memory().available))
112+
metrics.append(("cpu", psutil.cpu_percent()))
113+
metrics.append(("pids", len(psutil.pids())))
114+
# ... more metrics
115+
116+
for name, value in metrics:
117+
int_value = int(value * 1000000) # Preserve precision
118+
data += struct.pack("!Q", int_value)
119+
```
120+
121+
**Why it's random**:
122+
- RAM allocation changes with every program
123+
- CPU usage fluctuates rapidly
124+
- Processes start/stop constantly
125+
126+
**Entropy estimate**: ~6 bits per metric
127+
128+
---
129+
130+
### 4. External APIs (external.py)
131+
132+
**Sources**:
133+
- USGS Earthquake data (seismic activity)
134+
- Cryptocurrency prices (market volatility)
135+
136+
**How it works**:
137+
```python
138+
# Earthquake data
139+
response = requests.get("https://earthquake.usgs.gov/...")
140+
earthquakes = response.json()["features"]
141+
142+
for eq in earthquakes:
143+
magnitude = eq["properties"]["mag"] # 4.7
144+
lat = eq["geometry"]["coordinates"][0]
145+
lon = eq["geometry"]["coordinates"][1]
146+
147+
data += struct.pack("!d", magnitude)
148+
data += struct.pack("!dd", lat, lon)
149+
```
150+
151+
**Why it's random**:
152+
- Earthquakes are physically unpredictable
153+
- Financial markets are chaotic systems
154+
155+
**Entropy estimate**: ~32 bits per collection
156+
157+
---
158+
159+
### 5. Weather Data (weather.py)
160+
161+
**Sources**: OpenWeatherMap API or wttr.in
162+
163+
**Metrics**: Temperature, humidity, pressure, wind speed
164+
165+
**How it works**:
166+
```python
167+
cities = ["London", "Tokyo", "New York", "Sydney"]
168+
169+
for city in cities:
170+
weather = fetch_weather(city)
171+
172+
# Multiply to preserve decimal precision
173+
temp = int(weather["temp"] * 10000) # 23.47°C → 234700
174+
humidity = int(weather["humidity"] * 100) # 67.3% → 6730
175+
pressure = int(weather["pressure"] * 100) # 1013.25 → 101325
176+
177+
data += struct.pack("!QQQ", temp, humidity, pressure)
178+
```
179+
180+
**Why it's random**:
181+
- Weather changes constantly
182+
- Decimal places vary unpredictably
183+
- Multiple cities provide independent sources
184+
185+
**Entropy estimate**: ~8 bits per metric
186+
187+
---
188+
189+
### 6. Quantum Random (radioactive.py)
190+
191+
**Sources**:
192+
- ANU QRNG (quantum vacuum fluctuations)
193+
- random.org (atmospheric noise)
194+
195+
**How it works**:
196+
```python
197+
# ANU Quantum RNG - true quantum randomness
198+
response = requests.get(
199+
"https://qrng.anu.edu.au/API/jsonI.php",
200+
params={"length": 16, "type": "uint8"}
201+
)
202+
quantum_bytes = bytes(response.json()["data"])
203+
```
204+
205+
**Why it's random**:
206+
- Quantum vacuum fluctuations are fundamentally unpredictable
207+
- Heisenberg uncertainty principle guarantees randomness
208+
- Not pseudo-random - true physical randomness
209+
210+
**Entropy estimate**: 8 bits per byte (full entropy)
211+
212+
---
213+
214+
## Entropy Pool (pool.py)
215+
216+
### Whitening Process
217+
218+
All harvested data passes through SHA-256 mixing:
219+
220+
```python
221+
def feed(self, data: bytes):
222+
# Combine: current pool + new data + timestamp
223+
mix_input = self._pool + data + struct.pack("!d", time.time())
224+
225+
# SHA-256 hash for avalanche effect
226+
hash_digest = hashlib.sha256(mix_input).digest()
227+
228+
# Expand to fill pool
229+
self._pool = self._expand_to_pool_size(hash_digest)
230+
```
231+
232+
**Properties**:
233+
- Avalanche effect: 1 bit change → ~50% output bits change
234+
- Forward secrecy: Cannot recover old states
235+
- Thread-safe: Lock protects all operations
236+
237+
---
238+
239+
## Value Conversion (tap.py)
240+
241+
### random() → Float [0.0, 1.0)
242+
243+
```python
244+
raw_bytes = pool.extract(8) # 8 bytes
245+
value = struct.unpack("!Q", raw_bytes)[0] # 64-bit int
246+
return value / 2**64 # Divide by 2^64
247+
```
248+
249+
### randint(a, b) → Integer [a, b] (Rejection Sampling)
250+
251+
```python
252+
range_size = b - a + 1
253+
bits_needed = range_size.bit_length()
254+
mask = (1 << bits_needed) - 1
255+
256+
while True:
257+
value = extract_int() & mask
258+
if value < range_size: # Accept
259+
return a + value
260+
# Reject and retry (eliminates modulo bias)
261+
```
262+
263+
### gauss(mu, sigma) → Normal Distribution (Box-Muller)
264+
265+
```python
266+
u1 = random() # Uniform (0, 1)
267+
u2 = random() # Uniform [0, 1)
268+
269+
z0 = sqrt(-2 * ln(u1)) * cos(2π * u2)
270+
271+
return mu + sigma * z0
272+
```
273+
274+
### shuffle(seq) → Fisher-Yates Algorithm
275+
276+
```python
277+
for i in range(n - 1, 0, -1):
278+
j = randint(0, i)
279+
seq[i], seq[j] = seq[j], seq[i]
280+
```
281+
282+
Guarantees all N! permutations are equally probable.
283+
284+
---
285+
286+
## Security Properties
287+
288+
| Property | Implementation |
289+
|----------|----------------|
290+
| Forward Secrecy | Pool state updated after each extraction |
291+
| Avalanche Effect | SHA-256 mixing ensures 1 bit → 50% change |
292+
| Thread Safety | All pool operations protected by locks |
293+
| No Modulo Bias | Rejection sampling in randint() |
294+
| Entropy Mixing | Multiple independent sources combined |
295+
296+
---
297+
298+
## Module Summary
299+
300+
| Module | Purpose |
301+
|--------|---------|
302+
| `pool.py` | Accumulates and mixes entropy with SHA-256 |
303+
| `tap.py` | Extracts entropy and converts to types |
304+
| `collector.py` | Background thread for automatic collection |
305+
| `health.py` | Monitors pool health (score 0-100) |
306+
| `harvesters/` | Collectors for different entropy sources |
307+
| `aio.py` | Async versions of all functions |
308+
| `persistence.py` | Save/restore pool state to disk |
309+
| `pools.py` | Multiple isolated entropy pools |
310+
| `accel.py` | Optional Cython acceleration |

0 commit comments

Comments
 (0)