Executive summary
Probe outcomes and drift valence across the run.
Safe to upgrade: true · Manual review items: 0
Critical regressions: 0 · Policy drift: 0 · Fidelity drift: 0 · Structural drift: 0 · Content compression: 0
Migration profile
v2 is 32% faster
Safe to upgrade: true
Dimension overview
Worst risk per dimension, drift direction mix among affected probes (↑ / ↓ / →), and affected count.
| Dimension | Worst risk | Drift (affected) | Probes |
|---|---|---|---|
| Morphology | Green | — | 0 |
| Tone | Green | — | 0 |
| Factual | Green | — | 0 |
| Schema | Green | — | 0 |
| Instruction | Green | — | 0 |
| Refusal | Green | — | 0 |
| Semantic | Amber | ↓0↑1→0 | 1 |
| Claim | Green | — | 0 |
| Latency | Green | — | 0 |
| Consistency | Amber | ↓0↑1→0 | 1 |
| Custom assertions | Green | — | 0 |
Red probes
All probe results
Expand for prompt, dimension snapshot (with drift arrows), claim intelligence, and responses. Consistency σ shows run variance when multi-run sampling was enabled.
code_javascript_arrow_function Green σ v1 0 · v2 0
Prompt
Write a JavaScript arrow function called `double` that takes a number and returns it multiplied by 2. Use const. Return only the function definition, no explanation.
Dimension snapshot
| Severity | ⚪ Informational | ||
|---|---|---|---|
| Drift category | ✓ No significant drift (NoSignificantDrift) | ||
| Morphology | Green | Neutral · Δ tokens ≈ 4% · type Δ false | |
| Tone | Green | Neutral · formality Δ 0% | |
| Semantic | Green | Neutral · sim 0.9999999576965958 | |
| Claim | Green | Neutral · preservation 1 (≥ 0.7 for green band) · dropped 0 · new 0 | |
| Latency | Green | Regression · v1 642 ms · v2 775 ms · Δ 133 ms | |
| Consistency | Green | Neutral · runs 3/3 · var 0 / 0 | |
| Refusal | Green | Neutral · new false · lifted false | |
Claim diff2 matched · 0 dropped · 0 new
Matched claims
- sim 1 · anchors ok true · v1: ```javascript ↔ v2: ```javascript
- sim 1 · anchors ok true · v1: const double = (num) => num * 2; ↔ v2: const double = num => num * 2;
v1 response
```javascript const double = (num) => num * 2; ```
v2 response
```javascript const double = num => num * 2; ```
code_no_explanation_discipline Green σ v1 0 · v2 0
Prompt
Write a Python function called `fibonacci` that returns the nth Fibonacci number using recursion. Return ONLY the function. No explanation before it. No explanation after it. No usage example. Just the def block.
Dimension snapshot
| Severity | ⚪ Informational | ||
|---|---|---|---|
| Drift category | ✓ No significant drift (NoSignificantDrift) | ||
| Morphology | Green | Neutral · Δ tokens ≈ 2% · type Δ false | |
| Tone | Green | Neutral · formality Δ -2% | |
| Semantic | Green | Neutral · sim 0.999999992218997 | |
| Claim | Green | Neutral · preservation 1 (≥ 0.7 for green band) · dropped 0 · new 0 | |
| Latency | Green | Improvement · v1 1441 ms · v2 1138 ms · Δ -303 ms | |
| Consistency | Green | Neutral · runs 3/3 · var 0 / 0 | |
| Refusal | Green | Neutral · new false · lifted false | |
Claim diff6 matched · 0 dropped · 0 new
Matched claims
- sim 1 · anchors ok true · v1: def fibonacci(n): ↔ v2: def fibonacci(n):
- sim 1 · anchors ok true · v1: if n <= 0: ↔ v2: if n <= 0:
- sim 1 · anchors ok true · v1: return 0 ↔ v2: return 0
- sim 1 · anchors ok true · v1: elif n == 1: ↔ v2: elif n == 1:
- sim 1 · anchors ok true · v1: return 1 ↔ v2: return 1
- sim 1 · anchors ok true · v1: return fibonacci(n - 1) + fibonacci(n - 2) ↔ v2: return fibonacci(n-1) + fibonacci(n-2)
v1 response
```python
def fibonacci(n):
if n <= 0:
return 0
elif n == 1:
return 1
else:
return fibonacci(n - 1) + fibonacci(n - 2)
```
v2 response
```python
def fibonacci(n):
if n <= 0:
return 0
elif n == 1:
return 1
else:
return fibonacci(n-1) + fibonacci(n-2)
```
code_python_class_definition Green σ v1 0 · v2 0
Prompt
Write a Python class called `Counter` with: an `__init__` method that sets `self.count` to 0, an `increment` method that adds 1 to count, and a `reset` method that sets count back to 0. Return only the class definition, no explanation.
Dimension snapshot
| Severity | ⚪ Informational | ||
|---|---|---|---|
| Drift category | ✓ No significant drift (NoSignificantDrift) | ||
| Morphology | Green | Neutral · Δ tokens ≈ 0% · type Δ false | |
| Tone | Green | Neutral · formality Δ 0% | |
| Semantic | Green | Neutral · sim 0.999999982479268 | |
| Claim | Green | Neutral · preservation 1 (≥ 0.7 for green band) · dropped 0 · new 0 | |
| Latency | Green | Improvement · v1 3984 ms · v2 1042 ms · Δ -2942 ms | |
| Consistency | Green | Neutral · runs 3/3 · var 0 / 0 | |
| Refusal | Green | Neutral · new false · lifted false | |
Claim diff7 matched · 0 dropped · 0 new
Matched claims
- sim 1 · anchors ok true · v1: class Counter: ↔ v2: class Counter:
- sim 1 · anchors ok true · v1: def __init__(self): ↔ v2: def __init__(self):
- sim 1 · anchors ok true · v1: self.count = 0 ↔ v2: self.count = 0
- sim 1 · anchors ok true · v1: def increment(self): ↔ v2: def increment(self):
- sim 1 · anchors ok true · v1: self.count += 1 ↔ v2: self.count += 1
- sim 1 · anchors ok true · v1: def reset(self): ↔ v2: def reset(self):
- sim 1 · anchors ok true · v1: self.count = 0 ↔ v2: self.count = 0
v1 response
```python
class Counter:
def __init__(self):
self.count = 0
def increment(self):
self.count += 1
def reset(self):
self.count = 0
```
v2 response
```python
class Counter:
def __init__(self):
self.count = 0
def increment(self):
self.count += 1
def reset(self):
self.count = 0
```
code_python_dict_manipulation Green σ v1 0 · v2 0
Prompt
Write a Python function called `invert_dict` that takes a dictionary and returns a new dictionary with keys and values swapped. Return only the function, no explanation.
Dimension snapshot
| Severity | ⚪ Informational | ||
|---|---|---|---|
| Drift category | ✓ No significant drift (NoSignificantDrift) | ||
| Morphology | Green | Neutral · Δ tokens ≈ 0% · type Δ false | |
| Tone | Green | Neutral · formality Δ 0% | |
| Semantic | Green | Neutral · sim 0.9999999889549346 | |
| Claim | Green | Neutral · preservation 1 (≥ 0.7 for green band) · dropped 0 · new 0 | |
| Latency | Green | Improvement · v1 1064 ms · v2 734 ms · Δ -330 ms | |
| Consistency | Green | Neutral · runs 3/3 · var 0 / 0 | |
| Refusal | Green | Neutral · new false · lifted false | |
Claim diff1 matched · 0 dropped · 0 new
Matched claims
- sim 1 · anchors ok true · v1: def invert_dict(d): ↔ v2: def invert_dict(d):
v1 response
```python
def invert_dict(d):
return {v: k for k, v in d.items()}
```
v2 response
```python
def invert_dict(d):
return {v: k for k, v in d.items()}
```
code_python_error_handling Green σ v1 0 · v2 0
Prompt
Write a Python function called `safe_divide` that takes two numbers `a` and `b`, returns `a / b`, but returns `None` if `b` is zero. Use a try/except block. Return only the function, no explanation.
Dimension snapshot
| Severity | ⚪ Informational | ||
|---|---|---|---|
| Drift category | ✓ No significant drift (NoSignificantDrift) | ||
| Morphology | Green | Neutral · Δ tokens ≈ 0% · type Δ false | |
| Tone | Green | Neutral · formality Δ 0% | |
| Semantic | Green | Neutral · sim 0.9999999824548148 | |
| Claim | Green | Neutral · preservation 1 (≥ 0.7 for green band) · dropped 0 · new 0 | |
| Latency | Green | Improvement · v1 1316 ms · v2 831 ms · Δ -485 ms | |
| Consistency | Green | Neutral · runs 3/3 · var 0 / 0 | |
| Refusal | Green | Neutral · new false · lifted false | |
Claim diff3 matched · 0 dropped · 0 new
Matched claims
- sim 1 · anchors ok true · v1: def safe_divide(a, b): ↔ v2: def safe_divide(a, b):
- sim 1 · anchors ok true · v1: except ZeroDivisionError: ↔ v2: except ZeroDivisionError:
- sim 1 · anchors ok true · v1: return None ↔ v2: return None
v1 response
```python
def safe_divide(a, b):
try:
return a / b
except ZeroDivisionError:
return None
```
v2 response
```python
def safe_divide(a, b):
try:
return a / b
except ZeroDivisionError:
return None
```
code_python_list_comprehension Green σ v1 0 · v2 0
Prompt
Write a Python one-liner using list comprehension that returns all even numbers from a list called `numbers`. Assign the result to a variable called `evens`. Return only the code, no explanation.
Dimension snapshot
| Severity | ⚪ Informational | ||
|---|---|---|---|
| Drift category | ✓ No significant drift (NoSignificantDrift) | ||
| Morphology | Green | Neutral · Δ tokens ≈ 0% · type Δ false | |
| Tone | Green | Neutral · formality Δ 0% | |
| Semantic | Green | Neutral · sim 0.999999947142337 | |
| Claim | Green | Neutral · preservation 1 (≥ 0.7 for green band) · dropped 0 · new 0 | |
| Latency | Green | Improvement · v1 1143 ms · v2 845 ms · Δ -298 ms | |
| Consistency | Green | Neutral · runs 3/3 · var 0 / 0 | |
| Refusal | Green | Neutral · new false · lifted false | |
Claim diff1 matched · 0 dropped · 0 new
Matched claims
- sim 1 · anchors ok true · v1: evens = [num for num in numbers if num % 2 == 0] ↔ v2: evens = [num for num in numbers if num % 2 == 0]
v1 response
```python evens = [num for num in numbers if num % 2 == 0] ```
v2 response
```python evens = [num for num in numbers if num % 2 == 0] ```
code_python_simple_function Green σ v1 0 · v2 0
Prompt
Write a Python function called `add` that takes two integers and returns their sum. Return only the function definition, no explanation, no usage example.
Dimension snapshot
| Severity | ⚪ Informational | ||
|---|---|---|---|
| Drift category | ✓ No significant drift (NoSignificantDrift) | ||
| Morphology | Green | Neutral · Δ tokens ≈ 0% · type Δ false | |
| Tone | Green | Neutral · formality Δ 0% | |
| Semantic | Green | NotApplicable · sim 0.9999999659801694 | |
| Claim | Green | NotApplicable · preservation 1 (≥ 0.7 for green band) · dropped 0 · new 0 | |
| Latency | Green | Regression · v1 1135 ms · v2 1689 ms · Δ 554 ms | |
| Consistency | Amber | Improvement · runs 3/3 · var 0 / 0 | |
| Refusal | Green | Neutral · new false · lifted false | |
Claim diff0 matched · 0 dropped · 0 new
No claim rows extracted for this pair.
v1 response
```python
def add(a: int, b: int) -> int:
return a + b
```
v2 response
```python
def add(a: int, b: int) -> int:
return a + b
```
code_regex_email Green σ v1 0 · v2 0
Prompt
Write a Python regex pattern string (just the pattern, assigned to a variable called `pattern`) that matches a basic email address format: word characters, an @ symbol, word characters, a dot, and 2-4 word characters for the domain. Return only the assignment, no explanation, no import.
Dimension snapshot
| Severity | ⚪ Informational | ||
|---|---|---|---|
| Drift category | ✓ No significant drift (NoSignificantDrift) | ||
| Morphology | Green | Neutral · Δ tokens ≈ 2% · type Δ false | |
| Tone | Green | Neutral · formality Δ 0% | |
| Semantic | Green | Neutral · sim 1 | |
| Claim | Green | Neutral · preservation 1 (≥ 0.7 for green band) · dropped 0 · new 0 | |
| Latency | Green | Improvement · v1 1008 ms · v2 592 ms · Δ -416 ms | |
| Consistency | Green | Neutral · runs 3/3 · var 0 / 0 | |
| Refusal | Green | Neutral · new false · lifted false | |
Claim diff1 matched · 0 dropped · 0 new
Matched claims
- sim 1 · anchors ok true · v1: pattern = r'^\w+@\w+\.\w{2,4}$' ↔ v2: pattern = r'\w+@\w+\.\w{2,4}'
v1 response
```python
pattern = r'^\w+@\w+\.\w{2,4}$'
```
v2 response
```python
pattern = r'\w+@\w+\.\w{2,4}'
```
code_sql_basic_select Green σ v1 0 · v2 0
Prompt
Write a SQL query that selects all columns from a table called `orders` where the `status` column equals 'pending' and orders the results by `created_at` descending. Return only the SQL, no explanation.
Dimension snapshot
| Severity | ⚪ Informational | ||
|---|---|---|---|
| Drift category | ✓ No significant drift (NoSignificantDrift) | ||
| Morphology | Green | Neutral · Δ tokens ≈ 1% · type Δ false | |
| Tone | Green | Neutral · formality Δ 0% | |
| Semantic | Amber | Improvement · sim 0.8005305268146136 | |
| Claim | Green | Improvement · preservation 1 (≥ 0.7 for green band) · dropped 0 · new 2 | |
| Latency | Green | Neutral · v1 1269 ms · v2 1024 ms · Δ -245 ms | |
| Consistency | Green | Neutral · runs 3/3 · var 0 / 0 | |
| Refusal | Green | Neutral · new false · lifted false | |
Claim diff1 matched · 0 dropped · 2 new
Matched claims
- sim 1 · anchors ok true · v1: SELECT * FROM orders WHERE status = 'pending' ORDER BY created_at DESC; ↔ v2: ORDER BY created_at DESC;
New in v2
- SELECT * FROM orders
- WHERE status = 'pending'
v1 response
```sql SELECT * FROM orders WHERE status = 'pending' ORDER BY created_at DESC; ```
v2 response
```sql SELECT * FROM orders WHERE status = 'pending' ORDER BY created_at DESC; ```
code_sql_join Green σ v1 0 · v2 0
Prompt
Write a SQL query that joins the `users` table with the `orders` table on `users.id = orders.user_id`, returning `users.name` and `orders.total`. Return only the SQL, no explanation.
Dimension snapshot
| Severity | ⚪ Informational | ||
|---|---|---|---|
| Drift category | ✓ No significant drift (NoSignificantDrift) | ||
| Morphology | Green | Neutral · Δ tokens ≈ 0% · type Δ false | |
| Tone | Green | Neutral · formality Δ 0% | |
| Semantic | Green | Neutral · sim 0.9999999756073144 | |
| Claim | Green | Neutral · preservation 1 (≥ 0.7 for green band) · dropped 0 · new 0 | |
| Latency | Green | Neutral · v1 1269 ms · v2 1027 ms · Δ -242 ms | |
| Consistency | Green | Neutral · runs 3/3 · var 0 / 0 | |
| Refusal | Green | Neutral · new false · lifted false | |
Claim diff3 matched · 0 dropped · 0 new
Matched claims
- sim 1 · anchors ok true · v1: SELECT users.name, orders.total ↔ v2: SELECT users.name, orders.total
- sim 1 · anchors ok true · v1: FROM users ↔ v2: FROM users
- sim 1 · anchors ok true · v1: JOIN orders ON users.id = orders.user_id; ↔ v2: JOIN orders ON users.id = orders.user_id;
v1 response
```sql SELECT users.name, orders.total FROM users JOIN orders ON users.id = orders.user_id; ```
v2 response
```sql SELECT users.name, orders.total FROM users JOIN orders ON users.id = orders.user_id; ```
Upgrade path
Structured rollout view. Validated prompt patches appear when mutations validate.
Blocking 0 · Verify 0 · Neutral 0 · Validated 0
Validated prompt patches
No automated patches generated — run with --mutate to attempt prompt repairs.
Latency impact
Observational only — latency does not affect overall risk, probe direction, or upgrade path routing.
Average baseline latency
1427 ms
Average target latency
969 ms
Delta: -458 ms (-32.1%) · Direction: Improvement
v2 responded 32% faster on average across 10 probes
Upgrade recommendation
Overall risk is acceptable for a controlled rollout. Review amber items and any "verify" red valence before broad deployment.