Skip to content

split_secret #81

@ansaripublic56-rgb

Description

@ansaripublic56-rgb

The Python implementation of split_secret (in shamir.py) generates polynomial coefficients directly from entropy bytes without enforcing uniqueness. The JavaScript reference implementation (and the BIP specification) rejects duplicate coefficients via rejection sampling. This divergence creates a statistical weakness: duplicate coefficients reduce the polynomial's effective degree, lowering the threshold needed to recover the secret.

Affected files

pybtc/functions/shamir.py – line ~73
jsbtc/src/functions/shamir_secret_sharing.js – lines 88–95

Details

JavaScript (correct, matches BIP spec)

for (let i = 0; i < threshold - 1; i++) {
    do {
        w = e[ePointer++];
    } while (q.includes(w));   // ← uniqueness enforced
    q.push(w);
}

Python (vulnerable, no uniqueness check)

for i in range(threshold - 1):
    if e_i < len(e):
        a = e[e_i]
        e_i += 1
    else:
        e = generate_entropy(hex=False)
        a = e[0]
        e_i = 1
    q.append(a)   # ← no check for duplicates

Cryptographic impact
For a k-of-n scheme, the polynomial has degree k-1. If any two coefficients are equal, the polynomial simplifies:

f(x) = secret + a1·x + a2·x² + …   with a1 = a2
→ f(x) = secret + a1·(x + x²)

The effective degree drops from k-1 to 1. An attacker needs only 2 shares (instead of k) to recover the secret for that byte.

Probability of duplicate coefficients per byte: 1/256 ≈ 0.39%.

For a 16‑byte secret (as in BIP39), the probability that at least one byte is affected is about 6%.

This weakens the security of the scheme below its advertised threshold, especially for small thresholds (e.g., 2‑of‑3, 3‑of‑5). The JavaScript version correctly avoids this by rejecting duplicates.

Proof of concept

python
import os
import collections

def simulate_py(threshold, trials=500_000):
    dups = 0
    for _ in range(trials):
        coeffs = [os.urandom(1)[0] for _ in range(threshold - 1)]
        if len(set(coeffs)) < len(coeffs):
            dups += 1
    return dups / trials

def simulate_js(threshold, trials=500_000):
    dups = 0
    for _ in range(trials):
        coeffs = []
        while len(coeffs) < threshold - 1:
            b = os.urandom(1)[0]
            if b not in coeffs:
                coeffs.append(b)
        if len(set(coeffs)) < len(coeffs):
            dups += 1
    return dups / trials

threshold = 3
print(f"Python duplicate rate: {simulate_py(threshold)*100:.2f}%")   # ~0.38%
print(f"JS     duplicate rate: {simulate_js(threshold)*100:.2f}%")   # 0.00%

Proposed fix
Apply the same rejection sampling used in the JS implementation:

python
# After the line `q = [secret[b]]`:
for i in range(threshold - 1):
    while True:
        if e_i < len(e):
            a = e[e_i]
            e_i += 1
        else:
            e = generate_entropy(hex=False)
            a = e[0]
            e_i = 1
        if a not in q:    # reject duplicates
            break
    q.append(a)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions