Tokenized Uniform 3-SAT Isomorphism

Charles Dana · Monce SAS · June 2026

uniform-3-sat.aws.monce.ai · /math · /playground

Abstract

We present a polynomial-time reduction from arbitrary CNF to uniform 3-SAT with tokenized isomorphism (1 DIMACS character = 1 token). Every output clause is BCP-provable. The reduced formula is solved via AUMA Fourier maximization. We prove time and compute bounds for each stage and demonstrate solution set preservation across the full NP-complete spectrum.

1. The Reduction

Input: any CNF formula F with n variables, m clauses, max clause width k.

Output: uniform 3-SAT formula U(F) where every clause has exactly 3 literals and is individually BCP-provable from the formula.

Definition (BCP-Provable). Clause C = (l₁ ∨ l₂ ∨ l₃) is BCP-provable from F if unit propagation on F ∪ {¬l₁, ¬l₂, ¬l₃} derives ⊥.
Theorem 1 (Soundness). If C is BCP-provable from F, then F ⊨ C.
Theorem 2 (Equisatisfiability). SAT(F) ⟺ SAT(U(F)). Solution sets are isomorphic via the variable map.

2. Time Complexity

T_tseitin = O(m) — each k-clause → at most (k−2) 3-clauses
T_bcp(single clause) = O(n' · m') — unit propagation to fixpoint
T_reduce = O(n · m²) — full pipeline with BCP derivation
T_fourier = (2n + 1) · O(m) — AUMA first-order probe
T_solve = O(na · m) — AUMA total budget, a ∈ {2, 2.5, 3}
T_total = O(n · m²) + O(na · m) — end-to-end, polynomial

3. Compute Complexity

StageTimeSpaceEvaluations
Tseitin structuralO(m)O(n + m)
BCP gate (per clause)O(n · m)O(n)
BCP derivation (budget B ms)O(B/n') probesO(n)
AUMA Fourier probeO(n · m)O(n)2n + 1
AUMA greedy flipO(n · m) per passO(n)n per pass
AUMA full solve (a=2)O(n² · m)O(n)⌈n²⌉
AUMA full solve (a=3)O(n³ · m)O(n)⌈n³⌉
Solution decodeO(n)O(n)
Isomorphism verifyO(n·m²) + 2·T_solverO(n+m)

4. Tokenization

Definition (Token). 1 token = 1 character of DIMACS CNF input or formula string output.
tokens_per_clause = 3·d + 4 ≈ 42 — where d = avg digits per literal
S_formula(c clauses, n vars) = c · (28 + 3·⌈log₁₀(n)⌉) chars
Capacity: 50K tokens ⟹ c ≤ 1,190 clauses ⟹ n ≤ 500 vars

5. Speedup Theorem

Theorem 3 (Uniform Topology Speedup). Let T_cdcl(F) be CDCL solve time on raw encoding, T_cdcl(U(F)) on the uniform reduction. Then T_cdcl(U(F)) ≤ α(F) · T_cdcl(F) where α(F) < 1 for structurally biased encodings.

Empirical: PHP(10,9) — α = 0.40 (72ms reduced vs 181ms original in Kissat). The uniform topology eliminates VSIDS confusion from encoding artifacts.

6. Phase Transition

r < 4.0 ⟹ AUMA(a=2) resolves 100% of SAT instances
r ≈ 4.27 ⟹ phase transition, needs a ≥ 2.5
r > 5.0 ⟹ UNSAT detected (gap ≥ 1)

7. Service Mesh

NP problem → uniform-3-sat/reduce [O(n·m²)] → auma/maximize [O(na·m)] → decoded solution

Both stages polynomial. Token cost bounded at 50K per request. Cost per solve: $0.000003–$0.00008 on t3.small.

References

[1] Cook (1971). The complexity of theorem-proving procedures.
[2] Tseitin (1983). On the complexity of derivation in propositional calculus.
[3] Dana (2023). A O(2n/2) Universal Maximization Algorithm. MSc thesis, Ecole Polytechnique.
[4] Dana (2024). The Dana Theorem: indicator → CNF in polynomial time.
[5] Haken (1985). The intractability of resolution. TCS.
[6] Mezard & Zecchina (2002). Random K-satisfiability.