Uniform 3-SAT — Tokenized NP Isomorphism

We present a polynomial-time reduction from arbitrary CNF to uniform 3-SAT with tokenized isomorphism (1 DIMACS character = 1 token). Every output clause is BCP-provable. The reduced formula is solved via AUMA Fourier maximization. We prove time and compute bounds for each stage and demonstrate solution set preservation across the full NP-complete spectrum.

1. The Reduction

Output: uniform 3-SAT formula U(F) where every clause has exactly 3 literals and is individually BCP-provable from the formula.

Definition (BCP-Provable). Clause C = (l₁ ∨ l₂ ∨ l₃) is BCP-provable from F if unit propagation on F ∪ {¬l₁, ¬l₂, ¬l₃} derives ⊥.

Theorem 1 (Soundness). If C is BCP-provable from F, then F ⊨ C.

Theorem 2 (Equisatisfiability). SAT(F) ⟺ SAT(U(F)). Solution sets are isomorphic via the variable map.

2. Time Complexity

T_tseitin = O(m) — each k-clause → at most (k−2) 3-clauses

T_bcp(single clause) = O(n' · m') — unit propagation to fixpoint

T_reduce = O(n · m²) — full pipeline with BCP derivation

T_fourier = (2n + 1) · O(m) — AUMA first-order probe

T_solve = O(n^a · m) — AUMA total budget, a ∈ {2, 2.5, 3}

T_total = O(n · m²) + O(n^a · m) — end-to-end, polynomial

3. Compute Complexity

4. Tokenization

Stage	Time	Space	Evaluations
Tseitin structural	O(m)	O(n + m)	—
BCP gate (per clause)	O(n · m)	O(n)	—
BCP derivation (budget B ms)	O(B/n') probes	O(n)	—
AUMA Fourier probe	O(n · m)	O(n)	2n + 1
AUMA greedy flip	O(n · m) per pass	O(n)	n per pass
AUMA full solve (a=2)	O(n² · m)	O(n)	⌈n²⌉
AUMA full solve (a=3)	O(n³ · m)	O(n)	⌈n³⌉
Solution decode	O(n)	O(n)	—
Isomorphism verify	O(n·m²) + 2·T_solver	O(n+m)	—

Definition (Token). 1 token = 1 character of DIMACS CNF input or formula string output.

tokens_per_clause = 3·d + 4 ≈ 42 — where d = avg digits per literal

S_formula(c clauses, n vars) = c · (28 + 3·⌈log₁₀(n)⌉) chars

Capacity: 50K tokens ⟹ c ≤ 1,190 clauses ⟹ n ≤ 500 vars

5. Speedup Theorem

Theorem 3 (Uniform Topology Speedup). Let T_cdcl(F) be CDCL solve time on raw encoding, T_cdcl(U(F)) on the uniform reduction. Then T_cdcl(U(F)) ≤ α(F) · T_cdcl(F) where α(F) < 1 for structurally biased encodings.

Empirical: PHP(10,9) — α = 0.40 (72ms reduced vs 181ms original in Kissat). The uniform topology eliminates VSIDS confusion from encoding artifacts.

6. Phase Transition

r < 4.0 ⟹ AUMA(a=2) resolves 100% of SAT instances

r ≈ 4.27 ⟹ phase transition, needs a ≥ 2.5

7. Service Mesh

NP problem → uniform-3-sat/reduce [O(n·m²)] → auma/maximize [O(n^a·m)] → decoded solution

Both stages polynomial. Token cost bounded at 50K per request. Cost per solve: $0.000003–$0.00008 on t3.small.

References

[1] Cook (1971). The complexity of theorem-proving procedures.
[2] Tseitin (1983). On the complexity of derivation in propositional calculus.
[3] Dana (2023). A O(2^n/2) Universal Maximization Algorithm. MSc thesis, Ecole Polytechnique.
[4] Dana (2024). The Dana Theorem: indicator → CNF in polynomial time.
[5] Haken (1985). The intractability of resolution. TCS.
[6] Mezard & Zecchina (2002). Random K-satisfiability.

Tokenized Uniform 3-SAT Isomorphism

Abstract