FormalRV

Arithmetic 1464 declarations in 45 modules

FormalRV.Arithmetic.Correctness

FormalRV/Arithmetic/Correctness.lean
FormalRV.BQAlgo.Correctness — REUSABLE correctness primitives for Gate-IR-encoded arithmetic circuits. ## Status (2026-05-12) This module provides the bridge from `Gate` IR (in `Framework.Gate`) to classical-basis-state semantics (in `Framework.PadAction`'s `f_to_vec` infrastructure), so that any future arithmetic-circuit review can state correctness theorems of the form: > on classical input `f : Nat → Bool`, running the circuit produces > the basis state corresponding to the expected output function. Per CLAUDE.md hard rule "build a reusable framework, not one-off proofs", lemmas in this file are stated generically over Gate IR constructions. They are then applied to specific circuits (`gidney_adder_bit_step`, `prefix_and_step`, ...) in their own files. *Reusable primitives (this file):** - `gate_ccx_acts_on_basis`: Gate.CCX's classical-state action - `gate_cx_acts_on_basis`: Gate.CX's classical-state action - `gate_x_acts_on_basis`: Gate.X's classical-state action - `gate_seq_acts_on_basis`: sequential composition propagation *Application sites (other files):** - `BQAlgo/RippleCarryAdder.lean`: `gidney_adder_bit_step` correctness - `BQAlgo/UnaryLookup.lean`: `prefix_and_step` correctness - (future) Gidney measurement-AND with extended Gate IR
theoremgate_ccx_acts_on_basis
theorem gate_ccx_acts_on_basis (dim a b c : Nat)
    (ha : a < dim) (hb : b < dim) (hc : c < dim)
    (hab : a ≠ b) (hac : a ≠ c) (hbc : b ≠ c) (f : Nat → Bool) :
    uc_eval (Gate.toUCom dim (Gate.CCX a b c)) * f_to_vec dim f
      = f_to_vec dim (update f c (xor (f c) (f a && f b)))
A `Gate.CCX a b c` applied to a classical basis state `f_to_vec dim f` XORs the AND of bits `a` and `b` into bit `c`. This is the Gate-IR-level statement of the Toffoli's classical action, derived from `Framework.PadAction.f_to_vec_CCX_proved` via `Gate.toUCom_CCX`.
theoremgate_ccx_acts_on_basis_symm
theorem gate_ccx_acts_on_basis_symm (dim a b c : Nat)
    (ha : a < dim) (hb : b < dim) (hc : c < dim)
    (hab : a ≠ b) (hac : a ≠ c) (hbc : b ≠ c) (f : Nat → Bool) :
    uc_eval (Gate.toUCom dim (Gate.CCX a b c)) * f_to_vec dim f
      = f_to_vec dim (update f c (xor (f c) (f b && f a)))
Symmetric variant: CCX is unchanged by swapping controls. Just a notational convenience using `Bool.and_comm`.
theoremgate_cx_acts_on_basis
theorem gate_cx_acts_on_basis (dim c t : Nat)
    (hc : c < dim) (ht : t < dim) (hct : c ≠ t) (f : Nat → Bool) :
    uc_eval (Gate.toUCom dim (Gate.CX c t)) * f_to_vec dim f
      = f_to_vec dim (update f t (xor (f t) (f c)))
A `Gate.CX c t` applied to a classical basis state XORs bit `c` into bit `t`. Derived from `Framework.PadAction.f_to_vec_CNOT_proved`.
theoremgate_x_acts_on_basis
theorem gate_x_acts_on_basis (dim n : Nat) (h : n < dim) (f : Nat → Bool) :
    uc_eval (Gate.toUCom dim (Gate.X n)) * f_to_vec dim f
      = f_to_vec dim (update f n (!f n))
A `Gate.X n` applied to a classical basis state flips bit `n`. Derived from `Framework.PadAction.f_to_vec_X_uc_eval`.
theoremgate_cx_cx_id_on_basis
theorem gate_cx_cx_id_on_basis (dim c t : Nat)
    (hc : c < dim) (ht : t < dim) (hct : c ≠ t) (f : Nat → Bool) :
    uc_eval (Gate.toUCom dim (Gate.seq (Gate.CX c t) (Gate.CX c t)))
      * f_to_vec dim f
      = f_to_vec dim f
Applying `Gate.CX c t` twice to a classical basis state restores the original state. SQIR/SQIR/Equivalences.v line 109 analog (CNOT involution) lifted to the Gate IR / basis-action level. Direct lift of `f_to_vec_CNOT_CNOT` from `Framework/PadAction.lean`.
theoremgate_ccx_ccx_id_on_basis
theorem gate_ccx_ccx_id_on_basis (dim a b c : Nat)
    (ha : a < dim) (hb : b < dim) (hc : c < dim)
    (hab : a ≠ b) (hac : a ≠ c) (hbc : b ≠ c) (f : Nat → Bool) :
    uc_eval (Gate.toUCom dim (Gate.seq (Gate.CCX a b c) (Gate.CCX a b c)))
      * f_to_vec dim f
      = f_to_vec dim f
Applying `Gate.CCX a b c` twice to a classical basis state restores the original state. SQIR analog: CCX is self-inverse (Toffoli is an involution). Direct lift of `f_to_vec_CCX_involutive` via `Matrix.mul_assoc` (the SQIR form takes nested multiplication; our Gate.seq form takes a single composed CCX-CCX product).
theoremgate_x_x_id_on_basis
theorem gate_x_x_id_on_basis (dim n : Nat) (h : n < dim) (f : Nat → Bool) :
    uc_eval (Gate.toUCom dim (Gate.seq (Gate.X n) (Gate.X n)))
      * f_to_vec dim f
      = f_to_vec dim f
Applying `Gate.X n` twice to a classical basis state restores the original state. SQIR/SQIR/Equivalences.v line 68 analog (X_X_id) lifted to the Gate IR / basis-action level. Direct lift of `f_to_vec_X_X` from `Framework/PadAction.lean`. Completes the three-gate involution family (X, CX, CCX).
theoremgate_seq_acts_on_basis
theorem gate_seq_acts_on_basis (dim : Nat) (g₁ g₂ : Gate)
    (f g h : Nat → Bool)
    (h₁ : uc_eval (Gate.toUCom dim g₁) * f_to_vec dim f = f_to_vec dim g)
    (h₂ : uc_eval (Gate.toUCom dim g₂) * f_to_vec dim g = f_to_vec dim h) :
    uc_eval (Gate.toUCom dim (Gate.seq g₁ g₂)) * f_to_vec dim f
      = f_to_vec dim h
Sequential composition acts on basis states by composing the per- gate basis-state functions. Derived from `uc_eval_seq` (right-to- left matrix multiplication on `seq`).
defGate.applyNat
def Gate.applyNat : Gate → (Nat → Bool) → (Nat → Bool)
  | Gate.I,         f => f
  | Gate.X q,       f => update f q (!f q)
  | Gate.CX c t,    f => update f t (xor (f t) (f c))
  | Gate.CCX a b c, f => update f c (xor (f c) (f a && f b))
  | Gate.seq g₁ g₂, f => Gate.applyNat g₂ (Gate.applyNat g₁ f)
Boolean-function semantics of a `Gate` IR term as a transformation on `Nat → Bool` (the function-form parallel of `Framework.Semantics.apply` on `Fin n → Bool`). Uses the project's local `Framework.update`, matching `gate_*_acts_on_basis` exactly.
theoremuc_eval_toUCom_acts_on_basis
theorem uc_eval_toUCom_acts_on_basis (dim : Nat) (g : Gate)
    (h_wt : Gate.WellTyped dim g) (f : Nat → Bool) :
    uc_eval (Gate.toUCom dim g) * f_to_vec dim f
      = f_to_vec dim (Gate.applyNat g f)
*The Gate → BaseUCom → basis-state adapter.** For any well-typed `Gate` IR term `g`, the matrix action of `uc_eval (Gate.toUCom dim g)` on the classical basis state `f_to_vec dim f` equals the basis state of `Gate.applyNat g f`. Proved by structural induction on `g`, using the existing per-gate basis action lemmas plus `gate_seq_acts_on_basis` for composition. *Usage path to `f_modmult_circuit_MMI`**: given a future modular multiplier `g_modmult : Gate` with a Boolean-function correctness theorem `Gate.applyNat g_modmult (encode_pair x 0) = encode_pair (a*x%N) 0`, combine with this adapter and `f_to_vec_eq_basis_padEquiv` (in `Framework/PadAction.lean`) to obtain the `uc_eval ... * basis_vector ... = basis_vector ...` shape that `MultiplyCircuitProperty` requires.
theoremtoUCom_acts_on_basis_of_applyNat_index
theorem toUCom_acts_on_basis_of_applyNat_index
    {dim : Nat} {g : Gate}
    (h_wt : Gate.WellTyped dim g)
    (inputIndex outputIndex : Nat) (f : Nat → Bool)
    (h_input : f_to_vec dim f = basis_vector (2^dim) inputIndex)
    (h_output : f_to_vec dim (Gate.applyNat g f)
                  = basis_vector (2^dim) outputIndex) :
    uc_eval (Gate.toUCom dim g) * basis_vector (2^dim) inputIndex
      = basis_vector (2^dim) outputIndex
*Index-form Gate → BaseUCom → basis_vector adapter** (the `MultiplyCircuitProperty`-shaped specialisation). Given: a well-typed `Gate` term `g`, a Boolean bit-function `f` that encodes some input as the basis state at `inputIndex`, and the fact that `Gate.applyNat g f` re-encodes the output as the basis state at `outputIndex`, the matrix action of `uc_eval (Gate.toUCom dim g)` on `basis_vector (2^dim) inputIndex` yields exactly `basis_vector (2^dim) outputIndex`. This is precisely the shape of `MultiplyCircuitProperty`'s `uc_eval c (basis_vector …) = basis_vector …` clause; downstream, supply `inputIndex := x * 2^anc`, `outputIndex := (a * x % N) * 2^anc`, and a Boolean encoding `f` of `x` in the data register with the ancilla zeroed.
theoremGate.applyNat_oob
theorem Gate.applyNat_oob
    {dim : Nat} {g : Gate}
    (h_wt : Gate.WellTyped dim g)
    (f : Nat → Bool)
    {i : Nat} (hi : dim ≤ i) :
    Gate.applyNat g f i = f i
*Out-of-range preservation of `Gate.applyNat`.** For a `Gate` that is well-typed at `dim` qubits, `Gate.applyNat g f i = f i` for every position `i ≥ dim`. In other words, the gate's Boolean semantics only touches positions `< dim`; any position beyond the declared dimension is fixed. Proved by induction on `g`: `I`: identity, trivial. `X q`, `CX c t`, `CCX a b c`: `update f _ _ i = f i` whenever `i` differs from the updated index, which follows from `i ≥ dim` and the corresponding bound from `Gate.WellTyped`. `seq g₁ g₂`: chain the two inductive hypotheses. This is the bit-level analogue of "the gate matrix is padded with identity on the OOB qubits"; used downstream to satisfy the out-of-range branch of `eq_encodeDataZeroAnc_of_data_anc_oob` for modular-multiplier circuits.

FormalRV.Arithmetic.Cuccaro.Cuccaro

FormalRV/Arithmetic/Cuccaro/Cuccaro.lean
FormalRV.BQAlgo.Cuccaro — the Cuccaro–Draper–Kutin–Moulton ripple-carry adder, encoded as concrete `Gate` data over the Framework IR. Per CLAUDE.md "Paper-claim-first workflow", every claim has the form `paper_claim_X` (paper's stated number) + `X_meets_paper_claim` (theorem that our derivation matches). Either the proof closes (paper verified for this component) or it doesn't (gap found). This file covers cost claims (T-count). Semantic correctness — does the MAJ gadget actually compute the majority function on bits? — lives in `BQAlgo/CuccaroCorrectness.lean`. Refs: - Cuccaro, Draper, Kutin, Moulton, "A new quantum ripple-carry addition circuit" (arXiv:quant-ph/0410184). - SQIR/examples/shor/ModMult.v (Coq encoding we're mirroring). - SQIR/examples/shor/ResourceShor.v `bcgcount_MAJ` ≤ 3 (gate count, not T-count). Each MAJ has 1 CCX + 2 CX, so under the textbook 7-T Toffoli decomposition, T-count is 7 per MAJ.
defpaper_claim_MAJ_tcount
def paper_claim_MAJ_tcount : Nat
Per-MAJ T-count claim. Source chain: SQIR `bcgcount_MAJ` ≤ 3 (gate count) ⟹ 1 Toffoli + 2 CX per MAJ ⟹ 7 T-gates per Toffoli (textbook) ⟹ 7 T-gates per MAJ.
defpaper_claim_UMA_tcount
def paper_claim_UMA_tcount : Nat
Per-UMA T-count claim. Same derivation as MAJ.
defcuccaro_MAJ
def cuccaro_MAJ (a b c : Nat) : Gate
Cuccaro MAJ gadget.
defcuccaro_UMA
def cuccaro_UMA (a b c : Nat) : Gate
Cuccaro UMA gadget.
theoremMAJ_meets_paper_claim
theorem MAJ_meets_paper_claim (a b c : Nat) :
    tcount (cuccaro_MAJ a b c) = paper_claim_MAJ_tcount
✅ MAJ meets the paper claim (T-count = 7).
theoremUMA_meets_paper_claim
theorem UMA_meets_paper_claim (a b c : Nat) :
    tcount (cuccaro_UMA a b c) = paper_claim_UMA_tcount
✅ UMA meets the paper claim (T-count = 7).
example(example)
example : tcount (cuccaro_MAJ 0 1 2) = 7
example(example)
example : tcount (cuccaro_UMA 0 1 2) = 7
example(example)
example : tcount (seq (cuccaro_MAJ 0 1 2) (cuccaro_UMA 0 1 2)) = 14
theoremMAJ_tcount_label_invariant
theorem MAJ_tcount_label_invariant (a b c a' b' c' : Nat) :
    tcount (cuccaro_MAJ a b c) = tcount (cuccaro_MAJ a' b' c')
The MAJ T-count is 7 for *every* qubit assignment, not just (0,1,2).
theoremUMA_tcount_label_invariant
theorem UMA_tcount_label_invariant (a b c a' b' c' : Nat) :
    tcount (cuccaro_UMA a b c) = tcount (cuccaro_UMA a' b' c')
The UMA T-count is 7 for *every* qubit assignment.
theoremMAJ_UMA_pair_tcount
theorem MAJ_UMA_pair_tcount (a b c a' b' c' : Nat) :
    tcount (seq (cuccaro_MAJ a b c) (cuccaro_UMA a' b' c')) = 14
Parametric MAJ+UMA pair cost: 14 T for any qubit assignment.
defcuccaro_maj_chain
def cuccaro_maj_chain : Nat → Nat → Gate
  | 0,     _       => I
  | n + 1, q_start =>
      seq (cuccaro_MAJ q_start (q_start + 1) (q_start + 2))
          (cuccaro_maj_chain n (q_start + 2))
A chain of `n` MAJ gadgets, each operating on a triple of consecutive qubits starting at `q_start`, then `q_start + 2`, then `q_start + 4`, ... (the Cuccaro ripple structure).
defcuccaro_uma_chain
def cuccaro_uma_chain : Nat → Nat → Gate
  | 0,     _       => I
  | n + 1, q_start =>
      seq (cuccaro_UMA q_start (q_start + 1) (q_start + 2))
          (cuccaro_uma_chain n (q_start + 2))
A chain of `n` UMA gadgets in the same ripple structure.
theoremtcount_cuccaro_maj_chain
theorem tcount_cuccaro_maj_chain (n q_start : Nat) :
    tcount (cuccaro_maj_chain n q_start) = 7 * n
T-count of an n-block MAJ chain is exactly `7 * n` (no cross-block savings from gate-level optimization alone).
theoremtcount_cuccaro_uma_chain
theorem tcount_cuccaro_uma_chain (n q_start : Nat) :
    tcount (cuccaro_uma_chain n q_start) = 7 * n
T-count of an n-block UMA chain is exactly `7 * n`.
defcuccaro_n_bit_adder_skeleton
def cuccaro_n_bit_adder_skeleton (n q_start : Nat) : Gate
A simplified n-bit Cuccaro adder skeleton: `n` MAJs forward, then `n` UMAs back. Real Cuccaro has additional CX corrections at the boundaries (paper p. 22-24); for the T-count this skeleton is exact since CX is T-free.
theoremtcount_cuccaro_n_bit_adder_skeleton
theorem tcount_cuccaro_n_bit_adder_skeleton (n q_start : Nat) :
    tcount (cuccaro_n_bit_adder_skeleton n q_start) = 14 * n
*n-bit adder T-count is `14 * n`** — verified from gate-level construction (not taken as a paper input). This re-derives the "per-block" claim qianxu uses in Eq. E3 from the actual Cuccaro gate sequence.
example(example)
example : tcount (cuccaro_n_bit_adder_skeleton 4 0) = 56
Smoke: 4-bit adder skeleton has 14 × 4 = 56 T-gates.
defoptimize_ccx_pair_top
def optimize_ccx_pair_top : Gate → Gate
  | seq (CCX a b c) (CCX a' b' c') =>
      if a = a' ∧ b = b' ∧ c = c' then I
      else seq (CCX a b c) (CCX a' b' c')
  | seq g₁ g₂ => seq g₁ g₂
  | I => I
  | X q => X q
  | CX a b => CX a b
  | CCX a b c => CCX a b c
Top-level CCX-pair-removal: if the outermost `seq` contains the same `CCX a b c` on both sides, replace with `I` (zero T-count). All other shapes are returned unchanged.
example(example)
example : tcount (optimize_ccx_pair_top (seq (CCX 0 1 2) (CCX 0 1 2))) = 0
Smoke test: the optimization detects an identical adjacent CCX pair and reduces T-count from 14 to 0.
example(example)
example : tcount (optimize_ccx_pair_top (seq (CCX 0 1 2) (CCX 0 1 3))) = 14
Smoke test: when the two CCX's differ (different target), the optimizer leaves the circuit unchanged.
example(example)
example : optimize_ccx_pair_top (seq (X 0) (CX 0 1)) = seq (X 0) (CX 0 1)
Smoke test: non-CCX shapes are passed through.
theoremtcount_optimize_ccx_pair_top_le
theorem tcount_optimize_ccx_pair_top_le (g : Gate) :
    tcount (optimize_ccx_pair_top g) ≤ tcount g
The optimization never increases T-count: it either rewrites a matching CCX pair to `I` (drops 14 T's) or leaves the circuit unchanged. Combined with the semantic justification in `Framework.PadAction.CCX_CCX_id`, this is a real T-count monotonicity proof for a top-level circuit rewrite.
theoremgcount_optimize_ccx_pair_top_le
theorem gcount_optimize_ccx_pair_top_le (g : Gate) :
    gcount (optimize_ccx_pair_top g) ≤ gcount g
Gate-count monotonicity for the top-level CCX-pair rewrite. Same case structure as the T-count version: pair-match drops gcount from 2 to 0, all other shapes are unchanged.
defoptimize_ccx_pairs_deep
def optimize_ccx_pairs_deep : Gate → Gate
  | seq g₁ g₂ =>
      optimize_ccx_pair_top
        (seq (optimize_ccx_pairs_deep g₁) (optimize_ccx_pairs_deep g₂))
  | g => g
Recursive deep CCX-pair-removal: bottom-up, optimize children first, then apply the top-level rewrite. Catches nested patterns that `optimize_ccx_pair_top` alone misses.
example(example)
example :
    tcount (optimize_ccx_pairs_deep
      (seq (X 0) (seq (CCX 0 1 2) (CCX 0 1 2)))) = 0
Smoke test: nested CCX pair inside a seq is detected. Without the deep optimizer, `optimize_ccx_pair_top` alone wouldn't touch a CCX pair hidden behind a `seq (X 0) (...)`.
example(example)
example :
    tcount (optimize_ccx_pairs_deep (seq (CCX 0 1 2) (CCX 0 1 2))) = 0
Smoke test: deep optimizer also catches the trivial top-level case.
theoremtcount_optimize_ccx_pairs_deep_le
theorem tcount_optimize_ccx_pairs_deep_le (g : Gate) :
    tcount (optimize_ccx_pairs_deep g) ≤ tcount g
Deep optimization is also T-count-monotone-non-increasing. Inductive proof: assume both children's T-counts are bounded above by their pre-optimization values (IH), then chain through `seq` additivity and the top-level monotonicity result.
defoptimize_ccx_iter
def optimize_ccx_iter : Nat → Gate → Gate
  | 0, g => g
  | n + 1, g => optimize_ccx_iter n (optimize_ccx_pairs_deep g)
Nat-fueled iteration of `optimize_ccx_pairs_deep`. Useful as a fixpoint driver: pick an upper bound on the number of passes (e.g., bounded by gate count) and the result is guaranteed to be no worse than the input.
theoremtcount_optimize_ccx_iter_le
theorem tcount_optimize_ccx_iter_le (n : Nat) (g : Gate) :
    tcount (optimize_ccx_iter n g) ≤ tcount g
Iterated optimization preserves T-count monotonicity. Inductive on the fuel: each step composes the deep optimizer's bound with the previous iterates'.
example(example)
example : tcount (optimize_ccx_iter 5 (seq (CCX 0 1 2) (CCX 0 1 2))) = 0
Smoke: a top-level CCX pair optimizes to T-count 0 even with one iteration of fuel.
example(example)
example : tcount (optimize_ccx_iter 10 (CCX 0 1 2)) = 7
Smoke: a single (un-pairable) CCX is not affected by any number of iterations — T-count stays at 7.
defoptimize_I_top
def optimize_I_top : Gate → Gate
  | seq I g => g
  | seq g I => g
  | g       => g
Top-level identity-elimination: drops `I` from either side of an outermost `seq`.
theoremtcount_optimize_I_top
theorem tcount_optimize_I_top (g : Gate) :
    tcount (optimize_I_top g) = tcount g
Identity-elimination preserves T-count exactly (since `tcount I = 0`).
theoremtcount_optimize_I_top_le
theorem tcount_optimize_I_top_le (g : Gate) :
    tcount (optimize_I_top g) ≤ tcount g
T-count monotonicity follows trivially from exact equality.
theoremgcount_optimize_I_top
theorem gcount_optimize_I_top (g : Gate) :
    gcount (optimize_I_top g) = gcount g
Gate-count monotonicity for I-elimination: also exact preservation, since `gcount I = 0` (identity gates don't count).
theoremgcount_optimize_I_top_le
theorem gcount_optimize_I_top_le (g : Gate) :
    gcount (optimize_I_top g) ≤ gcount g
example(example)
example : tcount (optimize_I_top (seq I (CCX 0 1 2))) = 7
Smoke: chaining I-elimination with the deep CCX optimizer collapses a cascading pattern. `seq (CCX) (seq (CCX ; CCX) (CCX))` goes: deep → `seq (CCX) (seq I (CCX))` → I-elim doesn't fire at top-level yet, but if applied recursively it would. This smoke only checks top-level chaining behavior.
example(example)
example : optimize_I_top (seq (CCX 0 1 2) I) = CCX 0 1 2
Smoke: `seq (CCX) I` reduces to plain `CCX`.
defoptimize_I_pairs_deep
def optimize_I_pairs_deep : Gate → Gate
  | seq g₁ g₂ =>
      optimize_I_top
        (seq (optimize_I_pairs_deep g₁) (optimize_I_pairs_deep g₂))
  | g => g
Recursive deep identity-elimination: bottom-up, optimize children first, then apply the top-level rewrite.
theoremtcount_optimize_I_pairs_deep
theorem tcount_optimize_I_pairs_deep (g : Gate) :
    tcount (optimize_I_pairs_deep g) = tcount g
Deep I-elimination preserves T-count exactly (every step is exact).
defoptimize_full
def optimize_full (g : Gate) : Gate
The full single-pass optimizer: first reduce CCX pairs (which may introduce `I` placeholders), then sweep out the resulting `I`s.
theoremtcount_optimize_full_le
theorem tcount_optimize_full_le (g : Gate) :
    tcount (optimize_full g) ≤ tcount g
The combined optimizer is also T-count-monotone-non-increasing. Since deep I-elimination preserves T-count exactly, the combined bound is just the deep CCX bound.
example(example)
example : tcount (optimize_full
    (seq I (seq (CCX 0 1 2) (CCX 0 1 2)))) = 0
Smoke: directly-adjacent CCX pair under an `I` wrapper. The CCX pair is detected by the deep CCX-elim (returns `seq I I`), then the I-elim collapses it to `I`.
example(example)
example : tcount (optimize_full
    (seq (CCX 0 1 2) (seq (CCX 0 1 2) I))) = 14
Smoke: an un-adjacent CCX pair (separated by a seq wrapper) is NOT fully collapsed by a single `optimize_full` pass — the CCX elim only sees the inner `seq (CCX) I` after I-elim runs, which happens later. Documents the known limitation: one pass isn't a fixpoint.
defoptimize_full_iter
def optimize_full_iter : Nat → Gate → Gate
  | 0,     g => g
  | n + 1, g => optimize_full_iter n (optimize_full g)
Nat-fueled iteration of `optimize_full`. Each fuel step alternates CCX-pair removal and I-elimination, so two iterations suffice for the associativity-blocked example above.
theoremtcount_optimize_full_iter_le
theorem tcount_optimize_full_iter_le (n : Nat) (g : Gate) :
    tcount (optimize_full_iter n g) ≤ tcount g
Iterated combined optimization is monotone non-increasing in T-count. Inductive on fuel via `Nat.le_trans` and the single-pass bound.
example(example)
example : tcount (optimize_full_iter 2
    (seq (CCX 0 1 2) (seq (CCX 0 1 2) I))) = 0
Smoke: the associativity-blocked case from above now collapses to `I` (T-count 0) after 2 iterations. First iteration runs CCX-elim + I-elim, exposing a new top-level `seq (CCX) (CCX)` pair that the second iteration eliminates.
theoremgcount_optimize_ccx_pairs_deep_le
theorem gcount_optimize_ccx_pairs_deep_le (g : Gate) :
    gcount (optimize_ccx_pairs_deep g) ≤ gcount g
theoremgcount_optimize_I_pairs_deep
theorem gcount_optimize_I_pairs_deep (g : Gate) :
    gcount (optimize_I_pairs_deep g) = gcount g
theoremgcount_optimize_full_le
theorem gcount_optimize_full_le (g : Gate) :
    gcount (optimize_full g) ≤ gcount g
theoremgcount_optimize_full_iter_le
theorem gcount_optimize_full_iter_le (n : Nat) (g : Gate) :
    gcount (optimize_full_iter n g) ≤ gcount g
theoremgcount_optimize_ccx_pair_top_strict_on_pair
theorem gcount_optimize_ccx_pair_top_strict_on_pair (a b c : Nat) :
    gcount (optimize_ccx_pair_top (seq (CCX a b c) (CCX a b c))) <
      gcount (seq (CCX a b c) (CCX a b c))
If a CCX-CCX pair appears at the top level (same triple on both sides), the top-level optimizer strictly decreases gcount.
theoremtcount_optimize_ccx_pair_top_strict_on_pair
theorem tcount_optimize_ccx_pair_top_strict_on_pair (a b c : Nat) :
    tcount (optimize_ccx_pair_top (seq (CCX a b c) (CCX a b c))) <
      tcount (seq (CCX a b c) (CCX a b c))
Same statement for T-count: 14 → 0 is strict.
defhas_ccx_pair
def has_ccx_pair : Gate → Bool
  | seq (CCX a b c) (CCX a' b' c') =>
      (decide (a = a')) && (decide (b = b')) && (decide (c = c'))
  | seq g₁ g₂ => has_ccx_pair g₁ || has_ccx_pair g₂
  | _ => false
Decidable predicate: does `g` contain an adjacent CCX-CCX pair anywhere? Recurses into `seq` children. Used as the hypothesis for the future strict-decrease theorem.
example(example)
example : has_ccx_pair (seq (CCX 0 1 2) (CCX 0 1 2)) = true
Smoke: direct CCX pair is detected.
example(example)
example : has_ccx_pair (seq (X 0) (seq (CCX 0 1 2) (CCX 0 1 2))) = true
Smoke: nested CCX pair under X is detected via recursion.
example(example)
example : has_ccx_pair (seq (CCX 0 1 2) (CCX 0 1 3)) = false
Smoke: differing-triple CCX pair returns false.
example(example)
example : has_ccx_pair (seq (X 0) (CX 0 1)) = false
Smoke: no CCX at all → false.
example(example)
example : has_ccx_pair
    (seq (X 0) (seq (X 1) (seq (CCX 0 1 2) (CCX 0 1 2)))) = true
Smoke: deeply nested CCX pair behind two X's.
theoremgcount_optimize_ccx_pairs_deep_strict_on_pair
theorem gcount_optimize_ccx_pairs_deep_strict_on_pair (a b c : Nat) :
    gcount (optimize_ccx_pairs_deep (seq (CCX a b c) (CCX a b c))) <
      gcount (seq (CCX a b c) (CCX a b c))
Strict-decrease witness for the deep optimizer at the simplest input shape: a top-level CCX-CCX pair. The deep optimizer recurses into each child (both CCXs return themselves), then top-level matches the pair → `I`. So gcount drops 2 → 0 strictly. This is the "easy" seed for the future general theorem `has_ccx_pair g = true → gcount (optimize_ccx_pairs_deep g) < gcount g`.
theoremtcount_optimize_ccx_pairs_deep_strict_on_pair
theorem tcount_optimize_ccx_pairs_deep_strict_on_pair (a b c : Nat) :
    tcount (optimize_ccx_pairs_deep (seq (CCX a b c) (CCX a b c))) <
      tcount (seq (CCX a b c) (CCX a b c))
Same for T-count: deep optimizer drops 14 → 0 strictly on a pair.
theoremgcount_optimize_ccx_pairs_deep_strict_seq_X_pair
theorem gcount_optimize_ccx_pairs_deep_strict_seq_X_pair (q a b c : Nat) :
    gcount (optimize_ccx_pairs_deep
      (seq (X q) (seq (CCX a b c) (CCX a b c)))) <
    gcount (seq (X q) (seq (CCX a b c) (CCX a b c)))
Strict-decrease for a CCX pair nested under an X wrapper: the deep optimizer recurses, eliminates the inner CCX pair (replacing with `I`), then leaves `seq (X q) I` at the top. gcount drops 3 → 1. Demonstrates that strict-decrease propagates through the recursive structure of the deep optimizer when a pair exists anywhere below.
theoremtcount_optimize_ccx_pairs_deep_strict_seq_X_pair
theorem tcount_optimize_ccx_pairs_deep_strict_seq_X_pair (q a b c : Nat) :
    tcount (optimize_ccx_pairs_deep
      (seq (X q) (seq (CCX a b c) (CCX a b c)))) <
    tcount (seq (X q) (seq (CCX a b c) (CCX a b c)))
T-count strict-decrease for the same nested-under-X case: 14 → 0 (the X has tcount 0; only the CCX pair contributes).
theoremgcount_optimize_ccx_pairs_deep_strict_pair_seq_X
theorem gcount_optimize_ccx_pairs_deep_strict_pair_seq_X (a b c q : Nat) :
    gcount (optimize_ccx_pairs_deep
      (seq (seq (CCX a b c) (CCX a b c)) (X q))) <
    gcount (seq (seq (CCX a b c) (CCX a b c)) (X q))
Symmetric form: CCX pair on the LEFT of an X wrapper. Deep optimizer reduces gcount 3 → 1 and tcount 14 → 0.
theoremtcount_optimize_ccx_pairs_deep_strict_pair_seq_X
theorem tcount_optimize_ccx_pairs_deep_strict_pair_seq_X (a b c q : Nat) :
    tcount (optimize_ccx_pairs_deep
      (seq (seq (CCX a b c) (CCX a b c)) (X q))) <
    tcount (seq (seq (CCX a b c) (CCX a b c)) (X q))
theoremgcount_optimize_ccx_pairs_deep_strict_pair_left
theorem gcount_optimize_ccx_pairs_deep_strict_pair_left (a b c : Nat) (g : Gate) :
    gcount (optimize_ccx_pairs_deep
      (seq (seq (CCX a b c) (CCX a b c)) g)) <
    gcount (seq (seq (CCX a b c) (CCX a b c)) g)
Parametric: CCX pair on the LEFT, any gate `g` on the right. The deep optimizer collapses the pair to `I`, then leaves `seq I (deep g)` at top-level (no further match). Strict because the pair drops 2 gcount; `g`-side is monotone non-increasing.
theoremgcount_optimize_ccx_pairs_deep_strict_pair_right
theorem gcount_optimize_ccx_pairs_deep_strict_pair_right (g : Gate) (a b c : Nat) :
    gcount (optimize_ccx_pairs_deep
      (seq g (seq (CCX a b c) (CCX a b c)))) <
    gcount (seq g (seq (CCX a b c) (CCX a b c)))
Symmetric parametric: CCX pair on the RIGHT, any gate `g` on the left. Same shape as the `_left` variant: collapse the inner pair to `I`. The top-level optimizer can't be definitionally reduced on `seq (deep g) I` (since `deep g` is opaque), but its universal monotonicity bound is enough.
theoremgcount_optimize_ccx_pairs_deep_strict_via_left
theorem gcount_optimize_ccx_pairs_deep_strict_via_left (g₁ g₂ : Gate)
    (ih₁ : gcount (optimize_ccx_pairs_deep g₁) < gcount g₁) :
    gcount (optimize_ccx_pairs_deep (seq g₁ g₂)) < gcount (seq g₁ g₂)
If the LEFT child's deep optimization strictly decreases gcount, so does the seq's.
theoremgcount_optimize_ccx_pairs_deep_strict_via_right
theorem gcount_optimize_ccx_pairs_deep_strict_via_right (g₁ g₂ : Gate)
    (ih₂ : gcount (optimize_ccx_pairs_deep g₂) < gcount g₂) :
    gcount (optimize_ccx_pairs_deep (seq g₁ g₂)) < gcount (seq g₁ g₂)
Symmetric: if the RIGHT child's deep optimization strictly decreases gcount, so does the seq's.
theoremgcount_optimize_ccx_pairs_deep_strict
theorem gcount_optimize_ccx_pairs_deep_strict (g : Gate)
    (h : has_ccx_pair g = true) :
    gcount (optimize_ccx_pairs_deep g) < gcount g
*Main strict-decrease theorem.** If a gate contains any adjacent CCX-CCX pair (anywhere — `has_ccx_pair` recursive detector), the deep optimizer strictly reduces gcount. This is the well-founded termination prerequisite for an unfueled fixpoint.
theoremgcount_optimize_full_strict
theorem gcount_optimize_full_strict (g : Gate)
    (h : has_ccx_pair g = true) :
    gcount (optimize_full g) < gcount g
Strict-decrease lifted to `optimize_full = I-deep ∘ CCX-deep`. Since I-elim preserves gcount exactly, the strict drop comes entirely from the CCX-elim phase. Direct 3-line chain.
defoptimize_to_fixpoint
def optimize_to_fixpoint (g : Gate) : Gate
Iterate `optimize_full` until no adjacent CCX pair remains. Well-founded recursion on `gcount g`. The `_h` proof of `has_ccx_pair g = true` is unused inside the `then` branch but consumed by `decreasing_by` — Lean 4 allows the `_` prefix while still permitting references in proof-obligation blocks.
theoremoptimize_to_fixpoint_eq_self_of_no_pair
theorem optimize_to_fixpoint_eq_self_of_no_pair (g : Gate)
    (h : has_ccx_pair g = false) :
    optimize_to_fixpoint g = g
Easy direction: when `g` has no pair, `optimize_to_fixpoint g = g`. Just unfolds the `else` branch of the wf-definition.
theoremoptimize_to_fixpoint_eq_recurse_of_pair
theorem optimize_to_fixpoint_eq_recurse_of_pair (g : Gate)
    (h : has_ccx_pair g = true) :
    optimize_to_fixpoint g = optimize_to_fixpoint (optimize_full g)
One-step unfolding when `g` has a pair: `optimize_to_fixpoint g = optimize_to_fixpoint (optimize_full g)`.
theoremhas_ccx_pair_optimize_to_fixpoint
theorem has_ccx_pair_optimize_to_fixpoint (g : Gate) :
    has_ccx_pair (optimize_to_fixpoint g) = false
*Fixpoint property.** The optimizer terminates at an output with no remaining CCX pairs. Proved by well-founded recursion on `gcount g`, with `gcount_optimize_full_strict` as the decreasing bound.
theoremtcount_optimize_to_fixpoint_le
theorem tcount_optimize_to_fixpoint_le (g : Gate) :
    tcount (optimize_to_fixpoint g) ≤ tcount g
T-count monotonicity for the fixpoint operator. Same WF-recursive proof pattern as the fixpoint property, chaining the IH with `tcount_optimize_full_le` for the recursive step.
theoremgcount_optimize_to_fixpoint_le
theorem gcount_optimize_to_fixpoint_le (g : Gate) :
    gcount (optimize_to_fixpoint g) ≤ gcount g
Same monotonicity for gate-count.
defassoc_right_step
def assoc_right_step : Gate → Gate
  | seq (seq a b) c => seq a (seq b c)
  | g => g
Single top-level right-rotation: turns `seq (seq a b) c` into `seq a (seq b c)`. All other shapes pass through unchanged.
theoremtcount_assoc_right_step
theorem tcount_assoc_right_step (g : Gate) :
    tcount (assoc_right_step g) = tcount g
The rotation preserves T-count exactly.
theoremgcount_assoc_right_step
theorem gcount_assoc_right_step (g : Gate) :
    gcount (assoc_right_step g) = gcount g
The rotation preserves gate count exactly.
example(example)
example (a b c : Nat) (q : Nat) :
    assoc_right_step (seq (seq (CCX a b c) (CCX a b c)) (X q))
      = seq (CCX a b c) (seq (CCX a b c) (X q))
Smoke: rotating `seq (seq A B) C` gives `seq A (seq B C)`.
example(example)
example (q1 q2 : Nat) : assoc_right_step (seq (X q1) (X q2)) = seq (X q1) (X q2)
Smoke: rotating a non-left-leaning seq is a no-op.
defassoc_right_iter
def assoc_right_iter : Nat → Gate → Gate
  | 0, g => g
  | n + 1, g => assoc_right_iter n (assoc_right_step g)
Nat-fueled iteration of `assoc_right_step` at the top level. With enough fuel, the outer seq tree becomes right-leaning.
theoremtcount_assoc_right_iter
theorem tcount_assoc_right_iter (n : Nat) (g : Gate) :
    tcount (assoc_right_iter n g) = tcount g
Iterating rotations preserves T-count exactly. Induction on fuel + each step's preservation.
theoremgcount_assoc_right_iter
theorem gcount_assoc_right_iter (n : Nat) (g : Gate) :
    gcount (assoc_right_iter n g) = gcount g
Same exact preservation for gate count.

FormalRV.Arithmetic.Cuccaro.CuccaroAddConst

FormalRV/Arithmetic/Cuccaro/CuccaroAddConst.lean
FormalRV.BQAlgo.CuccaroAddConst — exact-budget Cuccaro add-constant primitive. Tick 45: build the add-constant primitive on top of the `cuccaro_n_bit_adder_full` machinery from Ticks 41-44. `cuccaro_addConstGate bits q_start c` implements `target ← (target + c) mod 2^bits` in place on the target/b register at positions `q_start + 2i + 1`, using a "prepare + adder + unprepare" pattern that re-encodes the constant `c` into the read/a register, runs the full Cuccaro adder, then unprepares the read register back to zero. Total qubit budget: `2*bits + 1` starting at `q_start` — matches SQIR's `modmult_rev_anc bits = 2*bits + 1` exactly. Structure: - `cuccaro_prepareConstRead`: XOR each read position with `c.testBit i`. - Per-position lemmas: action at read positions vs everywhere else. - `cuccaro_addConstGate`: composed gate. - Decoded correctness: target = `(x + c) % 2^bits`, read restored to 0, carry-in restored to false. - WellTyped + packaged primitive.
defcuccaro_prepareConstRead
def cuccaro_prepareConstRead : Nat → Nat → Nat → Gate
  | 0,     _,       _ => Gate.I
  | n + 1, q_start, c =>
      seq (cuccaro_prepareConstRead n q_start c)
          (cond (c.testBit n) (Gate.X (q_start + 2 * n + 2)) Gate.I)
*Constant-read preparation.** For each bit `i < bits`, applies `X` at the read-register position `q_start + 2*i + 2` iff `c.testBit i`. The gate is self-inverse on the affected positions (since X² = I).
theoremcuccaro_prepareConstRead_at_other
theorem cuccaro_prepareConstRead_at_other
    (bits q_start c q : Nat)
    (hq : ∀ i, i < bits → q ≠ q_start + 2 * i + 2)
    (f : Nat → Bool) :
    Gate.applyNat (cuccaro_prepareConstRead bits q_start c) f q = f q
*Frame: prepare doesn't touch positions outside the read range.** If `q` is not equal to any read position `q_start + 2*i + 2` (i < bits), the prepare gate leaves `f q` unchanged.
theoremcuccaro_prepareConstRead_at_read
theorem cuccaro_prepareConstRead_at_read
    (bits q_start c j : Nat) (hj : j < bits) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_prepareConstRead bits q_start c) f
        (q_start + 2 * j + 2)
      = xor (f (q_start + 2 * j + 2)) (c.testBit j)
*Action at read positions.** At read-position `q_start + 2*j + 2` for `j < bits`, the prepare gate XOR's `c.testBit j` into the existing value.
theoremcuccaro_prepareConstRead_wellTyped
theorem cuccaro_prepareConstRead_wellTyped
    (bits q_start c dim : Nat) (h : q_start + 2 * bits + 1 ≤ dim) :
    Gate.WellTyped dim (cuccaro_prepareConstRead bits q_start c)
*WellTyped: prepare fits in `q_start + 2*bits + 1` qubits.**
defcuccaro_addConstGate
def cuccaro_addConstGate (bits q_start c : Nat) : Gate
*Exact-budget Cuccaro add-constant gate.** Implements `target ← (target + c) mod 2^bits` in place via prepare-adder-unprepare. Total qubit budget: `2*bits + 1` starting at `q_start`.
theoremcuccaro_addConstGate_target_bit
theorem cuccaro_addConstGate_target_bit
    (bits q_start c x i : Nat) (hi : i < bits)
    (hc : c < 2^bits) :
    Gate.applyNat (cuccaro_addConstGate bits q_start c)
        (cuccaro_input_F q_start false 0 x) (q_start + 2 * i + 1)
      = (x + c).testBit i
*Target bit at position `q_start + 2*i + 1` for `i < bits` after the addConstGate**: equals `(x + c).testBit i`. Proved by tracing the three-stage composition: - After prepare₁ on input `cuccaro_input_F q_start false 0 x`: carry-in = false, b-bits = x.testBit, a-bits = c.testBit (XOR'd in). - After full adder: sum bit = `(x + c).testBit i` via the sum-bit theorem and `Adder.sumfb_eq_testBit_add_gen`. - After prepare₂: target b-bit position unchanged (prepare touches only a-positions).
theoremcuccaro_addConstGate_read_bit
theorem cuccaro_addConstGate_read_bit
    (bits q_start c x i : Nat) (hi : i < bits) :
    Gate.applyNat (cuccaro_addConstGate bits q_start c)
        (cuccaro_input_F q_start false 0 x) (q_start + 2 * i + 2)
      = false
*Read bit at position `q_start + 2*i + 2` for `i < bits` after the addConstGate**: equals `false` (restored to zero). Trace: - After prepare₁: a-bit at q_start+2*i+2 = false ⊕ c.testBit i = c.testBit i. - After full adder: a preserved (= c.testBit i) by `_a_restored`. - After prepare₂: c.testBit i ⊕ c.testBit i = false.
theoremcuccaro_addConstGate_carry_in_bit
theorem cuccaro_addConstGate_carry_in_bit
    (bits q_start c x : Nat) :
    Gate.applyNat (cuccaro_addConstGate bits q_start c)
        (cuccaro_input_F q_start false 0 x) q_start = false
*Carry-in at position `q_start` after the addConstGate**: equals `false` (restored).
theoremcuccaro_addConstGate_target_decode
theorem cuccaro_addConstGate_target_decode
    (bits q_start c x : Nat) (hc : c < 2^bits) :
    cuccaro_target_val bits q_start
        (Gate.applyNat (cuccaro_addConstGate bits q_start c)
          (cuccaro_input_F q_start false 0 x))
      = (x + c) % 2^bits
*HEADLINE — decoded target correctness.** After running `cuccaro_addConstGate bits q_start c` on `cuccaro_input_F q_start false 0 x`, the target register decodes to `(x + c) % 2^bits`.
theoremcuccaro_addConstGate_read_decode
theorem cuccaro_addConstGate_read_decode
    (bits q_start c x : Nat) :
    cuccaro_read_val bits q_start
        (Gate.applyNat (cuccaro_addConstGate bits q_start c)
          (cuccaro_input_F q_start false 0 x))
      = 0
*Decoded read restoration.** After running the addConstGate, the read register decodes to `0`.
theoremcuccaro_addConstGate_wellTyped
theorem cuccaro_addConstGate_wellTyped
    (bits q_start c dim : Nat) (h : q_start + 2 * bits + 1 ≤ dim) :
    Gate.WellTyped dim (cuccaro_addConstGate bits q_start c)
*WellTyped: addConstGate fits in `q_start + 2*bits + 1` qubits.**
theoremcuccaro_addConstGate_clean
theorem cuccaro_addConstGate_clean
    (bits q_start c x : Nat) (hc : c < 2^bits) :
    Gate.WellTyped (q_start + (2 * bits + 1))
        (cuccaro_addConstGate bits q_start c)
    ∧ cuccaro_target_val bits q_start
          (Gate.applyNat (cuccaro_addConstGate bits q_start c)
            (cuccaro_input_F q_start false 0 x))
        = (x + c) % 2^bits
    ∧ cuccaro_read_val bits q_start
          (Gate.applyNat (cuccaro_addConstGate bits q_start c)
            (cuccaro_input_F q_start false 0 x))
        = 0
*HEADLINE — packaged Cuccaro add-constant primitive.** For any `bits`, `q_start`, `c < 2^bits`, and `x`, the addConstGate: - is WellTyped at dimension `q_start + (2*bits + 1)`; - writes `(x + c) % 2^bits` into the target register; - restores the read register to 0; - restores the carry-in qubit to false.

FormalRV.Arithmetic.Cuccaro.CuccaroCompare

FormalRV/Arithmetic/Cuccaro/CuccaroCompare.lean
FormalRV.BQAlgo.CuccaroCompare — exact-budget comparator from the Cuccaro MAJ-chain forward pass. Tick 47: build the first exact-budget comparison primitive by reading the top carry of the Cuccaro MAJ chain BEFORE the reverse UMA chain uncomputes it. Mathematical idea: to compare `x` with `N`, add the two's-complement constant `K := 2^bits - N` to `x` and read the (bits)-th carry bit: carry_out_bit = decide (N ≤ x). The reverse UMA chain in `cuccaro_n_bit_adder_full` erases this carry; the forward-only gate retains it at position `q_start + 2*bits`. This file proves: - the arithmetic helper relating `Adder.carry false bits` to `(a + b).testBit bits` for a, b < 2^bits; - the comparator's top-carry value = `decide (N ≤ x)` (and its negation = `decide (x < N)`). IMPORTANT: this is a FORWARD-ONLY gate. It leaves the workspace in a "dirty" state — the MAJ chain has propagated XOR'd values through every register position. A separate reverse pass is needed to uncompute the workspace, which destroys the flag. Tick 48+ will address how to use this flag before uncomputation (a future decision-point flagged in QUESTIONS.md).
theoremtestBit_top_of_sum_eq_decide_ge
private theorem testBit_top_of_sum_eq_decide_ge
    (bits a b : Nat) (ha : a < 2^bits) (hb : b < 2^bits) :
    (a + b).testBit bits = decide (2^bits ≤ a + b)
*Top-bit-of-sum lemma** (private helper). For `a, b < 2^bits`, the `bits`-th bit of `a + b` equals `decide (2^bits ≤ a + b)`.
theoremAdder_carry_top_eq_testBit_sum
private theorem Adder_carry_top_eq_testBit_sum
    (bits a b : Nat) (ha : a < 2^bits) (hb : b < 2^bits) :
    Adder.carry false bits (fun i => a.testBit i) (fun i => b.testBit i)
      = (a + b).testBit bits
*Carry-out via top bit of sum** (private helper). For `a, b < 2^bits`, the carry-out of an n-bit addition equals the `bits`-th bit of `a + b`.
theoremadd_twos_complement_carry_out_eq
theorem add_twos_complement_carry_out_eq
    (bits N x : Nat) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits) :
    Adder.carry false bits
        (fun i => (2^bits - N).testBit i)
        (fun i => x.testBit i)
      = decide (N ≤ x)
*HEADLINE arithmetic helper**: the carry-out of adding the two's-complement constant `2^bits - N` to `x` equals `decide (N ≤ x)`.
defcuccaro_compareConstForwardGate
def cuccaro_compareConstForwardGate (bits q_start N : Nat) : Gate
*Forward-only Cuccaro comparison gate.** Prepares the two's-complement constant `K := 2^bits - N` in the read register, then runs the MAJ chain. The top carry at position `q_start + 2*bits` holds `decide (N ≤ x)`. This gate does NOT uncompute the workspace. Subsequent positions hold XOR'd intermediate values from the MAJ chain. This is by design — uncomputing would erase the flag.
theoremcuccaro_compareConstForward_top_carry
theorem cuccaro_compareConstForward_top_carry
    (bits q_start N x : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits) :
    Gate.applyNat (cuccaro_compareConstForwardGate bits q_start N)
        (cuccaro_input_F q_start false 0 x) (q_start + 2 * bits)
      = decide (N ≤ x)
*HEADLINE — top-carry of the forward comparator = `decide (N ≤ x)`.** After running `cuccaro_compareConstForwardGate bits q_start N` on `cuccaro_input_F q_start false 0 x`, the qubit at position `q_start + 2*bits` holds the comparison flag `decide (N ≤ x)`.
theoremcuccaro_compareConstForward_underflow
theorem cuccaro_compareConstForward_underflow
    (bits q_start N x : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits) :
    !(Gate.applyNat (cuccaro_compareConstForwardGate bits q_start N)
        (cuccaro_input_F q_start false 0 x) (q_start + 2 * bits))
      = decide (x < N)
*Underflow version**: the negation of the top carry equals `decide (x < N)`.
theoremcuccaro_compareConstForwardGate_wellTyped
theorem cuccaro_compareConstForwardGate_wellTyped
    (bits q_start N dim : Nat) (h : q_start + 2 * bits + 1 ≤ dim) :
    Gate.WellTyped dim (cuccaro_compareConstForwardGate bits q_start N)
*WellTyped: the forward comparator fits in `q_start + 2*bits + 1` qubits.**
theoremcuccaro_compareConstForwardGate_primitive
theorem cuccaro_compareConstForwardGate_primitive
    (bits q_start N x : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits) :
    Gate.WellTyped (q_start + (2 * bits + 1))
        (cuccaro_compareConstForwardGate bits q_start N)
    ∧ Gate.applyNat (cuccaro_compareConstForwardGate bits q_start N)
          (cuccaro_input_F q_start false 0 x) (q_start + 2 * bits)
        = decide (N ≤ x)
*Packaged exact-budget comparator forward gate.**

FormalRV.Arithmetic.Cuccaro.CuccaroCorrectness

FormalRV/Arithmetic/Cuccaro/CuccaroCorrectness.lean
FormalRV.BQAlgo.CuccaroCorrectness — semantic correctness of the Cuccaro MAJ and UMA gadgets, proved against the Framework's RCIR-level bit-vector semantics. This is the SQIR analogue of `Lemma MAJ_correct` in `RCIR.v` — but Lean-native, computable, and small enough that `decide` discharges every case. The key correctness fact: applied to bits (a, b, c), the MAJ gate writes the **majority** of a, b, c into bit c, while transforming bit a → a ⊕ c and bit b → b ⊕ c (so MAJ is reversible: UMA undoes it).
theoremcuccaro_MAJ_writes_xor_a
theorem cuccaro_MAJ_writes_xor_a (a b c : Bool) :
    apply (cuccaro_MAJ 0 1 2) (mkState3 a b c) 0 = xor a c
After MAJ a b c, bit `a` becomes `a ⊕ c` (XOR with the original c).
theoremcuccaro_MAJ_writes_xor_b
theorem cuccaro_MAJ_writes_xor_b (a b c : Bool) :
    apply (cuccaro_MAJ 0 1 2) (mkState3 a b c) 1 = xor b c
After MAJ a b c, bit `b` becomes `b ⊕ c`.
theoremcuccaro_MAJ_writes_majority
theorem cuccaro_MAJ_writes_majority (a b c : Bool) :
    apply (cuccaro_MAJ 0 1 2) (mkState3 a b c) 2 = majority a b c
After MAJ a b c, bit `c` becomes the **majority** of (a, b, c).
theoremMAJ_then_UMA_restores_a
theorem MAJ_then_UMA_restores_a (a b c : Bool) :
    apply (seq (cuccaro_MAJ 0 1 2) (cuccaro_UMA 0 1 2)) (mkState3 a b c) 0 = a
UMA after MAJ on the same triple restores bit 0 (a ⊕ c → a).
theoremMAJ_then_UMA_writes_sum
theorem MAJ_then_UMA_writes_sum (a b c : Bool) :
    apply (seq (cuccaro_MAJ 0 1 2) (cuccaro_UMA 0 1 2)) (mkState3 a b c) 1
      = xor (xor a b) c
UMA after MAJ writes the **sum bit** `a ⊕ b ⊕ c` into qubit 1. This is the per-bit output of a full adder.
theoremMAJ_then_UMA_restores_c
theorem MAJ_then_UMA_restores_c (a b c : Bool) :
    apply (seq (cuccaro_MAJ 0 1 2) (cuccaro_UMA 0 1 2)) (mkState3 a b c) 2 = c
UMA after MAJ restores bit 2 (the carry-in).
theoremcuccaro_MAJ_at_a
theorem cuccaro_MAJ_at_a
    (a b c : Nat) (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_MAJ a b c) f a = xor (f a) (f c)
*MAJ local semantics at the `a` wire.** Applied to `f`, the gate writes `f a ⊕ f c` at position `a`.
theoremcuccaro_MAJ_at_b
theorem cuccaro_MAJ_at_b
    (a b c : Nat) (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_MAJ a b c) f b = xor (f b) (f c)
*MAJ local semantics at the `b` wire.** Applied to `f`, the gate writes `f b ⊕ f c` at position `b`.
theoremcuccaro_MAJ_at_c
theorem cuccaro_MAJ_at_c
    (a b c : Nat) (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_MAJ a b c) f c
      = majority (f a) (f b) (f c)
*MAJ local semantics at the `c` wire.** Applied to `f`, the gate writes the boolean majority of `(f a, f b, f c)` at position `c`.
theoremcuccaro_MAJ_at_other
theorem cuccaro_MAJ_at_other
    (a b c q : Nat) (h_qa : q ≠ a) (h_qb : q ≠ b) (h_qc : q ≠ c) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_MAJ a b c) f q = f q
*MAJ local semantics at any unrelated wire.** Applied to `f`, the gate is identity at positions outside `{a, b, c}`.
theoremcuccaro_UMA_at_c
theorem cuccaro_UMA_at_c
    (a b c : Nat) (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_UMA a b c) f c
      = xor (f c) (f a && f b)
*UMA local semantics at the `c` wire.** Applied to `f`, the gate writes `f c ⊕ (f a AND f b)` at position `c` (the CCX action).
theoremcuccaro_UMA_at_a
theorem cuccaro_UMA_at_a
    (a b c : Nat) (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_UMA a b c) f a
      = xor (f a) (xor (f c) (f a && f b))
*UMA local semantics at the `a` wire.** After UMA, position `a` holds `f a ⊕ f c ⊕ (f a AND f b)`.
theoremcuccaro_UMA_at_b
theorem cuccaro_UMA_at_b
    (a b c : Nat) (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_UMA a b c) f b
      = xor (f b) (xor (f a) (xor (f c) (f a && f b)))
*UMA local semantics at the `b` wire.** After UMA, position `b` holds `f b ⊕ f a ⊕ f c ⊕ (f a AND f b)`.
theoremcuccaro_UMA_at_other
theorem cuccaro_UMA_at_other
    (a b c q : Nat) (h_qa : q ≠ a) (h_qb : q ≠ b) (h_qc : q ≠ c) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_UMA a b c) f q = f q
*UMA local semantics at any unrelated wire.**
theoremcuccaro_MAJ_wellTyped
theorem cuccaro_MAJ_wellTyped
    (dim a b c : Nat) (ha : a < dim) (hb : b < dim) (hc : c < dim)
    (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c) :
    Gate.WellTyped dim (cuccaro_MAJ a b c)
*WellTyped for `cuccaro_MAJ`.**
theoremcuccaro_UMA_wellTyped
theorem cuccaro_UMA_wellTyped
    (dim a b c : Nat) (ha : a < dim) (hb : b < dim) (hc : c < dim)
    (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c) :
    Gate.WellTyped dim (cuccaro_UMA a b c)
*WellTyped for `cuccaro_UMA`.**
defcuccaro_input_F
def cuccaro_input_F (q_start : Nat) (c_in : Bool) (a b : Nat) (q : Nat) : Bool
*Cuccaro register-level input encoding.** Given `a`, `b : Nat` (the two inputs as binary numbers) and `c_in : Bool` (the carry-in), produces the initial bit-function over `Nat → Bool` per the layout above.
defcuccaroAdderSpec
def cuccaroAdderSpec (bits a b : Nat) : Nat
*Cuccaro spec: integer-level sum-modulo-2^bits.** The Boolean specification of an n-bit addition.
theoremcuccaro_input_F_at_c_in
theorem cuccaro_input_F_at_c_in (q_start : Nat) (c_in : Bool) (a b : Nat) :
    cuccaro_input_F q_start c_in a b q_start = c_in
*Sanity: decoder at the carry-in position.**
theoremcuccaro_input_F_at_b
theorem cuccaro_input_F_at_b
    (q_start i : Nat) (c_in : Bool) (a b : Nat) :
    cuccaro_input_F q_start c_in a b (q_start + 2 * i + 1) = b.testBit i
*Sanity: decoder at the i-th `b` position (q_start + 2i + 1).
theoremcuccaro_input_F_at_a
theorem cuccaro_input_F_at_a
    (q_start i : Nat) (c_in : Bool) (a b : Nat) :
    cuccaro_input_F q_start c_in a b (q_start + 2 * i + 2) = a.testBit i
*Sanity: decoder at the i-th `a` position (q_start + 2i + 2).

FormalRV.Arithmetic.Cuccaro.CuccaroDecoded

FormalRV/Arithmetic/Cuccaro/CuccaroDecoded.lean
FormalRV.BQAlgo.CuccaroDecoded — integer-level decoded specification of the exact-budget full Cuccaro adder. Tick 44 bridges the bit-level symbolic correctness of `cuccaro_n_bit_adder_full` (proved in `CuccaroFull.lean`) to the Nat-level statement `cuccaro_target_val bits q_start (output) = (a + b + c_in) % 2^bits` using the framework's existing `Adder.carry`/`Adder.sumfb` machinery (proved in `RippleCarryAdder.lean`). This is the natural next step toward closing the original SQIR placeholder axioms: the adder primitive now matches the integer arithmetic spec, exposing a clean composable interface. Structure of this file: - decoders: `cuccaro_target_val`, `cuccaro_read_val`. - decoder sanity lemmas on `cuccaro_input_F`. - `cuccaro_target_val_eq_sum_when_bits_match` (generic bit-stream→Nat). - `cuccaro_carry_eq_Adder_carry` (bridge to framework `Adder.carry`). - decoded correctness theorems for the full adder. - packaged primitive `cuccaro_n_bit_adder_full_primitive`.
defcuccaro_target_val
def cuccaro_target_val (bits q_start : Nat) (f : Nat → Bool) : Nat
Decoder: value of the target/b register at width `bits`, LSB-first. Bit at `q_start + 2i + 1` contributes weight `2^i`.
defcuccaro_read_val
def cuccaro_read_val (bits q_start : Nat) (f : Nat → Bool) : Nat
Decoder: value of the read/a register at width `bits`, LSB-first. Bit at `q_start + 2i + 2` contributes weight `2^i`.
theoremcuccaro_target_val_lt
theorem cuccaro_target_val_lt (bits q_start : Nat) (f : Nat → Bool) :
    cuccaro_target_val bits q_start f < 2^bits
theoremcuccaro_read_val_lt
theorem cuccaro_read_val_lt (bits q_start : Nat) (f : Nat → Bool) :
    cuccaro_read_val bits q_start f < 2^bits
theoremcuccaro_target_val_eq_sum_when_bits_match
theorem cuccaro_target_val_eq_sum_when_bits_match
    (bits q_start S : Nat) (f : Nat → Bool)
    (h : ∀ i, i < bits → f (q_start + 2 * i + 1) = S.testBit i) :
    cuccaro_target_val bits q_start f = S % 2^bits
*Generic bit-stream-to-Nat lemma for the target decoder.** If `f` matches `S.testBit i` at all target positions for `i < bits`, then `cuccaro_target_val bits q_start f = S % 2^bits`. Same shape as `gidney_target_val_eq_sum_when_bits_match` but for the Cuccaro layout.
theoremcuccaro_read_val_eq_sum_when_bits_match
theorem cuccaro_read_val_eq_sum_when_bits_match
    (bits q_start S : Nat) (f : Nat → Bool)
    (h : ∀ i, i < bits → f (q_start + 2 * i + 2) = S.testBit i) :
    cuccaro_read_val bits q_start f = S % 2^bits
*Generic bit-stream-to-Nat lemma for the read decoder.** Same shape as above.
theoremcuccaro_target_val_input
theorem cuccaro_target_val_input
    (bits q_start a b : Nat) (c_in : Bool) (hb : b < 2^bits) :
    cuccaro_target_val bits q_start (cuccaro_input_F q_start c_in a b) = b
The input encoding decodes the target register to `b % 2^bits`.
theoremcuccaro_read_val_input
theorem cuccaro_read_val_input
    (bits q_start a b : Nat) (c_in : Bool) (ha : a < 2^bits) :
    cuccaro_read_val bits q_start (cuccaro_input_F q_start c_in a b) = a
The input encoding decodes the read register to `a % 2^bits`.
theoremmajority_eq_xor_pairs
theorem majority_eq_xor_pairs (a b c : Bool) :
    Boolean.majority a b c
      = xor (xor (a && b) (b && c)) (a && c)
*Boolean majority = XOR-pairwise-AND.** Local algebraic identity used by the carry bridge.
theoremcuccaro_carry_eq_Adder_carry
theorem cuccaro_carry_eq_Adder_carry
    (f : Nat → Bool) (q_start k : Nat) :
    cuccaro_carry f q_start k
      = Adder.carry (f q_start) k
          (fun i => f (q_start + 2 * i + 1))
          (fun i => f (q_start + 2 * i + 2))
*Carry-function bridge.** The Cuccaro carry function on a state `f` and origin `q_start` equals the framework `Adder.carry` on the corresponding bit streams, with carry-in `f q_start`. Bit stream conventions: - f-stream of `Adder.carry`: `i ↦ f (q_start + 2i + 1)` (the b-bits). - g-stream of `Adder.carry`: `i ↦ f (q_start + 2i + 2)` (the a-bits). Note: `Adder.carry` is symmetric in its two streams (`Adder.carry_sym`), so the order doesn't affect the carry.
theoremcuccaro_n_bit_adder_full_target_decode_carry
theorem cuccaro_n_bit_adder_full_target_decode_carry
    (bits q_start a b : Nat) (c_in : Bool) (ha : a < 2^bits) (hb : b < 2^bits) :
    cuccaro_target_val bits q_start
        (Gate.applyNat (cuccaro_n_bit_adder_full bits q_start)
          (cuccaro_input_F q_start c_in a b))
      = (a + b + c_in.toNat) % 2^bits
*HEADLINE — decoded target-register correctness for arbitrary carry-in.** After running the full Cuccaro adder on `cuccaro_input_F q_start c_in a b`, the target register decodes to `(a + b + c_in.toNat) % 2^bits`.
theoremcuccaro_n_bit_adder_full_target_decode
theorem cuccaro_n_bit_adder_full_target_decode
    (bits q_start a b : Nat) (ha : a < 2^bits) (hb : b < 2^bits) :
    cuccaro_target_val bits q_start
        (Gate.applyNat (cuccaro_n_bit_adder_full bits q_start)
          (cuccaro_input_F q_start false a b))
      = (a + b) % 2^bits
*HEADLINE — decoded target-register correctness for carry-in `false`.** After running the full Cuccaro adder on `cuccaro_input_F q_start false a b`, the target register decodes to `(a + b) % 2^bits`.
theoremcuccaro_n_bit_adder_full_read_decode
theorem cuccaro_n_bit_adder_full_read_decode
    (bits q_start a b : Nat) (c_in : Bool) (ha : a < 2^bits) :
    cuccaro_read_val bits q_start
        (Gate.applyNat (cuccaro_n_bit_adder_full bits q_start)
          (cuccaro_input_F q_start c_in a b))
      = a
*Decoded read-register restoration.** After running the full Cuccaro adder, the read register still decodes to `a`.
theoremcuccaro_n_bit_adder_full_carry_in_decode
theorem cuccaro_n_bit_adder_full_carry_in_decode
    (bits q_start a b : Nat) (c_in : Bool) :
    Gate.applyNat (cuccaro_n_bit_adder_full bits q_start)
        (cuccaro_input_F q_start c_in a b) q_start = c_in
*Decoded carry-in restoration.** After running the full Cuccaro adder on `cuccaro_input_F q_start c_in a b`, the carry-in qubit at `q_start` still holds `c_in`.
theoremcuccaro_n_bit_adder_full_primitive
theorem cuccaro_n_bit_adder_full_primitive
    (bits q_start a b : Nat) (ha : a < 2^bits) (hb : b < 2^bits) :
    Gate.WellTyped (q_start + (2 * bits + 1))
        (cuccaro_n_bit_adder_full bits q_start)
    ∧ cuccaro_target_val bits q_start
          (Gate.applyNat (cuccaro_n_bit_adder_full bits q_start)
            (cuccaro_input_F q_start false a b))
        = (a + b) % 2^bits
    ∧ cuccaro_read_val bits q_start
          (Gate.applyNat (cuccaro_n_bit_adder_full bits q_start)
            (cuccaro_input_F q_start false a b))
        = a
*HEADLINE — exact-budget Cuccaro adder primitive.** For any `bits`, `q_start`, and `a, b < 2^bits`, the full Cuccaro adder: - is WellTyped at dimension `q_start + (2*bits + 1)`; - writes `(a + b) % 2^bits` into the target register; - preserves the read register `a`; - restores the carry-in qubit (when initialized to `false`).

FormalRV.Arithmetic.Cuccaro.CuccaroFull

FormalRV/Arithmetic/Cuccaro/CuccaroFull.lean
FormalRV.BQAlgo.CuccaroFull — the BOUNDARY-CORRECTED Cuccaro adder. Tick 42: per the third-party Python sanity check (`scripts/check_cuccaro_adder.py`), the existing `cuccaro_n_bit_adder_skeleton` (forward MAJ-chain + forward UMA-chain) is NOT a correct in-place adder for n ≥ 2 — it fails 606 of 680 test cases. The fix is to **REVERSE the UMA chain order**: apply `UMA_{n-1}, UMA_{n-2}, ..., UMA_1, UMA_0` (descending) rather than `UMA_0, ..., UMA_{n-1}` (ascending). With that single structural fix, the simulator passes all 680 cases (n = 1..4, c_in ∈ {F, T}, all a, b < 2^n). This module defines the corrected `cuccaro_n_bit_adder_full` and proves WellTyped. Semantic correctness on the chain level is left as the next-tick deliverable. Layout (matches `cuccaro_input_F` in `BQAlgo/CuccaroCorrectness.lean`): - pos q_start + 0: c_in (carry-in). - pos q_start + 2i + 1: bit i of b (target register; becomes (a+b+c_in) mod 2^n). - pos q_start + 2i + 2: bit i of a (read register; preserved). - Total: 2*n + 1 qubits. This matches SQIR's `modmult_rev_anc n = 2*n + 1` budget EXACTLY, making this the natural exact-budget primitive for closing the original SQIR placeholders.
defcuccaro_uma_chain_reverse
def cuccaro_uma_chain_reverse : Nat → Nat → Gate
  | 0,     _       => I
  | n + 1, q_start =>
      seq (cuccaro_uma_chain_reverse n (q_start + 2))
          (cuccaro_UMA q_start (q_start + 1) (q_start + 2))
Reverse UMA chain: `UMA_{n-1}, UMA_{n-2}, ..., UMA_0`, applied in descending order on consecutive triples starting at `q_start`.
defcuccaro_n_bit_adder_full
def cuccaro_n_bit_adder_full (n q_start : Nat) : Gate
*Boundary-corrected n-bit Cuccaro adder.** Forward MAJ chain followed by **reverse** UMA chain. Validated by exhaustive Boolean simulation for n = 1..4 (see `scripts/check_cuccaro_adder.py`). Layout: `2 * n + 1` qubits starting at `q_start`; matches `cuccaro_input_F`.
theoremtcount_cuccaro_uma_chain_reverse
theorem tcount_cuccaro_uma_chain_reverse (n q_start : Nat) :
    tcount (cuccaro_uma_chain_reverse n q_start) = 7 * n
T-count of the reverse UMA chain is `7 * n`.
theoremtcount_cuccaro_n_bit_adder_full
theorem tcount_cuccaro_n_bit_adder_full (n q_start : Nat) :
    tcount (cuccaro_n_bit_adder_full n q_start) = 14 * n
T-count of the full adder: `14 * n`. Same as the (incorrect) skeleton — reordering doesn't change cost.
example(example)
example : tcount (cuccaro_n_bit_adder_full 4 0) = 56
Smoke: 4-bit full adder has 56 T-gates.
theoremcuccaro_maj_chain_wellTyped
theorem cuccaro_maj_chain_wellTyped
    (n q_start dim : Nat) (h : q_start + 2 * n + 1 ≤ dim) :
    Gate.WellTyped dim (cuccaro_maj_chain n q_start)
The MAJ chain of `n` steps starting at `q_start` is well-typed in any dimension `dim` containing the touched range `[q_start, q_start + 2n]`.
theoremcuccaro_uma_chain_reverse_wellTyped
theorem cuccaro_uma_chain_reverse_wellTyped
    (n q_start dim : Nat) (h : q_start + 2 * n + 1 ≤ dim) :
    Gate.WellTyped dim (cuccaro_uma_chain_reverse n q_start)
The reverse UMA chain is well-typed in any dimension containing `[q_start, q_start + 2n]`.
theoremcuccaro_n_bit_adder_full_wellTyped
theorem cuccaro_n_bit_adder_full_wellTyped
    (n q_start dim : Nat) (h : q_start + 2 * n + 1 ≤ dim) :
    Gate.WellTyped dim (cuccaro_n_bit_adder_full n q_start)
*HEADLINE: Full Cuccaro adder is well-typed.** In any dimension `dim ≥ q_start + 2n + 1` (covers all touched qubits, the highest being `q_start + 2n` for n ≥ 1), the corrected full adder is structurally well-typed. Proved by structural composition of MAJ-chain WellTyped with reverse-UMA-chain WellTyped.
example(example)
example : Gate.WellTyped 5 (cuccaro_n_bit_adder_full 2 0)
theoremcuccaro_maj_chain_frame_below
theorem cuccaro_maj_chain_frame_below
    (n q_start : Nat) (f : Nat → Bool) (q : Nat) (h : q < q_start) :
    Gate.applyNat (cuccaro_maj_chain n q_start) f q = f q
*Frame lemma for the MAJ chain: positions strictly below `q_start` are unchanged.** The chain touches only qubits `[q_start, q_start + 2n]`, so anything below is preserved. Proved by induction on `n` using `cuccaro_MAJ_at_other`.
theoremcuccaro_uma_chain_reverse_frame_below
theorem cuccaro_uma_chain_reverse_frame_below
    (n q_start : Nat) (f : Nat → Bool) (q : Nat) (h : q < q_start) :
    Gate.applyNat (cuccaro_uma_chain_reverse n q_start) f q = f q
*Frame lemma for the reverse UMA chain: positions strictly below `q_start` are unchanged.** Analogous to the MAJ-chain frame.
theoremcuccaro_n_bit_adder_full_frame_below
theorem cuccaro_n_bit_adder_full_frame_below
    (n q_start : Nat) (f : Nat → Bool) (q : Nat) (h : q < q_start) :
    Gate.applyNat (cuccaro_n_bit_adder_full n q_start) f q = f q
*Frame lemma for the full adder: positions strictly below `q_start` are unchanged.** Composition of the MAJ-chain and reverse-UMA-chain frame lemmas.
theoremcuccaro_maj_chain_frame_above
theorem cuccaro_maj_chain_frame_above
    (n q_start : Nat) (f : Nat → Bool) (q : Nat)
    (h : q_start + 2 * n + 1 ≤ q) :
    Gate.applyNat (cuccaro_maj_chain n q_start) f q = f q
The MAJ chain doesn't touch positions `≥ q_start + 2n + 1`.
theoremcuccaro_uma_chain_reverse_frame_above
theorem cuccaro_uma_chain_reverse_frame_above
    (n q_start : Nat) (f : Nat → Bool) (q : Nat)
    (h : q_start + 2 * n + 1 ≤ q) :
    Gate.applyNat (cuccaro_uma_chain_reverse n q_start) f q = f q
The reverse UMA chain doesn't touch positions `≥ q_start + 2n + 1`.
theoremcuccaro_n_bit_adder_full_frame_above
theorem cuccaro_n_bit_adder_full_frame_above
    (n q_start : Nat) (f : Nat → Bool) (q : Nat)
    (h : q_start + 2 * n + 1 ≤ q) :
    Gate.applyNat (cuccaro_n_bit_adder_full n q_start) f q = f q
*Full adder doesn't touch positions outside `[q_start, q_start + 2n]`.**
theoremcuccaro_maj_chain_at_first_a
theorem cuccaro_maj_chain_at_first_a
    (n q_start : Nat) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_maj_chain (n + 1) q_start) f q_start
      = xor (f q_start) (f (q_start + 2))
*First MAJ-chain step at position `q_start` (the first MAJ's `a` wire).** After `cuccaro_maj_chain (n+1) q_start`, position `q_start` holds `xor (f q_start) (f (q_start + 2))` — the result of MAJ_0's `a`-wire action, since the recursive sub-chain (starting at `q_start + 2`) doesn't touch positions below `q_start + 2`.
theoremcuccaro_maj_chain_at_first_b
theorem cuccaro_maj_chain_at_first_b
    (n q_start : Nat) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_maj_chain (n + 1) q_start) f (q_start + 1)
      = xor (f (q_start + 1)) (f (q_start + 2))
*First MAJ-chain step at position `q_start + 1` (the first MAJ's `b` wire).** After `cuccaro_maj_chain (n+1) q_start`, position `q_start + 1` holds `xor (f (q_start + 1)) (f (q_start + 2))`.
defcuccaro_carry
def cuccaro_carry (f : Nat → Bool) (q_start : Nat) : Nat → Bool
  | 0     => f q_start
  | k + 1 => Boolean.majority
               (cuccaro_carry f q_start k)
               (f (q_start + 2 * k + 1))
               (f (q_start + 2 * k + 2))
*Classical Cuccaro carry function.** Given a state `f` and a register origin `q_start`, `cuccaro_carry f q_start k` is the carry into bit-position k of the addition encoded by `f` (per the layout `pos q_start = c_in; pos q_start + 2i + 1 = b_i; pos q_start + 2i + 2 = a_i`). Defined recursively via the majority function (which is the classical full-adder carry-out).
theoremcuccaro_carry_after_MAJ0_shift
theorem cuccaro_carry_after_MAJ0_shift
    (q_start : Nat) (f : Nat → Bool) (k : Nat) :
    cuccaro_carry (Gate.applyNat (cuccaro_MAJ q_start (q_start + 1) (q_start + 2)) f)
                  (q_start + 2) k
      = cuccaro_carry f q_start (k + 1)
*Shift lemma.** Applying `MAJ_0` (the first chain step) and then the carry function starting from the shifted position `q_start + 2` equals the original carry function at the next index. This is the algebraic glue for the chain-invariant induction.
theoremcuccaro_maj_chain_at_carry_a
theorem cuccaro_maj_chain_at_carry_a
    (n q_start : Nat) (f : Nat → Bool) (i : Nat) (hi : i < n) :
    Gate.applyNat (cuccaro_maj_chain n q_start) f (q_start + 2 * i)
      = xor (cuccaro_carry f q_start i) (f (q_start + 2 * i + 2))
*MAJ-chain invariant at the carry positions `q_start + 2*i` (i < n).**
theoremcuccaro_maj_chain_at_b_xor
theorem cuccaro_maj_chain_at_b_xor
    (n q_start : Nat) (f : Nat → Bool) (i : Nat) (hi : i < n) :
    Gate.applyNat (cuccaro_maj_chain n q_start) f (q_start + 2 * i + 1)
      = xor (f (q_start + 2 * i + 1)) (f (q_start + 2 * i + 2))
*MAJ-chain invariant at the `b`-bit positions `q_start + 2*i + 1` (i < n).**
theoremcuccaro_maj_chain_at_top_carry
theorem cuccaro_maj_chain_at_top_carry
    (n q_start : Nat) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_maj_chain n q_start) f (q_start + 2 * n)
      = cuccaro_carry f q_start n
*MAJ-chain invariant at the top position `q_start + 2*n`: holds the final carry `c_n`.**
theoremcuccaro_UMA_undo_MAJ_a
theorem cuccaro_UMA_undo_MAJ_a
    (a b c : Nat) (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_UMA a b c)
        (Gate.applyNat (cuccaro_MAJ a b c) f) a
      = f a
*Algebraic UMA-after-MAJ identity (a-wire).** Applying UMA to the state after a MAJ on the same triple restores the original `a` value at the a-wire. This is the symbolic version of `MAJ_then_UMA_restores_a`.
theoremcuccaro_UMA_undo_MAJ_c
theorem cuccaro_UMA_undo_MAJ_c
    (a b c : Nat) (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_UMA a b c)
        (Gate.applyNat (cuccaro_MAJ a b c) f) c
      = f c
*Algebraic UMA-after-MAJ identity (c-wire).** Restores the original `c` value at the c-wire.
theoremcuccaro_UMA_undo_MAJ_b
theorem cuccaro_UMA_undo_MAJ_b
    (a b c : Nat) (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_UMA a b c)
        (Gate.applyNat (cuccaro_MAJ a b c) f) b
      = xor (xor (f a) (f b)) (f c)
*Algebraic UMA-after-MAJ identity (b-wire).** Writes the sum bit `f a XOR f b XOR f c` at the b-wire.
theoremcuccaro_n_bit_adder_full_carry_in_restored
theorem cuccaro_n_bit_adder_full_carry_in_restored
    (n q_start : Nat) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_n_bit_adder_full n q_start) f q_start = f q_start
*Carry-in restoration: position `q_start` is unchanged by the full Cuccaro adder.**
theoremcuccaro_n_bit_adder_full_a_restored
theorem cuccaro_n_bit_adder_full_a_restored
    (n q_start : Nat) (f : Nat → Bool) (i : Nat) (hi : i < n) :
    Gate.applyNat (cuccaro_n_bit_adder_full n q_start) f (q_start + 2 * i + 2)
      = f (q_start + 2 * i + 2)
*Read register restoration: position `q_start + 2*i + 2` is unchanged by the full Cuccaro adder for any `i < n`.** This is the second of the three positional invariants — the `a` register (stored at the read-positions) is preserved by the full adder. Same induction pattern as `_carry_in_restored`: split on `i = 0` vs `i ≥ 1`. For `i = 0`, the local `cuccaro_UMA_undo_MAJ_c` identity applies after using IH on the sub-carry-in restoration. For `i ≥ 1`, the sub-adder's a-restoration IH directly handles it, with the outer UMA_0 and MAJ_0 leaving the position untouched.
theoremcuccaro_n_bit_adder_full_sum_bit
theorem cuccaro_n_bit_adder_full_sum_bit
    (n q_start : Nat) (f : Nat → Bool) (i : Nat) (hi : i < n) :
    Gate.applyNat (cuccaro_n_bit_adder_full n q_start) f (q_start + 2 * i + 1)
      = xor (xor (cuccaro_carry f q_start i) (f (q_start + 2 * i + 1)))
            (f (q_start + 2 * i + 2))
*Sum-bit invariant: at position `q_start + 2*i + 1` (for `i < n`), the full Cuccaro adder produces the sum bit `c_i ⊕ b_i ⊕ a_i`.** This is the third and final positional invariant for the full adder. With the carry-in and a-restoration theorems above, this completes the symbolic specification of `cuccaro_n_bit_adder_full`. Proof structure: induction on n, splitting i into `i = 0` (UMA_0's b-wire action + cuccaro_UMA_undo_MAJ_b at the local level) and `i ≥ 1` (sub-adder's IH + carry-shift bridging).
theoremcuccaro_n_bit_adder_full_correct
theorem cuccaro_n_bit_adder_full_correct
    (n q_start : Nat) (f : Nat → Bool) :
    (Gate.applyNat (cuccaro_n_bit_adder_full n q_start) f q_start = f q_start) ∧
    (∀ i, i < n →
        Gate.applyNat (cuccaro_n_bit_adder_full n q_start) f (q_start + 2 * i + 1)
          = xor (xor (cuccaro_carry f q_start i) (f (q_start + 2 * i + 1)))
                (f (q_start + 2 * i + 2))) ∧
    (∀ i, i < n →
        Gate.applyNat (cuccaro_n_bit_adder_full n q_start) f (q_start + 2 * i + 2)
          = f (q_start + 2 * i + 2))
*HEADLINE — symbolic correctness of the full Cuccaro adder.** For any input state `f`, the full Cuccaro adder of length `n` starting at `q_start`: - restores the carry-in at position `q_start`; - produces sum bit `c_i ⊕ b_i ⊕ a_i` at position `q_start + 2*i + 1` for each `i < n` (where c_i is the cumulative classical carry); - restores the read register `a_i` at position `q_start + 2*i + 2`.

FormalRV.Arithmetic.Cuccaro.CuccaroModReduce

FormalRV/Arithmetic/Cuccaro/CuccaroModReduce.lean
FormalRV.BQAlgo.CuccaroModReduce — exact-budget Cuccaro modular-reduction skeleton + formal blocker. Tick 48: factor the Cuccaro subtract-constant primitive into its forward-only and reverse-only components, prove WellTyped for both, prove their composition equals the full subtract, and formalize the conclusion that no clean exact-budget modular reduction can be built from the current primitives without an additional qubit. Structure: - `cuccaro_subConstForwardOnlyGate`: prepare(K) ; MAJ chain. Exposes the comparison flag at the top carry position. - `cuccaro_subConstReverseOnlyGate`: UMA chain ; prepare(K). Completes the subtraction when run after the forward gate. - `cuccaro_subConst_forward_reverse_pointwise_eq`: pointwise equality of forward+reverse with the full subtract. - Flag-behavior theorem (reuse Tick 47 result). - Blocker documentation: simulation (script `check_cuccaro_modreduce.py`) confirms no exact-budget candidate gives clean modular reduction.
defcuccaro_subConstForwardOnlyGate
def cuccaro_subConstForwardOnlyGate (bits q_start N : Nat) : Gate
*Forward-only Cuccaro subtract gate.** Prepares the two's-complement constant `K = 2^bits - N` in the read register, then runs the MAJ chain. Leaves the workspace in a dirty intermediate state but exposes the comparison flag at position `q_start + 2*bits`. This is the same gate as `cuccaro_compareConstForwardGate` from Tick 47, introduced under a name that matches the subtraction-decomposition framing of Tick 48.
defcuccaro_subConstReverseOnlyGate
def cuccaro_subConstReverseOnlyGate (bits q_start N : Nat) : Gate
*Reverse-only Cuccaro subtract gate.** Runs the reverse UMA chain then unprepares the constant. When run AFTER the forward gate on the same input, the composition computes the full clean subtract.
theoremcuccaro_subConstForwardOnlyGate_wellTyped
theorem cuccaro_subConstForwardOnlyGate_wellTyped
    (bits q_start N dim : Nat) (h : q_start + 2 * bits + 1 ≤ dim) :
    Gate.WellTyped dim (cuccaro_subConstForwardOnlyGate bits q_start N)
theoremcuccaro_subConstReverseOnlyGate_wellTyped
theorem cuccaro_subConstReverseOnlyGate_wellTyped
    (bits q_start N dim : Nat) (h : q_start + 2 * bits + 1 ≤ dim) :
    Gate.WellTyped dim (cuccaro_subConstReverseOnlyGate bits q_start N)
theoremcuccaro_subConst_forward_reverse_pointwise_eq
theorem cuccaro_subConst_forward_reverse_pointwise_eq
    (bits q_start N : Nat) (f : Nat → Bool) (q : Nat) :
    Gate.applyNat
      (seq (cuccaro_subConstForwardOnlyGate bits q_start N)
            (cuccaro_subConstReverseOnlyGate bits q_start N)) f q
      = Gate.applyNat (cuccaro_subConstGate bits q_start N) f q
*Pointwise equality of forward ; reverse with the full subtract.**
theoremcuccaro_subConstForwardOnly_top_carry
theorem cuccaro_subConstForwardOnly_top_carry
    (bits q_start N x : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits) :
    Gate.applyNat (cuccaro_subConstForwardOnlyGate bits q_start N)
        (cuccaro_input_F q_start false 0 x) (q_start + 2 * bits)
      = decide (N ≤ x)
*Flag behavior of the forward-only subtract**: at the top carry position, the value is `decide (N ≤ x)`. Reused from Tick 47.
theoremcuccaro_subConstSkeleton_flag_value_at_use_point
theorem cuccaro_subConstSkeleton_flag_value_at_use_point
    (bits q_start N x : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits) :
    Gate.applyNat
        (cuccaro_subConstForwardOnlyGate bits q_start N)
        (cuccaro_input_F q_start false 0 x) (q_start + 2 * bits)
      = decide (N ≤ x)
*HEADLINE — flag-controlled action specification.** At the point of any "use the flag" operation that is inserted between forward and reverse, the qubit at `q_start + 2 * bits` holds `decide (N ≤ x)`. This is the contract any candidate modular-reduction skeleton must satisfy.
theoremcuccaro_subConstGate_not_modular_reduction
theorem cuccaro_subConstGate_not_modular_reduction
    (bits q_start N x : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N < 2^bits) (hx : x < N) :
    cuccaro_target_val bits q_start
        (Gate.applyNat (cuccaro_subConstGate bits q_start N)
          (cuccaro_input_F q_start false 0 x))
      ≠ x % N
*Formal blocker: no candidate single-step modular reduction.** The bare subtract-constant primitive does not compute modular reduction. Specifically, for any `bits`, `N` with `0 < N ≤ 2^bits`, and `x ∈ [0, 2N)`, the bare subtract gives the WRONG result whenever `x < N`. This is proved by reduction to the existing `cuccaro_subConstSpec_of_lt` lemma: in the underflow case, the spec equals `x + 2^bits - N ≠ x` (in general).

FormalRV.Arithmetic.Cuccaro.CuccaroSQIRCondAdd

FormalRV/Arithmetic/Cuccaro/CuccaroSQIRCondAdd.lean
FormalRV.BQAlgo.CuccaroSQIRCondAdd — SQIR-style conditional add-constant / subtract-constant gates and dirty-flag modular adder. Tick 54: build the conditional add/sub primitives needed to turn the Tick 53 mod-2^bits skeleton into a true mod-N add-constant primitive. Route chosen: B (masked constant preparation). Our Gate IR has X, CX, CCX but no controlled-CCX. Following the existing Gidney-route `prepareMaskedConstRead`/`conditionalAddConstGate` pattern, we use CX(flagPos, read_pos_i) for each bit of the constant. This file lands: - `sqir_prepareMaskedConstRead`: definition. - per-position semantics (at_read, at_other) for masked prepare. - `sqir_prepareMaskedConstRead_wellTyped`. - `sqir_conditionalAddConstGate`: prepare(masked) ; full_adder ; prepare(masked). - WellTyped for the conditional add. - `sqir_conditionalSubConstGate`: alias for add by 2^bits - N. Semantic correctness (target decode) for conditional add/sub and the dirty-flag modular adder is left for a follow-up tick due to the depth of the input-state equivalence argument needed.
defsqir_prepareMaskedConstRead
def sqir_prepareMaskedConstRead : Nat → Nat → Nat → Nat → Gate
  | 0,     _,       _, _       => Gate.I
  | n + 1, q_start, N, flagPos =>
      seq (sqir_prepareMaskedConstRead n q_start N flagPos)
          (cond (N.testBit n) (Gate.CX flagPos (q_start + 2 * n + 2)) Gate.I)
*Masked constant preparation**: for each `i < bits`, conditionally applies `CX flagPos (q_start + 2*i + 2)` iff `N.testBit i`.
theoremsqir_prepareMaskedConstRead_at_other
theorem sqir_prepareMaskedConstRead_at_other
    (bits q_start N flagPos q : Nat)
    (hq : ∀ i, i < bits → q ≠ q_start + 2 * i + 2)
    (f : Nat → Bool) :
    Gate.applyNat (sqir_prepareMaskedConstRead bits q_start N flagPos) f q = f q
*Frame**: the masked prepare gate doesn't touch positions outside the read range.
theoremsqir_prepareMaskedConstRead_at_flagPos
theorem sqir_prepareMaskedConstRead_at_flagPos
    (bits q_start N flagPos : Nat)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (f : Nat → Bool) :
    Gate.applyNat (sqir_prepareMaskedConstRead bits q_start N flagPos) f flagPos = f flagPos
*Frame for flagPos**: as long as flagPos isn't a read position, the masked prepare gate doesn't touch flagPos (CX's control is read, not written).
theoremsqir_prepareMaskedConstRead_at_read
theorem sqir_prepareMaskedConstRead_at_read
    (bits q_start N flagPos j : Nat) (hj : j < bits) (f : Nat → Bool)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2) :
    Gate.applyNat (sqir_prepareMaskedConstRead bits q_start N flagPos) f
        (q_start + 2 * j + 2)
      = xor (f (q_start + 2 * j + 2)) (f flagPos && N.testBit j)
*Action at read positions**: at `q_start + 2*j + 2` for `j < bits`, the value is XORed with `(f flagPos && N.testBit j)`.
theoremsqir_prepareMaskedConstRead_wellTyped
theorem sqir_prepareMaskedConstRead_wellTyped
    (bits q_start N flagPos dim : Nat)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_flag : flagPos < dim)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2) :
    Gate.WellTyped dim (sqir_prepareMaskedConstRead bits q_start N flagPos)
defsqir_conditionalAddConstGate
def sqir_conditionalAddConstGate (bits q_start N flagPos : Nat) : Gate
*Conditional add-constant gate**: adds `N` to the target register iff the flag is true. Uses masked prepare to encode the constant `N` into the read register conditionally on the flag value.
defsqir_conditionalSubConstGate
def sqir_conditionalSubConstGate (bits q_start N flagPos : Nat) : Gate
*Conditional sub-constant gate**: subtracts `N` from the target iff the flag is true. Implemented as conditional-add of `2^bits - N` (two's complement).
theoremsqir_conditionalAddConstGate_wellTyped
theorem sqir_conditionalAddConstGate_wellTyped
    (bits q_start N flagPos dim : Nat)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_flag : flagPos < dim)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2) :
    Gate.WellTyped dim (sqir_conditionalAddConstGate bits q_start N flagPos)
theoremsqir_conditionalSubConstGate_wellTyped
theorem sqir_conditionalSubConstGate_wellTyped
    (bits q_start N flagPos dim : Nat)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_flag : flagPos < dim)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2) :
    Gate.WellTyped dim (sqir_conditionalSubConstGate bits q_start N flagPos)
defsqir_style_modAddConst_dirtyFlag_candidate
def sqir_style_modAddConst_dirtyFlag_candidate
    (bits q_start N c flagPos : Nat) : Gate
*Dirty-flag modular add-constant candidate**: addConst(c) ; compareConst(N) ; conditionalSubConst(N). After this gate: - target = `(x + c) % N` (when `x, c < N`). - read register restored to 0. - carry-in restored to false. - flag (at flagPos) holds `decide(N ≤ (x+c) % 2^bits)` — DIRTY. The flag is dirty because we don't uncompute the comparator. A clean modular add-constant requires either: - a flag-uncompute step (e.g., another comparator with the right polarity), or - accepting the dirty flag at the modAdd level and tracking it in the calling context. For Shor's modular multiplier, the inner loops typically need a clean flag — so the next milestone is to uncompute the flag.
theoremsqir_style_modAddConst_dirtyFlag_candidate_wellTyped
theorem sqir_style_modAddConst_dirtyFlag_candidate_wellTyped
    (bits q_start N c flagPos dim : Nat)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_flag : flagPos < dim)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (h_flag_distinct_top : flagPos ≠ q_start + 2 * bits) :
    Gate.WellTyped dim
        (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos)
theoremsqir_prepareMaskedConstRead_eq_id_at_flag_false
theorem sqir_prepareMaskedConstRead_eq_id_at_flag_false
    (bits q_start N flagPos : Nat) (f : Nat → Bool)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (h_flag_false : f flagPos = false) (q : Nat) :
    Gate.applyNat (sqir_prepareMaskedConstRead bits q_start N flagPos) f q
      = f q
*Masked prepare with flag = false is identity** (per position).
theoremsqir_prepareMaskedConstRead_eq_unmasked_at_flag_true
theorem sqir_prepareMaskedConstRead_eq_unmasked_at_flag_true
    (bits q_start N flagPos : Nat) (f : Nat → Bool)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (h_flag_true : f flagPos = true) (q : Nat) :
    Gate.applyNat (sqir_prepareMaskedConstRead bits q_start N flagPos) f q
      = Gate.applyNat (cuccaro_prepareConstRead bits q_start N) f q
*Masked prepare with flag = true equals `cuccaro_prepareConstRead N`** (per position).
theoremsqir_prepareMaskedConstRead_eq_id_fun
theorem sqir_prepareMaskedConstRead_eq_id_fun
    (bits q_start N flagPos : Nat) (f : Nat → Bool)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (h_flag_false : f flagPos = false) :
    Gate.applyNat (sqir_prepareMaskedConstRead bits q_start N flagPos) f = f
*Function-level: masked prepare = id when flag = false.**
theoremsqir_prepareMaskedConstRead_eq_unmasked_fun
theorem sqir_prepareMaskedConstRead_eq_unmasked_fun
    (bits q_start N flagPos : Nat) (f : Nat → Bool)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (h_flag_true : f flagPos = true) :
    Gate.applyNat (sqir_prepareMaskedConstRead bits q_start N flagPos) f
      = Gate.applyNat (cuccaro_prepareConstRead bits q_start N) f
*Function-level: masked prepare = cuccaro_prepareConstRead N when flag = true.**
theoremsqir_conditionalAddConstGate_apply_false_fun
theorem sqir_conditionalAddConstGate_apply_false_fun
    (bits q_start N flagPos : Nat) (g : Nat → Bool)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (h_flag_false : g flagPos = false)
    (h_flag_disjoint : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (sqir_conditionalAddConstGate bits q_start N flagPos) g
      = Gate.applyNat (cuccaro_n_bit_adder_full bits q_start) g
*HEADLINE — false-flag reduction**: when the flag value in the input state is `false`, the conditional add gate behaves like the bare full Cuccaro adder.
theoremsqir_conditionalAddConstGate_apply_true_fun
theorem sqir_conditionalAddConstGate_apply_true_fun
    (bits q_start N flagPos : Nat) (g : Nat → Bool)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (h_flag_true : g flagPos = true)
    (h_flag_disjoint : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (sqir_conditionalAddConstGate bits q_start N flagPos) g
      = Gate.applyNat (cuccaro_addConstGate bits q_start N) g
*HEADLINE — true-flag reduction**: when the flag value in the input state is `true`, the conditional add gate behaves like `cuccaro_addConstGate N`.
theoremcuccaro_carry_update_outside_locality
theorem cuccaro_carry_update_outside_locality
    (f : Nat → Bool) (q_start k p : Nat) (v : Bool)
    (h_p_outside : p < q_start ∨ q_start + 2 * k + 1 ≤ p) :
    cuccaro_carry (update f p v) q_start k = cuccaro_carry f q_start k
*Locality**: `cuccaro_carry` doesn't depend on input at positions outside its computation support.
theoremcuccaro_MAJ_commute_update
theorem cuccaro_MAJ_commute_update
    (a b c flagPos : Nat) (v : Bool) (f : Nat → Bool)
    (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c)
    (h_neq_a : flagPos ≠ a) (h_neq_b : flagPos ≠ b) (h_neq_c : flagPos ≠ c) :
    Gate.applyNat (cuccaro_MAJ a b c) (update f flagPos v)
      = update (Gate.applyNat (cuccaro_MAJ a b c) f) flagPos v
*`cuccaro_MAJ` commutes with `update` outside its wires.**
theoremcuccaro_UMA_commute_update
theorem cuccaro_UMA_commute_update
    (a b c flagPos : Nat) (v : Bool) (f : Nat → Bool)
    (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c)
    (h_neq_a : flagPos ≠ a) (h_neq_b : flagPos ≠ b) (h_neq_c : flagPos ≠ c) :
    Gate.applyNat (cuccaro_UMA a b c) (update f flagPos v)
      = update (Gate.applyNat (cuccaro_UMA a b c) f) flagPos v
*`cuccaro_UMA` commutes with `update` outside its wires.**
theoremcuccaro_maj_chain_commute_update_outside_workspace
theorem cuccaro_maj_chain_commute_update_outside_workspace
    (bits q_start flagPos : Nat) (v : Bool) (f : Nat → Bool)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (cuccaro_maj_chain bits q_start) (update f flagPos v)
      = update (Gate.applyNat (cuccaro_maj_chain bits q_start) f) flagPos v
*`cuccaro_maj_chain` commutes with `update` outside its workspace.**
theoremcuccaro_uma_chain_reverse_commute_update_outside_workspace
theorem cuccaro_uma_chain_reverse_commute_update_outside_workspace
    (bits q_start flagPos : Nat) (v : Bool) (f : Nat → Bool)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (cuccaro_uma_chain_reverse bits q_start) (update f flagPos v)
      = update (Gate.applyNat (cuccaro_uma_chain_reverse bits q_start) f) flagPos v
*`cuccaro_uma_chain_reverse` commutes with `update` outside its workspace.**
theoremcuccaro_n_bit_adder_full_commute_update_outside_workspace
theorem cuccaro_n_bit_adder_full_commute_update_outside_workspace
    (bits q_start flagPos : Nat) (v : Bool) (f : Nat → Bool)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (cuccaro_n_bit_adder_full bits q_start) (update f flagPos v)
      = update (Gate.applyNat (cuccaro_n_bit_adder_full bits q_start) f) flagPos v
*`cuccaro_n_bit_adder_full` commutes with `update` outside its workspace.**
theoremcuccaro_n_bit_adder_full_update_outside_workspace_at
theorem cuccaro_n_bit_adder_full_update_outside_workspace_at
    (bits q_start flagPos : Nat) (v : Bool) (f : Nat → Bool) (p : Nat)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (hp_in : q_start ≤ p ∧ p < q_start + 2 * bits + 1) :
    Gate.applyNat (cuccaro_n_bit_adder_full bits q_start) (update f flagPos v) p
      = Gate.applyNat (cuccaro_n_bit_adder_full bits q_start) f p
*HEADLINE — full-adder locality at workspace under outside update.**
theoremcuccaro_n_bit_adder_full_preserves_outside_workspace
theorem cuccaro_n_bit_adder_full_preserves_outside_workspace
    (bits q_start flagPos : Nat) (v : Bool) (f : Nat → Bool)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (cuccaro_n_bit_adder_full bits q_start)
        (update f flagPos v) flagPos = v
*HEADLINE — full-adder preserves flagPos value.**
theoremcuccaro_prepareConstRead_commute_update_outside_workspace
theorem cuccaro_prepareConstRead_commute_update_outside_workspace
    (bits q_start c flagPos : Nat) (v : Bool) (f : Nat → Bool)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (cuccaro_prepareConstRead bits q_start c) (update f flagPos v)
      = update (Gate.applyNat (cuccaro_prepareConstRead bits q_start c) f) flagPos v
*`cuccaro_prepareConstRead` commutes with `update` outside its workspace.**
theoremcuccaro_addConstGate_commute_update_outside_workspace
theorem cuccaro_addConstGate_commute_update_outside_workspace
    (bits q_start c flagPos : Nat) (v : Bool) (f : Nat → Bool)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (cuccaro_addConstGate bits q_start c) (update f flagPos v)
      = update (Gate.applyNat (cuccaro_addConstGate bits q_start c) f) flagPos v
*`cuccaro_addConstGate` commutes with `update` outside its workspace.**
theoremcuccaro_addConstGate_update_outside_workspace_at
theorem cuccaro_addConstGate_update_outside_workspace_at
    (bits q_start c flagPos : Nat) (v : Bool) (f : Nat → Bool) (p : Nat)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (hp_in : q_start ≤ p ∧ p < q_start + 2 * bits + 1) :
    Gate.applyNat (cuccaro_addConstGate bits q_start c) (update f flagPos v) p
      = Gate.applyNat (cuccaro_addConstGate bits q_start c) f p
*HEADLINE — addConstGate locality at workspace under outside update.**
theoremcuccaro_addConstGate_preserves_outside_workspace
theorem cuccaro_addConstGate_preserves_outside_workspace
    (bits q_start c flagPos : Nat) (v : Bool) (f : Nat → Bool)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (cuccaro_addConstGate bits q_start c)
        (update f flagPos v) flagPos = v
*HEADLINE — addConstGate preserves flagPos value.**
theoremsqir_conditionalAddConstGate_target_decode
theorem sqir_conditionalAddConstGate_target_decode
    (bits q_start N x flagPos : Nat) (flag : Bool)
    (hbits : 1 ≤ bits) (hN : N < 2^bits) (hx : x < 2^bits)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    cuccaro_target_val bits q_start
        (Gate.applyNat (sqir_conditionalAddConstGate bits q_start N flagPos)
          (update (cuccaro_input_F q_start false 0 x) flagPos flag))
      = (x + (if flag then N else 0)) % 2^bits
*HEADLINE Deliverable C — conditional add target decode.**
theoremsqir_conditionalAddConstGate_carry_in_restored
theorem sqir_conditionalAddConstGate_carry_in_restored
    (bits q_start N x flagPos : Nat) (flag : Bool)
    (hbits : 1 ≤ bits) (hN : N < 2^bits) (hx : x < 2^bits)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (sqir_conditionalAddConstGate bits q_start N flagPos)
        (update (cuccaro_input_F q_start false 0 x) flagPos flag) q_start = false
*Conditional add carry-in restored.**
theoremsqir_conditionalAddConstGate_read_decode
theorem sqir_conditionalAddConstGate_read_decode
    (bits q_start N x flagPos : Nat) (flag : Bool)
    (hbits : 1 ≤ bits) (hN : N < 2^bits) (hx : x < 2^bits)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    cuccaro_read_val bits q_start
        (Gate.applyNat (sqir_conditionalAddConstGate bits q_start N flagPos)
          (update (cuccaro_input_F q_start false 0 x) flagPos flag)) = 0
*Conditional add read register restored.**
theoremsqir_conditionalAddConstGate_flag_preserved
theorem sqir_conditionalAddConstGate_flag_preserved
    (bits q_start N x flagPos : Nat) (flag : Bool)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (sqir_conditionalAddConstGate bits q_start N flagPos)
        (update (cuccaro_input_F q_start false 0 x) flagPos flag) flagPos = flag
*Conditional add flag preserved.**
theoremsqir_conditionalAddConstGate_clean
theorem sqir_conditionalAddConstGate_clean
    (bits q_start N x flagPos dim : Nat) (flag : Bool)
    (hbits : 1 ≤ bits) (hN : N < 2^bits) (hx : x < 2^bits)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_flag : flagPos < dim)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.WellTyped dim (sqir_conditionalAddConstGate bits q_start N flagPos)
    ∧ cuccaro_target_val bits q_start
          (Gate.applyNat (sqir_conditionalAddConstGate bits q_start N flagPos)
            (update (cuccaro_input_F q_start false 0 x) flagPos flag))
        = (x + (if flag then N else 0)) % 2^bits
*HEADLINE Deliverable E — packaged clean conditional add.**
theoremsqir_conditionalSubConstGate_target_decode
theorem sqir_conditionalSubConstGate_target_decode
    (bits q_start N x flagPos : Nat) (flag : Bool)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    cuccaro_target_val bits q_start
        (Gate.applyNat (sqir_conditionalSubConstGate bits q_start N flagPos)
          (update (cuccaro_input_F q_start false 0 x) flagPos flag))
      = (x + (if flag then 2^bits - N else 0)) % 2^bits
*HEADLINE Deliverable F — conditional sub target decode.**

FormalRV.Arithmetic.Cuccaro.CuccaroSQIRDirtyFlag

FormalRV/Arithmetic/Cuccaro/CuccaroSQIRDirtyFlag.lean
(no documented top-level declarations)

FormalRV.Arithmetic.Cuccaro.CuccaroSQIRDirtyFlag.CuccaroCleanModularAddCorrectness

FormalRV/Arithmetic/Cuccaro/CuccaroSQIRDirtyFlag/CuccaroCleanModularAddCorrectness.lean
theoremsqir_style_modAddConst_clean_candidate_flag_restored
theorem sqir_style_modAddConst_clean_candidate_flag_restored
    (bits N c x : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N) :
    Gate.applyNat
        (sqir_style_modAddConst_clean_candidate bits 2 N c 1)
        (update (cuccaro_input_F 2 false 0 x) 1 false) 1
      = false
*HEADLINE Deliverable D — clean-candidate flag restoration.** At `flagPos`, the clean candidate restores the input flag value `false`.
theoremsqir_style_modAddConst_clean_candidate_target_decode_qstart
theorem sqir_style_modAddConst_clean_candidate_target_decode_qstart
    (bits q_start N c x flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    cuccaro_target_val bits q_start
        (Gate.applyNat
          (sqir_style_modAddConst_clean_candidate bits q_start N c flagPos)
          (update (cuccaro_input_F q_start false 0 x) flagPos false))
      = (x + c) % N
*R7d^xxix-L-3.9′ DELIVERABLE: q_start-parametric clean candidate target preservation.** q_start-parametric port of `sqir_style_modAddConst_clean_candidate_target_decode`. Replaces the hard-coded layout `q_start = 2`, `flagPos = 1` with free parameters and the standard outside-workspace hypotheses. The decoded target after the clean candidate equals `(x + c) % N`, regardless of where the workspace and flag sit. Dependencies (all already q_start-parametric): - `sqir_style_modAddConst_dirtyFlag_state_eq` (CuccaroSQIRDirtyFlag.lean:1378); - `cuccaro_target_val_eq_sum_when_bits_match` (CuccaroDecoded.lean:102); - `sqir_style_compareConst_candidate_workspace_restored_at_general` (CuccaroSQIRDirtyFlag.lean:568); - `cuccaro_input_F_at_b` (CuccaroCorrectness.lean:240).
theoremsqir_style_modAddConst_clean_candidate_target_decode
theorem sqir_style_modAddConst_clean_candidate_target_decode
    (bits N c x : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N) :
    cuccaro_target_val bits 2
        (Gate.applyNat
          (sqir_style_modAddConst_clean_candidate bits 2 N c 1)
          (update (cuccaro_input_F 2 false 0 x) 1 false))
      = (x + c) % N
*HEADLINE Deliverable E — clean candidate target preservation.** The clean candidate's decoded target equals `(x + c) % N`.
theoremsqir_style_modAddConst_clean_candidate_clean
theorem sqir_style_modAddConst_clean_candidate_clean
    (bits N c x : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N) :
    Gate.WellTyped (sqir_modmult_rev_anc bits)
        (sqir_style_modAddConst_clean_candidate bits 2 N c 1)
    ∧ cuccaro_target_val bits 2
          (Gate.applyNat
            (sqir_style_modAddConst_clean_candidate bits 2 N c 1)
            (update (cuccaro_input_F 2 false 0 x) 1 false))
        = (x + c) % N
*HEADLINE Deliverable F — clean candidate full bundle.** WellTyped + target = (x+c)%N + read restored + top-carry restored + flag restored.
theoremsqir_style_modAddConst_clean_gate_zero_eq
theorem sqir_style_modAddConst_clean_gate_zero_eq
    (bits N : Nat) :
    sqir_style_modAddConst_clean_gate bits N 0 = Gate.I
The wrapper at `c = 0` reduces to `Gate.I`.
theoremsqir_style_modAddConst_clean_gate_zero_clean
theorem sqir_style_modAddConst_clean_gate_zero_clean
    (bits N x : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < N) :
    Gate.WellTyped (sqir_modmult_rev_anc bits)
        (sqir_style_modAddConst_clean_gate bits N 0)
    ∧ cuccaro_target_val bits 2
          (Gate.applyNat
            (sqir_style_modAddConst_clean_gate bits N 0)
            (update (cuccaro_input_F 2 false 0 x) 1 false))
        = x
    ∧ cuccaro_read_val bits 2
          (Gate.applyNat
*Deliverable B — c = 0 bundle.** At `c = 0` the gate is the identity, so all 5 conjuncts (WellTyped + target = x + read = 0 + top carry = false + flag = false) reduce to facts about the input encoding.
theoremsqir_style_modAddConst_clean_gate_clean
theorem sqir_style_modAddConst_clean_gate_clean
    (bits N c x : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc : c < N) (hx : x < N) :
    Gate.WellTyped (sqir_modmult_rev_anc bits)
        (sqir_style_modAddConst_clean_gate bits N c)
    ∧ cuccaro_target_val bits 2
          (Gate.applyNat
            (sqir_style_modAddConst_clean_gate bits N c)
            (update (cuccaro_input_F 2 false 0 x) 1 false))
        = (x + c) % N
*HEADLINE Deliverable C — total clean modular add-constant theorem.** For all `c < N` (including `c = 0`), the wrapper's output satisfies: WellTyped + target = `(x+c) % N` + read = 0 + top carry = false + flag = false.
theoremsqir_style_modAddConst_clean_gate_clean_from_BasicSetting
theorem sqir_style_modAddConst_clean_gate_clean_from_BasicSetting
    (a r N m n c x : Nat)
    (h_basic : FormalRV.SQIRPort.BasicSetting a r N m n)
    (hc : c < N) (hx : x < N) :
    Gate.WellTyped (sqir_modmult_rev_anc (n + 1))
        (sqir_style_modAddConst_clean_gate (n + 1) N c)
    ∧ cuccaro_target_val (n + 1) 2
          (Gate.applyNat
            (sqir_style_modAddConst_clean_gate (n + 1) N c)
            (update (cuccaro_input_F 2 false 0 x) 1 false))
        = (x + c) % N
    ∧ cuccaro_read_val (n + 1) 2
*HEADLINE Deliverable D — BasicSetting-derived total clean mod-add-constant theorem.** At `bits := n + 1`, the SQIR-faithful sizing `2*N ≤ 2^(n+1)` follows from `BasicSetting`, removing the explicit `hN`, `hN2`, `hN_pos` preconditions.
theoremsqir_controlledAddConstPow2_target_decode
theorem sqir_controlledAddConstPow2_target_decode
    (bits q_start c x controlIdx : Nat) (control : Bool)
    (hbits : 1 ≤ bits) (hc : c < 2^bits) (hx : x < 2^bits)
    (h_control_distinct : ∀ i, i < bits → controlIdx ≠ q_start + 2 * i + 2)
    (h_control_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx) :
    cuccaro_target_val bits q_start
        (Gate.applyNat (sqir_conditionalAddConstGate bits q_start c controlIdx)
          (update (cuccaro_input_F q_start false 0 x) controlIdx control))
      = (x + (if control then c else 0)) % 2^bits
*HEADLINE Task 3 — controlled add-mod-2^bits target decode.**
theoremsqir_controlledCompareConst_at_control_true_eq_unmasked_fun
theorem sqir_controlledCompareConst_at_control_true_eq_unmasked_fun
    (bits q_start c controlIdx flagPos : Nat) (g : Nat → Bool)
    (h_control_distinct : ∀ i, i < bits → controlIdx ≠ q_start + 2 * i + 2)
    (h_control_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (h_control_ne_flag : controlIdx ≠ flagPos)
    (h_control_true : g controlIdx = true) :
    Gate.applyNat (sqir_controlledCompareConst bits q_start c controlIdx flagPos) g
      = Gate.applyNat (sqir_style_compareConst_candidate bits q_start c flagPos) g
*Helper — `ctrlCompare` reduces to `compareConst(c)` when `state[controlIdx] = true`.** Function-level equality.
theoremsqir_style_controlledModAddConst_gate_zero_eq
theorem sqir_style_controlledModAddConst_gate_zero_eq
    (bits N controlIdx : Nat) :
    sqir_style_controlledModAddConst_gate bits 2 N 0 controlIdx 1 = Gate.I
*Total wrapper at c = 0 reduces to `Gate.I`.**
theoremsqir_style_controlledModAddConst_gate_zero_clean
theorem sqir_style_controlledModAddConst_gate_zero_clean
    (bits N x controlIdx : Nat) (control : Bool)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (hx : x < N)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1) :
    Gate.WellTyped (sqir_modmult_rev_anc bits)
        (sqir_style_controlledModAddConst_gate bits 2 N 0 controlIdx 1)
    ∧ cuccaro_target_val bits 2
          (Gate.applyNat
            (sqir_style_controlledModAddConst_gate bits 2 N 0 controlIdx 1)
*HEADLINE partial Deliverable F — c = 0 bundle for the controlled modular add-constant wrapper.**
theoremGate.applyNat_CX_at_control_false_eq_id_fun
theorem Gate.applyNat_CX_at_control_false_eq_id_fun
    (control target : Nat) (f : Nat → Bool) (h : f control = false) :
    Gate.applyNat (Gate.CX control target) f = f
*Deliverable B — CX with control = false is identity.**
theoremGate.applyNat_CX_at_control_true_eq_X_fun
theorem Gate.applyNat_CX_at_control_true_eq_X_fun
    (control target : Nat) (f : Nat → Bool) (h : f control = true) :
    Gate.applyNat (Gate.CX control target) f = Gate.applyNat (Gate.X target) f
*Deliverable B — CX with control = true equals X(target).**
theoremcuccaro_maj_chain_top_carry_on_input_F_zero_a
theorem cuccaro_maj_chain_top_carry_on_input_F_zero_a
    (bits q_start x : Nat) (hbits : 1 ≤ bits) (hx : x < 2^bits) :
    Gate.applyNat (cuccaro_maj_chain bits q_start)
        (cuccaro_input_F q_start false 0 x) (q_start + 2 * bits) = false
*Helper — maj_chain on `cuccaro_input_F` with `a = 0` has top carry = false.** Derived from `cuccaro_compareConstForward_top_carry` with `N = 2^bits` (reducing the prepare to identity).
theoremsqir_controlledCompareConst_at_control_false_on_input_F_eq_id_fun
theorem sqir_controlledCompareConst_at_control_false_on_input_F_eq_id_fun
    (bits q_start c controlIdx flagPos x : Nat)
    (hbits : 1 ≤ bits) (hx : x < 2^bits)
    (h_control_distinct : ∀ i, i < bits → controlIdx ≠ q_start + 2 * i + 2)
    (h_control_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (h_control_ne_flag : controlIdx ≠ flagPos) :
    Gate.applyNat (sqir_controlledCompareConst bits q_start c controlIdx flagPos)
        (update (cuccaro_input_F q_start false 0 x) controlIdx false)
      = update (cuccaro_input_F q_start false 0 x) controlIdx false
*Deliverable A — controlled comparator at control = false is identity on `cuccaro_input_F`-shaped input.**
theoremsqir_style_compareConst_candidate_on_input_F_x_lt_N_eq_id_fun
theorem sqir_style_compareConst_candidate_on_input_F_x_lt_N_eq_id_fun
    (bits N x : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < N) :
    Gate.applyNat (sqir_style_compareConst_candidate bits 2 N 1)
        (cuccaro_input_F 2 false 0 x)
      = cuccaro_input_F 2 false 0 x
*Deliverable A — uncontrolled comparator identity on `cuccaro_input_F` when `x < N`.** Since `decide(N ≤ x) = false`, the comparator XORs false into flagPos (no change), and workspace + outside positions are preserved.
theoremcuccaro_input_F_at_controlIdx_outside_eq_false
theorem cuccaro_input_F_at_controlIdx_outside_eq_false
    (bits x controlIdx : Nat) (hx : x < 2^bits)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx) :
    cuccaro_input_F 2 false 0 x controlIdx = false
*Helper — `cuccaro_input_F` at `controlIdx` outside workspace is `false`.**
theoremupdate_input_F_controlIdx_false_eq_F
theorem update_input_F_controlIdx_false_eq_F
    (bits x controlIdx : Nat) (hx : x < 2^bits)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx) :
    update (cuccaro_input_F 2 false 0 x) controlIdx false
      = cuccaro_input_F 2 false 0 x
*Helper — `update F controlIdx false = F` when F at controlIdx is false.**
theoremsqir_style_controlledModAddConst_candidate_control_false_state_eq
theorem sqir_style_controlledModAddConst_candidate_control_false_state_eq
    (bits N c x controlIdx : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1) :
    Gate.applyNat
        (sqir_style_controlledModAddConst_candidate bits 2 N c controlIdx 1)
        (update (cuccaro_input_F 2 false 0 x) controlIdx false)
      = update (cuccaro_input_F 2 false 0 x) controlIdx false
*HEADLINE Deliverable C — control = false state equality for the controlled mod-N add candidate.**
theoremsqir_style_controlledModAddConst_candidate_target_decode_control_false
theorem sqir_style_controlledModAddConst_candidate_target_decode_control_false
    (bits N c x controlIdx : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1) :
    cuccaro_target_val bits 2
        (Gate.applyNat
          (sqir_style_controlledModAddConst_candidate bits 2 N c controlIdx 1)
          (update (cuccaro_input_F 2 false 0 x) controlIdx false))
      = x
*Control=false target decode = x.**
theoremcuccaro_input_F_at_controlIdx_outside_eq_false_qstart
theorem cuccaro_input_F_at_controlIdx_outside_eq_false_qstart
    (q_start bits x controlIdx : Nat) (hx : x < 2^bits)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx) :
    cuccaro_input_F q_start false 0 x controlIdx = false
*q_start-parametric variant** of `cuccaro_input_F_at_controlIdx_outside_eq_false`. Same fact, fully parametric.
theoremupdate_input_F_controlIdx_false_eq_F_qstart
theorem update_input_F_controlIdx_false_eq_F_qstart
    (q_start bits x controlIdx : Nat) (hx : x < 2^bits)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx) :
    update (cuccaro_input_F q_start false 0 x) controlIdx false
      = cuccaro_input_F q_start false 0 x
*q_start-parametric variant** of `update_input_F_controlIdx_false_eq_F`.
theoremsqir_style_compareConst_candidate_on_input_F_x_lt_N_eq_id_fun_qstart
theorem sqir_style_compareConst_candidate_on_input_F_x_lt_N_eq_id_fun_qstart
    (bits q_start N flagPos x : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < N)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos)
        (cuccaro_input_F q_start false 0 x)
      = cuccaro_input_F q_start false 0 x
*q_start-parametric variant** of `sqir_style_compareConst_candidate_on_input_F_x_lt_N_eq_id_fun`. Adds an explicit `hflag_out` hypothesis so `flagPos` can be at any outside-workspace position.
theoremsqir_style_controlledModAddConst_candidate_control_false_state_eq_qstart
theorem sqir_style_controlledModAddConst_candidate_control_false_state_eq_qstart
    (bits q_start N c x controlIdx flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (hcontrol_ne_flag : controlIdx ≠ flagPos) :
    Gate.applyNat
        (sqir_style_controlledModAddConst_candidate bits q_start N c controlIdx flagPos)
        (update (cuccaro_input_F q_start false 0 x) controlIdx false)
      = update (cuccaro_input_F q_start false 0 x) controlIdx false
*q_start-parametric variant** of `sqir_style_controlledModAddConst_candidate_control_false_state_eq`. When the control is false, the controlled mod-add candidate is the identity on the appropriate base state. Parametric in both `q_start` and `flagPos`.
theoremsqir_style_controlledModAddConst_candidate_target_decode_control_false_qstart
theorem sqir_style_controlledModAddConst_candidate_target_decode_control_false_qstart
    (bits q_start N c x controlIdx flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (hcontrol_ne_flag : controlIdx ≠ flagPos) :
    cuccaro_target_val bits q_start
        (Gate.applyNat
          (sqir_style_controlledModAddConst_candidate bits q_start N c controlIdx flagPos)
          (update (cuccaro_input_F q_start false 0 x) controlIdx false))
*PRIMARY L-3.6′ DELIVERABLE: q_start-parametric control = false target-decode.** The candidate controlled mod-add gate, applied to the zero-accumulator Cuccaro base with the control bit set to false, decodes to `x` at the target. Parametric in both `q_start` and `flagPos`.
theoremsqir_style_controlledModAddConst_candidate_workspace_control_false_qstart
theorem sqir_style_controlledModAddConst_candidate_workspace_control_false_qstart
    (bits q_start N c x controlIdx flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (hcontrol_ne_flag : controlIdx ≠ flagPos) :
    cuccaro_read_val bits q_start
          (Gate.applyNat
            (sqir_style_controlledModAddConst_candidate bits q_start N c controlIdx flagPos)
            (update (cuccaro_input_F q_start false 0 x) controlIdx false))
*R7d^xxix-L-3.7′ DELIVERABLE: q_start-parametric control=false workspace bundle (4-conjunct).** Mirrors `sqir_style_controlledModAddConst_candidate_workspace_control_false` but parametric in `q_start` and `flagPos`. Both `controlIdx` and `flagPos` must lie OUTSIDE the Cuccaro workspace `[q_start, q_start + 2 * bits + 1)` and be distinct. After the candidate gate applied to `(update F controlIdx false)`: 1. `cuccaro_read_val bits q_start` of the output = 0; 2. position `q_start + 2 * bits` (top carry) = false; 3. position `flagPos` = false; 4. position `controlIdx` = false. Closes via the already-landed `sqir_style_controlledModAddConst_candidate_control_false_state_eq_qstart`.
theoremsqir_style_controlledModAddConst_candidate_clean_control_false_qstart
theorem sqir_style_controlledModAddConst_candidate_clean_control_false_qstart
    (bits q_start N c x dim controlIdx flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (hcontrol_ne_flag : controlIdx ≠ flagPos)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_control_lt_dim : controlIdx < dim)
    (h_flag_lt_dim : flagPos < dim) :
    Gate.WellTyped dim
*R7d^xxix-L-3.8′ DELIVERABLE: q_start-parametric control=false clean bundle.** Bundles the already-proved q_start-parametric facts for `sqir_style_controlledModAddConst_candidate bits q_start N c controlIdx flagPos` applied to `(update F controlIdx false)`: 1. `Gate.WellTyped dim` of the candidate gate. 2. target decoded value = `x` (no-op on the target). 3. read register = 0. 4. top carry position (`q_start + 2 * bits`) = false. 5. `flagPos` = false. 6. `controlIdx` = false. Parametric in `q_start`, `flagPos`, `controlIdx`, AND the ambient dimension `dim`. Wrapper over: - `sqir_style_controlledModAddConst_candidate_target_decode_control_false_qstart`, - `sqir_style_controlledModAddConst_candidate_workspace_control_false_qstart`, - the five existing q_start-parametric WellTyped sub-lemmas (`sqir_conditionalAddConstGate_wellTyped`, `sqir_style_compareConst_candidate_wellTyped`, `sqir_conditionalSubConstGate_wellTyped`, `cuccaro_maj_chain_wellTyped`, `cuccaro_maj_chain_inv_wellTyped`, `sqir_prepareMaskedConstRead_wellTyped`). No new infrastructure introduced; control=true direction NOT touched.
theoremsqir_style_controlledModAddConst_candidate_workspace_control_false
theorem sqir_style_controlledModAddConst_candidate_workspace_control_false
    (bits N c x controlIdx : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1) :
    cuccaro_read_val bits 2
          (Gate.applyNat
            (sqir_style_controlledModAddConst_candidate bits 2 N c controlIdx 1)
            (update (cuccaro_input_F 2 false 0 x) controlIdx false))
        = 0
*Control=false bundle (4-conjunct):** read = 0, top carry = false, flag = false, controlIdx = false.

FormalRV.Arithmetic.Cuccaro.CuccaroSQIRDirtyFlag.CuccaroControlledModularAddCorrectness

FormalRV/Arithmetic/Cuccaro/CuccaroSQIRDirtyFlag/CuccaroControlledModularAddCorrectness.lean
## Tick 69 — Control preservation helpers + control=true work.
theoremcuccaro_addConstGate_preserves_outside_workspace_at
theorem cuccaro_addConstGate_preserves_outside_workspace_at
    (bits q_start c controlIdx : Nat) (g : Nat → Bool)
    (h_control_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx) :
    Gate.applyNat (cuccaro_addConstGate bits q_start c) g controlIdx = g controlIdx
*Deliverable A — addConstGate preserves any outside-workspace position.**
theoremsqir_conditionalAddConstGate_preserves_outside
theorem sqir_conditionalAddConstGate_preserves_outside
    (bits q_start c flagPos controlIdx : Nat) (g : Nat → Bool)
    (h_control_distinct : ∀ i, i < bits → controlIdx ≠ q_start + 2 * i + 2)
    (h_control_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx) :
    Gate.applyNat (sqir_conditionalAddConstGate bits q_start c flagPos) g controlIdx
      = g controlIdx
*Deliverable C — conditionalAddConstGate preserves outside-workspace position (when distinct from read positions and flag position).**
theoremsqir_conditionalSubConstGate_preserves_outside
theorem sqir_conditionalSubConstGate_preserves_outside
    (bits q_start N flagPos controlIdx : Nat) (g : Nat → Bool)
    (h_control_distinct : ∀ i, i < bits → controlIdx ≠ q_start + 2 * i + 2)
    (h_control_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx) :
    Gate.applyNat (sqir_conditionalSubConstGate bits q_start N flagPos) g controlIdx
      = g controlIdx
*conditionalSubConstGate preserves outside-workspace position.**
theoremsqir_controlledCompareConst_preserves_control_outside
theorem sqir_controlledCompareConst_preserves_control_outside
    (bits q_start c controlIdx flagPos : Nat) (g : Nat → Bool)
    (h_control_distinct : ∀ i, i < bits → controlIdx ≠ q_start + 2 * i + 2)
    (h_control_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (h_control_ne_flag : controlIdx ≠ flagPos) :
    Gate.applyNat (sqir_controlledCompareConst bits q_start c controlIdx flagPos) g controlIdx
      = g controlIdx
*`sqir_controlledCompareConst` preserves outside-workspace position (when distinct from read positions and flagPos).**
theoremsqir_style_controlledModAddConst_candidate_preserves_control
theorem sqir_style_controlledModAddConst_candidate_preserves_control
    (bits N c x controlIdx : Nat) (control : Bool)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1) :
    Gate.applyNat
        (sqir_style_controlledModAddConst_candidate bits 2 N c controlIdx 1)
        (update (cuccaro_input_F 2 false 0 x) controlIdx control) controlIdx
      = control
*Partial Deliverable D — control bit is preserved through the controlled mod-N add candidate.**
theoremGate.applyNat_X_commute_update_outside_fun
theorem Gate.applyNat_X_commute_update_outside_fun
    (target controlIdx : Nat) (v : Bool) (f : Nat → Bool) (h : controlIdx ≠ target) :
    Gate.applyNat (Gate.X target) (update f controlIdx v)
      = update (Gate.applyNat (Gate.X target) f) controlIdx v
*X commute with update at outside position.**
theoremGate.applyNat_CX_commute_update_outside_fun
theorem Gate.applyNat_CX_commute_update_outside_fun
    (control target controlIdx : Nat) (v : Bool) (f : Nat → Bool)
    (h_ne_control : controlIdx ≠ control) (h_ne_target : controlIdx ≠ target) :
    Gate.applyNat (Gate.CX control target) (update f controlIdx v)
      = update (Gate.applyNat (Gate.CX control target) f) controlIdx v
*CX commute with update at outside position (≠ control and ≠ target).**
theoremsqir_prepareMaskedConstRead_commute_update_outside_workspace
theorem sqir_prepareMaskedConstRead_commute_update_outside_workspace
    (bits q_start N flagPos controlIdx : Nat) (v : Bool) (f : Nat → Bool)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ flagPos) :
    Gate.applyNat (sqir_prepareMaskedConstRead bits q_start N flagPos) (update f controlIdx v)
      = update (Gate.applyNat (sqir_prepareMaskedConstRead bits q_start N flagPos) f)
              controlIdx v
*Deliverable A — masked prepare commutes with `update` at outside position.**
theoremcuccaro_maj_chain_inv_commute_update_outside_workspace_fun
theorem cuccaro_maj_chain_inv_commute_update_outside_workspace_fun
    (bits q_start flagPos : Nat) (v : Bool) (f : Nat → Bool)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (cuccaro_maj_chain_inv bits q_start) (update f flagPos v)
      = update (Gate.applyNat (cuccaro_maj_chain_inv bits q_start) f) flagPos v
*Function-level commute for `cuccaro_maj_chain_inv`.** Lifts the existing position-level theorem to a function equality.
theoremsqir_style_compareConst_candidate_commute_update_outside_fun
theorem sqir_style_compareConst_candidate_commute_update_outside_fun
    (bits q_start N flagPos controlIdx : Nat) (v : Bool) (f : Nat → Bool)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ flagPos) :
    Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos)
        (update f controlIdx v)
      = update (Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos) f)
              controlIdx v
*Deliverable B — comparator commutes with `update` at outside position.**
theoremsqir_conditionalAddConstGate_commute_update_outside_fun
theorem sqir_conditionalAddConstGate_commute_update_outside_fun
    (bits q_start N flagPos controlIdx : Nat) (v : Bool) (f : Nat → Bool)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ flagPos) :
    Gate.applyNat (sqir_conditionalAddConstGate bits q_start N flagPos) (update f controlIdx v)
      = update (Gate.applyNat (sqir_conditionalAddConstGate bits q_start N flagPos) f)
              controlIdx v
*Deliverable C — conditionalAdd commutes with `update` at outside position.**
theoremsqir_conditionalSubConstGate_commute_update_outside_fun
theorem sqir_conditionalSubConstGate_commute_update_outside_fun
    (bits q_start N flagPos controlIdx : Nat) (v : Bool) (f : Nat → Bool)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ flagPos) :
    Gate.applyNat (sqir_conditionalSubConstGate bits q_start N flagPos) (update f controlIdx v)
      = update (Gate.applyNat (sqir_conditionalSubConstGate bits q_start N flagPos) f)
              controlIdx v
*Deliverable C — conditionalSub commutes with `update` at outside position.**
theoremsqir_style_modAddConst_clean_candidate_commute_update_control_outside_qstart
theorem sqir_style_modAddConst_clean_candidate_commute_update_control_outside_qstart
    (bits q_start N c controlIdx flagPos : Nat) (v : Bool) (f : Nat → Bool)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ flagPos) :
    Gate.applyNat (sqir_style_modAddConst_clean_candidate bits q_start N c flagPos)
        (update f controlIdx v)
      = update (Gate.applyNat (sqir_style_modAddConst_clean_candidate bits q_start N c flagPos) f)
              controlIdx v
*R7d^xxix-L-3.10′ helper: q_start-parametric clean modadd candidate commutes with `update` at controlIdx outside workspace ∪ {flagPos}.** q_start-parametric port of `sqir_style_modAddConst_clean_candidate_commute_update_control_outside`. All sub-deps already q_start-parametric: - `cuccaro_addConstGate_commute_update_outside_workspace` (CuccaroSQIRCondAdd.lean:685); - `sqir_style_compareConst_candidate_commute_update_outside_fun` (CuccaroSQIRDirtyFlag.lean:3132); - `sqir_conditionalSubConstGate_commute_update_outside_fun` (:3174); - `Gate.applyNat_X_commute_update_outside_fun` (:3039, generic).
theoremsqir_style_controlledModAddConst_candidate_control_true_state_eq_qstart
theorem sqir_style_controlledModAddConst_candidate_control_true_state_eq_qstart
    (bits q_start N c x controlIdx flagPos : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (hcontrol_ne_flag : controlIdx ≠ flagPos) :
    Gate.applyNat
        (sqir_style_controlledModAddConst_candidate bits q_start N c controlIdx flagPos)
        (update (cuccaro_input_F q_start false 0 x) controlIdx true)
      = update (Gate.applyNat
                  (sqir_style_modAddConst_clean_candidate bits q_start N c flagPos)
*R7d^xxix-L-3.10′ HEADLINE: q_start-parametric control=true state equality for the controlled mod-N add candidate.** q_start-parametric port of `sqir_style_controlledModAddConst_candidate_control_true_state_eq`. The 5-stage rewrite chain is mirrored with `2 → q_start`, `1 → flagPos`, free `controlIdx`, with the standard outside-workspace hypotheses on both `controlIdx` and `flagPos`, plus distinctness. The state-equality lifts the controlled candidate's action (under external control = true) to the uncontrolled clean candidate applied to `cuccaro_input_F q_start false 0 x`, with the `controlIdx` slot pinned to `true` on both sides. Dependencies (all already q_start-parametric or just landed): - `sqir_style_modAddConst_clean_candidate_commute_update_control_outside_qstart` (above, this tick); - `sqir_conditionalAddConstGate_apply_true_fun` (CuccaroSQIRCondAdd.lean:379); - `sqir_conditionalSubConstGate_preserves_outside` (:2951); - `sqir_style_compareConst_candidate_frame_outside` (:175); - `cuccaro_addConstGate_preserves_outside_workspace_at` (:2918); - `sqir_controlledCompareConst_at_control_true_eq_unmasked_fun` (:2091); - `Gate.applyNat_CX_at_control_true_eq_X_fun` (generic CX).
theoremsqir_style_modAddConst_clean_candidate_commute_update_control_outside
theorem sqir_style_modAddConst_clean_candidate_commute_update_control_outside
    (bits N c controlIdx : Nat) (v : Bool) (f : Nat → Bool)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1) :
    Gate.applyNat (sqir_style_modAddConst_clean_candidate bits 2 N c 1)
        (update f controlIdx v)
      = update (Gate.applyNat (sqir_style_modAddConst_clean_candidate bits 2 N c 1) f)
              controlIdx v
*HEADLINE Deliverable D — clean modadd candidate commutes with `update` at controlIdx outside workspace ∪ {flagPos = 1}.**
theoremcuccaro_target_val_update_outside_workspace
theorem cuccaro_target_val_update_outside_workspace
    (bits q_start controlIdx : Nat) (v : Bool) (Y : Nat → Bool)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx) :
    cuccaro_target_val bits q_start (update Y controlIdx v)
      = cuccaro_target_val bits q_start Y
*Helper — `cuccaro_target_val` is invariant under `update` at outside controlIdx.**
theoremcuccaro_read_val_update_outside_workspace
theorem cuccaro_read_val_update_outside_workspace
    (bits q_start controlIdx : Nat) (v : Bool) (Y : Nat → Bool)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx) :
    cuccaro_read_val bits q_start (update Y controlIdx v)
      = cuccaro_read_val bits q_start Y
*Helper — `cuccaro_read_val` is invariant under `update` at outside controlIdx.**
theoremsqir_style_controlledModAddConst_candidate_control_true_state_eq
theorem sqir_style_controlledModAddConst_candidate_control_true_state_eq
    (bits N c x controlIdx : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1) :
    Gate.applyNat (sqir_style_controlledModAddConst_candidate bits 2 N c controlIdx 1)
        (update (cuccaro_input_F 2 false 0 x) controlIdx true)
      = update (Gate.applyNat (sqir_style_modAddConst_clean_candidate bits 2 N c 1)
                  (update (cuccaro_input_F 2 false 0 x) 1 false)) controlIdx true
*HEADLINE Deliverable A — control=true state equality for the controlled mod-N add candidate.**
theoremsqir_style_controlledModAddConst_candidate_target_decode_control_true_qstart
theorem sqir_style_controlledModAddConst_candidate_target_decode_control_true_qstart
    (bits q_start N c x controlIdx flagPos : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (hcontrol_ne_flag : controlIdx ≠ flagPos) :
    cuccaro_target_val bits q_start
        (Gate.applyNat
          (sqir_style_controlledModAddConst_candidate bits q_start N c controlIdx flagPos)
          (update (cuccaro_input_F q_start false 0 x) controlIdx true))
      = (x + c) % N
*R7d^xxix-L-3.11′ DELIVERABLE: q_start-parametric control=true target decode.** q_start-parametric port of `sqir_style_controlledModAddConst_candidate_target_decode_control_true`. Three-step thin consequence of the L-3.10′ state equality: 1. rewrite via `_control_true_state_eq_qstart` to expose the uncontrolled clean candidate applied to `cuccaro_input_F q_start` with the `controlIdx` slot wrapped in `update _ controlIdx true`; 2. strip the outer `update controlIdx true` via `cuccaro_target_val_update_outside_workspace` (controlIdx lies outside the workspace by hypothesis); 3. close with `_modAddConst_clean_candidate_target_decode_qstart` (L-3.9′).
theoremsqir_style_controlledModAddConst_candidate_target_decode_control_true
theorem sqir_style_controlledModAddConst_candidate_target_decode_control_true
    (bits N c x controlIdx : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1) :
    cuccaro_target_val bits 2
        (Gate.applyNat (sqir_style_controlledModAddConst_candidate bits 2 N c controlIdx 1)
          (update (cuccaro_input_F 2 false 0 x) controlIdx true))
      = (x + c) % N
*Deliverable B — control=true target decode.**
theoremsqir_style_modAddConst_clean_candidate_flag_restored_qstart
theorem sqir_style_modAddConst_clean_candidate_flag_restored_qstart
    (bits q_start N c x flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat
        (sqir_style_modAddConst_clean_candidate bits q_start N c flagPos)
        (update (cuccaro_input_F q_start false 0 x) flagPos false) flagPos
      = false
*R7d^xxix-L-3.12′ helper: q_start-parametric clean-candidate flag restoration.** At `flagPos`, the uncontrolled clean candidate restores the flag to `false`. Direct q_start port of `sqir_style_modAddConst_clean_candidate_flag_restored` (line 1462).
theoremsqir_style_controlledModAddConst_candidate_workspace_control_true_qstart
theorem sqir_style_controlledModAddConst_candidate_workspace_control_true_qstart
    (bits q_start N c x controlIdx flagPos : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (hcontrol_ne_flag : controlIdx ≠ flagPos) :
    cuccaro_read_val bits q_start
          (Gate.applyNat
            (sqir_style_controlledModAddConst_candidate bits q_start N c controlIdx flagPos)
            (update (cuccaro_input_F q_start false 0 x) controlIdx true))
        = 0
*R7d^xxix-L-3.12′ DELIVERABLE: q_start-parametric control=true workspace bundle.** 4-conjunct workspace bundle when the external control bit is `true`. After applying the controlled mod-N add candidate to `(update (cuccaro_input_F q_start false 0 x) controlIdx true)`: 1. `cuccaro_read_val` (read register) = 0; 2. position `q_start + 2 * bits` (top carry) = false; 3. position `flagPos` = false; 4. position `controlIdx` = true (preserved external control). Proof strategy mirrors the hard-coded version (line 3481+) but uses the L-3.10′ state-eq + inline read/top-carry computation + `_flag_restored_qstart` (helper above).
theoremsqir_style_controlledModAddConst_candidate_workspace_control_true
theorem sqir_style_controlledModAddConst_candidate_workspace_control_true
    (bits N c x controlIdx : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1) :
    cuccaro_read_val bits 2
          (Gate.applyNat (sqir_style_controlledModAddConst_candidate bits 2 N c controlIdx 1)
            (update (cuccaro_input_F 2 false 0 x) controlIdx true))
        = 0
    ∧ Gate.applyNat (sqir_style_controlledModAddConst_candidate bits 2 N c controlIdx 1)
          (update (cuccaro_input_F 2 false 0 x) controlIdx true) (2 + 2 * bits)
*Deliverable C — control=true workspace bundle (4-conjunct).**
theoremsqir_style_controlledModAddConst_candidate_target_decode
theorem sqir_style_controlledModAddConst_candidate_target_decode
    (bits N c x controlIdx : Nat) (control : Bool) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1) :
    cuccaro_target_val bits 2
        (Gate.applyNat (sqir_style_controlledModAddConst_candidate bits 2 N c controlIdx 1)
          (update (cuccaro_input_F 2 false 0 x) controlIdx control))
      = if control then (x + c) % N else x
*HEADLINE Deliverable D — combined controlled target decode.**
theoremsqir_style_controlledModAddConst_candidate_workspace
theorem sqir_style_controlledModAddConst_candidate_workspace
    (bits N c x controlIdx : Nat) (control : Bool) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1) :
    cuccaro_read_val bits 2
          (Gate.applyNat (sqir_style_controlledModAddConst_candidate bits 2 N c controlIdx 1)
            (update (cuccaro_input_F 2 false 0 x) controlIdx control))
        = 0
    ∧ Gate.applyNat (sqir_style_controlledModAddConst_candidate bits 2 N c controlIdx 1)
          (update (cuccaro_input_F 2 false 0 x) controlIdx control) (2 + 2 * bits)
*Deliverable E — combined workspace bundle.**
theoremsqir_style_controlledModAddConst_candidate_clean_qstart
theorem sqir_style_controlledModAddConst_candidate_clean_qstart
    (bits q_start N c x dim controlIdx flagPos : Nat) (control : Bool)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (hcontrol_ne_flag : controlIdx ≠ flagPos)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_control_lt_dim : controlIdx < dim)
    (h_flag_lt_dim : flagPos < dim) :
    Gate.WellTyped dim
*R7d^xxix-L-3.13′ DELIVERABLE: q_start-parametric controlled candidate clean bundle (combined over both control branches).** 6-conjunct bundle parametric in `q_start`, `flagPos`, `controlIdx`, `dim`, and `control : Bool`: 1. `Gate.WellTyped dim` of the candidate gate; 2. target decode = `if control then (x + c) % N else x`; 3. read register = 0; 4. position `q_start + 2 * bits` (top carry) = false; 5. position `flagPos` = false; 6. position `controlIdx` = `control` (preserved external control). Mechanical case-split on `control`: false branch fully delegated to `_clean_control_false_qstart` (L-3.8′); true branch reuses `_clean_control_false_qstart` only to extract the control-independent `Gate.WellTyped`, then assembles the remaining five conjuncts from `_target_decode_control_true_qstart` (L-3.11′) and `_workspace_control_true_qstart` (L-3.12′). No new arithmetic.
theoremsqir_style_controlledModAddConst_candidate_clean
theorem sqir_style_controlledModAddConst_candidate_clean
    (bits N c x controlIdx : Nat) (control : Bool) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1)
    (h_control_workspace_lt : controlIdx < sqir_modmult_rev_anc bits) :
    Gate.WellTyped (sqir_modmult_rev_anc bits)
        (sqir_style_controlledModAddConst_candidate bits 2 N c controlIdx 1)
    ∧ cuccaro_target_val bits 2
          (Gate.applyNat (sqir_style_controlledModAddConst_candidate bits 2 N c controlIdx 1)
            (update (cuccaro_input_F 2 false 0 x) controlIdx control))
*Deliverable F — controlled candidate clean bundle for c > 0.**
theoremsqir_style_controlledModAddConst_gate_zero_eq_qstart
theorem sqir_style_controlledModAddConst_gate_zero_eq_qstart
    (bits q_start N controlIdx flagPos : Nat) :
    sqir_style_controlledModAddConst_gate bits q_start N 0 controlIdx flagPos = Gate.I
*R7d^xxix-L-3.14′ helper: q_start-parametric `c = 0` reduction.** The total wrapper at `c = 0` is the identity gate, regardless of `q_start`/`flagPos`/`controlIdx`.
theoremsqir_style_controlledModAddConst_gate_zero_clean_qstart
theorem sqir_style_controlledModAddConst_gate_zero_clean_qstart
    (bits q_start N x dim controlIdx flagPos : Nat) (control : Bool)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (hx : x < N)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (hcontrol_ne_flag : controlIdx ≠ flagPos)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim) :
    Gate.WellTyped dim
        (sqir_style_controlledModAddConst_gate bits q_start N 0 controlIdx flagPos)
    ∧ cuccaro_target_val bits q_start
*R7d^xxix-L-3.14′ helper: q_start-parametric `c = 0` clean bundle.** When `c = 0` the total wrapper is `Gate.I`, so all six conjuncts reduce to facts about the input state. Mirrors `sqir_style_controlledModAddConst_gate_zero_clean` (line 2150) with general `q_start`, `flagPos`, and free `dim`. Uses `cuccaro_input_F_at_outside_eq_false` for the flag conjunct instead of the hard-coded `unfold + if_pos`.
theoremsqir_style_controlledModAddConst_gate_clean_qstart
theorem sqir_style_controlledModAddConst_gate_clean_qstart
    (bits q_start N c x dim controlIdx flagPos : Nat) (control : Bool)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (hcontrol_ne_flag : controlIdx ≠ flagPos)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_control_lt_dim : controlIdx < dim)
    (h_flag_lt_dim : flagPos < dim) :
    Gate.WellTyped dim
*R7d^xxix-L-3.14′ DELIVERABLE: q_start-parametric total wrapper clean theorem.** Combines the `c = 0` case (delegated to `_gate_zero_clean_qstart` above, with the target conjunct re-massaged to match the headline's `if control then (x+c)%N else x` shape at `c = 0`) and the `c > 0` case (delegated to the L-3.13′ `_candidate_clean_qstart`). Mirrors `sqir_style_controlledModAddConst_gate_clean` (line 3871) with general `q_start`, `flagPos`, `controlIdx`, and free `dim`.
theoremsqir_style_controlledModAddConst_gate_clean
theorem sqir_style_controlledModAddConst_gate_clean
    (bits N c x controlIdx : Nat) (control : Bool) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1)
    (h_control_workspace_lt : controlIdx < sqir_modmult_rev_anc bits) :
    Gate.WellTyped (sqir_modmult_rev_anc bits)
        (sqir_style_controlledModAddConst_gate bits 2 N c controlIdx 1)
    ∧ cuccaro_target_val bits 2
          (Gate.applyNat (sqir_style_controlledModAddConst_gate bits 2 N c controlIdx 1)
            (update (cuccaro_input_F 2 false 0 x) controlIdx control))
*HEADLINE Deliverable G — total wrapper clean theorem.**
theoremsqir_style_controlledModAddConst_gate_clean_from_BasicSetting
theorem sqir_style_controlledModAddConst_gate_clean_from_BasicSetting
    (a r N m n c x controlIdx : Nat) (control : Bool)
    (h_basic : FormalRV.SQIRPort.BasicSetting a r N m n)
    (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * (n + 1) + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1)
    (h_control_workspace_lt : controlIdx < sqir_modmult_rev_anc (n + 1)) :
    Gate.WellTyped (sqir_modmult_rev_anc (n + 1))
        (sqir_style_controlledModAddConst_gate (n + 1) 2 N c controlIdx 1)
    ∧ cuccaro_target_val (n + 1) 2
          (Gate.applyNat (sqir_style_controlledModAddConst_gate (n + 1) 2 N c controlIdx 1)
            (update (cuccaro_input_F 2 false 0 x) controlIdx control))
*HEADLINE Deliverable H — BasicSetting-derived total wrapper clean theorem.**

FormalRV.Arithmetic.Cuccaro.CuccaroSQIRDirtyFlag.CuccaroDirtyFlagStageCorrectness

FormalRV/Arithmetic/Cuccaro/CuccaroSQIRDirtyFlag/CuccaroDirtyFlagStageCorrectness.lean
## Deliverable A — Arithmetic theorem for dirty-flag modular reduction.
theoremsqir_dirty_modadd_arith
theorem sqir_dirty_modadd_arith
    (bits N x c : Nat) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N) :
    (x + c + (if decide (N ≤ x + c) then 2^bits - N else 0)) % 2^bits
      = (x + c) % N
*HEADLINE Deliverable A — dirty-flag modular reduction arithmetic.** For `x, c < N` and `2*N ≤ 2^bits`, `(x + c + (if decide (N ≤ x+c) then 2^bits - N else 0)) % 2^bits = (x + c) % N`.
theoremcuccaro_addConstGate_output_eq_cuccaro_input_F
theorem cuccaro_addConstGate_output_eq_cuccaro_input_F
    (bits q_start c x : Nat) (hbits : 1 ≤ bits)
    (hc : c < 2^bits) (hx : x < 2^bits) (h_sum : x + c < 2^bits) :
    Gate.applyNat (cuccaro_addConstGate bits q_start c)
        (cuccaro_input_F q_start false 0 x)
      = cuccaro_input_F q_start false 0 (x + c)
*HEADLINE — post-addConst function equality.** Applying `cuccaro_addConstGate bits q_start c` to `cuccaro_input_F q_start false 0 x` gives `cuccaro_input_F q_start false 0 (x+c)` as a function, provided `x + c < 2^bits`.
theoremsqir_dirty_modadd_after_add_state_eq
theorem sqir_dirty_modadd_after_add_state_eq
    (bits q_start N c x flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (cuccaro_addConstGate bits q_start c)
        (update (cuccaro_input_F q_start false 0 x) flagPos false)
      = update (cuccaro_input_F q_start false 0 (x + c)) flagPos false
*HEADLINE Deliverable A — post-add state with external flag.**
theoremsqir_style_compareConst_candidate_frame_outside
theorem sqir_style_compareConst_candidate_frame_outside
    (bits q_start N flagPos : Nat) (f : Nat → Bool)
    (q : Nat) (h_q_ne_flagPos : q ≠ flagPos)
    (h_q_outside : q < q_start ∨ q_start + 2 * bits + 1 ≤ q) :
    Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos) f q = f q
*`sqir_style_compareConst_candidate` frame at positions outside workspace ∪ {flagPos}.** Layer order after `simp [applyNat_seq]` (outermost first): prepare₂ → maj_inv → CX → maj → prepare₁. We strip from the outside in.
theoremsqir_dirty_modadd_after_compare_state_eq
theorem sqir_dirty_modadd_after_compare_state_eq
    (bits q_start N c x flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos)
        (update (cuccaro_input_F q_start false 0 (x + c)) flagPos false)
      = update (cuccaro_input_F q_start false 0 (x + c)) flagPos (decide (N ≤ x + c))
*HEADLINE Deliverable B — post-compare state with external flag.**
theoremsqir_style_modAddConst_dirtyFlag_target_decode
theorem sqir_style_modAddConst_dirtyFlag_target_decode
    (bits q_start N c x flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    cuccaro_target_val bits q_start
        (Gate.applyNat
          (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos)
          (update (cuccaro_input_F q_start false 0 x) flagPos false))
      = (x + c) % N
*HEADLINE Deliverable D — dirty-flag mod-N target decode.**
theoremsqir_style_modAddConst_dirtyFlag_read_decode
theorem sqir_style_modAddConst_dirtyFlag_read_decode
    (bits q_start N c x flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    cuccaro_read_val bits q_start
        (Gate.applyNat
          (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos)
          (update (cuccaro_input_F q_start false 0 x) flagPos false))
      = 0
*HEADLINE Deliverable A — read register restored after the dirty-flag candidate.**
theoremsqir_style_modAddConst_dirtyFlag_carry_in_restored
theorem sqir_style_modAddConst_dirtyFlag_carry_in_restored
    (bits q_start N c x flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat
        (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos)
        (update (cuccaro_input_F q_start false 0 x) flagPos false) q_start
      = false
*HEADLINE Deliverable A (continued) — carry-in restored.**
theoremsqir_style_modAddConst_dirtyFlag_flag_value
theorem sqir_style_modAddConst_dirtyFlag_flag_value
    (bits q_start N c x flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat
        (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos)
        (update (cuccaro_input_F q_start false 0 x) flagPos false) flagPos
      = decide (N ≤ x + c)
*HEADLINE Deliverable A (continued) — flag holds `decide (N ≤ x + c)`.** The flag is DIRTY: it stores the comparison result, not the input `false`. Naming the field `dirtyFlag` is mandatory; do not advertise this as clean modular addition.
theoremsqir_style_modAddConst_dirtyFlag_candidate_wellTyped_sqir_dim
theorem sqir_style_modAddConst_dirtyFlag_candidate_wellTyped_sqir_dim
    (bits q_start N c flagPos : Nat) (hbits : 1 ≤ bits)
    (h_workspace : q_start + 2 * bits + 1 ≤ sqir_modmult_rev_anc bits)
    (h_flag : flagPos < sqir_modmult_rev_anc bits)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (h_flag_distinct_top : flagPos ≠ q_start + 2 * bits) :
    Gate.WellTyped (sqir_modmult_rev_anc bits)
        (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos)
*HEADLINE Deliverable B — WellTyped at the SQIR-faithful dimension `sqir_modmult_rev_anc bits = 3 * bits + 11`.**
theoremsqir_style_modAddConst_dirtyFlag_clean_except_flag
theorem sqir_style_modAddConst_dirtyFlag_clean_except_flag
    (bits q_start N c x flagPos dim : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_flag : flagPos < dim)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.WellTyped dim
        (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos)
    ∧ cuccaro_target_val bits q_start
*HEADLINE Deliverable C — packaged dirty-flag mod-N add bundle.** Provides WellTyped, target decode, read restored, carry restored, and the dirty flag value, all under the dirty-flag precondition set (2*N ≤ 2^bits, x < N, c < N, flagPos above workspace).
theoremsqir_style_modAddConst_dirtyFlag_candidate_wellTyped_sqir_layout
theorem sqir_style_modAddConst_dirtyFlag_candidate_wellTyped_sqir_layout
    (bits N c : Nat) (hbits : 1 ≤ bits) :
    Gate.WellTyped (sqir_modmult_rev_anc bits)
        (sqir_style_modAddConst_dirtyFlag_candidate bits 2 N c 1)
*Deliverable D (partial) — WellTyped at the exact SQIR layout `q_start = 2, flagPos = 1, dim = sqir_modmult_rev_anc bits`.**
theoremBasicSetting_twoN_le_pow_succ
theorem BasicSetting_twoN_le_pow_succ
    (a r N m n : Nat)
    (h_basic : FormalRV.SQIRPort.BasicSetting a r N m n) :
    2 * N ≤ 2 ^ (n + 1)
*HEADLINE Deliverable E — sizing relation: `BasicSetting a r N m n` implies `2 * N ≤ 2^(n + 1)`.** Reading: Shor's data register width is `n` bits; the dirty-flag modular adder must be instantiated at `bits := n + 1` (one extra bit) so that intermediate `x + c` cannot overflow before the comparator sees the top carry. This matches SQIR's `n + 1`-bit workspace per modular addition.
theoremcuccaro_input_F_at_outside_eq_false
theorem cuccaro_input_F_at_outside_eq_false
    (q_start bits x flagPos : Nat)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (hx : x < 2^bits) :
    cuccaro_input_F q_start false 0 x flagPos = false
Helper: `cuccaro_input_F q_start false 0 x` evaluates to `false` at any position outside the workspace `[q_start, q_start + 2*bits]`.
theoremsqir_style_compareConst_candidate_flag_general
theorem sqir_style_compareConst_candidate_flag_general
    (bits q_start N x flagPos : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos)
        (cuccaro_input_F q_start false 0 x) flagPos
      = decide (N ≤ x)
*Generalized flag-copy theorem.** For any `flagPos` outside the workspace (below OR above), the SQIR-style comparator candidate outputs `decide (N ≤ x)` at `flagPos`.
theoremsqir_style_compareConst_candidate_workspace_restored_at_general
theorem sqir_style_compareConst_candidate_workspace_restored_at_general
    (bits q_start N flagPos : Nat) (f : Nat → Bool)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (q : Nat) (hq_lower : q_start ≤ q) (hq_upper : q < q_start + 2 * bits + 1) :
    Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos) f q
      = f q
*Generalized workspace restoration (at-position).** At any workspace position, the SQIR-style comparator candidate restores the input value, for any `flagPos` outside workspace.
theoremsqir_dirty_modadd_after_compare_state_eq_general
theorem sqir_dirty_modadd_after_compare_state_eq_general
    (bits q_start N c x flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos)
        (update (cuccaro_input_F q_start false 0 (x + c)) flagPos false)
      = update (cuccaro_input_F q_start false 0 (x + c))
              flagPos (decide (N ≤ x + c))
*Generalized compare-state equality** (Tick 59 Deliverable B, relaxed to `hflag_out`).
theoremsqir_style_modAddConst_dirtyFlag_target_decode_general
theorem sqir_style_modAddConst_dirtyFlag_target_decode_general
    (bits q_start N c x flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    cuccaro_target_val bits q_start
        (Gate.applyNat
          (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos)
          (update (cuccaro_input_F q_start false 0 x) flagPos false))
      = (x + c) % N
*Generalized dirty-flag mod-N add target decode** (Tick 59 D, relaxed to `hflag_out`).
theoremsqir_style_modAddConst_dirtyFlag_read_decode_general
theorem sqir_style_modAddConst_dirtyFlag_read_decode_general
    (bits q_start N c x flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    cuccaro_read_val bits q_start
        (Gate.applyNat
          (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos)
          (update (cuccaro_input_F q_start false 0 x) flagPos false))
      = 0
*Generalized workspace conjuncts** (Tick 60 A, relaxed to `hflag_out`).
theoremsqir_style_modAddConst_dirtyFlag_carry_in_restored_general
theorem sqir_style_modAddConst_dirtyFlag_carry_in_restored_general
    (bits q_start N c x flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat
        (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos)
        (update (cuccaro_input_F q_start false 0 x) flagPos false) q_start
      = false
theoremsqir_style_modAddConst_dirtyFlag_flag_value_general
theorem sqir_style_modAddConst_dirtyFlag_flag_value_general
    (bits q_start N c x flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat
        (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos)
        (update (cuccaro_input_F q_start false 0 x) flagPos false) flagPos
      = decide (N ≤ x + c)
theoremsqir_style_modAddConst_dirtyFlag_clean_except_flag_general
theorem sqir_style_modAddConst_dirtyFlag_clean_except_flag_general
    (bits q_start N c x flagPos dim : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_flag : flagPos < dim)
    (h_flag_distinct : ∀ i, i < bits → flagPos ≠ q_start + 2 * i + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.WellTyped dim
        (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos)
    ∧ cuccaro_target_val bits q_start
*Generalized clean-except-flag bundle** (Tick 60 C, relaxed to `hflag_out`). Supports both above- AND below-workspace flag.
theoremsqir_style_compareConst_candidate_flag_sqir_layout
theorem sqir_style_compareConst_candidate_flag_sqir_layout
    (bits N x : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits) :
    Gate.applyNat (sqir_style_compareConst_candidate bits 2 N 1)
        (cuccaro_input_F 2 false 0 x) 1
      = decide (N ≤ x)
*Deliverable A — SQIR-layout comparator flag-copy.**
theoremsqir_style_compareConst_candidate_clean_sqir_layout
theorem sqir_style_compareConst_candidate_clean_sqir_layout
    (bits N x : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits) :
    Gate.WellTyped (sqir_modmult_rev_anc bits)
        (sqir_style_compareConst_candidate bits 2 N 1)
    ∧ Gate.applyNat (sqir_style_compareConst_candidate bits 2 N 1)
          (cuccaro_input_F 2 false 0 x) 1
        = decide (N ≤ x)
    ∧ (∀ i, i < bits →
        Gate.applyNat (sqir_style_compareConst_candidate bits 2 N 1)
          (cuccaro_input_F 2 false 0 x) (2 + 2 * i + 2)
          = false)
*Deliverable B — SQIR-layout clean comparator bundle.**
theoremsqir_style_modAddConst_dirtyFlag_target_decode_sqir_layout
theorem sqir_style_modAddConst_dirtyFlag_target_decode_sqir_layout
    (bits N c x : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N) :
    cuccaro_target_val bits 2
        (Gate.applyNat
          (sqir_style_modAddConst_dirtyFlag_candidate bits 2 N c 1)
          (update (cuccaro_input_F 2 false 0 x) 1 false))
      = (x + c) % N
*Deliverable C — SQIR-layout dirty-flag mod-N add target decode.**
theoremsqir_style_modAddConst_dirtyFlag_clean_except_flag_sqir_layout
theorem sqir_style_modAddConst_dirtyFlag_clean_except_flag_sqir_layout
    (bits N c x : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N) :
    Gate.WellTyped (sqir_modmult_rev_anc bits)
        (sqir_style_modAddConst_dirtyFlag_candidate bits 2 N c 1)
    ∧ cuccaro_target_val bits 2
          (Gate.applyNat
            (sqir_style_modAddConst_dirtyFlag_candidate bits 2 N c 1)
            (update (cuccaro_input_F 2 false 0 x) 1 false))
        = (x + c) % N
    ∧ cuccaro_read_val bits 2
*Deliverable D — SQIR-layout dirty-flag mod-N add clean-except-flag bundle.**
theoremsqir_style_modAddConst_dirtyFlag_clean_except_flag_from_BasicSetting
theorem sqir_style_modAddConst_dirtyFlag_clean_except_flag_from_BasicSetting
    (a r N m n c x : Nat)
    (h_basic : FormalRV.SQIRPort.BasicSetting a r N m n)
    (hx : x < N) (hc : c < N) :
    Gate.WellTyped (sqir_modmult_rev_anc (n + 1))
        (sqir_style_modAddConst_dirtyFlag_candidate (n + 1) 2 N c 1)
    ∧ cuccaro_target_val (n + 1) 2
          (Gate.applyNat
            (sqir_style_modAddConst_dirtyFlag_candidate (n + 1) 2 N c 1)
            (update (cuccaro_input_F 2 false 0 x) 1 false))
        = (x + c) % N
    ∧ cuccaro_read_val (n + 1) 2
*Deliverable E — BasicSetting-based SQIR-layout corollary.** Combines the SQIR-layout bundle with the sizing relation from `BasicSetting`. Instantiates `bits := n + 1` as the canonical workspace width per `BasicSetting_twoN_le_pow_succ`.
lemmaprepareMaj_at_top_eq_after_update
lemma prepareMaj_at_top_eq_after_update
    (bits q_start N x flagPos : Nat) (flag : Bool)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (cuccaro_maj_chain bits q_start)
        (Gate.applyNat (cuccaro_prepareConstRead bits q_start (2^bits - N))
          (update (cuccaro_input_F q_start false 0 x) flagPos flag))
        (q_start + 2 * bits)
      = decide (N ≤ x)
Helper for the XOR flag theorem: the inner `(prepare; maj)` block at `q_start + 2*bits` (top carry) equals `decide (N ≤ x)` even when the input has an outside `update` at `flagPos`.
theoremsqir_style_compareConst_candidate_flag_xor
theorem sqir_style_compareConst_candidate_flag_xor
    (bits q_start N x flagPos : Nat) (flag : Bool)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos)
        (update (cuccaro_input_F q_start false 0 x) flagPos flag) flagPos
      = xor flag (decide (N ≤ x))
*HEADLINE Task 1 — comparator flag-XOR semantics.** For any initial flag value `flag`, the SQIR-style comparator at `flagPos` returns `flag XOR decide (N ≤ x)`. This is the key polarity result needed for any flag-uncomputation construction.
theoremsqir_style_compareConst_candidate_flag_xor_sqir_layout
theorem sqir_style_compareConst_candidate_flag_xor_sqir_layout
    (bits N x : Nat) (flag : Bool) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits) :
    Gate.applyNat (sqir_style_compareConst_candidate bits 2 N 1)
        (update (cuccaro_input_F 2 false 0 x) 1 flag) 1
      = xor flag (decide (N ≤ x))
*SQIR-layout corollary of Task 1.**
theoremdecide_c_le_xc_mod_N_eq_not_decide_N_le_xc
theorem decide_c_le_xc_mod_N_eq_not_decide_N_le_xc
    (N x c : Nat) (hN_pos : 0 < N) (hc_pos : 0 < c)
    (hx : x < N) (hc : c < N) :
    decide (c ≤ (x + c) % N) = ! decide (N ≤ x + c)
*HEADLINE — arithmetic identity for clean candidate.** For `0 < c`, `x < N`, `c < N`, the comparator's result on the reduced target `(x+c) % N` is precisely the negation of the dirty flag.
lemmacuccaro_prepareConstRead_zero_eq_id_fun
lemma cuccaro_prepareConstRead_zero_eq_id_fun
    (bits q_start : Nat) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_prepareConstRead bits q_start 0) f = f
Helper: `prepare(0)` is identity.
lemmacuccaro_addConstGate_zero_eq_full_adder_fun
lemma cuccaro_addConstGate_zero_eq_full_adder_fun
    (bits q_start : Nat) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_addConstGate bits q_start 0) f
      = Gate.applyNat (cuccaro_n_bit_adder_full bits q_start) f
Helper: on any input, `addConstGate(0)` agrees with the full adder.
theoremsqir_style_modAddConst_dirtyFlag_target_bit
theorem sqir_style_modAddConst_dirtyFlag_target_bit
    (bits q_start N c x flagPos i : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N) (hi : i < bits)
    (h_flag_distinct : ∀ j, j < bits → flagPos ≠ q_start + 2 * j + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat
        (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos)
        (update (cuccaro_input_F q_start false 0 x) flagPos false)
        (q_start + 2 * i + 1)
      = ((x + c) % N).testBit i
*HEADLINE Deliverable A — dirty-flag target bit theorem.** At each target position `q_start + 2*i + 1` for `i < bits`, the dirty-flag candidate's output bit equals `((x + c) % N).testBit i`.
theoremsqir_style_modAddConst_dirtyFlag_read_bit
theorem sqir_style_modAddConst_dirtyFlag_read_bit
    (bits q_start N c x flagPos i : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N) (hi : i < bits)
    (h_flag_distinct : ∀ j, j < bits → flagPos ≠ q_start + 2 * j + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat
        (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos)
        (update (cuccaro_input_F q_start false 0 x) flagPos false)
        (q_start + 2 * i + 2)
      = false
*HEADLINE Deliverable B — dirty-flag read bit theorem.** At each read position `q_start + 2*i + 2` for `i < bits`, the dirty-flag candidate's output bit is `false`.
theoremsqir_style_modAddConst_dirtyFlag_frame_outside
theorem sqir_style_modAddConst_dirtyFlag_frame_outside
    (bits q_start N c flagPos : Nat) (f : Nat → Bool)
    (h_flag_distinct : ∀ j, j < bits → flagPos ≠ q_start + 2 * j + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (q : Nat) (h_q_ne_flagPos : q ≠ flagPos)
    (h_q_outside : q < q_start ∨ q_start + 2 * bits + 1 ≤ q) :
    Gate.applyNat (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos) f q
      = f q
Frame: dirty-flag candidate preserves values at positions outside workspace ∪ {flagPos}.
theoremsqir_style_modAddConst_dirtyFlag_state_eq
theorem sqir_style_modAddConst_dirtyFlag_state_eq
    (bits q_start N c x flagPos : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hx : x < N) (hc : c < N)
    (h_flag_distinct : ∀ j, j < bits → flagPos ≠ q_start + 2 * j + 2)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat
        (sqir_style_modAddConst_dirtyFlag_candidate bits q_start N c flagPos)
        (update (cuccaro_input_F q_start false 0 x) flagPos false)
      = update (cuccaro_input_F q_start false 0 ((x + c) % N))
              flagPos (decide (N ≤ x + c))
*HEADLINE Deliverable C — dirty-flag state equality.** As a function, the post-dirty-flag state equals `update (cuccaro_input_F false 0 ((x+c) % N)) flagPos (decide(N ≤ x+c))`.

FormalRV.Arithmetic.Cuccaro.CuccaroSQIRDirtyFlag.CuccaroModularAddDefinitions

FormalRV/Arithmetic/Cuccaro/CuccaroSQIRDirtyFlag/CuccaroModularAddDefinitions.lean
## Tick 62 — Clean modular-add candidate definition.
defsqir_style_modAddConst_clean_candidate
def sqir_style_modAddConst_clean_candidate
    (bits q_start N c flagPos : Nat) : Gate
*Clean modular add-constant candidate** for `0 < c < N`. Structure: dirty-flag candidate ; compareConst(c) ; X(flagPos). The compareConst(c) XORs `decide(c ≤ (x+c) % N) = ¬decide(N ≤ x+c)` into the flag, then X negates. Net flag effect: `flag → ¬(flag XOR decide(N ≤ x+c) XOR ¬decide(N ≤ x+c)) = ¬(flag XOR true) = flag`, so the flag is restored. The cleanup also re-touches the target / read / carry workspace, but by the comparator's workspace_restored property these end up at the same values as the dirty-flag stage. *Caveat on `c = 0`:** `compareConst(0)` cannot be implemented in `bits` bits because `K = 2^bits` overflows the read register. For `c = 0` the modular add is the identity and the dirty flag is already `false`; the clean candidate is correct only for `0 < c`. A wrapper that dispatches `c = 0` to identity is straightforward but introduces a conditional gate structure (deferred).
defsqir_style_modAddConst_clean_gate
def sqir_style_modAddConst_clean_gate (bits N c : Nat) : Gate
*Deliverable A — total clean modular add-constant gate.** Wraps the clean candidate (which requires `0 < c`) so that the `c = 0` case dispatches to the identity gate. This is the official clean mod-add-constant primitive at the SQIR-faithful layout `q_start = 2, flagPos = 1, dim = sqir_modmult_rev_anc bits`.
defsqir_controlledCompareConst
def sqir_controlledCompareConst
    (bits q_start c controlIdx flagPos : Nat) : Gate
*Controlled compareConst** — masked-prepare variant of `sqir_style_compareConst_candidate`. When `controlIdx = false`, identity at every position; when `controlIdx = true`, equivalent to `sqir_style_compareConst_candidate bits q_start c flagPos`.
defsqir_style_controlledModAddConst_candidate
def sqir_style_controlledModAddConst_candidate
    (bits q_start N c controlIdx flagPos : Nat) : Gate
*Controlled SQIR-style mod-N add-constant candidate** for `0 < c`.
defsqir_style_controlledModAddConst_gate
def sqir_style_controlledModAddConst_gate
    (bits q_start N c controlIdx flagPos : Nat) : Gate
*Total controlled SQIR mod-N add-constant** wrapper handling `c = 0`.

FormalRV.Arithmetic.Cuccaro.CuccaroSQIRModAdd

FormalRV/Arithmetic/Cuccaro/CuccaroSQIRModAdd.lean
FormalRV.BQAlgo.CuccaroSQIRModAdd — SQIR-style modular add-constant SKELETON. Tick 53: build the first SQIR-style modular adder skeleton. Background: SQIR's `modadder21` (ModMult.v lines 134-137) is register-to-register modular addition `[M][x][y] → [M][(x+y) % M][y]`. For our Lean development targeting Shor's modular multiplier (which multiplies by a CLASSICAL constant), we adapt this to a register-to- CONSTANT modular addition: `target ← (target + c) mod N` The SQIR sequence for register-to-register: swapper02 ; adder01 ; swapper02 ; -- target ← (target + y) mod 2^n comparator01 ; -- flag ← decide (M ≤ target) bygatectrl 1 (subtractor01) ; bcx 1 ; -- conditional sub of M, flip flag swapper02 ; bcinv (comparator01) ; -- uncompute flag (swap to undo logic) swapper02. Our adapted sequence for register-to-constant (with the constant c and modulus N): cuccaro_addConstGate c ; -- target ← (target + c) mod 2^bits sqir_style_compareConst_candidate N ; -- flag ← decide (N ≤ target) [conditional sub of N] ; -- target ← target - N if flag = 1 [flag uncompute]. This file lands the SKELETON `addConst c ; compareConst N` (Tick 53, Deliverable 6 fallback per directive). The conditional-sub + flag-uncompute steps are deferred to Tick 54+. Reason for split: the conditional subtract requires either a controlled-CCX (not in our IR), or a manual controlled re-encoding of the subtractor. Both are substantial work and deserve their own tick. This tick proves: - WellTyped. - Flag = `decide (N ≤ x + c)` (after the skeleton). - Target decode = `(x + c) % 2^bits`. - Read register restored to 0. - Carry-in qubit restored to 0. These four together characterize the skeleton's behavior precisely.
defsqir_style_modAddConst_skeleton
def sqir_style_modAddConst_skeleton
    (bits q_start N c flagPos : Nat) : Gate
*Skeleton modular add-constant** (Tick 53). Composes the clean add-const + clean compare-const primitives. The result has the target register holding `(x + c) mod 2^bits` and the external flag at `flagPos` holding `decide (N ≤ x + c)`.
theoremsqir_style_modAddConst_skeleton_wellTyped
theorem sqir_style_modAddConst_skeleton_wellTyped
    (bits q_start N c flagPos dim : Nat)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_flag : flagPos < dim)
    (h_flag_distinct : flagPos ≠ q_start + 2 * bits) :
    Gate.WellTyped dim
        (sqir_style_modAddConst_skeleton bits q_start N c flagPos)
theoremsqir_style_modAddConst_skeleton_target_bit
theorem sqir_style_modAddConst_skeleton_target_bit
    (bits q_start N c x flagPos : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hc : c < 2^bits) (hx : x < 2^bits)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    ∀ i, i < bits →
      Gate.applyNat (sqir_style_modAddConst_skeleton bits q_start N c flagPos)
        (cuccaro_input_F q_start false 0 x) (q_start + 2 * i + 1)
      = (x + c).testBit i
*Target bit after the skeleton.** At each target position `q_start + 2*i + 1` for `i < bits`, the output equals `(x+c).testBit i`.
theoremsqir_style_modAddConst_skeleton_read_bit
theorem sqir_style_modAddConst_skeleton_read_bit
    (bits q_start N c x flagPos : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hc : c < 2^bits) (hx : x < 2^bits)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    ∀ i, i < bits →
      Gate.applyNat (sqir_style_modAddConst_skeleton bits q_start N c flagPos)
        (cuccaro_input_F q_start false 0 x) (q_start + 2 * i + 2)
      = false
*Read bit after the skeleton.** At each read position `q_start + 2*i + 2` for `i < bits`, the output equals `false`.
theoremsqir_style_modAddConst_skeleton_carry_in_bit
theorem sqir_style_modAddConst_skeleton_carry_in_bit
    (bits q_start N c x flagPos : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hc : c < 2^bits) (hx : x < 2^bits)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (sqir_style_modAddConst_skeleton bits q_start N c flagPos)
        (cuccaro_input_F q_start false 0 x) q_start = false
*Carry-in qubit after the skeleton.** At position `q_start`, the output equals `false`.
theoremsqir_style_modAddConst_skeleton_target_decode
theorem sqir_style_modAddConst_skeleton_target_decode
    (bits q_start N c x flagPos : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hc : c < 2^bits) (hx : x < 2^bits)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    cuccaro_target_val bits q_start
        (Gate.applyNat (sqir_style_modAddConst_skeleton bits q_start N c flagPos)
          (cuccaro_input_F q_start false 0 x))
      = (x + c) % 2^bits
*HEADLINE — decoded target correctness.** After the skeleton, the target register decodes to `(x + c) % 2^bits`.
theoremsqir_style_modAddConst_skeleton_read_decode
theorem sqir_style_modAddConst_skeleton_read_decode
    (bits q_start N c x flagPos : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hc : c < 2^bits) (hx : x < 2^bits)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    cuccaro_read_val bits q_start
        (Gate.applyNat (sqir_style_modAddConst_skeleton bits q_start N c flagPos)
          (cuccaro_input_F q_start false 0 x))
      = 0
*Decoded read restoration.**
theoremsqir_style_modAddConst_skeleton_clean
theorem sqir_style_modAddConst_skeleton_clean
    (bits q_start N c x flagPos dim : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hc : c < 2^bits) (hx : x < 2^bits)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_flag : flagPos < dim)
    (h_flag_distinct : flagPos ≠ q_start + 2 * bits)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.WellTyped dim
        (sqir_style_modAddConst_skeleton bits q_start N c flagPos)
    ∧ cuccaro_target_val bits q_start
          (Gate.applyNat (sqir_style_modAddConst_skeleton bits q_start N c flagPos)
            (cuccaro_input_F q_start false 0 x))
*HEADLINE — packaged skeleton primitive.** Bundles WellTyped + target decode + read restored + carry-in restored. The flag-behavior theorem is DEFERRED to Tick 54 (requires the input-state equivalence argument and is needed for the controlled-sub-N step).

FormalRV.Arithmetic.Cuccaro.CuccaroSQIRStyle

FormalRV/Arithmetic/Cuccaro/CuccaroSQIRStyle.lean
FormalRV.BQAlgo.CuccaroSQIRStyle — SQIR-style compute-CNOT-uncompute comparator candidate. Tick 49 / Recovery of SQIR/RCIR exact-budget construction. CRITICAL DISCOVERY (Tick 49 source inspection of `SQIR/examples/shor/ModMult.v`): - **SQIR's actual `modmult_rev_anc n = 3 * n + 11`** (line 72 of `ModMult.v`), NOT `2 * n + 1` as the Lean placeholder in `SQIRPort/Shor.lean:4563` claims. - The Lean comment at line 4560-4562 incorrectly says "the specific RCIR implementation in Coq uses a similar linear-in-n count" — the actual SQIR value is `3n + 11`, with `3` non-overlapping n-bit registers + 2 designated flag bits at positions 0 and 1 + additional scratch. Consequence: the "exact-budget" framing in Ticks 41-48 was based on a too-tight Lean placeholder. The real SQIR budget gives substantial room for a designated flag qubit. SQIR's comparator01 (ModMult.v line 121-124): ``` comparator01 n := (bcx 0; negator0 n); highb01 n; bcinv (bcx 0; negator0 n). highb01 n := MAJseq n; bccnot (1 + n) 1; bcinv (MAJseq n). ``` - Position 1 is the designated FLAG bit. - `bccnot (1 + n) 1`: CNOT from the top carry (at position `1 + n`) to the flag (position 1). - The compute-CNOT-uncompute pattern: MAJseq forward, copy carry to flag, MAJseq reverse. This file ports the compute-CNOT-uncompute structure to our Cuccaro Gate IR. We: - Define `cuccaro_MAJ_inv` (the gate-level inverse of MAJ). - Define `cuccaro_maj_chain_inv` (the chain-level inverse). - Prove the local MAJ inverse identity: `MAJ ; MAJ_inv = id`. - Define `sqir_style_compareConst_candidate` matching SQIR's pattern, parameterized over an explicit flag qubit position `flagPos`. - Prove WellTyped (assuming `flagPos < dim`).
defcuccaro_MAJ_inv
def cuccaro_MAJ_inv (a b c : Nat) : Gate
*Inverse of the Cuccaro MAJ gate.** Since each component gate (CX, CCX) is self-inverse, the inverse is the reversed sequence.
defcuccaro_maj_chain_inv
def cuccaro_maj_chain_inv : Nat → Nat → Gate
  | 0,     _       => I
  | n + 1, q_start =>
      seq (cuccaro_maj_chain_inv n (q_start + 2))
          (cuccaro_MAJ_inv q_start (q_start + 1) (q_start + 2))
*Inverse of the n-step Cuccaro MAJ chain.**
theoremcuccaro_MAJ_inv_wellTyped
theorem cuccaro_MAJ_inv_wellTyped
    (dim a b c : Nat) (ha : a < dim) (hb : b < dim) (hc : c < dim)
    (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c) :
    Gate.WellTyped dim (cuccaro_MAJ_inv a b c)
theoremcuccaro_maj_chain_inv_wellTyped
theorem cuccaro_maj_chain_inv_wellTyped
    (n q_start dim : Nat) (h : q_start + 2 * n + 1 ≤ dim) :
    Gate.WellTyped dim (cuccaro_maj_chain_inv n q_start)
defsqir_modmult_rev_anc
def sqir_modmult_rev_anc (n : Nat) : Nat
*SQIR-faithful ancilla count** (Coq `ModMult.v` line 72). SQIR uses `3 * n + 11` ancilla qubits for `modmult_rev`, NOT the `2 * n + 1` Lean placeholder. We expose this separately to avoid silently patching the Lean placeholder while still making the real SQIR value available for parallel SQIR-faithful Lean development.
theoremsqir_modmult_total_dim
theorem sqir_modmult_total_dim (n : Nat) :
    n + sqir_modmult_rev_anc n = 4 * n + 11
Total dimension under SQIR-faithful ancilla count: `4 * n + 11`.
theoremsqir_modmult_anc_diff_from_lean_placeholder
theorem sqir_modmult_anc_diff_from_lean_placeholder (n : Nat) :
    sqir_modmult_rev_anc n
      = FormalRV.SQIRPort.modmult_rev_anc n + (n + 10)
Arithmetic gap between Lean placeholder and SQIR source. The placeholder undercounts ancilla by `n + 10`.
theoremcuccaro_MAJ_inv_at_a
theorem cuccaro_MAJ_inv_at_a
    (a b c : Nat) (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_MAJ_inv a b c) f a
      = xor (f a) (xor (f c) (f a && f b))
*MAJ_inv at the `a` wire.** Composes CCX, CX c a, CX c b left-to-right; at position a only the CX c a step writes a new value.
theoremcuccaro_MAJ_inv_at_b
theorem cuccaro_MAJ_inv_at_b
    (a b c : Nat) (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_MAJ_inv a b c) f b
      = xor (f b) (xor (f c) (f a && f b))
*MAJ_inv at the `b` wire.**
theoremcuccaro_MAJ_inv_at_c
theorem cuccaro_MAJ_inv_at_c
    (a b c : Nat) (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_MAJ_inv a b c) f c
      = xor (f c) (f a && f b)
*MAJ_inv at the `c` wire.** Only the first CCX writes here.
theoremcuccaro_MAJ_inv_at_other
theorem cuccaro_MAJ_inv_at_other
    (a b c q : Nat) (h_qa : q ≠ a) (h_qb : q ≠ b) (h_qc : q ≠ c) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_MAJ_inv a b c) f q = f q
*MAJ_inv at any unrelated wire.**
theoremcuccaro_MAJ_followed_by_MAJ_inv_eq_id
theorem cuccaro_MAJ_followed_by_MAJ_inv_eq_id
    (a b c : Nat) (h_ab : a ≠ b) (h_ac : a ≠ c) (h_bc : b ≠ c)
    (f : Nat → Bool) (q : Nat) :
    Gate.applyNat (seq (cuccaro_MAJ a b c) (cuccaro_MAJ_inv a b c)) f q = f q
*Local inverse identity** (per position). Applying MAJ followed by MAJ_inv to a state restores the original at every position.
theoremcuccaro_maj_chain_inv_after_chain_eq_id
theorem cuccaro_maj_chain_inv_after_chain_eq_id
    (n q_start : Nat) (g : Nat → Bool) :
    Gate.applyNat (cuccaro_maj_chain_inv n q_start)
        (Gate.applyNat (cuccaro_maj_chain n q_start) g) = g
*Chain inverse identity (function-level).** Applying the chain followed by its inverse to any state returns the original state.
theoremcuccaro_maj_chain_followed_by_inv_eq_id
theorem cuccaro_maj_chain_followed_by_inv_eq_id
    (n q_start : Nat) (f : Nat → Bool) (q : Nat) :
    Gate.applyNat
        (seq (cuccaro_maj_chain n q_start)
             (cuccaro_maj_chain_inv n q_start)) f q = f q
*Chain inverse identity** (per position). Pointwise corollary.
defsqir_style_compareConst_candidate
def sqir_style_compareConst_candidate
    (bits q_start N flagPos : Nat) : Gate
*SQIR-style compare-constant candidate gate** with explicit flag position. Uses the compute-CNOT-uncompute pattern: workspace restored, flag XOR'd with the comparison result.
theoremsqir_style_compareConst_candidate_wellTyped
theorem sqir_style_compareConst_candidate_wellTyped
    (bits q_start N flagPos dim : Nat)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_flag : flagPos < dim)
    (h_distinct : flagPos ≠ q_start + 2 * bits) :
    Gate.WellTyped dim
        (sqir_style_compareConst_candidate bits q_start N flagPos)
*WellTyped for the SQIR-style comparator candidate.** Requires both the workspace range `q_start + 2*bits + 1 ≤ dim` AND `flagPos < dim`, plus `flagPos ≠ q_start + 2 * bits` (the CNOT's controls and targets must differ).
theoremcuccaro_maj_chain_inv_frame_above
theorem cuccaro_maj_chain_inv_frame_above
    (n q_start : Nat) (f : Nat → Bool) (q : Nat)
    (h : q_start + 2 * n + 1 ≤ q) :
    Gate.applyNat (cuccaro_maj_chain_inv n q_start) f q = f q
The inverse MAJ chain doesn't touch positions strictly above its support (i.e., `q ≥ q_start + 2*n + 1`).
theoremcuccaro_maj_chain_inv_frame_below
theorem cuccaro_maj_chain_inv_frame_below
    (n q_start : Nat) (f : Nat → Bool) (q : Nat) (h : q < q_start) :
    Gate.applyNat (cuccaro_maj_chain_inv n q_start) f q = f q
The inverse MAJ chain doesn't touch positions strictly below its support (i.e., `q < q_start`).
theoremcuccaro_input_F_above_eq_false
theorem cuccaro_input_F_above_eq_false
    (q_start bits x q : Nat) (h_above : q_start + 2 * bits + 1 ≤ q) (hx : x < 2^bits) :
    cuccaro_input_F q_start false 0 x q = false
For an input `cuccaro_input_F q_start false 0 x` with `x < 2^bits`, all positions strictly above `q_start + 2*bits` evaluate to `false`.
theoremsqir_style_compareConst_candidate_flag
theorem sqir_style_compareConst_candidate_flag
    (bits q_start N x flagPos : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos)
        (cuccaro_input_F q_start false 0 x) flagPos
      = decide (N ≤ x)
*HEADLINE — flag-copy theorem.** After running the SQIR-style comparator candidate on the input encoding `cuccaro_input_F q_start false 0 x`, the external flag qubit at `flagPos` holds `decide (N ≤ x)`.
theoremsqir_style_compareConst_candidate_underflow_flag
theorem sqir_style_compareConst_candidate_underflow_flag
    (bits q_start N x flagPos : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    !(Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos)
        (cuccaro_input_F q_start false 0 x) flagPos)
      = decide (x < N)
*Underflow polarity**: negation of the flag gives `decide (x < N)`.
theoremsqir_style_compareConst_candidate_clean_flag
theorem sqir_style_compareConst_candidate_clean_flag
    (bits q_start N x flagPos dim : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_flag : flagPos < dim)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.WellTyped dim
        (sqir_style_compareConst_candidate bits q_start N flagPos)
    ∧ Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos)
          (cuccaro_input_F q_start false 0 x) flagPos
        = decide (N ≤ x)
*HEADLINE — packaged SQIR-style comparator primitive (flag-only).** Combines WellTyped at the SQIR-faithful dimension with the flag-copy theorem. Workspace restoration is established structurally by construction (forward-CX-uncompute pattern) but the full per-position bit-level workspace-restoration theorem requires a "function locality" lemma not yet proved — see status note below.
theoremcuccaro_prepareConstRead_self_inverse_at
theorem cuccaro_prepareConstRead_self_inverse_at
    (bits q_start c : Nat) (f : Nat → Bool) (q : Nat) :
    Gate.applyNat (cuccaro_prepareConstRead bits q_start c)
        (Gate.applyNat (cuccaro_prepareConstRead bits q_start c) f) q
      = f q
*Prepare self-inverse (per position).**
theoremcuccaro_prepareConstRead_self_inverse
theorem cuccaro_prepareConstRead_self_inverse
    (bits q_start c : Nat) (f : Nat → Bool) :
    Gate.applyNat (cuccaro_prepareConstRead bits q_start c)
        (Gate.applyNat (cuccaro_prepareConstRead bits q_start c) f) = f
*Prepare self-inverse (function-level).**
theoremcuccaro_maj_chain_inv_commute_update_outside_workspace
theorem cuccaro_maj_chain_inv_commute_update_outside_workspace
    (bits q_start flagPos : Nat) (v : Bool)
    (f : Nat → Bool)
    (hflag_outside : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (p : Nat) (hp_lower : q_start ≤ p) (hp_upper : p < q_start + 2 * bits + 1) :
    Gate.applyNat (cuccaro_maj_chain_inv bits q_start)
        (update f flagPos v) p
      = Gate.applyNat (cuccaro_maj_chain_inv bits q_start) f p
*Locality / commute-with-outside-update**: the inverse MAJ chain commutes with updating the input at any position outside its workspace range, when queried at any workspace position.
theoremsqir_style_compareConst_candidate_workspace_restored_at
theorem sqir_style_compareConst_candidate_workspace_restored_at
    (bits q_start N flagPos : Nat) (f : Nat → Bool)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos)
    (q : Nat) (hq_lower : q_start ≤ q) (hq_upper : q < q_start + 2 * bits + 1) :
    Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos) f q
      = f q
*HEADLINE — workspace restoration (at-position).** At any workspace position `q ∈ [q_start, q_start + 2*bits]`, the SQIR-style comparator candidate restores the input value.
theoremsqir_style_compareConst_candidate_target_restored
theorem sqir_style_compareConst_candidate_target_restored
    (bits q_start N x flagPos : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    ∀ i, i < bits →
      Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos)
        (cuccaro_input_F q_start false 0 x) (q_start + 2 * i + 1)
      = x.testBit i
*Target register restored**: at each target position `q_start + 2*i + 1` for `i < bits`, the output equals `x.testBit i`.
theoremsqir_style_compareConst_candidate_read_restored
theorem sqir_style_compareConst_candidate_read_restored
    (bits q_start N x flagPos : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    ∀ i, i < bits →
      Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos)
        (cuccaro_input_F q_start false 0 x) (q_start + 2 * i + 2)
      = false
*Read register restored**: at each read position `q_start + 2*i + 2` for `i < bits`, the output equals `false`.
theoremsqir_style_compareConst_candidate_carry_in_restored
theorem sqir_style_compareConst_candidate_carry_in_restored
    (bits q_start N x flagPos : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos)
        (cuccaro_input_F q_start false 0 x) q_start = false
*Carry-in qubit restored**: at position `q_start`, the output equals `false`.
theoremsqir_style_compareConst_candidate_top_carry_restored
theorem sqir_style_compareConst_candidate_top_carry_restored
    (bits q_start N x flagPos : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos)
        (cuccaro_input_F q_start false 0 x) (q_start + 2 * bits) = false
*Top-carry qubit restored**: at position `q_start + 2*bits`, the output equals the input value (= `0.testBit (bits - 1)` for `a = 0`, which is `false`).
theoremsqir_style_compareConst_candidate_clean
theorem sqir_style_compareConst_candidate_clean
    (bits q_start N x flagPos : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits)
    (h_workspace : q_start + 2 * bits + 1 ≤ sqir_modmult_rev_anc bits)
    (h_flag : flagPos < sqir_modmult_rev_anc bits)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.WellTyped (sqir_modmult_rev_anc bits)
        (sqir_style_compareConst_candidate bits q_start N flagPos)
    ∧ Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos)
          (cuccaro_input_F q_start false 0 x) flagPos
        = decide (N ≤ x)
    ∧ (∀ i, i < bits →
*HEADLINE — FULLY CLEAN SQIR-style comparator primitive.** At the SQIR-faithful dimension `3*bits + 11`: - WellTyped; - `flagPos` gets `decide (N ≤ x)`; - read register fully restored to `0`; - target register fully restored to `x.testBit`; - carry-in qubit restored to `false`; - top-carry qubit restored to `false`.
theoremsqir_style_compareConst_candidate_target_decode_restored
theorem sqir_style_compareConst_candidate_target_decode_restored
    (bits q_start N x flagPos : Nat)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits)
    (h_flag_above : q_start + 2 * bits + 1 ≤ flagPos) :
    cuccaro_target_val bits q_start
        (Gate.applyNat (sqir_style_compareConst_candidate bits q_start N flagPos)
          (cuccaro_input_F q_start false 0 x))
      = x
*Decoded target restoration**: the decoded target register after the comparator equals `x`.
theoremsqir_style_compareConst_candidate_wellTyped_sqir_dim
theorem sqir_style_compareConst_candidate_wellTyped_sqir_dim
    (bits q_start N flagPos : Nat) (hbits : 1 ≤ bits)
    (h_workspace : q_start + 2 * bits + 1 ≤ sqir_modmult_rev_anc bits)
    (h_flag : flagPos < sqir_modmult_rev_anc bits)
    (h_distinct : flagPos ≠ q_start + 2 * bits) :
    Gate.WellTyped (sqir_modmult_rev_anc bits)
        (sqir_style_compareConst_candidate bits q_start N flagPos)
*WellTyped at the SQIR-faithful dimension `sqir_modmult_rev_anc bits = 3 * bits + 11`.** The SQIR-style candidate fits comfortably: it uses `q_start + 2*bits + 1` workspace + 1 flag qubit = much less than the full SQIR ancilla budget.

FormalRV.Arithmetic.Cuccaro.CuccaroSubConst

FormalRV/Arithmetic/Cuccaro/CuccaroSubConst.lean
FormalRV.BQAlgo.CuccaroSubConst — exact-budget Cuccaro subtract-constant primitive + flag-feasibility analysis. Tick 46: - Define `cuccaro_subConstGate` as add-by-two's-complement. - Prove subtract correctness via wraparound spec. - Prove arithmetic split lemmas (no-underflow vs underflow cases). - Analyze whether the clean exact-budget subtract exposes a borrow/comparison flag. Conclusion (Deliverable D, formal): the CLEAN exact-budget Cuccaro subtract-constant primitive restores all non-target ancilla to their canonical zero values. The only informative output is the target register itself, which encodes `(x + 2^bits - N) mod 2^bits` — a function that distinguishes `x < N` from `x ≥ N` via its value but NOT via any single ancilla bit. Therefore an exact-budget modular-reduction step cannot read the borrow flag from a single qubit of this gate's output; a different construction (forward-only comparator copying the top carry before reverse uncompute, or a modified primitive that reserves a flag qubit) is required for the next layer. This file does NOT extend the Cuccaro budget — it identifies the precise structural blocker for the SQIR-axiom-closure pipeline.
defcuccaro_subConstGate
def cuccaro_subConstGate (bits q_start N : Nat) : Gate
*Cuccaro subtract-constant gate** (exact-budget). Implemented as add by the two's-complement of `N`.
defcuccaro_subConstSpec
def cuccaro_subConstSpec (bits N x : Nat) : Nat
*Wraparound spec for subtract.** The target register after a subtract-constant equals `(x + (2^bits - N)) mod 2^bits`. In the non-underflow case (`x ≥ N`) this reduces to `x - N`; in the underflow case (`x < N`) it equals `x + 2^bits - N`.
theoremcuccaro_subConstSpec_of_le
theorem cuccaro_subConstSpec_of_le
    (bits N x : Nat) (hN : N ≤ 2^bits) (hx : x < 2^bits) (hle : N ≤ x) :
    cuccaro_subConstSpec bits N x = x - N
*Non-underflow case.** When `N ≤ x`, the wraparound spec reduces to integer subtraction.
theoremcuccaro_subConstSpec_of_lt
theorem cuccaro_subConstSpec_of_lt
    (bits N x : Nat) (hN : N ≤ 2^bits) (hx : x < N) :
    cuccaro_subConstSpec bits N x = x + 2^bits - N
*Underflow case.** When `x < N`, the wraparound spec equals `x + 2^bits - N`.
theoremcuccaro_subConstGate_target_decode
theorem cuccaro_subConstGate_target_decode
    (bits q_start N x : Nat) (h1N : 1 ≤ N) (hN : N ≤ 2^bits) :
    cuccaro_target_val bits q_start
      (Gate.applyNat (cuccaro_subConstGate bits q_start N)
        (cuccaro_input_F q_start false 0 x))
    = cuccaro_subConstSpec bits N x
*HEADLINE — subtract-constant target decode.** After `cuccaro_subConstGate bits q_start N` on `cuccaro_input_F q_start false 0 x`, the target register decodes to `cuccaro_subConstSpec bits N x`.
theoremcuccaro_subConstGate_wellTyped
theorem cuccaro_subConstGate_wellTyped
    (bits q_start N dim : Nat) (h : q_start + 2 * bits + 1 ≤ dim) :
    Gate.WellTyped dim (cuccaro_subConstGate bits q_start N)
*subtract-constant WellTyped.**
theoremcuccaro_subConstGate_clean
theorem cuccaro_subConstGate_clean
    (bits q_start N x : Nat) (h1N : 1 ≤ N) (hN : N ≤ 2^bits) :
    Gate.WellTyped (q_start + (2 * bits + 1))
        (cuccaro_subConstGate bits q_start N)
    ∧ cuccaro_target_val bits q_start
          (Gate.applyNat (cuccaro_subConstGate bits q_start N)
            (cuccaro_input_F q_start false 0 x))
        = cuccaro_subConstSpec bits N x
    ∧ cuccaro_read_val bits q_start
          (Gate.applyNat (cuccaro_subConstGate bits q_start N)
            (cuccaro_input_F q_start false 0 x))
        = 0
*HEADLINE — packaged clean subtract-constant primitive.** - WellTyped at dimension `q_start + (2*bits + 1)`; - target decode = `cuccaro_subConstSpec bits N x`; - read register restored to 0; - carry-in qubit restored to false.
theoremcuccaro_subConstGate_clean_state_loses_underflow_info
theorem cuccaro_subConstGate_clean_state_loses_underflow_info
    (bits q_start N x : Nat) (h1N : 1 ≤ N) (hN : N ≤ 2^bits) :
    -- Carry-in restored to false.
    (Gate.applyNat (cuccaro_subConstGate bits q_start N)
        (cuccaro_input_F q_start false 0 x) q_start = false)
    ∧
    -- Every read-register qubit is false.
    (∀ i, i < bits →
        Gate.applyNat (cuccaro_subConstGate bits q_start N)
          (cuccaro_input_F q_start false 0 x) (q_start + 2 * i + 2) = false)
*Formal blocker — clean subtract leaves NO single bit holding the borrow flag.** Every ancilla qubit within the `2*bits + 1` adder budget is restored. In particular: - the carry-in qubit at `q_start` holds `false`; - every read-register qubit at `q_start + 2*i + 2` (i < bits) holds `false`. Consequence: an exact-budget modular-reduction step that needs the borrow flag cannot extract it from any single output qubit of the clean subtract-constant gate. A different construction is required (see PROGRESS.md tick-46 status for the path forward).
theoremcuccaro_subConstSpec_underflow_range
theorem cuccaro_subConstSpec_underflow_range
    (bits N x : Nat) (hN : N ≤ 2^bits) (hx : x < 2^bits) :
    (x < N → 2^bits - N ≤ cuccaro_subConstSpec bits N x)
    ∧ (N ≤ x → cuccaro_subConstSpec bits N x < 2^bits - N)
*Target encodes borrow via Nat-order, not via a single Boolean.** In the underflow case, the target value lies in `[2^bits - N, 2^bits - 1]`; in the non-underflow case it lies in `[0, 2^bits - N - 1]`.

FormalRV.Arithmetic.GateToUCom

FormalRV/Arithmetic/GateToUCom.lean
FormalRV.BQAlgo.GateToUCom — translation from the BQ-Algo `Gate` IR (used for cost accounting + optimization) to the Framework `BaseUCom` (used for semantic reasoning). The translation is faithful in the obvious sense: each `Gate` constructor maps to its `BaseUCom` analog, and `seq` becomes `UCom.seq`. This enables lifting BQ-Algo optimization theorems (tcount/gcount monotonicity) to BaseUCom semantic-preservation proofs via the existing `Framework` layer. Status: translation function + structural unfolding lemmas. Semantic preservation theorems (e.g., `uc_eval (toUCom (optimize_full g)) = uc_eval (toUCom g)`) are the natural next milestones.
defGate.toUCom
noncomputable def Gate.toUCom (dim : Nat) : Gate → BaseUCom dim
  | Gate.I            => BaseUCom.ID 0
  | Gate.X q          => BaseUCom.X q
  | Gate.CX c t       => BaseUCom.CNOT c t
  | Gate.CCX a b c    => BaseUCom.CCX a b c
  | Gate.seq g₁ g₂    => UCom.seq (Gate.toUCom dim g₁) (Gate.toUCom dim g₂)
Translate a BQ-Algo `Gate` into a `BaseUCom dim`. Identity, X, CNOT, and Toffoli have direct analogs; `Gate.seq` becomes `UCom.seq`. Marked `noncomputable` because `BaseUCom` carries real-valued matrix data downstream.
theoremuc_eval_toUCom_optimize_ccx_pair_top_pair
theorem uc_eval_toUCom_optimize_ccx_pair_top_pair {dim : Nat} (a b c : Nat)
    (h0 : 0 < dim)
    (ha : a < dim) (hb : b < dim) (hc : c < dim)
    (hab : a ≠ b) (hac : a ≠ c) (hbc : b ≠ c) :
    uc_eval (Gate.toUCom dim
      (optimize_ccx_pair_top (Gate.seq (Gate.CCX a b c) (Gate.CCX a b c))))
      = uc_eval (Gate.toUCom dim (Gate.seq (Gate.CCX a b c) (Gate.CCX a b c)))
Semantic preservation of the top-level CCX-pair rewrite on the matching-triple case. uc_eval of the optimized output (which is `BaseUCom.ID 0`) equals uc_eval of the input (`UCom.seq CCX CCX`) — both reduce to the identity matrix.
theoremuc_eval_toUCom_optimize_ccx_pair_top_I
theorem uc_eval_toUCom_optimize_ccx_pair_top_I {dim : Nat} :
    uc_eval (Gate.toUCom dim (optimize_ccx_pair_top Gate.I))
      = uc_eval (Gate.toUCom dim Gate.I)
theoremuc_eval_toUCom_optimize_ccx_pair_top_X
theorem uc_eval_toUCom_optimize_ccx_pair_top_X {dim q : Nat} :
    uc_eval (Gate.toUCom dim (optimize_ccx_pair_top (Gate.X q)))
      = uc_eval (Gate.toUCom dim (Gate.X q))
theoremuc_eval_toUCom_optimize_ccx_pair_top_CX
theorem uc_eval_toUCom_optimize_ccx_pair_top_CX {dim a b : Nat} :
    uc_eval (Gate.toUCom dim (optimize_ccx_pair_top (Gate.CX a b)))
      = uc_eval (Gate.toUCom dim (Gate.CX a b))
theoremuc_eval_toUCom_optimize_ccx_pair_top_CCX
theorem uc_eval_toUCom_optimize_ccx_pair_top_CCX {dim a b c : Nat} :
    uc_eval (Gate.toUCom dim (optimize_ccx_pair_top (Gate.CCX a b c)))
      = uc_eval (Gate.toUCom dim (Gate.CCX a b c))
theoremuc_eval_toUCom_optimize_ccx_pair_top_pair_diff
theorem uc_eval_toUCom_optimize_ccx_pair_top_pair_diff {dim : Nat}
    (a b c a' b' c' : Nat) (h : ¬ (a = a' ∧ b = b' ∧ c = c')) :
    uc_eval (Gate.toUCom dim
      (optimize_ccx_pair_top (Gate.seq (Gate.CCX a b c) (Gate.CCX a' b' c'))))
      = uc_eval (Gate.toUCom dim (Gate.seq (Gate.CCX a b c) (Gate.CCX a' b' c')))
When the two CCXs have differing triples, the optimizer leaves the circuit unchanged.
defGate.WellTyped
def Gate.WellTyped (dim : Nat) : Gate → Prop
  | Gate.I            => 0 < dim
  | Gate.X q          => q < dim
  | Gate.CX a b       => a < dim ∧ b < dim ∧ a ≠ b
  | Gate.CCX a b c    => a < dim ∧ b < dim ∧ c < dim ∧ a ≠ b ∧ a ≠ c ∧ b ≠ c
  | Gate.seq g₁ g₂    => Gate.WellTyped dim g₁ ∧ Gate.WellTyped dim g₂
A `Gate` is well-typed in `dim`-qubit context iff every contained gate-position is within `dim` and CCXs have distinct controls/target.
theoremuc_eval_toUCom_optimize_ccx_pair_top
theorem uc_eval_toUCom_optimize_ccx_pair_top {dim : Nat}
    (g : Gate) (h_wt : Gate.WellTyped dim g) :
    uc_eval (Gate.toUCom dim (optimize_ccx_pair_top g))
      = uc_eval (Gate.toUCom dim g)
*Unified semantic preservation** for the top-level CCX-pair rewrite. Combines the matching-pair case (uses CCX_CCX_eq_one) with all no-op cases (rfl + if_neg).
theoremGate.WellTyped_optimize_ccx_pair_top
theorem Gate.WellTyped_optimize_ccx_pair_top {dim : Nat}
    (g : Gate) (h_wt : Gate.WellTyped dim g) :
    Gate.WellTyped dim (optimize_ccx_pair_top g)
Well-typedness is preserved by the top-level CCX-pair rewrite. The interesting case: when the optimizer fires on a CCX-CCX pair, the output `I` requires `0 < dim`, which we can extract from the inner CCXs' well-typedness.
theoremGate.WellTyped_optimize_ccx_pairs_deep
theorem Gate.WellTyped_optimize_ccx_pairs_deep {dim : Nat}
    (g : Gate) (h_wt : Gate.WellTyped dim g) :
    Gate.WellTyped dim (optimize_ccx_pairs_deep g)
Well-typedness is preserved by the deep optimizer. Inductive on `g`.
theoremuc_eval_toUCom_optimize_ccx_pairs_deep
theorem uc_eval_toUCom_optimize_ccx_pairs_deep {dim : Nat}
    (g : Gate) (h_wt : Gate.WellTyped dim g) :
    uc_eval (Gate.toUCom dim (optimize_ccx_pairs_deep g))
      = uc_eval (Gate.toUCom dim g)
*Semantic preservation for the deep optimizer.** Inductive on `g`, using the top-level unified theorem at each `seq` step.
theoremuc_eval_toUCom_optimize_I_top
theorem uc_eval_toUCom_optimize_I_top {dim : Nat}
    (g : Gate) (h_wt : Gate.WellTyped dim g) :
    uc_eval (Gate.toUCom dim (optimize_I_top g))
      = uc_eval (Gate.toUCom dim g)
Semantic preservation for the top-level I-elimination rewrite. The interesting cases: `seq I g → g` and `seq g I → g`. Both use `uc_eval_ID_eq_one`.
theoremGate.WellTyped_optimize_I_top
theorem Gate.WellTyped_optimize_I_top {dim : Nat}
    (g : Gate) (h_wt : Gate.WellTyped dim g) :
    Gate.WellTyped dim (optimize_I_top g)
Well-typedness is preserved by the top-level I-elimination rewrite. `seq I g → g` and `seq g I → g` only drop an I (which is well-typed iff 0 < dim, propagated from any inner CCX or the seq's other half).
theoremGate.WellTyped_optimize_I_pairs_deep
theorem Gate.WellTyped_optimize_I_pairs_deep {dim : Nat}
    (g : Gate) (h_wt : Gate.WellTyped dim g) :
    Gate.WellTyped dim (optimize_I_pairs_deep g)
Well-typedness is preserved by the deep I-elimination optimizer.
theoremuc_eval_toUCom_optimize_I_pairs_deep
theorem uc_eval_toUCom_optimize_I_pairs_deep {dim : Nat}
    (g : Gate) (h_wt : Gate.WellTyped dim g) :
    uc_eval (Gate.toUCom dim (optimize_I_pairs_deep g))
      = uc_eval (Gate.toUCom dim g)
Semantic preservation for the deep I-elimination optimizer.
theoremuc_eval_toUCom_optimize_full
theorem uc_eval_toUCom_optimize_full {dim : Nat}
    (g : Gate) (h_wt : Gate.WellTyped dim g) :
    uc_eval (Gate.toUCom dim (optimize_full g))
      = uc_eval (Gate.toUCom dim g)
*Semantic preservation for the full optimizer.** Compose the CCX-deep and I-deep preservations.
theoremGate.WellTyped_optimize_full
theorem Gate.WellTyped_optimize_full {dim : Nat}
    (g : Gate) (h_wt : Gate.WellTyped dim g) :
    Gate.WellTyped dim (optimize_full g)
Well-typedness is preserved by `optimize_full`. Compose the two deep preservations.
theoremGate.WellTyped_optimize_to_fixpoint
theorem Gate.WellTyped_optimize_to_fixpoint {dim : Nat}
    (g : Gate) (h_wt : Gate.WellTyped dim g) :
    Gate.WellTyped dim (optimize_to_fixpoint g)
Well-typedness is preserved by the WF-recursive fixpoint operator. Same shape as the cost-monotonicity proofs: case-split on `has_ccx_pair g`, recurse via `_eq_recurse_of_pair`, base case via `_eq_self_of_no_pair`.
theoremuc_eval_toUCom_optimize_to_fixpoint
theorem uc_eval_toUCom_optimize_to_fixpoint {dim : Nat}
    (g : Gate) (h_wt : Gate.WellTyped dim g) :
    uc_eval (Gate.toUCom dim (optimize_to_fixpoint g))
      = uc_eval (Gate.toUCom dim g)
*Semantic preservation for the WF-recursive fixpoint operator.** Closes the certification stack: the unfueled `optimize_to_fixpoint` is formally proven to terminate, produce pair-free output, decrease both tcount and gcount monotonically, AND preserve uc_eval.
theoremoptimize_to_fixpoint_uc_equiv
theorem optimize_to_fixpoint_uc_equiv {dim : Nat}
    (g : Gate) (h_wt : Gate.WellTyped dim g) :
    UCom.equiv (Gate.toUCom dim (optimize_to_fixpoint g))
               (Gate.toUCom dim g)
UCom.equiv-form of the WF-fixpoint preservation. Clients reasoning in the UCom semantic layer can use this directly.
theoremuc_eval_optimize_to_fixpoint_cuccaro_MAJ
theorem uc_eval_optimize_to_fixpoint_cuccaro_MAJ {dim : Nat}
    (a b c : Nat) (h_wt : Gate.WellTyped dim (cuccaro_MAJ a b c)) :
    uc_eval (Gate.toUCom dim (optimize_to_fixpoint (cuccaro_MAJ a b c)))
      = uc_eval (Gate.toUCom dim (cuccaro_MAJ a b c))
Cuccaro MAJ: the certified optimizer preserves its semantics (which is non-trivially the majority function on bits).
theoremuc_eval_optimize_to_fixpoint_cuccaro_UMA
theorem uc_eval_optimize_to_fixpoint_cuccaro_UMA {dim : Nat}
    (a b c : Nat) (h_wt : Gate.WellTyped dim (cuccaro_UMA a b c)) :
    uc_eval (Gate.toUCom dim (optimize_to_fixpoint (cuccaro_UMA a b c)))
      = uc_eval (Gate.toUCom dim (cuccaro_UMA a b c))
Cuccaro UMA analog.
example(example)
example (a b c : Nat) :
    tcount (cuccaro_MAJ a b c) + tcount (cuccaro_UMA a b c) = 14
Documented limitation: on `seq MAJ UMA`, the natural CCX-CCX boundary between MAJ-end and UMA-start is NOT caught by the current optimizer because association blocks the pattern. `seq MAJ UMA` has shape `seq (seq ... (CCX a b c)) (seq (CCX a b c) ...)` where the two CCXs are at different nesting depths. T-count stays at 14 (= 7 + 7) without an associativity-normalizing preprocessor. Smoke test: optimizer leaves it at 14 T per `MAJ_UMA_pair_tcount`.
theoremuc_eval_toUCom_assoc_right_step
theorem uc_eval_toUCom_assoc_right_step {dim : Nat} (g : Gate) :
    uc_eval (Gate.toUCom dim (assoc_right_step g))
      = uc_eval (Gate.toUCom dim g)
The single-step associativity rotation `assoc_right_step` preserves `uc_eval` semantics. Reduces to `Matrix.mul_assoc` after unfolding `UCom.seq`'s right-to-left matrix multiplication.
theoremassoc_right_step_uc_equiv
theorem assoc_right_step_uc_equiv {dim : Nat} (g : Gate) :
    UCom.equiv (Gate.toUCom dim (assoc_right_step g))
               (Gate.toUCom dim g)
UCom.equiv form: the rotation produces an equivalent circuit.
theoremuc_eval_toUCom_assoc_right_iter
theorem uc_eval_toUCom_assoc_right_iter {dim : Nat} (n : Nat) (g : Gate) :
    uc_eval (Gate.toUCom dim (assoc_right_iter n g))
      = uc_eval (Gate.toUCom dim g)
Iterated rotation preserves uc_eval. Induction on fuel + each step's semantic preservation.

FormalRV.Arithmetic.MCPBridge

FormalRV/Arithmetic/MCPBridge.lean
FormalRV.BQAlgo.MCPBridge — promotion of a Gate-IR Boolean semantics into the `MultiplyCircuitProperty` shape required by `SQIRPort/Shor.lean`. This module imports both `BQAlgo.Correctness` (the structural `Gate.applyNat` → `f_to_vec` adapter) and `SQIRPort.Shor` (the declarations of `uc_eval` and `MultiplyCircuitProperty`). The single exported theorem is `toUCom_satisfies_MultiplyCircuitProperty_of_applyNat`: given a `Gate` IR term `g` together with an encoding `encode` of data-register inputs into bit-functions, and proofs that (a) `f_to_vec (n+anc) (encode x) = basis_vector … (x · 2^anc)`, and (b) `f_to_vec (n+anc) (Gate.applyNat g (encode x)) = basis_vector … ((a · x mod N) · 2^anc)`, conclude that `Gate.toUCom (n+anc) g` satisfies `MultiplyCircuitProperty a N n anc`. This is the exact statement consumed by `f_modmult_circuit_MMI`.
theoremtoUCom_satisfies_MultiplyCircuitProperty_of_applyNat
theorem toUCom_satisfies_MultiplyCircuitProperty_of_applyNat
    {a N n anc : Nat} {g : Gate}
    (h_wt : Gate.WellTyped (n + anc) g)
    (encode : Nat → (Nat → Bool))
    (h_input_encoded :
      ∀ x : Nat, x < N →
        f_to_vec (n + anc) (encode x)
          = FormalRV.Framework.basis_vector (2^(n+anc)) (x * 2^anc))
    (h_output_encoded :
      ∀ x : Nat, x < N →
        f_to_vec (n + anc) (Gate.applyNat g (encode x))
          = FormalRV.Framework.basis_vector (2^(n+anc))
*Gate IR ⟹ `MultiplyCircuitProperty` promotion.** Given a well-typed `Gate` term `g` on `n+anc` qubits, plus an encoding `encode : Nat → (Nat → Bool)` of inputs as bit-functions that (i) Boolean-encodes `x` as the basis state `|x · 2^anc⟩` (data register holds `x`, ancilla holds 0), and (ii) under the Gate IR's Boolean semantics, `g`'s action takes the encoded input to the encoded image `|(a · x mod N) · 2^anc⟩`, the compiled `Gate.toUCom (n+anc) g` satisfies `MultiplyCircuitProperty a N n anc`. This is the exact precondition demanded by `f_modmult_circuit_MMI` in `SQIRPort/Shor.lean`; once a constructive `Gate`-level modular multiplier `g_modmult` is supplied with the two encoding lemmas (i)/(ii), the axiom can be discharged by `toUCom_satisfies_MultiplyCircuitProperty_of_applyNat`.
theoremtoUCom_satisfies_MultiplyCircuitProperty_of_applyNat_ext
theorem toUCom_satisfies_MultiplyCircuitProperty_of_applyNat_ext
    {a N n anc : Nat} {g : Gate}
    (h_wt : Gate.WellTyped (n + anc) g)
    (encode : Nat → (Nat → Bool))
    (h_encode :
      ∀ y : Nat, y < N →
        f_to_vec (n + anc) (encode y)
          = FormalRV.Framework.basis_vector (2^(n+anc)) (y * 2^anc))
    (h_apply :
      ∀ x : Nat, x < N →
        Gate.applyNat g (encode x) = encode ((a * x) % N)) :
    FormalRV.SQIRPort.MultiplyCircuitProperty a N n anc
*Extensional (purely Boolean) Gate IR ⟹ `MultiplyCircuitProperty`.** This is the cleanest user-facing adapter for discharging `f_modmult_circuit_MMI`: the output obligation is now a *purely Boolean function equality* `Gate.applyNat g (encode x) = encode ((a * x) % N)` which contains no matrix, vector, or `f_to_vec` machinery. The matrix-level lift is entirely handled inside this theorem by appealing to `toUCom_satisfies_MultiplyCircuitProperty_of_applyNat` and `h_encode`. The only encoding-level hypothesis required is `h_encode` (single direction: bit-function → basis-vector at packed index `y * 2^anc`), which only has to be proved *once* for the chosen encoding scheme — not separately for every `x` and every image `(a * x) % N`. No extra side condition such as `0 < N` is needed: the bound is extracted from `x < N` via `Nat.lt_of_le_of_lt (Nat.zero_le _) hxN`, and then `(a * x) % N < N` follows from `Nat.mod_lt`.
defencodeDataZeroAnc
def encodeDataZeroAnc (n anc : Nat) (x : Nat) : Nat → Bool
Canonical Boolean encoding of the input register for the modular multiplier on `n` data qubits + `anc` ancilla qubits. Defined as `nat_to_funbool (n + anc) (x * 2^anc)`: the bit-function that produces the basis state `|x⟩|0_anc⟩` at index `x · 2^anc` via the big-endian `funbool_to_nat` convention.
theoremf_to_vec_encodeDataZeroAnc
theorem f_to_vec_encodeDataZeroAnc {n anc y : Nat} (hy : y < 2^n) :
    f_to_vec (n + anc) (encodeDataZeroAnc n anc y)
      = FormalRV.Framework.basis_vector (2^(n+anc)) (y * 2^anc)
The canonical encoding produces the basis state at index `y · 2^anc`. Direct specialisation of `basis_vector_eq_f_to_vec_nat` for the `x · 2^anc` family of indices. The required bound `y * 2^anc < 2^(n+anc)` follows from `y < 2^n` via `Nat.mul_lt_mul_of_pos_right`.
theoremencodeDataZeroAnc_data
theorem encodeDataZeroAnc_data
    {n anc x i : Nat}
    (_hx : x < 2^n) (hi : i < n) :
    encodeDataZeroAnc n anc x i
      = FormalRV.Framework.nat_to_funbool n x i
*Data-bit accessor.** For positions `i < n`, the canonical encoding's bit equals the big-endian `nat_to_funbool n x i`. Proof: `n + anc - 1 - i = (n - 1 - i) + anc`, and dividing `x * 2^anc` by `2^((n-1-i)+anc)` cancels `2^anc` via `Nat.mul_div_mul_right`.
theoremencodeDataZeroAnc_anc
theorem encodeDataZeroAnc_anc
    {n anc x j : Nat}
    (_hx : x < 2^n) (hj : j < anc) :
    encodeDataZeroAnc n anc x (n + j) = false
*Ancilla zero accessor.** For positions `n + j` with `j < anc`, the canonical encoding's bit is `false`. Proof: `n + anc - 1 - (n + j) = anc - 1 - j`, and `x * 2^anc / 2^(anc-1-j) = x * 2^(j+1)` (which is even, so `% 2 = 0`, so `decide (… = 1) = false`).
theoremencodeDataZeroAnc_ext
theorem encodeDataZeroAnc_ext
    {n anc x y : Nat}
    (hx : x < 2^n) (hy : y < 2^n)
    (hdata :
      ∀ i, i < n →
        encodeDataZeroAnc n anc x i = encodeDataZeroAnc n anc y i) :
    x = y
*Extensional injectivity of `encodeDataZeroAnc` on data bits.** If the data positions `0..n-1` of `encodeDataZeroAnc n anc x` and `encodeDataZeroAnc n anc y` agree pointwise, and both `x, y < 2^n`, then `x = y`. Proof: combine `encodeDataZeroAnc_data` (data bits = `nat_to_funbool n _`), `funbool_to_nat_congr` (agreement on `[0, n)` ⇒ same `funbool_to_nat`), and `funbool_to_nat_nat_to_funbool` (left inverse on `x < 2^n`).
theoremencodeDataZeroAnc_oob
theorem encodeDataZeroAnc_oob
    {n anc x i : Nat}
    (hanc_pos : 0 < anc) (hi : n + anc ≤ i) :
    encodeDataZeroAnc n anc x i = false
*Out-of-range accessor.** For positions `i ≥ n + anc`, the canonical encoding's bit is `false`, provided `anc ≥ 1`. Proof: the saturating Nat truncation gives `(n + anc) - 1 - i = 0`, so the value reduces to `decide ((x · 2^anc) % 2 = 1)`; `2^anc % 2 = 0` for `anc ≥ 1`, so the product is even and the decide returns `false`.
theoremeq_encodeDataZeroAnc_of_data_anc_oob
theorem eq_encodeDataZeroAnc_of_data_anc_oob
    {n anc y : Nat} {f : Nat → Bool}
    (hanc_pos : 0 < anc)
    (hy : y < 2^n)
    (hdata :
      ∀ i, i < n →
        f i = FormalRV.Framework.nat_to_funbool n y i)
    (hanc :
      ∀ j, j < anc →
        f (n + j) = false)
    (hoob :
      ∀ i, n + anc ≤ i →
*Full function reconstruction.** Any bit-function `f : Nat → Bool` that agrees with `nat_to_funbool n y` on the data band `[0, n)`, is `false` on the ancilla band `[n, n + anc)`, and is `false` outside `[0, n + anc)`, equals `encodeDataZeroAnc n anc y` as a function (under `0 < anc` and `y < 2^n`). Proved by `funext` + the three accessor lemmas `encodeDataZeroAnc_data`, `encodeDataZeroAnc_anc`, and `encodeDataZeroAnc_oob`. This is exactly the shape future modmult-correctness proofs need: the conclusion is the **function** equality `f = encodeDataZeroAnc n anc y`, not just pointwise equality on a finite band. Conversely, the hypotheses are local bit-by-bit statements that gate-IR correctness proofs naturally produce.
theoremGate.applyNat_eq_encodeDataZeroAnc_of_data_anc
theorem Gate.applyNat_eq_encodeDataZeroAnc_of_data_anc
    {n anc y : Nat} {g : Gate} {input : Nat → Bool}
    (hanc_pos : 0 < anc) (hy : y < 2^n)
    (h_wt : Gate.WellTyped (n + anc) g)
    (hdata :
      ∀ i, i < n →
        Gate.applyNat g input i
          = FormalRV.Framework.nat_to_funbool n y i)
    (hanc :
      ∀ j, j < anc →
        Gate.applyNat g input (n + j) = false)
    (hinput_oob :
*`Gate.applyNat`-specific wrapper of `eq_encodeDataZeroAnc_of_data_anc_oob`.** For a well-typed `Gate` on `n + anc` qubits, applied to an input function whose OOB region (positions `i ≥ n + anc`) is already zero, pointwise agreement on the data band `[0, n)` and on the ancilla band `[n, n + anc)` suffices to conclude the *function* equality `Gate.applyNat g input = encodeDataZeroAnc n anc y`. The OOB branch of the reconstruction is discharged automatically by `Gate.applyNat_oob` together with the user-supplied `hinput_oob`. This is exactly the shape downstream modmult-correctness proofs will produce: data-region semantic correctness of the arithmetic circuit plus ancilla-restoration of the workspace, then this lemma packages them into the function equality consumed by `toUCom_satisfies_MultiplyCircuitProperty_of_applyNat_encodeDataZeroAnc`.
theoremtoUCom_satisfies_MultiplyCircuitProperty_of_applyNat_encodeDataZeroAnc
theorem toUCom_satisfies_MultiplyCircuitProperty_of_applyNat_encodeDataZeroAnc
    {a N n anc : Nat} {g : Gate}
    (h_wt : Gate.WellTyped (n + anc) g)
    (hN : N ≤ 2^n)
    (h_apply :
      ∀ x : Nat, x < N →
        Gate.applyNat g (encodeDataZeroAnc n anc x)
          = encodeDataZeroAnc n anc ((a * x) % N)) :
    FormalRV.SQIRPort.MultiplyCircuitProperty a N n anc
      (Gate.toUCom (n + anc) g)
*Encoding-specific `MultiplyCircuitProperty` adapter.** Instantiates `toUCom_satisfies_MultiplyCircuitProperty_of_applyNat_ext` with the canonical `encodeDataZeroAnc` encoding. The user-side hypothesis reduces to the purely Boolean equality `Gate.applyNat g (encodeDataZeroAnc n anc x) = encodeDataZeroAnc n anc ((a * x) % N)`, with the additional bound `N ≤ 2^n` (necessary for `y < N` to imply `y < 2^n` so the encoding theorem applies). All matrix-vector machinery, all bit-order convention, and all index arithmetic are now hidden inside this theorem; downstream Boolean modmult correctness proofs need only reason about `Gate.applyNat`.
theoremuc_well_typed_toUCom_of_Gate_WellTyped
theorem uc_well_typed_toUCom_of_Gate_WellTyped
    (dim : Nat) (g : Gate) (h : Gate.WellTyped dim g) :
    FormalRV.SQIRPort.uc_well_typed (Gate.toUCom dim g)
*General `Gate.WellTyped` ⟹ `uc_well_typed (Gate.toUCom ...)` bridge.** For any `Gate` IR term `g`, structural well-typedness at dimension `dim` implies the compiled `BaseUCom` is well-typed at `dim`. Proven by structural induction on `g`.
theoremf_modmult_gate_family_uc_well_typed
theorem f_modmult_gate_family_uc_well_typed
    (bits N a multBits : Nat) (hbits : 1 ≤ bits) :
    ∀ i, FormalRV.SQIRPort.uc_well_typed
            (Gate.toUCom (multBits + (adder_n_qubits (bits + 1) + 1))
              (f_modmult_gate_family bits N a multBits i))
*`f_modmult_gate_family` is `uc_well_typed` at every iterate.** The analog of `f_modmult_circuit_uc_well_typed` for our gate family (at the Shor-compatible total dimension `multBits + (adder_n_qubits (bits+1) + 1)`). Note: this discharges the well-typedness obligation for OUR family, not directly for the SQIR-derived `f_modmult_circuit` (which is itself a top-level axiom; see QUESTIONS.md 2026-05-28 03:24 for the in-place/layout gap analysis).
theoremreverse_register_swap_encodeDataZeroAnc_to_mult_state_init
theorem reverse_register_swap_encodeDataZeroAnc_to_mult_state_init
    (bits multBits x : Nat) (hbits : 1 ≤ bits)
    (h_multBits_le : multBits ≤ bits + 1)
    (h_multBits_pos : 0 < multBits)
    (hx : x < 2^multBits) :
    Gate.applyNat
      (reverse_register_swap multBits 0 (adder_n_qubits (bits + 1)))
      (encodeDataZeroAnc multBits (adder_n_qubits (bits + 1) + 1) x)
    = mult_state_init bits multBits x
*HEADLINE: Reverse SWAP converts `encodeDataZeroAnc` to `mult_state_init`.** Applied to `encodeDataZeroAnc multBits (adder_n_qubits (bits+1) + 1) x`, the reverse-pairing SWAP between positions `[0, multBits)` and `[adder_n_qubits, adder_n_qubits + multBits)` produces `mult_state_init bits multBits x`.
theoremreverse_register_swap_involution
theorem reverse_register_swap_involution
    (bits multBits : Nat) (hbits : 1 ≤ bits)
    (h_multBits_le : multBits ≤ bits + 1) (f : Nat → Bool) :
    Gate.applyNat (reverse_register_swap multBits 0 (adder_n_qubits (bits + 1)))
      (Gate.applyNat (reverse_register_swap multBits 0 (adder_n_qubits (bits + 1))) f)
    = f
*Reverse-pairing SWAP is involutive.** Applying `reverse_register_swap multBits 0 (adder_n_qubits (bits+1))` twice returns the original state. This follows from the at_A/_at_B position-level lemmas: each A-side position swaps to its B-side partner and back, and other positions are untouched.
theoremreverse_register_swap_mult_state_init_to_encodeDataZeroAnc
theorem reverse_register_swap_mult_state_init_to_encodeDataZeroAnc
    (bits multBits y : Nat) (hbits : 1 ≤ bits)
    (h_multBits_le : multBits ≤ bits + 1)
    (h_multBits_pos : 0 < multBits)
    (hy : y < 2^multBits) :
    Gate.applyNat
      (reverse_register_swap multBits 0 (adder_n_qubits (bits + 1)))
      (mult_state_init bits multBits y)
    = encodeDataZeroAnc multBits (adder_n_qubits (bits + 1) + 1) y
*Converse bridge: `mult_state_init` → `encodeDataZeroAnc`.** By involution applied to the forward bridge: since `reverse_register_swap` is involutive and converts encodeDataZeroAnc x to mult_state_init x, applying it once more to mult_state_init x yields encodeDataZeroAnc x.
defmodMultInPlaceShor
def modMultInPlaceShor (bits N a ainv multBits : Nat) : Gate
*Shor-shaped in-place modular multiplier gate.** Three-stage composition: SWAP → in-place multiplier → SWAP. Takes `encodeDataZeroAnc` input and produces `encodeDataZeroAnc` output with the data register replaced by `(a*x) mod N`.
theoremmodMultInPlaceShor_wellTyped
theorem modMultInPlaceShor_wellTyped
    (bits N a ainv multBits : Nat) (hbits : 1 ≤ bits)
    (h_multBits_le : multBits ≤ bits + 1) (h_multBits_pos : 0 < multBits) :
    Gate.WellTyped (multBits + (adder_n_qubits (bits + 1) + 1))
      (modMultInPlaceShor bits N a ainv multBits)
*WellTyped for `modMultInPlaceShor`.**
theoremmodMultInPlaceShor_correct
theorem modMultInPlaceShor_correct
    (bits N a ainv multBits x : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (h_multBits_le : multBits ≤ bits + 1)
    (h_multBits_pos : 0 < multBits)
    (h_N_le_pow_multBits : N ≤ 2^multBits)
    (ha_pos : 0 < a) (ha_lt : a < N)
    (hainv_pos : 0 < ainv) (hainv_lt : ainv < N)
    (h_inv : a * ainv % N = 1)
    (hx_lt : x < N)
    (h_const_pos_a : ∀ j, j < multBits → 0 < (a * 2^j) % N)
    (h_const_pos_inv : ∀ j, j < multBits → 0 < ((N - ainv) % N * 2^j) % N) :
*HEADLINE: Layout-converting in-place modular multiplier correctness.** Applied to `encodeDataZeroAnc multBits (adder_n_qubits (bits+1) + 1) x`, the gate produces `encodeDataZeroAnc multBits (adder_n_qubits (bits+1) + 1) ((a*x) % N)`. This is the exact shape required by `toUCom_satisfies_MultiplyCircuitProperty_of_applyNat_encodeDataZeroAnc`.
theoremmodMultInPlaceShor_MultiplyCircuitProperty
theorem modMultInPlaceShor_MultiplyCircuitProperty
    (bits N a ainv multBits : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (h_multBits_le : multBits ≤ bits + 1)
    (h_multBits_pos : 0 < multBits)
    (h_N_le_pow_multBits : N ≤ 2^multBits)
    (ha_pos : 0 < a) (ha_lt : a < N)
    (hainv_pos : 0 < ainv) (hainv_lt : ainv < N)
    (h_inv : a * ainv % N = 1)
    (h_const_pos_a : ∀ j, j < multBits → 0 < (a * 2^j) % N)
    (h_const_pos_inv : ∀ j, j < multBits → 0 < ((N - ainv) % N * 2^j) % N) :
    FormalRV.SQIRPort.MultiplyCircuitProperty a N multBits
*HEADLINE: `modMultInPlaceShor` satisfies `MultiplyCircuitProperty`.** The compiled `BaseUCom (multBits + (adder_n_qubits (bits+1) + 1))` from `Gate.toUCom` satisfies the SQIR-shape modular-multiplication property required by `Shor_correct_var` / `Shor_correct`. This is the structural Phase 6 obligation, blocked since Tick 10 (out-of-place vs in-place, layout mismatch) and now closed via path (A).
defour_modmult_family
noncomputable def our_modmult_family (bits N a ainv multBits : Nat) :
    Nat → FormalRV.SQIRPort.BaseUCom
            (multBits + (adder_n_qubits (bits + 1) + 1))
The Shor-shaped modular multiplication family indexed by QPE iterate. At iterate `i`, the gate multiplies by `a^(2^i) mod N` in-place. Each per-iterate gate uses `(a^(2^i)) % N` as its base multiplier (so the constant fits in `[0, N)`) and `(ainv^(2^i)) % N` as its modular inverse (since `(a*ainv) ≡ 1 (mod N)` implies `(a*ainv)^(2^i) ≡ 1 (mod N)`, hence `(a^(2^i)) * (ainv^(2^i)) ≡ 1 (mod N)`).
theoremour_modmult_family_uc_well_typed
theorem our_modmult_family_uc_well_typed
    (bits N a ainv multBits : Nat) (hbits : 1 ≤ bits)
    (h_multBits_le : multBits ≤ bits + 1) (h_multBits_pos : 0 < multBits) :
    ∀ i, FormalRV.SQIRPort.uc_well_typed
            (our_modmult_family bits N a ainv multBits i)
*WellTyped for the squared-power family.** For every iterate `i`, the compiled `BaseUCom` is well-typed at the Shor dimension.
theoremMultiplyCircuitProperty_mod_invariance
theorem MultiplyCircuitProperty_mod_invariance
    (a N n anc : Nat) (c : FormalRV.SQIRPort.BaseUCom (n + anc))
    (h : FormalRV.SQIRPort.MultiplyCircuitProperty (a % N) N n anc c) :
    FormalRV.SQIRPort.MultiplyCircuitProperty a N n anc c
*`MultiplyCircuitProperty` is invariant under modular reduction of the multiplier**. Since the MCP property mentions `a` only inside `(a * x) % N`, reducing `a` modulo `N` doesn't change the property.
theoremour_modmult_family_mcp_per_iterate
theorem our_modmult_family_mcp_per_iterate
    (bits N a ainv multBits : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (h_multBits_le : multBits ≤ bits + 1)
    (h_multBits_pos : 0 < multBits)
    (h_N_le_pow_multBits : N ≤ 2^multBits)
    (h_inv_pow : ∀ i, (a^(2^i) % N) * (ainv^(2^i) % N) % N = 1)
    (h_pow_a_pos : ∀ i, 0 < a^(2^i) % N)
    (h_pow_ainv_pos : ∀ i, 0 < ainv^(2^i) % N)
    (h_const_pos_a_iter : ∀ i j, j < multBits → 0 < (a^(2^i) % N * 2^j) % N)
    (h_const_pos_inv_iter :
      ∀ i j, j < multBits → 0 < ((N - ainv^(2^i) % N) % N * 2^j) % N) :
*`our_modmult_family` satisfies `MultiplyCircuitProperty` at every iterate.** Combined with the WellTyped from Tick 26, this is the `ModMulImpl` evidence required by `Shor_correct_var`.
theoremour_modmult_family_ModMulImpl
theorem our_modmult_family_ModMulImpl
    (bits N a ainv multBits : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (h_multBits_le : multBits ≤ bits + 1)
    (h_multBits_pos : 0 < multBits)
    (h_N_le_pow_multBits : N ≤ 2^multBits)
    (h_inv_pow : ∀ i, (a^(2^i) % N) * (ainv^(2^i) % N) % N = 1)
    (h_pow_a_pos : ∀ i, 0 < a^(2^i) % N)
    (h_pow_ainv_pos : ∀ i, 0 < ainv^(2^i) % N)
    (h_const_pos_a_iter : ∀ i j, j < multBits → 0 < (a^(2^i) % N * 2^j) % N)
    (h_const_pos_inv_iter :
      ∀ i j, j < multBits → 0 < ((N - ainv^(2^i) % N) % N * 2^j) % N) :
*`our_modmult_family` is a `ModMulImpl`.** Direct reformulation of `our_modmult_family_mcp_per_iterate`.
theoremShor_correct_with_our_family
theorem Shor_correct_with_our_family
    (bits N a ainv multBits m r : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (h_multBits_le : multBits ≤ bits + 1)
    (h_multBits_pos : 0 < multBits)
    (h_N_le_pow_multBits : N ≤ 2^multBits)
    (h_basic : FormalRV.SQIRPort.BasicSetting a r N m multBits)
    (h_inv_pow : ∀ i, (a^(2^i) % N) * (ainv^(2^i) % N) % N = 1)
    (h_pow_a_pos : ∀ i, 0 < a^(2^i) % N)
    (h_pow_ainv_pos : ∀ i, 0 < ainv^(2^i) % N)
    (h_const_pos_a_iter : ∀ i j, j < multBits → 0 < (a^(2^i) % N * 2^j) % N)
    (h_const_pos_inv_iter :
*HEADLINE: Shor's success-probability bound for our concrete in-place modular multiplier family.** Direct application of `Shor_correct_var` with `u := our_modmult_family bits N a ainv multBits`, using Tick 26's WellTyped and Tick 27's `ModMulImpl`. The user must supply `BasicSetting a r N m multBits` — the order-and-bounds hypothesis on `(a, r, N, m, multBits)` — plus the modular-arithmetic conditions required by Tick 27.
theoremcoprime_mod_pos
theorem coprime_mod_pos (a N : Nat) (hN : 1 < N) (h_cop : Nat.Coprime a N) :
    0 < a % N
*Coprime + 1 < N implies `0 < a % N`.** If `N ∣ a` then `N ≤ gcd a N = 1`, contradicting `1 < N`.
theoremcoprime_pow
theorem coprime_pow (a N k : Nat) (h_cop : Nat.Coprime a N) :
    Nat.Coprime (a^k) N
*`gcd(a, N) = 1 → gcd(a^k, N) = 1`** via `Nat.Coprime.pow_left`.
theoremcoprime_pow_mod_pos
theorem coprime_pow_mod_pos (a N k : Nat) (hN : 1 < N) (h_cop : Nat.Coprime a N) :
    0 < a^k % N
*`gcd(a, N) = 1 + 1 < N → 0 < a^k % N` for all `k`.** Combines `coprime_pow` and `coprime_mod_pos`.
theoremcoprime_mul_pow_two_mod_pos
theorem coprime_mul_pow_two_mod_pos
    (a N k j : Nat) (hN : 1 < N) (h_cop : Nat.Coprime a N)
    (h_cop_two : Nat.Coprime 2 N) :
    0 < (a^k % N * 2^j) % N
*`gcd(a, N) = 1 + gcd(2, N) = 1 → 0 < (a^k % N * 2^j) % N`.** The per-bit coprimality condition needed by `our_modmult_family`'s hypotheses, derived from a base coprimality of `a` and `2` with `N`.
theoremcoprime_of_mul_mod_one
theorem coprime_of_mul_mod_one (a ainv N : Nat) (h_inv : a * ainv % N = 1) :
    Nat.Coprime a N
*`a * ainv % N = 1` implies `Nat.Coprime a N`.**
theoremcoprime_inv_of_mul_mod_one
theorem coprime_inv_of_mul_mod_one (a ainv N : Nat) (h_inv : a * ainv % N = 1) :
    Nat.Coprime ainv N
*`a * ainv % N = 1` implies `Nat.Coprime ainv N`.**
theoremmul_pow_mod_one
theorem mul_pow_mod_one (a ainv N k : Nat) (hN : 1 < N) (h_inv : a * ainv % N = 1) :
    (a^k % N) * (ainv^k % N) % N = 1
*`a * ainv % N = 1 + 1 < N → ∀ k, (a^k % N) * (ainv^k % N) % N = 1`.**
theoremShor_correct_with_our_family_coprime
theorem Shor_correct_with_our_family_coprime
    (bits N a ainv multBits m r : Nat)
    (hbits : 1 ≤ bits)
    (h_multBits_le : multBits ≤ bits + 1)
    (h_multBits_pos : 0 < multBits)
    (hN : N ≤ 2^bits)
    (h_N_le_pow_multBits : N ≤ 2^multBits)
    (h_N_gt_one : 1 < N)
    (h_cop_a : Nat.Coprime a N)
    (h_cop_two : Nat.Coprime 2 N)
    (h_inv : a * ainv % N = 1)
    (h_basic : FormalRV.SQIRPort.BasicSetting a r N m multBits) :
*HEADLINE: Shor success-probability bound from minimal coprimality hypotheses.** Bundles Tick 28's `Shor_correct_with_our_family` with the derivations from `1 < N`, `Nat.Coprime a N`, `Nat.Coprime 2 N` (N odd), and `a * ainv % N = 1`. This is the SIMPLEST user-facing Shor success-probability theorem for our concrete in-place modular multiplier construction.
theoremBasicSetting_intro
theorem BasicSetting_intro
    (a r N m n : Nat)
    (h_a_pos : 0 < a) (h_a_lt : a < N)
    (h_ord : FormalRV.SQIRPort.Order a r N)
    (h_m_lo : N^2 < 2^m) (h_m_hi : 2^m ≤ 2 * N^2)
    (h_n_lo : N < 2^n) (h_n_hi : 2^n ≤ 2 * N) :
    FormalRV.SQIRPort.BasicSetting a r N m n
*Constructor for `BasicSetting`.** Bundles the four component conditions into the single anonymous-constructor form.
theoremcoprime_two_of_odd
theorem coprime_two_of_odd (N : Nat) (h_odd : Odd N) : Nat.Coprime 2 N
*`Nat.Coprime 2 N` from `Odd N`.** Direct invocation of `Odd.coprime_two_left`. Useful for users who think of "N odd" rather than "gcd(2, N) = 1".
theoremcoprime_two_iff_odd
theorem coprime_two_iff_odd (N : Nat) : Nat.Coprime 2 N ↔ Odd N
*`Nat.Coprime 2 N` iff `Odd N`.**
theoremShor_correct_with_our_family_at_canonical_dim
theorem Shor_correct_with_our_family_at_canonical_dim
    (N a ainv : Nat)
    (h_N_gt_one : 1 < N)
    (h_a_pos : 0 < a) (h_a_lt : a < N)
    (h_cop_a : Nat.Coprime a N)
    (h_cop_two : Nat.Coprime 2 N)
    (h_inv : a * ainv % N = 1) :
    FormalRV.SQIRPort.probability_of_success a
        (FormalRV.SQIRPort.ord a N) N
        (Nat.log2 (2 * N^2)) (Nat.log2 (2 * N))
        (adder_n_qubits (Nat.log2 (2 * N) + 1) + 1)
        (our_modmult_family (Nat.log2 (2 * N)) N a ainv (Nat.log2 (2 * N)))
*HEADLINE: Shor success-probability bound at canonical Shor parameters.** Specializes `Shor_correct_with_our_family_coprime` at `multBits := Nat.log2 (2 * N)` and `m := Nat.log2 (2 * N^2)` (the canonical Shor sizing), automatically deriving the `BasicSetting` log2 bounds from `1 < N`. This mirrors the canonical-dim choice in `Shor_correct` but uses our concrete in-place gate.
example(example)
example :
    FormalRV.SQIRPort.probability_of_success 7
        (FormalRV.SQIRPort.ord 7 15) 15
        (Nat.log2 (2 * 15^2)) (Nat.log2 (2 * 15))
        (adder_n_qubits (Nat.log2 (2 * 15) + 1) + 1)
        (our_modmult_family (Nat.log2 (2 * 15)) 15 7 13 (Nat.log2 (2 * 15)))
      ≥ FormalRV.SQIRPort.κ / (Nat.log2 15 : ℝ)^4
*Concrete instantiation: Shor's bound at N=15, a=7, ainv=13.** Demonstrates that the canonical-dim theorem's hypotheses are fully decidable for concrete small N — every hypothesis closes by `decide`. N=15 is the smallest non-prime odd N > 2 with a nontrivial `Z*_N` structure (the standard Shor warm-up example); a=7 is coprime to 15 with order 4 (`7^4 = 2401 ≡ 1 mod 15`); ainv = 13 (since `7 * 13 = 91 = 6*15 + 1`).
theoremBasicSetting_at_canonical_dim
theorem BasicSetting_at_canonical_dim
    (N a r : Nat) (h_N_gt_one : 1 < N)
    (h_a_pos : 0 < a) (h_a_lt : a < N)
    (h_ord : FormalRV.SQIRPort.Order a r N) :
    FormalRV.SQIRPort.BasicSetting a r N
      (Nat.log2 (2 * N^2)) (Nat.log2 (2 * N))
*`BasicSetting` at canonical Shor dimensions.** For any `1 < N`, `0 < a < N`, and `Order a r N`, the `BasicSetting` predicate holds at `m := Nat.log2 (2 * N^2)` and `n := Nat.log2 (2 * N)`. This packages the log2-bound derivations used by Shor's canonical-dim theorems for reuse.
example(example)
example :
    FormalRV.SQIRPort.probability_of_success 2
        (FormalRV.SQIRPort.ord 2 21) 21
        (Nat.log2 (2 * 21^2)) (Nat.log2 (2 * 21))
        (adder_n_qubits (Nat.log2 (2 * 21) + 1) + 1)
        (our_modmult_family (Nat.log2 (2 * 21)) 21 2 11 (Nat.log2 (2 * 21)))
      ≥ FormalRV.SQIRPort.κ / (Nat.log2 21 : ℝ)^4
*Concrete instantiation: Shor's bound at N=21, a=2, ainv=11.** N=21 = 3·7 is the second-smallest non-prime odd composite useful for Shor; a=2 has order 6 mod 21 (since 2^6 = 64 = 3·21 + 1); ainv = 11 (since 2 · 11 = 22 = 21 + 1). As in Tick 33, every hypothesis closes by `decide`.
theoremmodMultInPlaceShor_qubit_count_at_canonical
theorem modMultInPlaceShor_qubit_count_at_canonical (N : Nat) :
    Nat.log2 (2 * N) + (adder_n_qubits (Nat.log2 (2 * N) + 1) + 1)
    = 4 * Nat.log2 (2 * N) + 6
*Total qubit count of `modMultInPlaceShor` at canonical Shor dimensions.** The gate occupies `4 * Nat.log2 (2 * N) + 6` qubits. Comparison with SQIR's placeholder `f_modmult_circuit`: - SQIR's `f_modmult_circuit` has dimension `n + modmult_rev_anc n = 3n + 1` (where `n = Nat.log2 (2 * N)`). - Our gate has dimension `multBits + (adder_n_qubits (multBits+1) + 1) = 4n + 6`. - Overhead: `n + 5` more ancilla qubits than SQIR's placeholder. This is the explicit cost of using the FAITHFULLY VERIFIED Gidney ripple-carry adder + in-place wrapper approach. The overhead pays for kernel-clean correctness across the entire Shor pipeline.
theoremmodMultInPlaceShor_qubit_count
theorem modMultInPlaceShor_qubit_count (bits multBits : Nat) :
    multBits + (adder_n_qubits (bits + 1) + 1)
    = multBits + 3 * bits + 6
*General total qubit count formula.**
theoremBasicSetting_at_canonical_dim_from_coprime
theorem BasicSetting_at_canonical_dim_from_coprime
    (N a : Nat) (h_N_gt_one : 1 < N)
    (h_a_pos : 0 < a) (h_a_lt : a < N)
    (h_cop : Nat.Coprime a N) :
    FormalRV.SQIRPort.BasicSetting a (FormalRV.SQIRPort.ord a N) N
      (Nat.log2 (2 * N^2)) (Nat.log2 (2 * N))
*`BasicSetting` at canonical Shor dim from coprimality alone.** Variant of `BasicSetting_at_canonical_dim` (Tick 34) that takes `Nat.Coprime a N` instead of `Order a r N`, and uses `r := ord a N` as the order. The `Order` proof is derived internally via `ord_Order`.
theoremShor_correct_parametric_modmult
theorem Shor_correct_parametric_modmult
    (a r N m n anc : Nat)
    (f : Nat → FormalRV.SQIRPort.BaseUCom (n + anc))
    (h_basic : FormalRV.SQIRPort.BasicSetting a r N m n)
    (h_mmi : FormalRV.SQIRPort.ModMulImpl a N n anc f)
    (h_wt : ∀ i, i < m → FormalRV.SQIRPort.uc_well_typed (f i)) :
    FormalRV.SQIRPort.probability_of_success a r N m n anc f
      ≥ FormalRV.SQIRPort.κ / (Nat.log2 N : ℝ)^4
*HEADLINE: Parametric Shor success-probability bound.** Direct re-export of `FormalRV.SQIRPort.Shor_correct_var` (in `PostQFT.lean`), highlighting that this theorem is already parametric in: - `n` (data register size). - `anc` (ancilla count — ANY natural number). - `u : Nat → BaseUCom (n + anc)` (the modmult family). No hardcoding of `modmult_rev_anc n`. Any family satisfying `BasicSetting`, `ModMulImpl`, and per-iterate `uc_well_typed` yields the canonical Shor success-probability bound `≥ κ / (Nat.log2 N)^4`.
theoremShor_correct_with_our_family_from_parametric
theorem Shor_correct_with_our_family_from_parametric
    (bits N a ainv multBits m r : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (h_multBits_le : multBits ≤ bits + 1)
    (h_multBits_pos : 0 < multBits)
    (h_N_le_pow_multBits : N ≤ 2^multBits)
    (h_basic : FormalRV.SQIRPort.BasicSetting a r N m multBits)
    (h_inv_pow : ∀ i, (a^(2^i) % N) * (ainv^(2^i) % N) % N = 1)
    (h_pow_a_pos : ∀ i, 0 < a^(2^i) % N)
    (h_pow_ainv_pos : ∀ i, 0 < ainv^(2^i) % N)
    (h_const_pos_a_iter : ∀ i j, j < multBits → 0 < (a^(2^i) % N * 2^j) % N)
    (h_const_pos_inv_iter :
*Our family instantiated via the parametric Shor theorem.** A thin wrapper around: - `Shor_correct_parametric_modmult` (the parametric Shor theorem). - `our_modmult_family_ModMulImpl` (Tick 27 — ModMulImpl evidence). - `our_modmult_family_uc_well_typed` (Tick 26 — WellTyped evidence). The user supplies the standard Shor hypotheses plus the per-iterate coprimality conditions; this theorem packages everything for our concrete `our_modmult_family`.
theoremsqir_placeholder_axioms_status
theorem sqir_placeholder_axioms_status :
    -- Our family has its own canonical ancilla count (4n + 6 vs SQIR's 3n + 1):
    ∀ bits multBits : Nat,
      multBits + (adder_n_qubits (bits + 1) + 1)
      = multBits + 3 * bits + 6
*Documentation theorem: SQIR placeholder axioms remain unchanged.** The original SQIR `f_modmult_circuit a ainv N n` (with `BaseUCom (n + modmult_rev_anc n)` shape) and its companion axioms remain placeholders in `SQIRPort/Shor.lean`. The concrete verified replacement is `our_modmult_family bits N a ainv multBits` with `BaseUCom (multBits + (adder_n_qubits (bits + 1) + 1))` shape. The parametric Shor theorem `Shor_correct_parametric_modmult` accepts EITHER shape (or any other satisfying the predicate-level hypotheses), so no dimension splicing or `modmult_rev_anc` redefinition is needed. This theorem holds trivially (it's a true conjunction) and serves as a documentation anchor.
theoremour_modmult_family_hypotheses_from_inverse
theorem our_modmult_family_hypotheses_from_inverse
    (N a ainv multBits : Nat)
    (h_N_gt_one : 1 < N)
    (h_cop_two : Nat.Coprime 2 N)
    (h_inv : a * ainv % N = 1) :
    (∀ i, (a^(2^i) % N) * (ainv^(2^i) % N) % N = 1)
    ∧ (∀ i, 0 < a^(2^i) % N)
    ∧ (∀ i, 0 < ainv^(2^i) % N)
    ∧ (∀ i j, j < multBits → 0 < (a^(2^i) % N * 2^j) % N)
    ∧ (∀ i j, j < multBits → 0 < ((N - ainv^(2^i) % N) % N * 2^j) % N)
*Deliverable A: bundled per-iterate hypothesis generator.** Given `1 < N`, `a * ainv % N = 1`, and `Nat.Coprime 2 N`, derives all 5 of the per-iterate hypotheses required by `Shor_correct_with_our_family_from_parametric`.
theoremShor_correct_with_verified_modexp
theorem Shor_correct_with_verified_modexp
    (bits N a ainv multBits m r : Nat)
    (hbits : 1 ≤ bits)
    (hN_pos : 0 < N)
    (hN : N ≤ 2^bits)
    (h_multBits_le : multBits ≤ bits + 1)
    (h_multBits_pos : 0 < multBits)
    (h_N_le_pow_multBits : N ≤ 2^multBits)
    (h_basic : FormalRV.SQIRPort.BasicSetting a r N m multBits)
    (h_N_gt_one : 1 < N)
    (h_cop_two : Nat.Coprime 2 N)
    (h_inv : a * ainv % N = 1) :
*HEADLINE Deliverable B: Clean final theorem for the verified modular-exponentiation family.** The minimal-assumption form of the end-to-end Shor success-probability bound for our concrete in-place modular multiplier construction. Mathematical assumptions (genuinely necessary): - `1 < N` — non-trivial Shor instance. - `Nat.Coprime 2 N` — N is odd (required so that `2^j` is coprime to N for the per-bit constant positivity). - `a * ainv % N = 1` — the modular inverse relation. From this, `Nat.Coprime a N` and `Nat.Coprime ainv N` are derived internally. - `BasicSetting a r N m multBits` — the standard Shor order + log2 bounds. Structural sizing assumptions: - `1 ≤ bits`, `multBits ≤ bits + 1`, `0 < multBits`. - `N ≤ 2^bits`, `N ≤ 2^multBits`.
theoremfinal_review_status
theorem final_review_status :
    -- (1) Total qubit count of our family at canonical Shor dim.
    (∀ N, Nat.log2 (2 * N) + (adder_n_qubits (Nat.log2 (2 * N) + 1) + 1)
          = 4 * Nat.log2 (2 * N) + 6) ∧
    -- (2) Our gate uses larger ancilla than SQIR's modmult_rev_anc n = 2n + 1.
    -- Concretely: our (3*bits + 6) - SQIR's (2*bits + 1) = bits + 5 more ancilla.
    (∀ bits : Nat,
        adder_n_qubits (bits + 1) + 1 = 3 * bits + 6) ∧
    -- (3) Original SQIR axioms remain as placeholders for the SQIR-size circuit;
    --     the verified replacement is `our_modmult_family` (separate gate).
    (∀ bits multBits : Nat,
        multBits + (adder_n_qubits (bits + 1) + 1)
*Deliverable D: Final review theorem documenting the project state.** This theorem packages three structural facts as a triple-conjunction: 1. The verified replacement gate's total qubit count formula. 2. The ancilla-count comparison with SQIR's `modmult_rev_anc n`. 3. The fact that SQIR's `f_modmult_circuit`-family axioms remain untouched placeholders (independent of our verified replacement). Each conjunct is decidable / provable; the theorem serves as a documentation anchor for the final project state.
theoremour_modmult_family_anc_strictly_exceeds_sqir
theorem our_modmult_family_anc_strictly_exceeds_sqir (n : Nat) :
    FormalRV.SQIRPort.modmult_rev_anc n + (n + 5)
    = adder_n_qubits (n + 1) + 1
*Deliverable A: explicit ancilla count mismatch.** For all `n ≥ 0`, our family's ancilla budget `adder_n_qubits (n + 1) + 1 = 3n + 6` is strictly greater than SQIR's `modmult_rev_anc n = 2n + 1`. Difference: `n + 5 ≥ 5` ancillas.
theoremour_modmult_family_dim_strictly_exceeds_sqir
theorem our_modmult_family_dim_strictly_exceeds_sqir (n : Nat) :
    n + FormalRV.SQIRPort.modmult_rev_anc n + (n + 5)
    = n + (adder_n_qubits (n + 1) + 1)
*Total-dimension mismatch.** With `bits = multBits = n`, our total dimension `n + (adder_n_qubits (n + 1) + 1) = 4n + 6` exceeds SQIR's `n + modmult_rev_anc n = 3n + 1` by `n + 5`.
theoremsqir_anc_ne_our_anc
theorem sqir_anc_ne_our_anc (n : Nat) (h_n_pos : 0 < n) :
    n + FormalRV.SQIRPort.modmult_rev_anc n
    ≠ n + (adder_n_qubits (n + 1) + 1)
*No `BaseUCom` of one dimension can inhabit another.** Type nonequality at the dimension level: `BaseUCom (3n + 1)` and `BaseUCom (4n + 6)` are DIFFERENT TYPES. This is the formal obstacle that PREVENTS pointing `f_modmult_circuit` (return type `BaseUCom (n + modmult_rev_anc n) = BaseUCom (3n + 1)`) at our gate (return type `BaseUCom (n + (adder_n_qubits (n + 1) + 1)) = BaseUCom (4n + 6)`).
theoremsqir_axiom_closure_obstruction
theorem sqir_axiom_closure_obstruction
    (a ainv N n : Nat) (_h_a_lt : a < N) (_h_ainv_lt : ainv < N)
    (_h_inv : a * ainv % N = 1) (h_n_pos : 0 < n) :
    -- SQIR's expected oracle type:
    let sqir_dim
*Closure obstruction theorem (Deliverable C as a Lean statement).** Composite documentation theorem stating three facts about the closure of the original SQIR axioms: 1. SQIR's expected oracle type is `BaseUCom (n + modmult_rev_anc n) = BaseUCom (3n + 1)`. 2. Our family's oracle type is `BaseUCom (n + (adder_n_qubits (n + 1) + 1)) = BaseUCom (4n + 6)`. 3. These types are not equal for any `n ≥ 1` (witnessed by `n + 5` strictly positive ancilla excess). Combined effect: any further closure of `f_modmult_circuit`, `f_modmult_circuit_MMI`, `f_modmult_circuit_uc_well_typed` must construct a new oracle family at the EXACT SQIR type, not embed our family.

FormalRV.Arithmetic.ModularAdder

FormalRV/Arithmetic/ModularAdder.lean
(no documented top-level declarations)

FormalRV.Arithmetic.ModularAdder.ModularAdderControlledPipeline

FormalRV/Arithmetic/ModularAdder/ModularAdderControlledPipeline.lean
theoremcontrolled_step5_true
theorem controlled_step5_true
    (bits N c x controlIdx flagIdx : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < N) (hc_pos : 0 < c) (hc : c < N)
    (hcontrolIdx : adder_n_qubits (bits + 1) ≤ controlIdx)
    (hflagIdx : controlIdx < flagIdx) :
    Gate.applyNat (conditionalAddConstGate (bits + 1) (2^(bits+1) - c) controlIdx)
      (update (update (adder_input_F (bits + 1) 0 ((x + c) % N)) controlIdx true)
              flagIdx (decide ((x + c) < N)))
    = update (update (adder_input_F (bits + 1) 0 (subConstPow2WideSpec bits c ((x + c) % N)))
                controlIdx true) flagIdx (decide ((x + c) < N))
Intermediate: applying step 5 of controlled pipeline (controlled sub c) with controlBit = true takes target from `(x+c) % N` to `subConstPow2WideSpec bits c ((x+c) % N)`.
theoremcontrolled_step6_true
theorem controlled_step6_true
    (bits N c x controlIdx flagIdx : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < N) (hc_pos : 0 < c) (hc : c < N)
    (hcontrolIdx : adder_n_qubits (bits + 1) ≤ controlIdx)
    (hflagIdx : controlIdx < flagIdx) :
    Gate.applyNat (Gate.CCX controlIdx (target_idx bits) flagIdx)
      (update (update (adder_input_F (bits + 1) 0 (subConstPow2WideSpec bits c ((x + c) % N)))
                  controlIdx true) flagIdx (decide ((x + c) < N)))
    = update (update (adder_input_F (bits + 1) 0 (subConstPow2WideSpec bits c ((x + c) % N)))
                controlIdx true) flagIdx true
Intermediate: applying step 6 of controlled pipeline (second CCX flag-copy) with controlBit = true sets flagIdx to `TRUE` (the XOR of the comparison flag and its complement).
theoremcontrolled_step7_true
theorem controlled_step7_true
    (bits c x controlIdx flagIdx : Nat) (y : Nat)
    (hcontrolIdx : adder_n_qubits (bits + 1) ≤ controlIdx)
    (hflagIdx : controlIdx < flagIdx) :
    Gate.applyNat (Gate.CX controlIdx flagIdx)
      (update (update (adder_input_F (bits + 1) 0 y) controlIdx true) flagIdx true)
    = update (update (adder_input_F (bits + 1) 0 y) controlIdx true) flagIdx false
Intermediate: applying step 7 of controlled pipeline (controlled X flipping flagIdx) takes flagIdx from `TRUE` to `FALSE`.
theoremcontrolled_step8_true
theorem controlled_step8_true
    (bits N c x controlIdx flagIdx : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < N) (hc_pos : 0 < c) (hc : c < N)
    (hcontrolIdx : adder_n_qubits (bits + 1) ≤ controlIdx)
    (hflagIdx : controlIdx < flagIdx) :
    Gate.applyNat (conditionalAddConstGate (bits + 1) c controlIdx)
      (update (update (adder_input_F (bits + 1) 0 (subConstPow2WideSpec bits c ((x + c) % N)))
                  controlIdx true) flagIdx false)
    = update (update (adder_input_F (bits + 1) 0 ((x + c) % N)) controlIdx true) flagIdx false
Intermediate: applying step 8 of controlled pipeline (final controlled add c) takes target from `subConstPow2WideSpec bits c ((x + c) % N)` to `(x + c) % N` via algebraic cancellation.
theoremcontrolledModAddConstGate_correct_true
theorem controlledModAddConstGate_correct_true
    (bits N c x : Nat) (controlIdx flagIdx : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (hx : x < N) (hc_pos : 0 < c) (hc : c < N)
    (hcontrolIdx : adder_n_qubits (bits + 1) ≤ controlIdx)
    (hflagIdx : controlIdx < flagIdx) :
    Gate.applyNat (controlledModAddConstGate bits N c controlIdx flagIdx)
      (update (adder_input_F (bits + 1) 0 x) controlIdx true)
    = update (adder_input_F (bits + 1) 0 ((x + c) % N)) controlIdx true
*Tick 6p HEADLINE — `controlBit = true` branch**. When the control bit is `true`, the full 8-step pipeline produces target = `(x + c) % N` with all workspace restored.
theoremcontrolledModAddConstGate_correct
theorem controlledModAddConstGate_correct
    (bits N c x : Nat) (controlBit : Bool) (controlIdx flagIdx : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (hx : x < N) (hc_pos : 0 < c) (hc : c < N)
    (hcontrolIdx : adder_n_qubits (bits + 1) ≤ controlIdx)
    (hflagIdx : controlIdx < flagIdx) :
    Gate.applyNat (controlledModAddConstGate bits N c controlIdx flagIdx)
      (update (adder_input_F (bits + 1) 0 x) controlIdx controlBit)
    = update (adder_input_F (bits + 1) 0 (if controlBit then (x + c) % N else x))
        controlIdx controlBit
*Tick 6 HEADLINE — full `controlledModAddConstGate_correct`**. For any `controlBit`, the 8-step pipeline produces target = `if controlBit then (x + c) % N else x` with all workspace restored.
theoremmodMultConstGateAux_zero
theorem modMultConstGateAux_zero (bits N a multBits : Nat) :
    modMultConstGateAux bits N a multBits 0 = Gate.I
`modMultConstGateAux ... 0 = Gate.I` by definition.
theoremmodMultConstGate_zero
theorem modMultConstGate_zero (bits N a : Nat) :
    modMultConstGate bits N a 0 = Gate.I
`modMultConstGate ... 0 = Gate.I` (zero-bit multiplier is the identity).
theoremmodMultConstGateAux_succ
theorem modMultConstGateAux_succ (bits N a multBits k : Nat) :
    modMultConstGateAux bits N a multBits (k + 1)
    = Gate.seq
        (modMultConstGateAux bits N a multBits k)
        (controlledModAddConstGate bits N ((a * 2^k) % N)
          (adder_n_qubits (bits + 1) + k)
          (adder_n_qubits (bits + 1) + multBits))
Recursive unfolding: `modMultConstGateAux ... (k+1)` is the seq of the `k`-step and the controlled add at bit `k`.
theoremmodMultConstGateAux_wellTyped
theorem modMultConstGateAux_wellTyped
    (bits N a multBits k : Nat) (hbits : 1 ≤ bits) (hk : k ≤ multBits) :
    Gate.WellTyped (adder_n_qubits (bits + 1) + multBits + 1)
      (modMultConstGateAux bits N a multBits k)
Well-typedness of the auxiliary gate at width `adder_n_qubits (bits+1) + multBits + 1` for any `k ≤ multBits`.
theoremmodMultConstGate_wellTyped
theorem modMultConstGate_wellTyped
    (bits N a multBits : Nat) (hbits : 1 ≤ bits) :
    Gate.WellTyped (adder_n_qubits (bits + 1) + multBits + 1)
      (modMultConstGate bits N a multBits)
*Well-typedness of `modMultConstGate`.** The full multiplier gate is well-typed at width `adder_n_qubits (bits+1) + multBits + 1` (adder block + `multBits` multiplier qubits + 1 flag qubit).
theoremmodMultConstGateAux_correct_zero
theorem modMultConstGateAux_correct_zero
    (bits N a multBits : Nat) (f : Nat → Bool) :
    Gate.applyNat (modMultConstGateAux bits N a multBits 0) f = f
Base case: the zero-step multiplier auxiliary gate is identity.
theoremmodMultConstGate_correct_zero
theorem modMultConstGate_correct_zero
    (bits N a : Nat) (f : Nat → Bool) :
    Gate.applyNat (modMultConstGate bits N a 0) f = f
Special case at `multBits = 0`: the full multiplier gate is identity (no multiplier bits to control).
theoremmodMultConstGateAux_apply_succ
theorem modMultConstGateAux_apply_succ
    (bits N a multBits k : Nat) (f : Nat → Bool) :
    Gate.applyNat (modMultConstGateAux bits N a multBits (k + 1)) f
    = Gate.applyNat
        (controlledModAddConstGate bits N ((a * 2^k) % N)
          (adder_n_qubits (bits + 1) + k)
          (adder_n_qubits (bits + 1) + multBits))
        (Gate.applyNat (modMultConstGateAux bits N a multBits k) f)
State-level unfolding for the recursive step.
theoremcontrolledModAddConstGate_commute_update_outer
theorem controlledModAddConstGate_commute_update_outer
    (bits N c controlIdx flagIdx p : Nat) (v : Bool) (hbits : 1 ≤ bits)
    (hp_dim : adder_n_qubits (bits + 1) ≤ p)
    (h_p_ne_ctrl : p ≠ controlIdx) (h_p_ne_flag : p ≠ flagIdx) :
    ∀ (f : Nat → Bool),
      Gate.applyNat (controlledModAddConstGate bits N c controlIdx flagIdx)
          (update f p v)
      = update (Gate.applyNat (controlledModAddConstGate bits N c controlIdx flagIdx) f)
          p v
*Commute lemma for `controlledModAddConstGate`.** The gate commutes with an `update _ p v` when `p` is outside the gate's read/write set: `p ≥ adder_n_qubits (bits+1)` (above the adder block), `p ≠ controlIdx`, and `p ≠ flagIdx`. This is the key infrastructure for the inductive multiplier correctness proof, where each iteration's gate must commute past updates at OTHER multiplier-bit positions.
theoremmodMultConstGateAux_commute_update_outer
theorem modMultConstGateAux_commute_update_outer
    (bits N a multBits k p : Nat) (v : Bool) (hbits : 1 ≤ bits)
    (hk : k ≤ multBits) (hp : adder_n_qubits (bits + 1) + multBits < p) :
    ∀ (f : Nat → Bool),
      Gate.applyNat (modMultConstGateAux bits N a multBits k) (update f p v)
      = update (Gate.applyNat (modMultConstGateAux bits N a multBits k) f) p v
*Commute lemma for `modMultConstGateAux`.** At positions strictly above the multiplier circuit's flag (i.e., `p > adder_n_qubits (bits+1) + multBits`), an `update _ p v` commutes through the full multiplier auxiliary gate. Proven directly via `applyNat_commute_update_above_dim` applied to `modMultConstGateAux_wellTyped`.
theoremmodMultConstGate_commute_update_outer
theorem modMultConstGate_commute_update_outer
    (bits N a multBits p : Nat) (v : Bool) (hbits : 1 ≤ bits)
    (hp : adder_n_qubits (bits + 1) + multBits < p) :
    ∀ (f : Nat → Bool),
      Gate.applyNat (modMultConstGate bits N a multBits) (update f p v)
      = update (Gate.applyNat (modMultConstGate bits N a multBits) f) p v
*Commute lemma for `modMultConstGate`.** Specialization of the aux-level commute lemma at `k = multBits`.
theoremmodMultConstGateAux_commute_update_mult_pos_above
theorem modMultConstGateAux_commute_update_mult_pos_above
    (bits N a multBits k j : Nat) (v : Bool) (hbits : 1 ≤ bits)
    (hk : k ≤ multBits) (hjk : k ≤ j) (hj : j < multBits) :
    ∀ (f : Nat → Bool),
      Gate.applyNat (modMultConstGateAux bits N a multBits k)
          (update f (adder_n_qubits (bits + 1) + j) v)
      = update (Gate.applyNat (modMultConstGateAux bits N a multBits k) f)
          (adder_n_qubits (bits + 1) + j) v
*`modMultConstGateAux` commute lemma at a multiplier-bit position.** For positions in the multiplier-bit range `p = adder_n_qubits (bits+1) + j` with `j < multBits` AND `j ≥ k` (i.e., a multiplier bit that has NOT yet been touched by iterations `0, 1, ..., k-1`), `update _ p v` commutes through `modMultConstGateAux bits N a multBits k`. Proven by induction on `k`, using `controlledModAddConstGate_commute_update_outer` for the step.
theoremmult_input_F_aux_succ
theorem mult_input_F_aux_succ (bits multBits m i : Nat) (f : Nat → Bool) :
    mult_input_F_aux bits multBits m (i + 1) f
    = update (mult_input_F_aux bits multBits m i f)
             (adder_n_qubits (bits + 1) + i) (Nat.testBit m i)
Recursion unfolding for the aux at `i+1`.
theoremmult_input_F_aux_at_mult_pos
theorem mult_input_F_aux_at_mult_pos
    (bits multBits m i j : Nat) (hj : j < i) (f : Nat → Bool) :
    mult_input_F_aux bits multBits m i f (adder_n_qubits (bits + 1) + j)
    = Nat.testBit m j
Decoder at multiplier-bit positions: `mult_input_F_aux ... i f` at position `adder_n_qubits (bits+1) + j` returns `Nat.testBit m j`, when `j < i` (i.e., bit `j` has been written by some iteration ≤ i-1).
theoremmult_input_F_aux_at_non_mult_pos
theorem mult_input_F_aux_at_non_mult_pos
    (bits multBits m i p : Nat)
    (h_outside : p < adder_n_qubits (bits + 1) ∨ adder_n_qubits (bits + 1) + i ≤ p)
    (f : Nat → Bool) :
    mult_input_F_aux bits multBits m i f p = f p
Decoder at non-multiplier positions: `mult_input_F_aux ... i f` at position `p` outside the multiplier-bit range `[adder_n_qubits (bits+1), adder_n_qubits (bits+1) + i)` equals `f p`.
theoremmult_input_F_at_mult_pos
theorem mult_input_F_at_mult_pos
    (bits multBits x m j : Nat) (hj : j < multBits) :
    mult_input_F bits multBits x m (adder_n_qubits (bits + 1) + j)
    = Nat.testBit m j
Top-level decoder at multiplier-bit position.
theoremmult_input_F_at_non_mult_pos
theorem mult_input_F_at_non_mult_pos
    (bits multBits x m p : Nat)
    (h_outside : p < adder_n_qubits (bits + 1)
                 ∨ adder_n_qubits (bits + 1) + multBits ≤ p) :
    mult_input_F bits multBits x m p = adder_input_F (bits + 1) 0 x p
Top-level decoder at non-multiplier positions: equal to `adder_input_F (bits+1) 0 x`.
theoremmult_input_F_aux_commute_update_above
theorem mult_input_F_aux_commute_update_above
    (bits multBits m i j : Nat) (hj : i ≤ j) (v : Bool) (f : Nat → Bool) :
    mult_input_F_aux bits multBits m i (update f (adder_n_qubits (bits + 1) + j) v)
    = update (mult_input_F_aux bits multBits m i f)
             (adder_n_qubits (bits + 1) + j) v
`mult_input_F_aux` commutes with an `update _ (adder_n_qubits (bits+1) + j) v` when `j ≥ i` (i.e., the iteration hasn't touched position `pos j` yet).
theoremmult_input_F_isolate_k
theorem mult_input_F_isolate_k
    (bits multBits x m k : Nat) (hk : k < multBits) :
    mult_input_F bits multBits x m
    = mult_input_F_aux bits multBits m multBits
        (update (adder_input_F (bits + 1) 0 x)
                (adder_n_qubits (bits + 1) + k) (Nat.testBit m k))
*`mult_input_F` isolation at position `k`.** For `k < multBits`, the full multiplier-encoded input is equal to `mult_input_F_aux` at iteration `multBits` applied to a base that already carries the k-th multiplier update on `adder_input_F`. The k-th iteration of the aux overwrites position `adder_n_qubits (bits+1) + k` to the same value (`Nat.testBit m k`), so the additional update is absorbed; outside the multiplier range the update at `pos k` is transparent.
theoremmult_input_F_aux_absorb_at_k_position
theorem mult_input_F_aux_absorb_at_k_position
    (bits multBits m k : Nat) (f : Nat → Bool) :
    mult_input_F_aux bits multBits m (k + 1)
        (update f (adder_n_qubits (bits + 1) + k) (Nat.testBit m k))
    = mult_input_F_aux bits multBits m k
        (update f (adder_n_qubits (bits + 1) + k) (Nat.testBit m k))
Absorption lemma: when an outer `update` at the k-th multiplier position rewrites a value that the inner aux-at-iteration-k already carries (because the inner has `update f (pos k) (testBit m k)` as base and aux at k doesn't touch pos k), the outer update is a no-op.
theoremCMAcg_on_mult_input_F_aux_iso
theorem CMAcg_on_mult_input_F_aux_iso
    (bits N c x m multBits k : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (hx : x < N) (hc_pos : 0 < c) (hc : c < N) (hk : k < multBits) :
    ∀ i, i ≤ multBits →
    Gate.applyNat
      (controlledModAddConstGate bits N c
        (adder_n_qubits (bits + 1) + k) (adder_n_qubits (bits + 1) + multBits))
      (mult_input_F_aux bits multBits m i
        (update (adder_input_F (bits + 1) 0 x)
                (adder_n_qubits (bits + 1) + k) (Nat.testBit m k)))
    = mult_input_F_aux bits multBits m i
Inductive helper for the single-step correctness on `mult_input_F`.
theoremcontrolledModAddConstGate_on_mult_input_F
theorem controlledModAddConstGate_on_mult_input_F
    (bits N c x m multBits k : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (hx : x < N) (hc_pos : 0 < c) (hc : c < N) (hk : k < multBits) :
    Gate.applyNat
      (controlledModAddConstGate bits N c
        (adder_n_qubits (bits + 1) + k) (adder_n_qubits (bits + 1) + multBits))
      (mult_input_F bits multBits x m)
    = mult_input_F bits multBits
        (if Nat.testBit m k then (x + c) % N else x) m
*Single-step correctness for `controlledModAddConstGate` on `mult_input_F`.** Applied to the multiplier-encoded input `mult_input_F bits multBits x m`, the controlled modular-add gate (controlled by the `k`-th multiplier qubit, with shared flag at position `adder_n_qubits (bits+1) + multBits`) advances the adder's target register from `x` to `(x + c) % N` when bit `k` of `m` is set, or leaves it unchanged otherwise.
lemmam_mod_two_pow_succ_eq
lemma m_mod_two_pow_succ_eq (m k : Nat) :
    m % 2^(k+1) = m % 2^k + (m / 2^k % 2) * 2^k
*Bit decomposition for the next power of two.** `m mod 2^(k+1) = m mod 2^k + (testBit m k as Nat) * 2^k`.
theoremmodMultConstGateAux_correct
theorem modMultConstGateAux_correct
    (bits N a multBits x m : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < N)
    (h_const_pos : ∀ j, j < multBits → 0 < (a * 2^j) % N) :
    ∀ k, k ≤ multBits →
    Gate.applyNat (modMultConstGateAux bits N a multBits k)
                  (mult_input_F bits multBits x m)
    = mult_input_F bits multBits ((x + a * (m % 2^k)) % N) m
*Inductive correctness for `modMultConstGateAux`.** At iteration `k ≤ multBits`, the aux gate has advanced the adder's target from `x` to `(x + a * (m mod 2^k)) mod N`, given that each per-bit constant `(a * 2^j) % N` is non-zero for `j < multBits`.
theoremmodMultConstGate_correct
theorem modMultConstGate_correct
    (bits N a multBits x m : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < N)
    (hm : m < 2^multBits)
    (h_const_pos : ∀ j, j < multBits → 0 < (a * 2^j) % N) :
    Gate.applyNat (modMultConstGate bits N a multBits)
                  (mult_input_F bits multBits x m)
    = mult_input_F bits multBits ((x + a * m) % N) m
*Modular multiplier correctness.** When `m < 2^multBits`, the modular multiplier gate sends the adder's target from `x` to `(x + a * m) mod N`, while preserving the multiplier register `m` and the flag. Equivalent form: each multiplier-bit `i` contributes `(a * 2^i) mod N` to the target when set.
theoremmult_state_init_at_mult_pos
theorem mult_state_init_at_mult_pos
    (bits multBits x j : Nat) (hj : j < multBits) :
    mult_state_init bits multBits x (adder_n_qubits (bits + 1) + j)
    = Nat.testBit x j
Decoder at multiplier-bit positions.
theoremmult_state_init_at_non_mult_pos
theorem mult_state_init_at_non_mult_pos
    (bits multBits x p : Nat)
    (h_outside : p < adder_n_qubits (bits + 1)
                 ∨ adder_n_qubits (bits + 1) + multBits ≤ p) :
    mult_state_init bits multBits x p = adder_input_F (bits + 1) 0 0 p
Decoder at non-multiplier positions: zero.
theoremmodMultConstGate_on_init_correct
theorem modMultConstGate_on_init_correct
    (bits N a multBits x : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (hx : x < 2^multBits)
    (h_const_pos : ∀ j, j < multBits → 0 < (a * 2^j) % N) :
    Gate.applyNat (modMultConstGate bits N a multBits)
                  (mult_state_init bits multBits x)
    = mult_input_F bits multBits ((a * x) % N) x
*Modular multiplier on the initial input state.** When applied to `mult_state_init bits multBits x` (multiplier register holds `x`, adder zeroed), the gate produces a state whose adder-target register encodes `(a * x) mod N` while the multiplier register `x` is preserved. Hypotheses ensure each per-bit constant `(a * 2^j) % N` is positive (Shor's coprimality condition) and `x < 2^multBits`.
theoremmodMultConstGate_wellTyped_at_shor_dim
theorem modMultConstGate_wellTyped_at_shor_dim
    (bits N a multBits : Nat) (hbits : 1 ≤ bits) :
    Gate.WellTyped (multBits + (adder_n_qubits (bits + 1) + 1))
      (modMultConstGate bits N a multBits)
*WellTyped corollary at the Shor-compatible dimension.** Setting `n := multBits` (the data register size) and `anc := adder_n_qubits (bits+1) + 1` (the workspace including the flag), the modular multiplier gate is well-typed at dimension `n + anc`, matching the shape required by `encodeDataZeroAnc n anc` and `MultiplyCircuitProperty a N n anc`.
theoremf_modmult_step_gate_wellTyped
theorem f_modmult_step_gate_wellTyped
    (bits N a multBits i : Nat) (hbits : 1 ≤ bits) :
    Gate.WellTyped (multBits + (adder_n_qubits (bits + 1) + 1))
      (f_modmult_step_gate bits N a multBits i)
*WellTyped** for the step gate at the Shor-compatible dimension.
theoremf_modmult_step_gate_wellTyped_aux
theorem f_modmult_step_gate_wellTyped_aux
    (bits N a multBits i : Nat) (hbits : 1 ≤ bits) :
    Gate.WellTyped (adder_n_qubits (bits + 1) + multBits + 1)
      (f_modmult_step_gate bits N a multBits i)
*WellTyped** at the original aux dimension.
theoremf_modmult_step_gate_on_init_correct
theorem f_modmult_step_gate_on_init_correct
    (bits N a multBits i x : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (hx : x < 2^multBits)
    (h_const_pos :
      ∀ j, j < multBits → 0 < ((a^(2^i) % N) * 2^j) % N) :
    Gate.applyNat (f_modmult_step_gate bits N a multBits i)
                  (mult_state_init bits multBits x)
    = mult_input_F bits multBits ((a^(2^i) * x) % N) x
*Step correctness on the initial state.** Applied to `mult_state_init bits multBits x`, the step gate at iterate `i` produces a state whose adder-target register holds `(a^(2^i) * x) % N` while the multiplier register `x` is preserved. Hypotheses ensure each per-bit constant `((a^(2^i)) * 2^j) % N` is positive (the analogue of Shor's coprimality condition for the squared base).
theoremf_modmult_gate_family_wellTyped
theorem f_modmult_gate_family_wellTyped
    (bits N a multBits : Nat) (hbits : 1 ≤ bits) :
    ∀ i, Gate.WellTyped (multBits + (adder_n_qubits (bits + 1) + 1))
            (f_modmult_gate_family bits N a multBits i)
*Family-level WellTyped.** For every iterate `i`, the gate `f_modmult_gate_family bits N a multBits i` is `Gate.WellTyped` at the Shor-compatible dimension `n + anc = multBits + (adder_n_qubits (bits+1) + 1)`.
theoremf_modmult_gate_family_on_init_correct
theorem f_modmult_gate_family_on_init_correct
    (bits N a multBits : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (h_const_pos :
      ∀ i, ∀ j, j < multBits → 0 < ((a^(2^i) % N) * 2^j) % N) :
    ∀ i x, x < 2^multBits →
      Gate.applyNat (f_modmult_gate_family bits N a multBits i)
                    (mult_state_init bits multBits x)
      = mult_input_F bits multBits ((a^(2^i) * x) % N) x
*Family-level out-of-place correctness on the initial state.** For each iterate `i`, applied to `mult_state_init bits multBits x`, the family member produces a state with adder-target register holding `(a^(2^i) * x) mod N` and multiplier register `x` preserved.
theoremqubit_swap_wellTyped
theorem qubit_swap_wellTyped (dim a b : Nat)
    (ha : a < dim) (hb : b < dim) (hab : a ≠ b) :
    Gate.WellTyped dim (qubit_swap a b)
Well-typedness for `qubit_swap`.
theoremqubit_swap_correct
theorem qubit_swap_correct (a b : Nat) (f : Nat → Bool) (hab : a ≠ b) :
    Gate.applyNat (qubit_swap a b) f
    = update (update f a (f b)) b (f a)
*Boolean-state correctness for SWAP.** Applied to `f`, the swap gate produces a state with values at positions `a` and `b` exchanged.
theoremregister_swap_aux_succ
theorem register_swap_aux_succ
    (offsetA offsetB k : Nat) :
    register_swap_aux offsetA offsetB (k + 1)
    = Gate.seq (register_swap_aux offsetA offsetB k)
               (qubit_swap (offsetA + k) (offsetB + k))
Recursion unfolding for `register_swap_aux`.
theoremregister_swap_aux_wellTyped
theorem register_swap_aux_wellTyped
    (dim offsetA offsetB k : Nat) (hdim : 0 < dim)
    (hA : offsetA + k ≤ dim) (hB : offsetB + k ≤ dim)
    (h_disjoint : offsetA + k ≤ offsetB ∨ offsetB + k ≤ offsetA) :
    Gate.WellTyped dim (register_swap_aux offsetA offsetB k)
*WellTyped for `register_swap_aux`.** Requires non-empty `dim`, both offset ranges fitting inside `dim`, and the two ranges being disjoint.
theoremregister_swap_wellTyped
theorem register_swap_wellTyped
    (dim multBits offsetA offsetB : Nat) (hdim : 0 < dim)
    (hA : offsetA + multBits ≤ dim) (hB : offsetB + multBits ≤ dim)
    (h_disjoint : offsetA + multBits ≤ offsetB ∨ offsetB + multBits ≤ offsetA) :
    Gate.WellTyped dim (register_swap multBits offsetA offsetB)
*WellTyped for `register_swap`.**
theoremregister_swap_aux_at_other
theorem register_swap_aux_at_other
    (offsetA offsetB n : Nat) (f : Nat → Bool) (q : Nat)
    (h_disjoint : offsetA + n ≤ offsetB ∨ offsetB + n ≤ offsetA)
    (h_outside_A : q < offsetA ∨ offsetA + n ≤ q)
    (h_outside_B : q < offsetB ∨ offsetB + n ≤ q) :
    Gate.applyNat (register_swap_aux offsetA offsetB n) f q = f q
*Correctness at "other" positions** of `register_swap_aux`. At any position outside both `[offsetA, offsetA + n)` and `[offsetB, offsetB + n)`, the gate is identity.
theoremregister_swap_aux_at_A
theorem register_swap_aux_at_A
    (offsetA offsetB n : Nat) (f : Nat → Bool) (j : Nat) (hj : j < n)
    (h_disjoint : offsetA + n ≤ offsetB ∨ offsetB + n ≤ offsetA) :
    Gate.applyNat (register_swap_aux offsetA offsetB n) f (offsetA + j)
    = f (offsetB + j)
*Correctness at A positions**: at `offsetA + j` for `j < n`, the gate returns `f (offsetB + j)`.
theoremregister_swap_aux_at_B
theorem register_swap_aux_at_B
    (offsetA offsetB n : Nat) (f : Nat → Bool) (j : Nat) (hj : j < n)
    (h_disjoint : offsetA + n ≤ offsetB ∨ offsetB + n ≤ offsetA) :
    Gate.applyNat (register_swap_aux offsetA offsetB n) f (offsetB + j)
    = f (offsetA + j)
*Correctness at B positions**: at `offsetB + j` for `j < n`, the gate returns `f (offsetA + j)`.
theoreminv_mul_mod_eq_self
theorem inv_mul_mod_eq_self (a ainv N x : Nat) (hN : 0 < N)
    (hx : x < N) (hainv : ainv < N) (h_inv : a * ainv % N = 1) :
    ainv * (a * x % N) % N = x
*Modular-inverse "undo" identity.** If `a * ainv ≡ 1 (mod N)`, `x < N`, and `ainv < N`, then `ainv * (a*x mod N) mod N = x`.
theoremmod_inv_cancel_identity
theorem mod_inv_cancel_identity (a ainv N x : Nat) (hN : 0 < N)
    (hx : x < N) (hainv : ainv < N) (h_inv : a * ainv % N = 1) :
    (x + (N - ainv) * (a * x % N)) % N = 0
*Modular cancellation by the additive-inverse-mod-N coefficient.** If `a * ainv ≡ 1 (mod N)`, `x < N`, `ainv < N`, then `(x + (N - ainv) * (a*x mod N)) mod N = 0`. This is the algebraic identity that justifies the third stage of the in-place modular multiplier wrapper.
theoremmult_target_swap_aux_succ
theorem mult_target_swap_aux_succ (bits k : Nat) :
    mult_target_swap_aux bits (k + 1)
    = Gate.seq (mult_target_swap_aux bits k)
               (qubit_swap (adder_n_qubits (bits + 1) + k) (target_idx k))
Recursion unfolding for `mult_target_swap_aux`.
theoremmult_target_swap_aux_wellTyped
theorem mult_target_swap_aux_wellTyped
    (bits multBits k : Nat) (hbits : 1 ≤ bits)
    (h_multBits_le : multBits ≤ bits + 1) (hk : k ≤ multBits) :
    Gate.WellTyped (adder_n_qubits (bits + 1) + multBits + 1)
      (mult_target_swap_aux bits k)
*WellTyped for `mult_target_swap_aux`.** At dimension `adder_n_qubits (bits + 1) + multBits + 1` (Shor-compatible), each constituent `qubit_swap (adder_n_qubits + k) (target_idx k)` is well-typed when `k ≤ multBits ≤ bits + 1`.
theoremmult_target_swap_wellTyped
theorem mult_target_swap_wellTyped
    (bits multBits : Nat) (hbits : 1 ≤ bits)
    (h_multBits_le : multBits ≤ bits + 1) :
    Gate.WellTyped (adder_n_qubits (bits + 1) + multBits + 1)
      (mult_target_swap bits multBits)
*WellTyped for `mult_target_swap`.**
theoremmodMultInPlace_wellTyped
theorem modMultInPlace_wellTyped
    (bits N a ainv multBits : Nat) (hbits : 1 ≤ bits)
    (h_multBits_le : multBits ≤ bits + 1) :
    Gate.WellTyped (adder_n_qubits (bits + 1) + multBits + 1)
      (modMultInPlace bits N a ainv multBits)
*WellTyped for `modMultInPlace`.**
theoremmodMultInPlace_wellTyped_at_shor_dim
theorem modMultInPlace_wellTyped_at_shor_dim
    (bits N a ainv multBits : Nat) (hbits : 1 ≤ bits)
    (h_multBits_le : multBits ≤ bits + 1) :
    Gate.WellTyped (multBits + (adder_n_qubits (bits + 1) + 1))
      (modMultInPlace bits N a ainv multBits)
*In-place WellTyped at the Shor-compatible dimension.**
theoremmult_target_swap_aux_at_other
theorem mult_target_swap_aux_at_other
    (bits n : Nat) (f : Nat → Bool) (q : Nat)
    (h_n_le : n ≤ bits + 1)
    (h_outside : ∀ k, k < n →
      q ≠ adder_n_qubits (bits + 1) + k ∧ q ≠ target_idx k) :
    Gate.applyNat (mult_target_swap_aux bits n) f q = f q
*At-other for `mult_target_swap_aux`.** If `q` is not equal to any swap-paired position (multiplier-side or target-side) up to iteration `n`, then the gate is identity at `q`. Requires `n ≤ bits + 1` to ensure each swap-pair has distinct positions.
theoremmult_target_swap_aux_at_mult
theorem mult_target_swap_aux_at_mult
    (bits n : Nat) (f : Nat → Bool) (j : Nat) (hj : j < n)
    (h_n_le : n ≤ bits + 1) :
    Gate.applyNat (mult_target_swap_aux bits n) f
      (adder_n_qubits (bits + 1) + j)
    = f (target_idx j)
*At multiplier-side position**: at `adder_n_qubits + j` for `j < n`, the gate returns `f (target_idx j)`. Requires `n ≤ bits + 1`.
theoremmult_target_swap_aux_at_target
theorem mult_target_swap_aux_at_target
    (bits n : Nat) (f : Nat → Bool) (j : Nat) (hj : j < n)
    (h_n_le : n ≤ bits + 1) :
    Gate.applyNat (mult_target_swap_aux bits n) f (target_idx j)
    = f (adder_n_qubits (bits + 1) + j)
*At target-side position**: at `target_idx j` for `j < n`, the gate returns `f (adder_n_qubits + j)`. Requires `n ≤ bits + 1`.

FormalRV.Arithmetic.ModularAdder.ModularAdderDefinitions

FormalRV/Arithmetic/ModularAdder/ModularAdderDefinitions.lean
## Boolean-level specifications
defmodAddConstSpec
def modAddConstSpec (N c x : Nat) : Nat
Spec for modular addition by a constant under arbitrary modulus `N`.
defaddConstPow2Spec
def addConstPow2Spec (bits c x : Nat) : Nat
Spec specialized to `N = 2^bits` (the case the patched Gidney adder implements natively without any extra circuitry).
defsubConstPow2Spec
def subConstPow2Spec (bits N x : Nat) : Nat
Wraparound spec for subtraction by `N` modulo `2^bits`.
defsubConstPow2WideSpec
def subConstPow2WideSpec (bits N x : Nat) : Nat
Wraparound-subtraction spec at widened bit-count `bits + 1`.
defprepareMaskedConstRead
def prepareMaskedConstRead : Nat → Nat → Nat → Gate
  | 0,     _, _       => Gate.I
  | k + 1, N, flagIdx =>
      Gate.seq (prepareMaskedConstRead k N flagIdx)
               (if N.testBit k then Gate.CX flagIdx (read_idx k) else Gate.I)
Prepare the read register by XORing each `read_idx k` (for `k < bits`) with `flag ∧ N.testBit k`, where the flag bit lives at `flagIdx`. Implemented as a CX cascade conditioned on the bit pattern of `N`.
defconditionalAddConstGate
def conditionalAddConstGate (bits N flagIdx : Nat) : Gate
Conditional add-back gate: prepare the read register with the masked constant `flag ∧ N`, run the patched Gidney adder, un-prepare the read register. The result computes `target := (x + (if flag then N else 0)) mod 2^bits` without using any controlled-CCX (CCCX) gate.
defprepareConstRead
def prepareConstRead : Nat → Nat → Gate
  | 0,     _ => Gate.I
  | k + 1, c => Gate.seq (prepareConstRead k c)
                  (if c.testBit k then Gate.X (read_idx k) else Gate.I)
Unconditionally prepare `read_idx k := c.testBit k` for `k < bits` by applying `X (read_idx k)` whenever `c.testBit k = true`. When applied to a zero read register, sets it to the bits of `c`; applied again (involutive), it clears the read register back to zero.
defaddConstGate
def addConstGate (bits c : Nat) : Gate
Composable constant-add gate: prepare read with `c`, run the patched Gidney adder, unprepare read. Takes a clean `adder_input_F bits 0 x` and produces target = `(x + c) mod 2^bits`, with read register restored to zero and carries cleared.
defsubConstGate
def subConstGate (bits N : Nat) : Gate
Composable constant-sub gate, expressed as wraparound addition of `2^bits - N`. This implements `(x + (2^bits - N)) mod 2^bits`, which equals `(x - N) mod 2^bits` over the two's-complement view.
defmodAddConstArithmeticSpec
def modAddConstArithmeticSpec (bits N c x : Nat) : Nat
Arithmetic-level spec for the widened modular-addition pipeline at width `bits + 1`. Composes: subtract-`N` after add-`c`, conditionally add back `N` when the comparison flag indicates underflow.
defcopyTargetHighBitToFlag
def copyTargetHighBitToFlag (bits flagIdx : Nat) : Gate
Flag-copy gate: a single CX from `target_idx bits` into `flagIdx`.
defmodAddConstGate_dirtyFlag
def modAddConstGate_dirtyFlag (bits N c flagIdx : Nat) : Gate
The full DIRTY-FLAG modular add-constant gate. Pipeline: `addConstGate (bits+1) c ; subConstGate (bits+1) N ; copyTargetHighBitToFlag bits flagIdx ; conditionalAddConstGate (bits+1) N flagIdx`. The result has the low `bits` target bits encoding `(x + c) mod N`, but the flag bit at `flagIdx` is LEFT DIRTY at `decide ((x + c) < N)`. Flag uncomputation is handled in a later tick.
defflagUncomputeGate
def flagUncomputeGate (bits c flagIdx : Nat) : Gate
Reversible flag-uncompute gate: `subConstGate c ; CX (target_idx bits) flagIdx ; X flagIdx ; addConstGate c`. Restores `flagIdx` to false while leaving the target, read, and carry registers unchanged.
defmodAddConstGate
def modAddConstGate (bits N c : Nat) : Gate
*Clean modular add-constant gate**. Composition of the dirty-flag pipeline with the flag-uncompute step. The internal flag bit lives at `adder_n_qubits (bits + 1)`.
defcontrolledModAddConstGate
def controlledModAddConstGate (bits N c controlIdx flagIdx : Nat) : Gate
Controlled modular add-constant gate. Eight-step pipeline: controlled add `c` ; controlled sub `N` ; controlled flag-copy ; flag-controlled add-back `N` ; controlled sub `c` ; controlled flag-copy ; controlled X flag ; controlled add `c`.
defmodMultConstGateAux
def modMultConstGateAux (bits N a multBits : Nat) : Nat → Gate
  | 0 => Gate.I
  | k+1 =>
    Gate.seq
      (modMultConstGateAux bits N a multBits k)
      (controlledModAddConstGate bits N ((a * 2^k) % N)
        (adder_n_qubits (bits + 1) + k)
        (adder_n_qubits (bits + 1) + multBits))
Auxiliary recursive gate for the modular multiplier: applies controlled modular-add of `(a * 2^i) % N` for bits `i = 0, 1, ..., k-1`. The parameter `multBits` is the TOTAL multiplier width (used to position the shared flag qubit); `k` is the recursion index running from 0 up to `multBits`.
defmodMultConstGate
def modMultConstGate (bits N a multBits : Nat) : Gate
Modular multiplier gate: applies `controlledModAddConstGate` for each bit of the multiplier register, accumulating `(a * m) % N` into the adder's target register, where `m` is the natural-number value of the multiplier register.
defmult_input_F_aux
def mult_input_F_aux (bits multBits m : Nat) : Nat → (Nat → Bool) → (Nat → Bool)
  | 0, f => f
  | i+1, f =>
    update (mult_input_F_aux bits multBits m i f)
           (adder_n_qubits (bits + 1) + i) (Nat.testBit m i)
Auxiliary recursive helper for the multiplier-encoded input: starting from `f`, applies an `update _ (adder_n_qubits (bits+1) + j) (Nat.testBit m j)` for each `j = 0, 1, ..., i-1`, in order. The last update written is at `j = i - 1`.
defmult_input_F
def mult_input_F (bits multBits x m : Nat) : Nat → Bool
*Multiplier-encoded input.** Starts from `adder_input_F (bits+1) 0 x` (which puts value `x` in the adder's target register and 0 elsewhere within the adder block; `false` outside), then fills the multiplier qubits at positions `adder_n_qubits (bits+1) + j` (for `j = 0, ..., multBits - 1`) with the bits of `m`.
defmult_state_init
def mult_state_init (bits multBits x : Nat) : Nat → Bool
Initial state for the multiplier: the multiplier register holds `x`, the adder block and flag are zeroed.
deff_modmult_step_gate
def f_modmult_step_gate (bits N a multBits i : Nat) : Gate
The `i`-th step of the QPE multiplication cascade: multiplication by the constant `a^(2^i) mod N` applied to the multiplier-encoded state.
deff_modmult_gate_family
def f_modmult_gate_family (bits N a multBits : Nat) : Nat → Gate
Modular multiplication gate family indexed by QPE iterate.
defqubit_swap
def qubit_swap (a b : Nat) : Gate
Two-qubit SWAP: exchanges the values at qubits `a` and `b` via the standard three-CNOT decomposition.
defregister_swap_aux
def register_swap_aux (offsetA offsetB : Nat) : Nat → Gate
  | 0 => Gate.I
  | k+1 => Gate.seq (register_swap_aux offsetA offsetB k)
                    (qubit_swap (offsetA + k) (offsetB + k))
Auxiliary recursive register-swap helper. At iteration count `n`, applies pairwise `qubit_swap (offsetA + k) (offsetB + k)` for `k = 0, 1, ..., n - 1`.
defregister_swap
def register_swap (multBits offsetA offsetB : Nat) : Gate
Register-level SWAP: exchanges two `multBits`-wide registers at positions `[offsetA, offsetA + multBits)` and `[offsetB, offsetB + multBits)`.
defmult_target_swap_aux
def mult_target_swap_aux (bits : Nat) : Nat → Gate
  | 0 => Gate.I
  | k+1 => Gate.seq (mult_target_swap_aux bits k)
                    (qubit_swap (adder_n_qubits (bits + 1) + k) (target_idx k))
Auxiliary recursive multiplier-target SWAP at iteration count `n`: swaps `(adder_n_qubits (bits+1) + k, target_idx k)` for `k = 0, ..., n - 1`.
defmult_target_swap
def mult_target_swap (bits multBits : Nat) : Gate
Multiplier-target SWAP: pairwise exchanges multiplier-register qubits at `adder_n_qubits (bits+1) + k` with adder-target qubits at `target_idx k`, for `k = 0, ..., multBits - 1`.
defmodMultInPlace
def modMultInPlace (bits N a ainv multBits : Nat) : Gate
*In-place modular multiplier gate.** Three-stage composition: 1. `modMultConstGate bits N a multBits` — OOPmul(a): `|x⟩|0⟩ → |x⟩|a*x mod N⟩`. 2. `mult_target_swap bits multBits` — exchanges multiplier and target registers: `|x⟩|a*x mod N⟩ → |a*x mod N⟩|x⟩`. 3. `modMultConstGate bits N ((N - ainv) % N) multBits` — adds `(N - ainv) * (a*x mod N)` to the target, yielding 0 by `mod_inv_cancel_identity`. Net effect: `|a*x mod N⟩|0⟩`. The multiplier register holds the input `x` initially; after the gate, it holds `(a * x) mod N`, with adder and flag clean. This is exactly the in-place semantics of `MultiplyCircuitProperty`.
defreverse_register_swap_aux
def reverse_register_swap_aux (n offsetA offsetB : Nat) : Nat → Gate
  | 0 => Gate.I
  | k+1 => Gate.seq (reverse_register_swap_aux n offsetA offsetB k)
                    (qubit_swap (offsetA + k) (offsetB + (n - 1 - k)))
Auxiliary recursive reverse-pairing register SWAP at iteration count `k`: at step k, swaps `(offsetA + k, offsetB + (n - 1 - k))`.
defreverse_register_swap
def reverse_register_swap (n offsetA offsetB : Nat) : Gate
Reverse-pairing register SWAP: exchanges positions `[offsetA, offsetA + n)` and `[offsetB, offsetB + n)` with index reversal (position `offsetA + i` swaps with `offsetB + (n - 1 - i)`).

FormalRV.Arithmetic.ModularAdder.ModularAdderForwardFaithfulness

FormalRV/Arithmetic/ModularAdder/ModularAdderForwardFaithfulness.lean
theoremgidney_adder_forward_faithful_full_preserves_above
theorem gidney_adder_forward_faithful_full_preserves_above
    (w : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * w ≤ p) :
    Gate.applyNat (gidney_adder_forward_faithful_full w) f p = f p
`forward_faithful_full w` preserves positions `≥ 3 * w`.
theoremgidney_adder_forward_faithful_full_reverse_patched_preserves_above
theorem gidney_adder_forward_faithful_full_reverse_patched_preserves_above
    (w : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * w ≤ p) :
    Gate.applyNat (gidney_adder_forward_faithful_full_reverse_patched w) f p = f p
`forward_faithful_full_reverse_patched w` preserves positions `≥ 3 * w`.
theoremgidney_final_cx_cascade_preserves_above
theorem gidney_final_cx_cascade_preserves_above
    (w : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * w ≤ p) :
    Gate.applyNat (gidney_final_cx_cascade w) f p = f p
`final_cx_cascade w` preserves positions `≥ 3 * w`.
theoremgidney_adder_full_faithful_no_measurement_patched_preserves_above
theorem gidney_adder_full_faithful_no_measurement_patched_preserves_above
    (w : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * w ≤ p) :
    Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched w) f p = f p
*Headline frame lemma**: the full patched Gidney adder of width `w` preserves positions `p ≥ 3 * w`. This is the tight bound: the cascade touches positions up to `carry_idx (w-1) = 3w - 1` for `w ≥ 2`.
theoremprepareConstRead_preserves_above
theorem prepareConstRead_preserves_above
    (bits c : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * bits ≤ p) :
    Gate.applyNat (prepareConstRead bits c) f p = f p
`prepareConstRead bits c` preserves positions `≥ 3 * bits`.
theoremaddConstGate_preserves_above_actual
theorem addConstGate_preserves_above_actual
    (bits c : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * bits ≤ p) :
    Gate.applyNat (addConstGate bits c) f p = f p
*Composable frame**: `addConstGate bits c` preserves positions `≥ 3 * bits`.
theoremsubConstGate_preserves_above_actual
theorem subConstGate_preserves_above_actual
    (bits N : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * bits ≤ p) :
    Gate.applyNat (subConstGate bits N) f p = f p
*Composable frame**: `subConstGate bits N` preserves positions `≥ 3 * bits`.
theoremaddConstGate_preserves_gap_read
theorem addConstGate_preserves_gap_read
    (bits c : Nat) (f : Nat → Bool) :
    Gate.applyNat (addConstGate (bits + 1) c) f (read_idx (bits + 1))
      = f (read_idx (bits + 1))
theoremaddConstGate_preserves_gap_target
theorem addConstGate_preserves_gap_target
    (bits c : Nat) (f : Nat → Bool) :
    Gate.applyNat (addConstGate (bits + 1) c) f (target_idx (bits + 1))
      = f (target_idx (bits + 1))
theoremsubConstGate_preserves_gap_read
theorem subConstGate_preserves_gap_read
    (bits N : Nat) (f : Nat → Bool) :
    Gate.applyNat (subConstGate (bits + 1) N) f (read_idx (bits + 1))
      = f (read_idx (bits + 1))
theoremsubConstGate_preserves_gap_target
theorem subConstGate_preserves_gap_target
    (bits N : Nat) (f : Nat → Bool) :
    Gate.applyNat (subConstGate (bits + 1) N) f (target_idx (bits + 1))
      = f (target_idx (bits + 1))
theoremaddConstGate_modAdd_step1_state_eq
theorem addConstGate_modAdd_step1_state_eq
    (bits N c x : Nat) (hbits : 1 ≤ bits) (hN : N ≤ 2^bits)
    (hx : x < N) (hc : c < N) :
    Gate.applyNat (addConstGate (bits + 1) c) (adder_input_F (bits + 1) 0 x)
    = adder_input_F (bits + 1) 0 (x + c)
*Strong normal-form for step 1**: `addConstGate (bits + 1) c` applied to the clean input `adder_input_F (bits + 1) 0 x` produces a function extensionally equal to `adder_input_F (bits + 1) 0 (x + c)`. This supersedes the WEAK `_state_normal` form above.
theoremsubConstGate_modAdd_step2_state_eq
theorem subConstGate_modAdd_step2_state_eq
    (bits N s : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hs : s < 2 * N) :
    Gate.applyNat (subConstGate (bits + 1) N) (adder_input_F (bits + 1) 0 s)
    = adder_input_F (bits + 1) 0 (subConstPow2WideSpec bits N s)
*Strong normal-form for step 2**: `subConstGate (bits + 1) N` applied to the clean input `adder_input_F (bits + 1) 0 s` produces a function extensionally equal to `adder_input_F (bits + 1) 0 y` where `y := subConstPow2WideSpec bits N s`.
theoremadder_input_F_at_high
theorem adder_input_F_at_high
    (w a b k : Nat) (hk : 3 * w ≤ k) :
    adder_input_F w a b k = false
Helper: `adder_input_F w a b` is `false` at any position `≥ 3 * w` (all working positions are below `3 * w`, and out-of-range bits of `a` and `b` are zero by the `decide(k/3 < w)` guard).
theoremconditionalAddConstGate_target_bit
theorem conditionalAddConstGate_target_bit
    (bits N flagIdx y i : Nat) (flag : Bool)
    (hbits : 2 ≤ bits) (hN : N < 2^bits) (hy : y < 2^bits)
    (hflagIdx : adder_n_qubits bits ≤ flagIdx) (hi : i < bits) :
    Gate.applyNat (conditionalAddConstGate bits N flagIdx)
      (update (adder_input_F bits 0 y) flagIdx flag) (target_idx i)
    = (y + (if flag then N else 0)).testBit i
Bit-level conditional add-back: applied to an `update (adder_input_F bits 0 y) flagIdx flag` input (target holds `y`, read/carry zero, flag at `flagIdx ≥ adder_n_qubits bits`), the gate writes `(y + (if flag then N else 0)).testBit i` at `target_idx i` for `i < bits`.
theoremmodAddConstGate_dirtyFlag_target_decode
theorem modAddConstGate_dirtyFlag_target_decode
    (bits N c x flagIdx : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < N) (hc : c < N)
    (hflagIdx : adder_n_qubits (bits + 1) ≤ flagIdx) :
    gidney_target_val bits
      (Gate.applyNat (modAddConstGate_dirtyFlag bits N c flagIdx)
        (adder_input_F (bits + 1) 0 x))
    = (x + c) % N
*Tick 2 HEADLINE**: the dirty-flag modular add-constant gate decodes its target register (low `bits` bits) to `(x + c) mod N`.
theoremmodAddConstGate_dirtyFlag_after_three_steps_eq
theorem modAddConstGate_dirtyFlag_after_three_steps_eq
    (bits N c x flagIdx : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < N) (hc : c < N)
    (hflagIdx : adder_n_qubits (bits + 1) ≤ flagIdx) :
    Gate.applyNat (Gate.seq (addConstGate (bits + 1) c)
      (Gate.seq (subConstGate (bits + 1) N)
        (copyTargetHighBitToFlag bits flagIdx)))
      (adder_input_F (bits + 1) 0 x)
    = update (adder_input_F (bits + 1) 0 (subConstPow2WideSpec bits N (x + c))) flagIdx
        (decide ((x + c) < N))
Intermediate: the state after the first three steps (add ; sub ; copy-flag) of `modAddConstGate_dirtyFlag` is extensionally equal to `update (adder_input_F (bits+1) 0 y) flagIdx (decide ((x+c)<N))`, where `y := subConstPow2WideSpec bits N (x+c)`.
theoremmodAddConstGate_dirtyFlag_workspace
theorem modAddConstGate_dirtyFlag_workspace
    (bits N c x flagIdx : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < N) (hc : c < N)
    (hflagIdx : adder_n_qubits (bits + 1) ≤ flagIdx) :
    Gate.WellTyped (flagIdx + 1) (modAddConstGate_dirtyFlag bits N c flagIdx)
    ∧ (∀ i, i < bits + 1 →
        Gate.applyNat (modAddConstGate_dirtyFlag bits N c flagIdx)
          (adder_input_F (bits + 1) 0 x) (read_idx i) = false)
    ∧ (∀ i, i < bits + 1 →
        Gate.applyNat (modAddConstGate_dirtyFlag bits N c flagIdx)
          (adder_input_F (bits + 1) 0 x) (carry_idx i) = false)
    ∧ Gate.applyNat (modAddConstGate_dirtyFlag bits N c flagIdx)
*Tick 3 HEADLINE — dirty-flag workspace theorem**. The `modAddConstGate_dirtyFlag` is WellTyped at the enlarged dimension `flagIdx + 1`, restores the read register to zero, clears the carry register, and places the comparison flag `decide ((x + c) < N)` at `flagIdx`. The flag bit is DIRTY — not restored to false.
theoremaddConstGate_state_eq_general
theorem addConstGate_state_eq_general
    (bits c x : Nat) (hbits : 2 ≤ bits) (hc : c < 2^bits) (hx : x < 2^bits) :
    Gate.applyNat (addConstGate bits c) (adder_input_F bits 0 x)
    = adder_input_F bits 0 ((x + c) % 2^bits)
General state-eq: `addConstGate bits c` applied to a clean input `adder_input_F bits 0 x` produces `adder_input_F bits 0 ((x + c) % 2^bits)`, under just `c < 2^bits` and `x < 2^bits`.
theoremsubConstGate_state_eq_general
theorem subConstGate_state_eq_general
    (bits N x : Nat) (hbits : 2 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < 2^bits) :
    Gate.applyNat (subConstGate bits N) (adder_input_F bits 0 x)
    = adder_input_F bits 0 (subConstPow2Spec bits N x)
General state-eq for subConstGate. Follows from `addConstGate_state_eq_general` via the definition `subConstGate = addConstGate (2^bits - N)`.
theoremflagUncomputeGate_correct
theorem flagUncomputeGate_correct
    (bits c flagIdx m : Nat) (hbits : 1 ≤ bits) (hc_pos : 0 < c)
    (hc : c < 2^bits) (hm : m < 2^bits)
    (hflagIdx : adder_n_qubits (bits + 1) ≤ flagIdx) :
    Gate.applyNat (flagUncomputeGate bits c flagIdx)
      (update (adder_input_F (bits + 1) 0 m) flagIdx (decide (m ≥ c)))
    = adder_input_F (bits + 1) 0 m
*Tick 4 HEADLINE — flag uncomputation correctness**. Given a state of the form `update (adder_input_F (bits+1) 0 m) flagIdx (decide (m ≥ c))` (target encoding `m < 2^bits`, flag stored at out-of-band `flagIdx`), the flag-uncompute gate restores the state to a clean `adder_input_F (bits+1) 0 m` — i.e., flag becomes false, target / read / carry unchanged.
theoremflagUncomputeGate_wellTyped
theorem flagUncomputeGate_wellTyped
    (bits c flagIdx : Nat) (hbits : 1 ≤ bits) (hc_pos : 0 < c) (hc : c < 2^bits)
    (hflagIdx : adder_n_qubits (bits + 1) ≤ flagIdx) :
    Gate.WellTyped (flagIdx + 1) (flagUncomputeGate bits c flagIdx)
WellTyped at `flagIdx + 1`. All four sub-gates are WellTyped at `adder_n_qubits (bits + 1) ≤ flagIdx + 1`; the CX and X explicitly touch `flagIdx`.
theoremmodAddConstArithmeticSpec_lt_pow_bits
theorem modAddConstArithmeticSpec_lt_pow_bits
    (bits N c x : Nat) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < N) (hc : c < N) :
    modAddConstArithmeticSpec bits N c x < 2^bits
Auxiliary: `modAddConstArithmeticSpec bits N c x < 2^bits` under modular hypotheses. Both flag cases produce a value in `[0, N - 1]`, hence `< 2^bits`.
theoremmodAddConstArithmeticSpec_eq_mod
theorem modAddConstArithmeticSpec_eq_mod
    (bits N c x : Nat) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hx : x < N) (hc : c < N) :
    modAddConstArithmeticSpec bits N c x = (x + c) % N
`modAddConstArithmeticSpec` equals `(x + c) mod N` (the high bit is zero, so the mod-`2^(bits+1)` mask is the value itself).
theoremconditionalAddConstGate_preserves_above_not_flag
theorem conditionalAddConstGate_preserves_above_not_flag
    (bits N flagIdx : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * bits ≤ p) :
    Gate.applyNat (conditionalAddConstGate bits N flagIdx) f p = f p
`conditionalAddConstGate bits N flagIdx` preserves positions `≥ 3 * bits`.
theoremmodAddConstGate_dirtyFlag_preserves_above_not_flag
theorem modAddConstGate_dirtyFlag_preserves_above_not_flag
    (bits N c flagIdx : Nat) (f : Nat → Bool) (p : Nat)
    (hp : 3 * (bits + 1) ≤ p) (h_p_ne_flag : p ≠ flagIdx) :
    Gate.applyNat (modAddConstGate_dirtyFlag bits N c flagIdx) f p = f p
`modAddConstGate_dirtyFlag bits N c flagIdx` preserves positions `≥ 3*(bits + 1)` that are not `flagIdx`.
theoremmodAddConstGate_dirtyFlag_state_eq
theorem modAddConstGate_dirtyFlag_state_eq
    (bits N c x flagIdx : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < N) (hc : c < N)
    (hflagIdx : adder_n_qubits (bits + 1) ≤ flagIdx) :
    Gate.applyNat (modAddConstGate_dirtyFlag bits N c flagIdx)
      (adder_input_F (bits + 1) 0 x)
    = update (adder_input_F (bits + 1) 0 ((x + c) % N)) flagIdx (decide ((x + c) < N))
*Strong state-eq for `modAddConstGate_dirtyFlag`**. The output is extensionally equal to the canonical "input form" with target encoding `(x + c) mod N` and the flag bit at `flagIdx` holding `decide ((x+c)<N)`.
theoremmodAddConstGate_state_eq
theorem modAddConstGate_state_eq
    (bits N c x : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < N) (hc_pos : 0 < c) (hc : c < N) :
    Gate.applyNat (modAddConstGate bits N c) (adder_input_F (bits + 1) 0 x)
    = adder_input_F (bits + 1) 0 ((x + c) % N)
*Tick 5 HEADLINE — clean modular add-constant**. Applied to `adder_input_F (bits + 1) 0 x`, the clean modular adder produces `adder_input_F (bits + 1) 0 ((x + c) mod N)` — full state-eq with target encoding the modular sum and ALL workspace (read, carry, internal flag) restored.
theoremmodAddConstGate_clean
theorem modAddConstGate_clean
    (bits N c x : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < N) (hc_pos : 0 < c) (hc : c < N) :
    Gate.WellTyped (adder_n_qubits (bits + 1) + 1) (modAddConstGate bits N c)
    ∧ gidney_target_val bits
        (Gate.applyNat (modAddConstGate bits N c) (adder_input_F (bits + 1) 0 x))
      = (x + c) % N
    ∧ (∀ i, i < bits + 1 →
        Gate.applyNat (modAddConstGate bits N c) (adder_input_F (bits + 1) 0 x)
          (read_idx i) = false)
    ∧ (∀ i, i < bits + 1 →
        Gate.applyNat (modAddConstGate bits N c) (adder_input_F (bits + 1) 0 x)
*Bundled clean theorem** — WellTyped, decoded target, read / carry / flag all restored. Derives from `modAddConstGate_state_eq`.
theoremcontrolledModAddConstGate_wellTyped
theorem controlledModAddConstGate_wellTyped
    (bits N c controlIdx flagIdx : Nat) (hbits : 1 ≤ bits)
    (hcontrolIdx : adder_n_qubits (bits + 1) ≤ controlIdx)
    (hflagIdx : controlIdx < flagIdx) :
    Gate.WellTyped (flagIdx + 1) (controlledModAddConstGate bits N c controlIdx flagIdx)
`controlledModAddConstGate` is `WellTyped` at `flagIdx + 1` when `controlIdx` and `flagIdx` are both out-of-band, with `controlIdx < flagIdx`.
theoremconditionalAddConstGate_state_eq
theorem conditionalAddConstGate_state_eq
    (bits N flagIdx x : Nat) (flag : Bool)
    (hbits : 2 ≤ bits) (hN : N < 2^bits) (hx : x < 2^bits)
    (hflagIdx : adder_n_qubits bits ≤ flagIdx) :
    Gate.applyNat (conditionalAddConstGate bits N flagIdx)
      (update (adder_input_F bits 0 x) flagIdx flag)
    = update (adder_input_F bits 0 ((x + (if flag then N else 0)) % 2^bits)) flagIdx flag
*`conditionalAddConstGate` full state-eq.** Applied to `update (adder_input_F bits 0 x) flagIdx flag`, the gate produces `update (adder_input_F bits 0 ((x + (if flag then N else 0)) % 2^bits)) flagIdx flag` — i.e. flag preserved, target = `(x + flag·N) mod 2^bits`, read / carry restored.
theoremprepareMaskedConstRead_commute_update_outer
theorem prepareMaskedConstRead_commute_update_outer
    (bits N flagIdx p : Nat) (v : Bool)
    (h_p_ne_flag : p ≠ flagIdx)
    (h_p_ne_read : ∀ i, i < bits → p ≠ read_idx i) :
    ∀ (f : Nat → Bool),
      Gate.applyNat (prepareMaskedConstRead bits N flagIdx) (update f p v)
      = update (Gate.applyNat (prepareMaskedConstRead bits N flagIdx) f) p v
`prepareMaskedConstRead bits N flagIdx` commutes with `update _ p v` when `p` is outside the gate's read/write set: `p ≠ flagIdx` (not read as control) and `p ≠ read_idx k` for any `k < bits` (not written).
theoremconditionalAddConstGate_commute_update_outer
theorem conditionalAddConstGate_commute_update_outer
    (bits N flagIdx p : Nat) (v : Bool)
    (hbits : 2 ≤ bits)
    (hp_dim : adder_n_qubits bits ≤ p)
    (h_p_ne_flag : p ≠ flagIdx) :
    ∀ (f : Nat → Bool),
      Gate.applyNat (conditionalAddConstGate bits N flagIdx) (update f p v)
      = update (Gate.applyNat (conditionalAddConstGate bits N flagIdx) f) p v
`conditionalAddConstGate bits N flagIdx` commutes with `update _ p v` when `p` is outside the gate's actual support: `p ≥ adder_n_qubits bits` and `p ≠ flagIdx`. Composes prep + adder + prep commute lemmas.
theoremconditionalAddConstGate_state_eq_with_outer
theorem conditionalAddConstGate_state_eq_with_outer
    (bits N flagIdx outerIdx x : Nat) (flag outerVal : Bool)
    (hbits : 2 ≤ bits) (hN : N < 2^bits) (hx : x < 2^bits)
    (hflagIdx : adder_n_qubits bits ≤ flagIdx)
    (hOuter : adder_n_qubits bits ≤ outerIdx) (hOuter_ne_flag : outerIdx ≠ flagIdx) :
    Gate.applyNat (conditionalAddConstGate bits N flagIdx)
      (update (update (adder_input_F bits 0 x) flagIdx flag) outerIdx outerVal)
    = update
        (update (adder_input_F bits 0 ((x + (if flag then N else 0)) % 2^bits))
          flagIdx flag) outerIdx outerVal
State-eq for `conditionalAddConstGate` lifted past an outer update at `outerIdx`. This is the form that lets us chain through `controlledModAddConstGate`'s 8 steps where each sub-state has both `flagIdx` and `controlIdx` updates active simultaneously.
theoremcollapse_flag_false_update_at_high
theorem collapse_flag_false_update_at_high
    (n flagIdx outerIdx x : Nat) (outerVal : Bool)
    (hflag_high : 3 * n ≤ flagIdx) :
    update (update (adder_input_F n 0 x) flagIdx false) outerIdx outerVal
    = update (adder_input_F n 0 x) outerIdx outerVal
Helper: an `update` at a high `flagIdx` to `false` is idempotent relative to `adder_input_F n 0 x` (since `adder_input_F` is already `false` at any position `≥ 3 * n`). Used in the `controlBit = false` chain proof to insert/remove redundant flagIdx updates so state forms match `conditionalAddConstGate_state_eq_with_outer`'s expected shape.
theoremconditionalAddConstGate_identity_when_flag_false
theorem conditionalAddConstGate_identity_when_flag_false
    (bits N flagIdx x : Nat) (hbits : 2 ≤ bits) (hN : N < 2^bits) (hx : x < 2^bits)
    (hflagIdx : adder_n_qubits bits ≤ flagIdx) :
    Gate.applyNat (conditionalAddConstGate bits N flagIdx)
      (update (adder_input_F bits 0 x) flagIdx false)
    = update (adder_input_F bits 0 x) flagIdx false
Corollary of `conditionalAddConstGate_state_eq` for `flag = false`: the gate is identity on the canonical input form.
theoremconditionalAddConstGate_identity_when_flag_false_with_outer
theorem conditionalAddConstGate_identity_when_flag_false_with_outer
    (bits N flagIdx outerIdx x : Nat) (outerVal : Bool)
    (hbits : 2 ≤ bits) (hN : N < 2^bits) (hx : x < 2^bits)
    (hflagIdx : adder_n_qubits bits ≤ flagIdx)
    (hOuter : adder_n_qubits bits ≤ outerIdx) (hOuter_ne_flag : outerIdx ≠ flagIdx) :
    Gate.applyNat (conditionalAddConstGate bits N flagIdx)
      (update (update (adder_input_F bits 0 x) flagIdx false) outerIdx outerVal)
    = update (update (adder_input_F bits 0 x) flagIdx false) outerIdx outerVal
Corollary of `conditionalAddConstGate_state_eq_with_outer` for `flag = false`: the gate is identity on the *double-update* form, useful when chaining through `controlledModAddConstGate`'s steps.
theoremcontrolledModAddConstGate_correct_false
theorem controlledModAddConstGate_correct_false
    (bits N c x : Nat) (controlIdx flagIdx : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (hx : x < N) (hc_pos : 0 < c) (hc : c < N)
    (hcontrolIdx : adder_n_qubits (bits + 1) ≤ controlIdx)
    (hflagIdx : controlIdx < flagIdx) :
    Gate.applyNat (controlledModAddConstGate bits N c controlIdx flagIdx)
      (update (adder_input_F (bits + 1) 0 x) controlIdx false)
    = update (adder_input_F (bits + 1) 0 x) controlIdx false
*Tick 6g HEADLINE — `controlBit = false` branch of `controlledModAddConstGate_correct`**. When the control bit is `false`, the entire 8-step controlled modular-add pipeline is identity: target / read / carry / flag all unchanged. Proved by chaining 8 identity rewrites.
theoremcontrolled_step1_true
theorem controlled_step1_true
    (bits c x controlIdx : Nat) (hbits : 1 ≤ bits)
    (hc_succ : c < 2^(bits+1)) (hxc_lt : x + c < 2^(bits+1))
    (hx_succ : x < 2^(bits+1))
    (hcontrolIdx : adder_n_qubits (bits + 1) ≤ controlIdx) :
    Gate.applyNat (conditionalAddConstGate (bits + 1) c controlIdx)
      (update (adder_input_F (bits + 1) 0 x) controlIdx true)
    = update (adder_input_F (bits + 1) 0 (x + c)) controlIdx true
Intermediate: applying step 1 of controlled pipeline (controlled add c) with controlBit = true gives target = `x + c`.
theoremcontrolled_step2_true
theorem controlled_step2_true
    (bits N c x controlIdx : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < N) (hc : c < N)
    (hcontrolIdx : adder_n_qubits (bits + 1) ≤ controlIdx) :
    Gate.applyNat (conditionalAddConstGate (bits + 1) (2^(bits+1) - N) controlIdx)
      (update (adder_input_F (bits + 1) 0 (x + c)) controlIdx true)
    = update (adder_input_F (bits + 1) 0 (subConstPow2WideSpec bits N (x + c))) controlIdx true
Intermediate: applying step 2 of controlled pipeline (controlled sub N) with controlBit = true takes target from `x + c` to `subConstPow2WideSpec bits N (x+c)`.
theoremcontrolled_step3_true
theorem controlled_step3_true
    (bits N c x controlIdx flagIdx : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < N) (hc : c < N)
    (hcontrolIdx : adder_n_qubits (bits + 1) ≤ controlIdx)
    (hflagIdx : controlIdx < flagIdx) :
    Gate.applyNat (Gate.CCX controlIdx (target_idx bits) flagIdx)
      (update (adder_input_F (bits + 1) 0 (subConstPow2WideSpec bits N (x + c))) controlIdx true)
    = update (update (adder_input_F (bits + 1) 0 (subConstPow2WideSpec bits N (x + c)))
                controlIdx true) flagIdx (decide ((x + c) < N))
Intermediate: applying step 3 of controlled pipeline (CCX flag-copy) with controlBit = true puts `decide ((x+c) < N)` into `flagIdx`.
theoremcontrolled_step4_true
theorem controlled_step4_true
    (bits N c x controlIdx flagIdx : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < N) (hc : c < N)
    (hcontrolIdx : adder_n_qubits (bits + 1) ≤ controlIdx)
    (hflagIdx : controlIdx < flagIdx) :
    Gate.applyNat (conditionalAddConstGate (bits + 1) N flagIdx)
      (update (update (adder_input_F (bits + 1) 0 (subConstPow2WideSpec bits N (x + c)))
                  controlIdx true) flagIdx (decide ((x + c) < N)))
    = update (update (adder_input_F (bits + 1) 0 ((x + c) % N))
                controlIdx true) flagIdx (decide ((x + c) < N))
Intermediate: applying step 4 of controlled pipeline (flag-controlled add-back of N) takes target from `subConstPow2WideSpec bits N (x+c)` to `(x + c) % N` when flag holds `decide ((x+c) < N)`.

FormalRV.Arithmetic.ModularAdder.ModularAdderPowerOfTwoCase

FormalRV/Arithmetic/ModularAdder/ModularAdderPowerOfTwoCase.lean
## Power-of-2 modular adder (the easy case) The patched Gidney adder implements `(a + b) mod 2^bits` in the target register when applied to `adder_input_F bits a b`. With `a = c` (constant in the read register) and `b = x` (data in the target register), the output target register decodes to `(x + c) mod 2^bits`. This is just a renaming wrapper around `gidney_adder_patched_target_decode` (in `RippleCarryAdder.lean`), exposed under a name the modular-multiplication layer can call directly.
theorempatched_adder_add_const_pow2
theorem patched_adder_add_const_pow2
    (bits c x : Nat) (hbits : 2 ≤ bits) (hc : c < 2^bits) (hx : x < 2^bits) :
    gidney_target_val bits
      (Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched bits)
        (adder_input_F bits c x))
    = addConstPow2Spec bits c x
*The patched Gidney adder implements `(x + c) mod 2^bits`.** With the constant `c` placed in the read register and the data `x` placed in the target register, applying the patched full faithful no-measurement Gidney adder writes `(x + c) mod 2^bits` into the target register.
theorempatched_adder_add_const_pow2_bundled
theorem patched_adder_add_const_pow2_bundled
    (bits c x : Nat) (hbits : 2 ≤ bits) (hc : c < 2^bits) (hx : x < 2^bits) :
    Gate.WellTyped (adder_n_qubits bits)
      (gidney_adder_full_faithful_no_measurement_patched bits)
    ∧ gidney_target_val bits
        (Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched bits)
          (adder_input_F bits c x))
      = addConstPow2Spec bits c x
    ∧ (∀ i, i < bits →
        Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched bits)
          (adder_input_F bits c x) (read_idx i) = c.testBit i)
    ∧ (∀ i, i < bits →
*Bundled `(x + c) mod 2^bits` primitive.** Combines the power-of-2 modular-addition spec, the patched-adder WellTyped, the read-register preservation (constant `c` survives), and the carry clearing (workspace zeroed) — the single theorem a modular- multiplication layer should call when adding a constant modulo `2^bits`.
theorempatched_adder_sub_const_pow2
theorem patched_adder_sub_const_pow2
    (bits N x : Nat) (hbits : 2 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < 2^bits) :
    gidney_target_val bits
      (Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched bits)
        (adder_input_F bits (2^bits - N) x))
    = subConstPow2Spec bits N x
*The patched Gidney adder with `read = 2^bits - N` implements the wraparound subtraction**. For `0 < N ≤ 2^bits` and `x < 2^bits`, applying the patched adder to `adder_input_F bits (2^bits - N) x` decodes the target register to `(x + (2^bits - N)) mod 2^bits`.
theoremsubConstPow2Spec_of_le
theorem subConstPow2Spec_of_le
    (bits N x : Nat) (hN : N ≤ 2^bits) (hx : x < 2^bits) (hle : N ≤ x) :
    subConstPow2Spec bits N x = x - N
No-underflow case: `N ≤ x` ⇒ `subConstPow2Spec bits N x = x - N`.
theoremsubConstPow2Spec_of_lt
theorem subConstPow2Spec_of_lt
    (bits N x : Nat) (hN : N ≤ 2^bits) (hx_lt : x < N) :
    subConstPow2Spec bits N x = x + 2^bits - N
Underflow case: `x < N` ⇒ `subConstPow2Spec bits N x = x + 2^bits - N`.
theoremsubConstPow2WideSpec_high_bit_of_le
theorem subConstPow2WideSpec_high_bit_of_le
    (bits N x : Nat) (hN : N ≤ 2^bits) (hx : x < 2^bits) (hle : N ≤ x) :
    (subConstPow2WideSpec bits N x).testBit bits = false
Arithmetic high-bit lemma, no-underflow case: when `N ≤ x` the widened result equals `x - N`, which fits in `bits` bits, so bit `bits` is `false`.
theoremsubConstPow2WideSpec_high_bit_of_lt
theorem subConstPow2WideSpec_high_bit_of_lt
    (bits N x : Nat) (hN : N ≤ 2^bits) (hx_lt : x < N) :
    (subConstPow2WideSpec bits N x).testBit bits = true
Arithmetic high-bit lemma, underflow case: when `x < N ≤ 2^bits` the widened result lies in `[2^bits, 2^(bits+1))`, so bit `bits` is `true`.
theoremsubConstPow2WideSpec_high_bit
theorem subConstPow2WideSpec_high_bit
    (bits N x : Nat) (hN : N ≤ 2^bits) (hx : x < 2^bits) :
    (subConstPow2WideSpec bits N x).testBit bits = decide (x < N)
*Main high-bit theorem**: bit `bits` of the widened-subtraction result is exactly the comparison flag `decide (x < N)`.
theorempatched_adder_sub_const_underflow_flag
theorem patched_adder_sub_const_underflow_flag
    (bits N x : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < 2^bits) :
    Gate.applyNat
      (gidney_adder_full_faithful_no_measurement_patched (bits + 1))
      (adder_input_F (bits + 1) (2^(bits + 1) - N) x)
      (target_idx bits)
    = decide (x < N)
*Gate-level underflow flag theorem** (Deliverable C). Instantiating the patched Gidney adder at width `bits + 1` with `read = 2^(bits + 1) - N`, the target bit at position `bits` is exactly `decide (x < N)`.
theoremtestBit_add_two_pow_below
theorem testBit_add_two_pow_below
    (y i n : Nat) (h : i < n) :
    (y + 2^n).testBit i = y.testBit i
*Helper**: bit `i` of `y + 2^n` equals bit `i` of `y` when `i < n` (adding a power of 2 at position `n` doesn't affect lower bits).
theorempatched_adder_sub_const_low_bits
theorem patched_adder_sub_const_low_bits
    (bits N x i : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < 2^bits) (hi : i < bits) :
    Gate.applyNat
      (gidney_adder_full_faithful_no_measurement_patched (bits + 1))
      (adder_input_F (bits + 1) (2^(bits + 1) - N) x)
      (target_idx i)
    = (subConstPow2Spec bits N x).testBit i
*Gate-level low-bits theorem** (Deliverable D). At the widened adder, the lower `bits` target positions decode to the bits of `subConstPow2Spec bits N x` — i.e., they hold the wraparound subtraction value (mod `2^bits`) just as the narrow adder would.
theoremprepareMaskedConstRead_preserves_outside
theorem prepareMaskedConstRead_preserves_outside
    (bits N flagIdx : Nat) (f : Nat → Bool) (p : Nat)
    (h : ∀ i, i < bits → p ≠ read_idx i) :
    Gate.applyNat (prepareMaskedConstRead bits N flagIdx) f p = f p
Outside the read register's `[0, bits)` window, `prepareMaskedConstRead` acts as the identity (in particular: target, carry, and `flagIdx` are preserved).
theoremprepareMaskedConstRead_at_read_idx
theorem prepareMaskedConstRead_at_read_idx
    (bits N flagIdx : Nat) (f : Nat → Bool) (j : Nat) (hj : j < bits)
    (h_flag_disj_read : ∀ i, i < bits → flagIdx ≠ read_idx i) :
    Gate.applyNat (prepareMaskedConstRead bits N flagIdx) f (read_idx j) =
    xor (f (read_idx j)) (f flagIdx && N.testBit j)
At `read_idx j` (for `j < bits`), `prepareMaskedConstRead` XORs the existing value with `f flagIdx && N.testBit j` — i.e. it conditionally flips the read bit based on the flag and the constant pattern.
theoremapplyNat_commute_update_above_dim
theorem applyNat_commute_update_above_dim
    (dim : Nat) (g : Gate) (h_wt : Gate.WellTyped dim g)
    (f : Nat → Bool) (p : Nat) (v : Bool) (h_p : dim ≤ p) :
    Gate.applyNat g (update f p v) = update (Gate.applyNat g f) p v
Any `WellTyped dim` gate commutes with `update _ p v` for `p ≥ dim`.
theoremadder_input_F_at_read_idx_eq
theorem adder_input_F_at_read_idx_eq
    (n a b j : Nat) (hj : j < n) :
    adder_input_F n a b (read_idx j) = a.testBit j
theoremadder_input_F_eq_outside_read_in_range
theorem adder_input_F_eq_outside_read_in_range
    (n a b k : Nat) (h : ∀ j, j < n → k ≠ read_idx j) :
    adder_input_F n a b k = adder_input_F n 0 b k
theoremprepareMaskedConstRead_yields_input_F
theorem prepareMaskedConstRead_yields_input_F
    (bits N flagIdx x : Nat) (flag : Bool)
    (h_disj : ∀ j, j < bits → flagIdx ≠ read_idx j) :
    Gate.applyNat (prepareMaskedConstRead bits N flagIdx)
      (update (adder_input_F bits 0 x) flagIdx flag)
    = update (adder_input_F bits (if flag then N else 0) x) flagIdx flag
*Key intermediate theorem.** Applying `prepareMaskedConstRead` to `update (adder_input_F bits 0 x) flagIdx flag` yields `update (adder_input_F bits (if flag then N else 0) x) flagIdx flag` — i.e. the read register has been re-loaded with the **conditionally masked** constant `flag ∧ N`.
theoremconditionalAddConstGate_target_decode
theorem conditionalAddConstGate_target_decode
    (bits N flagIdx x : Nat) (flag : Bool)
    (hbits : 2 ≤ bits) (hN : N < 2^bits) (hx : x < 2^bits)
    (h_flag : adder_n_qubits bits ≤ flagIdx) :
    gidney_target_val bits
      (Gate.applyNat (conditionalAddConstGate bits N flagIdx)
        (update (adder_input_F bits 0 x) flagIdx flag))
    = (x + (if flag then N else 0)) % 2^bits
*Conditional add-back target decode.** Applied to `update (adder_input_F bits 0 x) flagIdx flag` (read register zero, target register `x`, carry register zero, flag at `flagIdx`), the `conditionalAddConstGate` produces target register equal to `(x + (if flag then N else 0)) mod 2^bits`.
theoremGate.WellTyped.mono
theorem Gate.WellTyped.mono {dim dim' : Nat} {g : Gate}
    (h : Gate.WellTyped dim g) (h_le : dim ≤ dim') :
    Gate.WellTyped dim' g
*WellTyped monotonicity**: `WellTyped` is preserved under dimension enlargement. Generic helper, applies to any `Gate`.
theoremprepareMaskedConstRead_wellTyped
theorem prepareMaskedConstRead_wellTyped
    (bits N flagIdx : Nat) (h_flag : adder_n_qubits bits ≤ flagIdx) :
    Gate.WellTyped (flagIdx + 1) (prepareMaskedConstRead bits N flagIdx)
`prepareMaskedConstRead` is `WellTyped` in dimension `flagIdx + 1` whenever the flag is placed above the adder's working register.
theoremconditionalAddConstGate_read_restored
theorem conditionalAddConstGate_read_restored
    (bits N x flagIdx : Nat) (flag : Bool)
    (hbits : 2 ≤ bits) (hN : N < 2^bits) (hx : x < 2^bits)
    (hflagIdx : adder_n_qubits bits ≤ flagIdx) :
    ∀ i, i < bits →
      Gate.applyNat (conditionalAddConstGate bits N flagIdx)
        (update (adder_input_F bits 0 x) flagIdx flag) (read_idx i)
      = false
*Deliverable A — read register restored.** After the full conditional add-back, every in-range read position is back to zero (the read register served only as a scratch space during the underlying adder).
theoremconditionalAddConstGate_carries_cleared
theorem conditionalAddConstGate_carries_cleared
    (bits N x flagIdx : Nat) (flag : Bool)
    (hbits : 2 ≤ bits) (hN : N < 2^bits) (hx : x < 2^bits)
    (hflagIdx : adder_n_qubits bits ≤ flagIdx) :
    ∀ i, i < bits →
      Gate.applyNat (conditionalAddConstGate bits N flagIdx)
        (update (adder_input_F bits 0 x) flagIdx flag) (carry_idx i)
      = false
*Deliverable B — carry register cleared.** Every in-range carry position is `false` after the full conditional add-back (carries are fully cleared by the inner patched Gidney adder, and the outer prep cascade touches no carry positions).
theoremconditionalAddConstGate_flag_preserved
theorem conditionalAddConstGate_flag_preserved
    (bits N x flagIdx : Nat) (flag : Bool)
    (hbits : 2 ≤ bits)
    (hflagIdx : adder_n_qubits bits ≤ flagIdx) :
    Gate.applyNat (conditionalAddConstGate bits N flagIdx)
      (update (adder_input_F bits 0 x) flagIdx flag) flagIdx = flag
*Deliverable C — flag preserved.** The flag bit at `flagIdx` survives the full conditional add-back unchanged. Follows from the adder commuting past the flag update (by `WellTyped` framing) and both preps preserving positions outside the read range.
theoremconditionalAddConstGate_wellTyped
theorem conditionalAddConstGate_wellTyped
    (bits N flagIdx : Nat) (hbits : 2 ≤ bits)
    (hflagIdx : adder_n_qubits bits ≤ flagIdx) :
    Gate.WellTyped (flagIdx + 1) (conditionalAddConstGate bits N flagIdx)
*Deliverable D — `WellTyped` at `flagIdx + 1`.** The whole conditional add-back gate is `WellTyped` in the enlarged dimension that includes the out-of-band flag bit.
theoremconditionalAddConstGate_clean
theorem conditionalAddConstGate_clean
    (bits N x flagIdx : Nat) (flag : Bool)
    (hbits : 2 ≤ bits) (hN : N < 2^bits) (hx : x < 2^bits)
    (hflagIdx : adder_n_qubits bits ≤ flagIdx) :
    Gate.WellTyped (flagIdx + 1) (conditionalAddConstGate bits N flagIdx)
    ∧ gidney_target_val bits
        (Gate.applyNat (conditionalAddConstGate bits N flagIdx)
          (update (adder_input_F bits 0 x) flagIdx flag))
      = (x + (if flag then N else 0)) % 2^bits
    ∧ (∀ i, i < bits →
        Gate.applyNat (conditionalAddConstGate bits N flagIdx)
          (update (adder_input_F bits 0 x) flagIdx flag) (read_idx i) = false)
*Deliverable E — bundled clean primitive.** The headline characterisation of `conditionalAddConstGate`: WellTyped, correct target decode, read register restored, carries cleared, flag preserved. This is the one theorem downstream consumers should call.
theoremprepareConstRead_preserves_outside
theorem prepareConstRead_preserves_outside
    (bits c : Nat) (f : Nat → Bool) (p : Nat)
    (h : ∀ i, i < bits → p ≠ read_idx i) :
    Gate.applyNat (prepareConstRead bits c) f p = f p
Outside the read register's `[0, bits)` window, `prepareConstRead` is the identity (so target, carry, and any extra ancillas are preserved).
theoremprepareConstRead_at_read_idx
theorem prepareConstRead_at_read_idx
    (bits c : Nat) (f : Nat → Bool) (j : Nat) (hj : j < bits) :
    Gate.applyNat (prepareConstRead bits c) f (read_idx j) =
    xor (f (read_idx j)) (c.testBit j)
At `read_idx j` (for `j < bits`), `prepareConstRead` XORs the value with `c.testBit j`.
theoremprepareConstRead_yields_input_F
theorem prepareConstRead_yields_input_F
    (bits c x : Nat) :
    Gate.applyNat (prepareConstRead bits c) (adder_input_F bits 0 x)
    = adder_input_F bits c x
`prepareConstRead bits c` applied to `adder_input_F bits 0 x` produces exactly `adder_input_F bits c x` — i.e., the read register has been loaded with the bits of `c`.
theoremprepareConstRead_wellTyped
theorem prepareConstRead_wellTyped
    (bits c : Nat) :
    Gate.WellTyped (adder_n_qubits bits) (prepareConstRead bits c)
`prepareConstRead bits c` is WellTyped at the adder's natural dimension `adder_n_qubits bits = 3*bits + 2`.
theoremaddConstGate_clean
theorem addConstGate_clean
    (bits c x : Nat) (hbits : 2 ≤ bits) (hc : c < 2^bits) (hx : x < 2^bits) :
    Gate.WellTyped (adder_n_qubits bits) (addConstGate bits c)
    ∧ gidney_target_val bits
        (Gate.applyNat (addConstGate bits c) (adder_input_F bits 0 x))
      = (x + c) % 2^bits
    ∧ (∀ i, i < bits →
        Gate.applyNat (addConstGate bits c) (adder_input_F bits 0 x) (read_idx i) = false)
    ∧ (∀ i, i < bits →
        Gate.applyNat (addConstGate bits c) (adder_input_F bits 0 x) (carry_idx i) = false)
*Bundled clean primitive** for `addConstGate`. Takes a clean `adder_input_F bits 0 x` and produces: WellTyped at the natural dimension `adder_n_qubits bits`; target decodes to `(x + c) mod 2^bits`; read register restored to zero; carries cleared.
theoremsubConstGate_clean
theorem subConstGate_clean
    (bits N x : Nat) (hbits : 2 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hx : x < 2^bits) :
    Gate.WellTyped (adder_n_qubits bits) (subConstGate bits N)
    ∧ gidney_target_val bits
        (Gate.applyNat (subConstGate bits N) (adder_input_F bits 0 x))
      = subConstPow2Spec bits N x
    ∧ (∀ i, i < bits →
        Gate.applyNat (subConstGate bits N) (adder_input_F bits 0 x) (read_idx i) = false)
    ∧ (∀ i, i < bits →
        Gate.applyNat (subConstGate bits N) (adder_input_F bits 0 x) (carry_idx i) = false)
*Bundled clean primitive** for `subConstGate`. Follows directly from `addConstGate_clean` with `c = 2^bits - N`.
theoremsubConstPow2WideSpec_high_bit_bounded_sum_of_le
theorem subConstPow2WideSpec_high_bit_bounded_sum_of_le
    (bits N s : Nat) (hN : N ≤ 2^bits) (hle : N ≤ s) (hs : s < 2 * N) :
    (subConstPow2WideSpec bits N s).testBit bits = false
Generalized no-underflow high-bit lemma. When `N ≤ s` and `s < 2*N`, the widened result equals `s - N`, which fits in `bits` bits, so bit `bits` is `false`. Drops the `s < 2^bits` assumption of `subConstPow2WideSpec_high_bit_of_le`.
theoremsubConstPow2WideSpec_high_bit_bounded_sum_of_lt
theorem subConstPow2WideSpec_high_bit_bounded_sum_of_lt
    (bits N s : Nat) (hN : N ≤ 2^bits) (hlt : s < N) :
    (subConstPow2WideSpec bits N s).testBit bits = true
Generalized underflow high-bit lemma for `s < N` and `N ≤ 2^bits`. Identical to `subConstPow2WideSpec_high_bit_of_lt`, restated here as a named entry point for the post-add-step comparison flag.
theoremsubConstPow2WideSpec_high_bit_bounded_sum
theorem subConstPow2WideSpec_high_bit_bounded_sum
    (bits N s : Nat) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hs : s < 2 * N) :
    (subConstPow2WideSpec bits N s).testBit bits = decide (s < N)
*Generalized main high-bit theorem** for the widened subtraction under `s < 2*N`. After the first add-step of the modular-adder pipeline, the intermediate sum is bounded by `2*N` (not `2^bits`), yet the widened subtraction's high bit still equals `decide (s < N)`.
theorempatched_adder_sub_const_underflow_flag_bounded_sum
theorem patched_adder_sub_const_underflow_flag_bounded_sum
    (bits N s : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hs : s < 2 * N) :
    Gate.applyNat
      (gidney_adder_full_faithful_no_measurement_patched (bits + 1))
      (adder_input_F (bits + 1) (2^(bits + 1) - N) s)
      (target_idx bits)
    = decide (s < N)
*Generalized gate-level underflow flag.** After the first add-step of a modular adder, the intermediate sum `s` may have `s ≥ 2^bits` but always satisfies `s < 2*N`. The widened patched Gidney adder's target bit at position `bits` is exactly `decide (s < N)` under this weaker bound.
theoremmodAdd_sum_bound
theorem modAdd_sum_bound
    (bits N x c : Nat) (hN : N ≤ 2^bits) (hx : x < N) (hc : c < N) :
    x + c < 2^(bits + 1)
After widened add, the sum fits in `bits + 1` bits.
theoremmodAdd_sum_lt_twoN
theorem modAdd_sum_lt_twoN
    (N x c : Nat) (hx : x < N) (hc : c < N) :
    x + c < 2 * N
After widened add, the sum is bounded by `2N` (the tighter bound needed by the generalized underflow theorem).
theoremmodAddConstArithmeticSpec_correct
theorem modAddConstArithmeticSpec_correct
    (bits N c x : Nat) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (hx : x < N) (hc : c < N) :
    modAddConstArithmeticSpec bits N c x % 2^bits = (x + c) % N
*Widened modular-add pipeline correctness** (arithmetic level). For `0 < N ≤ 2^bits` and `x, c < N`, the low `bits` bits of the widened pipeline result equal `(x + c) mod N`.
theoremmodAddConstArithmeticSpec_low_bit_correct
theorem modAddConstArithmeticSpec_low_bit_correct
    (bits N c x i : Nat) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (hx : x < N) (hc : c < N) (hi : i < bits) :
    (modAddConstArithmeticSpec bits N c x).testBit i
    = ((x + c) % N).testBit i
Bit-level form of `modAddConstArithmeticSpec_correct`: bit `i` of the pipeline result (for `i < bits`) equals bit `i` of `(x + c) % N`.
theoremmodAdd_step1_target_decode
theorem modAdd_step1_target_decode
    (bits N c x : Nat) (hbits : 1 ≤ bits) (hN : N ≤ 2^bits)
    (hx : x < N) (hc : c < N) :
    gidney_target_val (bits+1)
      (Gate.applyNat (addConstGate (bits+1) c) (adder_input_F (bits+1) 0 x))
    = x + c
*Step 1 — first add**. Applied to a clean `adder_input_F (bits+1) 0 x`, `addConstGate (bits+1) c` decodes its target register to `x + c` (no overflow, since `x + c < 2^(bits+1)`).
theoremmodAdd_step2_flag_at_target_idx_bits
theorem modAdd_step2_flag_at_target_idx_bits
    (bits N s : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hs : s < 2 * N) :
    Gate.applyNat (addConstGate (bits+1) (2^(bits+1) - N))
      (adder_input_F (bits+1) 0 s) (target_idx bits)
    = decide (s < N)
*Step 2 — subtract `N`, observe comparison flag at `target_idx bits`**. Applied to an *idealized* `adder_input_F (bits+1) 0 s` (i.e., target holds `s` and read/carry are zero), `addConstGate (bits+1) (2^(bits+1) - N)` makes the bit at `target_idx bits` equal `decide (s < N)`.
theoremmodAdd_step3_target_decode
theorem modAdd_step3_target_decode
    (bits N flagIdx y : Nat) (flag : Bool)
    (hbits : 1 ≤ bits) (hN : N < 2^(bits+1)) (hy : y < 2^(bits+1))
    (hflagIdx : adder_n_qubits (bits+1) ≤ flagIdx) :
    gidney_target_val (bits+1)
      (Gate.applyNat (conditionalAddConstGate (bits+1) N flagIdx)
        (update (adder_input_F (bits+1) 0 y) flagIdx flag))
    = (y + (if flag then N else 0)) % 2^(bits+1)
*Step 3 — conditional add-back**. Applied to the idealized `update (adder_input_F (bits+1) 0 y) flagIdx flag` (target holds `y`, read/carry zero, flag bit at out-of-band `flagIdx`), the `conditionalAddConstGate (bits+1) N flagIdx` decodes target to `(y + (if flag then N else 0)) mod 2^(bits+1)` — which is exactly the `modAddConstArithmeticSpec` value when `y = subConstPow2WideSpec bits N s` and `flag = decide (s < N)`.
theoremaddConstGate_target_bit
theorem addConstGate_target_bit
    (bits c x i : Nat) (hbits : 2 ≤ bits) (hc : c < 2^bits) (hx : x < 2^bits)
    (hi : i < bits) :
    Gate.applyNat (addConstGate bits c) (adder_input_F bits 0 x) (target_idx i)
    = ((x + c) % 2^bits).testBit i
Bit-level form of `addConstGate_clean`'s target-decode line: applied to `adder_input_F bits 0 x`, the gate's value at `target_idx i` (for `i < bits`) equals bit `i` of `(x + c) % 2^bits`.
theoremaddConstGate_target_bit_no_overflow
theorem addConstGate_target_bit_no_overflow
    (bits N c x i : Nat) (hbits : 1 ≤ bits) (hN : N ≤ 2^bits)
    (hx : x < N) (hc : c < N) (hi : i < bits + 1) :
    Gate.applyNat (addConstGate (bits + 1) c) (adder_input_F (bits + 1) 0 x) (target_idx i)
    = (x + c).testBit i
No-overflow corollary for widened addition. When `x, c < N ≤ 2^bits`, the widened sum `x + c` fits in `bits + 1` bits, so bit `i` of the target is `(x + c).testBit i` (no mod needed).
theoremaddConstGate_modAdd_step1_state_normal
theorem addConstGate_modAdd_step1_state_normal
    (bits N c x : Nat) (hbits : 1 ≤ bits) (hN : N ≤ 2^bits)
    (hx : x < N) (hc : c < N) :
    (∀ i, i < bits + 1 →
      Gate.applyNat (addConstGate (bits + 1) c) (adder_input_F (bits + 1) 0 x) (target_idx i)
      = (x + c).testBit i)
    ∧ (∀ i, i < bits + 1 →
      Gate.applyNat (addConstGate (bits + 1) c) (adder_input_F (bits + 1) 0 x) (read_idx i)
      = false)
    ∧ (∀ i, i < bits + 1 →
      Gate.applyNat (addConstGate (bits + 1) c) (adder_input_F (bits + 1) 0 x) (carry_idx i)
      = false)
After step 1, the read register is zero, carries are cleared, and target bits 0..bits encode `(x + c)` (no overflow under `x, c < N`). This is the WEAK normal-form: it does NOT claim function equality at positions outside the working range.
theoremsubConstGate_modAdd_step2_state_normal
theorem subConstGate_modAdd_step2_state_normal
    (bits N s : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hs : s < 2 * N) :
    (∀ i, i < bits + 1 →
      Gate.applyNat (subConstGate (bits + 1) N) (adder_input_F (bits + 1) 0 s) (target_idx i)
      = (subConstPow2WideSpec bits N s).testBit i)
    ∧ Gate.applyNat (subConstGate (bits + 1) N) (adder_input_F (bits + 1) 0 s) (target_idx bits)
      = decide (s < N)
    ∧ (∀ i, i < bits + 1 →
      Gate.applyNat (subConstGate (bits + 1) N) (adder_input_F (bits + 1) 0 s) (read_idx i)
      = false)
    ∧ (∀ i, i < bits + 1 →
Weak normal-form for step 2. Same caveat as step 1: working positions only.
theoremcopyTargetHighBitToFlag_correct
theorem copyTargetHighBitToFlag_correct
    (bits flagIdx : Nat) (f : Nat → Bool) (h_init : f flagIdx = false) :
    Gate.applyNat (copyTargetHighBitToFlag bits flagIdx) f flagIdx
    = f (target_idx bits)
Correctness: when the flag bit is initially `false`, the gate sets it to the value of `target_idx bits`.
theoremcopyTargetHighBitToFlag_preserves_working
theorem copyTargetHighBitToFlag_preserves_working
    (bits flagIdx : Nat) (f : Nat → Bool) (p : Nat)
    (hflagIdx : adder_n_qubits (bits + 1) ≤ flagIdx)
    (h_p_lt : p < adder_n_qubits (bits + 1)) :
    Gate.applyNat (copyTargetHighBitToFlag bits flagIdx) f p = f p
Frame: when `flagIdx` is out-of-band (`flagIdx ≥ adder_n_qubits (bits+1)`), the flag-copy gate preserves all positions strictly inside the working dimension.
theoremcopyTargetHighBitToFlag_wellTyped
theorem copyTargetHighBitToFlag_wellTyped
    (bits flagIdx : Nat)
    (hflagIdx : adder_n_qubits (bits + 1) ≤ flagIdx) :
    Gate.WellTyped (flagIdx + 1) (copyTargetHighBitToFlag bits flagIdx)
WellTyped at the enlarged dimension `flagIdx + 1`.
theoremgidney_adder_bit_step_faithful_first_preserves_above
theorem gidney_adder_bit_step_faithful_first_preserves_above
    (f : Nat → Bool) (p : Nat) (hp : 5 ≤ p) :
    Gate.applyNat gidney_adder_bit_step_faithful_first f p = f p
theoremgidney_adder_bit_step_faithful_interior_preserves_above
theorem gidney_adder_bit_step_faithful_interior_preserves_above
    (i : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * i + 5 ≤ p) :
    Gate.applyNat (gidney_adder_bit_step_faithful_interior i) f p = f p
theoremgidney_adder_bit_step_faithful_last_preserves_above
theorem gidney_adder_bit_step_faithful_last_preserves_above
    (i : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * i + 3 ≤ p) :
    Gate.applyNat (gidney_adder_bit_step_faithful_last i) f p = f p
theoremgidney_adder_bit_step_faithful_first_reverse_preserves_above
theorem gidney_adder_bit_step_faithful_first_reverse_preserves_above
    (f : Nat → Bool) (p : Nat) (hp : 5 ≤ p) :
    Gate.applyNat gidney_adder_bit_step_faithful_first_reverse f p = f p
theoremgidney_adder_bit_step_faithful_interior_reverse_preserves_above
theorem gidney_adder_bit_step_faithful_interior_reverse_preserves_above
    (i : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * i + 5 ≤ p) :
    Gate.applyNat (gidney_adder_bit_step_faithful_interior_reverse i) f p = f p
theoremgidney_adder_bit_step_faithful_last_reverse_preserves_above
theorem gidney_adder_bit_step_faithful_last_reverse_preserves_above
    (i : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * i + 3 ≤ p) :
    Gate.applyNat (gidney_adder_bit_step_faithful_last_reverse i) f p = f p
theoremgidney_adder_bit_step_faithful_first_reverse_patched_preserves_above
theorem gidney_adder_bit_step_faithful_first_reverse_patched_preserves_above
    (f : Nat → Bool) (p : Nat) (hp : 5 ≤ p) :
    Gate.applyNat gidney_adder_bit_step_faithful_first_reverse_patched f p = f p
theoremgidney_adder_bit_step_faithful_interior_reverse_patched_preserves_above
theorem gidney_adder_bit_step_faithful_interior_reverse_patched_preserves_above
    (i : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * i + 5 ≤ p) :
    Gate.applyNat (gidney_adder_bit_step_faithful_interior_reverse_patched i) f p = f p
theoremgidney_adder_bit_step_faithful_last_reverse_patched_preserves_above
theorem gidney_adder_bit_step_faithful_last_reverse_patched_preserves_above
    (i : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * i + 3 ≤ p) :
    Gate.applyNat (gidney_adder_bit_step_faithful_last_reverse_patched i) f p = f p
theoremgidney_adder_forward_with_propagation_preserves_above
theorem gidney_adder_forward_with_propagation_preserves_above
    (k : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * k + 2 ≤ p) :
    Gate.applyNat (gidney_adder_forward_with_propagation k) f p = f p
`forward_with_propagation k` preserves positions `≥ 3 * k + 2`.
theoremgidney_adder_forward_with_propagation_reverse_patched_preserves_above
theorem gidney_adder_forward_with_propagation_reverse_patched_preserves_above
    (k : Nat) (f : Nat → Bool) (p : Nat) (hp : 3 * k + 2 ≤ p) :
    Gate.applyNat (gidney_adder_forward_with_propagation_reverse_patched k) f p = f p
`forward_with_propagation_reverse_patched k` preserves positions `≥ 3 * k + 2`.

FormalRV.Arithmetic.ModularAdder.ModularAdderSwapSemantics

FormalRV/Arithmetic/ModularAdder/ModularAdderSwapSemantics.lean
### Tick 18 — SWAP semantics on `mult_input_F`.
theoremmult_target_swap_on_mult_input_F
theorem mult_target_swap_on_mult_input_F
    (bits multBits x m : Nat)
    (h_multBits_le : multBits ≤ bits + 1)
    (hx : x < 2^multBits) (hm : m < 2^multBits) :
    Gate.applyNat (mult_target_swap bits multBits)
                  (mult_input_F bits multBits x m)
    = mult_input_F bits multBits m x
*HEADLINE: SWAP exchanges multiplier-register and target-register values on `mult_input_F`.** Applied to `mult_input_F bits multBits x m` (multiplier holds `m`, target holds `x`), the multiplier-target SWAP produces `mult_input_F bits multBits m x` (multiplier holds `x`, target holds `m`). Requires `multBits ≤ bits + 1` (multiplier no wider than adder) and `x, m < 2^multBits` (so they fit in the multBits-wide register and have no high bits leaking into unswapped positions).
theoremmodMultInPlace_correct
theorem modMultInPlace_correct
    (bits N a ainv multBits x : Nat)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits)
    (h_multBits_le : multBits ≤ bits + 1)
    (h_N_le_pow_multBits : N ≤ 2^multBits)
    (ha_pos : 0 < a) (ha_lt : a < N)
    (hainv_pos : 0 < ainv) (hainv_lt : ainv < N)
    (h_inv : a * ainv % N = 1)
    (hx_lt : x < N)
    (h_const_pos_a : ∀ j, j < multBits → 0 < (a * 2^j) % N)
    (h_const_pos_inv : ∀ j, j < multBits → 0 < ((N - ainv) % N * 2^j) % N) :
    Gate.applyNat (modMultInPlace bits N a ainv multBits)
*HEADLINE: `modMultInPlace` is a correct in-place modular multiplier.** Applied to `mult_state_init bits multBits x` (multiplier register holds `x`, adder zeroed), the gate produces `mult_input_F bits multBits 0 ((a * x) % N)` — the multiplier register now holds the result `a*x mod N` and the adder is zeroed. Hypotheses: - Structural: `1 ≤ bits`, `multBits ≤ bits + 1`, `N ≤ 2^multBits`. - Modular: `0 < N`, `N ≤ 2^bits`, `0 < a < N`, `0 < ainv < N`, `a * ainv ≡ 1 (mod N)`. - Input: `x < N`. - Coprimality of each per-bit constant `(a * 2^j) % N` and `((N - ainv) % N * 2^j) % N` is non-zero, used by the `modMultConstGate_correct` invocations.
theoremreverse_register_swap_aux_succ
theorem reverse_register_swap_aux_succ
    (n offsetA offsetB k : Nat) :
    reverse_register_swap_aux n offsetA offsetB (k + 1)
    = Gate.seq (reverse_register_swap_aux n offsetA offsetB k)
               (qubit_swap (offsetA + k) (offsetB + (n - 1 - k)))
Recursion unfolding for `reverse_register_swap_aux`.
theoremreverse_register_swap_aux_wellTyped
theorem reverse_register_swap_aux_wellTyped
    (dim n offsetA offsetB k : Nat) (hdim : 0 < dim)
    (hA : offsetA + n ≤ dim) (hB : offsetB + n ≤ dim)
    (h_disjoint : offsetA + n ≤ offsetB ∨ offsetB + n ≤ offsetA)
    (hk : k ≤ n) :
    Gate.WellTyped dim (reverse_register_swap_aux n offsetA offsetB k)
*WellTyped for `reverse_register_swap_aux`.** Disjoint ranges suffice.
theoremreverse_register_swap_wellTyped
theorem reverse_register_swap_wellTyped
    (dim n offsetA offsetB : Nat) (hdim : 0 < dim)
    (hA : offsetA + n ≤ dim) (hB : offsetB + n ≤ dim)
    (h_disjoint : offsetA + n ≤ offsetB ∨ offsetB + n ≤ offsetA) :
    Gate.WellTyped dim (reverse_register_swap n offsetA offsetB)
*WellTyped for `reverse_register_swap`.**
theoremreverse_register_swap_aux_at_other
theorem reverse_register_swap_aux_at_other
    (n offsetA offsetB k : Nat) (f : Nat → Bool) (q : Nat)
    (h_disjoint : offsetA + n ≤ offsetB ∨ offsetB + n ≤ offsetA)
    (hk : k ≤ n)
    (h_outside : ∀ i, i < k →
      q ≠ offsetA + i ∧ q ≠ offsetB + (n - 1 - i)) :
    Gate.applyNat (reverse_register_swap_aux n offsetA offsetB k) f q = f q
*Correctness at "other" positions** of `reverse_register_swap_aux`. At positions outside both `[offsetA, offsetA + k)` and `[offsetB + n - k, offsetB + n)` (the touched range up to iteration `k`), the gate is identity.
theoremreverse_register_swap_aux_at_A
theorem reverse_register_swap_aux_at_A
    (n offsetA offsetB k : Nat) (f : Nat → Bool) (j : Nat) (hj : j < k)
    (h_disjoint : offsetA + n ≤ offsetB ∨ offsetB + n ≤ offsetA)
    (hk : k ≤ n) :
    Gate.applyNat (reverse_register_swap_aux n offsetA offsetB k) f
      (offsetA + j)
    = f (offsetB + (n - 1 - j))
*At A-side position**: at `offsetA + j` (j < k), the gate returns `f (offsetB + (n - 1 - j))`. The reversed-pairing semantics.
theoremreverse_register_swap_aux_at_B
theorem reverse_register_swap_aux_at_B
    (n offsetA offsetB k : Nat) (f : Nat → Bool) (j : Nat) (hj : j < k)
    (h_disjoint : offsetA + n ≤ offsetB ∨ offsetB + n ≤ offsetA)
    (hk : k ≤ n) :
    Gate.applyNat (reverse_register_swap_aux n offsetA offsetB k) f
      (offsetB + (n - 1 - j))
    = f (offsetA + j)
*At B-side position (reversed)**: at `offsetB + (n - 1 - j)` (j < k), the gate returns `f (offsetA + j)`. The dual of `_at_A`.

FormalRV.Arithmetic.RCIR

FormalRV/Arithmetic/RCIR.lean
FormalRV.BQAlgo.RCIR — backward-compat shim. The IR `RCIRGate` and its `tcount` originally lived here; they have been promoted to the `Framework` layer (see `Framework/Gate.lean` and `Framework/Semantics.lean`) so that BQ-Arch and BQ-Code modules can also reason about gates / circuits semantically. This file just re-exports `Gate` under the legacy name `RCIRGate`. New code should `import FormalRV.Core.Gate` directly and use `Gate`.
abbrevRCIRGate
abbrev RCIRGate
Legacy alias — use `FormalRV.Framework.Gate` for new code.

FormalRV.Arithmetic.RippleCarryAdder

FormalRV/Arithmetic/RippleCarryAdder.lean
(no documented top-level declarations)

FormalRV.Arithmetic.RippleCarryAdder.RippleCarryAdderDecideWitnesses

FormalRV/Arithmetic/RippleCarryAdder/RippleCarryAdderDecideWitnesses.lean
example(example)
example :
    Gidney.post_last_bit_invariant 2 1 0
      (gidney_forward_faithful_full_post_state 2 (adder_input_F 2 1 0))
*Decide-witness on (n=2, a=1, b=0)** (Iter 187). No-carry case.
example(example)
example :
    Gidney.post_last_bit_invariant 3 3 1
      (gidney_forward_faithful_full_post_state 3 (adder_input_F 3 3 1))
*Decide-witness on (n=3, a=3, b=1)** (Iter 187). Multi-bit carry.
theoremGidney.post_last_bit_invariant_holds
theorem Gidney.post_last_bit_invariant_holds (n a b : Nat)
    (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n) :
    Gidney.post_last_bit_invariant n a b
      (gidney_forward_faithful_full_post_state n (adder_input_F n a b))
*Parametric `post_last_bit_invariant_holds`** (Iter 188, 2026-05-13). For any n ≥ 2 with valid bounds, applying the full forward cascade to `adder_input_F n a b` produces a state satisfying `Gidney.post_last_bit_invariant`. Proof strategy: destructure n = m+2, unfold via the recursive def's third clause to `gidney_last_bit_post_state (m+1) ∘ gidney_propagation_post_state (m+1)`. Apply Iter 179's `propagation_step_invariant_holds (m+1)` for the inner state, extract the 4 facts at positions {c_m, c_{m+1}, r_{m+1}, t_{m+1}}. Apply Iter 171's `gidney_last_bit_preserves` to get post(c_{m+1}) = c_{m+2}. For each j and each conjunct: split on j = m+1 carry case (use preserves) vs frame case (use Iter 173's last-bit frame + the propagation invariant clause, which always reduces to the propagated branch since j ≤ m+1 for all j < m+2).
example(example)
example :
    Gidney.post_forward_final_cx_invariant 2 1 1
      (gidney_final_cx_cascade_post_state 2
        (gidney_forward_faithful_full_post_state 2 (adder_input_F 2 1 1)))
*Decide-witness for the post-forward-final-CX invariant on (n=2, a=1, b=1)** (Iter 183). Validates the invariant on the instance where the original `TODO_gidney_classical_action` fails (per Iter 182 counterexample) — confirming the invariant matches the actual classical action.
example(example)
example :
    Gidney.post_forward_final_cx_invariant 2 1 0
      (gidney_final_cx_cascade_post_state 2
        (gidney_forward_faithful_full_post_state 2 (adder_input_F 2 1 0)))
*Decide-witness on (n=2, a=1, b=0)** (Iter 183). The case where no carry is generated (c_1 = 0), so target_1 = a_1 ⊕ b_1 = 0 happens to equal sum_1 = 0.
example(example)
example :
    Gidney.post_forward_final_cx_invariant 3 3 1
      (gidney_final_cx_cascade_post_state 3
        (gidney_forward_faithful_full_post_state 3 (adder_input_F 3 3 1)))
*Decide-witness on (n=3, a=3, b=1)** (Iter 183). Multi-bit carry propagation. 3+1 = 4 = 100. Invariant predicts: target_0 = a_0 ⊕ b_0 = 0, target_1 = a_1 ⊕ b_1 = 1, target_2 = a_2 ⊕ b_2 = 0. Sum bits: 0, 0, 1. So target_1 differs from sum_1 (1 vs 0), and target_2 differs from sum_2 (0 vs 1). The invariant correctly captures the actual post-state.
theoremgidney_final_cx_cascade_preserves_carry
theorem gidney_final_cx_cascade_preserves_carry
    (n k : Nat) (f : Nat → Bool) :
    gidney_final_cx_cascade_post_state n f (carry_idx k) = f (carry_idx k)
*Frame condition: final-CX cascade preserves carry positions.** For any depth n and any k, the cascade doesn't touch carry_k.
theoremgidney_final_cx_cascade_preserves_read
theorem gidney_final_cx_cascade_preserves_read
    (n k : Nat) (f : Nat → Bool) :
    gidney_final_cx_cascade_post_state n f (read_idx k) = f (read_idx k)
*Frame condition: final-CX cascade preserves read positions.** For any depth n and any k, the cascade doesn't touch read_k.
theoremgidney_final_cx_cascade_target_outside
theorem gidney_final_cx_cascade_target_outside
    (n j : Nat) (hj : n ≤ j) (f : Nat → Bool) :
    gidney_final_cx_cascade_post_state n f (target_idx j) = f (target_idx j)
*Frame condition: final-CX cascade preserves target_j for j ≥ n.** Target positions at or above the cascade depth are untouched.
theoremgidney_final_cx_cascade_target_action
theorem gidney_final_cx_cascade_target_action
    (n j : Nat) (hj : j < n) (f : Nat → Bool) :
    gidney_final_cx_cascade_post_state n f (target_idx j)
      = xor (f (target_idx j)) (f (read_idx j))
*Action of final-CX cascade on target_j for j < n**: the post-state XORs the input's read_j into target_j.
theoremGidney.post_forward_final_cx_invariant_holds
theorem Gidney.post_forward_final_cx_invariant_holds (n a b : Nat)
    (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n) :
    Gidney.post_forward_final_cx_invariant n a b
      (gidney_final_cx_cascade_post_state n
        (gidney_forward_faithful_full_post_state n (adder_input_F n a b)))
*Parametric `post_forward_final_cx_invariant_holds`** (Iter 189, 2026-05-13). For any n ≥ 2 with valid bounds, applying `gidney_final_cx_cascade_post_state n` to the post-forward state `gidney_forward_faithful_full_post_state n (adder_input_F n a b)` yields a state satisfying `Gidney.post_forward_final_cx_invariant`. *This is THE parametric provable end-state theorem at the forward + final-CX layer**, per Iter 182's review finding. Composes Iter 188's `post_last_bit_invariant_holds` with Iter 184's 4 final-CX structural lemmas: - **carry_j**: `final_cx_cascade_preserves_carry` + Iter 188 → `c_{j+1}`. ✓ - **read_j**: `final_cx_cascade_preserves_read` + Iter 188 → `a_j ⊕ c_j`. ✓ - **target_j**: `final_cx_cascade_target_action` (j < n) → `f(t_j) ⊕ f(r_j)`. From Iter 188: `f(t_j) = b_j ⊕ c_j`, `f(r_j) = a_j ⊕ c_j`. So target_j post-CX = `(b_j ⊕ c_j) ⊕ (a_j ⊕ c_j) = a_j ⊕ b_j`. The c_j contributions cancel — this is exactly Iter 182's review finding made parametric. ✓ The remaining gap to the headline `gidney_classical_action`: target_j is `a_j ⊕ b_j` here, but `sum_j = a_j ⊕ b_j ⊕ c_j`. The reverse cascade (separate, awaits Iter 191+ + John's QUESTIONS.md #1 approval) re-XORs c_j into target_j to produce sum_j.
theoremgidney_classical_action_without_reverse_is_false
theorem gidney_classical_action_without_reverse_is_false :
    ¬ (∀ (n a b : Nat), 0 < n → a < 2^n → b < 2^n →
        ∀ i, i < n →
          gidney_final_cx_cascade_post_state n
            (gidney_forward_faithful_full_post_state n (adder_input_F n a b))
            (target_idx i)
          = adder_sum_bit_classical a b i)
*Phase A end-to-end review finding (negation, proven 2026-05-22)**: the conjecture *"the Gidney adder's forward + final-CX cascade alone (no reverse cascade) computes the classical sum"* is FALSE. HISTORY: this slot used to hold a sorried theorem named `TODO_gidney_classical_action` asserting the (false) positive form. Iter 182 (2026-05-13) supplied a machine-checked counterexample at (n=2, a=1, b=1) — see `gidney_classical_action_unprovable_at_1_plus_1` below — proving that the positive form was unprovable as stated. The corrected headline `gidney_classical_action_with_reverse` (proven at line ~5709) is the canonical semantic-correctness theorem. The honest record of the review finding lives here as a proven negation theorem (no sorry): the universally-quantified positive conjecture is impossible because it fails at the specific witness (n=2, a=1, b=1, i=1).
theoremgidney_classical_action_unprovable_at_1_plus_1
theorem gidney_classical_action_unprovable_at_1_plus_1 :
    ¬ (∀ i, i < 2 →
        gidney_final_cx_cascade_post_state 2
          (gidney_forward_faithful_full_post_state 2 (adder_input_F 2 1 1))
          (target_idx i)
        = adder_sum_bit_classical 1 1 i)
*REVIEW FINDING (Iter 182, 2026-05-13)**: machine-checked counterexample establishing that `TODO_gidney_classical_action` is UNPROVABLE as currently stated. For the instance `(n=2, a=1, b=1)` (all hypotheses satisfied: `0 < 2`, `1 < 4`, `1 < 4`), the conclusion `∀ i, i < 2, forward+final-CX(target_i) = (a+b).testBit i` fails at `i=1`: - Forward+final-CX on `adder_input_F 2 1 1` yields target_1 = 0 (decide-witnessed at lines ~2395-2404 via `inputF_1_plus_1`). - `(1+1).testBit 1 = 2.testBit 1 = 1`. - 0 ≠ 1. ∎ The forward + final-CX cascade produces `target_j = a_j ⊕ b_j` for `j ≥ 1` (the two `c_j` contributions from forward propagation cancel via the final-CX `t_j ⊕= r_j`). But the classical sum is `sum_j = a_j ⊕ b_j ⊕ c_j`, which is OFF by `c_j` whenever `c_j = 1`. *The full Gidney adder requires the REVERSE cascade.** Its per-step `CX(c_{j-1}, t_j)` re-XORs `c_j` into target_j, fixing the gap. Hence the headline theorem should be: ``` gidney_forward_faithful_full_reverse_post_state n (gidney_final_cx_cascade_post_state n (gidney_forward_faithful_full_post_state n (adder_input_F n a b))) (target_idx i) = adder_sum_bit_classical a b i ``` (i.e., forward + final-CX + REVERSE, applied left-to-right.) See QUESTIONS.md (entry 2026-05-13 #1) for the proposed theorem-statement fix awaiting John's approval.
example(example)
example :
    let post
*Decide-witness for full reverse on (n=2, a=1, b=1)** (Iter 191). Confirms that applying the reverse cascade to the post-final-CX state of (1+1) restores `target_1 = 1 = sum_1`, fixing the Iter 182 counterexample. The reverse cascade DOES compute the sum bits — Iter 106's older comment was wrong.
example(example)
example :
    let post
*Decide-witness on (n=3, a=3, b=1)** (Iter 191). Multi-bit.
theoremgidney_interior_bit_reverse_post_state_in_bits
theorem gidney_interior_bit_reverse_post_state_in_bits
    (i : Nat) (hi : 0 < i) (f : Nat → Bool) :
    (gidney_interior_bit_reverse_post_state i f) (carry_idx i)
      = xor (xor (f (carry_idx i)) (f (carry_idx (i - 1))))
            (f (read_idx i) && f (target_idx i))
    ∧ (gidney_interior_bit_reverse_post_state i f) (read_idx (i + 1))
        = xor (f (read_idx (i + 1))) (f (carry_idx i))
    ∧ (gidney_interior_bit_reverse_post_state i f) (target_idx (i + 1))
        = xor (f (target_idx (i + 1))) (f (carry_idx i))
*Interior-bit reverse in-bits structural lemma (PROVEN, Iter 195, 2026-05-13)**. Analog of Iter 167's `gidney_interior_bit_post_state_in_bits` for the reverse direction. Captures the pure structural action of `gidney_interior_bit_reverse_post_state i` on an arbitrary input `f` (no input invariant assumed). Computed by walking the 4 chained updates of the def: - **post(c_i)** = `(f(c_i) ⊕ f(c_{i-1})) ⊕ (f(r_i) ∧ f(t_i))`. Outermost update (gate 4: CCX undo) adds `(r_i ∧ t_i)` to the previous c_i value, which itself was modified by gate 3 (chain CX) to be `f(c_i) ⊕ f(c_{i-1})`. - **post(r_{i+1})** = `f(r_{i+1}) ⊕ f(c_i)` (gate 2 propagates original c_i back through r_{i+1}). - **post(t_{i+1})** = `f(t_{i+1}) ⊕ f(c_i)` (gate 1 propagates back through t_{i+1}).
theoremgidney_last_bit_reverse_post_state_in_bits
theorem gidney_last_bit_reverse_post_state_in_bits
    (i : Nat) (hi : 0 < i) (f : Nat → Bool) :
    (gidney_last_bit_reverse_post_state i f) (carry_idx i)
      = xor (xor (f (carry_idx i)) (f (carry_idx (i - 1))))
            (f (read_idx i) && f (target_idx i))
*Last-bit reverse in-bits structural lemma (PROVEN, Iter 195, 2026-05-13)**. Analog of Iter 169's `gidney_last_bit_post_state_in_bits` for the reverse direction. The last-bit-reverse has only 2 gates (no propagation), so it only modifies `c_i`: - **post(c_i)** = `(f(c_i) ⊕ f(c_{i-1})) ⊕ (f(r_i) ∧ f(t_i))`.
theoremgidney_first_bit_reverse_preserves
theorem gidney_first_bit_reverse_preserves
    (a b : Nat) (f : Nat → Bool)
    (h_r0 : f (read_idx 0) = a.testBit 0)
    (h_t0 : f (target_idx 0) = xor (a.testBit 0) (b.testBit 0))
    (h_c0 : f (carry_idx 0) = Adder.carry false 1 a.testBit b.testBit)
    (h_r1 : f (read_idx 1)
              = xor (a.testBit 1) (Adder.carry false 1 a.testBit b.testBit))
    (h_t1 : f (target_idx 1) = xor (a.testBit 1) (b.testBit 1)) :
    let post
*First-bit reverse classical-action lemma (PROVEN, Iter 193, 2026-05-13)**. Analog of Iter 165's `gidney_first_bit_preserves` for the reverse direction. Given a state `f` matching the post-forward-final-CX invariant at positions {r_0, t_0, c_0, r_1, t_1}, applying `gidney_first_bit_reverse_post_state` produces: - **post(c_0) = a_0** (a "dirty carry" — restored to a_0, NOT to false. This is consistent with Iter 106's older "dirty carries" observation in the file's reverse smoke tests.) - **post(r_1) = a_1** (carry XOR'd out, restored to input). - **post(t_1) = sum_1 = a_1 ⊕ b_1 ⊕ c_1** — the SUM BIT. The reverse cascade's first step XORs c_1 into target_1, completing the sum that the forward+final-CX had pending. This is the CRITICAL semantic step that fixes the Iter 182 review finding: the reverse re-XORs the math carry (which the qubit c_0 holds post-forward) into target_1. The dirty post(c_0) = a_0 calculation: post(c_0) = c_1 ⊕ (r_0 ∧ t_0) = (a_0 ∧ b_0) ⊕ (a_0 ∧ (a_0 ⊕ b_0)) = (a_0 ∧ b_0) ⊕ (a_0 ∧ ¬b_0) = a_0 ∧ (b_0 ⊕ ¬b_0) = a_0 ∧ true = a_0. ∎
example(example)
example :
    let f
*Decide-witness for `gidney_first_bit_reverse_preserves` on (a=1, b=1)** (Iter 193). Validates the lemma statement holds for the post-forward+final-CX state of the (1+1) instance.
example(example)
example :
    let f
*Decide-witness on (a=3, b=1) at n=3** (Iter 193). Multi-bit.
example(example)
example :
    Gidney.post_full_reverse_invariant 2 1 1
      (gidney_full_reverse_post_state 2
        (gidney_final_cx_cascade_post_state 2
          (gidney_forward_faithful_full_post_state 2 (adder_input_F 2 1 1))))
*Decide-witness on (n=2, a=1, b=1)** (Iter 197). Validates the richer Iter 197 invariant on the Iter 182 counterexample case.
example(example)
example :
    Gidney.post_full_reverse_invariant 3 3 1
      (gidney_full_reverse_post_state 3
        (gidney_final_cx_cascade_post_state 3
          (gidney_forward_faithful_full_post_state 3 (adder_input_F 3 3 1))))
*Decide-witness on (n=3, a=3, b=1)** (Iter 197). Multi-bit.
example(example)
example :
    Gidney.reverse_step_invariant 2 2 1 1
      (gidney_full_reverse_post_state 2
        (gidney_final_cx_cascade_post_state 2
          (gidney_forward_faithful_full_post_state 2 (adder_input_F 2 1 1))))
*Smoke decide-witness at k=n=2, (a,b) = (1,1)** (the Iter 182 counterexample case). When the step index equals the register width, the predicate covers every j and matches the witnessed `post_full_reverse_invariant` at line 4615.
theoremGidney.reverse_step_invariant_zero
theorem Gidney.reverse_step_invariant_zero (n a b : Nat) (post : Nat → Bool) :
    Gidney.reverse_step_invariant 0 n a b post
*Base case of the cascade induction**: when `k = 0`, the step-indexed invariant is vacuously true because the quantifier range `n - 0 ≤ j ∧ j < n` simplifies to `n ≤ j ∧ j < n`, which is unsatisfiable. No assumption on `post` is needed. This is the starting point for the inductive proof of `TODO_post_full_reverse_invariant_holds` — the parametric `reverse_step_invariant k n a b _` will be lifted from k=0 up to k=n via a `_succ` step that uses Iter 194 (first-bit reverse) + Iter 195 (interior `in_bits`) + Iter 201 (interior `computes_sum`) + the cascade-frame property.
theoremGidney.reverse_step_invariant_n_iff_post_full_reverse_invariant
theorem Gidney.reverse_step_invariant_n_iff_post_full_reverse_invariant
    (n a b : Nat) (post : Nat → Bool) :
    Gidney.reverse_step_invariant n n a b post ↔
      Gidney.post_full_reverse_invariant n a b post
*k=n bridge** to the original `Gidney.post_full_reverse_invariant`: when the step index equals the register width, the step-indexed predicate's quantifier range `n - n ≤ j ∧ j < n` simplifies to `0 ≤ j ∧ j < n`, which is the same range as the post-full-reverse invariant. This is the closing composition step for `TODO_post_full_reverse_invariant_holds`: once a `_succ` lemma lifts the predicate from k=0 up to k=n, this iff turns `reverse_step_invariant n _ _ _ _` into the goal.
theoremGidney.reverse_step_invariant_apply
theorem Gidney.reverse_step_invariant_apply
    (k n a b j : Nat) (post : Nat → Bool)
    (h_inv : Gidney.reverse_step_invariant k n a b post)
    (h_lo : n - k ≤ j) (h_hi : j < n) :
    post (target_idx j) = adder_sum_bit_classical a b j ∧
      post (read_idx j) = a.testBit j
*Specialization-at-j helper**: given the step-indexed predicate and witnesses that position `j` is in its quantifier range, extract the (target, read) correctness pair at `j`. A trivial 1-line application of the predicate; named for readability in downstream cascade-induction proofs that need to invoke the invariant at a specific position.
theoremGidney.reverse_step_invariant_weaken
theorem Gidney.reverse_step_invariant_weaken
    (k n a b : Nat) (post : Nat → Bool)
    (h : Gidney.reverse_step_invariant (k + 1) n a b post) :
    Gidney.reverse_step_invariant k n a b post
*Weakening**: a larger step index strengthens the invariant (covers more positions), so `inv_{k+1} → inv_k`. Useful when a cascade-induction proof has established the strong form and needs to extract a weaker one for a sub-case. Direct from the definition: `n - (k+1) ≤ j` implies `n - k ≤ j` via `omega`.
lemmaand
--     lemma and then inducting on the propagation chain.
--   - The `_succ_via_step_property` engine still applies for each
--     propagation step (k=1 first instantiated via interior_reverse(n-2),
--     k=n-1 via first_bit_reverse). Target_0 needs separate handling
--     (set by final-CX, preserved by every reverse step — Iter 200
--     frame lemmas cover this).
theoremGidney.reverse_step_invariant_succ_via_step_property
theorem Gidney.reverse_step_invariant_succ_via_step_property
    (k n a b : Nat) (post post' : Nat → Bool)
    (ih : Gidney.reverse_step_invariant k n a b post)
    (_hk : k < n)
    (h_step_target :
      post' (target_idx (n - k - 1)) = adder_sum_bit_classical a b (n - k - 1))
    (h_step_read :
      post' (read_idx (n - k - 1)) = a.testBit (n - k - 1))
    (h_frame_target : ∀ j, n - k ≤ j → j < n →
                        post' (target_idx j) = post (target_idx j))
    (h_frame_read : ∀ j, n - k ≤ j → j < n →
                      post' (read_idx j) = post (read_idx j)) :
theoremgidney_interior_bit_reverse_computes_sum
theorem gidney_interior_bit_reverse_computes_sum
    (j a b : Nat) (hj : 0 < j) (f : Nat → Bool)
    (h_cj : f (carry_idx j) = Adder.carry false (j + 1) a.testBit b.testBit)
    (h_tj1 : f (target_idx (j + 1))
              = xor (a.testBit (j + 1)) (b.testBit (j + 1))) :
    let post
*Interior-bit reverse computes one sum bit** (PROVEN, Iter 201): given a state `f` whose values at {c_j, r_{j+1}, t_{j+1}} match the post-forward+final-CX invariant, applying `interior_reverse(j)` produces `target_{j+1} = sum_{j+1}`. XOR identity: `(a_{j+1} ⊕ b_{j+1}) ⊕ c_{j+1} = sum_{j+1}` (since `sumfb false a b (j+1) = c_{j+1} ⊕ a_{j+1} ⊕ b_{j+1}`). Proof composes Iter 195's `gidney_interior_bit_reverse_post_state_in_bits` with Iter 199's `Adder.sumfb_eq_testBit_add`.
theoremgidney_first_bit_reverse_preserves_target_0
theorem gidney_first_bit_reverse_preserves_target_0 (f : Nat → Bool) :
    gidney_first_bit_reverse_post_state f (target_idx 0) = f (target_idx 0)
*First-bit reverse preserves target_0** (Iter 200). The first-bit reverse modifies {t_1, r_1, c_0}; not target_idx 0 (= position 1, distinct from target_idx 1 = position 4).
theoremgidney_interior_bit_reverse_preserves_target_0
theorem gidney_interior_bit_reverse_preserves_target_0
    (i : Nat) (hi : 0 < i) (f : Nat → Bool) :
    gidney_interior_bit_reverse_post_state i f (target_idx 0) = f (target_idx 0)
*Interior-bit reverse preserves target_0** for `i ≥ 1`. The interior reverse at i modifies {t_{i+1}, r_{i+1}, c_i, c_i}; target_0 is distinct from all of these for i ≥ 1.
theoremgidney_last_bit_reverse_preserves_target_0
theorem gidney_last_bit_reverse_preserves_target_0
    (i : Nat) (hi : 0 < i) (f : Nat → Bool) :
    gidney_last_bit_reverse_post_state i f (target_idx 0) = f (target_idx 0)
*Last-bit reverse preserves target_0** for `i ≥ 1`.
theoremgidney_propagation_reverse_preserves_target_0
theorem gidney_propagation_reverse_preserves_target_0
    (K : Nat) (f : Nat → Bool) :
    gidney_propagation_reverse_post_state K f (target_idx 0) = f (target_idx 0)
*Propagation reverse cascade preserves target_0**. By induction on `K` over the propagation_reverse_post_state def (which only invokes first/interior reverses, all of which preserve target_0).
theoremgidney_full_reverse_preserves_target_0
theorem gidney_full_reverse_preserves_target_0 (n : Nat) (f : Nat → Bool) :
    gidney_full_reverse_post_state n f (target_idx 0) = f (target_idx 0)
*Full reverse cascade preserves target_0**. For `n ≥ 2`, the full reverse cascade applies last_reverse(n-1) + propagation_reverse(n-1); both preserve target_0.
theoremgidney_interior_bit_reverse_post_state_preserves_outside
theorem gidney_interior_bit_reverse_post_state_preserves_outside
    (i : Nat) (f : Nat → Bool) (q : Nat)
    (h_ci : q ≠ carry_idx i)
    (h_ri1 : q ≠ read_idx (i + 1))
    (h_ti1 : q ≠ target_idx (i + 1)) :
    gidney_interior_bit_reverse_post_state i f q = f q
*Interior-bit reverse frame condition** (Iter 206). Positions other than {c_i, r_{i+1}, t_{i+1}} are unchanged. Generic frame analog of Iter 173's forward interior-step frame.
theoremgidney_interior_bit_reverse_preserves_low
theorem gidney_interior_bit_reverse_preserves_low
    (i : Nat) (hi : 0 < i) (q : Nat) (hq : q < 5) (f : Nat → Bool) :
    gidney_interior_bit_reverse_post_state i f q = f q
*Interior-bit reverse preserves low positions** (Iter 206). For i ≥ 1 and q < 5, the interior reverse modifies indices ≥ 5 only.
theoremgidney_first_bit_reverse_low_dependence
theorem gidney_first_bit_reverse_low_dependence
    (g h : Nat → Bool) (q : Nat) (hq : q < 5)
    (h_eq : ∀ p, p < 5 → g p = h p) :
    gidney_first_bit_reverse_post_state g q
    = gidney_first_bit_reverse_post_state h q
*First-bit reverse depends only on inputs at low positions** (Iter 206). For q < 5, the first-bit reverse's output at q is determined by the input's values at positions {0, 1, 2, 3, 4}. Therefore if g and h agree on those positions, first_rev g and first_rev h agree at q.
theoremgidney_last_bit_reverse_post_state_preserves_outside
theorem gidney_last_bit_reverse_post_state_preserves_outside
    (i : Nat) (f : Nat → Bool) (q : Nat) (h_q : q ≠ carry_idx i) :
    gidney_last_bit_reverse_post_state i f q = f q
*Last-bit reverse frame condition** (Iter 203). Positions other than `carry_idx i` are unchanged.
theoremGidney.last_reverse_target_read_frame
theorem Gidney.last_reverse_target_read_frame
    (i j : Nat) (f : Nat → Bool) :
    gidney_last_bit_reverse_post_state i f (target_idx j) = f (target_idx j)
    ∧ gidney_last_bit_reverse_post_state i f (read_idx j) = f (read_idx j)
*Last-reverse target/read frame** (2026-05-14 tick, anchors the cascade-induction k=0 → k=1 step). The last-bit reverse modifies ONLY `carry_idx i` (see def line 2637), so it's the identity on every `target_idx j` and `read_idx j`. The frame holds universally (for ALL i, j) because the qubit layout `read_j = 3j`, `target_j = 3j + 1`, `carry_j = 3j + 2` gives disjoint mod-3 residues — `target_idx j ≠ carry_idx i` and `read_idx j ≠ carry_idx i` for any (i, j). No `j < n` bound needed. This is the matching frame for the outer cascade's first step (`last_reverse(n-1)` in `gidney_full_reverse_post_state`). Once the cascade-induction proof factors through `propagation_reverse`, this lemma transfers the post-final-CX target/read state across the last-reverse layer unchanged.
theoremGidney.reverse_step_invariant_preserved_by_last_reverse
theorem Gidney.reverse_step_invariant_preserved_by_last_reverse
    (k n a b i : Nat) (f : Nat → Bool)
    (h : Gidney.reverse_step_invariant k n a b f) :
    Gidney.reverse_step_invariant k n a b
      (gidney_last_bit_reverse_post_state i f)
*`reverse_step_invariant` transfers across last-reverse** (2026-05-14 tick). Since `last_reverse(i)` only modifies `carry_idx i` (per `last_reverse_target_read_frame`), every target/read claim in `reverse_step_invariant k n a b f` is preserved when `f` is replaced by `last_bit_reverse i f`. This is the structural lemma that lets the outer cascade `gidney_full_reverse_post_state` factor through last_reverse: if we can establish `inv_n` after the propagation_reverse cascade alone (starting from `last_reverse(n-1) post_final_cx`), this lemma's NOT what we need; rather, it's the dual — if `inv_k` already held BEFORE last_reverse, it still holds AFTER. Useful as a frame helper in the cascade-induction proof.
theoremGidney.reverse_step_invariant_preserved_by_propagation_reverse_zero
theorem Gidney.reverse_step_invariant_preserved_by_propagation_reverse_zero
    (k n a b : Nat) (f : Nat → Bool)
    (h : Gidney.reverse_step_invariant k n a b f) :
    Gidney.reverse_step_invariant k n a b
      (gidney_propagation_reverse_post_state 0 f)
*K=0 trivial preservation**: `propagation_reverse(0)` is definitionally the identity (see def line 4334), so any invariant on `f` carries directly to `propagation_reverse(0) f`. `:= h` by reduction.
theoremGidney.reverse_step_invariant_K_holds_after_propagation_reverse_K_zero_only
theorem Gidney.reverse_step_invariant_K_holds_after_propagation_reverse_K_zero_only
    (n a b : Nat) (_hn : 1 < n) (input : Nat → Bool) :
    Gidney.reverse_step_invariant 0 n a b
      (gidney_propagation_reverse_post_state 0 input)
theoremgidney_last_bit_reverse_preserves_low
theorem gidney_last_bit_reverse_preserves_low
    (i : Nat) (hi : 0 < i) (q : Nat) (hq : q < 5) (f : Nat → Bool) :
    gidney_last_bit_reverse_post_state i f q = f q
*Last-bit reverse preserves the low-position frame** (Iter 203, 2026-05-13). For i ≥ 1, the last-bit reverse only modifies `carry_idx i = 3i + 2 ≥ 5`. Positions 0..4 (= read_0, target_0, carry_0, read_1, target_1) are all preserved.
theoremgidney_propagation_reverse_eq_first_rev_low
theorem gidney_propagation_reverse_eq_first_rev_low
    (K : Nat) (hK : 0 < K) (g : Nat → Bool) (q : Nat) (hq : q < 5) :
    gidney_propagation_reverse_post_state K g q
    = gidney_first_bit_reverse_post_state g q
*Propagation reverse cascade equals first reverse on low positions** (Iter 206). For K ≥ 1 and q < 5, propagation_reverse(K) g equals first_reverse g at q.
theoremgidney_full_reverse_eq_first_rev_low
theorem gidney_full_reverse_eq_first_rev_low
    (n : Nat) (hn : 1 < n) (f : Nat → Bool) (q : Nat) (hq : q < 5) :
    gidney_full_reverse_post_state n f q
    = gidney_first_bit_reverse_post_state f q
*Full reverse cascade equals first reverse on low positions** (Iter 206). For n ≥ 2 and q < 5, full_reverse(n) f equals first_reverse f at q.
theoremgidney_classical_action_with_reverse_n2_target_1
theorem gidney_classical_action_with_reverse_n2_target_1 (a b : Nat)
    (ha : a < 4) (hb : b < 4) :
    gidney_full_reverse_post_state 2
      (gidney_final_cx_cascade_post_state 2
        (gidney_forward_faithful_full_post_state 2 (adder_input_F 2 a b)))
      (target_idx 1)
    = adder_sum_bit_classical a b 1
*Headline j=1 case for n=2** (Iter 205 PROVEN parametrically over a, b for n=2). Composes: - n=2 def unfolding: `full_reverse(2) f = first_reverse (last_reverse(1) f)`. - Iter 203's `gidney_last_bit_reverse_preserves_low` (positions 0-4 unchanged by last_reverse(1)). - Iter 189's `post_forward_final_cx_invariant_holds` (post-CX values). - Iter 194's `gidney_first_bit_reverse_preserves` (target_1 = sum_1). - Iter 199's `Adder.sumfb_eq_testBit_add` (XOR identity).
theoremgidney_classical_action_with_reverse_target_0
theorem gidney_classical_action_with_reverse_target_0
    (n a b : Nat) (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n) :
    gidney_full_reverse_post_state n
      (gidney_final_cx_cascade_post_state n
        (gidney_forward_faithful_full_post_state n (adder_input_F n a b)))
      (target_idx 0)
    = adder_sum_bit_classical a b 0
*Headline j=0 case PROVEN parametrically** (Iter 202, 2026-05-13). For any n ≥ 2 and valid a, b, the j=0 case of `TODO_gidney_classical_action_with_reverse` holds: target_0 after full forward + final-CX + reverse = `adder_sum_bit_classical a b 0`. Composes: - Iter 200's `gidney_full_reverse_preserves_target_0` (target_0 unchanged by full reverse cascade). - Iter 189's `Gidney.post_forward_final_cx_invariant_holds` (post-CX target_0 = a_0 ⊕ b_0). - Iter 163's `Adder.testBit_add_zero` ((a+b).testBit 0 = a_0 ⊕ b_0).
theoremgidney_classical_action_with_reverse_target_1
theorem gidney_classical_action_with_reverse_target_1
    (n a b : Nat) (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n) :
    gidney_full_reverse_post_state n
      (gidney_final_cx_cascade_post_state n
        (gidney_forward_faithful_full_post_state n (adder_input_F n a b)))
      (target_idx 1)
    = adder_sum_bit_classical a b 1
*Headline j=1 case PROVEN parametrically over n** (Iter 207, 2026-05-13). Uses Iter 206's `gidney_full_reverse_eq_first_rev_low` to reduce the full reverse cascade at target_idx 1 (= 4 < 5) to just first_reverse, then applies Iter 194 with hypotheses verified from Iter 189's invariant.
theoremgidney_first_bit_reverse_preserves_target_above
theorem gidney_first_bit_reverse_preserves_target_above
    (j : Nat) (hj : 1 < j) (f : Nat → Bool) :
    gidney_first_bit_reverse_post_state f (target_idx j) = f (target_idx j)
*First-bit reverse preserves target_j for j ≥ 2** (Iter 209). Modifies {c_0, r_1, t_1}; for j ≥ 2, target_idx j = 3j+1 ≥ 7 > 4.
theoremgidney_first_bit_reverse_preserves_read_above
theorem gidney_first_bit_reverse_preserves_read_above
    (j : Nat) (hj : 1 < j) (f : Nat → Bool) :
    gidney_first_bit_reverse_post_state f (read_idx j) = f (read_idx j)
*First-bit reverse preserves read_j for j > 1** (2026-05-14 tick, read-side analog of `_preserves_target_above`). Modifies {t_1, r_1, c_0}; for j > 1, read_idx j = 3j ≠ any of those.
theoremgidney_interior_bit_reverse_preserves_target_above
theorem gidney_interior_bit_reverse_preserves_target_above
    (i j : Nat) (hij : i + 1 < j) (f : Nat → Bool) :
    gidney_interior_bit_reverse_post_state i f (target_idx j) = f (target_idx j)
*Interior-bit reverse preserves target_j for j > i+1** (Iter 209). Modifies {c_i, r_{i+1}, t_{i+1}}; for j > i+1, target_idx j = 3j+1 > 3(i+1)+1 = t_{i+1}.
theoremgidney_interior_bit_reverse_preserves_read_above
theorem gidney_interior_bit_reverse_preserves_read_above
    (i j : Nat) (hij : i + 1 < j) (f : Nat → Bool) :
    gidney_interior_bit_reverse_post_state i f (read_idx j) = f (read_idx j)
*Interior-bit reverse preserves read_j for j > i+1** (2026-05-14 tick, read-side analog). Same proof structure as the target version with read_idx in place of target_idx.
theoremgidney_propagation_reverse_preserves_target_above
theorem gidney_propagation_reverse_preserves_target_above
    (K j : Nat) (hjK : K < j) (f : Nat → Bool) :
    gidney_propagation_reverse_post_state K f (target_idx j) = f (target_idx j)
*Propagation reverse preserves target_j for j > K** (Iter 209). For K ≥ 0 and j > K, propagation_reverse(K) preserves target_idx j. By induction on K.
theoremgidney_propagation_reverse_preserves_read_above
theorem gidney_propagation_reverse_preserves_read_above
    (K j : Nat) (hjK : K < j) (f : Nat → Bool) :
    gidney_propagation_reverse_post_state K f (read_idx j) = f (read_idx j)
*Propagation reverse preserves read_j for j > K** (2026-05-14 tick, read-side analog of `_preserves_target_above` at line 5404). Same induction-on-K structure with `read_idx` in place of `target_idx`.
theoremgidney_interior_bit_reverse_at_target_low_dependence
theorem gidney_interior_bit_reverse_at_target_low_dependence
    (i : Nat) (hi : 0 < i) (g h : Nat → Bool)
    (h_t : g (target_idx (i + 1)) = h (target_idx (i + 1)))
    (h_c : g (carry_idx i) = h (carry_idx i)) :
    gidney_interior_bit_reverse_post_state i g (target_idx (i + 1))
    = gidney_interior_bit_reverse_post_state i h (target_idx (i + 1))
*Interior reverse at target_(i+1) only depends on inputs at {t_{i+1}, c_i}** (Iter 211). If g and h agree at those two positions, then interior_reverse(i) g and interior_reverse(i) h agree at target_(i+1).
theoremgidney_interior_bit_reverse_at_read_low_dependence
theorem gidney_interior_bit_reverse_at_read_low_dependence
    (i : Nat) (hi : 0 < i) (g h : Nat → Bool)
    (h_r : g (read_idx (i + 1)) = h (read_idx (i + 1)))
    (h_c : g (carry_idx i) = h (carry_idx i)) :
    gidney_interior_bit_reverse_post_state i g (read_idx (i + 1))
    = gidney_interior_bit_reverse_post_state i h (read_idx (i + 1))
*Interior reverse at read_(i+1) only depends on inputs at {r_{i+1}, c_i}** (2026-05-14 tick). Read-side analog of `_at_target_low_dependence`. Same proof structure with `.2.1` (read component of Iter 195's `_in_bits` triple) instead of `.2.2`.
theoremgidney_propagation_reverse_at_target_eq_interior_reverse
theorem gidney_propagation_reverse_at_target_eq_interior_reverse
    (K j : Nat) (hj : 1 < j) (hjK : j ≤ K) (g : Nat → Bool) :
    gidney_propagation_reverse_post_state K g (target_idx j)
    = gidney_interior_bit_reverse_post_state (j - 1) g (target_idx j)
*Propagation reverse at target_j reduces to interior_reverse(j-1)** (Iter 211). For j ∈ [2, K], propagation_reverse(K) g (target_idx j) equals interior_reverse(j-1) g (target_idx j). The cascade reduces to a single per-step. Proof: induction on K. - K=1: vacuous (j ∈ [2, 1] is empty). - K=m+2: propagation_reverse(m+2) g = propagation_reverse(m+1) (interior_reverse(m+1) g). - Subcase j = m+2: interior_reverse(m+1) computes target_j; later cascade preserves it (Iter 209's preserves_target_above with j > m+1). - Subcase j ≤ m+1: by IH, propagation_reverse(m+1) (...) (target_j) = interior_reverse(j-1) (interior_reverse(m+1) g) (target_j). And interior_reverse(m+1) preserves t_j and c_{j-1} (both ≤ 3j+1 ≤ 3(m+1)+1 < 3(m+1)+2), so by at_target_low_dependence, this equals interior_reverse(j-1) g (target_j).

FormalRV.Arithmetic.RippleCarryAdder.RippleCarryAdderDefinitions

FormalRV/Arithmetic/RippleCarryAdder/RippleCarryAdderDefinitions.lean
## Register indexing for the ripple-carry adder qianxu Fig. 4(a) interleaves three registers along the qubit axis: `read[0], target[0], carry[0], read[1], target[1], carry[1], …` For an `n`-bit adder, the wire count is `read = n+1`, `target = n+1`, `carry = n` (with an extra read/target for the overflow bit). The figure shows n=4: 14 lines total (5+5+4). We choose the convention: qubit indices interleave in groups of three, with the final read/target on top of the carry chain. `read[i] = 3*i`, `target[i] = 3*i + 1`, `carry[i] = 3*i + 2`.
defread_idx
def read_idx (i : Nat) : Nat
Qubit index for the i-th read bit.
deftarget_idx
def target_idx (i : Nat) : Nat
Qubit index for the i-th target bit.
defcarry_idx
def carry_idx (i : Nat) : Nat
Qubit index for the i-th carry bit.
defadder_n_qubits
def adder_n_qubits (n : Nat) : Nat
Total qubits required for an n-bit adder: 3n+2 (n carries + n+1 reads + n+1 targets - the final i has no carry).
defripple_carry_unit_stub
def ripple_carry_unit_stub (i : Nat) : Gate
Per-bit computation unit at bit position `i`. CURRENTLY a stub using `cuccaro_MAJ` on the (read[i], target[i], carry[i]) triple — this MUST be replaced with the exact Fig. 4(a) gate sequence in the next tick (the figure has additional CX gates threading carry[i-1] into the unit).
defgidney_adder_bit_step
def gidney_adder_bit_step (i : Nat) : Gate
⚠️ **COST-ONLY SKELETON — does NOT compute addition.** This simplified step omits the carry-propagation CXs (Iter 53 finding below), so for `i > 0` it XORs the wrong value into `carry[i]`. The semantically-correct, basis-state-proven steps are `gidney_adder_bit_step_faithful_{first,interior,last}`, composed into the correct forward pass `gidney_adder_forward_faithful_full`. This skeleton is retained ONLY for T-count accounting, which provably equals the correct adder's (`gidney_cost_skeleton_eq_faithful`).
defgidney_adder_forward
def gidney_adder_forward : Nat → Gate
  | 0       => Gate.I
  | n + 1   => Gate.seq (gidney_adder_forward n) (gidney_adder_bit_step n)
⚠️ **COST-ONLY SKELETON** forward pass (built on the wrong `gidney_adder_bit_step`). NOT semantically correct — use `gidney_adder_forward_faithful_full`. Retained only for its T-count, which equals the correct adder's by `gidney_cost_skeleton_eq_faithful`.
defgidney_adder_uncompute
def gidney_adder_uncompute : Nat → Gate
  | 0       => Gate.I
  | n + 1   => Gate.seq (gidney_adder_bit_step n) (gidney_adder_uncompute n)
Reverse pass cascade: emits `bit_step n-1, n-2, ..., 0` in reverse order. Like `prefix_and_uncompute`, this gives `n` Toffolis.
defgidney_final_cx_cascade
def gidney_final_cx_cascade : Nat → Gate
  | 0       => Gate.I
  | n + 1   => Gate.seq (gidney_final_cx_cascade n)
                        (Gate.CX (read_idx n) (target_idx n))
Final CX cascade — one `CX(read[i], target[i])` per bit, stamping the sum onto the target register. Source: `qq_gidney_adder.py:122-123`.
defgidney_adder_full
def gidney_adder_full (n : Nat) : Gate
⚠️ **COST-ONLY SKELETON** full adder (forward + reverse + final CX) built on the wrong bit-step — NOT semantically correct. Its T-count (`tcount_gidney_adder_full = 14n`) is valid and is what the Shor cost model binds to; the correct, basis-state-proven adder is the faithful cascade (`gidney_adder_forward_faithful_full` + reverse + `gidney_final_cx_cascade`). See `gidney_cost_skeleton_eq_faithful` for the cost equivalence with the correct adder.
defgidney_adder_bit_with_measurement_uncompute_tcount
def gidney_adder_bit_with_measurement_uncompute_tcount : Nat
Per-bit Gidney adder with measurement-based uncomputation: one full Gidney-AND cycle per bit. T-count per bit = 7.
defgidney_adder_full_with_measurement_uncompute_tcount
def gidney_adder_full_with_measurement_uncompute_tcount (n : Nat) : Nat
n-bit Gidney adder T-count with measurement-based uncomputation: `7n`. Derived structurally from `GidneyAND_cycle_tcount_eq_seven` (Iter 43) applied at each bit.
defgidney_adder_bit_step_faithful_interior
def gidney_adder_bit_step_faithful_interior (i : Nat) : Gate
Faithful Gidney bit-step at an interior bit `i ≥ 1` (not the last). Emits 4 gates: CCX + chain-CX + 2 propagation CXs. **This is the review-correct encoding** per `qq_gidney_adder.py:58-73`.
defgidney_adder_forward_faithful_interior
def gidney_adder_forward_faithful_interior : Nat → Gate
  | 0       => Gate.I
  | n + 1   => Gate.seq
                 (gidney_adder_forward_faithful_interior n)
                 (gidney_adder_bit_step_faithful_interior (n + 1))
Faithful interior cascade: composes `gidney_adder_bit_step_faithful_interior (k+1)` for `k = 0..n-1`. Emits exactly `4n` gates (n Toffolis + 3n CXs).
defgidney_bit_step_faithful_post_state
def gidney_bit_step_faithful_post_state (i : Nat) (f : Nat → Bool) : Nat → Bool
The post-state after applying `gidney_adder_bit_step_faithful_interior i` to a basis state `f_to_vec dim f`, expressed as four chained `update`s. This is the **explicit semantic action** of the faithful bit-step: step 1 (CCX): carry[i] ⊕= read[i] ∧ target[i] step 2 (CX): carry[i] ⊕= carry[i-1] step 3 (CX): read[i+1] ⊕= carry[i] (post-step-2 value) step 4 (CX): target[i+1] ⊕= carry[i] (post-step-2 value) With pre-XORed inputs from the previous bit's propagation, the post-step-2 carry[i] equals Gidney's carry formula `((read[i] ⊕ prev) ∧ (target[i] ⊕ prev)) ⊕ prev`.
defgidney_adder_bit_step_faithful_interior_reverse
def gidney_adder_bit_step_faithful_interior_reverse (i : Nat) : Gate
Gate-reverse of `gidney_adder_bit_step_faithful_interior i`.
defgidney_cascade_post_state
def gidney_cascade_post_state : Nat → (Nat → Bool) → (Nat → Bool)
  | 0    , f => f
  | n + 1, f =>
      gidney_bit_step_faithful_post_state (n + 1)
        (gidney_cascade_post_state n f)
Cascade-level post-state: fold of `gidney_bit_step_faithful_post_state` over bits 1..n. Matches the recursive structure of `gidney_adder_forward_faithful_interior`.
structureBitDisjointness
structure BitDisjointness (dim i : Nat) : Prop
*Bit-disjointness hypothesis for bit `i`**: bundles the 12 conditions needed for the per-bit correctness theorem. Decidable on any concrete `i, dim`; for the parametric cascade we quantify over the bit index.
defgidney_adder_bit_step_faithful_first
def gidney_adder_bit_step_faithful_first : Gate
Faithful Gidney bit-step at i=0 (first bit). Emits 3 gates: CCX + 2 propagation CXs (no chain CX since `carry[-1]` doesn't exist).
defgidney_first_bit_post_state
def gidney_first_bit_post_state (f : Nat → Bool) : Nat → Bool
Post-state of `gidney_adder_bit_step_faithful_first` on basis states: CCX writes `(read[0] ∧ target[0])` into `carry[0]`, then propagation CXs XOR `carry[0]` into `read[1]` and `target[1]`. No chain CX (since there's no prev carry).
defgidney_adder_bit_step_faithful_first_reverse
def gidney_adder_bit_step_faithful_first_reverse : Gate
Gate-reverse of `gidney_adder_bit_step_faithful_first`. Emits `CX(carry[0], target[1]) ; CX(carry[0], read[1]) ; CCX(read[0], target[0], carry[0])` — the original three gates in reverse order.
defgidney_adder_bit_step_faithful_last
def gidney_adder_bit_step_faithful_last (i : Nat) : Gate
Faithful Gidney bit-step at the **last interior bit** `i ≥ 1` (no propagation CXs). Emits 2 gates: CCX + chain CX.
defgidney_last_bit_post_state
def gidney_last_bit_post_state (i : Nat) (f : Nat → Bool) : Nat → Bool
Post-state of `gidney_adder_bit_step_faithful_last i`: CCX writes `(read[i] ∧ target[i])` into `carry[i]`, then chain CX XORs `carry[i-1]` into `carry[i]`. No propagation.
defgidney_adder_bit_step_faithful_last_reverse
def gidney_adder_bit_step_faithful_last_reverse (i : Nat) : Gate
The **gate-reversed last-bit step** at index `i`: `seq (CX carry[i-1] carry[i]) (CCX read[i] target[i] carry[i])`. Mirrors `gidney_adder_bit_step_faithful_last i`'s gate order.
defgidney_adder_bit_step_reverse
def gidney_adder_bit_step_reverse (i : Nat) : Gate
Gate-reverse of `gidney_adder_bit_step i`. At i = 0 both forward and reverse are the same single CCX (CCX is self-inverse); at i > 0 the reverse is `CX · CCX` (gate-order swap of forward's `CCX · CX`).
defgidney_adder_uncompute_proper
def gidney_adder_uncompute_proper : Nat → Gate
  | 0       => Gate.I
  | n + 1   => Gate.seq (gidney_adder_bit_step_reverse n)
                        (gidney_adder_uncompute_proper n)
*Proper reverse cascade**: gate-by-gate inverse of `gidney_adder_forward`. Emits `bit_step_reverse n-1, n-2, ..., 0` in reverse order. Each gate-reverse swaps the CCX·CX → CX·CCX within the bit-step.
defgidney_adder_forward_with_propagation
def gidney_adder_forward_with_propagation : Nat → Gate
  | 0       => Gate.I
  | 1       => gidney_adder_bit_step_faithful_first
  | n + 2   => Gate.seq (gidney_adder_forward_with_propagation (n + 1))
                        (gidney_adder_bit_step_faithful_interior (n + 1))
Helper: cascade of `n` faithful bit-steps, each WITH propagation to the next bit. Bit 0 uses `..._first`; bits 1..n-1 use `..._interior`. For `n = 0`: identity. For `n ≥ 1`: first ; interior(1) ; interior(2) ; ... ; interior(n-1). All bits in this cascade have propagation CXs to bit `i+1`; pair this with `gidney_adder_bit_step_faithful_last` to get a full adder cascade (the LAST bit has no propagation).
defgidney_adder_forward_faithful_full
def gidney_adder_forward_faithful_full : Nat → Gate
  | 0       => Gate.I
  | 1       => Gate.I
  | n + 2   => Gate.seq (gidney_adder_forward_with_propagation (n + 1))
                        (gidney_adder_bit_step_faithful_last (n + 1))
*Faithful full forward cascade for an n-bit Gidney adder**. Glues first/interior/last bit-steps per `qq_gidney_adder.py`'s actual gate structure. - `n = 0` or `n = 1`: degenerate, returns `Gate.I`. - `n = k + 2`: `forward_with_propagation (k + 1) ; last (k + 1)`, i.e., bits 0..k each emit CCX + 3 CXs (first emits CCX + 2 CXs; interior emit CCX + chain CX + 2 propagation CXs); bit k+1 emits CCX + chain CX (last, no propagation). Concrete: for `n = 33` (RSA-2048 q_A=33 adder block), this is `forward_with_propagation 32 ; last 32` = 33 Toffolis = 231 T.
defgidney_propagation_post_state
def gidney_propagation_post_state : Nat → (Nat → Bool) → (Nat → Bool)
  | 0    , f => f
  | 1    , f => gidney_first_bit_post_state f
  | n + 2, f =>
      gidney_bit_step_faithful_post_state (n + 1)
        (gidney_propagation_post_state (n + 1) f)
Post-state of `gidney_adder_forward_with_propagation n` on `f`. Recursion matches the def's three clauses (0, 1, n+2).
defgidney_forward_faithful_full_post_state
def gidney_forward_faithful_full_post_state : Nat → (Nat → Bool) → (Nat → Bool)
  | 0    , f => f
  | 1    , f => f
  | n + 2, f =>
      gidney_last_bit_post_state (n + 1)
        (gidney_propagation_post_state (n + 1) f)
Post-state of the **faithful full forward cascade** on `f`. Composes `propagation_post_state (n+1)` (bits 0..n with propagation) with `last_bit_post_state (n+1)` (bit n+1 with no propagation).
defgidney_final_cx_cascade_post_state
def gidney_final_cx_cascade_post_state : Nat → (Nat → Bool) → (Nat → Bool)
  | 0    , f => f
  | n + 1, f =>
      let f'
Post-state of `gidney_final_cx_cascade n`: nested chain of `update target[i] (target[i] XOR read[i])` for i = 0..n-1.
defgidney_adder_forward_with_propagation_reverse
def gidney_adder_forward_with_propagation_reverse : Nat → Gate
  | 0       => Gate.I
  | 1       => gidney_adder_bit_step_faithful_first_reverse
  | n + 2   => Gate.seq (gidney_adder_bit_step_faithful_interior_reverse (n + 1))
                        (gidney_adder_forward_with_propagation_reverse (n + 1))
Reverse of `gidney_adder_forward_with_propagation`: emits `interior_reverse (n-1), interior_reverse (n-2), ..., interior_reverse 1, first_reverse` in reverse order.
defgidney_adder_forward_faithful_full_reverse
def gidney_adder_forward_faithful_full_reverse : Nat → Gate
  | 0       => Gate.I
  | 1       => Gate.I
  | n + 2   => Gate.seq (gidney_adder_bit_step_faithful_last_reverse (n + 1))
                        (gidney_adder_forward_with_propagation_reverse (n + 1))
Reverse of `gidney_adder_forward_faithful_full`. Emits `last_reverse (n+1), interior_reverse(n)..., first_reverse`.
defgidney_adder_full_faithful_no_measurement
def gidney_adder_full_faithful_no_measurement : Nat → Gate
  | 0       => Gate.I
  | 1       => Gate.I
  | n + 2   => Gate.seq
                (Gate.seq (gidney_adder_forward_faithful_full (n + 2))
                          (gidney_final_cx_cascade (n + 2)))
                (gidney_adder_forward_faithful_full_reverse (n + 2))
*Full no-measurement faithful Gidney adder**. For `n+2` bits: composes the faithful forward cascade (Iter 79) with the final CX cascade (Iter 21) and the faithful reverse cascade (Iter 83). Total T-count: `14(n+2)` Toffolis × 7 = no, wait: 7(n+2) forward + 0 final CX + 7(n+2) reverse = 14(n+2) T-gates. Edge cases `n=0, n=1` return `Gate.I`.
defgidney_adder
def gidney_adder (n : Nat) : Gate
*The canonical, semantically-correct Gidney ripple-carry adder.** Alias for the faithful, basis-state-proven, no-measurement adder (`gidney_adder_full_faithful_no_measurement`). This — NOT the cost-only `gidney_adder_full` skeleton — is the adder the Shor cost model binds to (`adderToff_eq`), and the canonical name downstream code should use.
abbrevzeroF
abbrev zeroF : Nat → Bool
The all-zero input function.
definputF_1_plus_0
def inputF_1_plus_0 : Nat → Bool
Input function for `read = (1, 0), target = (0, 0), carry = (0, 0)`. Encoded as `i == 0` (true only at i=0 = read_0 = 1; all else false).
definputF_1_plus_1
def inputF_1_plus_1 : Nat → Bool
  | 0 => true   -- read_0 = a_0 = 1
  | 1 => true   -- target_0 = b_0 = 1
  | _ => false  -- read_1 = a_1 = 0, target_1 = b_1 = 0, carries = 0
Input for `(a=1, b=1)` 2-bit addition.
definputF_3_plus_1
def inputF_3_plus_1 : Nat → Bool
  | 0 => true   -- read_0 = a_0 = 1
  | 1 => true   -- target_0 = b_0 = 1
  | 3 => true   -- read_1 = a_1 = 1
  -- target_1, carries, read_2, target_2 all default to false
  | _ => false
Input for `(a=3, b=1)` 3-bit addition.
definputF_7_plus_1
def inputF_7_plus_1 : Nat → Bool
  | 0 => true   -- read_0 = a_0 = 1
  | 1 => true   -- target_0 = b_0 = 1
  | 3 => true   -- read_1 = a_1 = 1
  | 6 => true   -- read_2 = a_2 = 1
  -- read_3 = a_3 = 0, target_1, target_2, target_3, carries all 0
  | _ => false
Input for `(a=7, b=1)` 4-bit addition.
defgidney_first_bit_reverse_post_state
def gidney_first_bit_reverse_post_state (f : Nat → Bool) : Nat → Bool
Post-state of `gidney_adder_bit_step_faithful_first_reverse` on a basis-state input `f`. Three chained updates matching the three gates' classical actions.
defgidney_interior_bit_reverse_post_state
def gidney_interior_bit_reverse_post_state (i : Nat) (f : Nat → Bool) : Nat → Bool
Post-state of `gidney_adder_bit_step_faithful_interior_reverse i` on a basis-state input `f`. Four chained updates matching the four gates' classical actions.
defgidney_last_bit_reverse_post_state
def gidney_last_bit_reverse_post_state (i : Nat) (f : Nat → Bool) : Nat → Bool
Post-state of `gidney_adder_bit_step_faithful_last_reverse i` on a basis-state input `f`. Two chained updates on `carry_i` matching the two gates' classical actions.
defadder_input_F
def adder_input_F (n a b : Nat) (k : Nat) : Bool
*Generic input encoding** for the n-bit Gidney adder. Maps qubit index `k` to its Boolean value when the adder's input is `(a, b, 0_carries)`: - `k = 3i` (read_i): bit `i` of `a` if `i < n`, else false. - `k = 3i + 1` (target_i): bit `i` of `b` if `i < n`, else false. - `k = 3i + 2` (carry_i): false. Decide-witnessed below to match the existing concrete `inputF_*` defs.
defAdder.carry
def Adder.carry (b₀ : Bool) : Nat → (Nat → Bool) → (Nat → Bool) → Bool
  | 0,     _, _ => b₀
  | n + 1, f, g =>
      let c
*Classical carry function** (SQIR ModMult.v:497 port). Given a carry-in `b₀ : Bool` and two bit-streams `f g : Nat → Bool`, `Adder.carry b₀ n f g` is the carry-out after processing bits `0..n-1` of `f + g`.
defAdder.sumfb
def Adder.sumfb (b₀ : Bool) (f g : Nat → Bool) (i : Nat) : Bool
*Classical sum-bit function** (SQIR ModMult.v:638 port). `Adder.sumfb b₀ f g i = carry b₀ i f g ⊕ f i ⊕ g i` — bit `i` of the sum `f + g` with carry-in `b₀`.
defGidney.forward_cascade_post_invariant
def Gidney.forward_cascade_post_invariant
    (n a b : Nat) (post : Nat → Bool) : Prop
*End-of-forward-cascade invariant**: characterizes the state after `gidney_forward_faithful_full n` has processed all `n` bits of inputs `a` and `b`. Quantum quantities (read/target/carry positions) are characterized as classical functions of `a`, `b`, and the classical carry chain `Adder.carry`. SQIR-style: this is the `forall i, predicate` form, not the per-step `msma`.
defGidney.propagation_step_invariant
def Gidney.propagation_step_invariant
    (k n a b : Nat) (post : Nat → Bool) : Prop
*Step-indexed propagation invariant** (Iter 175, analog of SQIR's msma/msmb/msmc at ModMult.v:631). After `k` propagation iterations of `gidney_propagation_post_state k f` on input `adder_input_F n a b`: - For `j < k`: carry_j = c_{j+1} (= Adder.carry false (j+1) ...); else (j ≥ k, unchanged): carry_j = false. - For `j ≤ k`: read_j = a_j ⊕ c_j (propagated; note c_0 = false so j=0 collapses to read_0 = a_0); else (j > k, unchanged): read_j = a_j. - Same for target_j. Indexing: k=0 means "before first step", k=1 means "after first-bit step", k=K means "after first-bit + (K-1) interior steps" (= bits 0..K-1 processed). This is the non-trivial induction invariant for cascade composition (rule 1: mirror SQIR's MAJseq'_correct structure).
defgidney_interior_bit_post_state
def gidney_interior_bit_post_state (i : Nat) (f : Nat → Bool) : Nat → Bool
*Forward post-state of interior bit-step at `i`** (Iter 166). Defined analogously to `gidney_first_bit_post_state` but for the interior 4-gate step at position `i ≥ 1`. Gate sequence (from `gidney_adder_bit_step_faithful_interior i`): 1. CCX(read_i, target_i, carry_i) 2. CX(carry_{i-1}, carry_i) — chain carry from previous 3. CX(carry_i, read_{i+1}) — propagation 4. CX(carry_i, target_{i+1}) — propagation Each gate's classical action is an XOR-update at the target.
defadder_sum_bit_classical
def adder_sum_bit_classical (a b i : Nat) : Bool
*Classical specification**: bit `i` of `(a + b) mod 2^n`, the value the i-th target qubit SHOULD hold after the full forward + final-CX cascade (per Iter 106's finding, the reverse cascade only undoes propagation but not the sum).
defGidney.post_last_bit_invariant
def Gidney.post_last_bit_invariant
    (n a b : Nat) (post : Nat → Bool) : Prop
*End-state invariant for forward cascade (forward only, no final-CX)** (Iter 187, 2026-05-13). Captures the EXACT classical action of `gidney_forward_faithful_full_post_state n` on `adder_input_F n a b` for an n-bit adder (n ≥ 2). For all `j < n`: - **carry_j** = `c_{j+1}` (last-bit step writes c_{n-1} = c_n, propagation writes all earlier carries). - **read_j** = `a_j ⊕ c_j` (forward propagation; `c_0 = 0` collapses to `a_0` for j = 0). - **target_j** = `b_j ⊕ c_j` (forward propagation; analogous). Compared to `Gidney.post_forward_final_cx_invariant` (Iter 183): same carry and read clauses; target clause is `b_j ⊕ c_j` here vs `a_j ⊕ b_j` post-final-CX (the final-CX layer XORs read_j into target_j, canceling c_j). Composition path: `gidney_forward_faithful_full_post_state n = gidney_last_bit_post_state (n-1) ∘ gidney_propagation_post_state (n-1)` (for n ≥ 2 via the recursive def's third clause). The propagation invariant at step (n-1) gives all positions j < n propagated, with `carry_{n-1} = false` (still unprocessed); the last-bit step at position n-1 writes `carry_{n-1} = c_n` and doesn't touch read/target. Combining: all 3 invariants hold for j ∈ [0, n-1].
defGidney.post_forward_final_cx_invariant
def Gidney.post_forward_final_cx_invariant
    (n a b : Nat) (post : Nat → Bool) : Prop
*End-state invariant for forward + final-CX cascade** (Iter 183, 2026-05-13). Captures the EXACT classical action of `gidney_final_cx_cascade_post_state n ∘ gidney_forward_faithful_full_post_state n` on `adder_input_F n a b` for an n-bit adder (n ≥ 2). For all `j < n`: - **carry_j** = `c_{j+1}` (= `Adder.carry false (j+1) a.testBit b.testBit`). All carries 0..n-1 hold the math carry into the next position. - **read_j** = `a_j ⊕ c_j` (forward propagation; `c_0 = 0` collapses this to `a_0` for j = 0). - **target_j** = `a_j ⊕ b_j`. The forward step writes target_j = b_j ⊕ c_j (propagation); the final-CX layer adds read_j = a_j ⊕ c_j; `(b_j ⊕ c_j) ⊕ (a_j ⊕ c_j) = a_j ⊕ b_j`. The c_j contributions *cancel** in the XOR — this is exactly the review finding of Iter 182. *Key review insight**: `target_j` is NOT the classical sum bit `a_j ⊕ b_j ⊕ c_j`. The reverse cascade is essential to re-XOR `c_j` into target_j (via the per-step `CX(c_{j-1}, t_j)` gates), completing the sum-bit computation. This invariant captures the correct intermediate state at the boundary between forward + final-CX and the reverse cascade. Per Iter 182 finding + QUESTIONS.md #1 (2026-05-13): this invariant is the **provable** parametric statement at this layer. The headline sum-bit theorem (target_j = sum_j) needs the additional reverse cascade composition, awaiting John's approval of the restatement.
defgidney_propagation_reverse_post_state
def gidney_propagation_reverse_post_state : Nat → (Nat → Bool) → (Nat → Bool)
  | 0       , f => f
  | 1       , f => gidney_first_bit_reverse_post_state f
  | n + 2   , f =>
      gidney_propagation_reverse_post_state (n + 1)
        (gidney_interior_bit_reverse_post_state (n + 1) f)
*Post-state of the reverse propagation cascade** (Iter 191). Mirrors `gidney_adder_forward_with_propagation_reverse` (line 1826): apply `interior_reverse (n+1)`, then recurse, ending with `first_reverse`. For `n+2`: applies interior_reverse from i=n+1 down to i=1, then first_reverse.
defgidney_full_reverse_post_state
def gidney_full_reverse_post_state : Nat → (Nat → Bool) → (Nat → Bool)
  | 0       , f => f
  | 1       , f => f
  | n + 2   , f =>
      gidney_propagation_reverse_post_state (n + 1)
        (gidney_last_bit_reverse_post_state (n + 1) f)
*Post-state of the full forward+reverse cascade** (Iter 191). Mirrors `gidney_adder_forward_faithful_full_reverse` (line 1849): apply `last_reverse (n+1)`, then `propagation_reverse (n+1)`.
defGidney.post_full_reverse_invariant
def Gidney.post_full_reverse_invariant
    (n a b : Nat) (post : Nat → Bool) : Prop
*Post-full-reverse invariant** (Iter 197, 2026-05-13). Captures target+read state after the full forward+final-CX+reverse cascade. For `n ≥ 2`: - `target_j = sum_j = (a + b).testBit j` (the SUM bit, completing the review chain). - `read_j = a.testBit j` (RESTORED to input value via the reverse cascade's r_{i+1} ⊕= c_i operation, which inverts the forward propagation r_{i+1} ⊕= c_i).
defGidney.reverse_step_invariant
def Gidney.reverse_step_invariant
    (k n a b : Nat) (post : Nat → Bool) : Prop
*Step-indexed reverse-cascade invariant** (Iter 2026-05-14, the `inv_k` from the comment-bridge above the `sorry` at `TODO_post_full_reverse_invariant_holds`). `reverse_step_invariant k n a b post` says: after `k` reverse steps have fired on the post-final-CX state, every position `j ∈ [n - k, n - 1]` has been corrected to its final target/read values (target_j = sum_j, read_j = a.testBit j). Positions `j < n - k` have NOT yet been corrected and are excluded from the predicate's quantifier. - `k = 0`: empty predicate (no j satisfies `n ≤ j < n`). - `k = 1`: covers j = n-1 (the last-bit reverse target). - `k = n`: covers all j ∈ [0, n-1], i.e., equivalent to `Gidney.post_full_reverse_invariant n a b post`. The cascade-induction proof of `TODO_post_full_reverse_invariant_holds` factors as: state `reverse_step_invariant k _` for k=1..n, induct on k via Iter 194/195/200/201 + frame, conclude at k=n.
defgidney_read_val
def gidney_read_val : Nat → (Nat → Bool) → Nat
  | 0,     _ => 0
  | n + 1, f =>
      gidney_read_val n f + (if f (read_idx n) then 2^n else 0)
Decoder: value of the `read` register at width `n`, LSB-first. Bit at `read_idx i = 3*i` contributes weight `2^i`.
defgidney_target_val
def gidney_target_val : Nat → (Nat → Bool) → Nat
  | 0,     _ => 0
  | n + 1, f =>
      gidney_target_val n f + (if f (target_idx n) then 2^n else 0)
Decoder: value of the `target` register at width `n`, LSB-first. Bit at `target_idx i = 3*i + 1` contributes weight `2^i`.
defgidney_carry_val
def gidney_carry_val : Nat → (Nat → Bool) → Nat
  | 0,     _ => 0
  | n + 1, f =>
      gidney_carry_val n f + (if f (carry_idx n) then 2^n else 0)
Decoder: value of the `carry` register at width `n`, LSB-first. Bit at `carry_idx i = 3*i + 2` contributes weight `2^i`.
definputF_1_plus_1_tickD
def inputF_1_plus_1_tickD : Nat → Bool
  | 0 => true   -- read_idx 0 = 0:  read[0] = 1 (LSB)
  | 1 => true   -- target_idx 0 = 1: target[0] = 1 (LSB)
  | _ => false  -- read[1] = 0, target[1] = 0, carry[0] = carry[1] = 0
Concrete 1+1 input (LSB-first): `read = 1, target = 1` at width 2.
defgidney_adder_bit_step_faithful_first_reverse_patched
def gidney_adder_bit_step_faithful_first_reverse_patched : Gate
Patched first-bit reverse step: existing first-reverse followed by `CX(read_idx 0, carry_idx 0)` to clear `carry_idx 0`.
defgidney_adder_bit_step_faithful_interior_reverse_patched
def gidney_adder_bit_step_faithful_interior_reverse_patched (i : Nat) :
    Gate
Patched interior-bit reverse step: existing interior-reverse followed by `CX(read_idx i, carry_idx i)` to clear `carry_idx i`.
defgidney_adder_bit_step_faithful_last_reverse_patched
def gidney_adder_bit_step_faithful_last_reverse_patched (i : Nat) :
    Gate
Patched last-bit reverse step: existing last-reverse followed by `CX(read_idx i, carry_idx i)` to clear `carry_idx i`.
defgidney_adder_forward_with_propagation_reverse_patched
def gidney_adder_forward_with_propagation_reverse_patched : Nat → Gate
  | 0       => Gate.I
  | 1       => gidney_adder_bit_step_faithful_first_reverse_patched
  | n + 2   =>
      Gate.seq (gidney_adder_bit_step_faithful_interior_reverse_patched (n + 1))
               (gidney_adder_forward_with_propagation_reverse_patched (n + 1))
Patched propagation reverse cascade.
defgidney_adder_forward_faithful_full_reverse_patched
def gidney_adder_forward_faithful_full_reverse_patched : Nat → Gate
  | 0       => Gate.I
  | 1       => Gate.I
  | n + 2   =>
      Gate.seq (gidney_adder_bit_step_faithful_last_reverse_patched (n + 1))
               (gidney_adder_forward_with_propagation_reverse_patched (n + 1))
Patched full reverse cascade.
defgidney_adder_full_faithful_no_measurement_patched
def gidney_adder_full_faithful_no_measurement_patched : Nat → Gate
  | 0       => Gate.I
  | 1       => Gate.I
  | n + 2   =>
      Gate.seq
        (Gate.seq (gidney_adder_forward_faithful_full (n + 2))
                  (gidney_final_cx_cascade (n + 2)))
        (gidney_adder_forward_faithful_full_reverse_patched (n + 2))
Patched full faithful no-measurement Gidney adder: forward + final-CX + **patched** reverse.

FormalRV.Arithmetic.RippleCarryAdder.RippleCarryAdderPropagationReverse

FormalRV/Arithmetic/RippleCarryAdder/RippleCarryAdderPropagationReverse.lean
theoremgidney_propagation_reverse_at_read_eq_interior_reverse
theorem gidney_propagation_reverse_at_read_eq_interior_reverse
    (K j : Nat) (hj : 1 < j) (hjK : j ≤ K) (g : Nat → Bool) :
    gidney_propagation_reverse_post_state K g (read_idx j)
    = gidney_interior_bit_reverse_post_state (j - 1) g (read_idx j)
*Propagation reverse at read_j reduces to interior_reverse(j-1)** (2026-05-14 tick, read-side analog of line ~5488 target version). For j ∈ [2, K], propagation_reverse(K) g (read_idx j) equals interior_reverse(j-1) g (read_idx j). Same induction-on-K + case-split structure as target version, with read_idx in place of target_idx and using the read-side preserves/dependence helpers (`_preserves_read_above`, `_at_read_low_dependence`).
theoremgidney_classical_action_with_reverse_target_geq_2
theorem gidney_classical_action_with_reverse_target_geq_2
    (n a b : Nat) (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n)
    (j : Nat) (hj : 2 ≤ j) (hjn : j < n) :
    gidney_full_reverse_post_state n
      (gidney_final_cx_cascade_post_state n
        (gidney_forward_faithful_full_post_state n (adder_input_F n a b)))
      (target_idx j)
    = adder_sum_bit_classical a b j
*Headline j ≥ 2 case** (Iter 208 STATED, sorried). For j ∈ [2, n-1], target_idx j after full forward+CX+reverse equals sum_j. The relevant per-step is `interior_reverse(j-1)` which fires at cascade step (n-j+1). Proof structure (pending): - "High-position frame": earlier reverses (last_reverse(n-1) + interior_reverse(n-2), ..., interior_reverse(j)) all modify positions ≥ 3j+2 (= c_j minimum). They preserve interior_reverse(j-1)'s input positions ≤ 3j+1. - Apply Iter 201's `gidney_interior_bit_reverse_computes_sum` with hypotheses verified from post-CX (Iter 189). - "Low-position frame": later reverses (interior_reverse(j-2), ..., first_reverse) all modify positions ≤ 3j-2. They preserve target_idx j = 3j+1. - Conclude full_reverse n f (target_idx j) = sum_j. Estimated 60-100 lines for the structural framing. The per-step computes_sum + frame conditions are mechanical mirror of the forward cascade pipeline (Iter 175-181).
theoremgidney_first_bit_reverse_preserves_read_0
theorem gidney_first_bit_reverse_preserves_read_0 (f : Nat → Bool) :
    gidney_first_bit_reverse_post_state f (read_idx 0) = f (read_idx 0)
*First-bit reverse preserves read_0** (2026-05-14 tick). Mirror of `_preserves_target_0` at line 4933. first_bit_reverse modifies {target_1, read_1, carry_0} = {4, 3, 2}; read_idx 0 = 0 ≠ any.
theoremgidney_classical_action_with_reverse_read_0
theorem gidney_classical_action_with_reverse_read_0
    (n a b : Nat) (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n) :
    gidney_full_reverse_post_state n
      (gidney_final_cx_cascade_post_state n
        (gidney_forward_faithful_full_post_state n (adder_input_F n a b)))
      (read_idx 0)
    = a.testBit 0
*Headline j=0 read case PROVEN parametrically over n** (2026-05-14 tick, read-side analog of `_with_reverse_target_0` at line 5296). Uses `gidney_full_reverse_eq_first_rev_low` (since read_idx 0 = 0 < 5) to reduce to first_bit_reverse, then the just-proven `_first_bit_reverse_preserves_read_0` frame, then the `post_forward_final_cx_invariant` at j=0 simplification `xor a_0 (Adder.carry false 0 a b) = xor a_0 false = a_0`.
theoremgidney_classical_action_with_reverse_read_1
theorem gidney_classical_action_with_reverse_read_1
    (n a b : Nat) (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n) :
    gidney_full_reverse_post_state n
      (gidney_final_cx_cascade_post_state n
        (gidney_forward_faithful_full_post_state n (adder_input_F n a b)))
      (read_idx 1)
    = a.testBit 1
*Headline j=1 read case PROVEN parametrically over n** (2026-05-14 tick, read-side analog of `_with_reverse_target_1` at line 5317). Uses `gidney_full_reverse_eq_first_rev_low` (read_idx 1 = 3 < 5) to reduce to first_bit_reverse, then Iter 194's `.2.1` directly gives `first_bit_reverse f (read_idx 1) = a.testBit 1`.
theoremgidney_classical_action_with_reverse_read_geq_2
theorem gidney_classical_action_with_reverse_read_geq_2
    (n a b : Nat) (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n)
    (j : Nat) (hj : 2 ≤ j) (hjn : j < n) :
    gidney_full_reverse_post_state n
      (gidney_final_cx_cascade_post_state n
        (gidney_forward_faithful_full_post_state n (adder_input_F n a b)))
      (read_idx j)
    = a.testBit j
*Read-side analog of `_with_reverse_target_geq_2`** (2026-05-14 tick). For j ∈ [2, n-1], the read_j position after the full forward+CX+reverse cascade equals `a.testBit j`. Same proof structure as the target version, using the read-side parametric `_at_read_eq_interior_reverse` and the read component (`.2.1`) of Iter 195's `_post_state_in_bits`, with XOR cancellation `xor (xor a_j c_j) c_j = a_j`.
theoremgidney_classical_action_with_reverse_assembled
theorem gidney_classical_action_with_reverse_assembled
    (n a b : Nat) (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n) :
    ∀ i, i < n →
      gidney_full_reverse_post_state n
        (gidney_final_cx_cascade_post_state n
          (gidney_forward_faithful_full_post_state n (adder_input_F n a b)))
        (target_idx i)
      = adder_sum_bit_classical a b i
*HEADLINE: TODO_gidney_classical_action_with_reverse PROVEN** (Iter 208 ASSEMBLY, modulo Iter 208's j ≥ 2 sorry). Combines: - Iter 202: j=0 case PARAMETRIC. - Iter 207: j=1 case PARAMETRIC over n. - Iter 208: TODO_..._target_geq_2 for j ∈ [2, n-1] (sorried).
theoremgidney_classical_action_with_reverse
theorem gidney_classical_action_with_reverse (n a b : Nat)
    (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n) :
    ∀ i, i < n →
      gidney_full_reverse_post_state n
        (gidney_final_cx_cascade_post_state n
          (gidney_forward_faithful_full_post_state n (adder_input_F n a b)))
        (target_idx i)
      = adder_sum_bit_classical a b i
*HEADLINE — Iter 191's restated headline, NOW PROVEN (Iter 213, 2026-05-13)**. The parametric semantic-correctness theorem with the REVERSE cascade. The Gidney ripple-carry adder is now Verified per CLAUDE.md taxonomy. Note: this theorem statement was originally drafted at line ~4605 as `TODO_gidney_classical_action_with_reverse` (sorried, Iter 191). Iter 213 derives it via `gidney_classical_action_with_reverse_assembled`.
theoremGidney.reverse_step_invariant_n_minus_1_after_propagation_reverse
theorem Gidney.reverse_step_invariant_n_minus_1_after_propagation_reverse
    (n a b : Nat) (hn : 1 < n) (_ha : a < 2^n) (_hb : b < 2^n)
    (input : Nat → Bool)
    (h_input : Gidney.post_forward_final_cx_invariant n a b input)
    (_h_t0 : input (target_idx 0) = adder_sum_bit_classical a b 0) :
    Gidney.reverse_step_invariant (n - 1) n a b
      (gidney_propagation_reverse_post_state (n - 1) input)
*Direct (non-K-inductive) cascade target** (2026-05-14 tick). For register width `n ≥ 2`, the parametric `propagation_reverse(n-1)` applied to the post-final-CX state produces a state satisfying `Gidney.reverse_step_invariant (n - 1) n a b _`. *Proof structure**: case-split on `j` in the predicate quantifier: - `j = 1`: use `gidney_propagation_reverse_eq_first_rev_low` to reduce propagation_reverse(n-1) at target_idx 1 / read_idx 1 to first_bit_reverse, then Iter 194's `gidney_first_bit_reverse_preserves` closes both, with the target side using `sumfb_eq_testBit_add` for the XOR identity. - `1 < j ≤ n - 1`: TODO_case_j_gt_1 — use `gidney_propagation_reverse_at_target_eq_interior_reverse` to reduce to interior_reverse(j-1), then Iter 201.
theoremGidney.post_full_reverse_invariant_holds
theorem Gidney.post_full_reverse_invariant_holds
    (n a b : Nat) (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n) :
    Gidney.post_full_reverse_invariant n a b
      (gidney_full_reverse_post_state n
        (gidney_final_cx_cascade_post_state n
          (gidney_forward_faithful_full_post_state n (adder_input_F n a b))))
*Closing composition** (2026-05-14 tick). For every n ≥ 2 and valid a, b inputs, the full forward + final-CX + reverse cascade state satisfies `Gidney.post_full_reverse_invariant`: every target_j equals sum_j AND every read_j equals a.testBit j. Target side: closed via the existing `gidney_classical_action_with_reverse` (Iter 213 assembly). Read side: TODO_read_via_direct — bridge from the new `_n_minus_1_after_propagation_reverse` (which proves the read side for the SIMPLER input `propagation_reverse(n-1) f` without the outer last_reverse layer) to the actual cascade `propagation_reverse(n-1) (last_reverse(n-1) f)`. The bridge requires showing that propagation_reverse is c_{n-1}-independent on read positions (since last_reverse modifies only c_{n-1}). ~30 lines of frame argument, deferred to next tick.
example(example)
example :
    Gidney.post_full_reverse_invariant 2 1 1
      (gidney_full_reverse_post_state 2
        (gidney_final_cx_cascade_post_state 2
          (gidney_forward_faithful_full_post_state 2 (adder_input_F 2 1 1))))
*Milestone validation** (2026-05-14 tick): the proven theorem fires correctly on the Iter 182 counterexample case (n=2, a=1, b=1) — the same instance where the original `TODO_gidney_classical_action` was found to be UNPROVABLE as stated. Confirms semantic-correctness closure at the smallest non-trivial input. Review hygiene (via `mcp__lean-lsp__lean_verify`, 2026-05-14): `Gidney.post_full_reverse_invariant_holds` depends only on `propext` and `Quot.sound` — Lean's standard foundational axioms. No custom axioms. See `notes/axiom-hygiene.md`.
example(example)
example : tcount (gidney_adder_full_faithful_no_measurement 33) = 462
*RSA-2048 adder T-count = 462** (Iter 262). For the maximum adder size in the RSA-2048 Shor's circuit (q_A = 33, qianxu p. 22), `tcount (gidney_adder_full_faithful_no_measurement 33) = 14·33 = 462`. Per qianxu Eq. E3: τ_adder = 25 q_A τ_s = 825 τ_s. The 462 T-gates is the underlying T-count from which the per-Toffoli cost (here 14n / q_A = 14) becomes a verified-correctness building block.
example(example)
example :
    tcount (gidney_adder_full_faithful_no_measurement
              qianxu_q_A_RSA2048)
      = gidney_adder_RSA2048_T_count_verified
*Bridge: verified parametric T-count matches the RSA-2048 paper-claim anchor** (Iter 263). Closes the review's paper-claim-first discipline (CLAUDE.md): the gate-faithful adder's T-count at q_A=33 matches the `gidney_adder_RSA2048_T_count_verified` paper-claim constant in `PaperClaims.lean`.
theoremgidney_adder_bit_step_0_applyNat
theorem gidney_adder_bit_step_0_applyNat (f : Nat → Bool) :
    Gate.applyNat (gidney_adder_bit_step 0) f
      = update f (carry_idx 0)
          (xor (f (carry_idx 0))
               (f (read_idx 0) && f (target_idx 0)))
`Gate.applyNat` form of `gidney_adder_bit_step_0_correct`. The i=0 step is a single CCX; its applyNat semantics matches the single-bit Toffoli update directly.
theoremgidney_adder_bit_step_faithful_first_applyNat
theorem gidney_adder_bit_step_faithful_first_applyNat (f : Nat → Bool) :
    Gate.applyNat gidney_adder_bit_step_faithful_first f
      = gidney_first_bit_post_state f
`Gate.applyNat` form of `gidney_adder_bit_step_faithful_first_correct`. The first-bit step's `applyNat` action is exactly the three-update chain captured by `gidney_first_bit_post_state`.
theoremgidney_adder_bit_step_faithful_interior_applyNat
theorem gidney_adder_bit_step_faithful_interior_applyNat
    (i : Nat) (f : Nat → Bool) :
    Gate.applyNat (gidney_adder_bit_step_faithful_interior i) f
      = gidney_bit_step_faithful_post_state i f
`Gate.applyNat` form of `gidney_adder_bit_step_faithful_interior_correct`. The interior step's `applyNat` action is exactly the four-update chain captured by `gidney_bit_step_faithful_post_state`.
theoremgidney_adder_bit_step_faithful_last_applyNat
theorem gidney_adder_bit_step_faithful_last_applyNat
    (i : Nat) (f : Nat → Bool) :
    Gate.applyNat (gidney_adder_bit_step_faithful_last i) f
      = gidney_last_bit_post_state i f
`Gate.applyNat` form of `gidney_adder_bit_step_faithful_last_correct`. The last-bit step's `applyNat` action is exactly the two-update chain captured by `gidney_last_bit_post_state`.
theoremgidney_final_cx_cascade_applyNat
theorem gidney_final_cx_cascade_applyNat :
    ∀ (n : Nat) (f : Nat → Bool),
      Gate.applyNat (gidney_final_cx_cascade n) f
        = gidney_final_cx_cascade_post_state n f
  | 0,     _ => rfl
  | n + 1, f =>
`Gate.applyNat` form of the final CX cascade. The cascade is a sequence of `CX(read[i], target[i])` for `i = 0..n-1`; its `applyNat` action is the chained `update` exactly captured by `gidney_final_cx_cascade_post_state`.
theoremgidney_adder_forward_with_propagation_applyNat
theorem gidney_adder_forward_with_propagation_applyNat :
    ∀ (n : Nat) (f : Nat → Bool),
      Gate.applyNat (gidney_adder_forward_with_propagation n) f
        = gidney_propagation_post_state n f
  | 0,     _ => rfl
  | 1,     _ => rfl
  | n + 2, f =>
`Gate.applyNat` form of the n-bit Gidney forward propagation cascade. Composes per-bit-step `Gate.applyNat` identities (Tick B) via the seq case. Base cases (`n = 0, 1`) and the inductive case all reduce to a single rewrite through the recursive identity + the per-step wrapper.
theoremgidney_adder_forward_faithful_full_applyNat
theorem gidney_adder_forward_faithful_full_applyNat :
    ∀ (n : Nat) (f : Nat → Bool),
      Gate.applyNat (gidney_adder_forward_faithful_full n) f
        = gidney_forward_faithful_full_post_state n f
  | 0,     _ => rfl
  | 1,     _ => rfl
  | n + 2, f =>
`Gate.applyNat` form of the full Gidney forward pass. The `applyNat` action is the propagation post-state through bit n-1 chained with the last-bit step at position n-1.
theoremgidney_read_val_lt
theorem gidney_read_val_lt : ∀ (n : Nat) (f : Nat → Bool),
    gidney_read_val n f < 2^n
  | 0,     _ => by simp [gidney_read_val]
  | n + 1, f =>
Decoder bound: `read_val < 2^n` for any bit-function.
theoremgidney_target_val_lt
theorem gidney_target_val_lt : ∀ (n : Nat) (f : Nat → Bool),
    gidney_target_val n f < 2^n
  | 0,     _ => by simp [gidney_target_val]
  | n + 1, f =>
Decoder bound: `target_val < 2^n`.
theoremgidney_carry_val_lt
theorem gidney_carry_val_lt : ∀ (n : Nat) (f : Nat → Bool),
    gidney_carry_val n f < 2^n
  | 0,     _ => by simp [gidney_carry_val]
  | n + 1, f =>
Decoder bound: `carry_val < 2^n`.
example(example)
example :
    gidney_target_val 2
      (Gate.applyNat (gidney_adder_full_faithful_no_measurement 2)
        inputF_1_plus_1_tickD) = 2
*Target register is correct**: after the full faithful no-measurement adder, target encodes `1 + 1 = 2`.
example(example)
example :
    gidney_read_val 2
      (Gate.applyNat (gidney_adder_full_faithful_no_measurement 2)
        inputF_1_plus_1_tickD) = 1
*Read register is preserved**: after the full faithful no-measurement adder, read = 1 (unchanged).
example(example)
example :
    gidney_carry_val 2
      (Gate.applyNat (gidney_adder_full_faithful_no_measurement 2)
        inputF_1_plus_1_tickD) = 3
*Carry register is NOT cleared**: after the full faithful no-measurement adder, carry = 3 (binary `11`), not 0. This is the open gap that blocks a verified modular adder built on this circuit.
theoremgidney_adder_bit_step_faithful_first_reverse_applyNat
theorem gidney_adder_bit_step_faithful_first_reverse_applyNat
    (f : Nat → Bool) :
    Gate.applyNat gidney_adder_bit_step_faithful_first_reverse f
      = gidney_first_bit_reverse_post_state f
`Gate.applyNat` form of the first-bit reverse step.
theoremgidney_adder_bit_step_faithful_interior_reverse_applyNat
theorem gidney_adder_bit_step_faithful_interior_reverse_applyNat
    (i : Nat) (f : Nat → Bool) :
    Gate.applyNat (gidney_adder_bit_step_faithful_interior_reverse i) f
      = gidney_interior_bit_reverse_post_state i f
`Gate.applyNat` form of the interior-bit reverse step.
theoremgidney_adder_bit_step_faithful_last_reverse_applyNat
theorem gidney_adder_bit_step_faithful_last_reverse_applyNat
    (i : Nat) (f : Nat → Bool) :
    Gate.applyNat (gidney_adder_bit_step_faithful_last_reverse i) f
      = gidney_last_bit_reverse_post_state i f
`Gate.applyNat` form of the last-bit reverse step.
theoremgidney_adder_forward_with_propagation_reverse_applyNat
theorem gidney_adder_forward_with_propagation_reverse_applyNat :
    ∀ (n : Nat) (f : Nat → Bool),
      Gate.applyNat (gidney_adder_forward_with_propagation_reverse n) f
        = gidney_propagation_reverse_post_state n f
  | 0,     _ => rfl
  | 1,     _ => rfl
  | n + 2, f =>
`Gate.applyNat` form of the n-bit propagation reverse cascade.
theoremgidney_adder_forward_faithful_full_reverse_applyNat
theorem gidney_adder_forward_faithful_full_reverse_applyNat :
    ∀ (n : Nat) (f : Nat → Bool),
      Gate.applyNat (gidney_adder_forward_faithful_full_reverse n) f
        = gidney_full_reverse_post_state n f
  | 0,     _ => rfl
  | 1,     _ => rfl
  | n + 2, f =>
`Gate.applyNat` form of the full Gidney reverse cascade.
theoremgidney_adder_full_faithful_no_measurement_applyNat
theorem gidney_adder_full_faithful_no_measurement_applyNat
    (n : Nat) (f : Nat → Bool) :
    Gate.applyNat (gidney_adder_full_faithful_no_measurement (n + 2)) f
      = gidney_full_reverse_post_state (n + 2)
          (gidney_final_cx_cascade_post_state (n + 2)
            (gidney_forward_faithful_full_post_state (n + 2) f))
`Gate.applyNat` form of the full faithful no-measurement Gidney adder for `n ≥ 2` (the only width at which the adder does non-trivial work; `n = 0` and `n = 1` are `Gate.I`). Composes the three Tick C forward wrappers + the new reverse wrapper.
theoremgidney_adder_full_faithful_no_measurement_target_correct
theorem gidney_adder_full_faithful_no_measurement_target_correct
    (n a b : Nat) (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n) :
    ∀ i, i < n →
      Gate.applyNat (gidney_adder_full_faithful_no_measurement n)
        (adder_input_F n a b) (target_idx i)
      = adder_sum_bit_classical a b i
*`Gate.applyNat`-form arithmetic correctness, target register.** For `n ≥ 2`, the full faithful Gidney adder applied to the standard 2-operand input encoding writes the correct sum bits into the target register. Lift of `gidney_classical_action_with_reverse` (Iter 207) through `gidney_adder_full_faithful_no_measurement_applyNat`.
theoremgidney_adder_full_faithful_no_measurement_read_correct_0
theorem gidney_adder_full_faithful_no_measurement_read_correct_0
    (n a b : Nat) (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n) :
    Gate.applyNat (gidney_adder_full_faithful_no_measurement n)
        (adder_input_F n a b) (read_idx 0)
      = a.testBit 0
*`Gate.applyNat`-form read-register preservation, j = 0.**
theoremgidney_adder_full_faithful_no_measurement_read_correct_1
theorem gidney_adder_full_faithful_no_measurement_read_correct_1
    (n a b : Nat) (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n) :
    Gate.applyNat (gidney_adder_full_faithful_no_measurement n)
        (adder_input_F n a b) (read_idx 1)
      = a.testBit 1
*`Gate.applyNat`-form read-register preservation, j = 1.**
theoremgidney_adder_full_faithful_no_measurement_read_correct_geq_2
theorem gidney_adder_full_faithful_no_measurement_read_correct_geq_2
    (n a b : Nat) (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n)
    (j : Nat) (hj : 2 ≤ j) (hjn : j < n) :
    Gate.applyNat (gidney_adder_full_faithful_no_measurement n)
        (adder_input_F n a b) (read_idx j)
      = a.testBit j
*`Gate.applyNat`-form read-register preservation, j ≥ 2.**
theoremgidney_adder_full_faithful_no_measurement_read_correct
theorem gidney_adder_full_faithful_no_measurement_read_correct
    (n a b : Nat) (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n) :
    ∀ i, i < n →
      Gate.applyNat (gidney_adder_full_faithful_no_measurement n)
        (adder_input_F n a b) (read_idx i)
      = a.testBit i
*`Gate.applyNat`-form read-register preservation, all positions.** Assembles the three cases above.
theoremgidney_adder_full_does_not_clear_carries_in_general
theorem gidney_adder_full_does_not_clear_carries_in_general :
    ¬ (∀ n a b, 1 < n → a < 2^n → b < 2^n → ∀ i, i < n →
        (Gate.applyNat (gidney_adder_full_faithful_no_measurement n)
          (adder_input_F n a b)) (carry_idx i) = false)
*Formalized Tick D finding**: the full faithful no-measurement Gidney adder does NOT clear the carry register in general. Proof: machine-checked counterexample at `(n=2, a=1, b=1, i=0)`. The existing Iter 191 work proves target-bit correctness and read-register preservation, but does NOT — and CANNOT, as this theorem shows — also establish carry-zeroing. This is the precise structural defect that blocks a verified modular adder built on this circuit: modular reduction requires clean ancillas to compare and conditionally subtract, but the existing adder leaves carries dirty whenever the carry chain is non-trivial.
theorempatched_n2_clears_carries
theorem patched_n2_clears_carries :
    ∀ a b, a < 4 → b < 4 → ∀ i, i < 2 →
      Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched 2)
        (adder_input_F 2 a b) (carry_idx i) = false
*Patched adder clears carries — n=2 exhaustive**. Over all `(a, b) ∈ [0, 4) × [0, 4)`, every carry position of the patched full faithful no-measurement Gidney adder is `false`.
theorempatched_n2_target_correct
theorem patched_n2_target_correct :
    ∀ a b, a < 4 → b < 4 → ∀ i, i < 2 →
      Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched 2)
          (adder_input_F 2 a b) (target_idx i)
        = adder_sum_bit_classical a b i
*Patched adder target correctness — n=2 exhaustive**.
theorempatched_n2_read_preserved
theorem patched_n2_read_preserved :
    ∀ a b, a < 4 → b < 4 → ∀ i, i < 2 →
      Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched 2)
          (adder_input_F 2 a b) (read_idx i)
        = a.testBit i
*Patched adder read preservation — n=2 exhaustive**.
theorempatched_n3_clears_carries
theorem patched_n3_clears_carries :
    ∀ a b, a < 8 → b < 8 → ∀ i, i < 3 →
      Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched 3)
        (adder_input_F 3 a b) (carry_idx i) = false
*Patched adder clears carries — n=3 exhaustive**. 192 cases.
theorempatched_n3_target_correct
theorem patched_n3_target_correct :
    ∀ a b, a < 8 → b < 8 → ∀ i, i < 3 →
      Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched 3)
          (adder_input_F 3 a b) (target_idx i)
        = adder_sum_bit_classical a b i
*Patched adder target correctness — n=3 exhaustive**. 192 cases.
theorempatched_n3_read_preserved
theorem patched_n3_read_preserved :
    ∀ a b, a < 8 → b < 8 → ∀ i, i < 3 →
      Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched 3)
          (adder_input_F 3 a b) (read_idx i)
        = a.testBit i
*Patched adder read preservation — n=3 exhaustive**. 192 cases.
theorempatched_carry_bool_identity
theorem patched_carry_bool_identity (A B C : Bool) :
    xor (xor (xor (xor (xor (A && B) (B && C)) (A && C)) C)
              ((xor A C) && (xor A B)))
        (xor A C)
      = false
*Boolean identity at the heart of the patch.** Given the carry recurrence `MAJ(A, B, C) = (A∧B) ⊕ (B∧C) ⊕ (A∧C)`, the patched reverse step's effect on `c[i]` reduces to `MAJ ⊕ C ⊕ ((A⊕C) ∧ (A⊕B)) ⊕ (A⊕C)`, which is identically `false` for all Booleans `A`, `B`, `C`. The role of each term in the patched step: `MAJ(A, B, C)` — invariant value of `c[i]` (the post-forward carry). `C` — invariant value of `c[i-1]` (chained out by `CX(c[i-1], c[i])`). `(A⊕C) ∧ (A⊕B)` — `r[i] ∧ t[i]` after final-CX, written into c[i] by the reverse CCX. `A⊕C` — `r[i]` after final-CX, written into c[i] by the patch's CX.
theorempatched_last_reverse_clears_carry_under_invariant
theorem patched_last_reverse_clears_carry_under_invariant
    (i : Nat) (a b : Nat) (f : Nat → Bool)
    (h_c   : f (carry_idx i)       = Adder.carry false (i + 1) a.testBit b.testBit)
    (h_cm1 : f (carry_idx (i - 1)) = Adder.carry false i       a.testBit b.testBit)
    (h_r   : f (read_idx i)        = xor (a.testBit i) (Adder.carry false i a.testBit b.testBit))
    (h_t   : f (target_idx i)      = xor (a.testBit i) (b.testBit i)) :
    Gate.applyNat (gidney_adder_bit_step_faithful_last_reverse_patched i) f
        (carry_idx i) = false
*Patched last-reverse step clears `carry_idx i`** for `i ≥ 1`, under the post-forward-final-CX invariant at position `i`.
theorempatched_last_reverse_preserves_non_carry
theorem patched_last_reverse_preserves_non_carry
    (i : Nat) (f : Nat → Bool) (k : Nat) (h_k : k ≠ carry_idx i) :
    Gate.applyNat (gidney_adder_bit_step_faithful_last_reverse_patched i) f k
      = f k
*Patched last-reverse step preserves every position outside `carry_idx i`** (frame condition).
theorempatched_interior_reverse_clears_carry_under_invariant
theorem patched_interior_reverse_clears_carry_under_invariant
    (i : Nat) (a b : Nat) (f : Nat → Bool)
    (h_c   : f (carry_idx i)       = Adder.carry false (i + 1) a.testBit b.testBit)
    (h_cm1 : f (carry_idx (i - 1)) = Adder.carry false i       a.testBit b.testBit)
    (h_r   : f (read_idx i)        = xor (a.testBit i) (Adder.carry false i a.testBit b.testBit))
    (h_t   : f (target_idx i)      = xor (a.testBit i) (b.testBit i)) :
    Gate.applyNat (gidney_adder_bit_step_faithful_interior_reverse_patched i) f
        (carry_idx i) = false
*Patched interior-reverse step clears `carry_idx i`** for `i ≥ 1`, under the post-forward-final-CX invariant at position `i`.
theoremfirst_reverse_post_state_preserves_read_0
theorem first_reverse_post_state_preserves_read_0 (f : Nat → Bool) :
    (gidney_first_bit_reverse_post_state f) (read_idx 0) = f (read_idx 0)
Frame helper: `gidney_first_bit_reverse_post_state` doesn't touch `read_idx 0`.
theorempatched_first_reverse_clears_carry_under_invariant
theorem patched_first_reverse_clears_carry_under_invariant
    (a b : Nat) (f : Nat → Bool)
    (h_r0 : f (read_idx 0)   = a.testBit 0)
    (h_t0 : f (target_idx 0) = xor (a.testBit 0) (b.testBit 0))
    (h_c0 : f (carry_idx 0)  = Adder.carry false 1 a.testBit b.testBit)
    (h_r1 : f (read_idx 1)   = xor (a.testBit 1) (Adder.carry false 1 a.testBit b.testBit))
    (h_t1 : f (target_idx 1) = xor (a.testBit 1) (b.testBit 1)) :
    Gate.applyNat gidney_adder_bit_step_faithful_first_reverse_patched f
        (carry_idx 0) = false
*Patched first-reverse step clears `carry_idx 0`** under the post-forward-final-CX invariant at position 0. The proof uses the existing `gidney_first_bit_reverse_preserves` (Iter 194) which states that the unpatched first-reverse step produces `post(c_0) = a.testBit 0`; the patch's `CX(read_idx 0, carry_idx 0)` then XORs this with `f (read_idx 0) = a.testBit 0`, yielding `false`.
theorempatched_interior_reverse_preserves_outside
theorem patched_interior_reverse_preserves_outside
    (i : Nat) (f : Nat → Bool) (k : Nat)
    (h_k_c   : k ≠ carry_idx i)
    (h_k_ri1 : k ≠ read_idx (i + 1))
    (h_k_ti1 : k ≠ target_idx (i + 1)) :
    Gate.applyNat (gidney_adder_bit_step_faithful_interior_reverse_patched i) f k = f k
theorempatched_first_reverse_preserves_outside
theorem patched_first_reverse_preserves_outside
    (f : Nat → Bool) (k : Nat)
    (h_k_c0 : k ≠ carry_idx 0)
    (h_k_r1 : k ≠ read_idx 1)
    (h_k_t1 : k ≠ target_idx 1) :
    Gate.applyNat gidney_adder_bit_step_faithful_first_reverse_patched f k = f k
theorempropagation_reverse_patched_preserves_carry_above
theorem propagation_reverse_patched_preserves_carry_above (m : Nat) :
    ∀ (f : Nat → Bool) (j : Nat), j > m →
      Gate.applyNat (gidney_adder_forward_with_propagation_reverse_patched (m + 1)) f
        (carry_idx j) = f (carry_idx j)
Frame for the propagation cascade: `gidney_adder_forward_with_propagation_reverse_patched (m+1)` preserves every `carry_idx j` for `j > m`. Proved by induction on `m` using the per-step frame lemmas above.
theorempatched_first_reverse_clears_carry_minimal
theorem patched_first_reverse_clears_carry_minimal
    (a b : Nat) (f : Nat → Bool)
    (h_r0 : f (read_idx 0)   = a.testBit 0)
    (h_t0 : f (target_idx 0) = xor (a.testBit 0) (b.testBit 0))
    (h_c0 : f (carry_idx 0)  = Adder.carry false 1 a.testBit b.testBit) :
    Gate.applyNat gidney_adder_bit_step_faithful_first_reverse_patched f
        (carry_idx 0) = false
Minimal-hypothesis version of the patched first-reverse step's carry-clearance (drops the `h_r1`, `h_t1` hypotheses that the earlier proof used via `gidney_first_bit_reverse_preserves`). This is the form needed by the cascade-level induction. Proved directly by structural unfolding + the boundary case `Adder.carry false 1 = MAJ(a_0, b_0, false) = a_0 ∧ b_0`.
theorempatched_propagation_reverse_cascade_clears_carries
theorem patched_propagation_reverse_cascade_clears_carries
    (m a b : Nat) :
    ∀ (f : Nat → Bool),
      (∀ j, j ≤ m →
        f (carry_idx j)   = Adder.carry false (j + 1) a.testBit b.testBit
        ∧ f (read_idx j)  = xor (a.testBit j) (Adder.carry false j a.testBit b.testBit)
        ∧ f (target_idx j) = xor (a.testBit j) (b.testBit j)) →
      ∀ i, i ≤ m →
        Gate.applyNat (gidney_adder_forward_with_propagation_reverse_patched (m + 1)) f
          (carry_idx i) = false
*Arbitrary-`m` propagation-cascade carry-clearance.** Under the post-forward-final-CX invariant at positions `0..m`, the patched propagation cascade `gidney_adder_forward_with_propagation_reverse_patched (m+1)` makes every `carry_idx i` (for `i ≤ m`) `false`. Proof: induction on `m`. Base case is the first-reverse step (using the minimal-hypothesis version). Inductive step uses `patched_interior_reverse_clears_carry_under_invariant` for the high-bit case, `propagation_reverse_patched_preserves_carry_above` to preserve the high carry across the rest of the cascade, and the inductive hypothesis for lower bits — with `patched_interior_reverse_preserves_outside` showing the invariant survives the interior step.
theorempatched_full_reverse_cascade_clears_carries
theorem patched_full_reverse_cascade_clears_carries
    (n a b : Nat) (f : Nat → Bool)
    (h_inv : ∀ j, j ≤ n + 1 →
      f (carry_idx j)   = Adder.carry false (j + 1) a.testBit b.testBit
      ∧ f (read_idx j)  = xor (a.testBit j) (Adder.carry false j a.testBit b.testBit)
      ∧ f (target_idx j) = xor (a.testBit j) (b.testBit j)) :
    ∀ i, i ≤ n + 1 →
      Gate.applyNat (gidney_adder_forward_faithful_full_reverse_patched (n + 2)) f
        (carry_idx i) = false
*Arbitrary-`n` full-reverse-cascade carry-clearance.** Under the post-forward-final-CX invariant at positions `0..n+1`, the patched full reverse cascade `gidney_adder_forward_faithful_full_reverse_patched (n+2)` makes every `carry_idx i` (for `i ≤ n+1`) `false`.
theoremgidney_adder_full_faithful_no_measurement_patched_clears_carries
theorem gidney_adder_full_faithful_no_measurement_patched_clears_carries
    (n a b : Nat) (ha : a < 2^(n + 2)) (hb : b < 2^(n + 2)) :
    ∀ i, i ≤ n + 1 →
      Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched (n + 2))
        (adder_input_F (n + 2) a b) (carry_idx i) = false
*Arbitrary-`n` patched-adder carry-clearance on `adder_input_F`.** The patched full faithful no-measurement Gidney adder, applied to the standard two-operand input `adder_input_F (n+2) a b`, leaves every carry position `carry_idx i` (for `i ≤ n+1`) cleared to `false`. Proof: combine the Tick C wrappers (forward + final_cx applyNat identities), the existing `Gidney.post_forward_final_cx_invariant_holds` (Iter 188 + Iter 189), and the new `patched_full_reverse_cascade_clears_carries` cascade theorem above.
theorempatched_first_reverse_eq_unpatched_at_non_c0
theorem patched_first_reverse_eq_unpatched_at_non_c0
    (f : Nat → Bool) (k : Nat) (h_k : k ≠ carry_idx 0) :
    Gate.applyNat gidney_adder_bit_step_faithful_first_reverse_patched f k
      = Gate.applyNat gidney_adder_bit_step_faithful_first_reverse f k
theorempatched_interior_reverse_eq_unpatched_at_non_ci
theorem patched_interior_reverse_eq_unpatched_at_non_ci
    (i : Nat) (f : Nat → Bool) (k : Nat) (h_k : k ≠ carry_idx i) :
    Gate.applyNat (gidney_adder_bit_step_faithful_interior_reverse_patched i) f k
      = Gate.applyNat (gidney_adder_bit_step_faithful_interior_reverse i) f k
theorempatched_last_reverse_eq_unpatched_at_non_ci
theorem patched_last_reverse_eq_unpatched_at_non_ci
    (i : Nat) (f : Nat → Bool) (k : Nat) (h_k : k ≠ carry_idx i) :
    Gate.applyNat (gidney_adder_bit_step_faithful_last_reverse_patched i) f k
      = Gate.applyNat (gidney_adder_bit_step_faithful_last_reverse i) f k
theoremunpatched_interior_reverse_preserves_outside
theorem unpatched_interior_reverse_preserves_outside
    (i : Nat) (f : Nat → Bool) (k : Nat)
    (h_k_c   : k ≠ carry_idx i)
    (h_k_ri1 : k ≠ read_idx (i + 1))
    (h_k_ti1 : k ≠ target_idx (i + 1)) :
    Gate.applyNat (gidney_adder_bit_step_faithful_interior_reverse i) f k = f k
theoremunpatched_first_reverse_preserves_outside
theorem unpatched_first_reverse_preserves_outside
    (f : Nat → Bool) (k : Nat)
    (h_k_c0 : k ≠ carry_idx 0) (h_k_r1 : k ≠ read_idx 1) (h_k_t1 : k ≠ target_idx 1) :
    Gate.applyNat gidney_adder_bit_step_faithful_first_reverse f k = f k
theoremunpatched_last_reverse_preserves_non_carry
theorem unpatched_last_reverse_preserves_non_carry
    (i : Nat) (f : Nat → Bool) (k : Nat) (h_k : k ≠ carry_idx i) :
    Gate.applyNat (gidney_adder_bit_step_faithful_last_reverse i) f k = f k
theoremupdate_update_comm
theorem update_update_comm (f : Nat → Bool) (a b : Nat) (u w : Bool) (h : a ≠ b) :
    update (update f a u) b w = update (update f b w) a u
Two `update`s at different positions commute.
theoremapplyNat_CX_commute_update_disjoint
theorem applyNat_CX_commute_update_disjoint
    (c t : Nat) (f : Nat → Bool) (p : Nat) (v : Bool)
    (h_p_c : p ≠ c) (h_p_t : p ≠ t) :
    Gate.applyNat (Gate.CX c t) (update f p v)
      = update (Gate.applyNat (Gate.CX c t) f) p v
`applyNat (CX c t)` commutes with `update _ p v` when `p` is disjoint from both `c` and `t`.
theoremapplyNat_CCX_commute_update_disjoint
theorem applyNat_CCX_commute_update_disjoint
    (a b c : Nat) (f : Nat → Bool) (p : Nat) (v : Bool)
    (h_p_a : p ≠ a) (h_p_b : p ≠ b) (h_p_c : p ≠ c) :
    Gate.applyNat (Gate.CCX a b c) (update f p v)
      = update (Gate.applyNat (Gate.CCX a b c) f) p v
`applyNat (CCX a b c)` commutes with `update _ p v` when `p` is disjoint from `a`, `b`, and `c`.
theoremapplyNat_seq_commute_update
theorem applyNat_seq_commute_update
    (g₁ g₂ : Gate) (f : Nat → Bool) (p : Nat) (v : Bool)
    (h₁ : ∀ f', Gate.applyNat g₁ (update f' p v) = update (Gate.applyNat g₁ f') p v)
    (h₂ : ∀ f', Gate.applyNat g₂ (update f' p v) = update (Gate.applyNat g₂ f') p v) :
    Gate.applyNat (Gate.seq g₁ g₂) (update f p v)
      = update (Gate.applyNat (Gate.seq g₁ g₂) f) p v
Sequential composition of gates commutes with `update _ p v` when each constituent gate does.
theoremunpatched_first_reverse_commute_update_at_c_above
theorem unpatched_first_reverse_commute_update_at_c_above
    (f : Nat → Bool) (j : Nat) (hj : j > 0) (v : Bool) :
    Gate.applyNat gidney_adder_bit_step_faithful_first_reverse (update f (carry_idx j) v)
      = update (Gate.applyNat gidney_adder_bit_step_faithful_first_reverse f) (carry_idx j) v
Unpatched first-reverse step commutes with update at `c[j]` (`j ≥ 1`).
theoremunpatched_interior_reverse_commute_update_at_c_above
theorem unpatched_interior_reverse_commute_update_at_c_above
    (i : Nat) (hi : 0 < i) (f : Nat → Bool) (j : Nat) (hj : j > i) (v : Bool) :
    Gate.applyNat (gidney_adder_bit_step_faithful_interior_reverse i)
      (update f (carry_idx j) v)
      = update (Gate.applyNat (gidney_adder_bit_step_faithful_interior_reverse i) f)
          (carry_idx j) v
Unpatched interior-reverse step commutes with update at `c[j]` (`j > i`).
theoremunpatched_last_reverse_commute_update_at_c_above
theorem unpatched_last_reverse_commute_update_at_c_above
    (i : Nat) (hi : 0 < i) (f : Nat → Bool) (j : Nat) (hj : j > i) (v : Bool) :
    Gate.applyNat (gidney_adder_bit_step_faithful_last_reverse i) (update f (carry_idx j) v)
      = update (Gate.applyNat (gidney_adder_bit_step_faithful_last_reverse i) f) (carry_idx j) v
Unpatched last-reverse step commutes with update at `c[j]` (`j > i`).
theoremunpatched_propagation_reverse_commute_update_at_c_above
theorem unpatched_propagation_reverse_commute_update_at_c_above (m : Nat) :
    ∀ (g : Nat → Bool) (v : Bool) (j : Nat), j > m →
      Gate.applyNat (gidney_adder_forward_with_propagation_reverse (m + 1))
        (update g (carry_idx j) v)
        = update (Gate.applyNat (gidney_adder_forward_with_propagation_reverse (m + 1)) g)
            (carry_idx j) v
Unpatched propagation cascade commutes with update at `c[j]` (`j > m`).

FormalRV.Arithmetic.RippleCarryAdder.RippleCarryAdderQubitCounts

FormalRV/Arithmetic/RippleCarryAdder/RippleCarryAdderQubitCounts.lean
example(example)
example : adder_n_qubits 4 = 14
Smoke: 4-bit adder uses 14 qubits (matches Fig. 4(a)).
example(example)
example : read_idx 0 = 0 ∧ target_idx 0 = 1 ∧ carry_idx 0 = 2
Smoke: indexing is monotone within a bit position.
example(example)
example : read_idx 1 = 3 ∧ target_idx 1 = 4 ∧ carry_idx 1 = 5
Smoke: indexing is monotone across bit positions.
theoremtcount_ripple_carry_unit_stub
theorem tcount_ripple_carry_unit_stub (i : Nat) :
    tcount (ripple_carry_unit_stub i) = 7
T-count of one stub unit = 7 (single CCX inside MAJ).
theoremtcount_gidney_adder_bit_step
theorem tcount_gidney_adder_bit_step (i : Nat) :
    tcount (gidney_adder_bit_step i) = 7
Each Gidney-adder forward step is exactly 1 Toffoli = 7 T-gates. Proof: CCX contributes 7 T; CX (if present) contributes 0.
example(example)
example : tcount (gidney_adder_bit_step 0) = 7
Concrete smoke checks: tcount per step is 7 for any specific i.
example(example)
example : tcount (gidney_adder_bit_step 5) = 7
example(example)
example : tcount (gidney_adder_bit_step 100) = 7
theoremgcount_gidney_adder_bit_step
theorem gcount_gidney_adder_bit_step (i : Nat) :
    gcount (gidney_adder_bit_step i) = if i = 0 then 1 else 2
Gate-count of one bit step is exactly 1 Toffoli — derived from the inner gate sequence. The +1 from any CX (i>0 case) is also counted in gcount (each CX = 1 gate).
theoremtcount_gidney_adder_forward
theorem tcount_gidney_adder_forward (n : Nat) :
    tcount (gidney_adder_forward n) = 7 * n
T-count of the full n-bit Gidney forward cascade: 7n (1 Toffoli × 7 T per bit × n bits). **First gate-derived recovery of qianxu Eq. E3's "q_A Toffoli gates" for the q_A-bit adder** — the adder-side analog of `tcount_prefix_and_cascade` for the lookup.
example(example)
example : tcount (gidney_adder_forward 4) = 28
Concrete: 4-bit Gidney forward cascade has 28 T-gates = 4 Toffolis.
example(example)
example : tcount (gidney_adder_forward 33) = 7 * 33
A 33-bit Gidney forward cascade (qianxu's RSA-2048 adder block, q_A=33, Eq. E3) has 33 Toffolis = 231 T-gates.
theoremtcount_gidney_adder_uncompute
theorem tcount_gidney_adder_uncompute (n : Nat) :
    tcount (gidney_adder_uncompute n) = 7 * n
T-count of the reverse pass: also `7n` (same Toffolis, different order).
theoremtcount_gidney_final_cx_cascade
theorem tcount_gidney_final_cx_cascade (n : Nat) :
    tcount (gidney_final_cx_cascade n) = 0
The final CX cascade is tcount-zero (only CXs, no Toffolis).
theoremtcount_gidney_adder_full
theorem tcount_gidney_adder_full (n : Nat) :
    tcount (gidney_adder_full n) = 14 * n
*Total T-count of the full n-bit Gidney adder (no-measurement upper bound): `14 n`**. Composition: forward (7n) + reverse (7n) + final CX (0). Under measurement-based uncomputation, the reverse contributes 0 — that's the optimization qianxu's "q_A Toffoli gates" claim relies on. The 14n here is the gate-level no-optimization bound; the 7n claim requires the measurement trick.
example(example)
example : tcount (gidney_adder_full 4) = 56
Concrete: 4-bit full Gidney adder = 56 T (= 8 Toffolis × 7T).
theoremgidney_adder_forward_tcount_matches_PaperClaims
theorem gidney_adder_forward_tcount_matches_PaperClaims (n : Nat) :
    tcount (gidney_adder_forward n) = 7 * gidney_total_toffolis_n_bit_adder n
*Bridge theorem**: the T-count of the Lean-encoded Gidney forward cascade equals `7 ·` the paper-claim Toffoli count. This connects the gate-derived value in `RippleCarryAdder.lean` to the data def in `PaperClaims.lean`, formally certifying that the latter is no longer paper-stated but Lean-gate-sequence-derived.
example(example)
example :
    tcount (gidney_adder_forward 33) = 7 * gidney_total_toffolis_n_bit_adder 33
Concrete bridge check at n=33 (RSA-2048 q_A=33 case): 33 Toffolis = 231 T-gates, both sides agree.
theoremgidney_no_measurement_vs_measurement_gap
theorem gidney_no_measurement_vs_measurement_gap (n : Nat) :
    tcount (gidney_adder_full n)
      = 2 * (7 * gidney_total_toffolis_n_bit_adder n)
*Review finding theorem**: the no-measurement gate-level T-count of the n-bit Gidney adder is exactly `2 ·` the paper's measurement- based claim. This is the formal statement of the structural Gidney-optimization assumption.
example(example)
example :
    tcount (gidney_adder_full 33) = 462
    ∧ 7 * gidney_total_toffolis_n_bit_adder 33 = 231
Concrete: at n=33 (RSA-2048 adder block), no-measurement bound is 14 × 33 = 462 T-gates, vs paper's 7 × 33 = 231 T-gates.
theoremgidney_adder_full_with_measurement_uncompute_tcount_eq
theorem gidney_adder_full_with_measurement_uncompute_tcount_eq (n : Nat) :
    gidney_adder_full_with_measurement_uncompute_tcount n = 7 * n
*Review-gap closure theorem**: the n-bit Gidney adder T-count with measurement-based uncomputation equals `7n`, matching qianxu Eq. E3's claim. This is the formal derivation of the previously paper-stated count from the Lean-encoded Gidney-AND primitive.
theoremgidney_full_vs_measurement_uncompute_factor
theorem gidney_full_vs_measurement_uncompute_factor (n : Nat) :
    tcount (gidney_adder_full n)
      = 2 * gidney_adder_full_with_measurement_uncompute_tcount n
*The review-gap factor of 2** is now explicit: the gate-explicit 14n bound (`tcount_gidney_adder_full n`) is exactly `2 ×` the measurement-uncomputation 7n bound. Both are formally derived in Lean; the difference is the Gidney trick.
example(example)
example :
    gidney_adder_full_with_measurement_uncompute_tcount 33 = 231
    ∧ tcount (gidney_adder_full 33) = 462
Concrete RSA-2048 (q_A=33): with Gidney measurement trick, T-count = 231 (paper figure); without, 462 (Lean explicit-reverse).
theoremgidney_adder_bit_step_0_correct
theorem gidney_adder_bit_step_0_correct (dim : Nat) (f : Nat → Bool)
    (h0 : read_idx 0 < dim) (h1 : target_idx 0 < dim) (h2 : carry_idx 0 < dim) :
    uc_eval (Gate.toUCom dim (gidney_adder_bit_step 0)) * f_to_vec dim f
      = f_to_vec dim
          (update f (carry_idx 0)
            (xor (f (carry_idx 0)) (f (read_idx 0) && f (target_idx 0))))
*`gidney_adder_bit_step 0` correctness**: on a classical basis state, the i=0 step XORs `(read[0] ∧ target[0])` into `carry[0]`. This is the Toffoli action: `(a, b, c) ↦ (a, b, c ⊕ (a ∧ b))`.
theoremtcount_gidney_adder_bit_step_faithful_interior
theorem tcount_gidney_adder_bit_step_faithful_interior (i : Nat) :
    tcount (gidney_adder_bit_step_faithful_interior i) = 7
T-count of the faithful interior bit-step: still 7 (1 Toffoli + 3 CXs, with CXs contributing 0 T). Matches qianxu's "q_A Toffoli gates per q_A-bit adder" claim.
theoremgcount_gidney_adder_bit_step_faithful_interior
theorem gcount_gidney_adder_bit_step_faithful_interior (i : Nat) :
    gcount (gidney_adder_bit_step_faithful_interior i) = 4
Gate count of the faithful interior bit-step: **4 gates** (vs the simplified encoding's 2). The 2 extra CXs are the propagation CXs the Iter 19 encoding was missing.
example(example)
example : tcount (gidney_adder_bit_step_faithful_interior 3) = 7
Concrete: at i=3 (interior bit), the faithful encoding has tcount 7 and gcount 4.
example(example)
example : gcount (gidney_adder_bit_step_faithful_interior 3) = 4
theoremtcount_gidney_adder_forward_faithful_interior
theorem tcount_gidney_adder_forward_faithful_interior (n : Nat) :
    tcount (gidney_adder_forward_faithful_interior n) = 7 * n
*T-count of the faithful interior cascade is `7n`**, matching the paper-claimed q_A Toffolis per q_A-bit adder. Same headline count as the Iter 20 simplified cascade — the propagation CXs are tcount-zero so they don't change the T-count, only the gate count.
theoremgcount_gidney_adder_forward_faithful_interior
theorem gcount_gidney_adder_forward_faithful_interior (n : Nat) :
    gcount (gidney_adder_forward_faithful_interior n) = 4 * n
*Gate count is `4n`** (vs the Iter 20 simplified cascade's `2n`). This is the **honest gate-count comparison** between the Lean-faithful encoding and qianxu Fig. 4(a).
example(example)
example : tcount (gidney_adder_forward_faithful_interior 33) = 231
Concrete: at n=33 (RSA-2048 adder block), faithful interior cascade has 231 T-gates (33 Toffolis × 7) and 132 total gates (33 × 4).
example(example)
example : gcount (gidney_adder_forward_faithful_interior 33) = 132
theoremfaithful_and_simplified_tcount_agree
theorem faithful_and_simplified_tcount_agree (n : Nat) :
    tcount (gidney_adder_forward_faithful_interior n)
      = tcount (gidney_adder_forward n)
The faithful cascade matches the simplified cascade's T-count (both 7n) but NOT its gate count (simplified: ~2n; faithful: 4n). This formalizes the review narrative: paper's "q_A Toffolis" count is preserved by either encoding, but only the faithful encoding correctly implements the carry.
theoremgidney_adder_bit_step_faithful_interior_correct
theorem gidney_adder_bit_step_faithful_interior_correct
    (dim i : Nat) (f : Nat → Bool)
    (hri : read_idx i < dim) (hti : target_idx i < dim)
    (hci : carry_idx i < dim) (hcim1 : carry_idx (i - 1) < dim)
    (hri1 : read_idx (i + 1) < dim) (hti1 : target_idx (i + 1) < dim)
    (h_rt : read_idx i ≠ target_idx i)
    (h_rc : read_idx i ≠ carry_idx i)
    (h_tc : target_idx i ≠ carry_idx i)
    (h_cc : carry_idx (i - 1) ≠ carry_idx i)
    (h_ci_ri1 : carry_idx i ≠ read_idx (i + 1))
    (h_ci_ti1 : carry_idx i ≠ target_idx (i + 1)) :
    uc_eval (Gate.toUCom dim (gidney_adder_bit_step_faithful_interior i))
*Faithful bit-step correctness on classical basis states** (Iter 57). For `i ≥ 1` interior bits, the four-gate sequence acts on `f_to_vec dim f` to produce the chained-update state `gidney_bit_step_faithful_post_state i f`. Proved by three applications of the reusable `gate_seq_acts_on_basis` bridge + the per-gate primitives `gate_ccx_acts_on_basis` and `gate_cx_acts_on_basis`.
theoremtcount_gidney_adder_bit_step_faithful_interior_reverse
theorem tcount_gidney_adder_bit_step_faithful_interior_reverse (i : Nat) :
    tcount (gidney_adder_bit_step_faithful_interior_reverse i) = 7
T-count of interior gate-reverse: 7 (matches forward).
theoremgcount_gidney_adder_bit_step_faithful_interior_reverse
theorem gcount_gidney_adder_bit_step_faithful_interior_reverse (i : Nat) :
    gcount (gidney_adder_bit_step_faithful_interior_reverse i) = 4
Gate-count of interior gate-reverse: 4 (matches forward).
theoremgidney_adder_bit_step_faithful_interior_fwd_rev_eq_one
theorem gidney_adder_bit_step_faithful_interior_fwd_rev_eq_one
    (dim i : Nat)
    (hri : read_idx i < dim) (hti : target_idx i < dim)
    (hci : carry_idx i < dim) (hcim1 : carry_idx (i - 1) < dim)
    (hri1 : read_idx (i + 1) < dim) (hti1 : target_idx (i + 1) < dim)
    (h_rt : read_idx i ≠ target_idx i)
    (h_rc : read_idx i ≠ carry_idx i)
    (h_tc : target_idx i ≠ carry_idx i)
    (h_cc : carry_idx (i - 1) ≠ carry_idx i)
    (h_ci_ri1 : carry_idx i ≠ read_idx (i + 1))
    (h_ci_ti1 : carry_idx i ≠ target_idx (i + 1)) :
    uc_eval (Gate.toUCom dim
*Interior forward · reverse = identity** at matrix level. The 3 CXs cancel pairwise (CNOT involution × 3) and the CCX-pair cancels. Mirrors Iter 81's first-bit pattern but with one more gate (4 gates → 4 involution pairs).
theorembit_disjointness_of_dim_bound
theorem bit_disjointness_of_dim_bound (dim i : Nat)
    (h1 : 1 ≤ i) (hd : 3 * i + 5 ≤ dim) :
    BitDisjointness dim i
*Parametric BitDisjointness derivation (Iter 61)**: all 12 disjointness conditions follow from a single dim-size bound `3*i + 5 ≤ dim` (covering the highest qubit index `target_idx (i+1) = 3i+4`), plus `1 ≤ i` (so `carry_idx (i-1)` is a distinct qubit). Reduces the review interface from 12 manual conditions to a single `omega`-style bound, per the new CLAUDE.md hard rule on reusable framework + readability.
theorembit_disjointness_for_cascade
theorem bit_disjointness_for_cascade (dim n : Nat) (h : 3 * n + 5 ≤ dim) :
    ∀ i, 1 ≤ i → i ≤ n → BitDisjointness dim i
*Cascade-level dim bound** suffices to derive BitDisjointness at every i in 1..n: a single `3*n + 5 ≤ dim` assumption covers all interior bits. Reduces the cascade-correctness interface to ONE quantifier-free hypothesis.
example(example)
example : 3 * 33 + 5 ≤ 104
Concrete: at RSA-2048 (q_A = 33), dim ≥ 3·33 + 5 = 104 suffices. Note that `adder_n_qubits 33 = 3·33 + 2 = 101`; the +3 over adder_n_qubits comes from the "next bit" propagation indices used by the interior bit-step.
theoremtcount_gidney_adder_bit_step_faithful_first
theorem tcount_gidney_adder_bit_step_faithful_first :
    tcount gidney_adder_bit_step_faithful_first = 7
T-count of the first-bit step: 7 (1 Toffoli; 2 CXs are tcount-0).
theoremgcount_gidney_adder_bit_step_faithful_first
theorem gcount_gidney_adder_bit_step_faithful_first :
    gcount gidney_adder_bit_step_faithful_first = 3
Gate count of the first-bit step: 3 (vs 4 for interior bits; no chain CX).
theoremgidney_adder_bit_step_faithful_first_correct
theorem gidney_adder_bit_step_faithful_first_correct
    (dim : Nat) (f : Nat → Bool)
    (hr0 : read_idx 0 < dim) (ht0 : target_idx 0 < dim)
    (hc0 : carry_idx 0 < dim) (hr1 : read_idx 1 < dim)
    (ht1 : target_idx 1 < dim)
    (h_rt : read_idx 0 ≠ target_idx 0)
    (h_rc : read_idx 0 ≠ carry_idx 0)
    (h_tc : target_idx 0 ≠ carry_idx 0)
    (h_c_r1 : carry_idx 0 ≠ read_idx 1)
    (h_c_t1 : carry_idx 0 ≠ target_idx 1) :
    uc_eval (Gate.toUCom dim gidney_adder_bit_step_faithful_first)
      * f_to_vec dim f
*First-bit correctness on classical basis states** (Iter 65). Proves `gidney_adder_bit_step_faithful_first` acts on `f_to_vec dim f` to produce `f_to_vec dim (gidney_first_bit_post_state f)`. Proof via two applications of `gate_seq_acts_on_basis` + the per-gate primitives.
theoremfirst_bit_disjointness_of_dim_bound
theorem first_bit_disjointness_of_dim_bound (dim : Nat) (h : 5 ≤ dim) :
    read_idx 0 < dim ∧ target_idx 0 < dim ∧ carry_idx 0 < dim
    ∧ read_idx 1 < dim ∧ target_idx 1 < dim
    ∧ read_idx 0 ≠ target_idx 0 ∧ read_idx 0 ≠ carry_idx 0
    ∧ target_idx 0 ≠ carry_idx 0
    ∧ carry_idx 0 ≠ read_idx 1 ∧ carry_idx 0 ≠ target_idx 1
The first-bit disjointness conditions are all decidable from the indexing (read_idx 0 = 0, target_idx 0 = 1, carry_idx 0 = 2, read_idx 1 = 3, target_idx 1 = 4). At dim ≥ 5 all 10 conditions hold.
theoremtcount_gidney_adder_bit_step_faithful_first_reverse
theorem tcount_gidney_adder_bit_step_faithful_first_reverse :
    tcount gidney_adder_bit_step_faithful_first_reverse = 7
T-count of the first-bit gate-reverse: 7 (matches forward).
theoremgcount_gidney_adder_bit_step_faithful_first_reverse
theorem gcount_gidney_adder_bit_step_faithful_first_reverse :
    gcount gidney_adder_bit_step_faithful_first_reverse = 3
Gate-count of the first-bit gate-reverse: 3 (matches forward).
theoremgidney_adder_bit_step_faithful_first_fwd_rev_eq_one
theorem gidney_adder_bit_step_faithful_first_fwd_rev_eq_one
    (dim : Nat)
    (hr0 : read_idx 0 < dim) (ht0 : target_idx 0 < dim)
    (hc0 : carry_idx 0 < dim) (hr1 : read_idx 1 < dim) (ht1 : target_idx 1 < dim)
    (h_rt : read_idx 0 ≠ target_idx 0)
    (h_rc : read_idx 0 ≠ carry_idx 0)
    (h_tc : target_idx 0 ≠ carry_idx 0)
    (h_c_r1 : carry_idx 0 ≠ read_idx 1)
    (h_c_t1 : carry_idx 0 ≠ target_idx 1) :
    uc_eval (Gate.toUCom dim
              (Gate.seq gidney_adder_bit_step_faithful_first
                        gidney_adder_bit_step_faithful_first_reverse))
*First-bit forward · reverse = identity** at matrix level. The two propagation CXs cancel pairwise (CNOT involution), and the CCX-pair cancels (CCX involution). Mirrors Iter 69's `..._faithful_last_fwd_rev_id` pattern but for the first-bit step (3 gates instead of 2).
theoremtcount_gidney_adder_bit_step_faithful_last
theorem tcount_gidney_adder_bit_step_faithful_last (i : Nat) :
    tcount (gidney_adder_bit_step_faithful_last i) = 7
T-count of the last-bit step: 7 (1 Toffoli; CX is tcount-0).
theoremgcount_gidney_adder_bit_step_faithful_last
theorem gcount_gidney_adder_bit_step_faithful_last (i : Nat) :
    gcount (gidney_adder_bit_step_faithful_last i) = 2
Gate count of the last-bit step: **2** (vs interior's 4, first- bit's 3). The last bit drops both propagation CXs.
theoremgidney_adder_bit_step_faithful_last_correct
theorem gidney_adder_bit_step_faithful_last_correct
    (dim i : Nat) (f : Nat → Bool)
    (hri : read_idx i < dim) (hti : target_idx i < dim)
    (hci : carry_idx i < dim) (hcim1 : carry_idx (i - 1) < dim)
    (h_rt : read_idx i ≠ target_idx i)
    (h_rc : read_idx i ≠ carry_idx i)
    (h_tc : target_idx i ≠ carry_idx i)
    (h_cc : carry_idx (i - 1) ≠ carry_idx i) :
    uc_eval (Gate.toUCom dim (gidney_adder_bit_step_faithful_last i))
      * f_to_vec dim f
      = f_to_vec dim (gidney_last_bit_post_state i f)
*Last-bit correctness on classical basis states** (Iter 67).
theoremgidney_adder_bit_step_faithful_last_fwd_rev_id
theorem gidney_adder_bit_step_faithful_last_fwd_rev_id
    (dim i : Nat) (f : Nat → Bool)
    (hri : read_idx i < dim) (hti : target_idx i < dim)
    (hci : carry_idx i < dim) (hcim1 : carry_idx (i - 1) < dim)
    (h_rt : read_idx i ≠ target_idx i)
    (h_rc : read_idx i ≠ carry_idx i)
    (h_tc : target_idx i ≠ carry_idx i)
    (h_cc : carry_idx (i - 1) ≠ carry_idx i) :
    uc_eval (Gate.toUCom dim
              (Gate.seq (gidney_adder_bit_step_faithful_last i)
                        (gidney_adder_bit_step_faithful_last_reverse i)))
      * f_to_vec dim f
*Forward · reverse (last-bit) = identity on basis states**. The two CX gates cancel (CX involution); the two CCX gates cancel (CCX involution). Composed correctly via the reusable framework.
theoremgidney_adder_forward_faithful_interior_correct
theorem gidney_adder_forward_faithful_interior_correct
    (dim : Nat) (hdim : 0 < dim) (f : Nat → Bool) :
    ∀ n, (∀ i, 1 ≤ i → i ≤ n → BitDisjointness dim i) →
    uc_eval (Gate.toUCom dim (gidney_adder_forward_faithful_interior n))
      * f_to_vec dim f
      = f_to_vec dim (gidney_cascade_post_state n f)
  | 0    , _ =>
*Faithful n-bit cascade correctness**: given disjointness on each bit position 1..n, the cascade acts on `f_to_vec dim f` to produce `f_to_vec dim (gidney_cascade_post_state n f)`. Proof by induction on n. **First Verified-tier theorem for the n-bit Gidney adder forward cascade.**
theoremgidney_adder_bit_step_succ_simplified
theorem gidney_adder_bit_step_succ_simplified (dim i : Nat) (f : Nat → Bool)
    (hri : read_idx (i+1) < dim) (hti : target_idx (i+1) < dim)
    (hci : carry_idx (i+1) < dim) (hci' : carry_idx i < dim)
    (hrt : read_idx (i+1) ≠ target_idx (i+1))
    (hrc : read_idx (i+1) ≠ carry_idx (i+1))
    (htc : target_idx (i+1) ≠ carry_idx (i+1))
    (hcc : carry_idx i ≠ carry_idx (i+1)) :
    let f'
Action of the simplified `gidney_adder_bit_step (i+1)` on basis states: XORs `(read[i+1] ∧ target[i+1]) ⊕ carry[i]` into `carry[i+1]`. *This is NOT Gidney's actual carry** (see review-gap note above); proving it here makes the discrepancy explicit.
theoremtcount_gidney_adder_bit_step_reverse
theorem tcount_gidney_adder_bit_step_reverse (i : Nat) :
    tcount (gidney_adder_bit_step_reverse i) = 7
T-count of the gate-reverse: same 7 as forward (same gates, swapped order).
theoremgcount_gidney_adder_bit_step_reverse
theorem gcount_gidney_adder_bit_step_reverse (i : Nat) :
    gcount (gidney_adder_bit_step_reverse i) = (if i = 0 then 1 else 2)
Gate-count of the gate-reverse: 1 at i=0, 2 at i>0 (matches forward).
theoremgidney_adder_bit_step_fwd_rev_eq_one
theorem gidney_adder_bit_step_fwd_rev_eq_one (dim i : Nat)
    (hri : read_idx i < dim) (hti : target_idx i < dim)
    (hci : carry_idx i < dim)
    (h_rt : read_idx i ≠ target_idx i)
    (h_rc : read_idx i ≠ carry_idx i)
    (h_tc : target_idx i ≠ carry_idx i)
    (hcim1 : i ≠ 0 → carry_idx (i - 1) < dim)
    (h_cc : i ≠ 0 → carry_idx (i - 1) ≠ carry_idx i) :
    uc_eval (Gate.toUCom dim
              (Gate.seq (gidney_adder_bit_step i)
                        (gidney_adder_bit_step_reverse i)))
      = (1 : Matrix (Fin (2^dim)) (Fin (2^dim)) ℂ)
*Matrix-level per-bit involution**: `bit_step i · bit_step_reverse i = 1`. Proven for all `i` (both branches) under the standard bit-disjointness hypotheses. The i = 0 branch needs `read_idx 0 = 0, target_idx 0 = 1, carry_idx 0 = 2` (auto-derived from the `read_idx`/`target_idx`/`carry_idx` defs and the disjointness hypotheses); the i > 0 branch mirrors `gidney_adder_bit_step_faithful_last_fwd_rev_id` (Iter 69) structurally. *This is the per-bit collapse used in Iter 74's cascade induction**: `uc_eval (cascade (n+1) · uncompute (n+1))` re-associates to `uc_eval (cascade n) · uc_eval (bit_step n · bit_step_reverse n) · uc_eval (uncompute n)`, and the middle factor collapses to 1 by this lemma.
theoremtcount_gidney_adder_uncompute_proper
theorem tcount_gidney_adder_uncompute_proper (n : Nat) :
    tcount (gidney_adder_uncompute_proper n) = 7 * n
T-count of the proper reverse: 7n (same gates, reversed).
theoremgidney_adder_forward_uncompute_proper_eq_one
theorem gidney_adder_forward_uncompute_proper_eq_one
    (dim : Nat) (hdim : 0 < dim) :
    ∀ n, 3 * n ≤ dim →
    uc_eval (Gate.toUCom dim
              (Gate.seq (gidney_adder_forward n)
                        (gidney_adder_uncompute_proper n)))
      = (1 : Matrix (Fin (2^dim)) (Fin (2^dim)) ℂ)
  | 0    , _ =>
*Matrix-level forward · proper-uncompute = identity**. The n-bit Gidney forward cascade composed with its proper (gate-reversed) uncomputation is the identity matrix. Proof by structural recursion on n, mirroring Iter 74's `prefix_and_cascade_uncompute_eq_one`. *Hypothesis**: a single `3 * n ≤ dim` bound suffices (the highest qubit touched at bit position k is `carry_idx k = 3k+2`, so all bits 0..n-1 fit when `3n ≤ dim`). *Fourth Verified-tier review chain** (adder side, mirror of Iter 74). Confirms that the simplified-bit-step forward cascade IS reversible by its proper inverse without measurement.
theoremtcount_gidney_adder_forward_with_propagation
theorem tcount_gidney_adder_forward_with_propagation : ∀ n,
    tcount (gidney_adder_forward_with_propagation n) = 7 * n
  | 0     => by decide
  | 1     => by decide
  | n + 2 =>
T-count of the propagation cascade: `7n` (each bit contributes 1 Toffoli).
theoremgcount_gidney_adder_forward_with_propagation
theorem gcount_gidney_adder_forward_with_propagation : ∀ n,
    gcount (gidney_adder_forward_with_propagation n)
      = if n = 0 then 0 else 4 * n - 1
  | 0     => by decide
  | 1     => by decide
  | n + 2 =>
Gate-count of the propagation cascade. Bit 0 contributes 3 gates (1 CCX + 2 propagation CXs); each interior bit contributes 4 (1 CCX + 1 chain CX + 2 propagation CXs). Total: `3 + 4·(n-1) = 4n - 1` for `n ≥ 1`. Edge cases: `n=0` gives 0 gates; for n ≥ 1 the formula `4n - 1` holds. We state it as `4n + (if n = 0 then 0 else -1)` to handle both cleanly — but Nat doesn't support negative, so we split into two clauses.
theoremtcount_gidney_adder_forward_faithful_full
theorem tcount_gidney_adder_forward_faithful_full (n : Nat) :
    tcount (gidney_adder_forward_faithful_full (n + 2)) = 7 * (n + 2)
T-count of the faithful full forward cascade: `7n` for `n ≥ 2`. Matches qianxu Eq. E3's `q_A` Toffolis per adder (T-count = 7 · q_A).
theoremgidney_cost_skeleton_eq_faithful
theorem gidney_cost_skeleton_eq_faithful (n : Nat) :
    tcount (gidney_adder_forward (n + 2))
      = tcount (gidney_adder_forward_faithful_full (n + 2))
*Cost-equivalence (Iter 53 review-gap closure).** The COST-ONLY skeleton forward pass (`gidney_adder_forward`, which is *not* semantically the adder) and the semantically-correct faithful forward pass (`gidney_adder_forward_faithful_full`, proven on basis states) have the *same T-count**. (The Shor cost model now binds *directly* to the faithful adder via `adderToff_eq`; this records that the deprecated skeleton was always cost-equivalent — the gates it omits are carry-propagation CXs, which are T-free.)
theoremgcount_gidney_adder_forward_faithful_full
theorem gcount_gidney_adder_forward_faithful_full (n : Nat) :
    gcount (gidney_adder_forward_faithful_full (n + 2)) = 4 * (n + 2) - 3
Gate-count of the faithful full forward cascade: `4n - 3` for `n ≥ 2`. Decomposes as 3 (first) + 4·(n-2) (interiors) + 2 (last) = 4n - 3.
example(example)
example : tcount (gidney_adder_forward_faithful_full 4) = 28
Concrete: 4-bit faithful Gidney adder = 28 T-gates = 4 Toffolis. (Matches `qq_gidney_adder.py` for a 4-bit instance.)
example(example)
example : tcount (gidney_adder_forward_faithful_full 33) = 7 * 33
Concrete: 33-bit faithful Gidney adder (RSA-2048 q_A=33 block) = 231 T-gates = 33 Toffolis.
theoremgidney_adder_forward_with_propagation_correct
theorem gidney_adder_forward_with_propagation_correct
    (dim : Nat) (hdim : 0 < dim) (f : Nat → Bool) :
    ∀ n, 3 * n + 2 ≤ dim →
    uc_eval (Gate.toUCom dim (gidney_adder_forward_with_propagation n))
      * f_to_vec dim f
      = f_to_vec dim (gidney_propagation_post_state n f)
  | 0    , _ =>
*Propagation cascade correctness**: given a single dim-bound `3 * n + 2 ≤ dim` (covering all qubits up through bit position n-1's propagation to bit n), the cascade acts on `f_to_vec dim f` to produce `f_to_vec dim (gidney_propagation_post_state n f)`. Proof by structural recursion on the three-clause def: - n=0: Gate.I, trivially preserves. - n=1: apply `gidney_adder_bit_step_faithful_first_correct` with first-bit disjointness derived from dim ≥ 5. - n+2: `gate_seq_acts_on_basis` + IH (propagation n+1) + per-bit interior correctness at position n+1 (via `bit_disjointness_of_dim_bound`).
theoremgidney_adder_forward_faithful_full_correct
theorem gidney_adder_forward_faithful_full_correct
    (dim : Nat) (hdim : 0 < dim) (f : Nat → Bool) (n : Nat)
    (hbd : 3 * (n + 2) ≤ dim) :
    uc_eval (Gate.toUCom dim (gidney_adder_forward_faithful_full (n + 2)))
      * f_to_vec dim f
      = f_to_vec dim (gidney_forward_faithful_full_post_state (n + 2) f)
*Faithful full forward cascade correctness** (Phase A review anchor at the basis-state level): on `(n+2)`-bit input `f`, the cascade `gidney_adder_forward_faithful_full (n+2)` acts as `gidney_forward_faithful_full_post_state (n+2)` on basis states. Combines `gidney_adder_forward_with_propagation_correct` (propagation, this iter) with `gidney_adder_bit_step_faithful_last_correct` (last bit, Iter 67). Single dim-bound hypothesis `3*(n+2) ≤ dim` covers all qubits including the (n+1)-th carry.
theoremgidney_final_cx_cascade_correct
theorem gidney_final_cx_cascade_correct
    (dim : Nat) (hdim : 0 < dim) (f : Nat → Bool) :
    ∀ n, 3 * n ≤ dim →
    uc_eval (Gate.toUCom dim (gidney_final_cx_cascade n)) * f_to_vec dim f
      = f_to_vec dim (gidney_final_cx_cascade_post_state n f)
  | 0    , _   =>
*Final CX cascade correctness** on classical basis states. Single dim-bound hypothesis `3 * n ≤ dim` covers all qubits `target_idx (n-1) = 3n - 2 < dim` (for n ≥ 1). Proof by structural recursion on `n`: - n = 0: cascade is `Gate.I`; trivially preserves. - n + 1: `gate_seq_acts_on_basis` + IH + per-step `gate_cx_acts_on_basis` with disjointness via `omega`.
theoremtcount_gidney_adder_forward_with_propagation_reverse
theorem tcount_gidney_adder_forward_with_propagation_reverse : ∀ n,
    tcount (gidney_adder_forward_with_propagation_reverse n) = 7 * n
  | 0     => by decide
  | 1     => by decide
  | n + 2 =>
T-count of the propagation reverse cascade: 7n (same gates as forward, reversed).
theoremtcount_gidney_adder_forward_faithful_full_reverse
theorem tcount_gidney_adder_forward_faithful_full_reverse (n : Nat) :
    tcount (gidney_adder_forward_faithful_full_reverse (n + 2)) = 7 * (n + 2)
T-count of the faithful full reverse cascade: 7n for `n ≥ 2`.
theoremgidney_adder_forward_with_propagation_fwd_rev_eq_one
theorem gidney_adder_forward_with_propagation_fwd_rev_eq_one
    (dim : Nat) (hdim : 0 < dim) :
    ∀ n, 3 * n + 2 ≤ dim →
    uc_eval (Gate.toUCom dim
              (Gate.seq (gidney_adder_forward_with_propagation n)
                        (gidney_adder_forward_with_propagation_reverse n)))
      = (1 : Matrix (Fin (2^dim)) (Fin (2^dim)) ℂ)
  | 0    , _ =>
*Cascade-level forward · reverse = identity** for the propagation cascade. By structural recursion on `n`: collapse the middle `interior fwd · interior rev` pair via Iter 82's `..._interior_fwd_rev_eq_one`, then apply IH. Base cases: - n = 0: both are Gate.I; product is ID·ID = 1. - n = 1: just first_fwd · first_rev = 1 by Iter 81's involution. Inductive step n+2: `(forward (n+1) ; interior (n+1)) ; (interior_reverse (n+1) ; reverse (n+1))`. Reassociate matrix product, collapse middle interior pair via Iter 82, drop via Matrix.one_mul, apply IH on forward (n+1) · reverse (n+1).
theoremgidney_adder_forward_faithful_full_fwd_rev_eq_one
theorem gidney_adder_forward_faithful_full_fwd_rev_eq_one
    (dim : Nat) (hdim : 0 < dim) (n : Nat)
    (hbd : 3 * (n + 2) ≤ dim) :
    uc_eval (Gate.toUCom dim
              (Gate.seq (gidney_adder_forward_faithful_full (n + 2))
                        (gidney_adder_forward_faithful_full_reverse (n + 2))))
      = (1 : Matrix (Fin (2^dim)) (Fin (2^dim)) ℂ)
*Faithful full forward · reverse = identity (cascade level)** for the `(n+2)`-bit Gidney adder. Combines `..._with_propagation_fwd_rev_eq_one` (propagation cascade) + Iter 69's `..._last_fwd_rev_id` (last bit) via matrix reassociation.
theoremtcount_gidney_adder_full_faithful_no_measurement
theorem tcount_gidney_adder_full_faithful_no_measurement (n : Nat) :
    tcount (gidney_adder_full_faithful_no_measurement (n + 2)) = 14 * (n + 2)
T-count of the full no-measurement faithful adder for `(n+2)` bits: `14(n+2)`. Derived from the gate sequence: 7(n+2) (forward) + 0 (final CX = pure CXs) + 7(n+2) (reverse).
example(example)
example : tcount (gidney_adder_full_faithful_no_measurement 4) = 56
Concrete: 4-bit full faithful adder = 56 T-gates = 8 Toffolis.
example(example)
example : tcount (gidney_adder_full_faithful_no_measurement 33) = 14 * 33
Concrete: 33-bit full faithful adder (RSA-2048 q_A=33) = 14 · 33 = 462 T-gates = 66 Toffolis. **No-measurement upper bound** (Gidney measurement trick would halve this to 33 Toffolis = 231 T).
theoremgidney_adder_full_faithful_no_measurement_vs_measurement_factor
theorem gidney_adder_full_faithful_no_measurement_vs_measurement_factor
    (n : Nat) :
    tcount (gidney_adder_full_faithful_no_measurement (n + 2))
      = 2 * gidney_adder_full_with_measurement_uncompute_tcount (n + 2)
*Gate-faithful no-measurement vs measurement-trick factor** (Iter 88). Strengthens `gidney_full_vs_measurement_uncompute_factor` (Iter 25, simplified bit-step) to the **gate-faithful** Gidney adder. The faithful encoding emits the same Toffoli count (14n T-gates), but is now backed by `qq_gidney_adder.py`'s full gate sequence and the Phase A semantic/structural correctness chain (Iter 65/57/67 per-bit + Iter 80 cascade forward + Iter 83 matrix-level inverse + Iter 86 reverse correctness). The factor of 2 remains the **measurement-uncomputation review gap**: faithful no-measurement T-count = 14n = 2 · (measurement paper-claim count 7n).

FormalRV.Arithmetic.RippleCarryAdder.RippleCarryAdderRSA2048Resource

FormalRV/Arithmetic/RippleCarryAdder/RippleCarryAdderRSA2048Resource.lean
example(example)
example :
    gidney_adder_full_with_measurement_uncompute_tcount 33 = 231
    ∧ tcount (gidney_adder_full_faithful_no_measurement 33) = 462
Concrete RSA-2048 (q_A=33): with Gidney measurement trick, T-count = 231 (paper figure); without (faithful gate-explicit), 462 — the factor of 2 review gap.
theoremgidney_adder_forward_faithful_full_reverse_correct
theorem gidney_adder_forward_faithful_full_reverse_correct
    (dim : Nat) (hdim : 0 < dim) (f : Nat → Bool) (n : Nat)
    (hbd : 3 * (n + 2) ≤ dim) :
    uc_eval (Gate.toUCom dim (gidney_adder_forward_faithful_full_reverse (n + 2)))
      * f_to_vec dim (gidney_forward_faithful_full_post_state (n + 2) f)
      = f_to_vec dim f
*Reverse cascade correctness on basis states** — derived as a corollary of Iter 80 (forward correctness) + Iter 83 (matrix- level forward · reverse = 1). On any classical basis state `f_to_vec dim (gidney_forward_faithful_full_post_state (n+2) f)`, the reverse cascade produces back `f_to_vec dim f`.
theoremgidney_adder_full_faithful_no_measurement_unfold
theorem gidney_adder_full_faithful_no_measurement_unfold
    (dim : Nat) (hdim : 0 < dim) (f : Nat → Bool) (n : Nat)
    (hbd : 3 * (n + 2) ≤ dim) :
    uc_eval (Gate.toUCom dim (gidney_adder_full_faithful_no_measurement (n + 2)))
      * f_to_vec dim f
      = uc_eval (Gate.toUCom dim (gidney_adder_forward_faithful_full_reverse (n + 2)))
          * f_to_vec dim
              (gidney_final_cx_cascade_post_state (n + 2)
                (gidney_forward_faithful_full_post_state (n + 2) f))
*Full faithful adder structural unfolding** on classical basis states. The action of `gidney_adder_full_faithful_no_measurement` on `f_to_vec dim f` is expressed as: uc_eval(reverse) * f_to_vec(cx_post(forward_post f)) where `forward_post = gidney_forward_faithful_full_post_state` and `cx_post = gidney_final_cx_cascade_post_state`. The reverse cascade is left symbolic; closing it to a final basis state requires the arithmetic-semantics theorem (Iter 88-89). This unfolding gives the structural skeleton needed to derive the end-to-end `(a, b, 0) → (a, a+b mod 2^n, 0)` theorem.
theoremgidney_first_bit_post_state_on_zero
theorem gidney_first_bit_post_state_on_zero :
    gidney_first_bit_post_state zeroF = zeroF
First-bit step on zero input gives zero. Each of the three updates writes `xor false false = false`, hence is a no-op by `Function.update_eq_self`.
theoremgidney_bit_step_faithful_post_state_on_zero
theorem gidney_bit_step_faithful_post_state_on_zero (i : Nat) :
    gidney_bit_step_faithful_post_state i zeroF = zeroF
Bit-step (interior) on zero input gives zero. Same pattern as first-bit: each update writes false.
theoremgidney_last_bit_post_state_on_zero
theorem gidney_last_bit_post_state_on_zero (i : Nat) :
    gidney_last_bit_post_state i zeroF = zeroF
Last-bit step on zero input gives zero.
theoremgidney_propagation_post_state_on_zero
theorem gidney_propagation_post_state_on_zero : ∀ n,
    gidney_propagation_post_state n zeroF = zeroF
  | 0     => rfl
  | 1     => gidney_first_bit_post_state_on_zero
  | n + 2 =>
Propagation cascade on zero input gives zero. Induction on n.
theoremgidney_forward_faithful_full_post_state_on_zero
theorem gidney_forward_faithful_full_post_state_on_zero : ∀ n,
    gidney_forward_faithful_full_post_state n zeroF = zeroF
  | 0     => rfl
  | 1     => rfl
  | n + 2 =>
Full forward cascade on zero input gives zero.
theoremgidney_final_cx_cascade_post_state_on_zero
theorem gidney_final_cx_cascade_post_state_on_zero : ∀ n,
    gidney_final_cx_cascade_post_state n zeroF = zeroF
  | 0     => rfl
  | n + 1 =>
Final CX cascade on zero input gives zero. Induction on n — each CX(read_i, target_i) writes `target_i ⊕= false = target_i`, a no-op.
theoremgidney_adder_full_faithful_no_measurement_on_zero
theorem gidney_adder_full_faithful_no_measurement_on_zero
    (dim : Nat) (hdim : 0 < dim) (n : Nat)
    (hbd : 3 * (n + 2) ≤ dim) :
    uc_eval (Gate.toUCom dim (gidney_adder_full_faithful_no_measurement (n + 2)))
      * f_to_vec dim zeroF
      = f_to_vec dim zeroF
*End-to-end smoke test**: full faithful Gidney adder on the all-zero input gives back the all-zero output. The simplest arithmetic claim `0 + 0 = 0 mod 2^n` verified at the gate level. Proof: combine Iter 87's structural unfolding with the zero-input lemmas above to reduce the full adder's action to `uc_eval(reverse) * f_to_vec(zero)`. Then apply Iter 86's reverse correctness (with f = zero, since `forward_post(zero) = zero`) to get `f_to_vec(zero)`.
example(example)
example :
    let post
*Concrete forward action check** at every qubit position for the 2-bit adder on `inputF_1_plus_0`. After the forward cascade (first-bit step + last-bit step): - read_0 stays 1 (CCX has read_0 as control; control=1 but target_0=0, so CCX writes 1 ∧ 0 = 0 into carry — no change) - target_0 stays 0 - carry_0 = 0 (read_0 ∧ target_0 = 1 ∧ 0 = 0) - read_1, target_1, carry_1 all stay 0 (no propagation since carry_0 = 0). All 6 positions evaluate by `decide`, confirming the forward cascade preserves the state on this input. The arithmetic interpretation: forward correctly determines that no carries are generated.
example(example)
example :
    let post
*Concrete final-CX action check** for the 2-bit adder on the forward-post-state above. The final CX cascade applies: - CX(read_0, target_0): target_0 ⊕= read_0 = 0 ⊕ 1 = 1. - CX(read_1, target_1): target_1 ⊕= read_1 = 0 ⊕ 0 = 0. After final CX, target = (1, 0), the sum 1 + 0 = 1 ✓.
example(example)
example :
    let post
*Forward post-state on (1, 1) input**: carry_0 generated, propagation flips read_1 and target_1 to 1, but the last-bit step's CCX·CX leaves carry_1 = 0.
example(example)
example :
    let post
*Final CX post-state on (1, 1) input**: `target_0 = 0` (sum-bit-0 = a XOR b XOR carry_in = 1 ⊕ 1 ⊕ 0 = 0 ✓), `target_1 = 0` (at this point target_1 is XOR'd by post-CX read_1=1, so 1 ⊕ 1 = 0 — NOT the sum bit; the reverse cascade is needed to restore target_1 = 1 via the propagation undo).
example(example)
example :
    let post
*Forward post-state on (3, 1) input** (9 qubits checked).
example(example)
example :
    let post
*Final CX post-state on (3, 1) input**: target = (0, 1, 0) = "010" LSB-first = **2**, NOT the expected sum 4 = "100". The reverse cascade is required to flip target_2 from 0 to 1 (via interior_reverse's CX(carry_1, target_2)) to obtain the correct sum. Same review pattern as Iter 106's 2-bit `1+1=2`.
example(example)
example :
    let post
*Forward post-state on (7, 1) input** at all 12 qubits. Carry chain: carry_0=1, carry_1=1, carry_2=1, carry_3=0 (last- bit step's chain CX cancels). Propagation flips read_1, read_2, read_3 (via CX with carries of 1) and target_1, target_2, target_3.
example(example)
example :
    let post
*Final CX post-state on (7, 1) input**: target_0 = 1⊕1 = 0 (sum bit 0), target_3 = 1⊕1 = 0 (NOT the sum bit 3, which should be 1 for 8 = "1000" binary; the reverse cascade is needed to flip target_3 from 0 to 1).
example(example)
example :
    -- Starting state after forward + final CX on inputF_1_plus_1:
    -- (1, 0, 1, 1, 0, 0) i.e., read=(1,1), target=(0,0), carry=(1,0).
    -- Wait this is 2-bit case, but first_bit_reverse uses bit 0 + bit 1
    -- indices, applied to a 6-qubit state.
    -- After first-bit reverse on this state:
    -- - CX(2, 4): target_1 ⊕= carry_0(=1) → target_1 = 0 ⊕ 1 = 1.
    -- - CX(2, 3): read_1 ⊕= carry_0(=1) → read_1 = 1 ⊕ 1 = 0.
    -- - CCX(0, 1, 2): carry_0 ⊕= read_0(=1) ∧ target_0(=0) → carry_0 = 1 ⊕ 0 = 1.
    let prev
*Smoke test on `inputF_1_plus_1` (a=1, b=1)**: starting from the post-final-CX state `(read=(1,0), target=(0,0), carry=(1,0))` after the 2-bit Gidney adder's forward + final CX, the first-bit reverse acts on it. Verify the post-state via decide.
example(example)
example :
    let prev
*Smoke test on `inputF_3_plus_1`**: starting from the post-(forward+final-CX) state of the 3-bit adder, apply the interior-bit reverse at i=1. Verify the post-state at all 9 qubits via decide.
example(example)
example :
    let prev
*Smoke test on `inputF_7_plus_1`**: starting from the post-(forward+final-CX) state of the 4-bit (a=7, b=1) adder, apply the last-bit reverse at i=3. The chain CX flips carry_3 from 0 to 1 (since carry_2=1); the CCX undo then conditions on (read_3=1, target_3=0) → AND=false, so carry_3 stays at 1. Verify the post-state at all 12 qubits via decide.
theoremAdder.carry_false_zero
theorem Adder.carry_false_zero (n : Nat) :
    Adder.carry false n (fun _ => false) (fun _ => false) = false
*Smoke lemma**: carry with carry-in zero, both inputs zero, yields zero. SQIR's `carry_false_0_l` analog ([ModMult.v:514](../../../SQIR/examples/shor/ModMult.v)).
theoremAdder.carry_sym
theorem Adder.carry_sym (b₀ : Bool) (n : Nat) (f g : Nat → Bool) :
    Adder.carry b₀ n f g = Adder.carry b₀ n g f
*Smoke lemma**: carry is symmetric in its two bit-stream arguments. SQIR's `carry_sym` analog ([ModMult.v:506](../../../SQIR/examples/shor/ModMult.v)).
theoremAdder.sumfb_zero
theorem Adder.sumfb_zero (f g : Nat → Bool) :
    Adder.sumfb false f g 0 = xor (f 0) (g 0)
*Smoke lemma**: sum-bit at position 0 with carry-in zero is just `f 0 ⊕ g 0`. Direct from def + carry's base case.
theoremAdder.carry_succ
theorem Adder.carry_succ (b₀ : Bool) (n : Nat) (f g : Nat → Bool) :
    Adder.carry b₀ (n + 1) f g
      = xor (xor (f n && g n) (g n && Adder.carry b₀ n f g))
            (f n && Adder.carry b₀ n f g)
*Carry recurrence in explicit form**: `Adder.carry b₀ (n+1) f g` equals `MAJ(f n, g n, Adder.carry b₀ n f g)` written out via XOR and AND. Auxiliary lemma for downstream proofs that need the recurrence as a rewrite rule (rather than via `unfold`, which expands too aggressively).
theoremAdder.testBit_add_zero
theorem Adder.testBit_add_zero (a b : Nat) :
    (a + b).testBit 0 = xor (a.testBit 0) (b.testBit 0)
*Base case of the classical-correctness bridge** (Iter 163, new): `(a + b).testBit 0 = a.testBit 0 ⊕ b.testBit 0`. This is the i=0 specialization of `Adder.sumfb_eq_testBit_add`. The proof goes via Nat's mod-2 arithmetic: `Nat.testBit n 0 ↔ n % 2 = 1`, and `(a + b) % 2 = (a % 2 + b % 2) % 2` (which equals `a % 2 ⊕ b % 2` for Bool-valued mods). This closes the base case of the planned induction on i for `TODO_sumfb_eq_testBit_add`.
lemmaAdder.carry_shift_one
lemma Adder.carry_shift_one (b₀ : Bool) (a b k : Nat) :
    Adder.carry b₀ (k + 1) (fun i => a.testBit i) (fun i => b.testBit i)
    = Adder.carry (Adder.carry b₀ 1 (fun i => a.testBit i) (fun i => b.testBit i))
        k (fun i => (a / 2).testBit i) (fun i => (b / 2).testBit i)
*Carry-shift auxiliary lemma** (Iter 199, 2026-05-13). Relates `Adder.carry b₀ (k+1)` on (a, b) to `Adder.carry initial k` on (a/2, b/2), where `initial = Adder.carry b₀ 1 a b = MAJ(a_0, b_0, b₀)`. Proof by induction on k: the carry recurrence `carry _ (k+1) = MAJ(...)` + `Nat.testBit_add_one` gives `(a/2).testBit m = a.testBit (m+1)`.
theoremAdder.sumfb_eq_testBit_add_gen
theorem Adder.sumfb_eq_testBit_add_gen (b₀ : Bool) (a b i : Nat) :
    Adder.sumfb b₀ (fun k => a.testBit k) (fun k => b.testBit k) i
      = (a + b + b₀.toNat).testBit i
*Strengthened classical-correctness bridge with carry-in** (Iter 196, 2026-05-13). Generalizes `Adder.sumfb_eq_testBit_add` by adding a carry-in parameter `b₀ : Bool`, which lets the inductive step thread through `Nat.testBit_add_one` + `Nat.add_div` decomposition cleanly. Base case (i=0) is the existing `Adder.testBit_add_zero` analog extended with b₀; succ case is named-sorried per Iter 190's strategy doc (uses the gen IH applied to a/2, b/2, new carry-in derived from `Nat.add_div` decomposition).
theoremAdder.sumfb_eq_testBit_add
theorem Adder.sumfb_eq_testBit_add (a b i : Nat) :
    Adder.sumfb false (fun k => a.testBit k) (fun k => b.testBit k) i
      = (a + b).testBit i
*The classical-correctness bridge, parametric** (Iter 196 PROVEN via gen helper). `sumfb` on Nat-derived bit-streams equals `testBit (a+b)`. SQIR's `sumfb_correct_carry0` analog. Was sorried as `TODO_sumfb_eq_testBit_add` until Iter 196. Now derived from `Adder.sumfb_eq_testBit_add_gen` by specializing `b₀ = false` (and using `Bool.toNat false = 0`). Iter 196 also introduced a new sorry `TODO_sumfb_eq_testBit_add_gen_succ` for the gen-helper's succ case. Net sorry delta = 0; the new sorry has cleaner inductive structure.
example(example)
example :
    Adder.sumfb false (fun k => (3 : Nat).testBit k)
                      (fun k => (1 : Nat).testBit k) 0
      = ((3 : Nat) + 1).testBit 0
    ∧ Adder.sumfb false (fun k => (3 : Nat).testBit k)
                        (fun k => (1 : Nat).testBit k) 1
        = ((3 : Nat) + 1).testBit 1
    ∧ Adder.sumfb false (fun k => (3 : Nat).testBit k)
                        (fun k => (1 : Nat).testBit k) 2
        = ((3 : Nat) + 1).testBit 2
    ∧ Adder.sumfb false (fun k => (3 : Nat).testBit k)
                        (fun k => (1 : Nat).testBit k) 3
*Small-instance validation** of the bridge at `(a=3, b=1)`. Sum = 4 = 0b100. Decide-witnesses confirm the statement `sumfb false ... i = (3+1).testBit i` for i = 0, 1, 2, 3.
example(example)
example :
    Adder.sumfb false (fun k => (7 : Nat).testBit k)
                      (fun k => (1 : Nat).testBit k) 0
      = ((7 : Nat) + 1).testBit 0
    ∧ Adder.sumfb false (fun k => (7 : Nat).testBit k)
                        (fun k => (1 : Nat).testBit k) 3
        = ((7 : Nat) + 1).testBit 3
*Small-instance validation** at `(a=7, b=1)`. Sum = 8 = 0b1000. Bit 0/1/2 of 8 = false; bit 3 of 8 = true.
example(example)
example :
    Gidney.forward_cascade_post_invariant 4 7 1
      (gidney_forward_faithful_full_post_state 4 inputF_7_plus_1)
*Validation on the (7, 1) 4-bit case**: decide-witnesses that the invariant predicate is SATISFIED by the actual forward cascade post-state computed by `gidney_forward_faithful_full_post_state 4 inputF_7_plus_1`. This confirms the invariant statement matches the observed post-state (Iter 116's decide-table). The parametric "for all `a b n`" claim will be a separate SORRIED theorem below.
example(example)
example :
    Gidney.propagation_step_invariant 1 3 3 1
      (gidney_propagation_post_state 1 (adder_input_F 3 3 1))
*Validation on (3, 1) n=3 k=1**: after the first-bit step (k=1) on `adder_input_F 3 3 1`, the propagation invariant holds at all 3 positions. Decide-witness via manual match.
theoremadder_input_F_at_bottom
theorem adder_input_F_at_bottom (n a b : Nat) :
    adder_input_F n a b 2 = false
*Preliminary lemma** (partial — bottom 3 positions only): `adder_input_F n a b` evaluates as expected at qubit indices 0, 1, 2 (positions handled by the first-bit step).
theoremadder_input_F_at_read_idx
theorem adder_input_F_at_read_idx
    (n a b j : Nat) (hj : j < n) :
    adder_input_F n a b (read_idx j) = a.testBit j
*`adder_input_F` at `read_idx j`**: evaluates to `a.testBit j` when `j < n`.
theoremadder_input_F_at_target_idx
theorem adder_input_F_at_target_idx
    (n a b j : Nat) (hj : j < n) :
    adder_input_F n a b (target_idx j) = b.testBit j
*`adder_input_F` at `target_idx j`**: evaluates to `b.testBit j` when `j < n`.
theoremadder_input_F_at_carry_idx
theorem adder_input_F_at_carry_idx
    (n a b j : Nat) :
    adder_input_F n a b (carry_idx j) = false
*`adder_input_F` at `carry_idx j`**: always `false` (carry register starts clean). No bound on `j` needed.
theoremadder_input_F_at_first_bit_positions
theorem adder_input_F_at_first_bit_positions
    (n a b : Nat) (hn : 1 < n) :
    adder_input_F n a b 0 = a.testBit 0
    ∧ adder_input_F n a b 1 = b.testBit 0
    ∧ adder_input_F n a b 2 = false
    ∧ adder_input_F n a b 3 = a.testBit 1
    ∧ adder_input_F n a b 4 = b.testBit 1
*`adder_input_F` evaluation at the 5 first-bit-step positions** (Iter 165). Closes the gap between `adder_input_F n a b` (which is parameterized by Nat `n a b`) and `(a.testBit 0, b.testBit 0, false, a.testBit 1, b.testBit 1)` (which is pure Bool). The hypothesis `hn : 1 < n` is needed for positions 3 and 4 (where `k / 3 = 1`, so `decide (1 < n) = true` is required to reduce the `decide` guard). Together with `gidney_first_bit_post_state_in_bits` (Iter 164), this unblocks the proof of `TODO_gidney_first_bit_preserves`.
theoremGidney.propagation_step_invariant_base_k0
theorem Gidney.propagation_step_invariant_base_k0
    (n a b : Nat) (_ha : a < 2^n) (_hb : b < 2^n) :
    Gidney.propagation_step_invariant 0 n a b
      (gidney_propagation_post_state 0 (adder_input_F n a b))
*Base case k=0 of the cascade induction** (Iter 176, PROVEN). The invariant `Gidney.propagation_step_invariant 0 n a b` holds for the input `adder_input_F n a b`. `propagation_post_state 0 f = f`, so this reduces to showing `adder_input_F` has the right values at all positions. Uses the 3 evaluation lemmas above.
example(example)
example :
    let pre
*Last-bit smoke-test** (Iter 169): apply `gidney_last_bit_post_state` at i=1 to the post-first-bit state of `inputF_1_plus_1` (2-bit adder). Expected: carry_1 = MAJ(0, 0, 1) = 0 (chain CX cancels CCX write). Note: `gidney_last_bit_post_state` was originally defined at line 1081 (Iter 67). This tick adds the bit-extraction lemma.
theoremgidney_last_bit_post_state_in_bits
theorem gidney_last_bit_post_state_in_bits
    (i : Nat) (hi : 0 < i) (f : Nat → Bool)
    (h_cinit : f (carry_idx i) = false) :
    (gidney_last_bit_post_state i f) (carry_idx i)
      = xor (f (read_idx i) && f (target_idx i)) (f (carry_idx (i - 1)))
*Bit-extraction helper for last-bit step** (Iter 169). Mirrors Iter 164 (first-bit) and Iter 167 (interior). Last step has only 2 gates; single conjunct (only carry_i is touched).
theoremgidney_first_bit_post_state_preserves_outside
theorem gidney_first_bit_post_state_preserves_outside
    (f : Nat → Bool) (k : Nat)
    (h_c0 : k ≠ carry_idx 0)
    (h_r1 : k ≠ read_idx 1)
    (h_t1 : k ≠ target_idx 1) :
    (gidney_first_bit_post_state f) k = f k
*First-bit step frame condition**: positions other than {carry_0, read_1, target_1} (= {2, 3, 4}) are unchanged.
theoremgidney_last_bit_post_state_preserves_outside
theorem gidney_last_bit_post_state_preserves_outside
    (i : Nat) (f : Nat → Bool) (k : Nat)
    (h_ci : k ≠ carry_idx i) :
    (gidney_last_bit_post_state i f) k = f k
*Last-bit step frame condition**: positions other than {carry_i} are unchanged. (Last-bit only writes to carry_i.)
theoremgidney_last_bit_preserves
theorem gidney_last_bit_preserves (i a b : Nat) (hi : 0 < i) (f : Nat → Bool)
    (h_ri : f (read_idx i)
              = xor (a.testBit i) (Adder.carry false i a.testBit b.testBit))
    (h_ti : f (target_idx i)
              = xor (b.testBit i) (Adder.carry false i a.testBit b.testBit))
    (h_cim1 : f (carry_idx (i - 1))
                = Adder.carry false i a.testBit b.testBit)
    (h_ci : f (carry_idx i) = false) :
    (gidney_last_bit_post_state i f) (carry_idx i)
      = Adder.carry false (i + 1) a.testBit b.testBit
*Last-bit-step preservation theorem (PROVEN, Iter 171)**. Adapter from Iter 169's bit-extraction helper to the carry recurrence. Simpler than interior (no propagation). Given a state `f` satisfying the "step (i-1) END invariant" (i.e., position i-1 fully processed, position i clean): - `f(read_i) = a_i ⊕ c`, `f(target_i) = b_i ⊕ c` - `f(carry_{i-1}) = c` where `c = Adder.carry false i a.testBit b.testBit` - `f(carry_i) = false` Applying `gidney_last_bit_post_state i` yields: - `post(carry_i) = c_{i+1} = Adder.carry false (i+1) a.testBit b.testBit` No propagation to position (i+1) since this is the last bit. The carry-out identity `((a⊕c) ∧ (b⊕c)) ⊕ c = MAJ(a,b,c)` is the same as interior.
example(example)
example :
    -- The interior step at i=1 transforms inputF_3_plus_1's post-first-bit state.
    -- inputF_3_plus_1 (a=3, b=1) → first-bit step → interior step at i=1.
    let post_first
*Smoke-test**: `gidney_interior_bit_post_state 1` on the (3, 1) 3-bit input matches the existing decide-witnessed post-state. Validates the def's correctness on a concrete instance before attempting the parametric bit-extraction proof.
theoremgidney_interior_bit_post_state_eq
theorem gidney_interior_bit_post_state_eq
    (i : Nat) (f : Nat → Bool) :
    gidney_interior_bit_post_state i f
      = gidney_bit_step_faithful_post_state i f
*Bridge lemma** (Iter 172): the Iter 166-defined `gidney_interior_bit_post_state` is identical to the existing `gidney_bit_step_faithful_post_state` (line 570) used by the propagation cascade. Same 4-update body. Provable by `rfl`. Iter 166 inadvertently introduced this duplicate def. The bridge lets us apply Iter 170's `gidney_interior_bit_preserves` (which uses the Iter 166 name) to the cascade's interior steps (which use the existing name).
theoremgidney_interior_bit_post_state_preserves_outside
theorem gidney_interior_bit_post_state_preserves_outside
    (i : Nat) (f : Nat → Bool) (k : Nat)
    (h_ci : k ≠ carry_idx i)
    (h_ri1 : k ≠ read_idx (i + 1))
    (h_ti1 : k ≠ target_idx (i + 1)) :
    (gidney_interior_bit_post_state i f) k = f k
*Interior-bit step frame condition** (Iter 173): positions other than {carry_i, read_{i+1}, target_{i+1}} are unchanged by the interior-bit step at position `i`.
theoremgidney_interior_bit_post_state_in_bits
theorem gidney_interior_bit_post_state_in_bits
    (i : Nat) (hi : 0 < i) (f : Nat → Bool)
    (h_cinit : f (carry_idx i) = false) :
    (gidney_interior_bit_post_state i f) (carry_idx i)
      = xor (f (read_idx i) && f (target_idx i)) (f (carry_idx (i - 1)))
    ∧ (gidney_interior_bit_post_state i f) (read_idx (i + 1))
        = xor (f (read_idx (i + 1)))
              ((gidney_interior_bit_post_state i f) (carry_idx i))
    ∧ (gidney_interior_bit_post_state i f) (target_idx (i + 1))
        = xor (f (target_idx (i + 1)))
              ((gidney_interior_bit_post_state i f) (carry_idx i))
*Bit-extraction helper for interior step** (Iter 167, PROVEN). Analog of Iter 164's first-bit version. Proven via `omega`- derived index inequalities + `update_neq` chain.
theoremgidney_interior_bit_preserves
theorem gidney_interior_bit_preserves (i a b : Nat) (hi : 0 < i) (f : Nat → Bool)
    (h_ri : f (read_idx i)
              = xor (a.testBit i) (Adder.carry false i a.testBit b.testBit))
    (h_ti : f (target_idx i)
              = xor (b.testBit i) (Adder.carry false i a.testBit b.testBit))
    (h_cim1 : f (carry_idx (i - 1))
                = Adder.carry false i a.testBit b.testBit)
    (h_ci : f (carry_idx i) = false)
    (h_ri1 : f (read_idx (i + 1)) = a.testBit (i + 1))
    (h_ti1 : f (target_idx (i + 1)) = b.testBit (i + 1)) :
    let post
*Interior-bit-step preservation theorem (PROVEN, Iter 170)**. Adapter from Iter 167's bit-extraction helper to the classical-carry-recurrence form. Given a state `f` satisfying the "step (i-1) END invariant": - `f(read_i) = a_i ⊕ c`, `f(target_i) = b_i ⊕ c` (propagated by prev step) - `f(carry_{i-1}) = c` (carry from prev step) - `f(carry_i) = false` (carry register unmodified up to position i) - `f(read_{i+1}) = a_{i+1}`, `f(target_{i+1}) = b_{i+1}` (unchanged from input) Applying `gidney_interior_bit_post_state i` yields a state satisfying the "step i END invariant": - `post(carry_i) = c_{i+1} = Adder.carry false (i+1) a.testBit b.testBit` - `post(read_{i+1}) = a_{i+1} ⊕ c_{i+1}` - `post(target_{i+1}) = b_{i+1} ⊕ c_{i+1}` The carry-out identity: `((a_i ⊕ c) ∧ (b_i ⊕ c)) ⊕ c = MAJ(a_i, b_i, c)`.
theoremgidney_first_bit_post_state_in_bits
theorem gidney_first_bit_post_state_in_bits
    (f : Nat → Bool) (h2 : f 2 = false) :
    (gidney_first_bit_post_state f) 2 = (f 0 && f 1)
    ∧ (gidney_first_bit_post_state f) 3 = xor (f 3) (f 0 && f 1)
    ∧ (gidney_first_bit_post_state f) 4 = xor (f 4) (f 0 && f 1)
*Bit-extraction helper for first-bit step** (Iter 164): captures the classical action of `gidney_first_bit_post_state` on an arbitrary input function `f`, parameterized by the 5 relevant bit values at positions 0, 1, 2, 3, 4. Per Iter 162 reflection pattern A (bit-extraction): take Bool values as inputs, NOT a free Nat. This avoids the "decide on free Nat vars" obstacle entirely — the proof is pure Bool case-analysis (16 sub-goals over the 4 free Bool vars). The relationship: `gidney_first_bit_post_state f` at positions 2 (carry_0), 3 (read_1), 4 (target_1): - post 2 = f 0 ∧ f 1 (CCX write) - post 3 = f 3 ⊕ (f 0 ∧ f 1) (CX propagation) - post 4 = f 4 ⊕ (f 0 ∧ f 1) (CX propagation) Note `f 2` (= carry_0's initial value) is XOR'd into the CCX write, but for our adder input `f 2 = false`, so the XOR is trivial. We absorb this via `h2 : f 2 = false`.
theoremgidney_first_bit_preserves
theorem gidney_first_bit_preserves (n a b : Nat)
    (hn : 1 < n) (_ha : a < 2^n) (_hb : b < 2^n) :
    let post
*First-bit-step preservation theorem (PROVEN, Iter 165)**: applying `gidney_first_bit_post_state` to the encoded input `adder_input_F n a b` (with `n ≥ 2`) produces a state where `carry_0 = c_1`, `read_1 = a_1 ⊕ c_1`, `target_1 = b_1 ⊕ c_1`, where `c_1 = Adder.carry false 1 (a.testBit) (b.testBit) = a_0 ∧ b_0`. *Proof** (post Iter 162 reflection's pattern A bit-extraction): glue `gidney_first_bit_post_state_in_bits` (Iter 164, pure Bool case-bash) with `adder_input_F_at_first_bit_positions` (Iter 165 preliminary, uses `hn : 1 < n` to evaluate the `decide` guards). Closes the original `TODO_gidney_first_bit_preserves` from Iter 160.
theoremGidney.propagation_step_invariant_k1
theorem Gidney.propagation_step_invariant_k1
    (n a b : Nat) (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n) :
    Gidney.propagation_step_invariant 1 n a b
      (gidney_propagation_post_state 1 (adder_input_F n a b))
*Inductive step k=0 → k=1 of cascade induction** (Iter 177, PROVEN). Applying `gidney_first_bit_post_state` to `adder_input_F n a b` produces a state satisfying step-1 invariant. Uses `gidney_first_bit_preserves` (touched positions) + frame condition + adder_input_F evaluations (outside positions).
theoremTODO_gidney_propagation_step_invariant_step
theorem TODO_gidney_propagation_step_invariant_step
    (k n a b : Nat) (hk : 1 ≤ k) (hk_n : k + 1 < n)
    (hn : 1 < n) (ha : a < 2^n) (hb : b < 2^n)
    (h_prev : Gidney.propagation_step_invariant k n a b
                (gidney_propagation_post_state k (adder_input_F n a b))) :
    Gidney.propagation_step_invariant (k + 1) n a b
      (gidney_propagation_post_state (k + 1) (adder_input_F n a b))
*Inductive step k → k+1 of cascade induction** (Iter 178, SORRIED). For k ≥ 1 (so we apply an interior step at position k), if the state satisfies step-k invariant, then applying the interior step at position k yields a state satisfying step-(k+1) invariant. Connects to the cascade via: `gidney_propagation_post_state (k + 2) f = gidney_bit_step_faithful_post_state (k + 1) (gidney_propagation_post_state (k + 1) f)` i.e., the recursive step. With the bridge `gidney_interior_bit_post_state_eq`, we can use `gidney_interior_bit_preserves` (Iter 170) for the touched positions + `gidney_interior_bit_post_state_preserves_outside` (Iter 173) for the rest. SORRIED — the full proof requires extracting hypotheses from h_prev (the step-k invariant) at 6+ positions, then applying the interior preserves + frame condition. Estimated ~50-80 lines of careful Lean. Punted to keep this tick bounded; the pattern is established by Iter 177's first-bit version. See [Iter 174 reflection](AutoScript/reflection.md) for the completion plan.
theoremGidney.propagation_step_invariant_holds
theorem Gidney.propagation_step_invariant_holds
    (k n a b : Nat) (hkn : k < n) (hn : 1 < n)
    (ha : a < 2^n) (hb : b < 2^n) :
    Gidney.propagation_step_invariant k n a b
      (gidney_propagation_post_state k (adder_input_F n a b))
*Parametric propagation invariant** (Iter 179, PROVEN — but depends on Iter 178's sorried step lemma). By induction on `k`: - Base case k=0: `propagation_step_invariant_base_k0`. - k=1: `propagation_step_invariant_k1`. - k ≥ 2: `TODO_gidney_propagation_step_invariant_step`. The result: for any k with `k + 1 ≤ n`, `gidney_propagation_post_state k (adder_input_F n a b)` satisfies the step-k invariant. With the structural recursion form, the induction goes via `Nat.rec`.
example(example)
example :
    (∀ k, k < 6 →
       adder_input_F 2 1 0 k = inputF_1_plus_0 k)
*Generic ↔ concrete check #1**: `adder_input_F 2 1 0` matches `inputF_1_plus_0` at all 6 qubits of the 2-bit adder.
example(example)
example :
    (∀ k, k < 6 →
       adder_input_F 2 1 1 k = inputF_1_plus_1 k)
*Generic ↔ concrete check #2**: `adder_input_F 2 1 1` matches `inputF_1_plus_1` at all 6 qubits.
example(example)
example :
    (∀ k, k < 9 →
       adder_input_F 3 3 1 k = inputF_3_plus_1 k)
*Generic ↔ concrete check #3**: `adder_input_F 3 3 1` matches `inputF_3_plus_1` at all 9 qubits.
example(example)
example :
    (∀ k, k < 12 →
       adder_input_F 4 7 1 k = inputF_7_plus_1 k)
*Generic ↔ concrete check #4**: `adder_input_F 4 7 1` matches `inputF_7_plus_1` at all 12 qubits.
example(example)
example :
    adder_sum_bit_classical 7 1 0 = false
    ∧ adder_sum_bit_classical 7 1 1 = false
    ∧ adder_sum_bit_classical 7 1 2 = false
    ∧ adder_sum_bit_classical 7 1 3 = true
*Classical sum-bit concrete check**: bit 0 of (7+1)=8 is 0, bit 1 is 0, bit 2 is 0, bit 3 is 1 (binary "1000").
example(example)
example :
    Gidney.post_last_bit_invariant 2 1 1
      (gidney_forward_faithful_full_post_state 2 (adder_input_F 2 1 1))
*Decide-witness for `post_last_bit_invariant` on (n=2, a=1, b=1)** (Iter 187). Validates that after forward cascade only (no final-CX), `target_1 = b_1 ⊕ c_1 = 0 ⊕ 1 = 1` (still propagated, not yet canceled). This is the state BEFORE the final-CX layer.

FormalRV.Arithmetic.RippleCarryAdder.RippleCarryAdderUncomputeCascade

FormalRV/Arithmetic/RippleCarryAdder/RippleCarryAdderUncomputeCascade.lean
theoremunpatched_full_reverse_commute_update_at_c_above
theorem unpatched_full_reverse_commute_update_at_c_above
    (n : Nat) (g : Nat → Bool) (v : Bool) (j : Nat) (hj : j > n + 1) :
    Gate.applyNat (gidney_adder_forward_faithful_full_reverse (n + 2)) (update g (carry_idx j) v)
      = update (Gate.applyNat (gidney_adder_forward_faithful_full_reverse (n + 2)) g)
          (carry_idx j) v
Unpatched full reverse cascade commutes with update at `c[j]` (`j > n+1`).
theoremunpatched_propagation_reverse_indep_input_at_c_above
theorem unpatched_propagation_reverse_indep_input_at_c_above
    (m : Nat) (g : Nat → Bool) (v : Bool) (k : Nat) (h_k : k ≠ carry_idx (m + 1)) :
    Gate.applyNat (gidney_adder_forward_with_propagation_reverse (m + 1))
      (update g (carry_idx (m + 1)) v) k
    = Gate.applyNat (gidney_adder_forward_with_propagation_reverse (m + 1)) g k
*Input-independence of the unpatched propagation cascade** (Deliverable A): changing the input at `carry_idx (m+1)` (above the cascade's range) does not affect the output at any other position.
theoremunpatched_full_reverse_indep_input_at_c_above
theorem unpatched_full_reverse_indep_input_at_c_above
    (n : Nat) (g : Nat → Bool) (v : Bool) (k : Nat) (h_k : k ≠ carry_idx (n + 2)) :
    Gate.applyNat (gidney_adder_forward_faithful_full_reverse (n + 2))
      (update g (carry_idx (n + 2)) v) k
    = Gate.applyNat (gidney_adder_forward_faithful_full_reverse (n + 2)) g k
Input-independence of the unpatched full reverse cascade at `c[n+2]`.
theorempatched_unpatched_propagation_reverse_eq_at_target
theorem patched_unpatched_propagation_reverse_eq_at_target (m : Nat) :
    ∀ (g : Nat → Bool) (i : Nat),
      Gate.applyNat (gidney_adder_forward_with_propagation_reverse_patched (m + 1)) g
        (target_idx i)
        = Gate.applyNat (gidney_adder_forward_with_propagation_reverse (m + 1)) g
            (target_idx i)
Patched propagation cascade equals unpatched at `target_idx i`.
theorempatched_unpatched_propagation_reverse_eq_at_read
theorem patched_unpatched_propagation_reverse_eq_at_read (m : Nat) :
    ∀ (g : Nat → Bool) (i : Nat),
      Gate.applyNat (gidney_adder_forward_with_propagation_reverse_patched (m + 1)) g
        (read_idx i)
        = Gate.applyNat (gidney_adder_forward_with_propagation_reverse (m + 1)) g
            (read_idx i)
Patched propagation cascade equals unpatched at `read_idx i`.
theorempatched_full_reverse_eq_unpatched_at_target
theorem patched_full_reverse_eq_unpatched_at_target
    (n : Nat) (g : Nat → Bool) (i : Nat) :
    Gate.applyNat (gidney_adder_forward_faithful_full_reverse_patched (n + 2)) g (target_idx i)
      = Gate.applyNat (gidney_adder_forward_faithful_full_reverse (n + 2)) g (target_idx i)
Patched full reverse cascade equals unpatched at `target_idx i`.
theorempatched_full_reverse_eq_unpatched_at_read
theorem patched_full_reverse_eq_unpatched_at_read
    (n : Nat) (g : Nat → Bool) (i : Nat) :
    Gate.applyNat (gidney_adder_forward_faithful_full_reverse_patched (n + 2)) g (read_idx i)
      = Gate.applyNat (gidney_adder_forward_faithful_full_reverse (n + 2)) g (read_idx i)
Patched full reverse cascade equals unpatched at `read_idx i`.
theoremgidney_adder_full_faithful_no_measurement_patched_target_correct
theorem gidney_adder_full_faithful_no_measurement_patched_target_correct
    (n a b : Nat) (ha : a < 2^(n + 2)) (hb : b < 2^(n + 2)) :
    ∀ i, i < n + 2 →
      Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched (n + 2))
        (adder_input_F (n + 2) a b) (target_idx i)
      = adder_sum_bit_classical a b i
*Patched full adder, target register correctness** (Deliverable C₁).
theoremgidney_adder_full_faithful_no_measurement_patched_read_preserved
theorem gidney_adder_full_faithful_no_measurement_patched_read_preserved
    (n a b : Nat) (ha : a < 2^(n + 2)) (hb : b < 2^(n + 2)) :
    ∀ i, i < n + 2 →
      Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched (n + 2))
        (adder_input_F (n + 2) a b) (read_idx i)
      = a.testBit i
*Patched full adder, read register preservation** (Deliverable C₂).
theoremgidney_adder_full_faithful_no_measurement_patched_correct
theorem gidney_adder_full_faithful_no_measurement_patched_correct
    (n a b : Nat) (ha : a < 2^(n + 2)) (hb : b < 2^(n + 2)) :
    (∀ i, i < n + 2 →
        Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched (n + 2))
          (adder_input_F (n + 2) a b) (read_idx i)
        = a.testBit i)
    ∧ (∀ i, i < n + 2 →
        Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched (n + 2))
          (adder_input_F (n + 2) a b) (target_idx i)
        = adder_sum_bit_classical a b i)
    ∧ (∀ i, i ≤ n + 1 →
        Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched (n + 2))
*Full patched-adder correctness — packaged theorem** (Deliverable D). For the Option-1 carry-clearing patched Gidney adder on `adder_input_F (n+2) a b`: 1. The read register is preserved (= original `a` bits). 2. The target register equals the classical sum bits. 3. The carry register is fully cleared.
theoremnat_mod_two_pow_succ_eq
theorem nat_mod_two_pow_succ_eq (x n : Nat) :
    x % 2^(n + 1) = x % 2^n + (if x.testBit n then 2^n else 0)
Helper: `x % 2^(n+1) = x % 2^n + (testBit x n) * 2^n`. Standard identity, not in mathlib in this exact form.
theoremgidney_target_val_eq_sum_when_bits_match
theorem gidney_target_val_eq_sum_when_bits_match
    (bits S : Nat) (f : Nat → Bool)
    (h : ∀ i, i < bits → f (target_idx i) = S.testBit i) :
    gidney_target_val bits f = S % 2^bits
If a bit-function's target-register positions match the bits of `S`, then `gidney_target_val` decodes the target register to `S % 2^bits`.
theoremgidney_adder_full_faithful_no_measurement_patched_correct_bits
theorem gidney_adder_full_faithful_no_measurement_patched_correct_bits
    (bits a b : Nat) (hbits : 2 ≤ bits) (ha : a < 2^bits) (hb : b < 2^bits) :
    (∀ i, i < bits →
        Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched bits)
          (adder_input_F bits a b) (read_idx i) = a.testBit i)
    ∧ (∀ i, i < bits →
        Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched bits)
          (adder_input_F bits a b) (target_idx i) = (a + b).testBit i)
    ∧ (∀ i, i < bits →
        Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched bits)
          (adder_input_F bits a b) (carry_idx i) = false)
*Deliverable A**: bits-parameter wrapper of the packaged correctness theorem. For any `bits ≥ 2` and `a, b < 2^bits`, the patched full faithful no-measurement Gidney adder preserves the read register, writes the classical sum bits into the target register, and clears the carry register.
theoremgidney_adder_patched_target_decode
theorem gidney_adder_patched_target_decode
    (bits a b : Nat) (hbits : 2 ≤ bits) (ha : a < 2^bits) (hb : b < 2^bits) :
    gidney_target_val bits
      (Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched bits)
        (adder_input_F bits a b))
    = (a + b) % 2^bits
*Deliverable B**: decoded target-register correctness. After the patched full faithful no-measurement Gidney adder runs on `adder_input_F bits a b`, the target register decodes to `(a + b) mod 2^bits`.
theoremgidney_adder_bit_step_faithful_first_wellTyped
theorem gidney_adder_bit_step_faithful_first_wellTyped
    (bits : Nat) (hbits : 2 ≤ bits) :
    Gate.WellTyped (adder_n_qubits bits) gidney_adder_bit_step_faithful_first
theoremgidney_adder_bit_step_faithful_interior_wellTyped
theorem gidney_adder_bit_step_faithful_interior_wellTyped
    (bits i : Nat) (hi_pos : 0 < i) (hi_lt : i < bits) :
    Gate.WellTyped (adder_n_qubits bits)
      (gidney_adder_bit_step_faithful_interior i)
theoremgidney_adder_bit_step_faithful_last_wellTyped
theorem gidney_adder_bit_step_faithful_last_wellTyped
    (bits i : Nat) (hi_pos : 0 < i) (hi_lt : i < bits) :
    Gate.WellTyped (adder_n_qubits bits)
      (gidney_adder_bit_step_faithful_last i)
theoremgidney_adder_bit_step_faithful_first_reverse_patched_wellTyped
theorem gidney_adder_bit_step_faithful_first_reverse_patched_wellTyped
    (bits : Nat) (hbits : 2 ≤ bits) :
    Gate.WellTyped (adder_n_qubits bits)
      gidney_adder_bit_step_faithful_first_reverse_patched
theoremgidney_adder_bit_step_faithful_interior_reverse_patched_wellTyped
theorem gidney_adder_bit_step_faithful_interior_reverse_patched_wellTyped
    (bits i : Nat) (hi_pos : 0 < i) (hi_lt : i < bits) :
    Gate.WellTyped (adder_n_qubits bits)
      (gidney_adder_bit_step_faithful_interior_reverse_patched i)
theoremgidney_adder_bit_step_faithful_last_reverse_patched_wellTyped
theorem gidney_adder_bit_step_faithful_last_reverse_patched_wellTyped
    (bits i : Nat) (hi_pos : 0 < i) (hi_lt : i < bits) :
    Gate.WellTyped (adder_n_qubits bits)
      (gidney_adder_bit_step_faithful_last_reverse_patched i)
theoremgidney_adder_forward_with_propagation_wellTyped
theorem gidney_adder_forward_with_propagation_wellTyped
    (bits : Nat) (hb2 : 2 ≤ bits) :
    ∀ k, k ≤ bits →
      Gate.WellTyped (adder_n_qubits bits)
        (gidney_adder_forward_with_propagation k)
theoremgidney_adder_forward_faithful_full_wellTyped
theorem gidney_adder_forward_faithful_full_wellTyped
    (bits : Nat) (hb2 : 2 ≤ bits) :
    Gate.WellTyped (adder_n_qubits bits)
      (gidney_adder_forward_faithful_full bits)
theoremgidney_final_cx_cascade_wellTyped
theorem gidney_final_cx_cascade_wellTyped
    (bits : Nat) :
    ∀ k, k ≤ bits →
      Gate.WellTyped (adder_n_qubits bits) (gidney_final_cx_cascade k)
theoremgidney_adder_forward_with_propagation_reverse_patched_wellTyped
theorem gidney_adder_forward_with_propagation_reverse_patched_wellTyped
    (bits : Nat) (hb2 : 2 ≤ bits) :
    ∀ k, k ≤ bits →
      Gate.WellTyped (adder_n_qubits bits)
        (gidney_adder_forward_with_propagation_reverse_patched k)
theoremgidney_adder_forward_faithful_full_reverse_patched_wellTyped
theorem gidney_adder_forward_faithful_full_reverse_patched_wellTyped
    (bits : Nat) (hb2 : 2 ≤ bits) :
    Gate.WellTyped (adder_n_qubits bits)
      (gidney_adder_forward_faithful_full_reverse_patched bits)
theoremgidney_adder_full_faithful_no_measurement_patched_wellTyped
theorem gidney_adder_full_faithful_no_measurement_patched_wellTyped
    (bits : Nat) (hb2 : 2 ≤ bits) :
    Gate.WellTyped (adder_n_qubits bits)
      (gidney_adder_full_faithful_no_measurement_patched bits)
*Deliverable C**: full patched-adder WellTyped at the natural dimension `adder_n_qubits bits = 3 * bits + 2`.
theoremgidney_adder_patched_primitive
theorem gidney_adder_patched_primitive
    (bits a b : Nat) (hbits : 2 ≤ bits) (ha : a < 2^bits) (hb : b < 2^bits) :
    Gate.WellTyped (adder_n_qubits bits)
      (gidney_adder_full_faithful_no_measurement_patched bits)
    ∧ gidney_target_val bits
        (Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched bits)
          (adder_input_F bits a b))
      = (a + b) % 2^bits
    ∧ (∀ i, i < bits →
        Gate.applyNat (gidney_adder_full_faithful_no_measurement_patched bits)
          (adder_input_F bits a b) (read_idx i) = a.testBit i)
    ∧ (∀ i, i < bits →
*Deliverable D**: bundled reusable patched-adder primitive combining WellTyped, decoded target correctness, read preservation, and carry clearing — the single theorem the modular-addition layer should call.

FormalRV.Arithmetic.SQIRModMult

FormalRV/Arithmetic/SQIRModMult.lean
(no documented top-level declarations)

FormalRV.Arithmetic.SQIRModMult.ModExpCount

FormalRV/Arithmetic/SQIRModMult/ModExpCount.lean
FormalRV.Arithmetic.SQIRModMult.ModExpCount — EXACT T-counts of two concrete mod-exp-shaped `Gate` IR chains. The counts are exact and machine-checked; the LABELS below are carefully honest about what each chain is (a counting audit, 2026-06-03, flagged earlier overclaims). TWO chains, TWO numbers — do NOT confuse them: `shorModExp` (this section): chains the OUT-OF-PLACE `sqir_modmult_const_gate` (8·bits² Toffoli/step). T-count EXACTLY `112·bits³` (= 16·bits³ Toffoli; `16·2048³ = 137 438 953 472`). ⚠ This is a COUNTING MODEL only: an out-of-place multiplier writes `a·x` into a FRESH accumulator with no feedback, so a chain of them does NOT compute modular exponentiation. It is NOT the term the verified Shor algorithm uses. Keep it only as the per-step structural skeleton. `shorModExpVerified` (below): chains the IN-PLACE verified oracle `sqir_modmult_MCP_gate` (16·bits² Toffoli/step) — the term the verified Shor theorem actually uses. T-count EXACTLY `224·bits³` (= 32·bits³ Toffoli; `32·2048³ = 274 877 906 944` = 2× the above, the in-place forward+uncompute factor). This is the honest verified-oracle arithmetic figure. HONEST STATUS (CLAUDE.md "semantic correctness before resource counts"): the COUNTS are exact, but NEITHER chain has a proof that it computes `a^x mod N`. Each per-step multiplier is semantically verified (`const_gate`: (a·m)%N decode; `MCP`: MultiplyCircuitProperty), but the chain-realizes-modular-exponentiation theorem is NOT proved here (the verified mod-exp semantics lives in `Shor_correct_verified_no_modmult_axioms` via `controlled_powers`, a DIFFERENT BaseUCom-level term — no bridge to these Gate chains yet). So both chains are SCAFFOLDED (count-only), and the `2·bits` exponent-register multiplicity is structural. EXACT counts derived by induction (math — the 2048 circuit is never built), no `sorry`/`axiom`.
defshorModExpChain
def shorModExpChain (m bits N a : Nat) : Gate
COUNTING-MODEL chain of `m` OUT-OF-PLACE `const_gate` multipliers (step `k` multiplies by `a^(2^k)`). ⚠ Not a valid modular-exponentiation circuit (out-of-place = no feedback) and NOT the verified Shor oracle term — kept only for its per-step Toffoli structure. For the verified-oracle chain use `shorModExpVerified`.
theoremtcount_shorModExpChain
theorem tcount_shorModExpChain (m bits N a : Nat)
    (hcop : Nat.Coprime a N) (hodd : Odd N) (h1 : 1 < N) :
    tcount (shorModExpChain m bits N a) = m * (56 * bits ^ 2)
*EXACT** T-count of the `m`-fold modular-multiplier chain: `m · 56·bits²`, for any valid Shor base (`gcd(a,N)=1`, `N` odd, `N>1` — so every multiplier step is non-trivial).
defshorModExp
def shorModExp (bits N a : Nat) : Gate
COUNTING-MODEL mod-exp skeleton: `2·bits` out-of-place multipliers. ⚠ Not a valid mod-exp (no feedback) and not the verified oracle — see header; use `shorModExpVerified`.
theoremtcount_shorModExp
theorem tcount_shorModExp (bits N a : Nat)
    (hcop : Nat.Coprime a N) (hodd : Odd N) (h1 : 1 < N) :
    tcount (shorModExp bits N a) = 112 * bits ^ 3
*EXACT** T-count of the out-of-place counting-model chain `shorModExp`: `112·bits³` (= `16·bits³` Toffoli). Exact count of THIS concrete term; the term is a counting model, NOT the verified Shor circuit (header).
example(example)
example : tcount (shorModExp 4 15 7) = 112 * 4 ^ 3
defshorModExpMCPChain
def shorModExpMCPChain (m bits N a ainv : Nat) : Gate
theoremtcount_shorModExpMCPChain
theorem tcount_shorModExpMCPChain (m bits N a ainv : Nat)
    (hcop : Nat.Coprime a N) (hcopinv : Nat.Coprime ainv N)
    (hpos : 0 < ainv) (hlt : ainv < N) (hodd : Odd N) (h1 : 1 < N) :
    tcount (shorModExpMCPChain m bits N a ainv) = m * (112 * bits ^ 2)
defshorModExpVerified
def shorModExpVerified (bits N a ainv : Nat) : Gate
Mod-exp-shaped chain of `2·bits` VERIFIED in-place MCP oracles — the honest arithmetic figure (each step is the term the verified Shor theorem uses). ⚠ Count-only/SCAFFOLDED: no proof yet that the chain computes `a^x mod N` (header); `2·bits` is structural.
theoremtcount_shorModExpVerified
theorem tcount_shorModExpVerified (bits N a ainv : Nat)
    (hcop : Nat.Coprime a N) (hcopinv : Nat.Coprime ainv N)
    (hpos : 0 < ainv) (hlt : ainv < N) (hodd : Odd N) (h1 : 1 < N) :
    tcount (shorModExpVerified bits N a ainv) = 224 * bits ^ 3
*EXACT** T-count of the verified-oracle chain `shorModExpVerified`: `224·bits³` (= `32·bits³` Toffoli; twice `shorModExp`, the in-place forward+uncompute factor). This is the count on the verified-oracle building block — count-only (mod-exp semantics not proved for the chain; see header).
example(example)
example : tcount (shorModExpVerified 2 15 7 13) = 224 * 2 ^ 3

FormalRV.Arithmetic.SQIRModMult.SQIRModMultAccumulatorRange

FormalRV/Arithmetic/SQIRModMult/SQIRModMultAccumulatorRange.lean
theoremsqir_swap_acc_mult_at_target_out_range_qstart
theorem sqir_swap_acc_mult_at_target_out_range_qstart
    (bits q_start k i : Nat) (hk : k ≤ bits) (hi_ge : k ≤ i) (hi_bits : i < bits)
    (f : Nat → Bool) :
    Gate.applyNat (sqir_swap_acc_mult_aux_qstart bits q_start k) f
        (sqir_target_idx_qstart q_start i)
      = f (sqir_target_idx_qstart q_start i)
q_start port of `sqir_swap_acc_mult_aux_at_target_out_range` (line 3118). At an accumulator bit `i ≥ k`, swap output = input.
theoremsqir_swap_acc_mult_at_target_in_range_qstart
theorem sqir_swap_acc_mult_at_target_in_range_qstart
    (bits q_start k i : Nat) (hk : k ≤ bits) (hi_k : i < k) (f : Nat → Bool) :
    Gate.applyNat (sqir_swap_acc_mult_aux_qstart bits q_start k) f
        (sqir_target_idx_qstart q_start i)
      = f (sqir_mult_control_idx_qstart bits q_start i)
q_start port of `sqir_swap_acc_mult_aux_at_target_in_range` (line 3139). At an accumulator bit `i < k`, swap output = input at the matched multiplier position.
theoremsqir_swap_acc_mult_at_mult_in_range_qstart
theorem sqir_swap_acc_mult_at_mult_in_range_qstart
    (bits q_start k i : Nat) (hk : k ≤ bits) (hi_k : i < k) (f : Nat → Bool) :
    Gate.applyNat (sqir_swap_acc_mult_aux_qstart bits q_start k) f
        (sqir_mult_control_idx_qstart bits q_start i)
      = f (sqir_target_idx_qstart q_start i)
q_start port of `sqir_swap_acc_mult_aux_at_mult_in_range` (line 3164). At a multiplier bit `i < k`, swap output = input at matched target.
theoremsqir_swap_acc_mult_at_other_qstart
theorem sqir_swap_acc_mult_at_other_qstart
    (bits q_start k q : Nat) (hk : k ≤ bits) (f : Nat → Bool)
    (h_q_not_target : ∀ i, i < k → q ≠ sqir_target_idx_qstart q_start i)
    (h_q_not_mult : ∀ i, i < k → q ≠ sqir_mult_control_idx_qstart bits q_start i) :
    Gate.applyNat (sqir_swap_acc_mult_aux_qstart bits q_start k) f q = f q
q_start port of `sqir_swap_acc_mult_aux_at_other` (line 3189). At any position outside the swap range, output = input.
theoremsqir_swap_acc_mult_apply_qstart
theorem sqir_swap_acc_mult_apply_qstart
    (bits q_start m acc : Nat) (hbits : 1 ≤ bits)
    (hm : m < 2^bits) (hacc : acc < 2^bits) :
    Gate.applyNat (sqir_swap_acc_mult_qstart bits q_start)
        (sqir_mult_input_F_qstart bits q_start m acc)
      = sqir_mult_input_F_qstart bits q_start acc m
q_start port of `sqir_swap_acc_mult_apply` (line 3215). Full SWAP correctness on `sqir_mult_input_F_qstart`.
theoremsqir_target_idx_ne_mult_control_idx
theorem sqir_target_idx_ne_mult_control_idx
    (bits i j : Nat) (hi : i < bits) :
    sqir_target_idx i ≠ sqir_mult_control_idx bits j
theoremsqir_swap_acc_mult_aux_succ_eq
theorem sqir_swap_acc_mult_aux_succ_eq (bits k : Nat) :
    sqir_swap_acc_mult_aux bits (k + 1)
      = Gate.seq (sqir_swap_acc_mult_aux bits k)
          (qubit_swap (sqir_target_idx k) (sqir_mult_control_idx bits k))
theoremsqir_swap_acc_mult_aux_wellTyped
theorem sqir_swap_acc_mult_aux_wellTyped
    (bits k : Nat) (hbits : 1 ≤ bits) (hk : k ≤ bits) :
    Gate.WellTyped (sqir_modmult_rev_anc bits) (sqir_swap_acc_mult_aux bits k)
*WellTyped for `sqir_swap_acc_mult_aux`.**
theoremsqir_swap_acc_mult_wellTyped
theorem sqir_swap_acc_mult_wellTyped
    (bits : Nat) (hbits : 1 ≤ bits) :
    Gate.WellTyped (sqir_modmult_rev_anc bits) (sqir_swap_acc_mult bits)
theoremsqir_swap_acc_mult_aux_at_mult_out_range
theorem sqir_swap_acc_mult_aux_at_mult_out_range
    (bits k i : Nat) (hk : k ≤ bits) (hi_ge : k ≤ i) (hi_bits : i < bits) (f : Nat → Bool) :
    Gate.applyNat (sqir_swap_acc_mult_aux bits k) f (sqir_mult_control_idx bits i)
      = f (sqir_mult_control_idx bits i)
*At a multiplier bit `i ≥ k`, swap output = input.**
theoremsqir_swap_acc_mult_aux_at_target_out_range
theorem sqir_swap_acc_mult_aux_at_target_out_range
    (bits k i : Nat) (hk : k ≤ bits) (hi_ge : k ≤ i) (hi_bits : i < bits) (f : Nat → Bool) :
    Gate.applyNat (sqir_swap_acc_mult_aux bits k) f (sqir_target_idx i)
      = f (sqir_target_idx i)
*At an accumulator bit `i ≥ k`, swap output = input.**
theoremsqir_swap_acc_mult_aux_at_target_in_range
theorem sqir_swap_acc_mult_aux_at_target_in_range
    (bits k i : Nat) (hk : k ≤ bits) (hi_k : i < k) (f : Nat → Bool) :
    Gate.applyNat (sqir_swap_acc_mult_aux bits k) f (sqir_target_idx i)
      = f (sqir_mult_control_idx bits i)
*At an accumulator bit `i < k`, swap output = input at the matched multiplier position.**
theoremsqir_swap_acc_mult_aux_at_mult_in_range
theorem sqir_swap_acc_mult_aux_at_mult_in_range
    (bits k i : Nat) (hk : k ≤ bits) (hi_k : i < k) (f : Nat → Bool) :
    Gate.applyNat (sqir_swap_acc_mult_aux bits k) f (sqir_mult_control_idx bits i)
      = f (sqir_target_idx i)
*At a multiplier bit `i < k`, swap output = input at matched target.**
theoremsqir_swap_acc_mult_aux_at_other
theorem sqir_swap_acc_mult_aux_at_other
    (bits k q : Nat) (hk : k ≤ bits) (f : Nat → Bool)
    (h_q_not_target : ∀ i, i < k → q ≠ sqir_target_idx i)
    (h_q_not_mult : ∀ i, i < k → q ≠ sqir_mult_control_idx bits i) :
    Gate.applyNat (sqir_swap_acc_mult_aux bits k) f q = f q
*At any position outside the swap range, output = input.**
theoremsqir_target_idx_value
theorem sqir_target_idx_value (i : Nat) :
    sqir_target_idx i = 2 + 2 * i + 1
*Sanity helper:** `sqir_target_idx i = 2 + 2*i + 1`.
theoremsqir_swap_acc_mult_apply
theorem sqir_swap_acc_mult_apply
    (bits m acc : Nat) (hbits : 1 ≤ bits)
    (hm : m < 2^bits) (hacc : acc < 2^bits) :
    Gate.applyNat (sqir_swap_acc_mult bits) (sqir_mult_input_F bits m acc)
      = sqir_mult_input_F bits acc m
*Full SWAP correctness on `sqir_mult_input_F`.**
theoremsqir_modmult_inverse_clear_arith
theorem sqir_modmult_inverse_clear_arith
    (N a ainv x : Nat) (hN_pos : 0 < N) (hx : x < N) (h_ainv_le : ainv ≤ N)
    (h_inv : (a * ainv) % N = 1) :
    (x + ((N - ainv) % N) * ((a * x) % N)) % N = 0
*Modular inverse clear arithmetic.** If `(a * ainv) % N = 1`, then `(x + ((N - ainv) % N) * ((a * x) % N)) % N = 0`.
theoremsqir_modmult_inplace_candidate_state_eq
theorem sqir_modmult_inplace_candidate_state_eq
    (bits N a ainv x : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (h_ainv_le : ainv ≤ N) (hx : x < N)
    (h_inv : (a * ainv) % N = 1) :
    Gate.applyNat (sqir_modmult_inplace_candidate bits N a ainv) (sqir_mult_input_F bits x 0)
      = sqir_mult_input_F bits ((a * x) % N) 0
*In-place modular multiplier candidate target theorem.** After applying the in-place wrapper to `(x, 0)`, the resulting state is `((a*x) % N, 0)` — i.e., the original "multiplier" register now holds the product, and the accumulator is cleared.
theoremsqir_modmult_inplace_candidate_state_eq_qstart
theorem sqir_modmult_inplace_candidate_state_eq_qstart
    (bits q_start N a ainv x flagPos dim : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (h_ainv_le : ainv ≤ N) (hx : x < N)
    (h_inv : (a * ainv) % N = 1)
    (h_flag_lt_qstart : flagPos < q_start)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    Gate.applyNat
        (sqir_modmult_inplace_candidate_qstart bits q_start N a ainv flagPos)
        (sqir_mult_input_F_qstart bits q_start x 0)
      = sqir_mult_input_F_qstart bits q_start ((a * x) % N) 0
q_start port of `sqir_modmult_inplace_candidate_state_eq`. After applying the q_start in-place wrapper to `(x, 0)`, the resulting state is `((a*x) % N, 0)` — the original "multiplier" register now holds the product, and the accumulator is cleared.
theoremsqir_modmult_inplace_candidate_target_decode_qstart
theorem sqir_modmult_inplace_candidate_target_decode_qstart
    (bits q_start N a ainv x flagPos dim : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (h_ainv_le : ainv ≤ N) (hx : x < N)
    (h_inv : (a * ainv) % N = 1)
    (h_flag_lt_qstart : flagPos < q_start)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    cuccaro_target_val bits q_start
        (Gate.applyNat
          (sqir_modmult_inplace_candidate_qstart bits q_start N a ainv flagPos)
          (sqir_mult_input_F_qstart bits q_start x 0))
q_start port of `sqir_modmult_inplace_candidate_target_decode` (line 3708): after the in-place wrapper, the decoded target value is `0`.
theoremsqir_modmult_inplace_candidate_mult_bit_qstart
theorem sqir_modmult_inplace_candidate_mult_bit_qstart
    (bits q_start N a ainv x k flagPos dim : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (h_ainv_le : ainv ≤ N) (hx : x < N)
    (h_inv : (a * ainv) % N = 1) (hk : k < bits)
    (h_flag_lt_qstart : flagPos < q_start)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    Gate.applyNat
        (sqir_modmult_inplace_candidate_qstart bits q_start N a ainv flagPos)
        (sqir_mult_input_F_qstart bits q_start x 0)
        (sqir_mult_control_idx_qstart bits q_start k)
q_start port of `sqir_modmult_inplace_candidate_mult_bit` (line 3721): the multiplier register decodes bit-by-bit to `((a*x) % N).testBit k`.
theoremsqir_modmult_inplace_candidate_clean_qstart
theorem sqir_modmult_inplace_candidate_clean_qstart
    (bits q_start N a ainv x flagPos dim : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (h_ainv_le : ainv ≤ N) (hx : x < N)
    (h_inv : (a * ainv) % N = 1)
    (h_flag_lt_qstart : flagPos < q_start)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    cuccaro_target_val bits q_start
        (Gate.applyNat
          (sqir_modmult_inplace_candidate_qstart bits q_start N a ainv flagPos)
          (sqir_mult_input_F_qstart bits q_start x 0)) = 0
q_start port of `sqir_modmult_inplace_candidate_clean` (line 3733). The clean bundle restating the in-place state-eq pointwise: `cuccaro_target_val = 0`; `cuccaro_read_val = 0`; every position below `q_start` is `false` (q_start generalisation of the old `flag_0`/`flag_1` conjuncts at positions 0 and 1); top-carry position `q_start + 2*bits` is `false`; multiplier-bit decoding at every `sqir_mult_control_idx_qstart bits q_start k` equals `((a*x) % N).testBit k`.
theoremsqir_modmult_inplace_candidate_target_decode
theorem sqir_modmult_inplace_candidate_target_decode
    (bits N a ainv x : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (h_ainv_le : ainv ≤ N) (hx : x < N)
    (h_inv : (a * ainv) % N = 1) :
    cuccaro_target_val bits 2
        (Gate.applyNat (sqir_modmult_inplace_candidate bits N a ainv) (sqir_mult_input_F bits x 0))
      = 0
*In-place modular multiplier candidate, target decoded.**
theoremsqir_modmult_inplace_candidate_mult_bit
theorem sqir_modmult_inplace_candidate_mult_bit
    (bits N a ainv x k : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (h_ainv_le : ainv ≤ N) (hx : x < N)
    (h_inv : (a * ainv) % N = 1) (hk : k < bits) :
    Gate.applyNat (sqir_modmult_inplace_candidate bits N a ainv)
        (sqir_mult_input_F bits x 0) (sqir_mult_control_idx bits k)
      = ((a * x) % N).testBit k
*In-place modular multiplier candidate, multiplier register decoded to `(a*x) % N`.**
theoremsqir_modmult_inplace_candidate_clean
theorem sqir_modmult_inplace_candidate_clean
    (bits N a ainv x : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (h_ainv_le : ainv ≤ N) (hx : x < N)
    (h_inv : (a * ainv) % N = 1) :
    cuccaro_target_val bits 2
        (Gate.applyNat (sqir_modmult_inplace_candidate bits N a ainv)
          (sqir_mult_input_F bits x 0)) = 0
    ∧ cuccaro_read_val bits 2
        (Gate.applyNat (sqir_modmult_inplace_candidate bits N a ainv)
          (sqir_mult_input_F bits x 0)) = 0
    ∧ Gate.applyNat (sqir_modmult_inplace_candidate bits N a ainv)
*In-place modular multiplier — clean bundle.**
theoremsqir_mult_input_F_shifted_below_bits
theorem sqir_mult_input_F_shifted_below_bits
    (bits x acc q : Nat) (hq : q < bits) :
    sqir_mult_input_F_shifted bits x acc q = false
theoremsqir_mult_input_F_shifted_above_bits
theorem sqir_mult_input_F_shifted_above_bits
    (bits x acc q : Nat) (hq : bits ≤ q) :
    sqir_mult_input_F_shifted bits x acc q
      = sqir_mult_input_F bits x acc (q - bits)
theoremsqir_mult_input_F_shifted_at_shifted_control_bit
theorem sqir_mult_input_F_shifted_at_shifted_control_bit
    (bits x acc k : Nat) (hk : k < bits) :
    sqir_mult_input_F_shifted bits x acc (bits + sqir_mult_control_idx bits k)
      = x.testBit k
theoremGate.shift_seq
theorem Gate.shift_seq (off : Nat) (g h : Gate) :
    Gate.shift off (Gate.seq g h)
      = Gate.seq (Gate.shift off g) (Gate.shift off h)
theoremGate.applyNat_shift_at_lo
theorem Gate.applyNat_shift_at_lo
    (off : Nat) (g : Gate) (f : Nat → Bool) (q : Nat) (hq : q < off) :
    Gate.applyNat (Gate.shift off g) f q = f q
*At positions below `off`, a shifted gate acts as identity.**
theoremGate.applyNat_shift_at_hi
theorem Gate.applyNat_shift_at_hi
    (off : Nat) (g : Gate) (f : Nat → Bool) (q : Nat) (hq : off ≤ q) :
    Gate.applyNat (Gate.shift off g) f q
      = Gate.applyNat g (fun r => f (off + r)) (q - off)
*At positions ≥ `off`, a shifted gate acts as the original gate on the function `r ↦ f (off + r)`.**
theoremGate.shift_wellTyped
theorem Gate.shift_wellTyped
    {off dim : Nat} {g : Gate} (h : Gate.WellTyped dim g) :
    Gate.WellTyped (off + dim) (Gate.shift off g)
*Gate.shift is WellTyped at the larger dimension.**
theoremsqir_encode_to_mult_adapter_disjoint
theorem sqir_encode_to_mult_adapter_disjoint (bits : Nat) :
    0 + bits ≤ bits + sqir_mult_control_idx bits 0
      ∨ bits + sqir_mult_control_idx bits 0 + bits ≤ 0
Disjointness of swap ranges (used in `reverse_register_swap` lemmas).
theoremsqir_encode_to_mult_adapter_wellTyped
theorem sqir_encode_to_mult_adapter_wellTyped
    (bits : Nat) (hbits : 1 ≤ bits) :
    Gate.WellTyped (sqir_total_dim bits) (sqir_encode_to_mult_adapter bits)
theoremcuccaro_input_F_zero_at_workspace
theorem cuccaro_input_F_zero_at_workspace
    (q : Nat) (hq : q < 2 + 2 * (0 : Nat) + 1 ∨ True) :
    cuccaro_input_F 2 false 0 0 q = false
Helper: workspace value of `cuccaro_input_F 2 false 0 0` is always false.
theoremsqir_encode_to_mult_adapter_correct
theorem sqir_encode_to_mult_adapter_correct
    (bits x : Nat) (hbits : 1 ≤ bits) (hx : x < 2^bits) :
    Gate.applyNat (sqir_encode_to_mult_adapter bits)
        (encodeDataZeroAnc bits (sqir_modmult_rev_anc bits) x)
      = sqir_mult_input_F_shifted bits x 0
*Adapter correctness: `encodeDataZeroAnc → sqir_mult_input_F_shifted`.**
theoremreverse_register_swap_involution_general
theorem reverse_register_swap_involution_general
    (n offsetA offsetB : Nat)
    (h_disjoint : offsetA + n ≤ offsetB ∨ offsetB + n ≤ offsetA)
    (f : Nat → Bool) :
    Gate.applyNat (reverse_register_swap n offsetA offsetB)
      (Gate.applyNat (reverse_register_swap n offsetA offsetB) f)
    = f
*General reverse-register-swap involution.** Applying `reverse_register_swap n offsetA offsetB` twice yields identity, given disjoint ranges.
theoremsqir_encode_to_mult_adapter_involution
theorem sqir_encode_to_mult_adapter_involution
    (bits : Nat) (f : Nat → Bool) :
    Gate.applyNat (sqir_encode_to_mult_adapter bits)
      (Gate.applyNat (sqir_encode_to_mult_adapter bits) f) = f
*Adapter is self-inverse.**
theoremsqir_encode_to_mult_adapter_reverse
theorem sqir_encode_to_mult_adapter_reverse
    (bits y : Nat) (hbits : 1 ≤ bits) (hy : y < 2^bits) :
    Gate.applyNat (sqir_encode_to_mult_adapter bits)
        (sqir_mult_input_F_shifted bits y 0)
      = encodeDataZeroAnc bits (sqir_modmult_rev_anc bits) y
*Adapter reverse direction: `sqir_mult_input_F_shifted → encodeDataZeroAnc`.**
theoremsqir_modmult_inplace_shifted_correct
theorem sqir_modmult_inplace_shifted_correct
    (bits N a ainv x : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (h_ainv_le : ainv ≤ N) (hx : x < N) (h_inv : (a * ainv) % N = 1) :
    Gate.applyNat (sqir_modmult_inplace_shifted bits N a ainv)
        (sqir_mult_input_F_shifted bits x 0)
      = sqir_mult_input_F_shifted bits ((a * x) % N) 0
*Shifted in-place multiplier correctness.**
theoremsqir_modmult_inplace_shifted_wellTyped
theorem sqir_modmult_inplace_shifted_wellTyped
    (bits N a ainv : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits) :
    Gate.WellTyped (sqir_total_dim bits) (sqir_modmult_inplace_shifted bits N a ainv)
theoremsqir_modmult_MCP_gate_apply_encode
theorem sqir_modmult_MCP_gate_apply_encode
    (bits N a ainv x : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (h_ainv_le : ainv ≤ N) (hx : x < N) (h_inv : (a * ainv) % N = 1) :
    Gate.applyNat (sqir_modmult_MCP_gate bits N a ainv)
        (encodeDataZeroAnc bits (sqir_modmult_rev_anc bits) x)
      = encodeDataZeroAnc bits (sqir_modmult_rev_anc bits) ((a * x) % N)
*MCP-layout gate apply theorem.** The composed gate maps `encodeDataZeroAnc bits anc x` to `encodeDataZeroAnc bits anc ((a*x) % N)`.
theoremsqir_modmult_MCP_gate_wellTyped
theorem sqir_modmult_MCP_gate_wellTyped
    (bits N a ainv : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits) :
    Gate.WellTyped (sqir_total_dim bits) (sqir_modmult_MCP_gate bits N a ainv)
*MCP-layout gate WellTyped.**
theoremsqir_modmult_MCP_gate_satisfies_MultiplyCircuitProperty
theorem sqir_modmult_MCP_gate_satisfies_MultiplyCircuitProperty
    (bits N a ainv : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (h_ainv_le : ainv ≤ N) (h_inv : (a * ainv) % N = 1) :
    FormalRV.SQIRPort.MultiplyCircuitProperty a N bits (sqir_modmult_rev_anc bits)
      (Gate.toUCom (sqir_total_dim bits) (sqir_modmult_MCP_gate bits N a ainv))
*HEADLINE: MCP-layout gate satisfies `MultiplyCircuitProperty`.**
theorempow_iter_inverse_mod
theorem pow_iter_inverse_mod
    (a ainv N i : Nat) (hN_ge_2 : 2 ≤ N) (h_inv : a * ainv % N = 1) :
    ((a^(2^i)) % N) * ((ainv^(2^i)) % N) % N = 1
*Per-iterate modular inverse arithmetic.** If `(a * ainv) % N = 1` and `N ≥ 2`, then for every `i`, `((a^(2^i)) % N) * ((ainv^(2^i)) % N) % N = 1`.
theoremMultiplyCircuitProperty_of_mod
theorem MultiplyCircuitProperty_of_mod
    {c N n anc : Nat} {U : FormalRV.Framework.BaseUCom (n + anc)}
    (hN_pos : 0 < N) (h_modN : FormalRV.SQIRPort.MultiplyCircuitProperty (c % N) N n anc U) :
    FormalRV.SQIRPort.MultiplyCircuitProperty c N n anc U
*MCP up-to-mod lifting.** If a unitary satisfies `MultiplyCircuitProperty (c % N)`, then it also satisfies `MultiplyCircuitProperty c` (since `(c * x) % N = ((c % N) * x) % N`).
theoremf_modmult_circuit_verified_per_iterate
theorem f_modmult_circuit_verified_per_iterate
    (a ainv N n i : Nat) (hN_ge_2 : 2 ≤ N) (hN : N ≤ 2^(n + 1)) (hN2 : 2 * N ≤ 2^(n + 1))
    (h_inv : a * ainv % N = 1) :
    FormalRV.SQIRPort.MultiplyCircuitProperty
      (a^(2^i)) N (n + 1) (sqir_modmult_rev_anc (n + 1))
      (f_modmult_circuit_verified a ainv N n i)
*Per-iterate `MultiplyCircuitProperty` for the verified family.**
theoremf_modmult_circuit_verified_MMI
theorem f_modmult_circuit_verified_MMI
    (a ainv N n : Nat) (hN_ge_2 : 2 ≤ N) (hN : N ≤ 2^(n + 1)) (hN2 : 2 * N ≤ 2^(n + 1))
    (h_inv : a * ainv % N = 1) :
    FormalRV.SQIRPort.ModMulImpl a N (n + 1) (sqir_modmult_rev_anc (n + 1))
      (f_modmult_circuit_verified a ainv N n)
*`ModMulImpl` for the verified family.**
theoremf_modmult_circuit_verified_uc_well_typed
theorem f_modmult_circuit_verified_uc_well_typed
    (a ainv N n : Nat) (hN_pos : 0 < N) (hN : N ≤ 2^(n + 1)) (hN2 : 2 * N ≤ 2^(n + 1)) :
    ∀ i, FormalRV.SQIRPort.uc_well_typed (f_modmult_circuit_verified a ainv N n i)
*`uc_well_typed` for every iterate of the verified family.**
theoremf_modmult_circuit_verified_MMI_from_BasicSetting
theorem f_modmult_circuit_verified_MMI_from_BasicSetting
    (a r N m n ainv : Nat) (h_basic : FormalRV.SQIRPort.BasicSetting a r N m n)
    (h_inv : a * ainv % N = 1) :
    FormalRV.SQIRPort.ModMulImpl a N (n + 1) (sqir_modmult_rev_anc (n + 1))
      (f_modmult_circuit_verified a ainv N n)
*`ModMulImpl` from `BasicSetting`** (n+1 dimension).
theoremf_modmult_circuit_verified_uc_well_typed_from_BasicSetting
theorem f_modmult_circuit_verified_uc_well_typed_from_BasicSetting
    (a r N m n ainv : Nat) (h_basic : FormalRV.SQIRPort.BasicSetting a r N m n) :
    ∀ i, FormalRV.SQIRPort.uc_well_typed (f_modmult_circuit_verified a ainv N n i)
*`uc_well_typed` from `BasicSetting`**.
theoremf_modmult_circuit_verified_bits_MMI
theorem f_modmult_circuit_verified_bits_MMI
    (a ainv N bits : Nat) (hbits : 1 ≤ bits) (hN_ge_2 : 2 ≤ N)
    (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits) (h_inv : a * ainv % N = 1) :
    FormalRV.SQIRPort.ModMulImpl a N bits (sqir_modmult_rev_anc bits)
      (f_modmult_circuit_verified_bits a ainv N bits)
*MMI for the bits-parameterized family.**
theoremf_modmult_circuit_verified_bits_uc_well_typed
theorem f_modmult_circuit_verified_bits_uc_well_typed
    (a ainv N bits : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits) :
    ∀ i, FormalRV.SQIRPort.uc_well_typed (f_modmult_circuit_verified_bits a ainv N bits i)
*uc_well_typed for the bits-parameterized family.**
theoremShor_correct_with_sqir_verified_modmult_bits
theorem Shor_correct_with_sqir_verified_modmult_bits
    (a r N m bits ainv : Nat) (hbits : 1 ≤ bits)
    (h_basic : FormalRV.SQIRPort.BasicSetting a r N m bits)
    (hN2 : 2 * N ≤ 2^bits)
    (h_inv : a * ainv % N = 1) :
    FormalRV.SQIRPort.probability_of_success a r N m bits
      (sqir_modmult_rev_anc bits)
      (f_modmult_circuit_verified_bits a ainv N bits)
      ≥ FormalRV.SQIRPort.κ / (Nat.log2 N : ℝ)^4
*Verified Shor probability bound — bits-parameterized.** If the user provides `BasicSetting a r N m bits` (which is generally INCOMPATIBLE with our sizing requirement `2 * N ≤ 2^bits` — see the documentation block above), the Shor success-probability bound holds for the verified family at dimension `bits + sqir_modmult_rev_anc bits`. In practice, both hypotheses can be simultaneously satisfied ONLY when `2 * N = 2^bits` (i.e., `N` is a power of 2). For general `N`, this theorem is vacuous — see Status D in PROGRESS.md / Tick 80 commit.

FormalRV.Arithmetic.SQIRModMult.SQIRModMultBasicSettingProofs

FormalRV/Arithmetic/SQIRModMult/SQIRModMultBasicSettingProofs.lean
theoremBasicSettingRelaxed_of_BasicSetting
theorem BasicSettingRelaxed_of_BasicSetting
    {a r N m n : Nat} (h : FormalRV.SQIRPort.BasicSetting a r N m n) :
    BasicSettingRelaxed a r N m n
`BasicSetting → BasicSettingRelaxed` (drops the upper-bound conjunct).
theoremVerifiedCircuitSizing_canonical_pow2_succ
theorem VerifiedCircuitSizing_canonical_pow2_succ
    (N : Nat) (hN : 0 < N) :
    VerifiedCircuitSizing N (Nat.log2 (2 * N) + 1)
*Canonical sizing**: `bits = Nat.log2 N + 1` gives `2*N ≤ 2^bits` when `N` is a power of 2 minus 1 or smaller; we use `Nat.log2 (2*N) + 1` as a generic choice.
theorems_closest_ub_relaxed
theorem s_closest_ub_relaxed (a r N m n k : Nat)
    (h_basic : BasicSettingRelaxed a r N m n) (h_k_lt : k < r) :
    FormalRV.SQIRPort.s_closest m k r < 2^m
*Relaxed s_closest_ub.**
theorems_closest_injective_relaxed
theorem s_closest_injective_relaxed
    (a r N m n : Nat) (h_basic : BasicSettingRelaxed a r N m n) :
    ∀ i j : Nat, i < r → j < r →
      FormalRV.SQIRPort.s_closest m i r = FormalRV.SQIRPort.s_closest m j r → i = j
*Relaxed s_closest_injective** — same proof as the original, just adjusted for the relaxed hypothesis.
theoremr_found_1_relaxed_with_bound
theorem r_found_1_relaxed_with_bound
    (a r N m n k : Nat) (h_basic_r : BasicSettingRelaxed a r N m n)
    (h_2n_bound : 2 ^ n ≤ 2 * N) (h_k_lt : k < r) (h_coprime : Nat.gcd k r = 1) :
    FormalRV.SQIRPort.r_found (FormalRV.SQIRPort.s_closest m k r) m r a N = 1
*Relaxed r_found_1**: Since the existing `r_found_1` proof chain discards the n-bound throughout, it lifts to the relaxed setting via a constructed-BasicSetting argument with a placeholder upper bound. Pragmatic implementation: the existing `r_found_1` works at `BasicSetting`, which requires `2^n ≤ 2*N`. We don't have this, but the proof doesn't use it. Rather than re-proving the entire chain, we use the relaxed-from-BasicSetting bridge in reverse: state the relaxed theorem with an extra `(h_fake : 2^n ≤ 2*N)` parameter that we discard at call sites by NOT using this lemma when the bound is unavailable. For the SQIR `Shor_correct_var` chain, the bound IS available (since BasicSetting holds), so the relaxed lemma can fall through to the existing one. For our verified family with `bits = n + 1`, the bound is NOT available — but we sidestep this by USING THE EXISTING `Shor_correct_var` AT `n = bits` where BasicSetting also holds, which is the route we've taken in Tick 80.
theoremShor_correct_var_relaxed_with_bound
theorem Shor_correct_var_relaxed_with_bound
    (a r N m n anc : Nat) (u : Nat → FormalRV.SQIRPort.BaseUCom (n + anc))
    (h_basic_r : BasicSettingRelaxed a r N m n)
    (h_2n_bound : 2 ^ n ≤ 2 * N)
    (h_modmul : FormalRV.SQIRPort.ModMulImpl a N n anc u)
    (h_wt : ∀ i, i < m → FormalRV.SQIRPort.uc_well_typed (u i)) :
    FormalRV.SQIRPort.probability_of_success a r N m n anc u
      ≥ FormalRV.SQIRPort.κ / (Nat.log2 N : ℝ)^4
*Relaxed Shor_correct_var (with bound)**: takes the upper bound explicitly so that the proof obligations are visible.
theoremShor_correct_with_sqir_verified_modmult_relaxed
theorem Shor_correct_with_sqir_verified_modmult_relaxed
    (a r N m bits ainv : Nat)
    (h_basic_r : BasicSettingRelaxed a r N m bits)
    (h_bits : VerifiedCircuitSizing N bits)
    (h_2n_bound : 2 ^ bits ≤ 2 * N)
    (h_inv : a * ainv % N = 1) :
    FormalRV.SQIRPort.probability_of_success a r N m bits
      (sqir_modmult_rev_anc bits)
      (f_modmult_circuit_verified_bits a ainv N bits)
      ≥ FormalRV.SQIRPort.κ / (Nat.log2 N : ℝ)^4
theoremBasicSetting_at_canonical_n_of_BasicSettingRelaxed
theorem BasicSetting_at_canonical_n_of_BasicSettingRelaxed
    (a r N m bits : Nat) (h_basic_r : BasicSettingRelaxed a r N m bits) :
    FormalRV.SQIRPort.BasicSetting a r N m (Nat.log2 (2 * N))
*Canonical-n bridge**: From `BasicSettingRelaxed` at any `bits`, we can construct `BasicSetting` at `n_canonical = Nat.log2 (2*N)`.
theoremr_found_1_relaxed
theorem r_found_1_relaxed (a r N m bits k : Nat)
    (h_basic_r : BasicSettingRelaxed a r N m bits)
    (h_k_lt : k < r) (h_coprime : Nat.gcd k r = 1) :
    FormalRV.SQIRPort.r_found (FormalRV.SQIRPort.s_closest m k r) m r a N = 1
*Relaxed `r_found_1`**: same conclusion as the original, but hypothesis weakened to `BasicSettingRelaxed`. Uses the canonical-n bridge since the conclusion `r_found (...) = 1` does not mention `n`.
theoremBasicSettingRelaxed_a_pos
theorem BasicSettingRelaxed_a_pos
    {a r N m n : Nat} (h : BasicSettingRelaxed a r N m n) : 0 < a
theoremBasicSettingRelaxed_a_lt
theorem BasicSettingRelaxed_a_lt
    {a r N m n : Nat} (h : BasicSettingRelaxed a r N m n) : a < N
theoremBasicSettingRelaxed_order
theorem BasicSettingRelaxed_order
    {a r N m n : Nat} (h : BasicSettingRelaxed a r N m n) :
    FormalRV.SQIRPort.Order a r N
theoremBasicSettingRelaxed_Nsq_lt
theorem BasicSettingRelaxed_Nsq_lt
    {a r N m n : Nat} (h : BasicSettingRelaxed a r N m n) : N^2 < 2^m
theoremBasicSettingRelaxed_pow_le_2Nsq
theorem BasicSettingRelaxed_pow_le_2Nsq
    {a r N m n : Nat} (h : BasicSettingRelaxed a r N m n) : 2^m ≤ 2 * N^2
theoremBasicSettingRelaxed_N_lt_pow_n
theorem BasicSettingRelaxed_N_lt_pow_n
    {a r N m n : Nat} (h : BasicSettingRelaxed a r N m n) : N < 2^n
theoremBasicSettingRelaxed_N_le_pow_n
theorem BasicSettingRelaxed_N_le_pow_n
    {a r N m n : Nat} (h : BasicSettingRelaxed a r N m n) : N ≤ 2^n
theoremBasicSettingRelaxed_N_pos
theorem BasicSettingRelaxed_N_pos
    {a r N m n : Nat} (h : BasicSettingRelaxed a r N m n) : 0 < N
theoremqpe_semantics_measurement_eq_from_lsb_relaxed
theorem qpe_semantics_measurement_eq_from_lsb_relaxed
    (a r N m n anc k : Nat)
    (f : Nat → FormalRV.SQIRPort.BaseUCom (n + anc))
    (h_basic_r : BasicSettingRelaxed a r N m n)
    (h_modmul : FormalRV.SQIRPort.ModMulImpl a N n anc f)
    (h_wt : ∀ i, i < m → FormalRV.SQIRPort.uc_well_typed (f i)) :
    FormalRV.SQIRPort.prob_partial_meas
        (FormalRV.Framework.basis_vector (2^m)
          (FormalRV.SQIRPort.s_closest m k r))
        (FormalRV.SQIRPort.Shor_final_state m n anc f)
    = FormalRV.SQIRPort.prob_partial_meas
        (FormalRV.Framework.basis_vector (2^m)
theoremQPE_MMI_correct_from_Shor_orbit_state_relaxed
theorem QPE_MMI_correct_from_Shor_orbit_state_relaxed
    (a r N m n anc k : Nat)
    (f : Nat → FormalRV.SQIRPort.BaseUCom (n + anc))
    (β : Fin r → Matrix (Fin (2^(n + anc))) (Fin 1) ℂ)
    (h_basic_r : BasicSettingRelaxed a r N m n)
    (_h_mmi : FormalRV.SQIRPort.ModMulImpl a N n anc f)
    (_h_wt : ∀ i, i < m → FormalRV.SQIRPort.uc_well_typed (f i))
    (h_k_lt : k < r)
    (h_orth : ∀ j j' : Fin r,
       ∑ y : Fin (2^(n + anc)), starRingEnd ℂ ((β j') y 0) * (β j) y 0
       = if j = j' then (1 : ℂ) else 0)
    (actual_state : Matrix (Fin (2^(m + (n + anc)))) (Fin 1) ℂ)
theoremQPE_MMI_correct_assuming_orbit_factorization_relaxed
theorem QPE_MMI_correct_assuming_orbit_factorization_relaxed
    (a r N m n anc k : Nat)
    (f : Nat → FormalRV.SQIRPort.BaseUCom (n + anc))
    (h_basic_r : BasicSettingRelaxed a r N m n)
    (h_mmi : FormalRV.SQIRPort.ModMulImpl a N n anc f)
    (h_wt : ∀ i, i < m → FormalRV.SQIRPort.uc_well_typed (f i))
    (h_k_lt : k < r)
    (h_orbit_exists :
        ∃ (β : Fin r → Matrix (Fin (2^(n + anc))) (Fin 1) ℂ)
          (actual_state : Matrix (Fin (2^(m + (n + anc)))) (Fin 1) ℂ),
          ((∀ j j' : Fin r,
             ∑ y : Fin (2^(n + anc)),
theoremQPE_MMI_correct_modulo_qpe_semantics_relaxed
theorem QPE_MMI_correct_modulo_qpe_semantics_relaxed
    (a r N m n anc k : Nat)
    (f : Nat → FormalRV.SQIRPort.BaseUCom (n + anc))
    (h_basic_r : BasicSettingRelaxed a r N m n)
    (h_mmi : FormalRV.SQIRPort.ModMulImpl a N n anc f)
    (h_wt : ∀ i, i < m → FormalRV.SQIRPort.uc_well_typed (f i))
    (h_k_lt : k < r)
    (h_qpe_semantics :
      FormalRV.SQIRPort.prob_partial_meas
          (FormalRV.Framework.basis_vector (2^m)
            (FormalRV.SQIRPort.s_closest m k r))
          (FormalRV.SQIRPort.Shor_final_state m n anc f)
theoremQPE_MMI_correct_relaxed
theorem QPE_MMI_correct_relaxed
    (a r N m n anc k : Nat)
    (f : Nat → FormalRV.SQIRPort.BaseUCom (n + anc))
    (h_basic_r : BasicSettingRelaxed a r N m n)
    (h_mmi : FormalRV.SQIRPort.ModMulImpl a N n anc f)
    (h_wt : ∀ i, i < m → FormalRV.SQIRPort.uc_well_typed (f i))
    (h_k_lt : k < r) :
    FormalRV.SQIRPort.prob_partial_meas
        (FormalRV.Framework.basis_vector (2^m)
          (FormalRV.SQIRPort.s_closest m k r))
        (FormalRV.SQIRPort.Shor_final_state m n anc f)
      ≥ 4 / (Real.pi^2 * (r : ℝ))
theoremShor_correct_var_relaxed
theorem Shor_correct_var_relaxed
    (a r N m n anc : Nat) (u : Nat → FormalRV.SQIRPort.BaseUCom (n + anc))
    (h_basic_r : BasicSettingRelaxed a r N m n)
    (h_modmul : FormalRV.SQIRPort.ModMulImpl a N n anc u)
    (h_wt : ∀ i, i < m → FormalRV.SQIRPort.uc_well_typed (u i)) :
    FormalRV.SQIRPort.probability_of_success a r N m n anc u
      ≥ FormalRV.SQIRPort.κ / (Nat.log2 N : ℝ)^4
theoremShor_correct_with_sqir_verified_modmult_usable
theorem Shor_correct_with_sqir_verified_modmult_usable
    (a r N m bits ainv : Nat)
    (h_basic_r : BasicSettingRelaxed a r N m bits)
    (h_sizing : VerifiedCircuitSizing N bits)
    (h_inv : a * ainv % N = 1) :
    FormalRV.SQIRPort.probability_of_success a r N m bits
      (sqir_modmult_rev_anc bits)
      (f_modmult_circuit_verified_bits a ainv N bits)
      ≥ FormalRV.SQIRPort.κ / (Nat.log2 N : ℝ)^4
*Fully usable verified SQIR Shor theorem** — no residual upper bound on `2^bits` from BasicSetting.
theoremShor_correct_with_sqir_verified_modmult_canonical_bits
theorem Shor_correct_with_sqir_verified_modmult_canonical_bits
    (a r N m ainv : Nat)
    (h_basic_r : BasicSettingRelaxed a r N m (Nat.log2 (2*N) + 1))
    (h_inv : a * ainv % N = 1) :
    FormalRV.SQIRPort.probability_of_success a r N m (Nat.log2 (2*N) + 1)
      (sqir_modmult_rev_anc (Nat.log2 (2*N) + 1))
      (f_modmult_circuit_verified_bits a ainv N (Nat.log2 (2*N) + 1))
      ≥ FormalRV.SQIRPort.κ / (Nat.log2 N : ℝ)^4
*Canonical-bits corollary**: bits = `Nat.log2 (2*N) + 1`.
theoremShor_correct_verified_no_modmult_axioms
theorem Shor_correct_verified_no_modmult_axioms
    (a r N m ainv : Nat)
    (h_basic_r : BasicSettingRelaxed a r N m (Nat.log2 (2*N) + 1))
    (h_inv : a * ainv % N = 1) :
    FormalRV.SQIRPort.probability_of_success a r N m (Nat.log2 (2*N) + 1)
      (sqir_modmult_rev_anc (Nat.log2 (2*N) + 1))
      (f_modmult_circuit_verified_bits a ainv N (Nat.log2 (2*N) + 1))
      ≥ FormalRV.SQIRPort.κ / (Nat.log2 N : ℝ)^4
*Verified Shor's algorithm correctness theorem (no placeholder axioms).** Alias for `Shor_correct_with_sqir_verified_modmult_canonical_bits` under a name that signals its axiom-free status.

FormalRV.Arithmetic.SQIRModMult.SQIRModMultBitPositioning

FormalRV/Arithmetic/SQIRModMult/SQIRModMultBitPositioning.lean
theoremsqir_mult_control_idx_outside_modadd_workspace
theorem sqir_mult_control_idx_outside_modadd_workspace
    (bits j : Nat) :
    sqir_mult_control_idx bits j < 2
      ∨ 2 + (2 * bits + 1) ≤ sqir_mult_control_idx bits j
theoremsqir_mult_control_idx_ne_flag
theorem sqir_mult_control_idx_ne_flag
    (bits j : Nat) :
    sqir_mult_control_idx bits j ≠ 1
theoremsqir_mult_control_idx_ne_top_carry
theorem sqir_mult_control_idx_ne_top_carry
    (bits j : Nat) :
    sqir_mult_control_idx bits j ≠ 2 + 2 * bits
theoremsqir_mult_control_idx_lt_sqir_dim
theorem sqir_mult_control_idx_lt_sqir_dim
    (bits j : Nat) (hj : j < bits) :
    sqir_mult_control_idx bits j < sqir_modmult_rev_anc bits
theoremsqir_mult_control_idx_outside_modadd_workspace_form
theorem sqir_mult_control_idx_outside_modadd_workspace_form
    (bits j : Nat) :
    sqir_mult_control_idx bits j < 2
      ∨ 2 + 2 * bits + 1 ≤ sqir_mult_control_idx bits j
theoremsqir_mult_control_idx_injective
theorem sqir_mult_control_idx_injective
    (bits j j' : Nat) (h : sqir_mult_control_idx bits j = sqir_mult_control_idx bits j') :
    j = j'
Distinct multiplier bits map to distinct positions.
theoremsqir_mult_input_control_bit
theorem sqir_mult_input_control_bit
    (bits m acc j : Nat) (hj : j < bits) :
    sqir_mult_input_F bits m acc (sqir_mult_control_idx bits j) = m.testBit j
*Multiplier bit at `sqir_mult_control_idx bits j` is `m.testBit j`.**
theoremsqir_mult_input_target_decode
theorem sqir_mult_input_target_decode
    (bits m acc : Nat) (hacc : acc < 2 ^ bits) :
    cuccaro_target_val bits 2 (sqir_mult_input_F bits m acc) = acc
*Decoded target register equals `acc` (for `acc < 2^bits`).**
theoremsqir_mult_input_read_decode
theorem sqir_mult_input_read_decode
    (bits m acc : Nat) :
    cuccaro_read_val bits 2 (sqir_mult_input_F bits m acc) = 0
*Decoded read register is 0.**
theoremsqir_mult_input_flag_0_false
theorem sqir_mult_input_flag_0_false
    (bits m acc : Nat) :
    sqir_mult_input_F bits m acc 0 = false
*Flag bits are false.**
theoremsqir_mult_input_flag_1_false
theorem sqir_mult_input_flag_1_false
    (bits m acc : Nat) :
    sqir_mult_input_F bits m acc 1 = false
theoremsqir_mult_input_top_carry_false
theorem sqir_mult_input_top_carry_false
    (bits m acc : Nat) (hbits : 1 ≤ bits) :
    sqir_mult_input_F bits m acc (2 + 2 * bits) = false
*Top carry is false (when bits ≥ 1).** The top carry position `2 + 2*bits = 2 + 2*(bits-1) + 2` is the highest read register bit in the Cuccaro encoding with `a = 0`.
theoremsqir_modmult_acc_spec_zero
theorem sqir_modmult_acc_spec_zero (N a m : Nat) :
    sqir_modmult_acc_spec N a m 0 = 0
theoremsqir_modmult_acc_spec_succ_false
theorem sqir_modmult_acc_spec_succ_false
    (N a m k : Nat) (h : m.testBit k = false) :
    sqir_modmult_acc_spec N a m (k + 1) = sqir_modmult_acc_spec N a m k
theoremsqir_modmult_acc_spec_succ_true
theorem sqir_modmult_acc_spec_succ_true
    (N a m k : Nat) (h : m.testBit k = true) :
    sqir_modmult_acc_spec N a m (k + 1)
      = (sqir_modmult_acc_spec N a m k + (a * 2 ^ k) % N) % N
theoremsqir_modmult_acc_spec_lt
theorem sqir_modmult_acc_spec_lt (N a m k : Nat) (hN_pos : 0 < N) :
    sqir_modmult_acc_spec N a m k < N
For `0 < N`, the accumulator after any prefix is in `[0, N)`.
theoremsqir_modmult_prefix_gate_zero_eq_I
theorem sqir_modmult_prefix_gate_zero_eq_I
    (bits N a : Nat) :
    sqir_modmult_prefix_gate bits N a 0 = Gate.I
theoremsqir_modmult_prefix_gate_succ_eq
theorem sqir_modmult_prefix_gate_succ_eq
    (bits N a k : Nat) :
    sqir_modmult_prefix_gate bits N a (k + 1)
      = seq (sqir_modmult_prefix_gate bits N a k) (sqir_modmult_step_gate bits N a k)
theoremsqir_controlledCompareConst_commute_update_outside_fun
theorem sqir_controlledCompareConst_commute_update_outside_fun
    (bits q_start c controlIdx flagPos updateIdx : Nat) (v : Bool) (f : Nat → Bool)
    (hupdate_out : updateIdx < q_start ∨ q_start + 2 * bits + 1 ≤ updateIdx)
    (hupdate_ne_flag : updateIdx ≠ flagPos)
    (hupdate_ne_ctrl : updateIdx ≠ controlIdx) :
    Gate.applyNat (sqir_controlledCompareConst bits q_start c controlIdx flagPos)
        (update f updateIdx v)
      = update (Gate.applyNat (sqir_controlledCompareConst bits q_start c controlIdx flagPos) f)
              updateIdx v
*Controlled compareConst commutes with update at outside position distinct from the inner controlIdx and flagPos.**
theoremsqir_style_controlledModAddConst_gate_commute_update_outside_fun
theorem sqir_style_controlledModAddConst_gate_commute_update_outside_fun
    (bits N c controlIdx updateIdx : Nat) (v : Bool) (f : Nat → Bool)
    (hupdate_out : updateIdx < 2 ∨ 2 + (2 * bits + 1) ≤ updateIdx)
    (hupdate_ne_flag : updateIdx ≠ 1)
    (hupdate_ne_control : updateIdx ≠ controlIdx) :
    Gate.applyNat (sqir_style_controlledModAddConst_gate bits 2 N c controlIdx 1)
        (update f updateIdx v)
      = update (Gate.applyNat (sqir_style_controlledModAddConst_gate bits 2 N c controlIdx 1) f)
              updateIdx v
*Deliverable A — controlled modular add-constant gate commutes with `update` at outside positions (distinct from flag and controlIdx).**
theoremsqir_mult_control_idx_outside_modadd_workspace_form_qstart
theorem sqir_mult_control_idx_outside_modadd_workspace_form_qstart
    (bits q_start j : Nat) :
    sqir_mult_control_idx_qstart bits q_start j < q_start
      ∨ q_start + 2 * bits + 1 ≤ sqir_mult_control_idx_qstart bits q_start j
The j-th multiplier bit lies above the shifted Cuccaro workspace `[q_start, q_start + 2 * bits + 1)`. Port of `sqir_mult_control_idx_outside_modadd_workspace_form` (line 63).
theoremsqir_mult_control_idx_ne_flag_qstart
theorem sqir_mult_control_idx_ne_flag_qstart
    (bits q_start j flagPos : Nat) (h_flag_lt : flagPos < q_start) :
    sqir_mult_control_idx_qstart bits q_start j ≠ flagPos
The j-th multiplier bit is distinct from any chosen `flagPos` strictly below the shifted workspace. Port of `sqir_mult_control_idx_ne_flag` (line 45).
theoremsqir_mult_control_idx_lt_dim_qstart
theorem sqir_mult_control_idx_lt_dim_qstart
    (bits q_start j dim : Nat) (hj : j < bits)
    (h_dim : q_start + (2 * bits + 1) + bits ≤ dim) :
    sqir_mult_control_idx_qstart bits q_start j < dim
The j-th multiplier bit fits in a dimension that covers the workspace plus the multiplier register. Port of `sqir_mult_control_idx_lt_sqir_dim` (line 57) generalised to a free `dim` parameter.
theoremsqir_mult_control_idx_injective_qstart
theorem sqir_mult_control_idx_injective_qstart
    (bits q_start j j' : Nat)
    (h : sqir_mult_control_idx_qstart bits q_start j
          = sqir_mult_control_idx_qstart bits q_start j') :
    j = j'
Distinct multiplier bits map to distinct positions. Port of `sqir_mult_control_idx_injective` (line 72).
theoremsqir_mult_input_control_bit_qstart
theorem sqir_mult_input_control_bit_qstart
    (bits q_start m acc j : Nat) (hj : j < bits) :
    sqir_mult_input_F_qstart bits q_start m acc
        (sqir_mult_control_idx_qstart bits q_start j) = m.testBit j
The multiplier bit at `sqir_mult_control_idx_qstart bits q_start j` is `m.testBit j`. Port of `sqir_mult_input_control_bit` (line 103).
theoremsqir_style_controlledModAddConst_gate_commute_update_outside_fun_qstart
theorem sqir_style_controlledModAddConst_gate_commute_update_outside_fun_qstart
    (bits q_start N c controlIdx flagPos updateIdx : Nat) (v : Bool) (f : Nat → Bool)
    (hupdate_out : updateIdx < q_start ∨ q_start + 2 * bits + 1 ≤ updateIdx)
    (hupdate_ne_flag : updateIdx ≠ flagPos)
    (hupdate_ne_control : updateIdx ≠ controlIdx) :
    Gate.applyNat (sqir_style_controlledModAddConst_gate bits q_start N c controlIdx flagPos)
        (update f updateIdx v)
      = update (Gate.applyNat (sqir_style_controlledModAddConst_gate bits q_start N c
                                  controlIdx flagPos) f)
              updateIdx v
q_start-parametric commute helper for the controlled mod-add gate. Port of `sqir_style_controlledModAddConst_gate_commute_update_outside_fun` (line 276): the gate commutes with an `update` at any position outside its workspace and distinct from both `controlIdx` and `flagPos`. All sub-helpers are already q_start-parametric: - `sqir_conditionalAddConstGate_commute_update_outside_fun` (CuccaroSQIRDirtyFlag.lean:3157); - `sqir_style_compareConst_candidate_commute_update_outside_fun` (:3132); - `sqir_conditionalSubConstGate_commute_update_outside_fun` (:3174); - `sqir_controlledCompareConst_commute_update_outside_fun` (this file:249); - `Gate.applyNat_CX_commute_update_outside_fun` (CuccaroSQIRDirtyFlag.lean:3039).
theoreminstall_mult_bits_skip_j_at_workspace_eq
theorem install_mult_bits_skip_j_at_workspace_eq
    (bits m j num_bits : Nat) (f : Nat → Bool) (q : Nat)
    (hq : q < 2 + 2 * bits + 1) :
    install_mult_bits_skip_j bits m j num_bits f q = f q
*`install_mult_bits_skip_j` at outside workspace.** At any position `q < 2 + 2 * bits + 1`, the installs don't touch `q` (they update only at multiplier positions).
theoreminstall_mult_bits_skip_j_at_mult_k_eq
theorem install_mult_bits_skip_j_at_mult_k_eq
    (bits m j num_bits k : Nat) (f : Nat → Bool)
    (h_k_lt : k < num_bits) (h_k_ne_j : k ≠ j) :
    install_mult_bits_skip_j bits m j num_bits f (sqir_mult_control_idx bits k)
      = m.testBit k
*`install_mult_bits_skip_j` at multiplier position `k`** (`k < num_bits`, `k ≠ j`): installs `m.testBit k`.
theoreminstall_mult_bits_skip_j_at_j_eq
theorem install_mult_bits_skip_j_at_j_eq
    (bits m j num_bits : Nat) (f : Nat → Bool) :
    install_mult_bits_skip_j bits m j num_bits f (sqir_mult_control_idx bits j)
      = f (sqir_mult_control_idx bits j)
*`install_mult_bits_skip_j` at the skipped position `j`.** Installs never touch position `controlIdx_j` (always skipped), so the install returns `f (controlIdx_j)`.
theoreminstall_mult_bits_skip_j_at_above_eq
theorem install_mult_bits_skip_j_at_above_eq
    (bits m j num_bits : Nat) (h_num_le : num_bits ≤ bits) (f : Nat → Bool) (q : Nat)
    (hq : q ≥ 2 + 2 * bits + 1 + bits) :
    install_mult_bits_skip_j bits m j num_bits f q = f q
*`install_mult_bits_skip_j` at outside-multiplier upper positions.** For `q ≥ 2 + 2 * bits + 1 + bits` (above the multiplier register), installs don't touch `q`.
theoremsqir_mult_input_F_eq_install_with_j
theorem sqir_mult_input_F_eq_install_with_j
    (bits m acc j : Nat) (hj : j < bits) (hacc : acc < 2 ^ bits) :
    sqir_mult_input_F bits m acc
      = install_mult_bits_skip_j bits m j bits
          (update (cuccaro_input_F 2 false 0 acc) (sqir_mult_control_idx bits j) (m.testBit j))
*`sqir_mult_input_F` decomposes as `install_mult_bits_skip_j` applied to `update (cuccaro_input_F) controlIdx_j (m.testBit j)`.**
theoreminstall_mult_bits_skip_j_at_workspace_eq_qstart
theorem install_mult_bits_skip_j_at_workspace_eq_qstart
    (bits q_start m j num_bits : Nat) (f : Nat → Bool) (q : Nat)
    (hq : q < q_start + 2 * bits + 1) :
    install_mult_bits_skip_j_qstart bits q_start m j num_bits f q = f q
q_start-parametric: install chain does not modify positions strictly below `q_start + 2 * bits + 1`. Port of `install_mult_bits_skip_j_at_workspace_eq` (line 456).
theoreminstall_mult_bits_skip_j_at_mult_k_eq_qstart
theorem install_mult_bits_skip_j_at_mult_k_eq_qstart
    (bits q_start m j num_bits k : Nat) (f : Nat → Bool)
    (h_k_lt : k < num_bits) (h_k_ne_j : k ≠ j) :
    install_mult_bits_skip_j_qstart bits q_start m j num_bits f
        (sqir_mult_control_idx_qstart bits q_start k) = m.testBit k
q_start-parametric: install chain at multiplier position `k` (`k < num_bits`, `k ≠ j`) installs `m.testBit k`. Port of `install_mult_bits_skip_j_at_mult_k_eq` (line 476).
theoreminstall_mult_bits_skip_j_at_j_eq_qstart
theorem install_mult_bits_skip_j_at_j_eq_qstart
    (bits q_start m j num_bits : Nat) (f : Nat → Bool) :
    install_mult_bits_skip_j_qstart bits q_start m j num_bits f
        (sqir_mult_control_idx_qstart bits q_start j)
      = f (sqir_mult_control_idx_qstart bits q_start j)
q_start-parametric: install chain at the skipped position `j` is preserved from the base state. Port of `install_mult_bits_skip_j_at_j_eq` (line 503).
theoreminstall_mult_bits_skip_j_at_above_eq_qstart
theorem install_mult_bits_skip_j_at_above_eq_qstart
    (bits q_start m j num_bits : Nat) (h_num_le : num_bits ≤ bits)
    (f : Nat → Bool) (q : Nat)
    (hq : q ≥ q_start + 2 * bits + 1 + bits) :
    install_mult_bits_skip_j_qstart bits q_start m j num_bits f q = f q
q_start-parametric: install chain identity above the multiplier register. Port of `install_mult_bits_skip_j_at_above_eq` (line 522).
theoremsqir_mult_input_F_eq_install_with_j_qstart
theorem sqir_mult_input_F_eq_install_with_j_qstart
    (bits q_start m acc j : Nat) (hj : j < bits) (hacc : acc < 2 ^ bits) :
    sqir_mult_input_F_qstart bits q_start m acc
      = install_mult_bits_skip_j_qstart bits q_start m j bits
          (update (cuccaro_input_F q_start false 0 acc)
            (sqir_mult_control_idx_qstart bits q_start j) (m.testBit j))
q_start-parametric bridge: `sqir_mult_input_F_qstart bits q_start m acc` decomposes as `install_mult_bits_skip_j_qstart` applied to `update (cuccaro_input_F q_start false 0 acc) (sqir_mult_control_idx_qstart bits q_start j) (m.testBit j)`. Port of `sqir_mult_input_F_eq_install_with_j` (line 544).
theoremcuccaro_target_val_through_install_mult
theorem cuccaro_target_val_through_install_mult
    (bits m j N c : Nat) (num_bits : Nat) (f : Nat → Bool) :
    cuccaro_target_val bits 2
        (Gate.applyNat (sqir_style_controlledModAddConst_gate bits 2 N c
            (sqir_mult_control_idx bits j) 1)
          (install_mult_bits_skip_j bits m j num_bits f))
      = cuccaro_target_val bits 2
          (Gate.applyNat (sqir_style_controlledModAddConst_gate bits 2 N c
            (sqir_mult_control_idx bits j) 1) f)
*`cuccaro_target_val` is invariant under installing multiplier bits on the gate-applied state.** Each installed update is at position `controlIdx_k` (outside workspace), so by Deliverable A (commute with update) + `cuccaro_target_val_update_outside_workspace`, the target register's decoded value is unchanged.
theoremsqir_modmult_step_target_decode
theorem sqir_modmult_step_target_decode
    (bits N a j m acc : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hj : j < bits) (hacc : acc < N) :
    cuccaro_target_val bits 2
        (Gate.applyNat (sqir_modmult_step_gate bits N a j)
          (sqir_mult_input_F bits m acc))
      = if m.testBit j then (acc + (a * 2 ^ j) % N) % N else acc
*Deliverable C — One-step modular-multiplier target decode.** After applying `sqir_modmult_step_gate bits N a j` to `sqir_mult_input_F bits m acc`, the decoded target register equals `if m.testBit j then (acc + (a * 2^j) % N) % N else acc`.
theoremcuccaro_target_val_through_install_mult_qstart
theorem cuccaro_target_val_through_install_mult_qstart
    (bits q_start m j N c flagPos : Nat) (num_bits : Nat) (f : Nat → Bool)
    (h_flag_lt_qstart : flagPos < q_start) :
    cuccaro_target_val bits q_start
        (Gate.applyNat (sqir_style_controlledModAddConst_gate bits q_start N c
            (sqir_mult_control_idx_qstart bits q_start j) flagPos)
          (install_mult_bits_skip_j_qstart bits q_start m j num_bits f))
      = cuccaro_target_val bits q_start
          (Gate.applyNat (sqir_style_controlledModAddConst_gate bits q_start N c
            (sqir_mult_control_idx_qstart bits q_start j) flagPos) f)
q_start-parametric: installing multiplier bits (other than the j-th) outside the Cuccaro workspace does not change the gate's decoded target value. Port of `cuccaro_target_val_through_install_mult` (line 782).
theoremsqir_modmult_step_target_decode_qstart
theorem sqir_modmult_step_target_decode_qstart
    (bits q_start N a j m acc dim flagPos : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hj : j < bits) (hacc : acc < N)
    (h_flag_lt_qstart : flagPos < q_start)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    cuccaro_target_val bits q_start
        (Gate.applyNat (sqir_modmult_step_gate_qstart bits q_start N a j flagPos)
          (sqir_mult_input_F_qstart bits q_start m acc))
      = if m.testBit j then (acc + (a * 2 ^ j) % N) % N else acc
*R7d^xxix-L-3.15c HEADLINE: q_start-parametric one-step modular- multiplier target decode.** After applying `sqir_modmult_step_gate_qstart bits q_start N a j flagPos` to `sqir_mult_input_F_qstart bits q_start m acc`, the decoded target register equals `if m.testBit j then (acc + (a * 2^j) % N) % N else acc`. Port of `sqir_modmult_step_target_decode` (line 820). Uses the L-3.14′ `sqir_style_controlledModAddConst_gate_clean_qstart`, the L-3.15a control-index facts, and the L-3.15b install bridge + this tick's `_through_install_mult_qstart`. The `flagPos < q_start` hypothesis matches the hard-coded `flagPos = 1 < 2 = q_start` case and ensures `flagPos` is distinct from every multiplier-bit position.
theoremcuccaro_read_val_through_install_mult
theorem cuccaro_read_val_through_install_mult
    (bits m j N c : Nat) (num_bits : Nat) (f : Nat → Bool) :
    cuccaro_read_val bits 2
        (Gate.applyNat (sqir_style_controlledModAddConst_gate bits 2 N c
            (sqir_mult_control_idx bits j) 1)
          (install_mult_bits_skip_j bits m j num_bits f))
      = cuccaro_read_val bits 2
          (Gate.applyNat (sqir_style_controlledModAddConst_gate bits 2 N c
            (sqir_mult_control_idx bits j) 1) f)
*Read register invariant through install.**
theoremapplyNat_modmult_through_install_at_workspace
theorem applyNat_modmult_through_install_at_workspace
    (bits m j N c : Nat) (num_bits q : Nat) (f : Nat → Bool)
    (hq_ws : q < 2 + 2 * bits + 1) :
    Gate.applyNat (sqir_style_controlledModAddConst_gate bits 2 N c
        (sqir_mult_control_idx bits j) 1)
      (install_mult_bits_skip_j bits m j num_bits f) q
      = Gate.applyNat (sqir_style_controlledModAddConst_gate bits 2 N c
        (sqir_mult_control_idx bits j) 1) f q
*Position-wise invariance through install at workspace positions.**
theoremapplyNat_modmult_through_install_at_j
theorem applyNat_modmult_through_install_at_j
    (bits m j N c : Nat) (num_bits : Nat) (f : Nat → Bool) :
    Gate.applyNat (sqir_style_controlledModAddConst_gate bits 2 N c
        (sqir_mult_control_idx bits j) 1)
      (install_mult_bits_skip_j bits m j num_bits f) (sqir_mult_control_idx bits j)
      = Gate.applyNat (sqir_style_controlledModAddConst_gate bits 2 N c
        (sqir_mult_control_idx bits j) 1) f (sqir_mult_control_idx bits j)
*Position-wise invariance through install at the controlIdx_j position.**
theoremsqir_modmult_step_workspace
theorem sqir_modmult_step_workspace
    (bits N a j m acc : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hj : j < bits) (hacc : acc < N) :
    cuccaro_read_val bits 2
        (Gate.applyNat (sqir_modmult_step_gate bits N a j)
          (sqir_mult_input_F bits m acc)) = 0
    ∧ Gate.applyNat (sqir_modmult_step_gate bits N a j)
          (sqir_mult_input_F bits m acc) (2 + 2 * bits) = false
    ∧ Gate.applyNat (sqir_modmult_step_gate bits N a j)
          (sqir_mult_input_F bits m acc) 1 = false
    ∧ Gate.applyNat (sqir_modmult_step_gate bits N a j)
*Deliverable D — One-step workspace preservation.** After applying the step gate, the read register is 0, the top carry is false, the flag bit is false, and the multiplier control bit `j` is preserved as `m.testBit j`.
theoremcuccaro_read_val_through_install_mult_qstart
theorem cuccaro_read_val_through_install_mult_qstart
    (bits q_start m j N c flagPos : Nat) (num_bits : Nat) (f : Nat → Bool)
    (h_flag_lt_qstart : flagPos < q_start) :
    cuccaro_read_val bits q_start
        (Gate.applyNat (sqir_style_controlledModAddConst_gate bits q_start N c
            (sqir_mult_control_idx_qstart bits q_start j) flagPos)
          (install_mult_bits_skip_j_qstart bits q_start m j num_bits f))
      = cuccaro_read_val bits q_start
          (Gate.applyNat (sqir_style_controlledModAddConst_gate bits q_start N c
            (sqir_mult_control_idx_qstart bits q_start j) flagPos) f)
q_start-parametric: installing multiplier bits (other than the j-th) outside the Cuccaro workspace preserves the decoded read/workspace value. Port of `cuccaro_read_val_through_install_mult` (line 958).
theoremapplyNat_modmult_through_install_at_workspace_qstart
theorem applyNat_modmult_through_install_at_workspace_qstart
    (bits q_start m j N c flagPos : Nat) (num_bits q : Nat) (f : Nat → Bool)
    (h_flag_lt_qstart : flagPos < q_start)
    (hq_ws : q < q_start + 2 * bits + 1) :
    Gate.applyNat (sqir_style_controlledModAddConst_gate bits q_start N c
        (sqir_mult_control_idx_qstart bits q_start j) flagPos)
      (install_mult_bits_skip_j_qstart bits q_start m j num_bits f) q
      = Gate.applyNat (sqir_style_controlledModAddConst_gate bits q_start N c
        (sqir_mult_control_idx_qstart bits q_start j) flagPos) f q
q_start-parametric: install chain commutes with the step gate at any single workspace position `q < q_start + 2 * bits + 1`. Port of `applyNat_modmult_through_install_at_workspace` (line 990).
theoremapplyNat_modmult_through_install_at_j_qstart
theorem applyNat_modmult_through_install_at_j_qstart
    (bits q_start m j N c flagPos : Nat) (num_bits : Nat) (f : Nat → Bool)
    (h_flag_lt_qstart : flagPos < q_start) :
    Gate.applyNat (sqir_style_controlledModAddConst_gate bits q_start N c
        (sqir_mult_control_idx_qstart bits q_start j) flagPos)
      (install_mult_bits_skip_j_qstart bits q_start m j num_bits f)
        (sqir_mult_control_idx_qstart bits q_start j)
      = Gate.applyNat (sqir_style_controlledModAddConst_gate bits q_start N c
        (sqir_mult_control_idx_qstart bits q_start j) flagPos) f
          (sqir_mult_control_idx_qstart bits q_start j)
q_start-parametric: install chain commutes with the step gate at the j-th multiplier control position. Port of `applyNat_modmult_through_install_at_j` (line 1022).
theoremsqir_modmult_step_workspace_qstart
theorem sqir_modmult_step_workspace_qstart
    (bits q_start N a j m acc dim flagPos : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hj : j < bits) (hacc : acc < N)
    (h_flag_lt_qstart : flagPos < q_start)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    cuccaro_read_val bits q_start
        (Gate.applyNat (sqir_modmult_step_gate_qstart bits q_start N a j flagPos)
          (sqir_mult_input_F_qstart bits q_start m acc)) = 0
    ∧ Gate.applyNat (sqir_modmult_step_gate_qstart bits q_start N a j flagPos)
          (sqir_mult_input_F_qstart bits q_start m acc) (q_start + 2 * bits) = false
*R7d^xxix-L-3.15d HEADLINE: q_start-parametric one-step modular-multiplier workspace preservation.** After applying `sqir_modmult_step_gate_qstart` to `sqir_mult_input_F_qstart`: 1. the read register decodes to 0; 2. the top carry position (`q_start + 2 * bits`) is `false`; 3. `flagPos` is `false`; 4. the j-th multiplier control position holds `m.testBit j`. Port of `sqir_modmult_step_workspace` (line 1055).
theoremsqir_mult_input_flag_0_false_qstart
theorem sqir_mult_input_flag_0_false_qstart
    (bits q_start m acc : Nat) (h_lt : 0 < q_start) :
    sqir_mult_input_F_qstart bits q_start m acc 0 = false
q_start-parametric input flag-0 mini-helper. At position 0 of `sqir_mult_input_F_qstart`, the bit is `false`, provided position 0 is below the Cuccaro workspace start. Port of `sqir_mult_input_flag_0_false` (line 143).
theoremsqir_mult_input_flag_1_false_qstart
theorem sqir_mult_input_flag_1_false_qstart
    (bits q_start m acc : Nat) (h_lt : 1 < q_start) :
    sqir_mult_input_F_qstart bits q_start m acc 1 = false
q_start-parametric input flag-1 mini-helper. At position 1 of `sqir_mult_input_F_qstart`, the bit is `false`, provided position 1 is below the Cuccaro workspace start. Port of `sqir_mult_input_flag_1_false` (line 151).
theoremsqir_modmult_prefix_gate_qstart_zero_eq_I
theorem sqir_modmult_prefix_gate_qstart_zero_eq_I
    (bits q_start N a flagPos : Nat) :
    sqir_modmult_prefix_gate_qstart bits q_start N a flagPos 0 = Gate.I
q_start-parametric prefix gate at 0 windows is the identity.
theoremsqir_modmult_prefix_gate_qstart_succ_eq
theorem sqir_modmult_prefix_gate_qstart_succ_eq
    (bits q_start N a flagPos k : Nat) :
    sqir_modmult_prefix_gate_qstart bits q_start N a flagPos (k + 1)
      = seq (sqir_modmult_prefix_gate_qstart bits q_start N a flagPos k)
            (sqir_modmult_step_gate_qstart bits q_start N a k flagPos)
q_start-parametric prefix gate at `k + 1` windows.
theoremsqir_modmult_step_at_untouched_pos_qstart
theorem sqir_modmult_step_at_untouched_pos_qstart
    (bits q_start N a j flagPos m acc q : Nat) (hj : j < bits)
    (h_input : sqir_mult_input_F_qstart bits q_start m acc q = false)
    (h_q_out : q < q_start ∨ q_start + 2 * bits + 1 ≤ q)
    (h_q_ne_flag : q ≠ flagPos)
    (h_q_ne_ctrl_j : q ≠ sqir_mult_control_idx_qstart bits q_start j) :
    Gate.applyNat (sqir_modmult_step_gate_qstart bits q_start N a j flagPos)
        (sqir_mult_input_F_qstart bits q_start m acc) q = false
q_start-parametric: step gate doesn't touch positions outside its support. At any `q` outside the workspace, distinct from `flagPos` and the j-th multiplier control position, the gate's output equals the input's value. Port of `sqir_modmult_step_at_untouched_pos` (line 1712).
theoremsqir_modmult_step_flag0_false_qstart
theorem sqir_modmult_step_flag0_false_qstart
    (bits q_start N a j flagPos m acc : Nat) (hbits : 1 ≤ bits) (hj : j < bits)
    (h_qstart_ge_2 : 2 ≤ q_start)
    (h_flag_ne_0 : (0 : Nat) ≠ flagPos) :
    Gate.applyNat (sqir_modmult_step_gate_qstart bits q_start N a j flagPos)
        (sqir_mult_input_F_qstart bits q_start m acc) 0 = false
q_start-parametric: step gate's output at flag-0 position is `false`. Port of `sqir_modmult_step_flag0_false` (line 1734).
theoremsqir_modmult_step_above_layout_false_qstart
theorem sqir_modmult_step_above_layout_false_qstart
    (bits q_start N a j flagPos m acc q : Nat) (hbits : 1 ≤ bits) (hj : j < bits)
    (hq : q ≥ q_start + 2 * bits + 1 + bits)
    (h_flag_lt_qstart : flagPos < q_start) :
    Gate.applyNat (sqir_modmult_step_gate_qstart bits q_start N a j flagPos)
        (sqir_mult_input_F_qstart bits q_start m acc) q = false
q_start-parametric: step gate's output above the multiplier register (for `q ≥ q_start + 2 * bits + 1 + bits`) is `false`. Port of `sqir_modmult_step_above_layout_false` (line 1747).
theoremsqir_style_modAddConst_clean_candidate_carry_in_restored_qstart
theorem sqir_style_modAddConst_clean_candidate_carry_in_restored_qstart
    (bits q_start N c x flagPos : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc : c < N) (hx : x < N)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos) :
    Gate.applyNat (sqir_style_modAddConst_clean_candidate bits q_start N c flagPos)
        (update (cuccaro_input_F q_start false 0 x) flagPos false) q_start = false
q_start-parametric: the uncontrolled clean modular-add candidate restores the carry-in to `false`. Port of `sqir_style_modAddConst_clean_candidate_carry_in_restored` (line 1637).
theoremsqir_style_controlledModAddConst_candidate_carry_in_restored_qstart
theorem sqir_style_controlledModAddConst_candidate_carry_in_restored_qstart
    (bits q_start N c x controlIdx flagPos : Nat) (control : Bool)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (hcontrol_ne_flag : controlIdx ≠ flagPos) :
    Gate.applyNat (sqir_style_controlledModAddConst_candidate bits q_start N c controlIdx flagPos)
        (update (cuccaro_input_F q_start false 0 x) controlIdx control) q_start = false
q_start-parametric: controlled candidate carry-in restored. Dispatches on `control`. Port of `sqir_style_controlledModAddConst_candidate_carry_in_restored` (line 1657).
theoremsqir_style_controlledModAddConst_gate_carry_in_restored_qstart
theorem sqir_style_controlledModAddConst_gate_carry_in_restored_qstart
    (bits q_start N c x controlIdx flagPos : Nat) (control : Bool)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < q_start ∨ q_start + 2 * bits + 1 ≤ controlIdx)
    (hflag_out : flagPos < q_start ∨ q_start + 2 * bits + 1 ≤ flagPos)
    (hcontrol_ne_flag : controlIdx ≠ flagPos) :
    Gate.applyNat (sqir_style_controlledModAddConst_gate bits q_start N c controlIdx flagPos)
        (update (cuccaro_input_F q_start false 0 x) controlIdx control) q_start = false
q_start-parametric: wrapper-level controlled mod-add carry-in restored. Adds the `c = 0` identity case. Port of `sqir_style_controlledModAddConst_gate_carry_in_restored` (line 1684).
theoremsqir_style_controlledModAddConst_gate_commute_install_qstart
theorem sqir_style_controlledModAddConst_gate_commute_install_qstart
    (bits q_start m j N c flagPos num_bits : Nat) (f : Nat → Bool)
    (h_flag_lt_qstart : flagPos < q_start) :
    Gate.applyNat (sqir_style_controlledModAddConst_gate bits q_start N c
        (sqir_mult_control_idx_qstart bits q_start j) flagPos)
      (install_mult_bits_skip_j_qstart bits q_start m j num_bits f)
      = install_mult_bits_skip_j_qstart bits q_start m j num_bits
          (Gate.applyNat (sqir_style_controlledModAddConst_gate bits q_start N c
            (sqir_mult_control_idx_qstart bits q_start j) flagPos) f)
q_start-parametric: the controlled mod-add wrapper gate commutes with the entire install stack. Port of `sqir_style_controlledModAddConst_gate_commute_install` (line 1362).
theoremsqir_modmult_step_carry_in_restored_qstart
theorem sqir_modmult_step_carry_in_restored_qstart
    (bits q_start N a j flagPos m acc : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hj : j < bits) (hacc : acc < N)
    (h_flag_lt_qstart : flagPos < q_start) :
    Gate.applyNat (sqir_modmult_step_gate_qstart bits q_start N a j flagPos)
        (sqir_mult_input_F_qstart bits q_start m acc) q_start = false
q_start-parametric: step gate's output at the carry-in position (`q_start`) is `false`. Port of `sqir_modmult_step_carry_in_restored` (line 1764).

FormalRV.Arithmetic.SQIRModMult.SQIRModMultDefinitions

FormalRV/Arithmetic/SQIRModMult/SQIRModMultDefinitions.lean
## Tick 73 — Multiplier control register layout. Reuses the SQIR-faithful Cuccaro mod-add layout (q_start = 2, flagPos = 1, top carry = 2 + 2*bits). The multiplier control register starts at `2 + 2*bits + 1` (immediately after the top carry), so the `j`-th multiplier bit sits at position `2 + 2*bits + 1 + j`.
defsqir_mult_control_idx
def sqir_mult_control_idx (bits j : Nat) : Nat
Multiplier bit `j` lives at this position in the layout.
defsqir_mult_input_F
def sqir_mult_input_F (bits m acc : Nat) : Nat → Bool
Input state for the modular multiplier. Layout: - Positions 0 and 1 are flag bits (both false). - Positions 2..2+2*bits-1 encode the Cuccaro state for the accumulator (carry-in = false, read register = 0, target = `acc`). - Position 2 + 2*bits is the top carry (false, since accumulator is in `[0, 2^bits)`). - Position 2 + 2*bits + 1 + j is the j-th multiplier bit (`m.testBit j`). - Positions above the multiplier register are false.
defsqir_modmult_step_gate
def sqir_modmult_step_gate (bits N a j : Nat) : Gate
*One-step modular multiplier gate (controlled add of `(a * 2^j) % N`).**
defsqir_modmult_acc_spec
def sqir_modmult_acc_spec (N a m : Nat) : Nat → Nat
  | 0       => 0
  | k + 1   =>
    if m.testBit k then
      (sqir_modmult_acc_spec N a m k + (a * 2 ^ k) % N) % N
    else
      sqir_modmult_acc_spec N a m k
Recursive specification of the accumulator after processing the first `k` multiplier bits. Models the classical shift-and-accumulate loop.
defsqir_modmult_prefix_gate
def sqir_modmult_prefix_gate (bits N a : Nat) : Nat → Gate
  | 0       => Gate.I
  | k + 1   => seq (sqir_modmult_prefix_gate bits N a k) (sqir_modmult_step_gate bits N a k)
Multiplier prefix gate: applies `sqir_modmult_step_gate` for `j = 0, 1, ..., k-1` in order.
defsqir_modmult_const_gate
def sqir_modmult_const_gate (bits N a : Nat) : Gate
The full multiplier gate (process all `bits` multiplier bits).
defsqir_mult_control_idx_qstart
def sqir_mult_control_idx_qstart (bits q_start j : Nat) : Nat
q_start-parametric multiplier-bit position. Generalises `sqir_mult_control_idx bits j = 2 + (2 * bits + 1) + j` to free `q_start`.
defsqir_mult_input_F_qstart
def sqir_mult_input_F_qstart (bits q_start m acc : Nat) : Nat → Bool
q_start-parametric input state for the modular multiplier. Layout (free `q_start`): - Positions `q < q_start + 2 * bits + 1`: Cuccaro state (`cuccaro_input_F q_start false 0 acc`). - Positions `q_start + 2 * bits + 1 + j` for `j < bits`: multiplier bit `m.testBit j`. - Positions above the multiplier register: `false`. Port of `sqir_mult_input_F` (line 95).
defsqir_modmult_step_gate_qstart
def sqir_modmult_step_gate_qstart (bits q_start N a j flagPos : Nat) : Gate
q_start-parametric one-step modular-multiplier gate. Conditionally adds `(a * 2^j) % N` to the accumulator at workspace q_start = `q_start`, controlled by the multiplier bit at `sqir_mult_control_idx_qstart bits q_start j`, with dirty flag at `flagPos`. Port of `sqir_modmult_step_gate` (line 178).
definstall_mult_bits_skip_j
def install_mult_bits_skip_j (bits m j : Nat) : Nat → (Nat → Bool) → (Nat → Bool)
  | 0,     f => f
  | n + 1, f =>
    if n = j then install_mult_bits_skip_j bits m j n f
    else update (install_mult_bits_skip_j bits m j n f) (sqir_mult_control_idx bits n) (m.testBit n)
Recursively install multiplier bits `k = 0, ..., num_bits - 1` from `m`, *skipping** bit `j`.
definstall_mult_bits_skip_j_qstart
def install_mult_bits_skip_j_qstart (bits q_start m j : Nat) :
    Nat → (Nat → Bool) → (Nat → Bool)
  | 0,     f => f
  | n + 1, f =>
    if n = j then install_mult_bits_skip_j_qstart bits q_start m j n f
    else update (install_mult_bits_skip_j_qstart bits q_start m j n f)
                (sqir_mult_control_idx_qstart bits q_start n) (m.testBit n)
q_start-parametric: recursively install multiplier bits `k = 0, ..., num_bits - 1` from `m`, **skipping** bit `j`. Port of `install_mult_bits_skip_j` (line 447).
defsqir_modmult_prefix_gate_qstart
def sqir_modmult_prefix_gate_qstart
    (bits q_start N a flagPos : Nat) : Nat → Gate
  | 0       => Gate.I
  | k + 1   => seq (sqir_modmult_prefix_gate_qstart bits q_start N a flagPos k)
                   (sqir_modmult_step_gate_qstart bits q_start N a k flagPos)
q_start-parametric prefix gate. Applies `sqir_modmult_step_gate_qstart` for `j = 0, 1, ..., k - 1` in order. Port of `sqir_modmult_prefix_gate` (line 228).
defsqir_modmult_const_gate_qstart
def sqir_modmult_const_gate_qstart (bits q_start N a flagPos : Nat) : Gate
q_start-parametric full multiplier gate. Process all `bits` multiplier bits. Port of `sqir_modmult_const_gate` (line 242).
defsqir_modmult_acc_spec_from
def sqir_modmult_acc_spec_from (N a m acc : Nat) : Nat → Nat
  | 0     => acc
  | k + 1 =>
    if m.testBit k then
      (sqir_modmult_acc_spec_from N a m acc k + (a * 2 ^ k) % N) % N
    else
      sqir_modmult_acc_spec_from N a m acc k
*Accumulator spec from a starting value.** Like `sqir_modmult_acc_spec` but starts at `acc` instead of `0`. Used by the in-place modular multiplier uncompute step.
defsqir_target_idx_qstart
def sqir_target_idx_qstart (q_start i : Nat) : Nat
q_start-parametric: index of the accumulator (target) bit `i` in the Cuccaro layout. Port of `sqir_target_idx`.
defsqir_swap_acc_mult_aux_qstart
def sqir_swap_acc_mult_aux_qstart (bits q_start : Nat) : Nat → Gate
  | 0     => Gate.I
  | k + 1 => Gate.seq (sqir_swap_acc_mult_aux_qstart bits q_start k)
                      (qubit_swap (sqir_target_idx_qstart q_start k)
                                  (sqir_mult_control_idx_qstart bits q_start k))
q_start-parametric recursive swap of accumulator bits `[0, k)` with multiplier bits `[0, k)`. Port of `sqir_swap_acc_mult_aux`.
defsqir_swap_acc_mult_qstart
def sqir_swap_acc_mult_qstart (bits q_start : Nat) : Gate
q_start-parametric full SWAP of accumulator with multiplier register. Port of `sqir_swap_acc_mult`.
defsqir_modmult_inplace_candidate_qstart
def sqir_modmult_inplace_candidate_qstart
    (bits q_start N a ainv flagPos : Nat) : Gate
q_start-parametric in-place modular multiplier wrapper. Implements `x ↦ (a * x) % N` in the multiplier register using: 1. `const_gate_qstart(a)`: compute `(a * x) % N` into accumulator. 2. `swap_acc_mult_qstart`: swap accumulator and multiplier registers. 3. `const_gate_qstart((N - ainv) % N)`: uncompute the old `x` by accumulating `(N - ainv) * (a*x) % N ≡ -x (mod N)`, leaving the accumulator = 0. Correctness (sub-tick L-3.15g.2) will require `(a * ainv) % N = 1`. Port of `sqir_modmult_inplace_candidate`.
defsqir_target_idx
def sqir_target_idx (i : Nat) : Nat
Index of the accumulator (target) bit `i` in the SQIR layout.
defsqir_swap_acc_mult_aux
def sqir_swap_acc_mult_aux (bits : Nat) : Nat → Gate
  | 0     => Gate.I
  | k + 1 => Gate.seq (sqir_swap_acc_mult_aux bits k)
                      (qubit_swap (sqir_target_idx k) (sqir_mult_control_idx bits k))
Recursive swap of accumulator bits `[0, k)` with multiplier bits `[0, k)`.
defsqir_swap_acc_mult
def sqir_swap_acc_mult (bits : Nat) : Gate
Full SWAP of accumulator (target) register with multiplier register.
defsqir_modmult_inplace_candidate
def sqir_modmult_inplace_candidate (bits N a ainv : Nat) : Gate
*In-place modular multiplier wrapper.** Implements `x ↦ (a*x) % N` in the multiplier register using: 1. `const_gate(a)`: compute `(a*x) % N` into accumulator. 2. `swap_acc_mult`: swap the accumulator and multiplier registers. 3. `const_gate((N - ainv) % N)`: uncompute the old `x` by accumulating `(N - ainv) * (a*x) % N ≡ -x (mod N)`, leaving accumulator = 0. Requires `(a * ainv) % N = 1` (i.e., `ainv` is the modular inverse of `a`).
defsqir_total_dim
def sqir_total_dim (bits : Nat) : Nat
*Total dimension for the MCP-layout SQIR multiplier.** `bits` for the external data register + `sqir_modmult_rev_anc bits` for the SQIR ancilla/workspace block.
defsqir_mult_input_F_shifted
def sqir_mult_input_F_shifted (bits x acc : Nat) : Nat → Bool
*Shifted SQIR input function.** The internal SQIR layout shifted up by `bits` so positions `[0, bits)` are reserved for the external data register and positions `[bits, bits + sqir_modmult_rev_anc bits)` for the SQIR block.
defGate.shift
def Gate.shift (off : Nat) : Gate → Gate
  | Gate.I        => Gate.I
  | Gate.X q      => Gate.X (off + q)
  | Gate.CX a b   => Gate.CX (off + a) (off + b)
  | Gate.CCX a b c => Gate.CCX (off + a) (off + b) (off + c)
  | Gate.seq g h  => Gate.seq (Gate.shift off g) (Gate.shift off h)
Shift all gate positions up by `off`.
defencode_data_pos
def encode_data_pos (bits j : Nat) : Nat
Position of `x.testBit j` in the big-endian `encodeDataZeroAnc` encoding (for `j < bits`).
defshifted_sqir_control_idx
def shifted_sqir_control_idx (bits j : Nat) : Nat
Shifted SQIR position of multiplier control bit `j`.
defsqir_encode_to_mult_adapter
def sqir_encode_to_mult_adapter (bits : Nat) : Gate
*Layout adapter from `encodeDataZeroAnc` to shifted SQIR layout.** Reuses the existing `reverse_register_swap` primitive: position `i` of the encoded data register (`[0, bits)`) is swapped with position `(3*bits + 3) + (bits - 1 - i)` of the shifted SQIR multiplier register.
defsqir_modmult_inplace_shifted
def sqir_modmult_inplace_shifted (bits N a ainv : Nat) : Gate
*Shifted in-place modular multiplier gate.**
defsqir_modmult_MCP_gate
def sqir_modmult_MCP_gate (bits N a ainv : Nat) : Gate
*MCP-layout gate.** Three-stage composition: adapter → shifted in-place multiplier → adapter.
deff_modmult_circuit_verified
noncomputable def f_modmult_circuit_verified (a ainv N n : Nat) :
    Nat → FormalRV.Framework.BaseUCom ((n + 1) + sqir_modmult_rev_anc (n + 1))
*Verified modular-multiplier oracle family** at SQIR-faithful dimension `(n + 1) + sqir_modmult_rev_anc (n + 1)`.
deff_modmult_circuit_verified_bits
noncomputable def f_modmult_circuit_verified_bits (a ainv N bits : Nat) :
    Nat → FormalRV.Framework.BaseUCom (bits + sqir_modmult_rev_anc bits)
*Bits-parameterized verified modular-multiplier family.**
defBasicSettingRelaxed
def BasicSettingRelaxed (a r N m n : Nat) : Prop
*Relaxed BasicSetting** without the tight upper bound `2^n ≤ 2*N`. Keeps every conjunct mathematically used by the Shor proof. *Deprecated 2026-05-29 (Phase R2):** use `VerifiedShor.ShorSetting` for new code. This definition is kept as the implementation; `VerifiedShor.ShorSetting` is an `abbrev` for it.
defVerifiedCircuitSizing
def VerifiedCircuitSizing (N bits : Nat) : Prop
*Sizing predicate** for the verified SQIR modular multiplier. *Deprecated 2026-05-29 (Phase R2):** use `VerifiedShor.CircuitSizing` for new code. This definition is kept as the implementation; `VerifiedShor.CircuitSizing` is an `abbrev` for it.

FormalRV.Arithmetic.SQIRModMult.SQIRModMultPrefixInvariant

FormalRV/Arithmetic/SQIRModMult/SQIRModMultPrefixInvariant.lean
## Tick 74 — Deliverable F: Prefix invariant starter.
theoremsqir_modmult_prefix_target_decode_zero
theorem sqir_modmult_prefix_target_decode_zero
    (bits N a m : Nat) :
    cuccaro_target_val bits 2
        (Gate.applyNat (sqir_modmult_prefix_gate bits N a 0) (sqir_mult_input_F bits m 0))
      = sqir_modmult_acc_spec N a m 0
*Prefix invariant — base case (`k = 0`).** The 0-step prefix gate is identity, so the target register is just the encoded `acc = 0`.
theoremsqir_style_controlledModAddConst_gate_commute_install
theorem sqir_style_controlledModAddConst_gate_commute_install
    (bits m j N c num_bits : Nat) (f : Nat → Bool) :
    Gate.applyNat (sqir_style_controlledModAddConst_gate bits 2 N c
        (sqir_mult_control_idx bits j) 1)
      (install_mult_bits_skip_j bits m j num_bits f)
      = install_mult_bits_skip_j bits m j num_bits
          (Gate.applyNat (sqir_style_controlledModAddConst_gate bits 2 N c
            (sqir_mult_control_idx bits j) 1) f)
*Function-level commute of step gate with install.** The controlled mod-add wrapper gate commutes with the entire install stack, because each install update is at `controlIdx_k` for `k ≠ j` (by construction of `install_mult_bits_skip_j`), which is outside the gate's support.
theoremsqir_modmult_step_preserves_all_control_bits
theorem sqir_modmult_step_preserves_all_control_bits
    (bits N a m acc j k : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hacc : acc < N) (hj : j < bits) (hk : k < bits) :
    Gate.applyNat (sqir_modmult_step_gate bits N a j)
        (sqir_mult_input_F bits m acc) (sqir_mult_control_idx bits k)
      = m.testBit k
*Deliverable A — All control bits preserved by one step.** The one-step gate `sqir_modmult_step_gate bits N a j` preserves every multiplier control bit `k < bits` as `m.testBit k`. This generalizes Tick 74's Deliverable D from `k = j` to all `k < bits`.
theoremcuccaro_target_val_eq_implies_bits_match
theorem cuccaro_target_val_eq_implies_bits_match
    (bits q_start S : Nat) (f : Nat → Bool)
    (hS : S < 2^bits) (h : cuccaro_target_val bits q_start f = S) :
    ∀ i, i < bits → f (q_start + 2 * i + 1) = S.testBit i
*Converse to `cuccaro_target_val_eq_sum_when_bits_match`.** For `S < 2^bits`, if `cuccaro_target_val bits q_start f = S`, then each target bit `i < bits` matches `S.testBit i`. By uniqueness of binary representation. Useful for deducing per-bit info from a target_val equality. This is a forward-looking utility lemma; in Tick 75 it is not yet consumed by `sqir_modmult_step_state_eq` (deferred to Tick 76).
theoremcuccaro_read_val_eq_implies_bits_match
theorem cuccaro_read_val_eq_implies_bits_match
    (bits q_start S : Nat) (f : Nat → Bool)
    (hS : S < 2^bits) (h : cuccaro_read_val bits q_start f = S) :
    ∀ i, i < bits → f (q_start + 2 * i + 2) = S.testBit i
*Converse to `cuccaro_read_val_eq_sum_when_bits_match`.**
theoremsqir_modmult_step_state_normal
theorem sqir_modmult_step_state_normal
    (bits N a j m acc : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hj : j < bits) (hacc : acc < N) :
    let acc'
*Deliverable B — One-step state-normal theorem.** Combines Tick 74's target decode + Tick 74 workspace preservation + Deliverable A's all-control-bits preservation into a unified finite- state characterization of the step gate's output.
theoremsqir_modmult_acc_spec_eq_mul_mod_pow
theorem sqir_modmult_acc_spec_eq_mul_mod_pow
    (N a m k : Nat) (hN_pos : 0 < N) :
    sqir_modmult_acc_spec N a m k = (a * (m % 2^k)) % N
*Strong recurrence form.**
theoremsqir_modmult_acc_spec_eq_mul_mod
theorem sqir_modmult_acc_spec_eq_mul_mod
    (bits N a m : Nat) (hN_pos : 0 < N) (hm : m < 2^bits) :
    sqir_modmult_acc_spec N a m bits = (a * m) % N
*Deliverable E — Accumulator-spec equals modular product.** For `m < 2^bits`, the bit-by-bit accumulator equals `(a * m) % N`.
theoremsqir_style_modAddConst_clean_candidate_carry_in_restored
theorem sqir_style_modAddConst_clean_candidate_carry_in_restored
    (bits N c x : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc : c < N) (hx : x < N) :
    Gate.applyNat (sqir_style_modAddConst_clean_candidate bits 2 N c 1)
        (update (cuccaro_input_F 2 false 0 x) 1 false) 2 = false
*Clean candidate carry-in (`q_start = 2`) restored to `false`.** Chains through `dirtyFlag → compareConst c → X(1)`: - `dirtyFlag` restores carry-in via `dirtyFlag_carry_in_restored_general`. - `compareConst c` preserves all workspace positions via `compareConst_candidate_workspace_restored_at_general`. - `X(1)` doesn't touch position 2 (since 2 ≠ 1).
theoremsqir_style_controlledModAddConst_candidate_carry_in_restored
theorem sqir_style_controlledModAddConst_candidate_carry_in_restored
    (bits N c x controlIdx : Nat) (control : Bool)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc_pos : 0 < c) (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1) :
    Gate.applyNat (sqir_style_controlledModAddConst_candidate bits 2 N c controlIdx 1)
        (update (cuccaro_input_F 2 false 0 x) controlIdx control) 2 = false
*Controlled candidate carry-in restored.** Dispatches on `control`: - `control = false`: identity (via `control_false_state_eq`). - `control = true`: chains through `clean_candidate_carry_in_restored`.
theoremsqir_style_controlledModAddConst_gate_carry_in_restored
theorem sqir_style_controlledModAddConst_gate_carry_in_restored
    (bits N c x controlIdx : Nat) (control : Bool)
    (hbits : 1 ≤ bits) (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hc : c < N) (hx : x < N)
    (hcontrol_out : controlIdx < 2 ∨ 2 + 2 * bits + 1 ≤ controlIdx)
    (hcontrol_ne_flag : controlIdx ≠ 1) :
    Gate.applyNat (sqir_style_controlledModAddConst_gate bits 2 N c controlIdx 1)
        (update (cuccaro_input_F 2 false 0 x) controlIdx control) 2 = false
*Wrapper-level controlled mod-add carry-in restored.** Adds the `c = 0` identity case to the candidate version.
theoremsqir_modmult_step_at_untouched_pos
theorem sqir_modmult_step_at_untouched_pos
    (bits N a j m acc q : Nat) (hj : j < bits)
    (h_input : sqir_mult_input_F bits m acc q = false)
    (h_q_out : q < 2 ∨ 2 + (2 * bits + 1) ≤ q)
    (h_q_ne_flag : q ≠ 1)
    (h_q_ne_ctrl_j : q ≠ sqir_mult_control_idx bits j) :
    Gate.applyNat (sqir_modmult_step_gate bits N a j)
        (sqir_mult_input_F bits m acc) q = false
*Generic helper — step gate doesn't touch positions outside its support.** At any `q` outside workspace, distinct from flag and controlIdx_j, the gate's output equals the input's value (via commute + update_self).
theoremsqir_modmult_step_flag0_false
theorem sqir_modmult_step_flag0_false
    (bits N a j m acc : Nat) (hbits : 1 ≤ bits) (hj : j < bits) :
    Gate.applyNat (sqir_modmult_step_gate bits N a j)
        (sqir_mult_input_F bits m acc) 0 = false
*Step gate's output at flag bit 0 is `false`.**
theoremsqir_modmult_step_above_layout_false
theorem sqir_modmult_step_above_layout_false
    (bits N a j m acc q : Nat) (hbits : 1 ≤ bits) (hj : j < bits)
    (hq : q ≥ 2 + 2 * bits + 1 + bits) :
    Gate.applyNat (sqir_modmult_step_gate bits N a j)
        (sqir_mult_input_F bits m acc) q = false
*Step gate's output above the multiplier register is `false`.** For `q ≥ 2 + 2 * bits + 1 + bits`.
theoremsqir_modmult_step_carry_in_restored
theorem sqir_modmult_step_carry_in_restored
    (bits N a j m acc : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hj : j < bits) (hacc : acc < N) :
    Gate.applyNat (sqir_modmult_step_gate bits N a j)
        (sqir_mult_input_F bits m acc) 2 = false
*Step gate's output at carry-in (position 2) is `false`.**
theoremsqir_modmult_step_target_bit
theorem sqir_modmult_step_target_bit
    (bits N a j m acc i : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hj : j < bits) (hacc : acc < N) (hi : i < bits) :
    Gate.applyNat (sqir_modmult_step_gate bits N a j)
        (sqir_mult_input_F bits m acc) (2 + 2 * i + 1)
      = (if m.testBit j then (acc + (a * 2^j) % N) % N else acc).testBit i
*Step gate's output at target bit `i` equals `acc'.testBit i`.** Uses the per-bit converse `cuccaro_target_val_eq_implies_bits_match` plus the Tick 74 `sqir_modmult_step_target_decode`.
theoremsqir_modmult_step_read_bit
theorem sqir_modmult_step_read_bit
    (bits N a j m acc i : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hj : j < bits) (hacc : acc < N) (hi : i < bits) :
    Gate.applyNat (sqir_modmult_step_gate bits N a j)
        (sqir_mult_input_F bits m acc) (2 + 2 * i + 2) = false
*Step gate's output at read bit `i` is `false`.**
theoremsqir_modmult_step_target_bit_qstart
theorem sqir_modmult_step_target_bit_qstart
    (bits q_start N a j flagPos m acc i : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hj : j < bits) (hacc : acc < N) (hi : i < bits)
    (h_flag_lt_qstart : flagPos < q_start)
    (dim : Nat)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    Gate.applyNat (sqir_modmult_step_gate_qstart bits q_start N a j flagPos)
        (sqir_mult_input_F_qstart bits q_start m acc) (q_start + 2 * i + 1)
      = (if m.testBit j then (acc + (a * 2^j) % N) % N else acc).testBit i
q_start-parametric: step gate's output at target/b-register position `q_start + 2 * i + 1` decodes the i-th bit of the advanced accumulator. Port of `sqir_modmult_step_target_bit` (line 2063).
theoremsqir_modmult_step_read_bit_qstart
theorem sqir_modmult_step_read_bit_qstart
    (bits q_start N a j flagPos m acc i : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hj : j < bits) (hacc : acc < N) (hi : i < bits)
    (h_flag_lt_qstart : flagPos < q_start)
    (dim : Nat)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    Gate.applyNat (sqir_modmult_step_gate_qstart bits q_start N a j flagPos)
        (sqir_mult_input_F_qstart bits q_start m acc) (q_start + 2 * i + 2) = false
q_start-parametric: step gate's output at read/a-register position `q_start + 2 * i + 2` is `false`. Port of `sqir_modmult_step_read_bit` (line 2161).
theoremsqir_modmult_step_preserves_all_control_bits_qstart
theorem sqir_modmult_step_preserves_all_control_bits_qstart
    (bits q_start N a m acc j k flagPos : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hacc : acc < N) (hj : j < bits) (hk : k < bits)
    (h_flag_lt_qstart : flagPos < q_start)
    (dim : Nat)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    Gate.applyNat (sqir_modmult_step_gate_qstart bits q_start N a j flagPos)
        (sqir_mult_input_F_qstart bits q_start m acc)
          (sqir_mult_control_idx_qstart bits q_start k)
      = m.testBit k
q_start-parametric: step gate preserves every multiplier control bit `k < bits` as `m.testBit k`. Port of `sqir_modmult_step_preserves_all_control_bits` (line 1717).
theoremsqir_modmult_step_state_eq
theorem sqir_modmult_step_state_eq
    (bits N a j m acc : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hj : j < bits) (hacc : acc < N) :
    Gate.applyNat (sqir_modmult_step_gate bits N a j) (sqir_mult_input_F bits m acc)
      = sqir_mult_input_F bits m
          (if m.testBit j then (acc + (a * 2^j) % N) % N else acc)
*Deliverable D — One-step state equality (function-level).** After applying the step gate to `sqir_mult_input_F bits m acc`, the state is exactly `sqir_mult_input_F bits m acc'` where `acc' = if m.testBit j then (acc + (a*2^j)%N) % N else acc`.
theoremsqir_modmult_step_state_eq_qstart
theorem sqir_modmult_step_state_eq_qstart
    (bits q_start N a j flagPos m acc dim : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hj : j < bits) (hacc : acc < N)
    (h_flag_lt_qstart : flagPos < q_start)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    Gate.applyNat (sqir_modmult_step_gate_qstart bits q_start N a j flagPos)
        (sqir_mult_input_F_qstart bits q_start m acc)
      = sqir_mult_input_F_qstart bits q_start m
          (if m.testBit j then (acc + (a * 2^j) % N) % N else acc)
theoremsqir_modmult_prefix_state_eq
theorem sqir_modmult_prefix_state_eq
    (bits N a m k : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hk : k ≤ bits) :
    Gate.applyNat (sqir_modmult_prefix_gate bits N a k) (sqir_mult_input_F bits m 0)
      = sqir_mult_input_F bits m (sqir_modmult_acc_spec N a m k)
*Deliverable E — Prefix state equality.** By induction on `k`, the prefix gate's output on `sqir_mult_input_F bits m 0` equals `sqir_mult_input_F bits m (acc_spec ... k)`. Uses `sqir_modmult_step_state_eq` at each step + the accumulator recurrence.
theoremsqir_modmult_prefix_target_decode
theorem sqir_modmult_prefix_target_decode
    (bits N a m k : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hk : k ≤ bits) :
    cuccaro_target_val bits 2
        (Gate.applyNat (sqir_modmult_prefix_gate bits N a k) (sqir_mult_input_F bits m 0))
      = sqir_modmult_acc_spec N a m k
*Deliverable F — Prefix target decode (corollary of E).** The decoded target after applying the prefix gate equals the accumulator spec at `k`.
theoremsqir_modmult_const_gate_target_decode
theorem sqir_modmult_const_gate_target_decode
    (bits N a m : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hm : m < 2^bits) :
    cuccaro_target_val bits 2
        (Gate.applyNat (sqir_modmult_const_gate bits N a) (sqir_mult_input_F bits m 0))
      = (a * m) % N
*Deliverable G — Full modular multiplier target theorem.** After applying `sqir_modmult_const_gate bits N a` to `sqir_mult_input_F bits m 0`, the target register decodes to `(a*m) % N`.
theoremsqir_mult_input_target_decode_qstart
theorem sqir_mult_input_target_decode_qstart
    (bits q_start m acc : Nat) (hacc : acc < 2 ^ bits) :
    cuccaro_target_val bits q_start (sqir_mult_input_F_qstart bits q_start m acc) = acc
q_start-parametric: decoded target of the initial input state equals the accumulator value (assuming `acc < 2^bits`). Port of `sqir_mult_input_target_decode` (line 115).
theoremsqir_modmult_prefix_state_eq_qstart
theorem sqir_modmult_prefix_state_eq_qstart
    (bits q_start N a m k flagPos dim : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hk : k ≤ bits)
    (h_flag_lt_qstart : flagPos < q_start)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    Gate.applyNat (sqir_modmult_prefix_gate_qstart bits q_start N a flagPos k)
        (sqir_mult_input_F_qstart bits q_start m 0)
      = sqir_mult_input_F_qstart bits q_start m (sqir_modmult_acc_spec N a m k)
q_start-parametric prefix state equality. Port of `sqir_modmult_prefix_state_eq` (line 2416).
theoremsqir_modmult_prefix_target_decode_qstart
theorem sqir_modmult_prefix_target_decode_qstart
    (bits q_start N a m k flagPos dim : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hk : k ≤ bits)
    (h_flag_lt_qstart : flagPos < q_start)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    cuccaro_target_val bits q_start
        (Gate.applyNat (sqir_modmult_prefix_gate_qstart bits q_start N a flagPos k)
          (sqir_mult_input_F_qstart bits q_start m 0))
      = sqir_modmult_acc_spec N a m k
q_start-parametric prefix target decode (corollary of prefix state equality). Port of `sqir_modmult_prefix_target_decode` (line 2443).
theoremsqir_modmult_const_gate_target_decode_qstart
theorem sqir_modmult_const_gate_target_decode_qstart
    (bits q_start N a m flagPos dim : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hm : m < 2^bits)
    (h_flag_lt_qstart : flagPos < q_start)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    cuccaro_target_val bits q_start
        (Gate.applyNat (sqir_modmult_const_gate_qstart bits q_start N a flagPos)
          (sqir_mult_input_F_qstart bits q_start m 0))
      = (a * m) % N
*R7d^xxix-L-3.15e.5 HEADLINE: q_start-parametric full modular multiplier target decode.** After applying `sqir_modmult_const_gate_qstart bits q_start N a flagPos` to `sqir_mult_input_F_qstart bits q_start m 0`, the target register decodes to `(a * m) % N`. Port of `sqir_modmult_const_gate_target_decode` (line 2461).
theoremsqir_mult_input_at_below_qstart_eq_false_qstart
theorem sqir_mult_input_at_below_qstart_eq_false_qstart
    (bits q_start m acc q : Nat) (hq : q < q_start) :
    sqir_mult_input_F_qstart bits q_start m acc q = false
q_start-parametric: `sqir_mult_input_F_qstart` at any position strictly below `q_start` is `false`. Generalises the hard-coded `sqir_mult_input_flag_0_false` / `_flag_1_false` to any flagPos < q_start.
theoremsqir_mult_input_read_decode_qstart
theorem sqir_mult_input_read_decode_qstart
    (bits q_start m acc : Nat) :
    cuccaro_read_val bits q_start (sqir_mult_input_F_qstart bits q_start m acc) = 0
q_start-parametric: the read register of `sqir_mult_input_F_qstart` is 0. Port of `sqir_mult_input_read_decode` (line 129).
theoremsqir_mult_input_top_carry_false_qstart
theorem sqir_mult_input_top_carry_false_qstart
    (bits q_start m acc : Nat) (hbits : 1 ≤ bits) :
    sqir_mult_input_F_qstart bits q_start m acc (q_start + 2 * bits) = false
q_start-parametric: the top-carry position `q_start + 2 * bits` of `sqir_mult_input_F_qstart` is `false`. Port of `sqir_mult_input_top_carry_false` (line 162).
theoremsqir_modmult_const_gate_workspace_qstart
theorem sqir_modmult_const_gate_workspace_qstart
    (bits q_start N a m flagPos dim : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hm : m < 2^bits)
    (h_flag_lt_qstart : flagPos < q_start)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    cuccaro_read_val bits q_start
          (Gate.applyNat (sqir_modmult_const_gate_qstart bits q_start N a flagPos)
            (sqir_mult_input_F_qstart bits q_start m 0))
        = 0
    ∧ Gate.applyNat (sqir_modmult_const_gate_qstart bits q_start N a flagPos)
*R7d^xxix-L-3.15f HEADLINE: q_start-parametric constant-multiplier workspace bundle.** For the full multiplier `sqir_modmult_const_gate_qstart bits q_start N a flagPos` applied to `sqir_mult_input_F_qstart bits q_start m 0`: 1. the read register decodes to 0; 2. the top carry position `q_start + 2 * bits` is `false`; 3. the dirty-flag position `flagPos` is `false`; 4. every multiplier control bit `k < bits` is preserved as `m.testBit k`. Proof routes through `sqir_modmult_prefix_state_eq_qstart` (L-3.15e.5) at `k = bits` + `sqir_modmult_acc_spec_eq_mul_mod` (q_start-independent) to reshape the post-gate state, then reads each conjunct off the input state shape. Port of the workspace conjuncts of `sqir_modmult_const_gate_clean` (line 2629).
theoremsqir_modmult_const_gate_state_eq
theorem sqir_modmult_const_gate_state_eq
    (bits N a m : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hm : m < 2^bits) :
    Gate.applyNat (sqir_modmult_const_gate bits N a) (sqir_mult_input_F bits m 0)
      = sqir_mult_input_F bits m ((a * m) % N)
*Deliverable G corollary — Full multiplier state equality.** After the full multiplier, the state is `sqir_mult_input_F bits m ((a*m)%N)`.
theoremsqir_modmult_step_gate_wellTyped
theorem sqir_modmult_step_gate_wellTyped
    (bits N a j : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits) (hj : j < bits) :
    Gate.WellTyped (sqir_modmult_rev_anc bits) (sqir_modmult_step_gate bits N a j)
*Step gate is WellTyped at `sqir_modmult_rev_anc bits`.**
theoremsqir_modmult_prefix_gate_wellTyped
theorem sqir_modmult_prefix_gate_wellTyped
    (bits N a k : Nat) (hbits : 1 ≤ bits) (hN_pos : 0 < N)
    (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits) (hk : k ≤ bits) :
    Gate.WellTyped (sqir_modmult_rev_anc bits) (sqir_modmult_prefix_gate bits N a k)
*Prefix gate is WellTyped at `sqir_modmult_rev_anc bits`.**
theoremsqir_modmult_const_gate_clean
theorem sqir_modmult_const_gate_clean
    (bits N a m : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hm : m < 2^bits) :
    Gate.WellTyped (sqir_modmult_rev_anc bits) (sqir_modmult_const_gate bits N a)
    ∧ cuccaro_target_val bits 2
          (Gate.applyNat (sqir_modmult_const_gate bits N a) (sqir_mult_input_F bits m 0))
        = (a * m) % N
    ∧ cuccaro_read_val bits 2
          (Gate.applyNat (sqir_modmult_const_gate bits N a) (sqir_mult_input_F bits m 0))
        = 0
    ∧ Gate.applyNat (sqir_modmult_const_gate bits N a) (sqir_mult_input_F bits m 0) 0
*Deliverable H — Clean modular-multiplier bundle.** For the full multiplier gate `sqir_modmult_const_gate bits N a`: - WellTyped at `sqir_modmult_rev_anc bits`. - Target decoded to `(a * m) % N`. - Read = 0. - Flag bits 0, 1 = false. - Top carry (position `2 + 2*bits`) = false. - All multiplier control bits preserved as `m.testBit k`.
theoremsqir_modmult_const_gate_clean_from_BasicSetting
theorem sqir_modmult_const_gate_clean_from_BasicSetting
    (a r N m n x_mult : Nat)
    (h_basic : FormalRV.SQIRPort.BasicSetting a r N m n)
    (hm : x_mult < 2^(n + 1)) :
    Gate.WellTyped (sqir_modmult_rev_anc (n + 1))
        (sqir_modmult_const_gate (n + 1) N a)
    ∧ cuccaro_target_val (n + 1) 2
          (Gate.applyNat (sqir_modmult_const_gate (n + 1) N a)
            (sqir_mult_input_F (n + 1) x_mult 0)) = (a * x_mult) % N
    ∧ cuccaro_read_val (n + 1) 2
          (Gate.applyNat (sqir_modmult_const_gate (n + 1) N a)
            (sqir_mult_input_F (n + 1) x_mult 0)) = 0
*Deliverable I — BasicSetting specialization of the full multiplier clean bundle.** For BasicSetting parameters (which give `N ≤ 2^(n+1)` and `2*N ≤ 2^(n+1)`), the clean bundle holds at `bits = n + 1`.
theoremsqir_modmult_acc_spec_from_zero
theorem sqir_modmult_acc_spec_from_zero (N a m acc : Nat) :
    sqir_modmult_acc_spec_from N a m acc 0 = acc
theoremsqir_modmult_acc_spec_from_succ_true
theorem sqir_modmult_acc_spec_from_succ_true
    (N a m acc k : Nat) (h : m.testBit k = true) :
    sqir_modmult_acc_spec_from N a m acc (k + 1)
      = (sqir_modmult_acc_spec_from N a m acc k + (a * 2 ^ k) % N) % N
theoremsqir_modmult_acc_spec_from_succ_false
theorem sqir_modmult_acc_spec_from_succ_false
    (N a m acc k : Nat) (h : m.testBit k = false) :
    sqir_modmult_acc_spec_from N a m acc (k + 1)
      = sqir_modmult_acc_spec_from N a m acc k
theoremsqir_modmult_acc_spec_from_lt
theorem sqir_modmult_acc_spec_from_lt
    (N a m acc k : Nat) (hN_pos : 0 < N) (hacc : acc < N) :
    sqir_modmult_acc_spec_from N a m acc k < N
For `0 < N`, the accumulator-from-start stays in `[0, N)` if `acc < N`.
theoremsqir_modmult_acc_spec_from_eq_mod_pow
theorem sqir_modmult_acc_spec_from_eq_mod_pow
    (N a m acc k : Nat) (hN_pos : 0 < N) (hacc : acc < N) :
    sqir_modmult_acc_spec_from N a m acc k = (acc + a * (m % 2^k)) % N
*Closed form for `acc_spec_from`**: equals `(acc + a * (m % 2^k)) % N` for `acc < N`.
theoremsqir_modmult_acc_spec_from_eq_add_mul_mod
theorem sqir_modmult_acc_spec_from_eq_add_mul_mod
    (bits N a m acc : Nat) (hN_pos : 0 < N) (hacc : acc < N) (hm : m < 2^bits) :
    sqir_modmult_acc_spec_from N a m acc bits = (acc + a * m) % N
For `m < 2^bits` and `acc < N`, the final accumulator equals `(acc + a*m) % N`.
theoremsqir_modmult_prefix_state_eq_from
theorem sqir_modmult_prefix_state_eq_from
    (bits N a m acc k : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hacc : acc < N) (hk : k ≤ bits) :
    Gate.applyNat (sqir_modmult_prefix_gate bits N a k) (sqir_mult_input_F bits m acc)
      = sqir_mult_input_F bits m (sqir_modmult_acc_spec_from N a m acc k)
*Generalized prefix state equality** for arbitrary starting accumulator.
theoremsqir_modmult_const_gate_state_eq_from
theorem sqir_modmult_const_gate_state_eq_from
    (bits N a m acc : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hacc : acc < N) (hm : m < 2^bits) :
    Gate.applyNat (sqir_modmult_const_gate bits N a) (sqir_mult_input_F bits m acc)
      = sqir_mult_input_F bits m ((acc + a * m) % N)
*Generalized full multiplier state equality** for arbitrary starting accumulator.
theoremsqir_modmult_prefix_state_eq_from_qstart
theorem sqir_modmult_prefix_state_eq_from_qstart
    (bits q_start N a m acc k flagPos dim : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hacc : acc < N) (hk : k ≤ bits)
    (h_flag_lt_qstart : flagPos < q_start)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    Gate.applyNat (sqir_modmult_prefix_gate_qstart bits q_start N a flagPos k)
        (sqir_mult_input_F_qstart bits q_start m acc)
      = sqir_mult_input_F_qstart bits q_start m
          (sqir_modmult_acc_spec_from N a m acc k)
q_start-parametric: prefix state-eq generalised to an arbitrary starting accumulator. Port of `sqir_modmult_prefix_state_eq_from`. Uses the q_start-INDEPENDENT `sqir_modmult_acc_spec_from` chain plus the L-3.15e.4 `sqir_modmult_step_state_eq_qstart`.
theoremsqir_modmult_const_gate_state_eq_from_qstart
theorem sqir_modmult_const_gate_state_eq_from_qstart
    (bits q_start N a m acc flagPos dim : Nat) (hbits : 1 ≤ bits)
    (hN_pos : 0 < N) (hN : N ≤ 2^bits) (hN2 : 2 * N ≤ 2^bits)
    (hacc : acc < N) (hm : m < 2^bits)
    (h_flag_lt_qstart : flagPos < q_start)
    (h_workspace : q_start + 2 * bits + 1 ≤ dim)
    (h_dim_covers_mult : q_start + (2 * bits + 1) + bits ≤ dim) :
    Gate.applyNat (sqir_modmult_const_gate_qstart bits q_start N a flagPos)
        (sqir_mult_input_F_qstart bits q_start m acc)
      = sqir_mult_input_F_qstart bits q_start m ((acc + a * m) % N)
q_start-parametric: full multiplier state-eq generalised to an arbitrary starting accumulator. Port of `sqir_modmult_const_gate_state_eq_from`.
theoremsqir_target_idx_ne_mult_control_idx_qstart
theorem sqir_target_idx_ne_mult_control_idx_qstart
    (bits q_start i j : Nat) (hi : i < bits) :
    sqir_target_idx_qstart q_start i
      ≠ sqir_mult_control_idx_qstart bits q_start j
Target index disjoint from multiplier control index.
theoremsqir_swap_acc_mult_aux_qstart_succ_eq
theorem sqir_swap_acc_mult_aux_qstart_succ_eq (bits q_start k : Nat) :
    sqir_swap_acc_mult_aux_qstart bits q_start (k + 1)
      = Gate.seq (sqir_swap_acc_mult_aux_qstart bits q_start k)
          (qubit_swap (sqir_target_idx_qstart q_start k)
                      (sqir_mult_control_idx_qstart bits q_start k))
Unfold lemma for `sqir_swap_acc_mult_aux_qstart`.
theoremsqir_target_idx_qstart_value
theorem sqir_target_idx_qstart_value (q_start i : Nat) :
    sqir_target_idx_qstart q_start i = q_start + 2 * i + 1
Sanity helper: `sqir_target_idx_qstart` value.
theoremsqir_swap_acc_mult_at_mult_out_range_qstart
theorem sqir_swap_acc_mult_at_mult_out_range_qstart
    (bits q_start k i : Nat) (hk : k ≤ bits) (hi_ge : k ≤ i) (hi_bits : i < bits)
    (f : Nat → Bool) :
    Gate.applyNat (sqir_swap_acc_mult_aux_qstart bits q_start k) f
        (sqir_mult_control_idx_qstart bits q_start i)
      = f (sqir_mult_control_idx_qstart bits q_start i)
q_start port of `sqir_swap_acc_mult_aux_at_mult_out_range` (line 3097). At a multiplier bit `i ≥ k`, swap output = input.

FormalRV.Arithmetic.SQIRModMult.ToffoliCount

FormalRV/Arithmetic/SQIRModMult/ToffoliCount.lean
FormalRV.Arithmetic.SQIRModMult.ToffoliCount — a proved Toffoli (T-count) bound on the SAME Gate IR term that the SQIR port proves computes modular multiplication. The verified modular multiplier `sqir_modmult_const_gate bits N a` is proved to write `(a · m) % N` into the accumulator (`sqir_modmult_const_gate_target_decode`). Here we derive a closed-form UPPER BOUND on its T-count by structural induction over its Gate IR — `tcount ≤ 56·bits²` (i.e. ≤ 8·bits² Toffolis) — so for the first time a modular- arithmetic building block has count AND semantics on one verified circuit term. Layer counts (each derived, not asserted): prepare-reads tcount 0 (only X / CX / cond) cuccaro_maj_chain_inv tcount 7·n conditional add / sub tcount 14·bits (one Cuccaro adder, 14·bits) compare / ctrl-compare tcount 14·bits (two maj-chains, 7·bits each) controlled mod-add tcount 56·bits (four 14·bits sub-blocks) modmult prefix (k steps) tcount ≤ k·56·bits modmult const (bits) tcount ≤ 56·bits² No `sorry`, no new `axiom`.
theoremtcount_cuccaro_prepareConstRead
theorem tcount_cuccaro_prepareConstRead (n q c : Nat) :
    tcount (cuccaro_prepareConstRead n q c) = 0
theoremtcount_sqir_prepareMaskedConstRead
theorem tcount_sqir_prepareMaskedConstRead (n q N f : Nat) :
    tcount (sqir_prepareMaskedConstRead n q N f) = 0
theoremtcount_cuccaro_MAJ_inv
theorem tcount_cuccaro_MAJ_inv (a b c : Nat) : tcount (cuccaro_MAJ_inv a b c) = 7
theoremtcount_cuccaro_maj_chain_inv
theorem tcount_cuccaro_maj_chain_inv (n q : Nat) :
    tcount (cuccaro_maj_chain_inv n q) = 7 * n
theoremtcount_sqir_conditionalAddConstGate
theorem tcount_sqir_conditionalAddConstGate (bits q N f : Nat) :
    tcount (sqir_conditionalAddConstGate bits q N f) = 14 * bits
theoremtcount_sqir_conditionalSubConstGate
theorem tcount_sqir_conditionalSubConstGate (bits q N f : Nat) :
    tcount (sqir_conditionalSubConstGate bits q N f) = 14 * bits
theoremtcount_sqir_style_compareConst_candidate
theorem tcount_sqir_style_compareConst_candidate (bits q N f : Nat) :
    tcount (sqir_style_compareConst_candidate bits q N f) = 14 * bits
theoremtcount_sqir_controlledCompareConst
theorem tcount_sqir_controlledCompareConst (bits q c ci f : Nat) :
    tcount (sqir_controlledCompareConst bits q c ci f) = 14 * bits
theoremtcount_sqir_style_controlledModAddConst_candidate
theorem tcount_sqir_style_controlledModAddConst_candidate (bits q N c ci f : Nat) :
    tcount (sqir_style_controlledModAddConst_candidate bits q N c ci f) = 56 * bits
theoremtcount_sqir_style_controlledModAddConst_gate_le
theorem tcount_sqir_style_controlledModAddConst_gate_le (bits q N c ci f : Nat) :
    tcount (sqir_style_controlledModAddConst_gate bits q N c ci f) ≤ 56 * bits
theoremtcount_sqir_modmult_step_gate_le
theorem tcount_sqir_modmult_step_gate_le (bits N a j : Nat) :
    tcount (sqir_modmult_step_gate bits N a j) ≤ 56 * bits
theoremtcount_sqir_modmult_prefix_gate_le
theorem tcount_sqir_modmult_prefix_gate_le (bits N a k : Nat) :
    tcount (sqir_modmult_prefix_gate bits N a k) ≤ k * (56 * bits)
theoremtcount_sqir_modmult_const_gate_le
theorem tcount_sqir_modmult_const_gate_le (bits N a : Nat) :
    tcount (sqir_modmult_const_gate bits N a) ≤ 56 * bits ^ 2
*T-count UPPER BOUND on the verified modular multiplier.** The SAME Gate term `sqir_modmult_const_gate bits N a` that `sqir_modmult_const_gate_target_decode` proves computes `(a · m) % N` costs at most `56·bits²` T-gates (≤ 8·bits² Toffolis).
theoremtcount_sqir_style_controlledModAddConst_gate_eq
theorem tcount_sqir_style_controlledModAddConst_gate_eq
    (bits q N c ci f : Nat) (hc : c ≠ 0) :
    tcount (sqir_style_controlledModAddConst_gate bits q N c ci f) = 56 * bits
theoremtcount_sqir_modmult_step_gate_eq
theorem tcount_sqir_modmult_step_gate_eq
    (bits N a j : Nat) (h : (a * 2 ^ j) % N ≠ 0) :
    tcount (sqir_modmult_step_gate bits N a j) = 56 * bits
theoremtcount_sqir_modmult_prefix_gate_eq
theorem tcount_sqir_modmult_prefix_gate_eq
    (bits N a k : Nat) (h : ∀ j, j < k → (a * 2 ^ j) % N ≠ 0) :
    tcount (sqir_modmult_prefix_gate bits N a k) = k * (56 * bits)
theoremtcount_sqir_modmult_const_gate_eq
theorem tcount_sqir_modmult_const_gate_eq
    (bits N a : Nat) (h : ∀ j, j < bits → (a * 2 ^ j) % N ≠ 0) :
    tcount (sqir_modmult_const_gate bits N a) = 56 * bits ^ 2
*EXACT T-count of the verified modular multiplier.** `sqir_modmult_const_gate bits N a` (PROVED to compute `(a·m) % N`) costs EXACTLY `56·bits²` T-gates whenever every step constant is non-zero — so the compiled circuit will count exactly this number.
theoremmodmult_step_const_ne_zero
theorem modmult_step_const_ne_zero
    (a N : Nat) (hcop : Nat.Coprime a N) (hodd : Odd N) (h1 : 1 < N) (j : Nat) :
    (a * 2 ^ j) % N ≠ 0
theoremtcount_sqir_modmult_const_gate_shor
theorem tcount_sqir_modmult_const_gate_shor
    (bits N a : Nat) (hcop : Nat.Coprime a N) (hodd : Odd N) (h1 : 1 < N) :
    tcount (sqir_modmult_const_gate bits N a) = 56 * bits ^ 2
*EXACT T-count of the verified modular multiplier, for any valid Shor base.**
theoremtcount_qubit_swap
theorem tcount_qubit_swap (a b : Nat) : tcount (qubit_swap a b) = 0
theoremtcount_Gate_shift
theorem tcount_Gate_shift (off : Nat) (g : Gate) : tcount (Gate.shift off g) = tcount g
theoremtcount_sqir_swap_acc_mult_aux
theorem tcount_sqir_swap_acc_mult_aux (bits k : Nat) :
    tcount (sqir_swap_acc_mult_aux bits k) = 0
theoremtcount_sqir_swap_acc_mult
theorem tcount_sqir_swap_acc_mult (bits : Nat) : tcount (sqir_swap_acc_mult bits) = 0
theoremtcount_reverse_register_swap_aux
theorem tcount_reverse_register_swap_aux (n oa ob k : Nat) :
    tcount (reverse_register_swap_aux n oa ob k) = 0
theoremtcount_reverse_register_swap
theorem tcount_reverse_register_swap (n oa ob : Nat) :
    tcount (reverse_register_swap n oa ob) = 0
theoremtcount_sqir_encode_to_mult_adapter
theorem tcount_sqir_encode_to_mult_adapter (bits : Nat) :
    tcount (sqir_encode_to_mult_adapter bits) = 0
theoremtcount_sqir_modmult_inplace_candidate_eq
theorem tcount_sqir_modmult_inplace_candidate_eq
    (bits N a ainv : Nat)
    (ha : ∀ j, j < bits → (a * 2 ^ j) % N ≠ 0)
    (hb : ∀ j, j < bits → ((N - ainv) % N * 2 ^ j) % N ≠ 0) :
    tcount (sqir_modmult_inplace_candidate bits N a ainv) = 112 * bits ^ 2
EXACT T-count of the in-place modular multiplier = `112·bits²` when both constant multipliers (`a` and `(N-ainv)%N`) have all non-zero steps.
theoremtcount_sqir_modmult_MCP_gate_eq
theorem tcount_sqir_modmult_MCP_gate_eq
    (bits N a ainv : Nat)
    (ha : ∀ j, j < bits → (a * 2 ^ j) % N ≠ 0)
    (hb : ∀ j, j < bits → ((N - ainv) % N * 2 ^ j) % N ≠ 0) :
    tcount (sqir_modmult_MCP_gate bits N a ainv) = 112 * bits ^ 2
*EXACT T-count of the VERIFIED MCP oracle term** = `112·bits²` (same conditions).
theoremcoprime_modsub
theorem coprime_modsub (ainv N : Nat)
    (hcopinv : Nat.Coprime ainv N) (hpos : 0 < ainv) (hlt : ainv < N) :
    Nat.Coprime ((N - ainv) % N) N
The uncompute constant `(N-ainv)%N` is coprime to `N` when `ainv` is (and `0<ainv<N`).
theoremtcount_sqir_modmult_MCP_gate_shor
theorem tcount_sqir_modmult_MCP_gate_shor
    (bits N a ainv : Nat) (hcop : Nat.Coprime a N) (hcopinv : Nat.Coprime ainv N)
    (hpos : 0 < ainv) (hlt : ainv < N) (hodd : Odd N) (h1 : 1 < N) :
    tcount (sqir_modmult_MCP_gate bits N a ainv) = 112 * bits ^ 2
*EXACT T-count of the verified MCP oracle, for any valid Shor base + inverse.**

FormalRV.Arithmetic.UnaryLookup

FormalRV/Arithmetic/UnaryLookup.lean
(no documented top-level declarations)

FormalRV.Arithmetic.UnaryLookup.UnaryLookupDefinitions

FormalRV/Arithmetic/UnaryLookup/UnaryLookupDefinitions.lean
## Register indexing for the unary lookup circuit Layout (top to bottom): ctrl[0], then `n_addr` pairs of (address[i], and[i]), then `n_word` word qubits. Index assignment: ctrl_idx = 0 address_idx i = 1 + 2*i (i = 0..n_addr-1) and_idx i = 1 + 2*i + 1 (i = 0..n_addr-1) word_idx n_addr j = 1 + 2*n_addr + j (j = 0..n_word-1)
defulookup_ctrl_idx
def ulookup_ctrl_idx : Nat
Qubit index for the controller bit (top wire in Fig. 4(b)).
defulookup_address_idx
def ulookup_address_idx (i : Nat) : Nat
Qubit index for the i-th address bit.
defulookup_and_idx
def ulookup_and_idx (i : Nat) : Nat
Qubit index for the i-th ancilla AND bit (interleaved with address).
defulookup_word_idx
def ulookup_word_idx (n_addr j : Nat) : Nat
Qubit index for the j-th word bit, given the number of address bits.
defunary_lookup_n_qubits
def unary_lookup_n_qubits (n_addr n_word : Nat) : Nat
Total qubits required for an `n_addr`-address-bit, `n_word`-word-bit unary lookup: 1 + 2*n_addr + n_word.
defunary_lookup_stub
def unary_lookup_stub (_n_addr _n_word : Nat) : Gate
Placeholder: the empty lookup (Iter 1 only encodes indexing).
defprefix_and_step
def prefix_and_step (i : Nat) : Gate
One step of the prefix-AND cascade at bit `i`: i=0 → CCX(ctrl, address[0], and[0]) i>0 → CCX(and[i-1], address[i], and[i]) Faithful translation of `PyCircuits/lookups/unary_lookup_qrisp.py:build_prefix_and_cascade`.
defprefix_and_cascade
def prefix_and_cascade : Nat → Gate
  | 0       => Gate.I
  | n + 1   => Gate.seq (prefix_and_cascade n) (prefix_and_step n)
The full forward prefix-AND cascade for `n_addr` address bits, composed via `Gate.seq`.
defprefix_and_uncompute_step
def prefix_and_uncompute_step (i : Nat) : Gate
One reverse step of the prefix-AND cascade — same gate as the forward step (CCX is self-inverse) but emitted in reverse order in `prefix_and_uncompute`. Provided as a separate def for clarity even though structurally `prefix_and_step` already encodes the gate.
defprefix_and_uncompute
def prefix_and_uncompute : Nat → Gate
  | 0       => Gate.I
  | n + 1   => Gate.seq (prefix_and_step n) (prefix_and_uncompute n)
The full reverse uncomputation cascade. Emits `prefix_and_step n-1` then `n-2`, ..., then `0`. Together with `prefix_and_cascade n`, forms the no-measurement upper-bound: total `2n` Toffolis.
defprefix_and_compute_and_uncompute
def prefix_and_compute_and_uncompute (n : Nat) : Gate
The no-measurement upper bound: forward + reverse cascade uses `2n` Toffolis. This represents the gate-level cost WITHOUT the Gidney- style measurement trick. The paper's optimization gets the per- iteration cost down to `n` (forward only, reverse uses measurements).
defx_gates_from_indices
def x_gates_from_indices : List Nat → Gate
  | []      => Gate.I
  | i :: xs => Gate.seq (x_gates_from_indices xs) (Gate.X i)
Helper: emit X gates at each index in the list.
defcx_gates_from_indices
def cx_gates_from_indices (ctrl : Nat) : List Nat → Gate
  | []        => Gate.I
  | tgt :: xs => Gate.seq (cx_gates_from_indices ctrl xs) (Gate.CX ctrl tgt)
Helper: emit CX gates with a fixed control and each target in the list.
defunary_lookup_iteration
def unary_lookup_iteration (n_addr : Nat)
    (addr_flip_idxs word_cnot_idxs : List Nat) : Gate
One iteration of the unary lookup loop targeting a specific address value. `addr_flip_idxs` is the list of address-bit indices to X-flip (so the cascade fires for the target value). `word_cnot_idxs` is the list of word-bit indices to write (per the table row at that address).
defunary_lookup_multi_iteration
def unary_lookup_multi_iteration (n_addr : Nat) :
    List (List Nat × List Nat) → Gate
  | []                     => Gate.I
  | (flips, cnots) :: rest =>
      Gate.seq (unary_lookup_multi_iteration n_addr rest)
               (unary_lookup_iteration n_addr flips cnots)
Compose `unary_lookup_iteration` for a list of `(addr_flips, word_cnots)` data tuples. Each tuple is one iteration of the lookup loop.
defprefix_and_step_post_state
def prefix_and_step_post_state (i : Nat) (f : Nat → Bool) : Nat → Bool
Per-step post-state: applying `prefix_and_step i` XORs `(prev ∧ address[i])` into `and[i]`, where `prev = ctrl` at i=0 and `and[i-1]` at i>0.
defprefix_and_cascade_post_state
def prefix_and_cascade_post_state : Nat → (Nat → Bool) → (Nat → Bool)
  | 0    , f => f
  | n + 1, f => prefix_and_step_post_state n (prefix_and_cascade_post_state n f)
Cascade post-state: fold of per-step post-states over bits 0..n-1. Matches the recursive structure of `prefix_and_cascade`.
structureULookupBitDisjointness
structure ULookupBitDisjointness (dim i : Nat) : Prop
Disjointness bundle for a single bit of the lookup prefix-AND cascade. The five conditions follow from the indexing structure `ulookup_*_idx i = 1 + 2*i (+1)`.
defgray_code_unary_lookup_toffoli_count
def gray_code_unary_lookup_toffoli_count (n_addr q_a : Nat) : Nat
Gray-code-amortized Toffoli count for a q_a-bit unary lookup: `n_addr` (initial cascade) + `(2^q_a - 1)` (one Toffoli per subsequent iteration).
defgray_code_residual_ratio
def gray_code_residual_ratio (n_addr q_a : Nat) : Nat × Nat
*Two-step closure roadmap**: the lookup review-gap (12× at q_a=6) decomposes as 2× (Gidney AND, closed Iter 43-44) × 6× (Gray-code, scaffolded here). With Gray-code Toffoli count = `n_addr + 2^q_a - 1`, the ratio Lean-Gray-code / paper-claim is `(n_addr + 2^q_a - 1) / 2^q_a ≈ 1.08` at q_a=6 — **down from 6× to ~8% residual**. The residual is the initial-cascade bookkeeping discussed above.
abbrevzeroFLook
abbrev zeroFLook : Nat → Bool
The all-zero input function (local re-abbreviation; the adder side defines a `zeroF` in its own namespace).
definputF_lookup_ctrl_addr_10
def inputF_lookup_ctrl_addr_10 : Nat → Bool
  | 0 => true   -- ctrl = 1
  | 1 => true   -- address_0 = 1
  | _ => false  -- and_0, address_1, and_1, ... = 0
Input for the lookup: ctrl=1 (qubit 0), address_0=1 (qubit 1), everything else false (and_0, address_1, and_1 all 0).
definputF_lookup_ctrl_addr_11
def inputF_lookup_ctrl_addr_11 : Nat → Bool
  | 0 => true   -- ctrl = 1
  | 1 => true   -- address_0 = 1
  | 3 => true   -- address_1 = 1
  | _ => false
*And another variant**: with `ctrl=1, address=11`, both AND ancillas should fire to 1.
definputF_lookup_q3_addr_110
def inputF_lookup_q3_addr_110 : Nat → Bool
  | 0 => true   -- ctrl = 1
  | 1 => true   -- addr_0 = 1
  | 3 => true   -- addr_1 = 1
  | _ => false  -- addr_2 = 0; and ancillas all 0
Input for `q_a = 3` lookup: ctrl=1, address = (1, 1, 0) LSB-first. The cascade should compute: - and_0 = ctrl ∧ addr_0 = 1 ∧ 1 = 1 - and_1 = and_0 ∧ addr_1 = 1 ∧ 1 = 1 - and_2 = and_1 ∧ addr_2 = 1 ∧ 0 = 0
definputF_lookup_q3_addr_111
def inputF_lookup_q3_addr_111 : Nat → Bool
  | 0 => true   -- ctrl
  | 1 => true   -- addr_0
  | 3 => true   -- addr_1
  | 5 => true   -- addr_2
  | _ => false  -- and ancillas
Input with all 3 address bits set: ctrl=1, address = (1, 1, 1). All ANDs fire to 1.
defLookup.address_and
def Lookup.address_and (ctrl : Bool) (addr : Nat) : Nat → Bool
  | 0     => ctrl
  | n + 1 => Lookup.address_and ctrl addr n && addr.testBit n
*Math AND of `ctrl` with the first `n` bits of `addr`**. `address_and ctrl addr n = ctrl ∧ addr.testBit 0 ∧ ... ∧ addr.testBit (n-1)`.
defLookup.cascade_step_invariant
def Lookup.cascade_step_invariant (k n : Nat) (ctrl : Bool) (addr : Nat)
    (post : Nat → Bool) : Prop
*Step-indexed prefix-AND cascade invariant** (Iter 219, analog of Iter 175's `Gidney.propagation_step_invariant`). After `k` steps of the prefix-AND cascade applied to a state where `f(ctrl_idx) = ctrl`, `f(address_idx i) = addr.testBit i`, and `f(and_idx i) = false` for all i < n: - For i < k (computed): post(and_idx i) = ctrl ∧ ⋀_{j ≤ i} addr.testBit j. - For i ≥ k (untouched): post(and_idx i) = false.
defLookup.x_flip_post_state
def Lookup.x_flip_post_state : List Nat → (Nat → Bool) → (Nat → Bool)
  | [], f => f
  | i :: xs, f =>
    let f'
*Classical post-state of `x_gates_from_indices xs`**: starting from `f`, apply X-flips to the indices in `xs` in the order matching the Gate.seq nesting (tail first, head last). With unique indices, the net effect is to XOR each listed position with `true`.
defLookup.cnot_layer_post_state
def Lookup.cnot_layer_post_state (ctrl : Nat) : List Nat → (Nat → Bool) → (Nat → Bool)
  | [], f => f
  | tgt :: xs, f =>
    let f'
*Classical post-state of `cx_gates_from_indices ctrl xs`**: each CX(ctrl, tgt) does `tgt := tgt ⊕ ctrl`. In the order matching the Gate.seq nesting, the tail is applied first. **Crucially**: the control wire `ctrl` is never the target of any CX in this layer, so its value is preserved across the layer (see `cnot_layer_post_state_ctrl_unchanged` below).
defprefix_and_uncompute_post_state
def prefix_and_uncompute_post_state : Nat → (Nat → Bool) → (Nat → Bool)
  | 0    , f => f
  | n + 1, f => prefix_and_uncompute_post_state n (prefix_and_step_post_state n f)
*Boolean post-state of the reverse cascade**: applies `prefix_and_step_post_state` in the reverse order (n-1, n-2, ..., 0), matching `prefix_and_uncompute n = seq (step (n-1)) (...) (step 0)`.
defLookup.iteration_post_state
def Lookup.iteration_post_state
    (n_addr : Nat) (addr_flip_idxs word_cnot_idxs : List Nat)
    (f : Nat → Bool) : Nat → Bool
*Boolean post-state of `unary_lookup_iteration`**. The 5-stage composition mirrors the Gate.seq structure of `unary_lookup_iteration`: `flips · cascade · cnots · uncompute · flips`.
defLookup.AllWordIdx
def Lookup.AllWordIdx (n_addr : Nat) (xs : List Nat) : Prop
*All elements of `xs` are word-register indices** (i.e., ≥ 1 + 2·n_addr). Captures the structural condition that CNOT targets in a lookup iteration write to the word register, not the ctrl/address/and registers.
defLookup.multi_iteration_post_state
def Lookup.multi_iteration_post_state (n_addr : Nat) :
    List (List Nat × List Nat) → (Nat → Bool) → (Nat → Bool)
  | [],                     f => f
  | (flips, cnots) :: rest, f =>
      Lookup.iteration_post_state n_addr flips cnots
        (Lookup.multi_iteration_post_state n_addr rest f)
*Boolean post-state of `unary_lookup_multi_iteration`**. Recursive fold matching the gate-level structure: each `(flips, cnots)` tuple in the iter list contributes one application of `iteration_post_state`.
defLookup.iter_triggers
def Lookup.iter_triggers (ctrl : Bool) (addr : Nat) (n_addr : Nat)
    (flips : List Nat) : Bool
*Iter trigger predicate** (pure classical): true iff the iter's prefix-AND chain fires on input `(ctrl, addr)`, equivalently iff `ctrl` is true and the effective address (addr XOR flip mask) is all-ones for the first `n_addr` bits. Equivalent to `Lookup.address_and ctrl effective_addr n_addr` where `effective_addr.testBit i = xor (addr.testBit i) (decide (ulookup_address_idx i ∈ flips))`.
defLookup.multi_iteration_xor_value
def Lookup.multi_iteration_xor_value
    (ctrl : Bool) (addr : Nat) (n_addr : Nat) :
    List (List Nat × List Nat) → Nat → Bool
  | [], _ => false
  | (flips, cnots) :: rest, p =>
      xor (decide (p ∈ cnots) && Lookup.iter_triggers ctrl addr n_addr flips)
          (Lookup.multi_iteration_xor_value ctrl addr n_addr rest p)
*Multi-iteration XOR contribution at a word position** (pure classical). For a word position `p`, the boolean XOR contribution is `XOR` over all iters of `(p ∈ cnots_i) AND (iter_i triggers)`.
defLookup.effective_addr
def Lookup.effective_addr (addr : Nat) (flips : List Nat) : Nat → Nat
  | 0     => 0
  | n + 1 =>
    let lower
*Effective address Nat construction (Iter 253 reform via `Nat.lor`)**. Recursively builds a Nat whose i-th bit (for i < n) equals `xor (addr.testBit i) (decide (ulookup_address_idx i ∈ flips))`. Uses bitwise OR (`|||`) instead of addition. With OR, the testBit characterization (Iter 254) is straightforward via `Nat.testBit_or` and `Nat.testBit_two_pow`.
defLookup.multi_iteration_xor_value_via_address_and
def Lookup.multi_iteration_xor_value_via_address_and
    (ctrl : Bool) (addr : Nat) (n_addr : Nat) :
    List (List Nat × List Nat) → Nat → Bool
  | [], _ => false
  | (flips, cnots) :: rest, p =>
      xor (decide (p ∈ cnots) &&
           Lookup.address_and ctrl
             (Lookup.effective_addr addr flips n_addr) n_addr)
          (Lookup.multi_iteration_xor_value_via_address_and ctrl addr n_addr rest p)
*Classical XOR contribution at a word position** (via address_and). Recursive fold matching the multi-iter post-state structure.

FormalRV.Arithmetic.UnaryLookup.UnaryLookupGateDerivations

FormalRV/Arithmetic/UnaryLookup/UnaryLookupGateDerivations.lean
## Smoke tests matching Fig. 4(b)'s example (n_addr=3, n_word=6)
example(example)
example : unary_lookup_n_qubits 3 6 = 13
1 + 2·3 + 6 = 13 qubits, matching the 13 horizontal wires in Fig. 4(b)'s example diagram.
example(example)
example : ulookup_ctrl_idx = 0
ctrl[0] is wire 0.
example(example)
example : ulookup_address_idx 0 = 1 ∧ ulookup_and_idx 0 = 2
address[0] is wire 1, and[0] is wire 2 (highlighted red in the figure).
example(example)
example : ulookup_address_idx 1 = 3 ∧ ulookup_and_idx 1 = 4
address[1] is wire 3, and[1] is wire 4 (also highlighted red).
example(example)
example : ulookup_address_idx 2 = 5 ∧ ulookup_and_idx 2 = 6
address[2] is wire 5, and[2] is wire 6.
example(example)
example : ulookup_word_idx 3 0 = 7 ∧ ulookup_word_idx 3 5 = 12
word[0..5] are wires 7..12.
example(example)
example : tcount (unary_lookup_stub 3 6) = 0
Smoke: stub has T-count 0 (placeholder; real circuit has many).
theoremgcount_prefix_and_step
theorem gcount_prefix_and_step (i : Nat) : gcount (prefix_and_step i) = 1
Each cascade step is exactly 1 Toffoli (`gcount = 1`), regardless of which branch of the `if` fires.
theoremtcount_prefix_and_step
theorem tcount_prefix_and_step (i : Nat) : tcount (prefix_and_step i) = 7
Each cascade step is exactly 7 T-gates (`tcount = 7`).
example(example)
example : tcount (prefix_and_step 0) = 7
example(example)
example : tcount (prefix_and_step 5) = 7
example(example)
example : gcount (prefix_and_step 7) = 1
example(example)
example : tcount (prefix_and_cascade 3) = 21
The 3-bit prefix-AND cascade has exactly 3 Toffolis = 21 T-gates.
example(example)
example : gcount (prefix_and_cascade 3) = 3
theoremgcount_prefix_and_cascade
theorem gcount_prefix_and_cascade (n : Nat) :
    gcount (prefix_and_cascade n) = n
General Toffoli count: an `n`-bit prefix-AND cascade has exactly `n` Toffolis. **Gate-derived** from the recursive definition — no paper claim involved. Matches Iter 5 Python's predicted `n_addr` Toffolis.
theoremtcount_prefix_and_cascade
theorem tcount_prefix_and_cascade (n : Nat) :
    tcount (prefix_and_cascade n) = 7 * n
T-count of the cascade is `7n` (one Toffoli = 7 T after decomposition).
theoremgcount_prefix_and_uncompute_step
theorem gcount_prefix_and_uncompute_step (i : Nat) :
    gcount (prefix_and_uncompute_step i) = 1
Each uncompute step is exactly 1 Toffoli.
theoremgcount_prefix_and_uncompute
theorem gcount_prefix_and_uncompute (n : Nat) :
    gcount (prefix_and_uncompute n) = n
Toffoli count of the reverse cascade: also exactly `n` Toffolis.
theoremtcount_prefix_and_uncompute
theorem tcount_prefix_and_uncompute (n : Nat) :
    tcount (prefix_and_uncompute n) = 7 * n
T-count of the reverse cascade: `7n`.
theoremgcount_prefix_and_compute_and_uncompute
theorem gcount_prefix_and_compute_and_uncompute (n : Nat) :
    gcount (prefix_and_compute_and_uncompute n) = 2 * n
theoremtcount_x_gates_zero
theorem tcount_x_gates_zero (xs : List Nat) : tcount (x_gates_from_indices xs) = 0
All X-gate sequences are T-count zero.
theoremtcount_cx_gates_zero
theorem tcount_cx_gates_zero (ctrl : Nat) (xs : List Nat) :
    tcount (cx_gates_from_indices ctrl xs) = 0
All CX-gate sequences are T-count zero.
theoremgcount_x_gates_from_indices
theorem gcount_x_gates_from_indices (xs : List Nat) :
    gcount (x_gates_from_indices xs) = xs.length
Gate-count of `x_gates_from_indices xs` is the list length: one X per index, identity contributes 0.
theoremgcount_cx_gates_from_indices
theorem gcount_cx_gates_from_indices (ctrl : Nat) (xs : List Nat) :
    gcount (cx_gates_from_indices ctrl xs) = xs.length
Gate-count of `cx_gates_from_indices ctrl xs` is the list length.
theoremtcount_unary_lookup_iteration
theorem tcount_unary_lookup_iteration (n_addr : Nat)
    (addr_flip_idxs word_cnot_idxs : List Nat) :
    tcount (unary_lookup_iteration n_addr addr_flip_idxs word_cnot_idxs)
      = 14 * n_addr
The iteration body has T-count `14 · n_addr` regardless of the address pattern or word pattern (only the two cascades contribute T).
theoremgcount_unary_lookup_iteration
theorem gcount_unary_lookup_iteration (n_addr : Nat)
    (addr_flip_idxs word_cnot_idxs : List Nat) :
    gcount (unary_lookup_iteration n_addr addr_flip_idxs word_cnot_idxs)
      = 2 * addr_flip_idxs.length + 2 * n_addr + word_cnot_idxs.length
*Gate-count of one iteration body**: `2·|addr_flips| + 2·n_addr + |word_cnots|`. Decomposes as: forward+reverse X-flip layers contribute `2·|addr_flips|`; forward+reverse prefix-AND cascades contribute `2·n_addr`; word CNOTs contribute `|word_cnots|`. Derived purely from the gate sequence of `unary_lookup_iteration` — no paper-claim constant. This is the **leaf gate-count review claim** for one iteration body, mirroring `tcount_unary_lookup_iteration` (Iter 14) but at the structural (all-gate) level.
theoremtcount_unary_lookup_multi_iteration
theorem tcount_unary_lookup_multi_iteration (n_addr : Nat)
    (iters : List (List Nat × List Nat)) :
    tcount (unary_lookup_multi_iteration n_addr iters)
      = 14 * n_addr * iters.length
T-count of the multi-iteration cascade: `14 · n_addr · |iters|` regardless of the data carried in the iterations (each iteration contributes a fixed `14 · n_addr`, only Toffolis matter).
example(example)
example :
    tcount (unary_lookup_multi_iteration 3
              [([], []), ([], []), ([], []), ([], []),
               ([], []), ([], []), ([], []), ([], [])])
      = 336
Concrete: at n_addr=3 with 8 iterations (= 2^3), total T-count is `14 · 3 · 8 = 336`. This is the **no-measurement** bound; the paper's `2^q_a = 8` Toffolis = 56 T requires Gidney measurement + Gray-code amortization.
theoremgcount_unary_lookup_multi_iteration
theorem gcount_unary_lookup_multi_iteration (n_addr : Nat)
    (iters : List (List Nat × List Nat)) :
    gcount (unary_lookup_multi_iteration n_addr iters)
      = 2 * n_addr * iters.length
        + 2 * (iters.map (fun p => p.1.length)).sum
        + (iters.map (fun p => p.2.length)).sum
*Gate-count of the multi-iteration cascade** (Iter 77): each iteration contributes its data-dependent gcount `2·|flips_i| + 2·n_addr + |cnots_i|` (Iter 76 leaf), and the multi-iteration gcount is the sum of those. Expressed as a sum: total gates = `2·n_addr · |iters| + 2 · (Σᵢ |flipsᵢ|) + (Σᵢ |cnotsᵢ|)`. Derived purely from the gate sequence via induction on the iter-list, using `gcount_unary_lookup_iteration` (Iter 76) at each step. Mirrors `tcount_unary_lookup_multi_iteration` but aggregates data-dependent gate counts (vs T-count's uniform `14 · n_addr` per iteration).
theoremunary_lookup_two_factor_gap
theorem unary_lookup_two_factor_gap (n_addr : Nat)
    (iters : List (List Nat × List Nat)) :
    tcount (unary_lookup_multi_iteration n_addr iters)
      = 2 * n_addr * (7 * iters.length)
*Lookup review finding theorem**: the no-measurement / no-Gray-code T-count of the n_addr-bit unary lookup with `addr_count` iterations is `2 · n_addr ·` the paper's per-iteration T-count. At `addr_count = 2^q_a`, this gives the full two-factor gap.
example(example)
example :
    tcount (unary_lookup_multi_iteration 6
              (List.replicate 64 ([], [])))
      = 5376
Concrete at q_a=6 (RSA-2048 case), simulated with a list of 64 empty-data iterations: `14 · 6 · 64 = 5376` T-gates (Lean no-measurement) vs `7 · 64 = 448` T-gates (paper, with full optimization). The two-factor gap is `2 · 6 = 12` ×.
example(example)
example : 5376 = 12 * 448
The 12× gap at q_a=6 is exactly `2 · n_addr` — formally captured.
theoremunary_lookup_factor_decomposition_2_times_n_addr
theorem unary_lookup_factor_decomposition_2_times_n_addr
    (n_addr : Nat) (iters : List (List Nat × List Nat)) :
    tcount (unary_lookup_multi_iteration n_addr iters)
      = 2 * (n_addr * (7 * iters.length))
*Review factor decomposition** (Iter 119): the `2 · n_addr` multiplier of `unary_lookup_two_factor_gap` factors into: - **2**: no-measurement factor (matches the adder's `gidney_adder_full_faithful_no_measurement_vs_measurement_factor` from Iter 88 — uses explicit-reverse instead of Gidney's measurement-AND trick). - **n_addr**: no-Gray-code factor (lookup-specific — qianxu's Gray-code amortization reduces n_addr Toffolis per cascade to 1 amortized across consecutive iterations). Concrete decomposition at q_a=6 (RSA-2048 inner-product lookup): Lean T-count = 12 × paper claim = (2 measurement × 6 Gray-code) × paper.
theoremunary_lookup_factor_decomposition_n_addr_times_2
theorem unary_lookup_factor_decomposition_n_addr_times_2
    (n_addr : Nat) (iters : List (List Nat × List Nat)) :
    tcount (unary_lookup_multi_iteration n_addr iters)
      = n_addr * (2 * (7 * iters.length))
Mirror decomposition: `n_addr · (2 · 7 · iters.length)`. Same total but groups by the Gray-code factor first.
theoremprefix_and_step_zero_correct
theorem prefix_and_step_zero_correct (dim : Nat) (f : Nat → Bool)
    (h0 : ulookup_ctrl_idx < dim)
    (h1 : ulookup_address_idx 0 < dim)
    (h2 : ulookup_and_idx 0 < dim) :
    uc_eval (Gate.toUCom dim (prefix_and_step 0)) * f_to_vec dim f
      = f_to_vec dim
          (update f (ulookup_and_idx 0)
            (xor (f (ulookup_and_idx 0))
                 (f ulookup_ctrl_idx && f (ulookup_address_idx 0))))
*`prefix_and_step 0` correctness**: on a classical basis state, the i=0 step XORs `(ctrl ∧ address[0])` into `and[0]`. The standard unary-cascade base case. Proven via the Iter 52 reusable `gate_ccx_acts_on_basis` framework.
theoremprefix_and_step_succ_correct
theorem prefix_and_step_succ_correct (dim i : Nat) (f : Nat → Bool)
    (h_and_i  : ulookup_and_idx i < dim)
    (h_addr   : ulookup_address_idx (i + 1) < dim)
    (h_and_i1 : ulookup_and_idx (i + 1) < dim) :
    uc_eval (Gate.toUCom dim (prefix_and_step (i + 1))) * f_to_vec dim f
      = f_to_vec dim
          (update f (ulookup_and_idx (i + 1))
            (xor (f (ulookup_and_idx (i + 1)))
                 (f (ulookup_and_idx i) && f (ulookup_address_idx (i + 1)))))
*`prefix_and_step (i+1)` correctness**: on a classical basis state, the i>0 step XORs `(and[i] ∧ address[i+1])` into `and[i+1]`. The chain step of the unary cascade. Proven via `gate_ccx_acts_on_basis`.
example(example)
example (dim : Nat) (f : Nat → Bool)
    (h_and_2 : ulookup_and_idx 2 < dim)
    (h_addr_3 : ulookup_address_idx 3 < dim)
    (h_and_3 : ulookup_and_idx 3 < dim) :
    uc_eval (Gate.toUCom dim (prefix_and_step 3)) * f_to_vec dim f
      = f_to_vec dim
          (update f (ulookup_and_idx 3)
            (xor (f (ulookup_and_idx 3))
                 (f (ulookup_and_idx 2) && f (ulookup_address_idx 3))))
Concrete: at i=2 (chain step), the per-step action XORs `and[2] ∧ address[3]` into `and[3]`. Note that `prefix_and_step 3` triggers the i>0 branch (since 3 ≠ 0).
theoremprefix_and_step_correct
theorem prefix_and_step_correct (dim i : Nat) (f : Nat → Bool)
    (h_ctrl : ulookup_ctrl_idx < dim)
    (h_and_pred : ulookup_and_idx (i - 1) < dim)
    (h_addr : ulookup_address_idx i < dim)
    (h_and : ulookup_and_idx i < dim) :
    uc_eval (Gate.toUCom dim (prefix_and_step i)) * f_to_vec dim f
      = f_to_vec dim (prefix_and_step_post_state i f)
*Unified per-step correctness**: combines the i=0 and i>0 cases via the new `prefix_and_step_post_state`. Useful as the inductive step in the cascade correctness proof below.
theoremprefix_and_step_involutive
theorem prefix_and_step_involutive (dim i : Nat) (f : Nat → Bool)
    (h_ctrl : ulookup_ctrl_idx < dim)
    (h_and_pred : ulookup_and_idx (i - 1) < dim)
    (h_addr : ulookup_address_idx i < dim)
    (h_and : ulookup_and_idx i < dim) :
    uc_eval (Gate.toUCom dim (Gate.seq (prefix_and_step i) (prefix_and_step i)))
      * f_to_vec dim f
      = f_to_vec dim f
*Lookup `prefix_and_step` is involutive at the gate-IR level.** For any `i`, applying `prefix_and_step i` twice acts as identity on classical basis states. Direct lift of `gate_ccx_ccx_id_on_basis` via case-splitting on `i = 0`. **First Verified-tier lookup-side involution** — building block for Iter 71's `prefix_and_cascade · prefix_and_uncompute = identity` proof (the lookup analog of Iter 69's adder-side closure).
theoremprefix_and_step_step_eq_one
theorem prefix_and_step_step_eq_one (dim i : Nat)
    (h_ctrl : ulookup_ctrl_idx < dim)
    (h_and_pred : ulookup_and_idx (i - 1) < dim)
    (h_addr : ulookup_address_idx i < dim)
    (h_and : ulookup_and_idx i < dim) :
    uc_eval (Gate.toUCom dim (Gate.seq (prefix_and_step i) (prefix_and_step i)))
      = (1 : Matrix (Fin (2^dim)) (Fin (2^dim)) ℂ)
*Matrix-level form of `prefix_and_step_involutive`** (Iter 71): `uc_eval (seq (step i) (step i)) = 1`, independent of any basis vector. Useful for cascade-level proofs where we re-associate matrix products and need to collapse pairs to 1 in the middle. Proven via case-split on `i = 0` and reduction to `CCX_CCX_eq_one` (matrix-level CCX involution from PadAction).
theoremprefix_and_cascade_correct
theorem prefix_and_cascade_correct
    (dim : Nat) (hdim : 0 < dim) (f : Nat → Bool) :
    ∀ n, (∀ i, i < n → ULookupBitDisjointness dim i) →
    uc_eval (Gate.toUCom dim (prefix_and_cascade n)) * f_to_vec dim f
      = f_to_vec dim (prefix_and_cascade_post_state n f)
  | 0    , _ =>
*Faithful n-step prefix-AND cascade correctness**: given disjointness on each bit 0..n-1, the cascade acts on `f_to_vec dim f` to produce `f_to_vec dim (prefix_and_cascade_post_state n f)`. Proof by structural recursion on n, using `gate_seq_acts_on_basis` + IH + per-step correctness (Iter 63). *Second Verified-tier review chain (lookup side, mirroring Iter 58 for the adder).**
theoremprefix_and_cascade_uncompute_eq_one
theorem prefix_and_cascade_uncompute_eq_one
    (dim : Nat) (hdim : 0 < dim) :
    ∀ n, (∀ i, i < n → ULookupBitDisjointness dim i) →
    uc_eval (Gate.toUCom dim
              (Gate.seq (prefix_and_cascade n) (prefix_and_uncompute n)))
      = (1 : Matrix (Fin (2^dim)) (Fin (2^dim)) ℂ)
  | 0    , _ =>
*Matrix-level cascade · uncompute = identity**. The n-step forward cascade composed with the n-step reverse cascade is the identity matrix. Proof by structural induction on n, re-associating the matrix products to expose the per-step `prefix_and_step · prefix_and_step` involution (`prefix_and_step_step_eq_one` from Iter 71). *Third Verified-tier review chain (lookup side)** — composition of the n-step forward cascade (Iter 64) with its uncomputation is the identity matrix. Confirms that without measurement-based uncomputation, the lookup ancillas ARE faithfully reset to zero on the basis-state image.
theoremunary_lookup_tcount_matches_PaperClaims
theorem unary_lookup_tcount_matches_PaperClaims (q_a : Nat)
    (iters : List (List Nat × List Nat))
    (hlen : iters.length = qianxu_E9_lookup_gate_derived_count q_a) :
    tcount (unary_lookup_multi_iteration q_a iters)
      = 2 * q_a * (7 * qianxu_E9_lookup_gate_derived_count q_a)
*Bridge theorem**: at `n_addr = q_a` address bits and `iters.length = 2^q_a` iterations (the full unary loop), the Lean no-measurement no-Gray-code T-count is `2 · q_a · 7 · qianxu_E9_lookup_gate_derived_count q_a`. This formally connects the gate-derived count to the PaperClaims data def, parallel to Iter 22's `gidney_adder_forward_tcount_matches_PaperClaims`.
example(example)
example :
    tcount (unary_lookup_multi_iteration 6 (List.replicate 64 ([], [])))
      = 2 * 6 * (7 * qianxu_E9_lookup_gate_derived_count 6)
Concrete bridge check at q_a=6 (RSA-2048 case): with 64 iterations, Lean encodes 5376 T-gates = 2 · 6 · 7 · 64.
example(example)
example : gray_code_unary_lookup_toffoli_count 6 6 = 69
For RSA-2048 (q_a=6, n_addr=6): Gray-code count = 69 Toffolis.
example(example)
example :
    gray_code_unary_lookup_toffoli_count 6 6
      - qianxu_E9_lookup_gate_derived_count 6 = 5
Gap analysis: at q_a=6, Lean Gray-code count (69) is 5 more than the paper's exact `2^q_a = 64` Toffoli claim. The +5 is the initial cascade cost (n_addr - 1, since the first Toffoli is already counted in 2^q_a per Gidney).
example(example)
example : gray_code_residual_ratio 6 6 = (69, 64)
For RSA-2048: Lean Gray-code 69 vs paper 64; residual ~8%.
theoremgray_code_residual_eq_n_addr_minus_one
theorem gray_code_residual_eq_n_addr_minus_one (n_addr q_a : Nat) (h : 0 < n_addr)
    (_hq : 0 < q_a) :
    gray_code_unary_lookup_toffoli_count n_addr q_a
      = qianxu_E9_lookup_gate_derived_count q_a + (n_addr - 1)
The Gray-code Toffoli count exceeds the paper's `2^q_a` claim by exactly `n_addr - 1` — the initial-cascade setup cost.
example(example)
example :
    gray_code_unary_lookup_toffoli_count 6 6
      = qianxu_E9_lookup_gate_derived_count 6 + (6 - 1)
Concrete: at RSA-2048 (n_addr=6, q_a=6), residual = 5 = 6 - 1.
theoremlookup_review_gap_closure
theorem lookup_review_gap_closure (n_addr q_a : Nat) (h : 0 < n_addr) (hq : 0 < q_a) :
    gray_code_unary_lookup_toffoli_count n_addr q_a
      - qianxu_E9_lookup_gate_derived_count q_a
      = n_addr - 1
*Lookup-side review closure**: the Lean count exceeds the paper count by EXACTLY `n_addr - 1` Toffolis (the initial-cascade setup). Combined with Iter 44's Gidney closure, the original 12× gap at q_a=6 is **fully attributed**: 6× from Gray-code (now formalized, residual `n_addr - 1 = 5`), 2× from Gidney (Iter 44, closed).
theoremprefix_and_step_post_state_on_zero
theorem prefix_and_step_post_state_on_zero (i : Nat) :
    prefix_and_step_post_state i zeroFLook = zeroFLook
`prefix_and_step` on zero input gives zero (both `i = 0` and `i > 0` branches). Single CCX writes `xor false (false ∧ false) = false`, a no-op via `Function.update_eq_self`.
theoremprefix_and_cascade_post_state_on_zero
theorem prefix_and_cascade_post_state_on_zero : ∀ n,
    prefix_and_cascade_post_state n zeroFLook = zeroFLook
  | 0     => rfl
  | n + 1 =>
Prefix-AND cascade on zero input gives zero. Induction on n.
theoremprefix_and_cascade_uncompute_on_zero
theorem prefix_and_cascade_uncompute_on_zero
    (dim : Nat) (hdim : 0 < dim) (n : Nat)
    (hyp : ∀ i, i < n → ULookupBitDisjointness dim i) :
    uc_eval (Gate.toUCom dim
              (Gate.seq (prefix_and_cascade n) (prefix_and_uncompute n)))
      * f_to_vec dim zeroFLook
      = f_to_vec dim zeroFLook
*Cascade and its uncompute compose to identity on the zero state vector**. Direct corollary of Iter 74's matrix-level `prefix_and_cascade_uncompute_eq_one`, applied to `f_to_vec dim zeroFLook`. This is the lookup analog of Iter 89's adder zero-input smoke test (modulo the absence of the final-CX cascade analog on the lookup side).
example(example)
example :
    let post
*Concrete prefix-AND cascade action check** for a 2-step cascade on `inputF_lookup_ctrl_addr_10`. After two steps: - and_0 (qubit 2) = ctrl ∧ address_0 = 1 ∧ 1 = 1 ✓ - and_1 (qubit 4) = and_0 ∧ address_1 = 1 ∧ 0 = 0 ✓ `decide` reduces the nested `update` chain at each specific qubit index. Verifies the cascade correctly computes the AND-chain on a non-trivial input.
example(example)
example :
    let post
example(example)
example :
    let post
*3-step cascade decide-check on (ctrl=1, addr=110)**. The final AND ancilla (and_2 at qubit 6) is 0 because addr_2 = 0 breaks the chain.
example(example)
example :
    let post
*3-step cascade decide-check on (ctrl=1, addr=111)**. All AND ancillas fire to 1 (the chain propagates fully).
example(example)
example :
    tcount (unary_lookup_iteration 3 [0, 2] [0, 1, 3]) = 42
*Concrete iteration tcount** at q_a=3, |flips|=2, |cnots|=3: T-count = 14·3 = 42 (data-independent — only Toffolis count).
example(example)
example :
    gcount (unary_lookup_iteration 3 [0, 2] [0, 1, 3]) = 13
*Concrete iteration gcount** at the same instance: gcount = 2·|flips| + 2·n_addr + |cnots| = 2·2 + 2·3 + 3 = 13.
example(example)
example :
    tcount (unary_lookup_multi_iteration 3
              [([], []), ([0], [1]), ([1], [2]), ([0, 1], [0, 1])])
      = 168
*Multi-iteration concrete tcount** at q_a=3 with 4 iterations: T-count = 14·3·4 = 168 (data-independent).
example(example)
example :
    tcount (unary_lookup_multi_iteration 3
              (List.replicate 8 ([], []))) = 336
*qianxu Fig. 4b instance** (q_a=3, q_w=6, full 2^3=8 iterations with the data implied by the figure's red-highlighted Toffolis and bit-flip pattern). Counts the no-measurement no-Gray-code bound: 14·3·8 = 336 T-gates total.
example(example)
example :
    Lookup.cascade_step_invariant 0 3 true 3
      (fun i => if i = ulookup_ctrl_idx then true
                else if i = ulookup_address_idx 0 then true
                else if i = ulookup_address_idx 1 then true
                else if i = ulookup_address_idx 2 then false
                else false)
*Decide-witness on (n=3, k=0, ctrl=true, addr=3=0b011)** (Iter 219). No cascade steps applied: all and qubits are false.
example(example)
example :
    Lookup.cascade_step_invariant 2 3 true 3
      (fun i =>
        if i = ulookup_ctrl_idx then true
        else if i = ulookup_address_idx 0 then true
        else if i = ulookup_address_idx 1 then true
        else if i = ulookup_address_idx 2 then false
        else if i = ulookup_and_idx 0 then true   -- and_0 = ctrl ∧ addr_0 = 1
        else if i = ulookup_and_idx 1 then true   -- and_1 = and_0 ∧ addr_1 = 1
        else false)
*Decide-witness on (n=3, k=2, ctrl=true, addr=3)** (Iter 219). After 2 steps: and_0 = and_1 = true (chain of ANDs), and_2 = false.
example(example)
example :
    Lookup.cascade_step_invariant 3 3 true 3
      (fun i =>
        if i = ulookup_ctrl_idx then true
        else if i = ulookup_address_idx 0 then true
        else if i = ulookup_address_idx 1 then true
        else if i = ulookup_address_idx 2 then false
        else if i = ulookup_and_idx 0 then true
        else if i = ulookup_and_idx 1 then true
        else if i = ulookup_and_idx 2 then false
        else false)
*Decide-witness on (n=3, k=3, ctrl=true, addr=3)** (Iter 219). Full cascade: and_2 = (1 ∧ 1) ∧ 0 = 0 (the top bit kills it).
theoremLookup.cascade_step_preserves
theorem Lookup.cascade_step_preserves
    (k n : Nat) (hk : k < n) (ctrl : Bool) (addr : Nat) (f : Nat → Bool)
    (h_ctrl : f ulookup_ctrl_idx = ctrl)
    (h_addr : ∀ i, i < n → f (ulookup_address_idx i) = addr.testBit i)
    (h_inv : Lookup.cascade_step_invariant k n ctrl addr f) :
    Lookup.cascade_step_invariant (k + 1) n ctrl addr
      (prefix_and_step_post_state k f)
*Per-step cascade invariant preservation** (Iter 220). Given an initial state `f` satisfying the step-`k` cascade invariant (with `ctrl` and `address` contents fixed in `f`), applying `prefix_and_step_post_state k` yields a state satisfying the step-`k+1` invariant. The proof case-splits on the position `i`: `i = k`: the updated qubit. Compute the new value as `prev ∧ addr.testBit k`, where `prev = ctrl` if `k = 0` and `prev = and_{k-1} = address_and ctrl addr k` otherwise. By the definition of `address_and`, this is `address_and ctrl addr (k+1)`. `i ≠ k`: untouched (frame condition via `update_neq`). The step-`k` value carries through unchanged.
theoremprefix_and_step_post_state_frame
theorem prefix_and_step_post_state_frame
    (k : Nat) (f : Nat → Bool) (j : Nat)
    (h_neq : j ≠ ulookup_and_idx k) :
    prefix_and_step_post_state k f j = f j
*Per-step frame condition**: `prefix_and_step_post_state k f` agrees with `f` outside `ulookup_and_idx k`. Both i=0 and i>0 branches of the post-state definition write to a single qubit (`ulookup_and_idx 0` and `ulookup_and_idx k` respectively).
theoremprefix_and_cascade_post_state_frame_ctrl
theorem prefix_and_cascade_post_state_frame_ctrl
    (n : Nat) (f : Nat → Bool) :
    prefix_and_cascade_post_state n f ulookup_ctrl_idx = f ulookup_ctrl_idx
*Cascade frame for the ctrl qubit**: the n-step cascade post-state agrees with `f` at `ulookup_ctrl_idx`. Proof by structural recursion on n; each step writes to `ulookup_and_idx _ ≠ ulookup_ctrl_idx = 0`.
theoremprefix_and_cascade_post_state_frame_addr
theorem prefix_and_cascade_post_state_frame_addr
    (n : Nat) (f : Nat → Bool) (j : Nat) :
    prefix_and_cascade_post_state n f (ulookup_address_idx j)
      = f (ulookup_address_idx j)
*Cascade frame for the address bits**: the n-step cascade post-state agrees with `f` at every `ulookup_address_idx j`. Address indices have parity 1 (`1 + 2*j`); and indices have parity 0 (`2 + 2*i`), so they are always disjoint.
theoremLookup.cascade_step_invariant_holds
theorem Lookup.cascade_step_invariant_holds
    (k n : Nat) (hk : k ≤ n) (ctrl : Bool) (addr : Nat) (f : Nat → Bool)
    (h_ctrl : f ulookup_ctrl_idx = ctrl)
    (h_addr : ∀ i, i < n → f (ulookup_address_idx i) = addr.testBit i)
    (h_clean : ∀ i, i < n → f (ulookup_and_idx i) = false) :
    Lookup.cascade_step_invariant k n ctrl addr
      (prefix_and_cascade_post_state k f)
*Cascade invariant holds at every step `k ≤ n`**. By induction on `k`: `k = 0`: the cascade post-state is `f`, which has all `and_idx` qubits clean by hypothesis. Matches `if i < 0 then ... else false = false`. `k+1` step: by IH, the k-step cascade satisfies the step-`k` invariant. The cascade frame lemmas (Iter 221) ensure ctrl and address are preserved, so `Lookup.cascade_step_preserves` (Iter 220) lifts the step-`k` invariant on `cascade_post k f` to the step-`k+1` invariant on `cascade_post (k+1) f = step_post k (cascade_post k f)`.
theoremprefix_and_cascade_top_bit_eq_address_and
theorem prefix_and_cascade_top_bit_eq_address_and
    (n : Nat) (hn : 0 < n) (ctrl : Bool) (addr : Nat) (f : Nat → Bool)
    (h_ctrl : f ulookup_ctrl_idx = ctrl)
    (h_addr : ∀ i, i < n → f (ulookup_address_idx i) = addr.testBit i)
    (h_clean : ∀ i, i < n → f (ulookup_and_idx i) = false) :
    prefix_and_cascade_post_state n f (ulookup_and_idx (n - 1))
      = Lookup.address_and ctrl addr n
*Top-bit corollary** (Iter 223, 2026-05-13). After the n-step cascade, the top and-bit `ulookup_and_idx (n - 1)` carries the full `Lookup.address_and ctrl addr n` value (`ctrl ∧ ⋀_{j < n} addr.testBit j`). Direct specialization of `cascade_step_invariant_holds` at k = n and i = n - 1. This is the "trigger bit" read by the word-CNOT layer of the lookup iteration body — the value that decides whether the table row fires on this iteration's address. Lookup analog of Iter 199's `Adder.sumfb_eq_testBit_add` (final-bit extraction from the forward cascade).
example(example)
example :
    let f : Nat → Bool
*Decide-witness on (n=3, ctrl=true, addr=7=0b111)** (Iter 223). Full address all-ones: top and-bit = ctrl ∧ 1 ∧ 1 ∧ 1 = true.
example(example)
example :
    let f : Nat → Bool
*Decide-witness on (n=3, ctrl=true, addr=3=0b011)** (Iter 223). Top address bit (addr_2) is 0, killing the chain: top and-bit = false.
theoremLookup.x_flip_post_state_frame
theorem Lookup.x_flip_post_state_frame
    (xs : List Nat) (f : Nat → Bool) (j : Nat) (h : j ∉ xs) :
    Lookup.x_flip_post_state xs f j = f j
*X-flip layer frame condition**: positions not in the flip list are unchanged by the layer.
theoremLookup.cnot_layer_post_state_frame
theorem Lookup.cnot_layer_post_state_frame
    (ctrl : Nat) (xs : List Nat) (f : Nat → Bool) (j : Nat) (h : j ∉ xs) :
    Lookup.cnot_layer_post_state ctrl xs f j = f j
*CNOT-layer frame condition**: positions not in the target list AND not equal to the control are unchanged. (The control itself is preserved by a separate lemma since CX never targets ctrl.)
theoremLookup.cnot_layer_post_state_ctrl_unchanged
theorem Lookup.cnot_layer_post_state_ctrl_unchanged
    (ctrl : Nat) (xs : List Nat) (f : Nat → Bool) (h_ctrl_not_tgt : ctrl ∉ xs) :
    Lookup.cnot_layer_post_state ctrl xs f ctrl = f ctrl
*CNOT-layer preserves the control qubit** (Iter 224). The control `ctrl` is never the target of any CX in this layer. (For this lemma we additionally need `ctrl ∉ xs`, since CX(ctrl, ctrl) is malformed in our gate IR but the post-state def doesn't enforce that.)
example(example)
example :
    let f : Nat → Bool
*Decide-witness on x-flip layer**: starting from f ≡ false on positions {0, 1, 2}, flipping {0, 2} produces (true, false, true).
example(example)
example :
    let f : Nat → Bool
*Decide-witness on CNOT layer**: with ctrl=0 (true) and targets {1, 2, 3} (initially all false), each XORs with ctrl → all become true.
example(example)
example :
    let f : Nat → Bool
*Decide-witness on CNOT layer with ctrl=false**: when control is false, no XOR fires; targets remain at their initial values.
theoremLookup.x_flip_post_state_at
theorem Lookup.x_flip_post_state_at
    (xs : List Nat) (h_nodup : xs.Nodup) (f : Nat → Bool) (j : Nat)
    (h_in : j ∈ xs) :
    Lookup.x_flip_post_state xs f j = ! (f j)
*X-flip value-at-element**: for `j ∈ xs` with `xs.Nodup`, the layer flips `f j` exactly once.
theoremLookup.cnot_layer_post_state_at
theorem Lookup.cnot_layer_post_state_at
    (ctrl : Nat) (xs : List Nat) (h_nodup : xs.Nodup)
    (h_ctrl_not_in : ctrl ∉ xs) (f : Nat → Bool) (tgt : Nat)
    (h_in : tgt ∈ xs) :
    Lookup.cnot_layer_post_state ctrl xs f tgt = xor (f tgt) (f ctrl)
*CNOT-layer value-at-element**: for `tgt ∈ xs` with `xs.Nodup` AND `ctrl ∉ xs` (so the control wire is preserved), the layer XORs `f tgt` with `f ctrl` exactly once.
example(example)
example :
    let f : Nat → Bool
*Decide-witness on x_flip_post_state_at**: with f = false everywhere, flipping {0, 2}, queried at j=2 (in the list): result = true.
example(example)
example :
    let f : Nat → Bool
*Decide-witness on cnot_layer_post_state_at**: with f(0)=true and f(2)=false, CNOT layer ctrl=0, targets {1,2,3}, queried at tgt=2 (in the list): result = false ⊕ true = true.
theoremLookup.x_flip_post_state_involution
theorem Lookup.x_flip_post_state_involution
    (xs : List Nat) (h_nodup : xs.Nodup) (f : Nat → Bool) :
    Lookup.x_flip_post_state xs (Lookup.x_flip_post_state xs f) = f
*X-flip layer involution**: with `xs.Nodup`, applying the X-flip layer twice returns to the identity. By funext + case-split on `j ∈ xs` vs `j ∉ xs`, using value-at-element (Iter 225) for the in-list case and the frame lemma (Iter 224) for the not-in-list case.
example(example)
example :
    let f : Nat → Bool
*Decide-witness on x-flip involution** at (xs=[0,2], f=fun i => i = 1). Both layers cancel; result equals input.
theoremprefix_and_step_post_state_at_and_zero
theorem prefix_and_step_post_state_at_and_zero (f : Nat → Bool) :
    prefix_and_step_post_state 0 f (ulookup_and_idx 0)
      = xor (f (ulookup_and_idx 0))
            (f ulookup_ctrl_idx && f (ulookup_address_idx 0))
*Step post-state value at the and-bit (k=0 branch)**.
theoremprefix_and_step_post_state_at_and_succ
theorem prefix_and_step_post_state_at_and_succ
    (k : Nat) (hk : k ≠ 0) (f : Nat → Bool) :
    prefix_and_step_post_state k f (ulookup_and_idx k)
      = xor (f (ulookup_and_idx k))
            (f (ulookup_and_idx (k - 1)) && f (ulookup_address_idx k))
*Step post-state value at the and-bit (k>0 branch)**.
theoremprefix_and_step_post_state_involution
theorem prefix_and_step_post_state_involution
    (k : Nat) (f : Nat → Bool) :
    prefix_and_step_post_state k (prefix_and_step_post_state k f) = f
*Boolean-level step involution**: applying `prefix_and_step_post_state k` twice yields the identity. The step's only write is to `ulookup_and_idx k`, XORing it with a frame (`f ctrl ∧ f addr_0` for k=0, `f and_{k-1} ∧ f addr_k` for k>0). The frame depends only on positions OTHER than `and_k`, so the second application sees the SAME frame value, and the XOR cancels. Holds for arbitrary `f` — no clean-state hypothesis.
example(example)
example :
    let f : Nat → Bool
*Decide-witness on step involution at k=0**: arbitrary input, apply step 0 twice, get input back.
example(example)
example :
    let f : Nat → Bool
*Decide-witness on step involution at k=2** (k > 0 branch).
theoremprefix_and_cascade_uncompute_post_state_eq_id
theorem prefix_and_cascade_uncompute_post_state_eq_id
    (n : Nat) (f : Nat → Bool) :
    prefix_and_uncompute_post_state n (prefix_and_cascade_post_state n f) = f
*Boolean-level cascade · uncompute = identity**. Applying the forward n-step cascade post-state then the n-step uncompute post-state returns to the input `f`. Proof by induction on n + Iter 227's step involution. Lookup analog of Iter 76's matrix-level `prefix_and_cascade_uncompute_eq_one`.
example(example)
example :
    let f : Nat → Bool
*Decide-witness on cascade · uncompute = id** at n=3 with a small concrete input function.

FormalRV.Arithmetic.UnaryLookup.UnaryLookupIterationCorrectness

FormalRV/Arithmetic/UnaryLookup/UnaryLookupIterationCorrectness.lean
theoremLookup.cnot_layer_post_state_preserves_and_bit
theorem Lookup.cnot_layer_post_state_preserves_and_bit
    (n_addr : Nat) (ctrl_idx : Nat) (word_cnot_idxs : List Nat)
    (h_word : Lookup.AllWordIdx n_addr word_cnot_idxs)
    (f : Nat → Bool) (k : Nat) (hk : k < n_addr) :
    Lookup.cnot_layer_post_state ctrl_idx word_cnot_idxs f (ulookup_and_idx k)
      = f (ulookup_and_idx k)
*CNOT layer with word-register targets preserves any and-bit** at `ulookup_and_idx k` for `k < n_addr`. By the frame lemma (Iter 224) + disjointness `and_idx k = 2 + 2*k < 1 + 2*n_addr ≤ word_idx _ j`.
theoremLookup.cnot_layer_post_state_preserves_ctrl
theorem Lookup.cnot_layer_post_state_preserves_ctrl
    (n_addr : Nat) (ctrl_idx : Nat) (word_cnot_idxs : List Nat)
    (h_word : Lookup.AllWordIdx n_addr word_cnot_idxs)
    (f : Nat → Bool) :
    Lookup.cnot_layer_post_state ctrl_idx word_cnot_idxs f ulookup_ctrl_idx
      = f ulookup_ctrl_idx
*CNOT layer with word targets preserves the ctrl qubit** (qubit 0). Special case of the general ctrl-preservation lemma; the layer's declared control is `and_idx (n_addr - 1)` which is NOT `ulookup_ctrl_idx = 0`, and word targets all exceed 0.
theoremLookup.cnot_layer_post_state_preserves_address
theorem Lookup.cnot_layer_post_state_preserves_address
    (n_addr : Nat) (ctrl_idx : Nat) (word_cnot_idxs : List Nat)
    (h_word : Lookup.AllWordIdx n_addr word_cnot_idxs)
    (f : Nat → Bool) (i : Nat) (hi : i < n_addr) :
    Lookup.cnot_layer_post_state ctrl_idx word_cnot_idxs f (ulookup_address_idx i)
      = f (ulookup_address_idx i)
*CNOT layer with word targets preserves each address qubit** `ulookup_address_idx i` for `i < n_addr`. Word indices start at `1 + 2*n_addr`, while address indices are `1 + 2*i ≤ 1 + 2*(n_addr - 1) < 1 + 2*n_addr`.
example(example)
example :
    let f : Nat → Bool
*Decide-witness**: with n_addr=3 and word_cnot_idxs = [7, 8, 12] (all valid word indices ≥ 1 + 2*3 = 7), the and-bit at position `ulookup_and_idx 2 = 6` is preserved by the CNOT layer.
theoremprefix_and_uncompute_post_state_frame_ctrl
theorem prefix_and_uncompute_post_state_frame_ctrl
    (n : Nat) (f : Nat → Bool) :
    prefix_and_uncompute_post_state n f ulookup_ctrl_idx = f ulookup_ctrl_idx
*Uncompute frame at ctrl_idx**: the n-step uncompute post-state preserves `ulookup_ctrl_idx`. Direct analog of Iter 221's `prefix_and_cascade_post_state_frame_ctrl`.
theoremprefix_and_uncompute_post_state_frame_addr
theorem prefix_and_uncompute_post_state_frame_addr
    (n : Nat) (f : Nat → Bool) (j : Nat) :
    prefix_and_uncompute_post_state n f (ulookup_address_idx j)
      = f (ulookup_address_idx j)
*Uncompute frame at every address bit**: preserves `ulookup_address_idx j` for any `j`.
theoremprefix_and_cascade_post_state_frame_word
theorem prefix_and_cascade_post_state_frame_word
    (n n_addr : Nat) (f : Nat → Bool) (j : Nat) (hn : n ≤ n_addr) :
    prefix_and_cascade_post_state n f (ulookup_word_idx n_addr j)
      = f (ulookup_word_idx n_addr j)
*Cascade frame at every word bit**: preserves `ulookup_word_idx n_addr j` for any `j` (word indices `≥ 1 + 2·n_addr` are disjoint from and-indices `≤ 2·n` for the cascade's n-many writes).
theoremprefix_and_uncompute_post_state_frame_word
theorem prefix_and_uncompute_post_state_frame_word
    (n n_addr : Nat) (f : Nat → Bool) (j : Nat) (hn : n ≤ n_addr) :
    prefix_and_uncompute_post_state n f (ulookup_word_idx n_addr j)
      = f (ulookup_word_idx n_addr j)
*Uncompute frame at every word bit**: symmetric to the cascade word-frame.
theoremLookup.iteration_post_state_preserves_ctrl
theorem Lookup.iteration_post_state_preserves_ctrl
    (n_addr : Nat) (addr_flip_idxs word_cnot_idxs : List Nat)
    (h_ctrl_not_flip : ulookup_ctrl_idx ∉ addr_flip_idxs)
    (h_word : Lookup.AllWordIdx n_addr word_cnot_idxs)
    (f : Nat → Bool) :
    Lookup.iteration_post_state n_addr addr_flip_idxs word_cnot_idxs f
        ulookup_ctrl_idx = f ulookup_ctrl_idx
*Iteration preserves ctrl**. Requires `ctrl_idx ∉ addr_flip_idxs` (X-flip layers don't touch ctrl) and `AllWordIdx n_addr word_cnot_idxs` (CNOT-on-word doesn't touch ctrl, which has index 0 < 1 + 2·n_addr).
theoremLookup.iteration_post_state_preserves_address
theorem Lookup.iteration_post_state_preserves_address
    (n_addr : Nat) (addr_flip_idxs word_cnot_idxs : List Nat)
    (h_flip_nodup : addr_flip_idxs.Nodup)
    (h_word : Lookup.AllWordIdx n_addr word_cnot_idxs)
    (f : Nat → Bool) (i : Nat) (hi : i < n_addr) :
    Lookup.iteration_post_state n_addr addr_flip_idxs word_cnot_idxs f
        (ulookup_address_idx i) = f (ulookup_address_idx i)
*Iteration preserves every address bit** `ulookup_address_idx i` for `i < n_addr`. The two outer X-flip layers cancel by involution (Iter 226), and the inner 3 stages each preserve address bits via register-level frame lemmas.
theoremprefix_and_cascade_post_state_frame_general
theorem prefix_and_cascade_post_state_frame_general
    (n : Nat) (f : Nat → Bool) (j : Nat)
    (h : ∀ k, k < n → j ≠ ulookup_and_idx k) :
    prefix_and_cascade_post_state n f j = f j
*General cascade frame**: positions outside `{ulookup_and_idx k : k < n}` are unchanged by the n-step forward cascade.
theoremprefix_and_uncompute_post_state_frame_general
theorem prefix_and_uncompute_post_state_frame_general
    (n : Nat) (f : Nat → Bool) (j : Nat)
    (h : ∀ k, k < n → j ≠ ulookup_and_idx k) :
    prefix_and_uncompute_post_state n f j = f j
*General uncompute frame**: positions outside `{ulookup_and_idx k : k < n}` are unchanged by the n-step reverse uncompute. Symmetric to the cascade general frame above.
theoremLookup.iteration_post_state_preserves_outside_word_targets
theorem Lookup.iteration_post_state_preserves_outside_word_targets
    (n_addr : Nat) (addr_flip_idxs word_cnot_idxs : List Nat)
    (h_flip_addr : ∀ x ∈ addr_flip_idxs,
                       ∃ i, i < n_addr ∧ x = ulookup_address_idx i)
    (f : Nat → Bool) (p : Nat) (h_p_word : 1 + 2 * n_addr ≤ p)
    (h_not_target : p ∉ word_cnot_idxs) :
    Lookup.iteration_post_state n_addr addr_flip_idxs word_cnot_idxs f p = f p
*Iteration preserves any word-register position not in CNOT targets**. Requires: - `addr_flip_idxs` are all valid address indices (so they don't include word positions). - `word_cnot_idxs` consist of word indices (`AllWordIdx`). - `p` is in the word register (`1 + 2·n_addr ≤ p`) and not in the CNOT target list.
theoremLookup.iteration_post_state_at_word_target
theorem Lookup.iteration_post_state_at_word_target
    (n_addr : Nat) (hn : 0 < n_addr)
    (addr_flip_idxs word_cnot_idxs : List Nat)
    (h_word_nodup : word_cnot_idxs.Nodup)
    (h_word : Lookup.AllWordIdx n_addr word_cnot_idxs)
    (h_flip_addr : ∀ x ∈ addr_flip_idxs,
                       ∃ i, i < n_addr ∧ x = ulookup_address_idx i)
    (f : Nat → Bool) (p : Nat) (h_in : p ∈ word_cnot_idxs) :
    Lookup.iteration_post_state n_addr addr_flip_idxs word_cnot_idxs f p
      = xor (f p)
            (prefix_and_cascade_post_state n_addr
              (Lookup.x_flip_post_state addr_flip_idxs f)
*Iteration's trigger XOR at word targets**. For any `p ∈ word_cnot_idxs` (a target of the middle CNOT layer), the iteration post-state is `f p XOR T`, where `T = prefix_and_cascade_post_state n_addr (x_flip_post_state addr_flip_idxs f) (ulookup_and_idx (n_addr - 1))` is the cascade's top-bit trigger.
theoremprefix_and_step_post_state_commute_update_word
theorem prefix_and_step_post_state_commute_update_word
    (k n_addr : Nat) (hk : k < n_addr)
    (f : Nat → Bool) (p : Nat) (v : Bool)
    (h_p : 1 + 2 * n_addr ≤ p) :
    prefix_and_step_post_state k (Function.update f p v)
      = Function.update (prefix_and_step_post_state k f) p v
*Step commutes with word-update**: if `p ≥ 1 + 2·n_addr` (a word position) and `k < n_addr`, then applying step `k` after an update at `p` is the same as updating after step `k`.
theoremprefix_and_uncompute_post_state_commute_update_word
theorem prefix_and_uncompute_post_state_commute_update_word
    (n n_addr : Nat) (hn : n ≤ n_addr)
    (f : Nat → Bool) (p : Nat) (v : Bool)
    (h_p : 1 + 2 * n_addr ≤ p) :
    prefix_and_uncompute_post_state n (Function.update f p v)
      = Function.update (prefix_and_uncompute_post_state n f) p v
*Uncompute commutes with word-update**: applying uncompute to an update at a word position equals updating after the uncompute. Direct induction on `n` using Iter 237's step commutation.
theoremprefix_and_uncompute_post_state_at_and_invariant_under_cnot_layer
theorem prefix_and_uncompute_post_state_at_and_invariant_under_cnot_layer
    (n n_addr : Nat) (hn : n ≤ n_addr)
    (ctrl_idx : Nat) (cnots : List Nat)
    (h_cnots_word : Lookup.AllWordIdx n_addr cnots)
    (f : Nat → Bool) (k : Nat) (hk : k < n_addr) :
    prefix_and_uncompute_post_state n
      (Lookup.cnot_layer_post_state ctrl_idx cnots f) (ulookup_and_idx k)
      = prefix_and_uncompute_post_state n f (ulookup_and_idx k)
*CNOT-layer invariance at and-bits**: the n-step uncompute output at any and-bit position is unchanged when the input is preprocessed by a CNOT layer with word-register targets. Proof: induction on the CNOT target list, using `prefix_and_uncompute_post_state_commute_update_word` at each list step.
theoremLookup.iteration_post_state_preserves_and
theorem Lookup.iteration_post_state_preserves_and
    (n_addr : Nat) (addr_flip_idxs word_cnot_idxs : List Nat)
    (h_flip_addr : ∀ x ∈ addr_flip_idxs,
                       ∃ i, i < n_addr ∧ x = ulookup_address_idx i)
    (h_word : Lookup.AllWordIdx n_addr word_cnot_idxs)
    (f : Nat → Bool) (k : Nat) (hk : k < n_addr) :
    Lookup.iteration_post_state n_addr addr_flip_idxs word_cnot_idxs f
        (ulookup_and_idx k) = f (ulookup_and_idx k)
*Iteration preserves every and-bit** at `ulookup_and_idx k` for `k < n_addr`. The proof composes Iter 226 X-flip frame + Iter 238 CNOT-uncompute congruence + Iter 229 cascade·uncompute=id.
theoremLookup.unary_lookup_iteration_correct
theorem Lookup.unary_lookup_iteration_correct
    (n_addr : Nat) (hn : 0 < n_addr)
    (addr_flip_idxs word_cnot_idxs : List Nat)
    (h_flip_addr : ∀ x ∈ addr_flip_idxs,
                       ∃ i, i < n_addr ∧ x = ulookup_address_idx i)
    (h_flip_nodup : addr_flip_idxs.Nodup)
    (h_word : Lookup.AllWordIdx n_addr word_cnot_idxs)
    (h_word_nodup : word_cnot_idxs.Nodup)
    (f : Nat → Bool) :
    -- (1) Word targets get XOR'd with the trigger.
    (∀ p, p ∈ word_cnot_idxs →
      Lookup.iteration_post_state n_addr addr_flip_idxs word_cnot_idxs f p
*Headline: `unary_lookup_iteration` classical action**. For valid inputs (flip indices are address; word_cnot_idxs are word-register indices), the iteration post-state has the following form at every position: 1. `p ∈ word_cnot_idxs`: `xor (f p) trigger` — written by the CNOT layer with the cascade-top-bit trigger. 2. `p = ulookup_ctrl_idx`: preserved. 3. `p = ulookup_address_idx i` for `i < n_addr`: restored to `f p` (X-flip layers cancel by involution). 4. `p = ulookup_and_idx k` for `k < n_addr`: returned to clean (cascade · uncompute = id, modulo CNOT-layer-invariance at and-bits). 5. `p` a word index, `p ∉ word_cnot_idxs`: preserved.
theoremLookup.cascade_top_bit_under_x_flip
theorem Lookup.cascade_top_bit_under_x_flip
    (n_addr : Nat) (hn : 0 < n_addr)
    (addr_flip_idxs : List Nat)
    (h_flip_addr : ∀ x ∈ addr_flip_idxs,
                       ∃ i, i < n_addr ∧ x = ulookup_address_idx i)
    (ctrl : Bool) (effective_addr : Nat) (f : Nat → Bool)
    (h_ctrl : f ulookup_ctrl_idx = ctrl)
    (h_eff_addr : ∀ i, i < n_addr →
        Lookup.x_flip_post_state addr_flip_idxs f (ulookup_address_idx i)
          = effective_addr.testBit i)
    (h_clean : ∀ i, i < n_addr → f (ulookup_and_idx i) = false) :
    prefix_and_cascade_post_state n_addr
*Trigger value under X-flip = `address_and` at effective address**. Specialization of Iter 223's `prefix_and_cascade_top_bit_eq_address_and` to the X-flipped state used in `unary_lookup_iteration`.
example(example)
example :
    let f : Nat → Bool
*Decide-witness on (n_addr=3, no flips, cnots=[7,8], addr=111)**. Trigger fires (all 3 address bits = 1), so word_0 and word_1 get flipped from false to true. Uses `native_decide` for build speed.
example(example)
example :
    let f : Nat → Bool
*Decide-witness on (n_addr=3, no flips, cnots=[7,8], addr=011)**. Trigger does NOT fire (addr_2 = 0 kills the chain), so word_0 and word_1 stay at their input value (false).
example(example)
example :
    let f : Nat → Bool
*Decide-witness on (n_addr=3, flips=[1], cnots=[7], addr=110)**. With flips=[1] (= addr_0), the effective address has addr_0 flipped: 1 XOR 0 = 1, plus original addr_1 = 1, addr_2 = 1. So effective_addr = 111, and the trigger fires. Word_0 flipped from false to true. Note: the address bits are RESTORED to their input by the outer x_flip layers, so addr_0 (= 1 originally) reads as true at the end.
example(example)
example :
    let f : Nat → Bool
*Decide-witness on ctrl preservation** (n_addr=3, mixed instance). Validates `iteration_post_state_preserves_ctrl` concretely.
example(example)
example :
    let f : Nat → Bool
*Decide-witness on word-not-in-targets preservation** (n_addr=3). Word position 9 (= ulookup_word_idx 3 2 = word_2) is NOT in word_cnot_idxs=[7,8], so it's preserved.
example(example)
example :
    let f : Nat → Bool
*Decide-witness on the multi-iteration post-state at n_addr=3, 2 iters**. With iters = [(flips=[], cnots=[7]), (flips=[], cnots=[7])], the cnot at word position 7 fires TWICE (both iters trigger if addr=111). 2 XORs cancel, leaving word_0 unchanged.
theoremLookup.multi_iteration_post_state_preserves_outside_all_cnots
theorem Lookup.multi_iteration_post_state_preserves_outside_all_cnots
    (n_addr : Nat) (iters : List (List Nat × List Nat))
    (h_flip_addr_all : ∀ flips cnots, (flips, cnots) ∈ iters →
        ∀ x ∈ flips, ∃ i, i < n_addr ∧ x = ulookup_address_idx i)
    (f : Nat → Bool) (p : Nat) (h_p_word : 1 + 2 * n_addr ≤ p)
    (h_not_in_any : ∀ flips cnots, (flips, cnots) ∈ iters → p ∉ cnots) :
    Lookup.multi_iteration_post_state n_addr iters f p = f p
*Multi-iteration post-state frame**: positions p with `1 + 2*n_addr ≤ p` and outside the UNION of every iter's `cnots` are preserved. By induction on the iter list, using `iteration_post_state_preserves_outside_word_targets` (Iter 235) at each step.
example(example)
example : Lookup.iter_triggers true 7 3 [] = true
*Decide-witness on iter_triggers**: with no flips, addr=111 (all-ones), n_addr=3, ctrl=true → trigger fires.
example(example)
example : Lookup.iter_triggers true 3 3 [] = false
*Decide-witness on iter_triggers**: with no flips, addr=011, n_addr=3, ctrl=true → trigger does NOT fire (addr_2 = 0).
example(example)
example : Lookup.iter_triggers true 3 3 [5] = true
*Decide-witness on iter_triggers**: with flips=[ulookup_address_idx 2] (= [5]), addr=011, n_addr=3, ctrl=true → effective_addr = 111 (addr_2 toggled to 1), trigger fires.
example(example)
example :
    Lookup.multi_iteration_xor_value true 7 3 [([], [7]), ([], [7])] 7 = false
*Decide-witness on multi_iteration_xor_value**: 2 iters both targeting word_0=7, both trigger on addr=111, both contribute → XOR cancels → false.
example(example)
example :
    Lookup.multi_iteration_xor_value true 7 3 [([], [7])] 7 = true
*Decide-witness on multi_iteration_xor_value**: 1 iter targeting word_0=7, triggers on addr=111 → contributes true.
theoremLookup.effective_addr_lt_two_pow
theorem Lookup.effective_addr_lt_two_pow
    (addr : Nat) (flips : List Nat) (n_addr : Nat) :
    Lookup.effective_addr addr flips n_addr < 2 ^ n_addr
*Effective address is bounded by 2^n_addr**. By induction on n_addr, using `Nat.bitwise_lt_two_pow` (`x, y < 2^n → bitwise f x y < 2^n`).
example(example)
example : Lookup.effective_addr 3 [] 3 = 3
*Decide-witness on effective_addr**: addr=3 (=0b011), no flips, n=3. Result = 3 (the input pattern is preserved since no flips).
theoremLookup.effective_addr_testBit
theorem Lookup.effective_addr_testBit
    (addr : Nat) (flips : List Nat) (n_addr i : Nat) (hi : i < n_addr) :
    (Lookup.effective_addr addr flips n_addr).testBit i
      = xor (addr.testBit i) (decide (ulookup_address_idx i ∈ flips))
*testBit characterization of effective_addr** (Iter 254). For `i < n_addr`, the i-th bit of `effective_addr addr flips n_addr` equals the X-flipped i-th bit pattern. Direct induction on `n_addr` using `Nat.testBit_or`, `Nat.testBit_two_pow`, and `Nat.testBit_lt_two_pow` (via `effective_addr_lt_two_pow` from Iter 253).
example(example)
example : Lookup.effective_addr 3 [5] 3 = 7
*Decide-witness on effective_addr**: addr=3 (=0b011), flips=[5] (=addr_idx 2), n=3. Bit 2 is toggled from 0 → 1, giving 7 (=0b111).
example(example)
example : Lookup.effective_addr 7 [1, 3] 3 = 4
*Decide-witness on effective_addr**: addr=7 (=0b111), flips=[1,3] (=addr_idx 0, 1), n=3. Bits 0 and 1 toggled to 0, bit 2 unchanged → 4 (=0b100).
example(example)
example :
    Lookup.iter_triggers true 7 3 []
      = Lookup.address_and true (Lookup.effective_addr 7 [] 3) 3
*Decide-witness consistency**: iter_triggers and address_and on effective_addr agree on small instances. Witnessing the BRIDGE THEOREM that Iter 251 will prove parametrically.
example(example)
example :
    Lookup.iter_triggers true 3 3 [5]
      = Lookup.address_and true (Lookup.effective_addr 3 [5] 3) 3
theoremLookup.iteration_post_state_at_word_target_via_address_and
theorem Lookup.iteration_post_state_at_word_target_via_address_and
    (n_addr : Nat) (hn : 0 < n_addr)
    (flips cnots : List Nat)
    (h_flip_addr : ∀ x ∈ flips, ∃ i, i < n_addr ∧ x = ulookup_address_idx i)
    (h_cnots_nodup : cnots.Nodup)
    (h_word : Lookup.AllWordIdx n_addr cnots)
    (ctrl : Bool) (addr effective_addr : Nat) (g : Nat → Bool)
    (h_ctrl : g ulookup_ctrl_idx = ctrl)
    (h_addr : ∀ i, i < n_addr → g (ulookup_address_idx i) = addr.testBit i)
    (h_eff_addr : ∀ i, i < n_addr →
        Lookup.x_flip_post_state flips g (ulookup_address_idx i)
          = effective_addr.testBit i)
*Iteration at word target via address_and** (Iter 251). For `p ∈ cnots` and a user-supplied `effective_addr` matching the X-flipped address pattern, the post-state at p is `xor (g p) (address_and ctrl effective_addr n_addr)`.
example(example)
example :
    let f : Nat → Bool
*Decide-witness on the chaining lemma** (n_addr=3, no flips, addr=7). effective_addr = 7 (no flip change). iteration at p=7 = xor (g 7) (address_and true 7 3) = xor false true = true.
theoremLookup.multi_iteration_post_state_preserves_ctrl
theorem Lookup.multi_iteration_post_state_preserves_ctrl
    (n_addr : Nat) (iters : List (List Nat × List Nat))
    (h_flip_addr_all : ∀ flips cnots, (flips, cnots) ∈ iters →
        ∀ x ∈ flips, ∃ i, i < n_addr ∧ x = ulookup_address_idx i)
    (h_word_all : ∀ flips cnots, (flips, cnots) ∈ iters →
        Lookup.AllWordIdx n_addr cnots)
    (f : Nat → Bool) :
    Lookup.multi_iteration_post_state n_addr iters f ulookup_ctrl_idx
      = f ulookup_ctrl_idx
*Multi-iter preserves ctrl** at every position.
theoremLookup.multi_iteration_post_state_preserves_address
theorem Lookup.multi_iteration_post_state_preserves_address
    (n_addr : Nat) (iters : List (List Nat × List Nat))
    (h_flip_nodup_all : ∀ flips cnots, (flips, cnots) ∈ iters → flips.Nodup)
    (h_word_all : ∀ flips cnots, (flips, cnots) ∈ iters →
        Lookup.AllWordIdx n_addr cnots)
    (f : Nat → Bool) (i : Nat) (hi : i < n_addr) :
    Lookup.multi_iteration_post_state n_addr iters f (ulookup_address_idx i)
      = f (ulookup_address_idx i)
*Multi-iter preserves every address bit** for `i < n_addr`.
theoremLookup.x_flip_post_state_xor
theorem Lookup.x_flip_post_state_xor
    (xs : List Nat) (h_nodup : xs.Nodup) (f : Nat → Bool) (j : Nat) :
    Lookup.x_flip_post_state xs f j = xor (f j) (decide (j ∈ xs))
*X-flip post-state as XOR with membership** (utility): for `xs.Nodup`, `x_flip_post_state xs f j = xor (f j) (decide (j ∈ xs))`. Unifies the Iter 224 frame + Iter 225 value-at-element under a single expression.
theoremLookup.multi_iteration_post_state_preserves_and
theorem Lookup.multi_iteration_post_state_preserves_and
    (n_addr : Nat) (iters : List (List Nat × List Nat))
    (h_flip_addr_all : ∀ flips cnots, (flips, cnots) ∈ iters →
        ∀ x ∈ flips, ∃ i, i < n_addr ∧ x = ulookup_address_idx i)
    (h_word_all : ∀ flips cnots, (flips, cnots) ∈ iters →
        Lookup.AllWordIdx n_addr cnots)
    (f : Nat → Bool) (k : Nat) (hk : k < n_addr) :
    Lookup.multi_iteration_post_state n_addr iters f (ulookup_and_idx k)
      = f (ulookup_and_idx k)
*Multi-iter preserves every and-bit** for `k < n_addr`.
theoremLookup.multi_iteration_post_state_at_word_target_in_head_iter
theorem Lookup.multi_iteration_post_state_at_word_target_in_head_iter
    (n_addr : Nat) (hn : 0 < n_addr)
    (head_flips head_cnots : List Nat)
    (rest : List (List Nat × List Nat))
    (h_head_flip_addr : ∀ x ∈ head_flips,
                            ∃ i, i < n_addr ∧ x = ulookup_address_idx i)
    (h_head_flip_nodup : head_flips.Nodup)
    (h_head_cnots_nodup : head_cnots.Nodup)
    (h_head_word : Lookup.AllWordIdx n_addr head_cnots)
    (h_flip_addr_all : ∀ flips cnots, (flips, cnots) ∈ rest →
        ∀ x ∈ flips, ∃ i, i < n_addr ∧ x = ulookup_address_idx i)
    (h_flip_nodup_all : ∀ flips cnots, (flips, cnots) ∈ rest → flips.Nodup)
*Multi-iter chaining at word target**: at a word target `p` in the HEAD iter's `head_cnots`, the multi-iter post-state on `(head_flips, head_cnots) :: rest` equals the rest's post-state XOR'd with `Lookup.address_and ctrl (Lookup.effective_addr addr head_flips n_addr) n_addr`.
theoremLookup.unary_lookup_multi_iteration_correct
theorem Lookup.unary_lookup_multi_iteration_correct
    (n_addr : Nat) (hn : 0 < n_addr)
    (iters : List (List Nat × List Nat))
    (h_flip_addr_all : ∀ flips cnots, (flips, cnots) ∈ iters →
        ∀ x ∈ flips, ∃ i, i < n_addr ∧ x = ulookup_address_idx i)
    (h_flip_nodup_all : ∀ flips cnots, (flips, cnots) ∈ iters → flips.Nodup)
    (h_cnots_nodup_all : ∀ flips cnots, (flips, cnots) ∈ iters → cnots.Nodup)
    (h_word_all : ∀ flips cnots, (flips, cnots) ∈ iters →
        Lookup.AllWordIdx n_addr cnots)
    (ctrl : Bool) (addr : Nat) (f : Nat → Bool)
    (h_ctrl : f ulookup_ctrl_idx = ctrl)
    (h_addr : ∀ i, i < n_addr → f (ulookup_address_idx i) = addr.testBit i)
*HEADLINE: multi-iteration unary lookup classical action**. For a word position `p` in some iter's cnots, the multi-iter post-state is `xor (f p) (cumulative_xor_value)`, where the cumulative value sums the trigger contributions from each iter whose cnots include `p`.
example(example)
example (addr_flip_idxs word_cnot_idxs : List Nat) :
    tcount (unary_lookup_iteration 6 addr_flip_idxs word_cnot_idxs) = 84
*RSA-2048 lookup single-iteration T-count = 84** (Iter 262). For q_a = 6 (qianxu p. 22 max table-row size for RSA-2048), `tcount (unary_lookup_iteration 6 _ _) = 14·6 = 84`.
example(example)
example :
    tcount (unary_lookup_multi_iteration 6
              (List.replicate 64 ([], []))) = 5376
*RSA-2048 lookup multi-iteration T-count = 5376** (Iter 262) for the full 2^6 = 64 iterations covering all addresses. This is the **no-measurement, no-Gray-code upper bound**; qianxu's optimized claim of 2^q_a Toffolis = 56 T requires BOTH the Gidney measurement trick (factor 2) AND Gray-code amortization (factor q_a = 6). See Iter 28 review finding for the factor-of-12 = 5376/448 ≈ 12 gap analysis.
example(example)
example :
    tcount (unary_lookup_multi_iteration 6
              (List.replicate 64 ([], [])))
    = 14 * 6 * (List.replicate 64 (([] : List Nat), ([] : List Nat))).length
*RSA-2048 lookup multi-iteration symbolic form** (Iter 262): parametric `14 · n_addr · |iters|` instantiated at (6, 64).
example(example)
example (addr_flip_idxs word_cnot_idxs : List Nat) :
    tcount (unary_lookup_iteration
              qianxu_E9_q_a_RSA2048
              addr_flip_idxs word_cnot_idxs)
      = unary_lookup_iteration_RSA2048_T_count_verified
*Bridge: verified single-iter T-count matches the RSA-2048 paper-claim anchor** (Iter 263).
example(example)
example :
    tcount (unary_lookup_multi_iteration
              qianxu_E9_q_a_RSA2048
              (List.replicate (2 ^ qianxu_E9_q_a_RSA2048)
                ([], [])))
      = unary_lookup_multi_RSA2048_no_meas_T_count_verified
*Bridge: verified multi-iter T-count matches the RSA-2048 no-measurement paper-claim anchor** (Iter 263).