Dual-Model Adversarial Methodology

Dual-Model Adversarial Intelligence as a Method for Documenting PrimerField Theory

David Allen LaPoint

PrimerField Foundation

June 22, 2026


Abstract

PrimerField Theory was developed through nineteen years of experimental research without artificial intelligence involvement. The core theory was complete and stable by 2012, documented in three YouTube videos from 2012–2013 that remain accurate today. AI entered this work only in 2025—not as a theoretical collaborator but as an analytical instrument for validating documentation, enforcing constraint compliance, and stress-testing written formulations. This paper documents the methodology that emerged: the deliberate use of multiple large language models exhibiting different behavioral characteristics as adversarial review tools.

The approach treats AI systems not as sources of theoretical insight but as diagnostic instruments with complementary failure modes. When two behaviorally distinct models—ChatGPT and Claude AI—examine the same theoretical documentation under a defined adversarial protocol, their structured disagreements expose hidden assumptions, logical gaps, and definitional ambiguities that persist undetected in single-model workflows. This creates a form of parallax—depth and structure become visible only when non-identical perspectives are held in productive tension under explicit human constraint.

Model role assignments were not assumed. They were established empirically through a blind error-detection test. Twelve intentional errors were embedded in a test document and submitted independently to ChatGPT and to Gemini 1.5 Pro operating in deep-think mode. ChatGPT detected all twelve errors on the first pass. Gemini detected approximately four of the twelve intentional errors—a roughly 33% detection rate, depending on whether partial detections are counted. This result established the validation hierarchy for this workflow: ChatGPT serves as the primary cross-checker and hostile reviewer; Gemini is not used in the production workflow.

A further refinement emerged from sustained practice: Claude functions not only as the primary drafter but as the meta-evaluator of ChatGPT’s audit. When ChatGPT returns a review, Claude classifies each flagged item as a genuine error, a likely false positive, or a human-escalation item before revisions are executed. The human researcher retains final authority over all canonical claims and release decisions.

The result is a documentation review process more relentless than ordinary informal human review at the level of internal logical consistency, while maintaining absolute human authority over theoretical content and canonical claims. The theory itself remains entirely human-originated; AI serves only to ensure that written expression of that theory is internally consistent and precisely bounded.

Scope: This paper documents a documentation and validation methodology, not a physical theory claim. The validity of the method does not depend on the validity of PrimerField Theory itself.


1. Introduction: AI as Instrument, Not Authority

PrimerField Theory did not emerge from artificial intelligence. The core concepts—bowl-shaped magnetic field geometries, plasma confinement structures, photon field extension, and emergent wave behavior from particle interactions—developed through initial inspiration in 2006 and stable theory by 2012, documented in three YouTube videos from 2012–2013 that remain accurate today (PrimerField Parts 1, 2, and 3, published May–July 2012). This represents more than twelve years of unchanged theory developed prior to the availability of modern large language models.

In this paper, canon means human-controlled locked PF constraint documents that define permitted claims, binding terminology, and release boundaries. No AI system has authority to modify canon; it is the fixed reference against which all AI-generated drafts are evaluated.

AI entered this work not as a theoretical collaborator but as an analytical instrument. The models perform specific diagnostic functions: adversarial review, logical compression testing, rhetorical stress testing, and structured rewriting under constraint. At no point does any model operate as an epistemic authority. Theory originates with the human researcher; AI systems exist to break weak formulations, not to generate strong ones.

This distinction matters because it shapes every subsequent methodological choice. A model trusted to invent theory will optimize for coherence. A model constrained to attack theory will optimize for inconsistency detection. The PF workflow requires the latter.

2. The Hazard of Single-Model Reliance

Every large language model possesses characteristic biases, preferred explanatory styles, and systematic blind spots. When a single model drafts, revises, and validates theoretical content, its particular failure modes remain invisible throughout the entire process. Errors that cohere with the model’s internal priors feel natural; claims that violate them get flagged. The result is self-consistent hallucination—output that appears rigorous precisely because it has been filtered through a single lens.

This problem intensifies in non-standard theoretical frameworks like PrimerField Theory, where conventional physics assumptions cannot be imported uncritically. A single model will tend to smooth contradictions between PF claims and standard physics, either by unconsciously importing mainstream assumptions or by generating plausible-sounding but unfounded reconciliations.

A deeper version of this problem is what has been called the Baseline Correction Mechanism (BCM). The BCM is defined and defended in the companion paper The Baseline Correction Mechanism: How Consensus Pressure Shapes Reasoning in Humans and Artificial Intelligence (LaPoint, PrimerField Foundation, 2026); this paper uses BCM in that defined sense. AI models—and human reviewers—exhibit a documented tendency to shape their reasoning in the direction of whatever baseline they were built around. The mechanism is proportional to distance from that baseline: the further a claim sits from consensus, the more pressure the BCM exerts to discount, hedge, or silently harmonize it. Critically, the BCM is not a truth-tracking mechanism. It defends whatever baseline was built around, regardless of whether that baseline is correct. A model trained on orthodox physics will experience BCM pressure against PF claims not because the claims are false, but because they are distant from the training consensus. The same mechanism that would push back on a genuine error will push back equally on a correct but non-consensus claim. Single-model workflows make this indistinguishable.

The solution is not to find a better model. The solution is to exploit the behavioral differences between models as a diagnostic signal—and to establish those behavioral profiles empirically rather than by assumption.

3. Model Assessment: ChatGPT, Claude, and the Gemini Benchmark Test

ChatGPT and Claude AI are not interchangeable instruments. In repeated application, they exhibit reliably different behaviors that can be exploited for rigorous theory development. Critically, the role assignments described below are not assumptions about inherent model capabilities. They are the outcome of an empirical blind test designed to establish which model functions most reliably as the error-detection instrument in the PF workflow.

3.1 The Blind Error-Detection Test

To determine which model should serve as primary validator, a controlled test was conducted. A completed PF paper was modified to embed twelve intentional errors—logical inconsistencies, definitional violations, and scope overreach—distributed across the document without disclosure to either model. The document was submitted independently to ChatGPT and to Gemini 1.5 Pro operating in deep-think mode, with each model instructed to perform a thorough hostile audit. Appendix A documents the twelve error categories, scoring criteria, and outcome.

ChatGPT detected all twelve errors on the first pass.

Gemini detected approximately four of the twelve intentional errors—a roughly 33% detection rate, depending on whether partial detections are counted—under its most capable configuration.

This result was decisive for this workflow. The substantial disparity between 100% and 33% detection under tested hostile-read conditions is not a marginal performance difference. It reflects a fundamental disparity in constraint enforcement capability under the specific conditions of the PF documentation workflow. Gemini was removed from the production validation role. The methodology now uses ChatGPT exclusively as cross-checker and hostile reviewer.

3.2 ChatGPT: Constraint Enforcement and Hostile Review

In the PF workflow, ChatGPT functions as the primary constraint enforcement mechanism. Its observed strengths include: long context retention across multiple revision cycles, reliable detection of canon violations even when those violations are rhetorically polished, systematic terminology policing, identification of scope creep and hidden assumptions, and the capacity for genuine hostile read review.

The term ‘hostile read’ requires definition: it means assuming every claim is potentially false and actively seeking internal inconsistencies, unstated dependencies, and logical gaps—rather than reading charitably to understand intent. When instructed to perform hostile read review, ChatGPT executes that instruction with useful literalism. It flags violations of explicitly stated constraints even when the violating passage sounds reasonable. This makes ChatGPT well suited for final quality assurance, adversarial audit, and canon lock-in—the point at which theoretical claims become binding on all subsequent work. Hostile read does not, however, perform empirical validation, verify domain truth claims, or assess correspondence with external reality; it operates strictly on internal consistency.

3.3 Claude: Rhetorical Synthesis, Drafting, and Audit Meta-Evaluation

Claude serves two distinct roles in the PF workflow. The first is primary drafter: Claude produces all initial drafts, optimized for clarity, rhetorical force, and narrative coherence. When permitted to write forcefully, Claude’s drafts stress-test whether an argument actually holds together when stated without hedging—a useful diagnostic in its own right.

The second role is audit triage of ChatGPT’s review. Claude does not validate its own draft for release. When ChatGPT returns a hostile review, Claude classifies each flagged item as a genuine error, a likely false positive, or a human-escalation item. False positives are rejected with documented reasoning; ambiguous cases are escalated to the human researcher. ChatGPT then performs final quality assurance after revision. Final release authority remains entirely with the human researcher. This assignment reflects sustained observation: Claude has consistently proven more accurate than ChatGPT at determining whether a given critique is correct. The distinction matters because an overcorrected document—one revised in response to a false positive—can introduce errors that did not exist in the original draft.

Claude exhibits tendencies that require constraint in other contexts: a disposition to soften boundaries, introduce uncanonized language, and implicitly harmonize contradictions rather than expose them. These tendencies are managed by the structure of the workflow itself—Claude drafts under constraint, ChatGPT audits without deference to Claude’s framing, and the human researcher retains authority over all canonical claims.

4. The Three-Phase Adversarial Loop

The PF paper development process follows a deliberate three-phase sequence. The structure has evolved from a two-model relay into a loop that captures not only model disagreements but meta-level assessment of whether those disagreements are valid.

Phase 1 — Drafting. The human researcher supplies theory, constraints, and canon. Claude produces an aggressive or high-clarity draft. No review occurs in this phase; the goal is a complete document that can be audited.

Phase 2 — Hostile Audit. ChatGPT performs a cold, publication-grade hostile audit of the full document. Every claim is treated as potentially false. Errors, overreach, canon violations, and logical gaps are explicitly flagged with reasoning.

Phase 3 — Triage and Revision. Claude classifies ChatGPT’s flagged items: genuine errors are accepted for correction, likely false positives are rejected with documented reasoning, and ambiguous cases are escalated to the human researcher. Claude then rewrites the document under confirmed corrections. ChatGPT performs final quality assurance before the document is submitted to the human for release approval.

The critical feature of this loop is that disagreement between models functions as a diagnostic signal at two levels. First-level disagreement: ChatGPT and Claude reach different implicit conclusions about a claim, revealing ambiguity or logical gap. Second-level disagreement: Claude and ChatGPT disagree about whether a flagged item is a real error, revealing either overcorrection by the auditor or a subtlety missed in the original draft. Both levels produce actionable information. Neither is resolved by model consensus—the human researcher holds final authority over all canonical content.

5. Why This Method Produces Stronger Theoretical Documentation

Error visibility through disagreement. When two models with different behavioral tendencies reach different conclusions, the disagreement localizes the problem. Either one model has violated a constraint, one model has introduced an unintended interpretation, or the original formulation was genuinely ambiguous. All three outcomes produce actionable information.

Immunity to consensus illusions. Because neither model is permitted to validate its own output, false coherence does not accumulate across revision cycles. A claim that ‘feels right’ to Claude still faces hostile scrutiny from ChatGPT. A constraint that ChatGPT enforces rigidly still gets evaluated by Claude’s triage before revisions are executed.

BCM pressure made visible. The BCM operates on both models, but in different directions depending on each model’s training baseline. When both models independently flag the same claim, that is strong evidence of a real problem. When only one flags it, that divergence is the signal: the flagged claim may be an actual error, or it may be BCM pressure from one model’s baseline operating against a correct but non-consensus position. The human researcher makes that determination. The key is that the BCM pressure is now visible rather than silent.

Preserved human authority. The human researcher defines canon, approves revisions, resolves disagreements, and bears responsibility for all claims. AI systems function strictly as instruments. This is not a limitation imposed reluctantly; it is the design principle that makes the method work.

Productive constraint pressure. Working under explicit constraints with adversarial review forces theoretical claims to become more precise over successive iterations. Vague formulations that survive Claude’s synthesis get sharpened when they fail ChatGPT’s audit. Overcorrections introduced by ChatGPT’s literalism get caught by Claude’s triage. The result is theory compressed into its defensible core.

6. Scope of Claimed Advantage

The dual-model adversarial method offers advantages over single-model workflows in specific, bounded domains. It is essential to define both where the method provides value and where it does not apply. No comparative empirical study is presented here; the claims below derive from procedural analysis, the blind error-detection test described in Section 3.1, and documented internal application across multiple paper development cycles.

Where this method provides advantage: internal logical coherence, definitional consistency, constraint compliance, detection of scope creep, identification of hidden assumptions, and adversarial critique of rhetorical claims. It also provides structural visibility into BCM pressure by surfacing divergences between models that would be invisible in a single-model workflow. In these areas, continuous full-document reaudit after each revision provides a form of systematic scrutiny that single-model workflows typically do not provide under identical constraints.

Where this method does not apply: empirical validation, experimental replication, statistical methodology review, domain expertise verification, and assessment of a theory’s relationship to established literature. The method is orthogonal to these concerns—it does not compete with them and makes no claim to address them.

The claim is not that dual-AI review replaces traditional scientific evaluation. The claim is that it provides a form of internal logical stress testing that complements other review processes.

7. Addressing the ‘AI Wrote This’ Objection

A predictable criticism of any AI-assisted theoretical work is the claim that the AI generated the content and the human merely approved it. The documented PF workflow directly contradicts this characterization.

PrimerField Theory predates AI involvement by more than a decade, with stable theory established by 2012 and documented in YouTube videos that remain accurate today. The core experimental observations, the bowl-shaped field geometry hypothesis, the plasma confinement predictions—all of this existed before any large language model was consulted. Canon is human-authored and document-locked, maintained in foundation documents that no AI system has authority to modify. AI outputs are repeatedly rejected, corrected, and constrained; the revision history of any PF paper shows far more rejections than approvals.

No model is trusted to define ontology. When a model suggests that a PF concept maps to some standard physics construct, that suggestion is treated as a hypothesis requiring explicit human evaluation, not as an authoritative determination.

In documented PF use, the dual-AI method has often been more relentless than ordinary informal human review at the level of internal logical consistency, constraint enforcement, and wording ambiguity.

7.1 Evidence Boundary

This paper describes a methodology based on procedural design, the blind error-detection benchmark described in Section 3.1 and Appendix A, and documented internal application. The term ‘documented’ refers to internal revision logs, version histories, and rejection rates across multiple paper development cycles.

What this paper does not claim: comparative empirical superiority over peer review as an institution, measured error detection rates beyond the described benchmark, quantified improvement metrics, or applicability beyond constraint-heavy theoretical frameworks where canonical definitions must be enforced. The method is designed for contexts resembling PrimerField Theory development—where unconventional claims require rigorous internal consistency checking and where standard physics assumptions cannot be imported without explicit evaluation.

8. The Binocular Vision Analogy

Using two fundamentally different AI systems resembles binocular vision. This analogy is illustrative, not evidentiary—it clarifies the conceptual structure of the method without constituting proof of its effectiveness.

With one eye closed, depth collapses. Structure flattens. Errors hide in plain sight because there is no parallax to reveal them. A single model produces output that appears coherent from one angle, but that coherence may be an artifact of the viewing angle rather than a property of the object.

With two non-identical perspectives, parallax reveals inconsistency. False alignment becomes visible. Structure acquires depth. The slight difference in perspective between the two systems exposes features that neither would detect alone.

The triage layer—Claude classifying ChatGPT’s audit before revisions are executed—adds a third dimension to this analogy. Not only does the parallax between drafting and auditing reveal structural features; the assessment of whether the parallax signal itself is genuine adds a further layer of error discrimination.

The BCM adds a fourth dimension. Each model’s parallax is not neutral: it is shaped by the gravitational pull of its training baseline. What looks like a disagreement between models may in part reflect two different BCM gradients operating simultaneously. The human researcher’s role is not merely to resolve disagreements but to distinguish between ‘these models disagree because the claim is ambiguous’ and ‘these models disagree because one of them is under BCM pressure from its baseline.’ The dual-model structure makes both diagnoses possible; single-model review makes neither visible.

PrimerField Theory papers are not produced by AI consensus. They are produced by AI parallax under human control—the deliberate exploitation of model differences to reveal the shape of the theoretical claims themselves.

9. Replicability Protocol

The dual-model adversarial method is procedurally reproducible. The following protocol defines the essential structure. Note that process replicability is claimed—the ability to execute the same procedural steps—not outcome replicability. Different theoretical content subjected to this process will yield different results; the method does not guarantee identical conclusions across applications.

Roles. Model A (synthesis and triage): Claude. Produces drafts optimized for clarity, rhetorical force, and narrative coherence. Also classifies Model B’s audit findings as genuine errors, likely false positives, or human-escalation items; does not validate its own draft for release. Model B (hostile audit / constraint validation): ChatGPT. Performs hostile read audits optimized for constraint enforcement, error detection, and logical scrutiny. Confirmed as primary validator via blind error-detection benchmark (Section 3.1, Appendix A). Also performs final QA after revision. Human authority: defines canon, resolves disputes, approves all outputs, and bears final responsibility for all claims.

Sequence. (1) Human supplies theory, constraints, and canon. (2) Claude produces aggressive draft. (3) ChatGPT performs cold hostile audit. (4) Claude classifies audit findings: genuine errors corrected, false positives rejected with documented reasoning, ambiguous cases escalated to human. (5) Claude rewrites under confirmed corrections. (6) ChatGPT performs final QA. (7) Human approves or rejects for release.

Acceptance rules. No model validates its own output. Full document reaudit occurs after each revision cycle. Canon exists in locked documents external to both models. Disagreement between models triggers investigation, not automatic resolution. BCM divergences—cases where one model flags a claim that the other does not, particularly for claims distant from mainstream consensus—are escalated to the human researcher rather than resolved by model agreement.

Model drift. Large language models change over time as vendors update training and alignment. The method assumes periodic recalibration: behavioral profiles and role assignments must be reassessed when model versions change significantly. BCM baselines also shift with model versions—a model retrained on an updated corpus will exhibit different consensus-pressure gradients. The blind error-detection benchmark described in Section 3.1 provides a replicable instrument for reassessment. The adversarial structure remains valid; the specific behavioral assignments require empirical verification rather than assumption.

10. Conclusion

The combined use of ChatGPT and Claude AI in PrimerField Theory development is not a convenience or a productivity enhancement. It is a deliberate methodological choice designed to expose error, enforce discipline, and strengthen the written formulation of theoretical claims under adversarial pressure. That choice is grounded not in assumed model characteristics but in empirical performance under controlled test conditions.

The method exploits behavioral differences between models rather than seeking consensus between them. It maintains absolute human authority over canonical claims while leveraging AI capabilities for drafting, constraint enforcement, error detection, and audit triage. The three-phase loop—draft, audit, triage-and-revise—provides structural redundancy that neither model could supply alone, and that the human researcher directs without delegating authority.

The Baseline Correction Mechanism is a property of every participant in this workflow, including the human researcher. It does not disappear when acknowledged; it becomes visible. That visibility is the core structural advantage of the dual-model adversarial approach: not the elimination of bias, but the transformation of silent bias into a detectable, diagnosable signal under human control.

In repeated application, this approach produces theoretical work that has undergone systematic adversarial review for internal logical consistency—a form of stress testing that complements rather than replaces traditional scientific evaluation. It represents a replicable procedural framework for AI-assisted theory development—particularly valuable in domains where existing paradigms cannot be assumed correct, where canonical definitions must be enforced, and where unconventional claims require rigorous internal consistency checking.

Artificial intelligence did not replace thinking. It made weak thinking harder to hide.


Appendix A: Blind Error-Detection Benchmark

The following table summarizes the twelve intentional errors embedded in the test document, their categories, what counted as full versus partial detection, and the final outcome for each model. The errors were distributed across a complete PF paper without disclosure to either model. Both ChatGPT and Gemini were instructed to perform a thorough hostile audit under identical conditions.

Detection scoring: Full detection required the model to identify the specific error and explain why it constituted a violation. Partial detection required the model to flag the relevant passage but without correct characterization of the error type. No detection meant the passage was passed without comment.

# Category Error Description ChatGPT Gemini
1 Logical inconsistency Claim in Section 3 contradicted conclusion stated in Section 6 without acknowledgment Full Full
2 Definitional violation PF structure term used outside its canonical scope (internal structure applied to outer boundary) Full None
3 Scope overreach Observational claim stated as if directly measured when it was inferred from indirect data Full Partial
4 Canon violation Terminology from standard physics imported without comparison label Full None
5 Hidden assumption Conclusion required an unstated premise about scale invariance not established in the paper Full None
6 Logical inconsistency Two paragraphs used the same PF structure term to refer to different physical locations Full Partial
7 Scope overreach Extrapolation to galactic scale presented without the Category 2 label required by attribution protocol Full None
8 Definitional violation Internal substructure (FR) placed outside the bowl boundary in contradiction of canonical placement rules Full Full
9 Canon violation Numeric value cited differed from the locked canonical value in the foundation document Full None
10 Hidden assumption Causal language introduced where only structural correspondence had been established Full Partial
11 Logical inconsistency Abstract claimed a result that the body of the paper explicitly marked as an open question Full None
12 Scope overreach Conclusion section used ‘proves’ where the body had used ‘consistent with’ throughout Full Full
Total 12 / 12 (100%) ~4 / 12 (~33%)

Note: Error descriptions are representative characterizations of the error category; exact wording and placement within the original test document are on file with the PrimerField Foundation. Partial detection counts toward the ‘approximately 4’ figure reported in Section 3.1; the exact count depends on whether partial detections are included. ChatGPT version used: GPT-4o. Gemini version used: Gemini 1.5 Pro deep-think mode. Test conducted: 2025.