Intelligence that learns, never forgets, and tunes itself

Built on hypercomplex algebra, Discur replaces gradient descent, backpropagation, and hand-tuned hyperparameters with five operations derived from a single eight-dimensional number system. Discur achieved 95.2% on MNIST with zero gradient computation in the classifier and no hand-tuning.

Mechanism

Routing

Each node in the trie decomposes incoming data along the seven quaternionic subalgebras of the octonions, indexed by the Fano plane. The child whose subalgebra captures the strongest projection receives the input. Routing requires no learned parameters: the subalgebra structure is a property of octonionic multiplication itself.

Novelty detection

The associator $[a,b,c] = (ab)c - a(bc)$ measures whether a triple of octonions lies within a common quaternionic subalgebra. When an input produces small associator norms with existing children, it is algebraically compatible. When the norm is large for all children, the input represents genuinely novel structure and triggers a branching event.

Content update

Nodes accumulate information through octonionic composition. The update rule composes the existing node content with new input via multiplication. Because octonionic multiplication is norm-preserving, sequential compositions neither explode nor vanish, and any composition can be reversed through algebraic inversion.

Consistency verification

Before committing an update, the trie checks whether the change is consistent with prior predictions stored in the node's memory buffer. It computes the counterfactual: given the proposed update, how would past predictions have differed? If the discrepancy exceeds a threshold, the update is rejected and escalated to a branching decision. This reasoning is enabled by algebraic invertibility.

Structural health

Five invariants derived from the algebra monitor the trie's structural integrity: composition error bounds limit depth, compression efficiency flags redundant subtrees, subalgebra cleanliness measures branch separation, prediction consistency tracks update stability, and associator health detects routing strain. These replace the role of loss functions in gradient-trained systems.

Architecture

The octonions are an eight-dimensional number system extending the quaternions. Like the quaternions, they are non-commutative. Unlike the quaternions, they are also non-associative: the expression (ab)c does not always equal a(bc). This failure of associativity, far from being a deficiency, provides the octonionic trie with its core computational signal.

The octonions contain exactly seven quaternionic subalgebras, each a four-dimensional subspace where multiplication is associative. These subalgebras are indexed by the lines of the Fano plane, a finite projective geometry with seven points and seven lines. Each imaginary basis unit belongs to exactly three subalgebras, creating an overlapping structure that the trie uses as its routing channels.

The associator is the mathematical object at the center of the octonionic trie. For three octonions a, b, and c, it measures how far their product deviates from associativity:

[a,b,c] = (ab)c - a(bc)

(1)

The associator vanishes when all three arguments lie in a common quaternionic subalgebra. It is nonzero precisely when the triple spans multiple subalgebras. For unit octonions, the associator norm is bounded between 0 and 2, with an expected value of approximately 1.09 for uniformly random inputs on the 7-sphere.

In the trie, the associator answers a routing question: given a new input x, an existing child c, and their parent node n, does x belong in the same branch as c? The answer is graded, not binary.

A small associator norm indicates algebraic compatibility: x and c share a subalgebra in the context of n, and x is routed to c for composition. A large norm across all children signals novelty: x is algebraically incompatible with every existing branch and triggers the creation of a new node. Intermediate values represent partial compatibility, where the composition proceeds but the consistency check examines whether the update destabilizes prior predictions.

The resulting structure is a dynamically growing tree. Each node stores a single octonion. Edges are labeled with the quaternionic subalgebra that the child most strongly activates. The branching factor is bounded by seven (the number of Fano plane lines), and the depth is limited by composition error accumulation.

Invariants

The octonionic trie monitors its own structural health through five invariants derived from the algebra. These invariants replace the role of loss functions in gradient-trained systems. Each produces a scalar signal without labeled data or external supervision.

Composition error bound

Sequential octonionic multiplication accumulates floating-point error. The composition error bound tracks this accumulation at each depth level. When the error exceeds a threshold, the node stops accepting new children, establishing a natural resolution limit per branch.

\varepsilon(d) = \frac{|x^{-1} \cdot (o \cdot x) - o|}{|o|}

(2)

Compression efficiency

The trie evaluates whether a subtree's structural complexity is justified by its contribution to prediction quality. A high ratio of nodes to effective capacity signals redundancy. Subtrees that consume structure without improving predictions are candidates for consolidation through sibling absorption.

\rho = \frac{\text{number of nodes}}{\text{effective prediction capacity}}

(3)

Subalgebra cleanliness

At each internal node, the pairwise associators between children measure how distinct their occupied subalgebras are. High cleanliness means the branches represent genuinely different algebraic directions. Low cleanliness suggests two children are too similar and should be merged.

\text{cleanliness}(n) = \min_{i \neq j} |[c_i, c_j, n]|

(4)

Prediction consistency

Before committing an update, the trie estimates how the change would have affected prior predictions stored in the node's memory buffer. The maximum discrepancy across all buffered predictions determines whether the update proceeds. Persistently high discrepancy indicates that the node is attempting to represent fundamentally incompatible information.

\delta(n) = \max_{y \in \text{buffer}(n)} |\text{counterfactual}(y, o\prime) - \text{original}(y)|

(5)

Associator health

The mean associator norm at each node, averaged over recent inputs, measures how well the routing structure accommodates incoming data. A persistently high value signals that the node's children are not organizing data effectively and that restructuring may be needed.

\alpha(n) = \frac{1}{K} \sum_{k=1}^{K} |[x_k, c_k^*, n]|

(6)

Results

Experiment	Result	Context
MNIST Classification	95.2%	CNN encoder, 60,000 training samples. Within 3 percentage points of k-nearest neighbors on the same features.
Stability-Plasticity	0% forgetting	97.7% overall accuracy across 7 sequential categories (342/350 test samples).
Novelty Detection	5x spike ratio	Associator norm at category transitions vs. within-category baseline. All transitions exceed the 99th percentile of within-category norms.

Zero gradient computation in the classifier. All experiments use the octonionic trie as a direct replacement for trained classifiers.

Comparison

The octonionic trie shares individual features with existing memory-augmented architectures but combines them through a single algebraic substrate rather than independent engineered components.

	NTM/DNC	SDM	HTM	Oct-Trie
Training	Backpropagation	None	Hebbian	None (algebraic)
Routing	Learned attention	Hamming distance	Spatial pooling	Subalgebra decomposition
Consistency check	None	None	None	Algebraic inversion

Paper

The full technical treatment, including proofs, experimental methodology, and open questions, is available on GitHub.

Read the paper

research@discur.ai · github.com/discur-ai