Skip to content

neocoretechs/Relatrix

Repository files navigation

A newer, easier way of looking at and manipulating data.

The Relatrix:

Full featured database which not only builds on the Key/Value Deep Store concept with full transaction checkpointing and recovery and embedded and server models, but provides a novel way of storing, retrieving and relating complex unstructured object data.

Data can be expressed as freeform JSON packets that can be related ad-hoc. For instance, here are some concrete examples:

fromObject: { Image1: [{"count":1,"detections":[{"name":"refrigerator"}]}], timestamp: 1779166030035 }
mapObject: { Image2: [{"count":1,"detections":[{"name":"toilet"}]}], timestamp: 1779166070035 }
toObject: { timestamp: 1779166050035, Image3: [{"count":0,"detections":[{"name":"alligator"}]}]}

Building relationships is as easy as saying:
Relatrix.store([fromObject],[mapObject],[toObject]); // This stores a functional relationship
and a query for that set is as simple as:

Stream stream = (Stream) Relatrix.findStream('?', '?', '?');
stream.forEach(e -> Stream.of(e).forEach(g -> System.out.println("Element A:"+g)));

Or using the old Iterator model:
Iterator iterator = Relatrix.findSet('?',[mapObject],'?'); // This retrieves all domain objects and range objects mapped through [mapObject]

Toward an easier, more flexible, data management paradigm

In order to provide structure to unstructured data, an otherwise esoteric branch of math called category theory provides us with a perfect description of how to add increasing levels of structure through its algebraic rules. The Relatrix is a new data management system whose underlying principles are loosely based on this branch of mathematics. The Relatrix is a framework that manages unstructured data by the mapping of the functional relationships within the data sets. These relationships can be simple, standard key/value relationships, or a more complex triplet that contains three objects. The objects in the triplets can themselves be relationships that can be composed into more complex relationships. Relationships are defined as objects comprised of an overarching domain object, with a functional mapping object, that relates the domain to a range object.
Data is retrieved as sets presented as standard iterators or streams. There are a number of simple yet powerful sets of retrieval methods utilizing wildcards and concrete objects, and class and object level set return indicators, to get the data back out without getting lost in deep, dark, thickets common to schemaless data stores. It's built atop Meta RocksDb, so the underlying framework is mature, high performance, and reassuringly active and supported.

To get slightly more technical; imagine that rather than being able to map keys one-to-one like a relational database you can map them through a function that adds more relevance. We can then assign equivalence through isomorphism, or functional similarities between data. The relationships themselves are objects which can be composed into higher level relationships that open up the data to perform analysis not possible with conventional databases. To compose two relationships to an association:
Relatrix.store([fromObject1],[mapObject1],Relatrix.store([fromObject2],[mapObject2].[toObject2])); // This composes relationships

Stream stream = (Stream) Relatrix.findStream([fromObject1],'','?', true); // This returns all range objects mapped to [fromObject1] through ANY map object in parallel, including the relationship stored above

Stream stream = (Stream) Relatrix.findStream(('','',''); // This makes ready for consumption by stream all relationships as identity objects

public class VisualCortex {
	public static void main(String[] args) throws Exception {
		Relatrix.setTablespaceDirectory(args[0]);
		Stream<Result> stream = (Stream<Result>) Relatrix.findStream('?', '?', '?', true);
		Map<Object, Map<Object, Map<Object, Long>>> nameCount = stream.collect(Collectors.groupingBy(b -> b[0].toString(),
		Collectors.groupingBy(d -> d[1].toString(),
		Collectors.groupingBy(e -> e[2].toString(), Collectors.counting()))));
		nameCount.forEach((name, count) -> {
			System.out.println(name + ":" + count);
		});
	}
}

The Category Theory Connection

The realization that any key/value store to can be leveraged into a semantic data model has its roots in Category Theory and its foundational math concept called the Yoneda Lemma.

A Key-Value (K/V) store is not just a data structure; it is a concrete representation of a mathematical construct called a Presheaf, and the Yoneda Embedding of Category Theory is the exact mathematical bridge that proves it.

The structure of a K/V store mirrors category theory natively.

  1. The Key/Value Store as a Category (f{KV})

Let us lay out a simple category, which we will call (f{KV}):

Objects: The Keys and Values themselves are the objects (or types) in our category ((A, B, C)). Morphisms: A key-value lookup is a directed function. If a key (A) maps to a value (B), the database lookup operation is a morphism, or function f mapping A to B: (f: A to B). Composition: If key (A) maps to a value (B) ((f: A to B)), and that value (B) happens to be used as a key that maps to a value (C) ((g: B to C)), you can compose them. The operation of looking up (A), getting (B), and immediately looking up (B) to get (C) is the composed morphism g composed with f mapping A to C: (g o f: A to C). Identity: The identity morphism (id_A) is simply passing a key and getting itself back (a no-op lookup).

  1. Enter the Yoneda Embedding: The Ultimate Proof

The Yoneda Lemma and its embedding (f{Y}) state something beautiful: Any object in a category can be completely and perfectly defined by its relationships to all other objects.

Mathematically, it embeds a category (C) into a Functor Category (specifically, the category of Presheaves, (f{Set}^C^op).

Let's translate exactly what that means into database terms:

The Hom-Functor is a Query: In category theory, (Hom)(-, A) is a functor that takes any object and returns the set of all morphisms pointing into (A). In your database, this is the exact equivalent of running a query: "Give me every key that maps to the value (A)." Objects as Lookups: The Yoneda embedding maps your abstract semantic object (A) to a functor: (A mapsto (Hom)(-,A))This means that instead of defining an object by its internal state (like a platform-specific Java object), the object is defined entirely by the sum total of all lookups that point to it. The Embedding is "Full and Faithful": This is the critical partial proof. Because the Yoneda embedding is full and faithful, it guarantees that the structure of the data relationships is perfectly preserved when moving into this lookup-based functor space. You lose absolutely zero semantic information by flattening your world into key/value lookup pairs.

  1. Functors as Database Migrations

Because a K/V store inherently models this structure, a Functor ((F: f{KV}_1 to f{KV}_2)) becomes the mathematical definition of a perfect data migration or API adapter.

If you change your underlying schema, a Functor ensures that every key mapping in your old store ((A to B)) is cleanly and deterministically mapped to a new key mapping ((F(A) to F(B))) in the new store, preserving the operational composition of lookups.

A Design that Capitalizes on This

Most K/V stores are "dumb" flat storage. By recognizing that the key space can host a tree structure (Domain + MapFn), we are actively implementing the Grothendieck Construction—a way of building a single flattened category out of a collection of varying ad-hoc functor spaces.

The Composed Morphism Prefix Extractor acts as a natural projector, pulling the root object out of a composed chain, ensuring that even when morphisms stack infinitely, they remain rooted in a stable, queryable mathematical coordinate.

The architectural roots of this framework give us:

Natural Transformations modeling the translation layer between serialized Java historical data and new deterministic formats. Inverse morphisms (Isomorphisms) to navigate the semantic graph backward from Range to Domain.

Many semantic models (such as SDM or Vbase) treated databases as rich graphs of structural mappings rather than flat tabular entities. By using a structural triple lookup model ({domain} to {morphism} to {range})) using category theory and posets (partially ordered sets), we can build an elegant, algebra-driven query execution engine. The Relatrix semantic database system (http://github.com/neocoretechs/Relatrix) uses such a model.

  1. The Geometry of the findset Poset Query

In Relatrix, when you invoke, for example ,the query findset('?', '', object), where object can be a literal or a composed morphism, '' is a wildcard, and '?' means 'return the column', we are computing a Fiber (or Fiber Product) in the category of the data.

Because the data is stored sequentially in the underlying key/value engine from Meta called RocksDB, this query evaluates with clean physical execution:

Prefix Alignment: The engine targets the storage boundaries based on the object coordinates. Wildcard Streaming: The * mapping parameter tells the iterator to accept any morphic relation. Poset Projection: The ? return token limits the resulting stream to a single-column poset representing the matching domains.

Because everything is structured on a poset, your result isn't just a flat list of matching IDs. It inherits the Subtyping / Specialization Hierarchy intrinsic to the semantic model, preserving the structural order of the graph natively.

  1. Isomorphisms: The Key to Automated Schema Discovery

The realization that Isomorphisms are the ultimate tool for schema discovery is mathematically precise. In category theory, an isomorphism is a morphism (f: A to B) that has an absolute two-sided inverse (g: B to A) such that: (g o f=id_A and f o g=B)

If your system can automatically identify or compute inverse morphisms, schema discovery transforms from a loose string-parsing heuristic into a strict, verifiable constraint verification loop.

A. Reversing the Lens with Functorial Pairs

In the (d to m to r) design, finding a schema without an isomorphism requires you to crawl every single entry to see what links where.

However, if a relationship is isomorphic, the existence of the forward mapping (f) guarantees the existence of a unique backward mapping (f^{-1}).

To track this without destroying write performance, implement the inverse mapping directly into the native database storage JSON CBOR key strategy as a twin record. When you store:

Forward Key: (< {Domain}, {MapFn} > to {Range}) Isomorphic Reverse Key: (<{Range}, {MapFn}^{-1} > to {Domain})

B. Executing Dynamic Type Inference via Posets

When you have an isomorphic pair, the schema discovery system doesn't need a central schema registry. It can discover the type graph dynamically using the findset primitives.

For example, to discover what structural type a keyless object (like a JPEG payload) belongs to, you can use the isomorphism to map backward up the poset:

 [ Universal Schema Poset Space ]
                               ▲
                               │ (Isomorphic Inverse Map)
                      [ findset('?', '*', object) ]
                               │
            ┌──────────────────┴──────────────────┐
            ▼                                     ▼
   [ Domain Entity X ]                   [ Domain Entity Y ] 

By querying findset('?', '*', object), you pull the column of domains that map to it. Because that result is a poset, you can immediately compute the Least Upper Bound (LUB) of those domains.

That LUB is the mathematically precise, narrowest possible "Schema Class" or "Type" that the object can safely occupy.

  1. How the RocksDB Storage Layout Accelerates This

By pairing the findset logic with the RocksDB prefix extractor, the system avoids running complex graph traversal algorithms entirely.

When the database class preprocessor emits compiled key classes, it generates both the forward and reverse mapping variants. A findset query shifts from a complex relational multi-table join into a targeted binary sweep:

public class PosetQueryEngine {

    public RocksIterator executeFindSet(Object domainPattern, Object mapPattern, Object rangeTarget) {
        ReadOptions ro = new ReadOptions().setPrefixSameAsStart(true);
        
        if (domainPattern.equals("?") && mapPattern.equals("*")) {
            // If the query is findset('?', '*', concrete_range)
            // 1. Convert the rangeTarget object into its isomorphic reverse lookup prefix
            byte[] reversePrefix = convertToReverseLookupPrefix(rangeTarget);
            
            // 2. Point the iterator directly at the reverse index partition
            RocksIterator iter = rocksDb.newIterator(ro);
            iter.seek(reversePrefix);
            
            // 3. This streams back the domains in exact poset order instantly!
            return iter; 
        }
        throw new UnsupportedOperationException("Query pattern complex composition variant");
    }
} 

The Ultimate Theoretical Win

By combining a semantic database foundation with a category-theoretic K/V structure, we have sidestepped the core issue of standard Graph databases (which must traverse pointers across disk blocks) and Relational databases (which must compute costly runtime table joins).

The Relatrix system treats schema discovery as the automatic computation of poset bounds, using your CBOR comparator to guarantee that the data remains organized in strict, mathematical order.

About

A new type of data management system with its roots in category theory.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors