Overview – Precision Gesture Interaction

Description

xtan.ai explores a new approach to precision gesture interaction for professional 3D workflows such as digital content creation (DCC), CAD and virtual production.

Instead of relying purely on AI-based depth estimation, the system is designed around metric stereo geometry and deterministic tracking pipelines. The goal is to provide stable, low-latency spatial interaction that can be reliably integrated into professional software tools.

It is important to distinguish between two fundamentally different approaches:

Direct AI based recognition from camera images, where gestures are inferred directly from raw visual input. This approach is often less transparent, less deterministic, and more error prone under difficult real world conditions. It currently dominates many mainstream use cases.
Geometry first 3D reconstruction followed by structured recognition, where the system first reconstructs stable, high quality 3D data without relying on AI for the initial perception stage. Only afterward is the resulting 3D representation passed to a higher level model, such as a GCN, for gesture interpretation. When the 3D signal is clean and stable, this approach is typically more robust, more transparent, and often better suited for real time use with lower error rates. It can also simplify model training, improve the usefulness of augmentation, and make it easier to scale the recognition pipeline to additional gestures or application domains.

🧠 Motion Interpretation

What this layer does: It converts poses and keypoints into high-level intents using gesture grammars, state machines, and context rules (tool modes, constraints, safety).

The layer handles:

debounce
disambiguation
confidence scoring
temporal consistency

The result is a set of deterministic, low-latency interaction events suitable for professional applications.

🧩 Module	📝 Short Description	⚖️ License	🚦 Status	🔗 Link
MotionCoder	Real-time gesture interpretation engine with state machine and context logic.	Apache-2.0	🟡 Planned	MotionCoder

🎥 Hardware Recommendation

For high-precision tracking, the system can be combined with EdgeTrack stereo rigs, which provide synchronized NIR stereo capture and geometry-based depth reconstruction.

More information about the hardware layer can be found here:

🔌 Connectors

What this layer does: It maps interaction intents from MotionCoder to application-native actions such as operators, hotkeys, API calls, or engine events.

Each connector module (Coder2XY) targets a specific software ecosystem and translates MotionCoder output into application commands.

🧩 Module	📝 Short Description	📲 Target System	⚖️ License	⚠️ Notes	🚦 Status	🔗 Link
Coder2Blender	Add-on/API bridge: gestures → operators, hotkeys, nodes.	Blender	MIT	—	🟡 Research (API exploration)	coming soon
Coder2Unreal	Plugin bridge: gestures → Blueprint/C++ events.	Unreal Engine	MIT	—	🟡 Planned	coming soon
Coder2Dassault	Macro/API bridge: gestures → CAD commands.	Dassault (SolidWorks/CATIA)	MIT	—	🟠 Targeted for next year	coming soon

Why separate MotionCoder and Coder2XY?

Modularity MotionCoder (gesture detection and semantics) remains independent from any target application.

Reuse One interpretation engine can support many software integrations.

Breadth Potentially 100+ software targets (CAD, DCC, robotics tools, assistive interfaces).

Maintainability API changes affect only the relevant adapter, not the core engine.

Portability Enables fast integration into new software ecosystems.

🕹️ Peripherals (optional)

This layer includes optional hardware devices designed to improve ergonomics and interaction precision.

Examples include:

clutch/confirm buttons
mode switches
haptic feedback

These devices communicate via BLE or USB and are designed to avoid NIR interference, ensuring compatibility with camera-based tracking systems.

Note: These peripherals do not require MotionCoder and can operate as standard input devices (HID).

🧩 Module	📝 Short Description	🔌 Hardware / Dependencies	⚖️ License	⚠️ Notes	🚦 Status	🔗 Link
Pen3D	Tracked 3D pen with buttons and optional haptic feedback.	Optional ESP32-S3 (BLE) or mechanical design.	Apache-2.0	BLE GATT notifications, deep-sleep wake-on-button, optional USB-CDC debug. Designed for 850 nm NIR environments.	🟡 Planned	Pen3D
HMDone	Minimal VR headset concept using external marker-based tracking only.	Works with high-resolution HMDs (e.g. Pimax Crystal, Valve headsets).	Apache-2.0	Designed for multi-view NIR tracking. Inside-out tracking intentionally ignored.	🟠 Later	HMDone

🗺️ Roadmap

Coming soon.

The project is currently in the research and prototyping phase, focusing on:

architecture definition
stereo-based tracking experiments
gesture interaction models
early software integrations

More details will be published as the ecosystem evolves. 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview – Precision Gesture Interaction

Description

🧠 Motion Interpretation

🎥 Hardware Recommendation

🔌 Connectors

Why separate MotionCoder and Coder2XY?

🕹️ Peripherals (optional)

🗺️ Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Overview – Precision Gesture Interaction

Description

🧠 Motion Interpretation

🎥 Hardware Recommendation

🔌 Connectors

Why separate MotionCoder and Coder2XY?

🕹️ Peripherals (optional)

🗺️ Roadmap

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages