Overview – Multi-View Stereo Tracking System

Description

EdgeTrack is an open multi-view tracking architecture built around RAW-first capture, precise timing, hardware synchronization, and host-side fusion. It is designed as a transparent and deterministic foundation for stereo and multi-rig tracking systems, without depending on closed vendor pipelines.

The architecture itself is application-independent and intentionally kept general-purpose. It can serve as a pure technical foundation for a wide range of use cases, including gesture interaction, 3D keypoint extraction, spatial input, robotics, teleoperation, and more.

This repository serves as the central overview and concept documentation for EdgeTrack, including architectural notes, design principles, and related system documents.

Alongside classical stereo pipelines, EdgeTrack may also support optional neural stereo methods for multi-view processing, including acceleration on GPU-based hardware where appropriate. These AI-assisted components are optional and complement the core geometry-first architecture rather than replacing it.

EdgeTrack is hardware-agnostic and designed to remain highly flexible across different system classes, including ARM or x86 platforms, industrial cameras, and camera modules such as MIPI CSI.

📚 Documentation & Resources

comming soon!!!

⏱️ Layer 1 – Timing

What this layer does:

This layer provides the timing backbone of the system. It controls trigger distribution, phase sequencing, and synchronized IR illumination across one or more camera rigs, enabling deterministic capture timing and stable multi-device operation.

🧩 Module	📝 Short Description	⚖️ License	🚦 Status	🔗 Link
TDMStrobe	Time-division-multiplexed IR illumination and trigger system with phase control (A/B/C/D) for precise multi-camera synchronization	Apache-2.0	🟡 In progress	TDMStrobe

🎥 Layer 2 – Capture

What this layer does:

This layer handles sensor-side image acquisition and edge-side preprocessing. It captures raw camera data, prepares it for downstream stereo or fusion stages, and preserves precise timing alignment with the timing layer.

Depending on configuration, it can output RAW streams, ROI metadata, preview streams, or lightweight edge-side inference results.

🧩 Module	📝 Short Description	⚖️ License	🚦 Status	🔗 Link
EdgeTrack	RAW10 mono capture pipeline running on ARM-based systems (e.g., Raspberry Pi, Jetson), designed for deterministic stereo acquisition	Apache-2.0	🟡 In progress	EdgeTrack

When using industrial cameras with direct host output, this layer 2 can be omitted entirely, and the image data can be forwarded directly to Layer 3.

⚙️ Layer 3 – Host-side Stereo Compute

What this layer does:

For ARM-based edge nodes, this layer is fully optional and only needed when computationally heavy stereo processing is required. For base industrial-camera setups, however, this layer is typically required.

Instead of performing stereo reconstruction directly on the edge device, RAW data is streamed to a host PC, where dense or ROI-based disparity and depth computation is executed before the results are forwarded to the fusion layer.

🧩 Module	📝 Short Description	⚖️ License	🚦 Status	🔗 Link
CoreStereo	Host-side stereo processing module: ingests synchronized RAW or rectified stereo streams and performs disparity/depth reconstruction (dense or ROI-based), including optional filtering and confidence estimation	Apache-2.0	🟡 Planned	CoreStereo

If not needed, this layer 3 can be completely skipped, and data can be sent directly to Layer 4.

🔗 Layer 4 – Multi-View Fusion

What this layer does:

This layer runs on a host system and performs multi-view spatial fusion.

It aggregates multiple stereo rigs, applies time synchronization, calibration refinement, and bundle adjustment, and produces stable, structured spatial outputs.

Outputs include:

3D keypoints / skeletons
Dense or sparse depth
Motion signals
Structured spatial representations

These outputs are designed for direct use in:

Robotics
Teleoperation
SLAM / mapping
Spatial input systems
Gesture-based interaction

🧩 Module	📝 Short Description	⚖️ License	🚦 Status	🔗 Link
CoreFusion	Aggregates 2–4 synchronized stereo rigs over LAN; performs multi-view calibration, bundle adjustment, outlier rejection, and low-latency fusion to produce stable 3D keypoints and spatial signals	Apache-2.0	🟡 Planned	CoreFusion

🧠 Layer 5 – Motion Interpretation (Optional)

What this layer does:

It converts poses/keypoints into high-level intents using gesture grammars, state machines, and context rules (tool modes, constraints, safety). It handles debounce, disambiguation, and confidence scoring, producing deterministic, low-latency events.

🧩 Module	📝 Short Description	⚖️ License	🚦 Status	🔗 Link
MotionCoder	Real-time gestures/intents, state machine, context logic.	Apache-2.0	🟡 Planned	MotionCoder

🕹️ Peripherals (Optional)

What this layer does:

Purpose-built devices that improve ergonomics and precision (e.g., clutch/confirm, mode switches, haptic cues). They speak BLE/USB and avoid IR emission to stay camera-safe in NIR setups.

Note: These peripherals don’t require MotionCoder. They work like standard input devices (e.g., HID) and can be used independently.

🧩 Module	📝 Short Description	⚖️ License	🚦 Status	🔗 Link
Pen3D	Tracked 3D pen input with buttons and optional haptics.	Apache-2.0	🟡 Planned	Pen3D
HMDone	Minimal VR headset with external marker-based tracking only.	Apache-2.0	🟠 Later	HMDone

🛠️ Example Use Cases

What this layer does:

This layer shows how the stack can be applied across different domains, including robot grippers, motion capture, 3D scanning, robotics, and workspace perception.

Note: These are example application areas built on top of the stack. More use cases will be added over time.

🧩 Module	📝 Short Description	⚖️ License	🚦 Status	🔗 Link
MoCap	Marker-based motion capture for tracking body, hand, or object movement in 3D space.	Apache-2.0	🟠 Later	MoCap
3DScan	Multi-view 3D scanning for geometry capture, reconstruction, and measurement.	Apache-2.0	🟠 Later	3DScan
PerceptGrid	External multi-camera perception layer for robots, safety monitoring, and shared workspace understanding.	Apache-2.0	🟠 Later	PerceptGrid

More modules and application areas will be added over time.

🗺️ Roadmap

Coming soon. The project is currently in the research and prototyping phase. 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview – Multi-View Stereo Tracking System

Description

📚 Documentation & Resources

⏱️ Layer 1 – Timing

🎥 Layer 2 – Capture

⚙️ Layer 3 – Host-side Stereo Compute

🔗 Layer 4 – Multi-View Fusion

🧠 Layer 5 – Motion Interpretation (Optional)

🕹️ Peripherals (Optional)

🛠️ Example Use Cases

🗺️ Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Overview – Multi-View Stereo Tracking System

Description

📚 Documentation & Resources

⏱️ Layer 1 – Timing

🎥 Layer 2 – Capture

⚙️ Layer 3 – Host-side Stereo Compute

🔗 Layer 4 – Multi-View Fusion

🧠 Layer 5 – Motion Interpretation (Optional)

🕹️ Peripherals (Optional)

🛠️ Example Use Cases

🗺️ Roadmap

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages