GenomDB is a distributed storage system for genomic data. GenomDB can store both SAM and BAM files.
Each read of the SAM file will be split across multiple nodes in the network.
GQL will be translated by the index in order to figure out the get calls that need to be made to the data.
- Raft Consensus: Ensures all nodes have consistent data
- Distributed Storage: Run multiple nodes locally or across networks
- HTTP API: Simple REST API for get/put operations
- Automatic Leader Election: Raft handles leader election and failover
The easiest way to run the cluster is using Docker Compose:
-
Build and start all nodes:
docker-compose up -d
-
Initialize the cluster (add nodes 2 through 5):
./scripts/docker-init.sh
Or manually:
# Add node2 curl -X POST http://127.0.0.1:8001/join \ -H "Content-Type: application/json" \ -d '{"node_id": "node2", "node_addr": "node2:9026"}' # Add node3 curl -X POST http://127.0.0.1:8001/join \ -H "Content-Type: application/json" \ -d '{"node_id": "node3", "node_addr": "node3:9027"}' # Add node4 curl -X POST http://127.0.0.1:8001/join \ -H "Content-Type: application/json" \ -d '{"node_id": "node4", "node_addr": "node4:9028"}' # Add node5 curl -X POST http://127.0.0.1:8001/join \ -H "Content-Type: application/json" \ -d '{"node_id": "node5", "node_addr": "node5:9029"}'
-
View logs:
docker-compose logs -f
-
Open the monitoring dashboard:
-
Stop the cluster:
docker-compose down
-
Stop and remove volumes (clean slate):
docker-compose down -v
To rebuild/restart node processes automatically when .go files change, run:
docker compose -f docker-compose.yml -f docker-compose.dev.yml up --buildOr with Make:
make docker-dev-upDetached mode:
make docker-dev-up-dmake docker-dev-up-d now starts containers and runs cluster init (/join for node2-node5).
If Raft gets stuck after address/config changes (for example heartbeats to 127.0.0.1:9025 inside containers), reset dev volumes and re-init:
make docker-dev-resetThis uses air inside each node container and bind-mounts your workspace, so saving Go files triggers recompilation and process restart.
Stop hot-reload stack:
docker compose -f docker-compose.yml -f docker-compose.dev.yml downsendex run sendex/put.ymlNote: Writes must go through the leader. If you hit a follower, it will redirect you to the leader.
sendex run sendex/get.ymlNote: Reads can be performed on any node (leader or follower).
sendex run sendex/ping.ymlcurl -X POST http://127.0.0.1:8001/join \
-H "Content-Type: application/json" \
-d '{"node_id": "node2", "node_addr": "127.0.0.1:9026"}'Note: Only the leader can add nodes.
curl http://127.0.0.1:8001/statusReturns node information including raft state, leader, peer list, keys, and in-memory store values.
Each node requires a configuration file with:
database: Path to the BoltDB database fileserver.host: HTTP server hostserver.port: HTTP server portraft.node_id: Unique node identifierraft.bind_addr: Raft bind host (Raft port is derived asserver.port + 1024)raft.advertise_addr: Optional Raft advertise host (port also derived asserver.port + 1024)raft.data_dir: Directory for Raft logs and snapshotsraft.peers: List of peer Raft addresses (empty for first node)
Example config files are provided in the configs/ directory:
config-node1.yml/config-node1-docker.yml- First node (bootstraps cluster)config-node2.yml/config-node2-docker.yml- Second nodeconfig-node3.yml/config-node3-docker.yml- Third nodeconfig-node4.yml/config-node4-docker.yml- Fourth nodeconfig-node5.yml/config-node5-docker.yml- Fifth node
Note: The -docker.yml variants are configured for Docker networking (using service names instead of localhost).
- First Node: Bootstraps the Raft cluster
- Additional Nodes: Start and wait to be added to the cluster
- Leader Election: Raft automatically elects a leader
- Consensus: All writes go through the leader and are replicated to followers
- Reads: Can be performed on any node (eventually consistent)
- Raft logs and snapshots are stored in
data/<node_id>/ - Key-value data is stored in the FSM (in-memory, persisted via snapshots)
- BoltDB database files store additional metadata