This repository downloads, converts, and loads metadata from DBNomics into the graph database, and links it to public knowledge graphs.
NOTE: This repo still contains many hard-coded directories in the scripts and tools. Eventually those will be updated, but until then it may be difficult to use without modifications. Search for /mnt/data to replace if necessary.
The docker-compose.yml file has all of the services required to serve the Loqu backend: Virtuoso, LIMES, TypeSense, the Loqu API container (which you should build from that repository), and a simple reverse proxy that loadbalances requests to the Search and API services via the Host header.
- Docker
- Docker Compose
- httpie, for configuring Typesense
- Rclone (for downloading or uploading transformed data), maybe replace with gsutil
- Go and NodeJS, for the transformation scripts (think about containerizing)
Follow the steps below to get the source data and create a Loqu backend:
- Download and transform the source data from DBNomics
- Download the data with
./scraper/main.goas.jsonfiles - Run
node ./converter/run.mjsto convert to JSON-LD - Run
go run ./jsonld2nt/main.goto convert to N-Triples - (optional) validate a subset of your data using the Data Cube integrity constraints with
run.shinintegrity-constraints
- Download the data with
- Run
docker-compose upto start the backend. Make sure you have a.envfile that sets the required variables (eg location of data from previous step) - Load data into the search engine
./typesense/dataset-loader/schemas/mk_collections.shDBNOMICS_DATA_DIR=<your path> go run ./typesense/dataset-loader/bin/main.go
- Load data into the database
- Docker exec into Virtuoso database and run
/scripts/load.sh
- Docker exec into Virtuoso database and run
- Create links between concepts
- Write your LIMES configs in the
limesfolder, then submit them withsubmit.sh - TODO: store theses in a better place (eg cloud object storage)
- Write your LIMES configs in the