ra_bio

Public runtime library for microorganism risk-reference data derived from NITE/MRINDA.

ra_bio is the canonical non-MCP API for direct consumers. It ships a bundled SQLite database generated by ra_bio_scraper.

What it does

resolves microorganism names against canonical names and synonyms
supports fuzzy search for misspellings and historical names
returns aggregated organism profiles with structured law/risk annotations
keeps raw evidence under risk_annotations while also exposing grouped sections like regulations and biosafety

Install

uv add git+https://github.com/Ameyanagi/ra_bio.git

pip install "ra-bio @ git+https://github.com/Ameyanagi/ra_bio.git"

Usage

from ra_bio import get_bio_database, search_organisms, lookup_bio_profile

db = get_bio_database()

lookup = db.lookup(query="Mortierella wolfii", language="ja")
search = db.search(query="Granulicatella adiacen", limit=5, min_score=0.6)
sources = db.get_source_snapshots()

search_via_helper = search_organisms(query="Granulicatella adiacen", limit=5, min_score=0.6)
lookup_via_helper = lookup_bio_profile(query="Mortierella wolfii", language="ja")

Examples

Search a synonym or misspelling and get the current accepted name:

from ra_bio import get_bio_database

db = get_bio_database()
result = db.search(query="Mortierella wolfii", limit=3, min_score=0.6)

print(result["hits"][0]["canonical_name"])
# Actinomortierella wolfii

Look up a canonical profile with host and disease information:

from ra_bio import get_bio_database

db = get_bio_database()
profile = db.lookup(query="Vibrio salmonicida", language="ja")

print(profile["profile"]["canonical_name"])
# Aliivibrio salmonicida

print(profile["profile"]["hosts"])
# ['サケ科魚類']

print(profile["profile"]["diseases"])
# ['冷水性ビブリオ病*']

Look up law and biosafety annotations in a stable structure:

from ra_bio import get_bio_database

db = get_bio_database()
profile = db.lookup(query="Anaplasma bovis", language="ja")

print(profile["profile"]["regulations"]["cartagena"]["values"])
# ['クラス2']

print(profile["profile"]["biosafety"]["bsl_bsj"]["values"])
# ['BSL2']

Inspect the bundled source-update metadata:

from ra_bio import get_bio_database

db = get_bio_database()
for row in db.get_source_snapshots():
    print(row["dataset_id"], row["source_filename"], row["source_version"], row["fetched_at"])

Example search result:

{
  "cluster_id": "ORG-000093",
  "canonical_name": "Actinomortierella wolfii",
  "preferred_scientific_name": "Actinomortierella wolfii",
  "datasets": ["fungi"],
  "score": 1.0,
  "match_type": "exact_raw",
  "matched_value": "Mortierella wolfii",
  "match_sources": ["canonical_name", "scientific_name", "synonym:m"],
  "regulation_keys": [],
  "biosafety_keys": ["trba"]
}

Example lookup summary:

{
  "matched": true,
  "profile": {
    "canonical_name": "Granulicatella adiacens",
    "scientific_names": [
      "Abiotrophia adiacens",
      "Granulicatella adiacens",
      "Streptococcus adjacens"
    ],
    "datasets": ["bacteria"],
    "regulations": {
      "cartagena": {
        "values": ["クラス2"]
      }
    },
    "biosafety": {
      "bsl_bsj": {
        "values": ["BSL1*"]
      },
      "trba": {
        "values": ["2"]
      }
    }
  }
}

Example source snapshot metadata:

[
  {
    "dataset_id": "bacteria",
    "source_filename": "risk_bacteria_20260120.csv",
    "source_version": "20260120"
  },
  {
    "dataset_id": "bacteria_fish",
    "source_filename": "risk_bacteria_fish_20240924.csv",
    "source_version": "20240924"
  },
  {
    "dataset_id": "fungi",
    "source_filename": "risk_fungi.xlsx",
    "source_version": "20260120"
  }
]

Public runtime API:

get_bio_database(db_path: str | None = None)
search_organisms(query, mode="auto", dataset=None, limit=20, min_score=0.6, db_path=None)
lookup_bio_profile(query=None, scientific_name=None, language="ja", db_path=None)
get_bio_source_snapshots(db_path=None)
get_bio_runtime_status(db_path=None)
BioDatabase.get_source_snapshots()
BioDatabase.get_runtime_status()
BioDatabase.lookup(query=None, scientific_name=None, language="ja")
BioDatabase.search(query, mode="auto", dataset=None, limit=20, min_score=0.6)

Lookup payload highlights:

profile["regulations"]: 法令・制度の注記を安定キーで参照
profile["biosafety"]: BSL / TRBA などの注記を参照
profile["designations"]: 魚病菌・植物病原菌・住環境菌などの指定区分を参照
profile["pathogen_profiles"]: 魚病データセット由来の宿主・疾病プロファイルを参照
profile["risk_annotations"]: 元データに近い証跡を保持

Default behavior:

if db_path is omitted, ra_bio uses the packaged bundled SQLite database
db_path may point to:
- a direct SQLite file
- a checked-out ra_bio directory containing bio.sqlite3

Runtime artifact

The canonical runtime artifact is the bundled SQLite database:

packaged path: src/ra_bio/data/bio.sqlite3
published repo artifact: bio.sqlite3

Normal installed consumers should rely on the packaged bundled database. The public repo intentionally stays small: detailed raw CSV/HTML retention is handled by ra_bio_scraper, not by ra_bio.

Data sources

The current dataset is derived from these NITE/MRINDA downloads:

https://www.nite.go.jp/mrinda/list/risk/download/bacteria
https://www.nite.go.jp/mrinda/list/risk/download/bacteria_fish
https://www.nite.go.jp/mrinda/list/risk/download/fungi

Source update metadata

Source update information is part of the public runtime data.

the SQLite bundle stores source_filename, source_version, fetched_at, and content_hash
the public repo keeps parsed/source_snapshots.jsonl
consumers can inspect the same information via BioDatabase.get_source_snapshots()

Detailed raw CSV files, extracted CSVs, and HTML snapshots are retained in ra_bio_scraper.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
parsed		parsed
src/ra_bio		src/ra_bio
tests		tests
.gitignore		.gitignore
README.md		README.md
bio.sqlite3		bio.sqlite3
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ra_bio

What it does

Install

Usage

Examples

Runtime artifact

Data sources

Source update metadata

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ra_bio

What it does

Install

Usage

Examples

Runtime artifact

Data sources

Source update metadata

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages