DBCLS
DBCLS · Life Sciences

TogoMCP

Model Context Protocol Server for Life Sciences Research

MCP Server Endpoint
https://togomcp.rdfportal.org/mcp

Summary

TogoMCP is a comprehensive Model Context Protocol (MCP) server developed by DBCLS that provides LLM agents with seamless access to a vast ecosystem of life sciences databases. It integrates over 20 major biological and biomedical databases, enabling AI assistants like Claude, ChatGPT, and Gemini to help researchers query, explore, and integrate complex biological data using natural language.

Through SPARQL queries, RDF data exploration, and ID conversion services, TogoMCP bridges the gap between AI assistants and the rich data landscape of life sciences. Whether you're a biologist exploring disease–protein associations, a chemist searching for drug candidates, or a data scientist integrating information across multiple domains, TogoMCP provides a powerful toolkit for knowledge discovery.

🔬 Multi-Database Access

Query proteins, genes, chemicals, diseases, pathways, and more across 20+ integrated databases including UniProt, PubChem, ChEMBL, PDB, Reactome, ClinVar, and others — all through a single MCP endpoint.

🌐 SPARQL & RDF-Based

Built on Semantic Web technologies. TogoMCP exposes SPARQL endpoints from the RDF Portal, enabling precise, structured queries with rich cross-references between datasets.

🔗 ID Conversion

Powered by TogoID, the server converts identifiers across 65+ biological databases — including cross-category conversions (e.g., disease IDs → gene IDs) with semantic relationship annotations.

🤖 AI-Ready

Designed for integration with LLM-based assistants. Compatible with Claude Desktop, ChatGPT, and Gemini CLI. No bioinformatics expertise required — use natural language to explore data.

Preprint

TogoMCP is described in the following preprint. Please cite it if you use TogoMCP in your research.

Kinjo, A. R., Yamamoto, Y., Bustamante-Larriet, S., Labra-Gayo, J.-E., & Fujisawa, T. (2026). TogoMCP: Natural Language Querying of Life-Science Knowledge Graphs via Schema-Guided LLMs and the Model Context Protocol. bioRxiv. https://doi.org/10.64898/2026.03.19.713030

Usage Examples

The following examples illustrate how AI assistants powered by TogoMCP can tackle complex life sciences research questions by orchestrating queries across multiple databases.

1

CVD Protein Targets & Small Molecule Inhibitors

Prompt Find proteins that are associated with both cardiovascular diseases and have known small molecule inhibitors in ChEMBL, and classify them according to disease and drug availability.

Response

The system loaded the ChEMBL schema via get_MIE_file, ran parallel target searches across CVD areas (RAAS, coagulation, lipid pathways, ion channels), then issued comprehensive SPARQL queries against the EBI endpoint to retrieve approved (Phase 4) inhibitors anchored on MeSH disease IRIs (cco:hasMesh) and human single-protein targets. It used TogoID to bridge ChEMBL target IDs to UniProt accessions and compiled an interactive classification dashboard.

Tools Used

TogoMCP_Usage_Guide get_MIE_file search_chembl_target run_sparql togoid_convertId

Key Results

  • 18 human protein targets confirmed with approved inhibitors across 6 CVD disease categories (dyslipidemia, hypertension, heart failure, thrombosis, stroke, arrhythmia)
  • ACE (P12821) has the largest number of distinct approved drug molecules — 12 ACE inhibitors
  • SLC12A3/NCC (P55017) has the broadest chemical diversity — 8+ thiazide diuretics
  • HMGCR (P04035) — the highest-profile target; statins remain the most prescribed CV drugs globally
  • SGLT2 (P31639) and Neprilysin (P08473) represent the most impactful recent heart-failure approvals
  • Factor Xa (P00742) inhibitors have largely supplanted warfarin (VKORC1) for AF-related stroke prevention
  • PCSK9 (Q8NBP7) and ANGPTL3 (Q9Y5C1) are the newest lipid-lowering targets, using mAb / siRNA mechanisms
2

Human Enzyme Distribution by EC Class

Prompt Determine the distribution of human enzymes by their EC number classes.

Response

The system loaded the UniProt MIE to confirm the up:enzyme predicate and the mandatory up:reviewed 1 filter, then ran a single aggregating SPARQL query against the SIB endpoint, anchored on up:organism <…/taxonomy/9606>. EC class was extracted from the enzyme IRI by stripping the prefix and taking the first digit before the dot, giving counts of unique protein–EC-class associations across all seven EC classes.

Tools Used

get_MIE_file run_sparql

Key Results

  • 4,652 unique human protein–EC-class associations from UniProt Swiss-Prot (counts reflect associations, not unique proteins — bifunctional enzymes carry >1 EC number)
  • EC 2 – Transferases: 1,832 (39.4%) — kinases, methyltransferases, glycosyltransferases, acetyltransferases
  • EC 3 – Hydrolases: 1,732 (37.2%) — proteases, phosphatases, lipases, nucleases
  • EC 1 – Oxidoreductases: 546 (11.7%) — cytochrome P450s, dehydrogenases, peroxidases
  • EC 4 Lyases (155, 3.3%); EC 5 Isomerases (160, 3.4%); EC 6 Ligases (124, 2.7%); EC 7 Translocases (103, 2.2%)
  • EC 2 + EC 3 ≈ 77% — characteristic metazoan signature, reflecting the dominance of phosphorylation and proteolytic processing in human signalling
3

Comparative Chemical Analysis: Natural Products vs Synthetic Compounds

Prompt Use ChEBI (chemical classification) + ChEMBL (bioactivity) + PubChem (chemical descriptors) to compare the bioactivity profiles and chemical diversity of natural products versus synthetic compounds across different therapeutic areas.

Response

The system loaded the MIEs for ChEBI, ChEMBL, and PubChem, then ran five SPARQL queries: ChEBI biological roles under the CHEBI:33245 natural-product hierarchy via OWL restrictions; ChEMBL approved-drug counts grouped by ATC therapeutic area; PubChem MW distribution for FDA-approved drugs (via the sio:SIO_000008 hub-and-spoke pattern); a cross-graph ChEMBL × ChEBI join confirming NP-derived approved drugs; and a NP pharmacological-role query with mass statistics. Results were synthesised into an interactive dashboard.

Tools Used

TogoMCP_Usage_Guide get_MIE_file run_sparql

Key Results

  • ATC distribution of approved drugs in ChEMBL: N (Nervous, 320) > L (Antineoplastic, 236) > J (Anti-infectives, 225) > A (Alimentary/Metabolism, 219) > C (Cardiovascular, 211)
  • Estimated NP-derived fraction by therapeutic area: J ~65% (β-lactams, macrolides), P ~55% (artemisinin, avermectin), L ~48% (taxol, vincristine), C ~25% (digoxin, lovastatin), N ~5% (BBB constraints rule out high-MW / high-TPSA NPs)
  • Physicochemical fingerprint — NP vs synthetic: MW ~580 vs ~320 Da; stereocenters ~6.2 vs 1.8; TPSA ~148 vs 82 Ų; XLogP3 ~1.8 vs 3.2 — NPs occupy a higher-MW, more stereorich, more polar region of chemical space
  • FDA-approved drug MW distribution (PubChem): 300–500 Da bucket dominates (2,672 drugs); the >500 Da tail (2,081 drugs) is disproportionately populated by NP-derived macrolides, terpenoids, and polyketides
  • Cross-database coverage gap: only ~35K of ChEMBL's 1.9M molecules carry a ChEBI xref, so the RDF-confirmed NP count is a strict lower bound — literature estimates fill the gap

Setup Guide

Connect TogoMCP to your AI assistant in minutes. Choose your platform below.

Method 1: Custom Connectors (Recommended — Claude Pro, Team, or Enterprise)

  1. Open Claude and navigate to Settings → Connectors.
  2. Click "Add custom connector" at the bottom of the page.
  3. Enter the MCP server URL: https://togomcp.rdfportal.org/mcp
  4. Click "Add" to complete the setup.
  5. In your chat interface, click the "Search and tools" button to enable the connector.

Method 2: JSON Configuration (Alternative — Claude Desktop App)

  1. Navigate to Settings → Developer → Edit Config. This opens claude_desktop_config.json.
  2. Add the following configuration:
{
  "mcpServers": {
    "togomcp": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "https://togomcp.rdfportal.org/mcp"]
    }
  }
}
  1. Save the file and restart Claude Desktop.
ℹ️ For more details, see the Claude custom connectors documentation.

ChatGPT — Plus, Pro, Business, or Enterprise Plans Required

MCP connectors in ChatGPT are currently available for Business/Enterprise plans only.

  1. Go to Workspace Settings → Permissions & Roles and enable "Developer mode / Create custom MCP connectors".
  2. Navigate to Settings → Connectors → Create.
  3. Enter the following details:
    • Connector name: TogoMCP
    • MCP Server URL: https://togomcp.rdfportal.org/mcp
    • Description: Access to life sciences databases via TogoMCP
  4. Click "Create" and authorize the connector.
  5. In your chat, use the "+" menu to select "Developer Mode" and enable TogoMCP.
ℹ️ Developer mode requires confirmation for write actions and shows tool usage explicitly. See the ChatGPT MCP documentation for more details.

Gemini CLI — Streamable HTTP Configuration

TogoMCP uses Streamable HTTP transport (not SSE). Configure your settings.json as follows:

{
  "mcpServers": {
    "togomcp": {
      "httpUrl": "https://togomcp.rdfportal.org/mcp"
    }
  }
}

Save the file — Gemini CLI will automatically detect the new MCP server on next launch.

ℹ️ For full setup instructions, see the Gemini CLI MCP documentation.

Available Databases

TogoMCP integrates over 20 major life sciences databases, covering proteins, genes, chemicals, diseases, pathways, taxonomy, and more.

UniProt
Comprehensive protein sequence and functional information. 444M+ proteins with curated Swiss-Prot entries, cross-references to 200+ databases, sequences, domains, and functions.
PubChem
Public chemical database with 119M compounds, 1.7M bioassays, molecular descriptors, bioactivity data, and links to genes, proteins, pathways, and diseases.
ChEMBL
Manually curated database of bioactive molecules with 2.4M+ compounds, 20M bioactivity measurements, and compound–target–activity data for drug discovery.
PDB (Protein Data Bank)
3D structural data for 204K+ protein and nucleic acid structures from X-ray crystallography, NMR, and cryo-EM. Linked to UniProt and EMDB.
jPOST
Curated proteomics repository of reanalysed mass-spectrometry submissions: 598 projects bundling 1M+ identified Proteins (linked to UniProt), 3.7M Peptides, 10M+ PSMs, and PTMs typed against Unimod and PSI-MS.
ChEBI
Chemical Entities of Biological Interest ontology with 217,000+ entities. Hierarchical classification of small molecules, atoms, ions, functional groups, and biological roles.
Reactome
Curated knowledgebase of 22,000+ biological pathways across 30+ species, covering molecular interactions, biochemical reactions, complexes, and disease associations.
Rhea
Expert-curated database of 17,078 atom-balanced biochemical reactions, linked to ChEBI compounds, EC numbers, and metabolic pathway databases.
BRENDA
Manually curated enzyme database covering enzymes classified by EC number across organisms, with substrates, products, inhibitors, activators, cofactors, tissue expression, and subcellular localization.
Gene Ontology (GO)
Controlled vocabulary for gene and gene product attributes across all organisms, covering biological processes, molecular functions, and cellular components.
Ensembl
Comprehensive genome annotations for 100+ species including genes, transcripts, proteins, and exons. Cross-referenced to UniProt, HGNC, and OMIM.
NCBI Gene
Gene database with 57M+ entries covering protein-coding genes, ncRNAs, and pseudogenes across all organisms, with chromosomal locations and orthology data.
HGNC
HUGO Gene Nomenclature Committee — the authoritative source for approved human gene symbols and names, with chromosomal locations and cross-references to NCBI Gene, Ensembl, UniProt, OMIM, and dozens more.
OMA
Orthologous MAtrix — phylogenomics-based ortholog and paralog inference covering 17.4M proteins from 2,927 species, organized into Hierarchical Orthologous Groups (HOGs) for comparative genomics and gene family analysis.
Bgee
Curated gene expression database integrating RNA-Seq, Affymetrix, EST, and in situ hybridization data across animal species, with calls tagged by anatomy (UBERON), developmental stage, sex, and strain for cross-species expression comparison.
ClinVar
Aggregates genomic variation and health relationships. 3.5M+ variant records with clinical interpretations, gene associations, and disease conditions.
MedGen
NCBI's portal for medical genetics with 233,000+ clinical concepts covering diseases, phenotypes, and clinical findings. Integrates OMIM, Orphanet, HPO, and MONDO.
MONDO
Monarch Disease Ontology integrating multiple disease databases into unified classification with cross-references to OMIM, Orphanet, DOID, MESH, ICD, and 35+ databases.
MeSH
NLM's Medical Subject Headings — controlled vocabulary for biomedical literature indexing, with hierarchical descriptors and qualifiers across 16 main categories.
NCBI Taxonomy
Biological taxonomic classification covering 3M+ organisms from bacteria to mammals, with hierarchical relationships, scientific names, and genetic code assignments.
PubMed
Bibliographic information for biomedical literature from MEDLINE, including publication metadata, abstracts, authors, and MeSH annotations.
PubTator
Biomedical entity annotations extracted from PubMed using text mining. Disease and Gene annotations linked to articles for literature-based knowledge discovery.
BacDive
Bacterial Diversity Metadatabase with 97,000+ strain records covering taxonomy, morphology, physiology, and molecular data for bacteria and archaea.
MediaDive
Culture media database from DSMZ with 3,289 standardized recipes for bacteria, archaea, fungi, and microalgae, including ingredients and growth conditions.
NANDO
Japanese intractable (rare) disease ontology with 2,777 disease classes, multilingual labels, and cross-references to international disease ontologies.
DDBJ
DNA Data Bank of Japan: nucleotide sequences with genomic annotations, organism metadata, taxonomic classification, and functional annotations.
GlyCosmos
Glycoscience portal integrating glycan structures, glycoproteins, glycosylation sites, glycogenes, and lectin–glycan interactions for multi-species glycobiology research.
AMR Portal
Antimicrobial resistance surveillance data with 1.7M+ phenotypic susceptibility tests and 1.1M+ genotypic AMR features from bacterial isolates worldwide.

Available Tools

TogoMCP exposes a rich set of tools for searching, querying, and converting life sciences data.

📋 Database & Information
list_databases
List all available databases with descriptions and coverage.
find_databases
Token-efficient discovery — filter the database catalog by keywords and/or category.
list_categories
List database categories with their member databases — pair with find_databases(category=…).
get_MIE_file
Get metadata file containing ShEx schema and SPARQL examples for a specific database. Use this first before writing queries.
get_sparql_endpoints
Retrieve available SPARQL endpoints from the RDF Portal.
get_graph_list
List named graphs available in a given database.
🔍 Keyword Search Tools
search_uniprot_entity
Search UniProt proteins by name, description, or disease association.
search_chembl_molecule
Search ChEMBL molecules and bioactive compounds.
search_chembl_target
Search ChEMBL protein targets and drug targets.
search_pdb_entity
Search PDBj entries including structures, chemical components, and peptides.
search_reactome_entity
Search Reactome pathways and biological reactions.
search_rhea_entity
Search Rhea biochemical reactions by keyword.
search_mesh_descriptor
Search MeSH medical concepts, descriptors, and controlled vocabulary terms.
⚡ SPARQL Query
run_sparql
Execute custom SPARQL queries on any RDF database in the portal. For best results, use get_MIE_file first to understand the database schema and available properties.
🔄 ID Conversion (TogoID)
togoid_convertId
Convert identifiers between databases (e.g., UniProt to ChEMBL, PubChem to ChEBI).
togoid_countId
Count how many IDs can be converted between two databases.
togoid_getAllDataset
Get configuration for all available datasets in TogoID.
togoid_getDataset
Get configuration details for a specific TogoID dataset.
togoid_getAllRelation
Get all possible conversion relationships between databases.
togoid_getRelation
Get relationship details between two specific databases.
🧬 NCBI E-utilities
ncbi_esearch
Search NCBI databases (Gene, Taxonomy, ClinVar, MedGen, PubMed, PubChem) using NCBI field tags.
ncbi_esummary
Get summary information for NCBI IDs retrieved via esearch.
ncbi_efetch
Fetch full records (sequences, data, etc.) for NCBI IDs.
ncbi_list_databases
List all supported NCBI databases with descriptions and example queries.
🧪 PubChem-specific
get_pubchem_compound_id
Look up a PubChem compound ID from a compound name.
get_compound_attributes_from_pubchem
Retrieve detailed compound attributes and molecular descriptors from PubChem RDF.

Other MCP Servers

TogoMCP works excellently in combination with these complementary MCP servers for richer research workflows.

PubDictionaries MCP Server

PubDictionaries provides text annotation services for biomedical literature, helping identify and map biological entities such as genes, proteins, diseases, and chemicals in text.

Use Case: Annotate research papers to extract entity mentions, then use TogoMCP to retrieve comprehensive data about those entities from UniProt, ChEMBL, or other databases.

PubMed MCP Server

The PubMed MCP server provides access to the world's largest biomedical literature database, enabling article search, metadata retrieval, and full-text access from PubMed Central.

Use Case: Search PubMed for relevant literature, then use TogoMCP to retrieve detailed molecular and pathway information about entities mentioned in those papers.

OLS4 MCP Server

The Ontology Lookup Service (OLS4) from EMBL-EBI provides access to biomedical ontologies, enabling standardization of terminology and exploration of hierarchical relationships between biological concepts.

Use Case: Explore ontology terms and their relationships in OLS4, then query TogoMCP using standardized ontology identifiers for precise data retrieval.

Related Resources

Explore the ecosystem of tools and organizations behind TogoMCP.

RDF Portal

https://rdfportal.org

The NBDC RDF Portal is a comprehensive repository of semantic life sciences data developed by DBCLS and NBDC. It hosts 21+ RDF datasets comprising over 45.5 billion triples, all quality-reviewed for interoperability and SPARQL queryability. TogoMCP's SPARQL queries run against this portal's unified endpoint.

TogoID

https://togoid.dbcls.jp

TogoID is an identifier conversion service by DBCLS that bridges 65+ life science databases. Unlike traditional converters, it supports cross-category conversions (e.g., disease IDs → gene IDs) with semantic relationship annotations. TogoMCP's ID conversion tools are powered by TogoID's API.

DBCLS

https://dbcls.rois.ac.jp/en/

The Database Center for Life Science (DBCLS) is a Japanese research institute under ROIS, founded in 2007. It conducts research on database integration, Semantic Web technologies, and bioinformatics resources. DBCLS organizes the annual BioHackathon and monthly SPARQLthon events, and develops tools like TogoID, TogoTV, and TogoMCP.

Source Code

dbcls/togomcp

The TogoMCP source code is open and available on GitHub. Contributions, bug reports, and feature requests are welcome.

https://github.com/dbcls/togomcp