TogoMCP – Model Context Protocol Server for Life Sciences

Summary

TogoMCP is a comprehensive Model Context Protocol (MCP) server developed by DBCLS that provides LLM agents with seamless access to a vast ecosystem of life sciences databases. It integrates over 20 major biological and biomedical databases, enabling AI assistants like Claude, ChatGPT, and Gemini to help researchers query, explore, and integrate complex biological data using natural language.

Through SPARQL queries, RDF data exploration, and ID conversion services, TogoMCP bridges the gap between AI assistants and the rich data landscape of life sciences. Whether you're a biologist exploring disease–protein associations, a chemist searching for drug candidates, or a data scientist integrating information across multiple domains, TogoMCP provides a powerful toolkit for knowledge discovery.

🔬 Multi-Database Access

Query proteins, genes, chemicals, diseases, pathways, and more across 20+ integrated databases including UniProt, PubChem, ChEMBL, PDB, Reactome, ClinVar, and others — all through a single MCP endpoint.

🌐 SPARQL & RDF-Based

Built on Semantic Web technologies. TogoMCP exposes SPARQL endpoints from the RDF Portal, enabling precise, structured queries with rich cross-references between datasets.

🔗 ID Conversion

Powered by TogoID, the server converts identifiers across 65+ biological databases — including cross-category conversions (e.g., disease IDs → gene IDs) with semantic relationship annotations.

🤖 AI-Ready

Designed for integration with LLM-based assistants. Compatible with Claude Desktop, ChatGPT, and Gemini CLI. No bioinformatics expertise required — use natural language to explore data.

Usage Examples

The following examples illustrate how AI assistants powered by TogoMCP can tackle complex life sciences research questions by orchestrating queries across multiple databases.

CVD Protein Targets & Small Molecule Inhibitors

Prompt Find proteins that are associated with both cardiovascular diseases and have known small molecule inhibitors in ChEMBL, and classify them according to disease and drug availability.

Response

The system loaded the ChEMBL schema via get_MIE_file, ran parallel target searches across CVD areas (RAAS, coagulation, lipid pathways, ion channels), then issued comprehensive SPARQL queries against the EBI endpoint to retrieve approved (Phase 4) inhibitors anchored on MeSH disease IRIs (cco:hasMesh) and human single-protein targets. It used TogoID to bridge ChEMBL target IDs to UniProt accessions and compiled an interactive classification dashboard.

Tools Used

TogoMCP_Usage_Guide get_MIE_file search_chembl_target run_sparql togoid_convertId

Key Results

18 human protein targets confirmed with approved inhibitors across 6 CVD disease categories (dyslipidemia, hypertension, heart failure, thrombosis, stroke, arrhythmia)
ACE (P12821) has the largest number of distinct approved drug molecules — 12 ACE inhibitors
SLC12A3/NCC (P55017) has the broadest chemical diversity — 8+ thiazide diuretics
HMGCR (P04035) — the highest-profile target; statins remain the most prescribed CV drugs globally
SGLT2 (P31639) and Neprilysin (P08473) represent the most impactful recent heart-failure approvals
Factor Xa (P00742) inhibitors have largely supplanted warfarin (VKORC1) for AF-related stroke prevention
PCSK9 (Q8NBP7) and ANGPTL3 (Q9Y5C1) are the newest lipid-lowering targets, using mAb / siRNA mechanisms

Human Enzyme Distribution by EC Class

Prompt Determine the distribution of human enzymes by their EC number classes.

Response

The system loaded the UniProt MIE to confirm the up:enzyme predicate and the mandatory up:reviewed 1 filter, then ran a single aggregating SPARQL query against the SIB endpoint, anchored on up:organism <…/taxonomy/9606>. EC class was extracted from the enzyme IRI by stripping the prefix and taking the first digit before the dot, giving counts of unique protein–EC-class associations across all seven EC classes.

Tools Used

get_MIE_file run_sparql

Key Results

4,652 unique human protein–EC-class associations from UniProt Swiss-Prot (counts reflect associations, not unique proteins — bifunctional enzymes carry >1 EC number)
EC 2 – Transferases: 1,832 (39.4%) — kinases, methyltransferases, glycosyltransferases, acetyltransferases
EC 3 – Hydrolases: 1,732 (37.2%) — proteases, phosphatases, lipases, nucleases
EC 1 – Oxidoreductases: 546 (11.7%) — cytochrome P450s, dehydrogenases, peroxidases
EC 4 Lyases (155, 3.3%); EC 5 Isomerases (160, 3.4%); EC 6 Ligases (124, 2.7%); EC 7 Translocases (103, 2.2%)
EC 2 + EC 3 ≈ 77% — characteristic metazoan signature, reflecting the dominance of phosphorylation and proteolytic processing in human signalling

Comparative Chemical Analysis: Natural Products vs Synthetic Compounds

Prompt Use ChEBI (chemical classification) + ChEMBL (bioactivity) + PubChem (chemical descriptors) to compare the bioactivity profiles and chemical diversity of natural products versus synthetic compounds across different therapeutic areas.

Response

The system loaded the MIEs for ChEBI, ChEMBL, and PubChem, then ran five SPARQL queries: ChEBI biological roles under the CHEBI:33245 natural-product hierarchy via OWL restrictions; ChEMBL approved-drug counts grouped by ATC therapeutic area; PubChem MW distribution for FDA-approved drugs (via the sio:SIO_000008 hub-and-spoke pattern); a cross-graph ChEMBL × ChEBI join confirming NP-derived approved drugs; and a NP pharmacological-role query with mass statistics. Results were synthesised into an interactive dashboard.

Tools Used

TogoMCP_Usage_Guide get_MIE_file run_sparql

Key Results

ATC distribution of approved drugs in ChEMBL: N (Nervous, 320) > L (Antineoplastic, 236) > J (Anti-infectives, 225) > A (Alimentary/Metabolism, 219) > C (Cardiovascular, 211)
Estimated NP-derived fraction by therapeutic area: J ~65% (β-lactams, macrolides), P ~55% (artemisinin, avermectin), L ~48% (taxol, vincristine), C ~25% (digoxin, lovastatin), N ~5% (BBB constraints rule out high-MW / high-TPSA NPs)
Physicochemical fingerprint — NP vs synthetic: MW ~580 vs ~320 Da; stereocenters ~6.2 vs 1.8; TPSA ~148 vs 82 Å²; XLogP3 ~1.8 vs 3.2 — NPs occupy a higher-MW, more stereorich, more polar region of chemical space
FDA-approved drug MW distribution (PubChem): 300–500 Da bucket dominates (2,672 drugs); the >500 Da tail (2,081 drugs) is disproportionately populated by NP-derived macrolides, terpenoids, and polyketides
Cross-database coverage gap: only ~35K of ChEMBL's 1.9M molecules carry a ChEBI xref, so the RDF-confirmed NP count is a strict lower bound — literature estimates fill the gap

Setup Guide

Connect TogoMCP to your AI assistant in minutes. Choose your platform below.

Method 1: Custom Connectors (Recommended — Claude Pro, Team, or Enterprise)

Open Claude and navigate to Settings → Connectors.
Click "Add custom connector" at the bottom of the page.
Enter the MCP server URL: https://togomcp.rdfportal.org/mcp
Click "Add" to complete the setup.
In your chat interface, click the "Search and tools" button to enable the connector.

Method 2: JSON Configuration (Alternative — Claude Desktop App)

Navigate to Settings → Developer → Edit Config. This opens claude_desktop_config.json.
Add the following configuration:

{
  "mcpServers": {
    "togomcp": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "https://togomcp.rdfportal.org/mcp"]
    }
  }
}

Save the file and restart Claude Desktop.

ℹ️ For more details, see the Claude custom connectors documentation.

ChatGPT — Plus, Pro, Business, or Enterprise Plans Required

MCP connectors in ChatGPT are currently available for Business/Enterprise plans only.

Go to Workspace Settings → Permissions & Roles and enable "Developer mode / Create custom MCP connectors".
Navigate to Settings → Connectors → Create.
Enter the following details:
- Connector name: TogoMCP
- MCP Server URL: https://togomcp.rdfportal.org/mcp
- Description: Access to life sciences databases via TogoMCP
Click "Create" and authorize the connector.
In your chat, use the "+" menu to select "Developer Mode" and enable TogoMCP.

ℹ️ Developer mode requires confirmation for write actions and shows tool usage explicitly. See the ChatGPT MCP documentation for more details.

Gemini CLI — Streamable HTTP Configuration

TogoMCP uses Streamable HTTP transport (not SSE). Configure your settings.json as follows:

{
  "mcpServers": {
    "togomcp": {
      "httpUrl": "https://togomcp.rdfportal.org/mcp"
    }
  }
}

Save the file — Gemini CLI will automatically detect the new MCP server on next launch.

ℹ️ For full setup instructions, see the Gemini CLI MCP documentation.

Available Databases

TogoMCP integrates over 20 major life sciences databases, covering proteins, genes, chemicals, diseases, pathways, taxonomy, and more.

UniProt

Comprehensive protein sequence and functional information. 444M+ proteins with curated Swiss-Prot entries, cross-references to 200+ databases, sequences, domains, and functions.

PubChem

Public chemical database with 119M compounds, 1.7M bioassays, molecular descriptors, bioactivity data, and links to genes, proteins, pathways, and diseases.

ChEMBL

Manually curated database of bioactive molecules with 2.4M+ compounds, 20M bioactivity measurements, and compound–target–activity data for drug discovery.

PDB (Protein Data Bank)

3D structural data for 204K+ protein and nucleic acid structures from X-ray crystallography, NMR, and cryo-EM. Linked to UniProt and EMDB.

jPOST

Curated proteomics repository of reanalysed mass-spectrometry submissions: 598 projects bundling 1M+ identified Proteins (linked to UniProt), 3.7M Peptides, 10M+ PSMs, and PTMs typed against Unimod and PSI-MS.

ChEBI

Chemical Entities of Biological Interest ontology with 217,000+ entities. Hierarchical classification of small molecules, atoms, ions, functional groups, and biological roles.

Reactome

Curated knowledgebase of 22,000+ biological pathways across 30+ species, covering molecular interactions, biochemical reactions, complexes, and disease associations.

Rhea

Expert-curated database of 17,078 atom-balanced biochemical reactions, linked to ChEBI compounds, EC numbers, and metabolic pathway databases.

BRENDA

Manually curated enzyme database covering enzymes classified by EC number across organisms, with substrates, products, inhibitors, activators, cofactors, tissue expression, and subcellular localization.

Gene Ontology (GO)

Controlled vocabulary for gene and gene product attributes across all organisms, covering biological processes, molecular functions, and cellular components.

Ensembl

Comprehensive genome annotations for 100+ species including genes, transcripts, proteins, and exons. Cross-referenced to UniProt, HGNC, and OMIM.

NCBI Gene

Gene database with 57M+ entries covering protein-coding genes, ncRNAs, and pseudogenes across all organisms, with chromosomal locations and orthology data.

HGNC

HUGO Gene Nomenclature Committee — the authoritative source for approved human gene symbols and names, with chromosomal locations and cross-references to NCBI Gene, Ensembl, UniProt, OMIM, and dozens more.

OMA

Orthologous MAtrix — phylogenomics-based ortholog and paralog inference covering 17.4M proteins from 2,927 species, organized into Hierarchical Orthologous Groups (HOGs) for comparative genomics and gene family analysis.

Bgee

Curated gene expression database integrating RNA-Seq, Affymetrix, EST, and in situ hybridization data across animal species, with calls tagged by anatomy (UBERON), developmental stage, sex, and strain for cross-species expression comparison.

ClinVar

Aggregates genomic variation and health relationships. 3.5M+ variant records with clinical interpretations, gene associations, and disease conditions.

MedGen

NCBI's portal for medical genetics with 233,000+ clinical concepts covering diseases, phenotypes, and clinical findings. Integrates OMIM, Orphanet, HPO, and MONDO.

MONDO

Monarch Disease Ontology integrating multiple disease databases into unified classification with cross-references to OMIM, Orphanet, DOID, MESH, ICD, and 35+ databases.

MeSH

NLM's Medical Subject Headings — controlled vocabulary for biomedical literature indexing, with hierarchical descriptors and qualifiers across 16 main categories.

NCBI Taxonomy

Biological taxonomic classification covering 3M+ organisms from bacteria to mammals, with hierarchical relationships, scientific names, and genetic code assignments.

PubMed

Bibliographic information for biomedical literature from MEDLINE, including publication metadata, abstracts, authors, and MeSH annotations.

PubTator

Biomedical entity annotations extracted from PubMed using text mining. Disease and Gene annotations linked to articles for literature-based knowledge discovery.

BacDive

Bacterial Diversity Metadatabase with 97,000+ strain records covering taxonomy, morphology, physiology, and molecular data for bacteria and archaea.

MediaDive

Culture media database from DSMZ with 3,289 standardized recipes for bacteria, archaea, fungi, and microalgae, including ingredients and growth conditions.

NANDO

Japanese intractable (rare) disease ontology with 2,777 disease classes, multilingual labels, and cross-references to international disease ontologies.

DDBJ

DNA Data Bank of Japan: nucleotide sequences with genomic annotations, organism metadata, taxonomic classification, and functional annotations.

GlyCosmos

Glycoscience portal integrating glycan structures, glycoproteins, glycosylation sites, glycogenes, and lectin–glycan interactions for multi-species glycobiology research.

AMR Portal

Antimicrobial resistance surveillance data with 1.7M+ phenotypic susceptibility tests and 1.1M+ genotypic AMR features from bacterial isolates worldwide.

Available Tools

TogoMCP exposes a rich set of tools for searching, querying, and converting life sciences data.

📋 Database & Information

list_databases

List all available databases with descriptions and coverage.

find_databases

Token-efficient discovery — filter the database catalog by keywords and/or category.

list_categories

List database categories with their member databases — pair with find_databases(category=…).

get_MIE_file

Get metadata file containing ShEx schema and SPARQL examples for a specific database. Use this first before writing queries.

get_sparql_endpoints

Retrieve available SPARQL endpoints from the RDF Portal.

get_graph_list

List named graphs available in a given database.

🔍 Keyword Search Tools

search_uniprot_entity

Search UniProt proteins by name, description, or disease association.

search_chembl_molecule

Search ChEMBL molecules and bioactive compounds.

search_chembl_target

Search ChEMBL protein targets and drug targets.

search_pdb_entity

Search PDBj entries including structures, chemical components, and peptides.

search_reactome_entity

Search Reactome pathways and biological reactions.

search_rhea_entity

Search Rhea biochemical reactions by keyword.

search_mesh_descriptor

Search MeSH medical concepts, descriptors, and controlled vocabulary terms.

⚡ SPARQL Query

run_sparql

Execute custom SPARQL queries on any RDF database in the portal. For best results, use get_MIE_file first to understand the database schema and available properties.

🔄 ID Conversion (TogoID)

togoid_convertId

Convert identifiers between databases (e.g., UniProt to ChEMBL, PubChem to ChEBI).

togoid_countId

Count how many IDs can be converted between two databases.

togoid_getAllDataset

Get configuration for all available datasets in TogoID.

togoid_getDataset

Get configuration details for a specific TogoID dataset.

togoid_getAllRelation

Get all possible conversion relationships between databases.

togoid_getRelation

Get relationship details between two specific databases.

🧬 NCBI E-utilities

ncbi_esearch

Search NCBI databases (Gene, Taxonomy, ClinVar, MedGen, PubMed, PubChem) using NCBI field tags.

ncbi_esummary

Get summary information for NCBI IDs retrieved via esearch.

ncbi_efetch

Fetch full records (sequences, data, etc.) for NCBI IDs.

ncbi_list_databases

List all supported NCBI databases with descriptions and example queries.

🧪 PubChem-specific

get_pubchem_compound_id

Look up a PubChem compound ID from a compound name.

get_compound_attributes_from_pubchem

Retrieve detailed compound attributes and molecular descriptors from PubChem RDF.

Other MCP Servers

TogoMCP works excellently in combination with these complementary MCP servers for richer research workflows.

PubDictionaries MCP Server

https://pubdictionaries.org/mcp

PubDictionaries provides text annotation services for biomedical literature, helping identify and map biological entities such as genes, proteins, diseases, and chemicals in text.

Use Case: Annotate research papers to extract entity mentions, then use TogoMCP to retrieve comprehensive data about those entities from UniProt, ChEMBL, or other databases.

PubMed MCP Server

Claude Support Article

The PubMed MCP server provides access to the world's largest biomedical literature database, enabling article search, metadata retrieval, and full-text access from PubMed Central.

Use Case: Search PubMed for relevant literature, then use TogoMCP to retrieve detailed molecular and pathway information about entities mentioned in those papers.

OLS4 MCP Server

https://www.ebi.ac.uk/ols4/mcp

The Ontology Lookup Service (OLS4) from EMBL-EBI provides access to biomedical ontologies, enabling standardization of terminology and exploration of hierarchical relationships between biological concepts.

Use Case: Explore ontology terms and their relationships in OLS4, then query TogoMCP using standardized ontology identifiers for precise data retrieval.

Related Resources

Explore the ecosystem of tools and organizations behind TogoMCP.

RDF Portal

https://rdfportal.org

The NBDC RDF Portal is a comprehensive repository of semantic life sciences data developed by DBCLS and NBDC. It hosts 21+ RDF datasets comprising over 45.5 billion triples, all quality-reviewed for interoperability and SPARQL queryability. TogoMCP's SPARQL queries run against this portal's unified endpoint.

TogoID

https://togoid.dbcls.jp

TogoID is an identifier conversion service by DBCLS that bridges 65+ life science databases. Unlike traditional converters, it supports cross-category conversions (e.g., disease IDs → gene IDs) with semantic relationship annotations. TogoMCP's ID conversion tools are powered by TogoID's API.

DBCLS

https://dbcls.rois.ac.jp/en/

The Database Center for Life Science (DBCLS) is a Japanese research institute under ROIS, founded in 2007. It conducts research on database integration, Semantic Web technologies, and bioinformatics resources. DBCLS organizes the annual BioHackathon and monthly SPARQLthon events, and develops tools like TogoID, TogoTV, and TogoMCP.

Source Code

dbcls/togomcp

The TogoMCP source code is open and available on GitHub. Contributions, bug reports, and feature requests are welcome.

https://github.com/dbcls/togomcp

Summary

🔬 Multi-Database Access

🌐 SPARQL & RDF-Based

🔗 ID Conversion

🤖 AI-Ready

Preprint

Usage Examples

CVD Protein Targets & Small Molecule Inhibitors

Response

Tools Used

Key Results

Human Enzyme Distribution by EC Class

Response

Tools Used

Key Results

Comparative Chemical Analysis: Natural Products vs Synthetic Compounds

Response

Tools Used

Key Results

Setup Guide

Method 1: Custom Connectors (Recommended — Claude Pro, Team, or Enterprise)

Method 2: JSON Configuration (Alternative — Claude Desktop App)

ChatGPT — Plus, Pro, Business, or Enterprise Plans Required

Gemini CLI — Streamable HTTP Configuration

Available Databases

Available Tools

Other MCP Servers

PubDictionaries MCP Server

PubMed MCP Server

OLS4 MCP Server

Related Resources

RDF Portal

TogoID

DBCLS

Source Code

dbcls/togomcp