TogoMCP

Model Context Protocol Server for Life Sciences Research

MCP Server Endpoint:
https://togomcp.rdfportal.org/mcp

Summary

TogoMCP is a comprehensive Model Context Protocol (MCP) server that provides LLM agents with seamless access to a vast ecosystem of life sciences databases. Developed by the Database Center for Life Science (DBCLS), TogoMCP integrates over 20 major biological and biomedical databases, offering researchers a powerful toolkit for cross-database queries, data integration, and knowledge discovery.

Through SPARQL queries, RDF data exploration, and ID conversion services, TogoMCP enables AI assistants like Claude to help researchers:

Whether you're a biologist exploring disease-protein associations, a chemist searching for drug candidates, or a researcher integrating data across multiple domains, TogoMCP bridges the gap between AI assistants and the rich data landscape of life sciences.

Usage Examples

Example 1: Multi-Database Integration for Drug Discovery

Prompt:

"Find proteins that are associated with both cardiovascular diseases and have known small molecule inhibitors in ChEMBL, and classify them according to disease and drug availability."

Response:

The system executed a comprehensive multi-database search strategy, querying MeSH for cardiovascular disease terms, then searching ChEMBL for related drug targets, and executing complex SPARQL queries to retrieve detailed bioactivity data.

Tools Used:
  • search_mesh_entity - Found cardiovascular disease MeSH terms
  • search_chembl_target - Searched for cardiovascular drug targets
  • get_MIE_file - Retrieved ChEMBL database schema
  • run_sparql - Executed complex queries across ChEMBL database
Key Results:
  • Successfully identified and classified 75+ human proteins associated with cardiovascular diseases
  • 15+ validated targets for hypertension with approved drugs
  • 20+ targets for coronary artery disease in various development stages
  • 12+ validated targets for heart failure
  • Ion channels with 60+ different protein subtypes and inhibitors
  • 50+ targets with approved drugs, 15+ in Phase 3 trials

Example 2: Enzyme Classification Analysis

Prompt:

"Determine the distribution of human enzymes by their EC number classes."

Response:

The system retrieved the UniProt database schema and executed targeted SPARQL queries to extract and analyze all human enzymes with EC number classifications from Swiss-Prot.

Tools Used:
  • get_MIE_file - Retrieved UniProt database schema
  • run_sparql - Queried UniProt for human enzymes with EC numbers
Key Results:
  • Analyzed 4,442 unique human enzymes from UniProt Swiss-Prot
  • EC 2 (Transferases): 1,823 proteins (41.0%) - Most abundant
  • EC 3 (Hydrolases): 1,655 proteins (37.3%) - Second most abundant
  • EC 1 (Oxidoreductases): 545 proteins (12.3%)
  • EC 4 (Lyases): 151 proteins (3.4%)
  • EC 5 (Isomerases): 148 proteins (3.3%)
  • EC 6 (Ligases): 124 proteins (2.8%)
  • EC 7 (Translocases): 103 proteins (2.3%)
  • The dominance of transferases and hydrolases (78.3% combined) reflects the importance of cellular signaling and regulatory processes

Example 3: Comparative Chemical Analysis

Prompt:

"Use ChEBI (chemical classification) + ChEMBL (bioactivity) + PubChem (chemical descriptors) to compare the bioactivity profiles and chemical diversity of natural products versus synthetic compounds across different therapeutic areas."

Response:

The system coordinated queries across three major chemical databases, retrieving schemas, executing cross-database SPARQL queries, and obtaining detailed molecular descriptors to perform a comprehensive comparative analysis.

Tools Used:
  • get_MIE_file - Retrieved schemas for ChEBI, ChEMBL, and PubChem
  • run_sparql - Executed cross-database queries
  • get_compound_attributes_from_pubchem - Retrieved molecular descriptors
Key Results:
  • Natural Products: Ultra-high potency (70% with IC50 < 100 nM), average 8 chiral centers, MW 400-900 Da, polypharmacology (8.5 targets average), ~60% of anticancer drugs are natural product-derived
  • Synthetic Compounds: Selective potency (40% with IC50 < 100 nM), simpler structures (average 1 chiral center), MW 300-600 Da, monopharmacology (2-3 targets), dominant in cardiovascular, CNS, and metabolic diseases
  • Natural products excel in oncology and occupy unique chemical space
  • Synthetic drugs excel in target selectivity for precision medicine
  • Complementary strengths suggest optimal strategy combines both approaches

Setup Guide

TogoMCP can be connected to various AI platforms. Choose your platform below:

Claude Desktop

Method 1: Custom Connectors (Recommended for Pro/Team/Enterprise)

  1. Navigate to Settings → Connectors
  2. Click "Add custom connector" at the bottom
  3. Enter the MCP server URL: https://togomcp.rdfportal.org/mcp
  4. Click "Add" to complete the setup
  5. Enable the connector via the "Search and tools" button in your chat interface

Method 2: JSON Configuration (Alternative)

  1. Navigate to Settings → Developer → Edit Config
  2. This opens claude_desktop_config.json
  3. Add the following configuration:
{
  "mcpServers": {
    "togomcp": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "https://togomcp.rdfportal.org/mcp"]
    }
  }
}
  1. Save the file and restart Claude Desktop

ChatGPT (Business/Enterprise Plans)

Setup Instructions

  1. Navigate to Workspace Settings → Permissions & Roles
  2. Enable "Developer mode / Create custom MCP connectors"
  3. Go to Settings → Connectors → Create
  4. Enter the following details:
    • Connector name: TogoMCP
    • MCP Server URL: https://togomcp.rdfportal.org/mcp
    • Description: Access to life sciences databases via TogoMCP
  5. Click "Create" and authorize the connector
  6. In your chat, use the "+" menu to select "Developer Mode" and enable TogoMCP

Note: Developer mode requires confirmation for write actions and shows tool usage explicitly for safety.

Gemini CLI

TogoMCP uses Streamable HTTP transport (not SSE). Configure your settings.json as follows:

{
  "mcpServers": {
    "togomcp": {
      "httpUrl": "https://togomcp.rdfportal.org/mcp"
    }
  }
}

For more information, see the Gemini CLI documentation.

Available Databases

TogoMCP provides access to over 20 major life sciences databases covering proteins, genes, chemicals, diseases, pathways, and more:

UniProt

Comprehensive protein sequence and functional information. Contains 444M proteins with Swiss-Prot (923K curated) and TrEMBL annotations, including sequences, functions, domains, structures, and cross-references to 200+ databases.

PubChem

Public database of chemical molecules with 119M compounds, 339M substances, 1.7M bioassays, molecular descriptors, bioactivity data, and extensive cross-references to genes, proteins, pathways, and diseases.

ChEMBL

Manually curated database of bioactive molecules containing 2.4M+ compounds, 1.6M assays, 20M bioactivity measurements. Essential for drug discovery with compound-target-activity relationships and mechanism of action data.

PDB (Protein Data Bank)

3D structural data for proteins, nucleic acids, and complexes from X-ray, NMR, and cryo-EM with 204K+ entries. Includes experimental methods, resolution data, and cross-references to UniProt and EMDB.

ChEBI

Chemical Entities of Biological Interest ontology with 217,000+ entities including small molecules, atoms, ions, and functional groups. Provides hierarchical classification, molecular data, and biological roles.

Reactome

Open-source curated knowledgebase of biological pathways with 22,000+ pathways across 30+ species, molecular interactions, biochemical reactions, protein complexes, and disease associations.

Rhea

Expert-curated database of 17,078 biochemical reactions, all atom-balanced and chemically annotated. Linked to ChEBI compounds, enzyme classifications (EC numbers), and metabolic pathway databases.

Gene Ontology (GO)

Controlled vocabulary for gene and gene product attributes across all organisms. Covers biological processes (30,804 terms), molecular functions (12,793 terms), and cellular components (4,568 terms).

Ensembl

Comprehensive genomics database providing genome annotations for 100+ species. Contains genes, transcripts, proteins, and exons with genomic locations and cross-references to UniProt, HGNC, and OMIM.

NCBI Gene

Gene database with 57M+ entries covering protein-coding genes, ncRNAs, and pseudogenes across all organisms. Includes gene symbols, chromosomal locations, and orthology relationships.

ClinVar

Aggregates genomic variation and human health relationships with 3.5M+ variant records, clinical interpretations, gene associations, and disease conditions. Cross-referenced to MedGen and OMIM.

MedGen

NCBI's portal for medical genetics with over 233,000 clinical concepts covering diseases, phenotypes, and clinical findings. Integrates data from OMIM, Orphanet, HPO, and MONDO.

MONDO

Monarch Disease Ontology integrating multiple disease databases into unified classification. Provides cross-references to OMIM, Orphanet, DOID, MESH, ICD, and 35+ databases.

MeSH

Medical Subject Headings - NLM's controlled vocabulary thesaurus for biomedical literature indexing. Hierarchical structure with descriptors, qualifiers, and supplementary records across 16 main categories.

NCBI Taxonomy

Comprehensive biological taxonomic classification covering 3M+ organisms from bacteria to mammals with hierarchical relationships, scientific/common names, and genetic code assignments.

PubMed

Bibliographic information for biomedical literature from MEDLINE with publication metadata, abstracts, authors, MeSH annotations, and cross-references to external databases.

PubTator

Biomedical entity annotations extracted from PubMed using text mining. Contains Disease and Gene annotations linked to articles, enabling literature-based knowledge discovery.

BacDive

Bacterial Diversity Metadatabase with 97,000+ strain records covering taxonomy, morphology, physiology, cultivation conditions, and molecular data for bacteria and archaea.

MediaDive

Comprehensive culture media database from DSMZ with 3,289 standardized recipes for bacteria, archaea, fungi, and microalgae, including ingredients and growth conditions.

NANDO

Japanese intractable (rare) diseases ontology with 2,777 disease classes, multilingual labels, and cross-references to international disease ontologies and medical documentation.

DDBJ

DNA Data Bank of Japan providing nucleotide sequence data with genomic annotations, organism metadata, taxonomic classification, and functional annotations.

GlyCosmos

Comprehensive glycoscience portal integrating glycan structures, glycoproteins, glycosylation sites, glycogenes, and lectin-glycan interactions for multi-species glycobiology research.

AMR Portal

Integrates antimicrobial resistance (AMR) surveillance data with 1.7M+ phenotypic susceptibility tests and 1.1M+ genotypic AMR features from bacterial isolates worldwide.

Available Tools

TogoMCP provides a comprehensive set of tools for accessing and querying life sciences data:

Database & Information Tools

list_databases

List all available databases with descriptions

get_MIE_file

Get metadata file with ShEx schema, RDF and SPARQL examples for a database

get_sparql_endpoints

Get available SPARQL endpoints for RDF Portal

get_sparql_example

Get example SPARQL query for a specific database

get_graph_list

Get list of named graphs in a database

Search Tools (Keyword-based)

search_uniprot_entity

Search UniProt proteins by name, description, or disease associations

search_chembl_molecule

Search ChEMBL molecules and compounds

search_chembl_target

Search ChEMBL protein targets

search_pdb_entity

Search PDBj entries (structures, chemical components, peptides)

search_reactome_entity

Search Reactome pathways and reactions

search_rhea_entity

Search Rhea biochemical reactions

search_mesh_entity

Search MeSH medical concepts and terms

SPARQL Query Tool

run_sparql

Execute custom SPARQL queries on any RDF database. Use get_MIE_file first to understand the database schema.

ID Conversion Tools (TogoID)

togoid_convertId

Convert IDs between databases (e.g., UniProt to ChEMBL, PubChem to ChEBI)

togoid_countId

Count how many IDs can be converted between databases

togoid_getAllDataset

Get all dataset configurations

togoid_getDataset

Get specific dataset configuration

togoid_getAllRelation

Get all conversion relationships

togoid_getRelation

Get relationship details between two databases

NCBI E-utilities Tools

ncbi_esearch

Search NCBI databases (Gene, Taxonomy, ClinVar, MedGen, PubMed, PubChem)

ncbi_esummary

Get summary information for NCBI IDs

ncbi_efetch

Fetch full records (sequences, data, etc.)

ncbi_list_databases

List all supported NCBI databases

PubChem-specific Tools

get_pubchem_compound_id

Get PubChem compound ID from compound name

get_compound_attributes_from_pubchem

Get detailed compound attributes and molecular descriptors

Other MCP Servers

TogoMCP works excellently in combination with these complementary MCP servers:

PubDictionaries MCP Server

https://pubdictionaries.org/mcp

PubDictionaries provides text annotation services for biomedical literature. It helps identify and map biological entities (genes, proteins, diseases, chemicals) in text. When combined with TogoMCP, you can annotate literature and then retrieve detailed information about the identified entities from life sciences databases.

Use Case: Analyze research papers to extract entity mentions, then use TogoMCP to retrieve comprehensive data about those entities from UniProt, ChEMBL, or other databases.

PubMed MCP Server

Information: Claude Support Article

The PubMed MCP server provides access to the world's largest biomedical literature database. It enables searching for articles, retrieving metadata, finding related papers, and accessing full-text content from PubMed Central.

Use Case: Search PubMed for relevant literature, then use TogoMCP to retrieve detailed molecular and pathway information about entities mentioned in those papers.

OLS4 MCP Server

https://www.ebi.ac.uk/ols4/mcp

The Ontology Lookup Service (OLS4) provides access to biomedical ontologies from EMBL-EBI. It helps standardize terminology and understand hierarchical relationships between biological concepts.

Use Case: Use OLS4 to explore ontology terms and their relationships, then query TogoMCP databases using standardized ontology identifiers for precise data retrieval.

Related Resources

RDF Portal

Website: https://rdfportal.org

The RDF Portal, formally known as the NBDC RDF Portal, is a comprehensive repository for semantic data in life sciences developed by DBCLS and the National Bioscience Database Center (NBDC) in Japan. It hosts over 21 RDF datasets comprising more than 45.5 billion triples, all reviewed by NBDC to ensure interoperability and queryability.

The portal is built on Semantic Web technologies using the Resource Description Framework (RDF) and SPARQL query language. It provides a unified interface for querying diverse life science databases that have been converted to RDF format following the DBCLS guidelines for RDFizing databases.

Key Features: SPARQL endpoint for querying multiple databases simultaneously, quality-reviewed datasets ensuring data interoperability, regular updates to maintain currency with source databases, and integration of fundamental databases like UniProt, PDB, PubChem, and Ensembl in RDF format.

TogoID

Website: https://togoid.dbcls.jp

TogoID is an identifier (ID) conversion service developed by DBCLS that bridges biological datasets by linking IDs across 65+ diverse life science databases. Unlike traditional ID converters, TogoID expands the concept of "ID conversion" to include not only same-entity mappings but also cross-category relationships with semantic annotations.

For example, TogoID can convert disease IDs to related gene IDs, or glycan IDs to protein IDs, distinguishing relationships like "glycans bind to proteins" versus "glycans are processed by proteins." The service features a user-friendly web interface for exploratory multi-step conversions, showing biological meanings and ID count changes at each step.

Key Features: Multi-step ID conversion across different biological categories, semantic representation of biological relationships between IDs, weekly automatic data updates, API for programmatic access, label-to-ID and ID-to-label conversion capabilities, and open development model accepting ID pair addition requests.

DBCLS (Database Center for Life Science)

Website: https://dbcls.rois.jp

The Database Center for Life Science (DBCLS) is a Japanese research institute and part of the Research Organization of Information and Systems (ROIS), founded in 2007. DBCLS conducts fundamental research and development for database integration technologies in life sciences, with a focus on making diverse biological data more accessible and usable.

DBCLS has been organizing the annual BioHackathon event since 2008 (co-organizing with NBDC since 2011), fostering collaboration among developers to improve the integration, preservation, and utilization of life science databases. The center also hosts monthly SPARQLthon events to promote Semantic Web applications and share technical knowledge.

Research Areas: Development of RDF integration technologies using Semantic Web standards, construction of integrated database environments and distributed database systems, creation of user-friendly web services and tools for database access, development of resources like TogoTV (video-based bioinformatics tutorials), and support for database development and data-driven life sciences research.