Model Context Protocol Server for Life Sciences Research
TogoMCP is a comprehensive Model Context Protocol (MCP) server that provides LLM agents with seamless access to a vast ecosystem of life sciences databases. Developed by the Database Center for Life Science (DBCLS), TogoMCP integrates over 20 major biological and biomedical databases, offering researchers a powerful toolkit for cross-database queries, data integration, and knowledge discovery.
Through SPARQL queries, RDF data exploration, and ID conversion services, TogoMCP enables AI assistants like Claude to help researchers:
Whether you're a biologist exploring disease-protein associations, a chemist searching for drug candidates, or a researcher integrating data across multiple domains, TogoMCP bridges the gap between AI assistants and the rich data landscape of life sciences.
"Find proteins that are associated with both cardiovascular diseases and have known small molecule inhibitors in ChEMBL, and classify them according to disease and drug availability."
The system executed a comprehensive multi-database search strategy, querying MeSH for cardiovascular disease terms, then searching ChEMBL for related drug targets, and executing complex SPARQL queries to retrieve detailed bioactivity data.
search_mesh_entity - Found cardiovascular disease MeSH termssearch_chembl_target - Searched for cardiovascular drug targetsget_MIE_file - Retrieved ChEMBL database schemarun_sparql - Executed complex queries across ChEMBL database"Determine the distribution of human enzymes by their EC number classes."
The system retrieved the UniProt database schema and executed targeted SPARQL queries to extract and analyze all human enzymes with EC number classifications from Swiss-Prot.
get_MIE_file - Retrieved UniProt database schemarun_sparql - Queried UniProt for human enzymes with EC numbers"Use ChEBI (chemical classification) + ChEMBL (bioactivity) + PubChem (chemical descriptors) to compare the bioactivity profiles and chemical diversity of natural products versus synthetic compounds across different therapeutic areas."
The system coordinated queries across three major chemical databases, retrieving schemas, executing cross-database SPARQL queries, and obtaining detailed molecular descriptors to perform a comprehensive comparative analysis.
get_MIE_file - Retrieved schemas for ChEBI, ChEMBL, and PubChemrun_sparql - Executed cross-database queriesget_compound_attributes_from_pubchem - Retrieved molecular descriptorsTogoMCP can be connected to various AI platforms. Choose your platform below:
https://togomcp.rdfportal.org/mcpclaude_desktop_config.json{
"mcpServers": {
"togomcp": {
"command": "npx",
"args": ["-y", "mcp-remote", "https://togomcp.rdfportal.org/mcp"]
}
}
}
https://togomcp.rdfportal.org/mcpNote: Developer mode requires confirmation for write actions and shows tool usage explicitly for safety.
TogoMCP uses Streamable HTTP transport (not SSE). Configure your settings.json as follows:
{
"mcpServers": {
"togomcp": {
"httpUrl": "https://togomcp.rdfportal.org/mcp"
}
}
}
For more information, see the Gemini CLI documentation.
TogoMCP provides access to over 20 major life sciences databases covering proteins, genes, chemicals, diseases, pathways, and more:
Comprehensive protein sequence and functional information. Contains 444M proteins with Swiss-Prot (923K curated) and TrEMBL annotations, including sequences, functions, domains, structures, and cross-references to 200+ databases.
Public database of chemical molecules with 119M compounds, 339M substances, 1.7M bioassays, molecular descriptors, bioactivity data, and extensive cross-references to genes, proteins, pathways, and diseases.
Manually curated database of bioactive molecules containing 2.4M+ compounds, 1.6M assays, 20M bioactivity measurements. Essential for drug discovery with compound-target-activity relationships and mechanism of action data.
3D structural data for proteins, nucleic acids, and complexes from X-ray, NMR, and cryo-EM with 204K+ entries. Includes experimental methods, resolution data, and cross-references to UniProt and EMDB.
Chemical Entities of Biological Interest ontology with 217,000+ entities including small molecules, atoms, ions, and functional groups. Provides hierarchical classification, molecular data, and biological roles.
Open-source curated knowledgebase of biological pathways with 22,000+ pathways across 30+ species, molecular interactions, biochemical reactions, protein complexes, and disease associations.
Expert-curated database of 17,078 biochemical reactions, all atom-balanced and chemically annotated. Linked to ChEBI compounds, enzyme classifications (EC numbers), and metabolic pathway databases.
Controlled vocabulary for gene and gene product attributes across all organisms. Covers biological processes (30,804 terms), molecular functions (12,793 terms), and cellular components (4,568 terms).
Comprehensive genomics database providing genome annotations for 100+ species. Contains genes, transcripts, proteins, and exons with genomic locations and cross-references to UniProt, HGNC, and OMIM.
Gene database with 57M+ entries covering protein-coding genes, ncRNAs, and pseudogenes across all organisms. Includes gene symbols, chromosomal locations, and orthology relationships.
Aggregates genomic variation and human health relationships with 3.5M+ variant records, clinical interpretations, gene associations, and disease conditions. Cross-referenced to MedGen and OMIM.
NCBI's portal for medical genetics with over 233,000 clinical concepts covering diseases, phenotypes, and clinical findings. Integrates data from OMIM, Orphanet, HPO, and MONDO.
Monarch Disease Ontology integrating multiple disease databases into unified classification. Provides cross-references to OMIM, Orphanet, DOID, MESH, ICD, and 35+ databases.
Medical Subject Headings - NLM's controlled vocabulary thesaurus for biomedical literature indexing. Hierarchical structure with descriptors, qualifiers, and supplementary records across 16 main categories.
Comprehensive biological taxonomic classification covering 3M+ organisms from bacteria to mammals with hierarchical relationships, scientific/common names, and genetic code assignments.
Bibliographic information for biomedical literature from MEDLINE with publication metadata, abstracts, authors, MeSH annotations, and cross-references to external databases.
Biomedical entity annotations extracted from PubMed using text mining. Contains Disease and Gene annotations linked to articles, enabling literature-based knowledge discovery.
Bacterial Diversity Metadatabase with 97,000+ strain records covering taxonomy, morphology, physiology, cultivation conditions, and molecular data for bacteria and archaea.
Comprehensive culture media database from DSMZ with 3,289 standardized recipes for bacteria, archaea, fungi, and microalgae, including ingredients and growth conditions.
Japanese intractable (rare) diseases ontology with 2,777 disease classes, multilingual labels, and cross-references to international disease ontologies and medical documentation.
DNA Data Bank of Japan providing nucleotide sequence data with genomic annotations, organism metadata, taxonomic classification, and functional annotations.
Comprehensive glycoscience portal integrating glycan structures, glycoproteins, glycosylation sites, glycogenes, and lectin-glycan interactions for multi-species glycobiology research.
Integrates antimicrobial resistance (AMR) surveillance data with 1.7M+ phenotypic susceptibility tests and 1.1M+ genotypic AMR features from bacterial isolates worldwide.
TogoMCP provides a comprehensive set of tools for accessing and querying life sciences data:
List all available databases with descriptions
Get metadata file with ShEx schema, RDF and SPARQL examples for a database
Get available SPARQL endpoints for RDF Portal
Get example SPARQL query for a specific database
Get list of named graphs in a database
Search UniProt proteins by name, description, or disease associations
Search ChEMBL molecules and compounds
Search ChEMBL protein targets
Search PDBj entries (structures, chemical components, peptides)
Search Reactome pathways and reactions
Search Rhea biochemical reactions
Search MeSH medical concepts and terms
Execute custom SPARQL queries on any RDF database. Use get_MIE_file first to understand the database schema.
Convert IDs between databases (e.g., UniProt to ChEMBL, PubChem to ChEBI)
Count how many IDs can be converted between databases
Get all dataset configurations
Get specific dataset configuration
Get all conversion relationships
Get relationship details between two databases
Search NCBI databases (Gene, Taxonomy, ClinVar, MedGen, PubMed, PubChem)
Get summary information for NCBI IDs
Fetch full records (sequences, data, etc.)
List all supported NCBI databases
Get PubChem compound ID from compound name
Get detailed compound attributes and molecular descriptors
TogoMCP works excellently in combination with these complementary MCP servers:
PubDictionaries provides text annotation services for biomedical literature. It helps identify and map biological entities (genes, proteins, diseases, chemicals) in text. When combined with TogoMCP, you can annotate literature and then retrieve detailed information about the identified entities from life sciences databases.
Use Case: Analyze research papers to extract entity mentions, then use TogoMCP to retrieve comprehensive data about those entities from UniProt, ChEMBL, or other databases.
Information: Claude Support Article
The PubMed MCP server provides access to the world's largest biomedical literature database. It enables searching for articles, retrieving metadata, finding related papers, and accessing full-text content from PubMed Central.
Use Case: Search PubMed for relevant literature, then use TogoMCP to retrieve detailed molecular and pathway information about entities mentioned in those papers.
The Ontology Lookup Service (OLS4) provides access to biomedical ontologies from EMBL-EBI. It helps standardize terminology and understand hierarchical relationships between biological concepts.
Use Case: Use OLS4 to explore ontology terms and their relationships, then query TogoMCP databases using standardized ontology identifiers for precise data retrieval.
Website: https://rdfportal.org
The RDF Portal, formally known as the NBDC RDF Portal, is a comprehensive repository for semantic data in life sciences developed by DBCLS and the National Bioscience Database Center (NBDC) in Japan. It hosts over 21 RDF datasets comprising more than 45.5 billion triples, all reviewed by NBDC to ensure interoperability and queryability.
The portal is built on Semantic Web technologies using the Resource Description Framework (RDF) and SPARQL query language. It provides a unified interface for querying diverse life science databases that have been converted to RDF format following the DBCLS guidelines for RDFizing databases.
Key Features: SPARQL endpoint for querying multiple databases simultaneously, quality-reviewed datasets ensuring data interoperability, regular updates to maintain currency with source databases, and integration of fundamental databases like UniProt, PDB, PubChem, and Ensembl in RDF format.
Website: https://togoid.dbcls.jp
TogoID is an identifier (ID) conversion service developed by DBCLS that bridges biological datasets by linking IDs across 65+ diverse life science databases. Unlike traditional ID converters, TogoID expands the concept of "ID conversion" to include not only same-entity mappings but also cross-category relationships with semantic annotations.
For example, TogoID can convert disease IDs to related gene IDs, or glycan IDs to protein IDs, distinguishing relationships like "glycans bind to proteins" versus "glycans are processed by proteins." The service features a user-friendly web interface for exploratory multi-step conversions, showing biological meanings and ID count changes at each step.
Key Features: Multi-step ID conversion across different biological categories, semantic representation of biological relationships between IDs, weekly automatic data updates, API for programmatic access, label-to-ID and ID-to-label conversion capabilities, and open development model accepting ID pair addition requests.
Website: https://dbcls.rois.jp
The Database Center for Life Science (DBCLS) is a Japanese research institute and part of the Research Organization of Information and Systems (ROIS), founded in 2007. DBCLS conducts fundamental research and development for database integration technologies in life sciences, with a focus on making diverse biological data more accessible and usable.
DBCLS has been organizing the annual BioHackathon event since 2008 (co-organizing with NBDC since 2011), fostering collaboration among developers to improve the integration, preservation, and utilization of life science databases. The center also hosts monthly SPARQLthon events to promote Semantic Web applications and share technical knowledge.
Research Areas: Development of RDF integration technologies using Semantic Web standards, construction of integrated database environments and distributed database systems, creation of user-friendly web services and tools for database access, development of resources like TogoTV (video-based bioinformatics tutorials), and support for database development and data-driven life sciences research.