Send us your feedback.  Sign up for Pathway Commons announcements.  RSS Logo RSS Feed

Web Service API:

You can programmatically access pathway data via the Web Service API. This page provides a reference guide to help you get started.

[1] Command: search

Summary:

Searches all physical entity records (e.g. proteins and small molecules) by keyword, name or external identifier. For example, retrieve a list of all physical entity records that contain the word, "BRCA2". The response contains a summary list of the top 10 physical entity matches. For each match, detailed information is provided, including name, synonyms, external references and participating pathways. This command enables third-party software to implement direct query interfaces to Pathway Commons, and to link physical entities to known pathways.

Parameters:

  • [Required] cmd=search
  • [Required] version=2.0
  • [Required] q= a keyword, name or external identifier.
  • [Required] output = xml
  • [Optional] organism = organism filter. Must be specified as an NCBI Taxonomy identifier, e.g. 9606 for human.

Output:

An XML file which follows the SearchResponse.xsd XML Schema [Full Documentation].

Example Query:

Below is an example query. Note: this query is not guaranteed to return results.
webservice.do?version=2.0&q=BRCA2&output=xml&cmd=search

[2] Command: get_pathways

Summary:

Retrieves all pathways involving a specified physical entity (e.g. protein or small molecule). For example, get all pathways involving BRCA2. Output is a tab-delimited text file, designed for easy parsing in your favorite scripting language, such as Perl or Python.

Parameters:

  • [Required] cmd=get_pathways
  • [Required] version=2.0
  • [Required] q= a comma separated list of internal or external identifiers (IDs), used to identify the physical entities of interest. For example, to look up two distinct proteins by their UniProt IDs, use the following query: O14763, P55957. To prevent system overload, clients are currently restricted to a maximum of 25 IDs.
  • [Optional] input_id_type= internal or external database. For example, to use UniProt IDs, set input_id_type=UNIPROT. See the valid values for input_id_type parameter below. If not specified, the internal CPATH_ID is assumed.
  • [Optional] data_source = a comma separated list of pathway data sources to search. For example, the following restricts your results to Reactome pathways only: data_source=REACTOME. See the valid values for data_source parameter below. If not specified, all pathway data sources will be searched.

Output:

Output is a tab-delimited text file with four data columns:
  • Database:ID: External database identifier. For example, UNIPROT:O14763.
  • Pathway_Name: Pathway name.
  • Pathway_Database_Name: Pathway database name. For example, REACTOME.
  • CPATH_ID: Internal ID, used to uniquely identify the pathway. These IDs can be used to create links to each pathway. For example: record2.do?id=1. However, please note that these internal IDs (and any links created with them) are not stable, and may change with each new data release.

Detecting matches:

  • If a specified identifier can't be found, the second column of the tab-delimited text file will contain the keyword: PHYSICAL_ENTITY_ID_NOT_FOUND.
  • On the other hand, if a match is found for a specified identifier, but the corresponding physical entity is not involved in any known pathways or in any of the requested pathway databases, the second column of the tab-delimited text file will contain the keyword: NO_PATHWAY_DATA.

Example Query:

Below is an example query. Note: this query is not guaranteed to return results.
webservice.do?cmd=get_pathways&version=2.0&q=O14763&input_id_type=UNIPROT

[3] Command: get_neighbors

Summary:

Retrieves the nearest neighbors of a given physical entity (e.g. gene, protein or small molecule). For example, get all the neighbors of BRCA2. The following rules govern the construction of the neighborhood:

  • if A is part of a [complex] (A:B), (A:B) is included in the neighborhood, but none of the interactions involving (A:B) are included.
  • if A is a [CONTROLLER] for a [control] interaction, the reaction that is [CONTROLLED] (and all the participants in that reaction) are included in the neighborhood.
  • if A participates in a [conversion] reaction, and this reaction is [CONTROLLED] by another interaction, the [control] interaction (plus its [CONTROLLER]) are included in the neighborhood.
  • Parameters:

    • [Required] cmd = get_neighbors
    • [Required] version = 3.0
    • [Required] q = an internal or external identifier (ID), corresponding to the physical entity of interest. For example, the following query uses a UniProt identifier: O14763.
    • [Optional] input_id_type= internal or external database. For example, to use UniProt IDs, set input_id_type=UNIPROT. See the valid values for input_id_type parameter below. If not specified, the internal CPATH_ID is assumed.
    • [Optional] output = biopax (default), id_list, binary_sif , image_map, image_map_thumbnail, or image_map_frameset . When set to biopax, the client will receive a complete BioPAX representation of the neighborhood. When set to id_list, the client will receive a list of all physical entities in the neighborhood (see below). When set to binary_sif, the client will receive a text file in the Simple Interaction Format (SIF). When set to image_map, the client will receive a png image representing the neighborhood. When set to image_map_thumbnail, the client will receive a thumbnail size version of the png image. Finally, when set to image_map_frameset, the client will receive an html frameset that contains both a neighborhood image and legend.
    • [Optional] output_id_type = internal or external database. This option is only valid when the output parameter has been set to id_list or binary_sif and is used to specify which external identifiers should be used to identify the physical entities in the neighborhood. For example, to output UniProt IDs, use: UNIPROT. See the valid values for output_id_type parameter below. If not specified, the internal CPATH_ID is assumed.
    • [Optional] data_source = a comma separated list of pathway data sources that you want to search. For example, the following restricts your results to Reactome pathways only: data_source=REACTOME. See the valid values for data_source parameter below. If not specified, all pathway data sources will be searched.
    • [Optional] binary_interaction_rule = a comma separated list of binary interaction rules that are applied when binary interactions are requested. This parameter is only relevant when the output parameter is set to binary_sif. See Exporting to the Simple Interaction Format (SIF) for details.

    Output:

    A complete BioPAX representation of the network neighborhood for the given physical entity (default) or a simple text file that lists all the physical entities in the neighborhood. The output can specified by setting the output parameter. See the get_neighbors command parameter list for more information. The simple text file contains three columns of data:
    • Record Name: Physical Entity name.
    • CPATH_ID: Internal cPath ID, used to uniquely identify the physical entity. These IDs can be used to create links to each pathway. For example: record2.do?id=1. However, please note that these internal IDs (and any links created with them) are not stable, and may change with each new data release.
    • Database:ID: External database identifier. For example, UNIPROT:O14763.

    Detecting matches:

    • In the case of id_list output, if we are unable to find an external database identifier for a specified record, the third column of the tab-delimited text file will contain the keyword: NOT_SPECIFIED. In the case of binary_sif output, if we are unable to find an external database identifier for a specified record, the external identifier will be set to: NOT_SPECIFIED.

    Example Query:

    Below is an example query. Note: this query is not guaranteed to return results.
    webservice.do?version=3.0&cmd=get_neighbors&q=9854

    [4] Command: get_parents

    Summary:

    Retrieves a summary of all records which contain or reference the specified record. For example, assume that internal ID 145 refers to the BRCA2 gene. If you request get_parents for this ID, you will receive a summary list of all interactions and complexes that include BRCA2.

    Parameters:

    • [Required] cmd=get_parents
    • [Required] version=2.0
    • [Required] q= an internal identifier, used to identify the physical entity or interaction of interest.
    • [Required] output = xml

    Output:

    An XML file which follows the SummaryResponse.xsd XML Schema [Full Documentation].

    Example Query:

    Below is an example query. Note: this query is not guaranteed to return results.
    webservice.do?version=2.0&q=45202&output=xml&cmd=get_parents

    [5] Command: get_record_by_cpath_id

    Summary:

    Retrieves details regarding one or more records, such as a pathway, interaction or physical entity. For example, get the complete Apoptosis pathway from Reactome.

    Parameters:

    • [Required] cmd=get_record_by_cpath_id
    • [Required] version=2.0
    • [Required] q= a comma delimited list of internal identifiers, used to identify the pathways, interactions or physical entities of interest.
    • [Required] output = biopax, binary_sif, gsea, or pc_gene_set.
      • biopax: client will receive a complete BioPAX representation of the desired record.
      • binary_sif: client will receive a text file in the Simple Interaction Format (SIF).
      • gsea: client will receive a tab-delimited text file in file format specified by the Broad Molecular Signature Database. By default, this will output the cPath IDs of all genes within a pathway. To output different identifiers, such as gene symbols or UniProt identifies, you must specify a output_id_type parameter (see below).
      • pc_gene_set: client will receive a tab-delimited text file similar to the gsea format (see above), except that all participants are micro-encoded with multiple identifiers. Each participant is specified as: CPATH_ID:RECORD_TYPE:NAME:UNIPROT_ACCESION:GENE_SYMBOL:ENTREZ_GENE_ID.
    • [Optional] output_id_type = internal or external database. This option is only valid when the output parameter has been set to binary_sif or gsea. It specifies which external identifiers to use for physical entities. For example, to output UniProt IDs, use: UNIPROT. See the valid values for output_id_type parameter below. If not specified, the internal CPATH_ID is assumed.
    • [Optional] binary_interaction_rule = a comma separated list of binary interaction rules that are applied when binary interactions are requested. This parameter is only relevant when the output parameter is set to binary_sif. See Exporting to the Simple Interaction Format (SIF) for details.

    Detecting matches:

    • In the case of binary_sif, gsea, and pc_gene_set output, if we are unable to find an attribute for a specified record, such as a gene symbol or UNIPROT identifier, we will output the field as: NOT_SPECIFIED.

    Example Query:

    Below is an example query. Note: this query is not guaranteed to return results.
    webservice.do?cmd=get_record_by_cpath_id&version=2.0&q=1&output=biopax

    [6] Additional Parameter Details:

    Valid values for the input_id_type parameter:

    • UNIPROT
    • CPATH_ID
    • ENTREZ_GENE
    • GENE_SYMBOL

    Valid values for the output_id_type parameter:

    • UNIPROT
    • CPATH_ID
    • ENTREZ_GENE
    • GENE_SYMBOL

    Valid values for the data_source parameter:

    • BIOGRID
    • CELL_MAP
    • HPRD
    • HUMANCYC
    • IMID
    • INTACT
    • MINT
    • NCI_NATURE
    • REACTOME

    [7] Error Codes:

    An error while processing a request is reported as an XML document with information about the error cause in the following format:

    <error>
        <error_code>[ERROR_CODE]</error_code>
        <error_msg>[ERROR_DESCRIPTION]</error_msg>
        <error_details>[ADDITIONAL_ERROR _DETAILS]</error_details>
    </error>
    

    Only the first error encountered is reported. The table below provides a list of error codes, with their descriptions.

    Error Code Error Description
    450 Bad Command (command not recognized)
    452 Bad Request (missing arguments)
    453 Bad Request (invalid arguments)
    460 No Results Found
    470 Version not supported
    500 Internal Server Error