Project Team Logo  "Pathway Commons" Web Service 12 
  • Web Service
    • About
    • Search
    • Get
    • Traverse
    • Graph
    • Top pathways
    • Values
  • Providers
  • Downloads

"Pathway Commons"

Pathway Commons integrates a number of pathway and molecular interaction databases supporting BioPAX and PSI-MI formats into one large BioPAX model, which can be queried using our web API (documented below). This API can be used by computational biologists to download custom subsets of pathway data for analysis, or can be used to incorporate powerful biological pathway and network information retrieval and query functionality into websites and software. For computational biologists looking for comprehensive biological pathway data, we also make available data archives in several formats. Try not to exceed ten concurrent connections, several hits per second, from one IP address to avoid being banned. We can add capacity based on demand. For more information and help, please visit our homepage at www.pathwaycommons.org. Feel free to tell us more about yourself and your project.

RESTful Web Service API

Core Commands

To query the integrated biological pathway database, application developers can use the following commands:

  • SEARCH
  • GET
  • GRAPH
  • TRAVERSE
  • TOP_PATHWAYS

Please check the availability terms of contributing databases.
Please see also the Swagger auto-generated documentation.

Notes

About URIs and IDs

Parameters: 'source', 'uri', and 'target' require URIs of existing BioPAX elements, which are either standard Identifiers.org URLs (for most canonical biological entities and controlled vocabularies), or "Pathway Commons" generated http://pathwaycommons.org/pc12/<localID> URLs (for most BioPAX Entities and Xrefs). BioPAX object URIs used by this service are not easy to guess; thus, they should be discovered using web service commands, such as search, top_pathways, or from our archive files. For example, despite knowing PC URI namespace http://pathwaycommons.org/pc12/ or Identifiers.org prefix, one should not build a query by guessing uri like '/get?uri=http://pathwaycommons.org/pc12/foo' or '/get?uri=http://identifiers.org/something/foo', respectively, unless the BioPAX individual there exists (instead, search for existing objects of interest first). However, HUGO gene symbols, SwissProt, RefSeq, Ensembl, and NCBI Gene (positive integer) ID; and ChEBI, ChEMBL, KEGG Compound, DrugBank, PharmGKB Drug, PubChem Compound or Substance (ID must be prefixed with 'CID:' or 'SID:' to distinguish from each other and NCBI Gene), are also acceptable in place of full URIs in get and graph queries. As a rule of thumb, using full URIs makes a precise query, whereas using the identifiers makes a more exploratory one, which depends on full-text search (index) and id-mapping.


SEARCH:

A full-text search in this BioPAX database using the Lucene query syntax. Index fields (case-sensitive): uri, keyword, name, pathway, xrefid, datasource, organism (some of these are BioPAX properties, while others are composite relationships), can be optionally used in a query string. For example, the pathway index field helps find pathway participants by keywords that match their parent pathway names or identifiers; xrefid finds objects by matching its direct or 'attached to a child element' Xrefs; keyword, the default search field, is a large aggregate that includes all BioPAX properties of an element and nested elements' properties (e.g. a Complex can be found by one of its member's name or EC Number). Search results can be filtered by data provider (datasource parameter), organism, and instantiable BioPAX class (type). Search can be used to select starting points for graph traversal queries (with '/graph', '/traverse', '/get' commands). Search strings are case insensitive unless put inside quotes.

Returns:

ordered list of BioPAX individuals that match the search criteria (the page size, 'maxHitsPerpage' is configured on the server). The results (hits) are returned either as JSON or XML (Search Response XML Schema) document, which can be requested by using '.json' (e.g. '/search.json') or '.xml' extension/suffix or via HTTP request header 'Accept: application/json' (or application/json).

Parameters:

  • q= [Required] a keyword, name, external identifier, or a Lucene query string.
  • page=N [Optional] (N>=0, default is 0). Search results are paginated to avoid overloading the search response. This sets the search result page number.
  • datasource= [Optional] filjsonter by data source (use names or URIs of pathway data sources or of any existing Provenance object). If multiple data source values are specified, a union of hits from specified sources is returned. For example, datasource=reactome&datasource=pid returns hits associated with Reactome or PID.
  • organism= [Optional] organism filter. The organism can be specified either by official name, e.g. "homo sapiens" or by NCBI taxonomy identifier, e.g. "9606". Similar to data sources, if multiple organisms are declared, a union of all hits from specified organisms is returned. For example 'organism=9606&organism=10016' returns results for both human and mouse. Note the officially supported species.
  • type= [Optional] BioPAX class filter (values). NOTE: queries using &type=biosource (or any BioPAX UtilityClass, such as Score, Evidence) filter won't not return any hits; use Entity (e.g., Pathway, Control, Protein) or EntityReference type (e.g., ProteinReference) instead.

Examples:


  1. Find things that contain "FGFR2" keyword; request XML format
  2. Find pathways by FGFR2 keyword in any index field
  3. Search in 'xrefid' index, filter by protein reference type
  4. Pagination example: get the third page of all indexed elements
  5. Finds Control interactions that contain the word "binding" but not "transcription" in their indexed fields
  6. Find all interactions that directly or indirectly participate in a pathway that has a keyword match for "immune"
  7. All Reactome pathways
^top

GET:

Retrieves an object model for one or several BioPAX elements, such as pathway, interaction or physical entity, given their URIs. Get commands only retrieve the specified and all the child BioPAX elements (one can use the traverse query to obtain parent elements).

Parameters:

  • uri= [Required] valid/existing BioPAX element's absolute URI (for utility classes that were "normalized", such as entity references and controlled vocabularies, it is usually an Identifiers.org URL. Multiple identifiers are allowed per query, for example, 'uri=http://identifiers.org/uniprot/Q06609&uri=http://identifiers.org/uniprot/Q549Z0' See also note about URIs and IDs.
  • format= [Optional] output format (values)
  • pattern= [Optional] array of built-in BioPAX patterns to apply (SIF types - inference rule names; see output format description) when format=SIF or TXT is used; by default, all the pre-defined patterns but neighbor-of apply.
  • subpw= [Optional] 'true' or 'false' (default) - whether to include or skip sub-pathways when we auto-complete and clone the requested BioPAX element(s) into a reasonable sub-model.

Output:

BioPAX (default) representation for the record(s) pointed to by the given URI(s) is returned. Other output formats are produced on demand by converting from the BioPAX and can be specified using the optional format parameter. With some output formats, it might return no data (empty result) if the conversion is not applicable to the BioPAX model. For example, SIF output is only possible if there are some interactions, complexes, or pathways in the retrieved set.

Examples:

  1. Gets the JSON-LD representation of Q06609 ProteinReference.
  2. This /get query, unlike previous one, first performs full-text search by 'xrefid:FGFR2', and then converts result physical entities and genes (BioPAX sub-model) to GSEA GMT format.
  3. Get the 'Signaling by BMP' Pathway (R-HSA-201451, format: BioPAX, source: Reactome, human).
^top

GRAPH:

Graph searches are useful for finding connections and neighborhoods of elements, such as the shortest path between two proteins or the neighborhood for a particular protein state or all states. Graph searches consider detailed BioPAX semantics, such as generics or nested complexes, and traverse the graph accordingly. Note that we integrate data from multiple databases and consistently normalize UnificationXref, EntityReference, Provenance, BioSource, and ControlledVocabulary objects when we are absolutely sure that two objects of the same type are equivalent. We, however, do not merge physical entities and processes from different sources, as accurately matching and aligning pathways at that level is still an open research problem.

Parameters:

  • kind= [Required] graph query (values)
  • source= [Required] source object's URI/ID. Multiple source URIs/IDs are allowed per query, for example 'source=http://identifiers.org/uniprot/Q06609&source=http://identifiers.org/uniprot/Q549Z0'. See note about URIs and IDs.
  • target= [Required for PATHSFROMTO graph query] target URI/ID. Multiple target URIs are allowed per query; for example 'target=http://identifiers.org/uniprot/Q06609&target=http://identifiers.org/uniprot/Q549Z0'. See note about URIs and IDs.
  • direction= [Optional, for NEIGHBORHOOD and COMMONSTREAM algorithms] - graph search direction (values).
  • limit= [Optional] graph query search distance limit (default = 1).
  • format= [Optional] output format (values)
  • pattern= [Optional] array of built-in BioPAX patterns to apply (SIF types - inference rule names; see output format description) when format=SIF or TXT is used; by default, all the pre-defined patterns but neighbor-of apply.
  • datasource= [Optional] datasource filter (same as for 'search').
  • organism= [Optional] organism filter (same as for 'search').
  • subpw= [Optional] 'true' or 'false' (default) - whether to include or skip sub-pathways; it does not affect the graph search algorithm, but - only how we auto-complete and clone BioPAX elements to make a reasonable sub-model from the result set.

Output:

By default, it returns a BioPAX representation of the sub-network matched by the algorithm. Other output formats are available as specified by the optional format parameter. Some output format choices result in no data if the conversion is not applicable to the result BioPAX model (e.g., BINARY_SIF output fails if there are no interactions, complexes, nor pathways in the retrieved set).

Examples:

Neighborhood of COL5A1 (P20908):
  1. BioPAX nearest neighborhood of the protein reference http://identifiers.org/uniprot/P20908, i.e., all reactions where the corresponding protein forms participate; returned as SIF
  2. Nearest neighborhood of P20908 - starting from the corresponding Xref, finds all reactions that its owners (e.g., a protein reference) and their states (protein forms) participate in, and returns the BioPAX model.
  3. A similar query using the gene symbol COL5A1 instead of URI or UniProt ID (performs full-text search and id-mapping internally). Compared with other examples, a query like this potentially returns a larger sub-network, as it possibly starts graph traversing from multiple matching entities (seeds) rather than from a single ProteinReference. One can mix URIs along with UniProt, NCBI Gene, ChEBI IDs in a single /graph or /get query; other identifier types may also work. See: about URIs and IDs.
^top

TRAVERSE:

XPath-like access to our BioPAX db. With '/traverse', users can explicitly state the paths they would like to access. The format of the path parameter value: [Initial Class]/[property1]:[classRestriction(optional)]/[property2]... A "*" sign after the property instructs the path accessor to transitively traverse that property. For example, the following path accessor will traverse through all physical entity components a complex, including components of nested complexes, if any: Complex/component*/entityReference/xref:UnificationXref. The following will list the display names of all participants of interactions, which are pathway components of a pathway: Pathway/pathwayComponent:Interaction/participant*/displayName. Optional classRestriction allows to limit the returned property values to a certain subclass of the property's range. In the first example above, this is used to get only the unification xrefs. Path accessors can use all the official BioPAX properties as well as additional derived classes and parameters, such as inverse parameters and interfaces that represent anonymous union classes in BioPAX OWL. (See Paxtools documentation for more details).

Parameters:

  • path= [Required] a BioPAX property path in the form of type0/property1[:type1]/property2[:type2]; see properties, inverse properties, Paxtools, org.biopax.paxtools.controller.PathAccessor.
  • uri= [Required] a BioPAX element URI - specified similarly to the 'GET' command above). Multiple URIs are allowed (uri=...&uri=...&uri=...). Standard gene/chemical IDs can now be used along with absolute URIs, which makes such request equivalent to two queries combined: 1) search for the specified biopax type objects by IDs in the 'xrefid' index field; 2) traverse - using URIs of objects found in the first step and the path.

Output:

XML result according to the Search Response XML Schema (TraverseResponse type; pagination is disabled to return all values at once)

Examples (using human data):

  1. This query returns the display name of the organism of the ProteinReference specified by the URI.
  2. This query returns the URI of the organism for each of the Protein References
  3. This query returns the names of all states of RAD51 protein (by its ProteinReference URI, using property path="ProteinReference/entityReferenceOf:Protein/name")
  4. This query returns the URIs of states of BRCA1_HUMAN
  5. This query returns the names of several different objects (using abstract type 'Named' from Paxtools API)
^top

TOP_PATHWAYS:

Finds root pathways - that are neither 'controlled' nor a 'pathwayComponent' of another biological process, excluding trivial ones.

Parameters:

  • q= [Required] a keyword, name, external identifier, or a Lucene query string, like in 'search', but the default is '*' (match all).
  • datasource= [Optional] filter by data source (same as for 'search').
  • organism= [Optional] organism filter (same as for 'search').

Output:

XML document described by Search Response XML Schema (SearchResponse type; pagination is disabled to return all top pathways at once)

Examples:

  1. get top pathways related to 'TP53'
  2. get top pathways from Reactome, matching 'insulin'; request JSON format
^top

Parameter Values

Organisms

We intend to integrate pathway data only for the following species:

    Homo sapiens (9606)

Additional organisms may be pulled in due to interactions with entities from any of the above organisms, but are not otherwise supported. This means that we don’t comprehensively collect information for unsupported organisms and we have not cleaned or converted such data due to the high risk of introducing errors and artifacts. All BioSource objects can be found by using this search query.

Output Format ('format'):

For detailed descriptions of these formats, see output format description.

    Graph Type ('kind'):

      Graph Directions ('direction'):

        BioPAX class ('type'):

        Click here to show/hide the list (see also: BioPAX Classes).


          BioPAX Properties and Restrictions:

          Listed below are BioPAX properties' summary as defined in the Paxtools model: domain, property name, range and restrictions (if any). For example, XReferrable xref Xref D:ControlledVocabulary=UnificationXref D:Provenance=UnificationXref,PublicationXref means that values of ControlledVocabulary.xref can only be of UnificationXref type.

          Click here to show/hide the list of properties

            Inverse BioPAX Object Properties (a feature of the Paxtools library):

            Some of the BioPAX object properties can be traversed in the inverse direction, e.g, 'xref' - 'xrefOf'. Unlike for the standard xref property, e.g., the restriction XReferrable xref Xref D:ControlledVocabulary=UnificationXref D:Provenance=UnificationXref,PublicationXref below must be read right-to-left as it is actually about Xref.xrefOf: RelationshipXref.xrefOf cannot contain neither ControlledVocabulary (any sub-class) nor Provenance objects (in other words, vocabularies and provenance may not have any relationship xrefs).

            Click here to show/hide the list of properties

              ^top

              Powered by cPath2  © 2006-2017 Bader Lab (UofT), cBio (MSKCC; DFCI, HMS) and Demir Lab (OHSU).