Introduction
The disgenet2r package contains a set of functions to retrieve, visualize and expand DisGeNET data. DisGeNET is a discovery platform that contains information about the genetic basis of human diseases (J. Piñero et al. 2015, 2017; Janet Piñero et al. 2019). DisGeNET integrates data from several expert curated databases and from text-mining the biomedical literature.
The current version of DisGeNET (v7.0) contains 1134942 gene-disease associations (GDAs), between 21671 genes and 30170 diseases, disorders, traits, and clinical or abnormal human phenotypes, and 369554 variant-disease associations (VDAs), between 194515 variants and 14155 diseases, traits, and phenotypes.
The information in DisGeNET is organized according to the original data source (Table 1). Diseases are identified using the UMLS concept unique identifier (CUI), but mappings to commonly employed biomedical vocabularies such as MeSH, OMIM, DO, HPO, and ICD-9 are also provided. The genes are identified using the NCBI Entrez Identifier, but annotations to the official gene symbol, the UniProt identifier, and the Panther Protein Class are also supplied. Finally, the GDAs and VDAs can be ranked using the DisGeNET score. The DisGeNET score ranges from 0 to 1, and takes into account the evidence supporting the association (See more information at http://disgenet.org/dbinfo/)
DisGeNET data is also represented as a Resource Description Framework (RDF), which provides new opportunities for data integration, making possible to link DisGeNET data to other external RDF datasets (Queralt-Rosinach et al. 2016).
Table: Sources of DisGeNET data
Source_Name | Type_of_data | Description |
---|---|---|
CTD_human | GDAs | The Comparative Toxicogenomics Database, human data |
CGI | GDAs | The Cancer Genome Interpreter |
CLINGEN | GDAs | The Clinical Genome Resource |
GENOMICS_ENGLAND | GDAs | The Genomics England PanelApp |
ORPHANET | GDAs | The portal for rare diseases and orphan drugs |
PSYGENET | GDAs | Psychiatric disorders Gene association NETwork |
HPO | GDAs | Human Phenotype Ontology |
UNIPROT | GDAs/VDAs | The Universal Protein Resource |
CLINVAR | GDAs/VDAs | ClinVar, public archive of relationships among sequence variation and human phenotype |
GWASCAT | GDAs/VDAs | The NHGRI-EBI GWAS Catalog |
GWASDB | GDAs/VDAs | The GWas Database |
CTD_mouse | GDAs | The Comparative Toxicogenomics Database, Mus musculus data |
MGD | GDAs | The Mouse Genome Database |
CTD_rat | GDAs | The Comparative Toxicogenomics Database, Rattus norvergicus data |
RGD | GDAs | The Rat Genome Database |
BEFREE | GDAs/VDAs | Data from text mining medline abstracts using the BeFree System (Bravo et al. 2015) |
LHGDN | GDAs | Literature-derived human gene-disease network generated by text mining NCBI GeneRIFs (Bundschus et al. 2008) |
CURATED | GDAs/VDAs | Human curated sources: CTD, ClinGen, CGI, UniProt, Orphanet, PsyGeNET, Genomics England PanelApp |
INFERRED | GDAs | Inferred data from: HPO,ClinVar, GWASCat, GwasDB |
ANIMAL_MODELS | GDAs | Data from animal models: CTD_rat, RGD, CTD_mouse, MGD |
ALL | GDAs/VDAs | All data sources |
Contact
For questions regarding disgenet2r, contact our support account at support@disgenet.org.
Installation and first run
The package disgenet2r is available through Bitbucket. The package requires an R version > 3.5. Additionally, the following packages are needed: VennDiagram, stringr, tidyr, SPARQL, RCurl, igraph, ggplot2, and reshape2.
Install disgenet2r by typing in R:
library(devtools)
install_bitbucket("ibi_group/disgenet2r")
To load the package:
library(disgenet2r)
To be able to use the disgenet2r package, you need to register for a free account here. Once you have completed the registration process, use the get_disgenet_api_key to retrieve your API key.
<- get_disgenet_api_key(
disgenet_api_key email = "user@email.com",
password = "myspwd" )
After retrieving the API key, run the line below so the key is available for all the disgenet2r functions.
Sys.setenv(DISGENET_API_KEY= disgenet_api_key)
In the following document, we illustrate how to use the disgenet2r package through a series of examples.
Retrieving Gene-Disease Associations from DisGeNET
Searching by gene
The gene2disease function retrieves the GDAs in DisGeNET for a given gene, or a for a list of genes. The gene(s) can be identified by either the NCBI gene identifier, or the official Gene Symbol, and the type of identifier used must be specified using the parameter vocabulary
. By default, vocabulary = "HGNC"
, to switch to Entrez NCBI Gene identifiers, set vocabulary to ENTREZ.
The function also requires the user to specify the source database using the argument database
. By default, all the functions in the disgenet2r package use as source database CURATED, which includes GDAs from CTD (human data), PsyGeNET, the HPO, Genomics England PanelApp, ClinGen, CGI, UniProt, and Orphanet.
The information can be filtered using the DisGeNET score. The argument score
is filled with a range of score to perform the search. The score is entered as a vector which first position is the initial value of score, and the second argument is the final value of score. Both values will always be included. By default, score=c(0,1)
.
In the example, the query for the Leptin Receptor (Gene Symbol LEPR
, and Entrez NCBI Identifier 3953
) is performed in all databases in DisGeNET (database = "ALL"
).
<- gene2disease( gene = 3953, vocabulary = "ENTREZ",
data1 database = "CURATED")
The function gene2disease produces an object DataGeNET.DGN
that contains the results of the query.
class(data1)
## [1] "DataGeNET.DGN"
## attr(,"package")
## [1] "disgenet2r"
Type the name of the object to display its attributes: the input parameters such as whether a single entity, or a list were searched (single
or list
), the type of entity (gene-disease
), the selected database (ALL
), the score range used in the search (0-1
), and the gene NCBI identifier (3953
).
data1
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: gene-disease
## . Database: CURATED
## . Score: 0-1
## . Term: 3953
## . Results: 79
To obtain the data frame with the results of the query, apply the extract function:
<- extract(data1)
results head( results, 3 )
## gene_dpi geneid gene_symbol protein_class_name gene_pli source score protein_class disease_type diseaseid year_final uniprotid
## 1 0.808 3953 LEPR Signaling 0.99475 CURATED 0.72 DTO_05007599 disease C3554225 2019 P48357
## 2 0.808 3953 LEPR Signaling 0.99475 CURATED 0.70 DTO_05007599 disease C0028754 2019 P48357
## 3 0.808 3953 LEPR Signaling 0.99475 CURATED 0.60 DTO_05007599 disease C0011860 2019 P48357
## disease_name year_initial disease_semantic_type el ei
## 1 LEPTIN RECEPTOR DEFICIENCY 1998 Disease or Syndrome strong 1.000
## 2 Obesity 1966 Disease or Syndrome 0.930
## 3 Diabetes Mellitus, Non-Insulin-Dependent 1966 Disease or Syndrome 0.975
## disease_class_name disease_class gene_dsi
## 1 0.433
## 2 Pathological Conditions, Signs and Symptoms; Nutritional and Metabolic Diseases C23;C18 0.433
## 3 Nutritional and Metabolic Diseases; Endocrine System Diseases C18;C19 0.433
The same query can be performed using the Gene Symbol (LEPR
). Additionally, a minimum threshold for the score can be defined. In the example, a cutoff of score=c(0.2,1)
is used. Notice how the number of diseases associated to the Leptin Receptor drops from 264 to 68 when the score is restricted.
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: gene-disease
## . Database: ALL
## . Score: 0.3-1
## . Term: LEPR
## . Results: 79
Visualizing the diseases associated to a single gene
The disgenet2r package offers two options to visualize the results of querying DisGeNET for a single gene: a network showing the diseases associated to the gene of interest (Gene-Disease Network
), and a network showing the MeSH Disease Classes of the diseases associated to the gene (Gene-Disease Class Network
). These graphics can be obtained by changing the class
argument in the plot function.
By default, the plot function produces a Gene-Disease Network
on a DataGeNET.DGN
object (Figure 1). In the Gene-Disease Network
the blue nodes are diseases, the pink nodes are genes, and the width of the edges is proportional to the score of the association. The prop
parameter allows to adjust the width of the edges while keeping the proportionality to the score.
plot( data1,
class = "Network",
prop = 20)
Figure 1: The Gene-Disease Network for the Leptin Receptor gene
The results can also be visualized in a network in which diseases are grouped by the MeSH Disease Class if the class
argument is set to “DiseaseClass” (Gene-Disease Class Network
, Figure 2). In the Gene-Disease Class Network
, the node size of is proportional to the fraction of diseases in the disease class, with respect to the total number of diseases with disease classes associated to the gene. In the example, the Lepin Receptor is associated mainly to Nutritional and Metabolic Diseases. There is 1 disease in the example that does not have annotations to MeSH disease class (Shown as a warning).
plot( data1,
class = "DiseaseClass",
prop = 3)
## [1] "warning: 1 disease(s) not shown in the plot"
Figure 2: The Disease Class Network for the Leptin Receptor Gene
Exploring the evidences associated to a gene
You can extract the evidences associated to a particular gene using the function gene2evidence. Additionally, you can explore the evidences for a specific gene-disease pair by specifying the disease identifier using the argument disease
.
<- gene2evidence( gene = "LEPR",
data1 vocabulary = "HGNC",
disease ="C3554225",
database = "ALL",
score =c(0.3,1))
data1
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: evidence
## . Database: ALL
## . Score: 0.3-1
## . Term: LEPR
<- extract(data1) results
pmid | year | sentence |
---|---|---|
29568105 | 2018 | A cohort of n = 21 children living in Germany or Austria with monogenic obesity due to congenital leptin deficiency (group LEP, n = 6), leptin receptor deficiency (group LEPR, n = 6) and primarily heterozygous MC4 receptor deficiency (group MC4R, n = 9) was analyzed. |
31702041 | 2019 | In conclusion, activation of the leptin receptor ob‑R is an important pathogenic mechanism of UC, and leptin receptor deficiency may provide resistance against TNBS‑induced colitis by inhibiting the NF‑κB and RhoA signaling pathways. |
23275530 | 2013 | Homozygous leptin receptor mutation due to uniparental disomy of chromosome 1: response to bariatric surgery. |
17229951 | 2007 | Clinical and molecular genetic spectrum of congenital deficiency of the leptin receptor. |
9537324 | 1998 | A mutation in the human leptin receptor gene causes obesity and pituitary dysfunction. |
25751111 | 2015 | Seven novel deleterious LEPR mutations found in early-onset obesity: a ΔExon6-8 shared by subjects from Reunion Island, France, suggests a founder effect. |
9537324 | 1998 | A mutation in the human leptin receptor gene causes obesity and pituitary dysfunction. |
Searching multiple genes
The gene2disease function can also receive a list of genes as input, either as NCBI Gene Identifiers or Gene Symbols. In the example, we show how to create a vector with the Gene Symbols of several genes belonging to the family of voltage-gated potassium channels (Table 2) and then, we apply the function gene2disease.
Table 2: Example of voltage-gated potassium channel family members
Name | Description |
---|---|
KCNE1 | potassium channel, voltage gated subfamily E regulatory beta subunit 1 |
KCNE2 | potassium channel, voltage gated subfamily E regulatory beta subunit 2 |
KCNH1 | potassium channel, voltage gated eag related subfamily H, member 1 |
KCNH2 | potassium channel, voltage gated eag related subfamily H, member 2 |
KCNG1 | potassium voltage-gated channel modifier subfamily G member 1 |
Creating the vector with the list of genes belonging to the voltage-gated potassium channel family.
<- c( "KCNE1", "KCNE2", "KCNH1", "KCNH2", "KCNG1") myListOfGenes
The gene2disease function also requires the user to specify the source database using the argument database
, and optionally, the DisGeNET score
can also be applied to filter the results.
<- gene2disease(
data2 gene = myListOfGenes,
score =c(0.2, 1),
verbose = TRUE
)
## Warning in gene2disease(gene = myListOfGenes, score = c(0.2, 1), verbose = TRUE):
## One or more of the genes in the list is not in DisGeNET ( 'CURATED' ):
## - KCNG1
data2
## Object of class 'DataGeNET.DGN'
## . Search: list
## . Type: gene-disease
## . Database: CURATED
## . Score: 0.2-1
## . Term: KCNE1 ... KCNH2
## . Results: 51
Visualizing the diseases associated to multiple genes
By default, plotting a DataGeNET.DGN
resulting of the query with a list of genes produces a Gene-Disease Network
where the blue nodes are diseases, the pink nodes are genes, and the width of the edges is proportional to the score of the association (Figure 3).
plot( data2,
class = "Network",
prop = 10)
Figure 3: The Gene-Disease Network for a list of genes belonging to the voltage-gated potassium channel family
Setting the argument class
to “Heatmap” produces a Gene-Disease Heatmap
(Figure 4), where the scale of colors is proportional to the score of the GDA. The argument limit
can be used to limit the number of rows to the top scoring GDAs. The argument nchars
can be used to limit the length of the name of the disease. By default, the plot shows the 50 highest scoring GDAs.
plot( data2,
class ="Heatmap",
limit = 100, nchars = 50 )
Figure 4: The Gene-Disease Heatmap for a list of genes belonging to the voltage-gated potassium channel family
These results can also be visualized as a Gene-Disease Class Heatmap
by setting the argument class
to “DiseaseClass” (Figure 5). In this case, diseases are grouped by the their MeSH disease classes, and the color scale is proportional to the percentage of diseases in each MeSH disease class. In the example, genes are associated mainly to Cardiovascular Diseases, and to Congenital, Hereditary, and Neonatal Diseases and Abnormalities.
plot( data2,
class="DiseaseClass", nchars=60)
## [1] "warning: 3 disease(s) not shown in the plot"
Figure 5: The Gene-Disease Class Heatmap for a list of genes belonging to the voltage-gated potassium channel family
Retrieving the identifier for a disease of interest
To retrieve the concept unique identifier for the disease of interest, you can use the get_umls_from_vocabulary function with the parameter vocabulary set to “NAME.” This will retrieve all diseases in DisGeNET that contain the keyword used as an input.
<- get_umls_from_vocabulary( "Schizophrenia", vocabulary = "NAME" )
diseases kable(diseases, caption = "CUIs associated to Schizophrenia") %>% kable_styling(bootstrap_options = "striped", full_width = F)
diseaseid | name |
---|---|
C0021151 | Incipient Schizophrenia |
C0036341 | Schizophrenia |
C0036344 | Schizophrenia, Catatonic |
C0036346 | Schizophrenia, Childhood |
C0036347 | Schizophrenia, Disorganized |
C0036349 | Paranoid Schizophrenia |
C0036351 | Residual schizophrenia |
C0220702 | SCHIZOPHRENIA 1 (disorder) |
C0221765 | Chronic schizophrenia |
C0392322 | Undifferentiated schizophrenia |
C0525046 | Schizophrenia Spectrum and Other Psychotic Disorders |
C0543918 | SCHIZOPHRENIA 10 |
C0809983 | Schizophrenia and related disorders |
C0857501 | Acute schizophrenia |
C1656427 | Early onset schizophrenia |
C1833247 | SCHIZOPHRENIA 4 (disorder) |
C1837893 | SCHIZOPHRENIA 12 |
C1838069 | SCHIZOPHRENIA 3 (disorder) |
C1842605 | SCHIZOPHRENIA 11 |
C1864124 | SCHIZOPHRENIA 8 (disorder) |
C1864153 | SCHIZOPHRENIA 5 |
C1864275 | SCHIZOPHRENIA 6 (disorder) |
C2673482 | Increased susceptibility to schizophrenia |
C2675945 | SCHIZOPHRENIA, SUSCEPTIBILITY TO |
C2678117 | Schizophrenia-like symptoms (uncommon) |
C3151380 | SCHIZOPHRENIA 15 |
C3544321 | Treatment-resistant schizophrenia |
C3808913 | SCHIZOPHRENIA 18 |
C4049474 | Very late onset schizophrenia |
C4539944 | SCHIZOPHRENIA 19 |
Searching by disease
The disease2gene function allows to retrieve the genes associated to a disease, or a list of diseases. The function uses as input the disease, or list of diseases of interest (as UMLS CUI, MeSH, OMIM, Disease Ontology, ICD9CM, NCIt, EFO, or Orphanet Identifiers), the disease vocabulary employed (OMIM
(OMIM), MESH
(MeSH),ICD9CM
(ICD9-CM), DO
(Disease Ontology), NCI
(NCI thesaurus), ORDO
(Orphanet), or EFO
(EFO) and the database (by default, CURATED
). A threshold value for the score can be set, like in the gene2disease function.
In the example, we will use the disease2gene function to retrieve the genes associated to the UMLS CUI C0036341. This function also receives as input the database, in the example, CURATED, and a score range, in the example, from 0.4 to 1.
<- disease2gene( disease = "C0036341",
data3 database = "CURATED",
score = c( 0.4,1 ) )
data3
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: disease-gene
## . Database: CURATED
## . Score: 0.4-1
## . Term: C0036341
## . Results: 227
The same results are obtained when querying DisGeNET with the MeSH identifier for Schizophrenia (D012559).
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: disease-gene
## . Database: CURATED
## . Score: 0.4-1
## . Term: D012559
## . Results: 227
The same results are obtained when querying DisGeNET with the OMIM identifier for Schizophrenia (181500).
<- disease2gene( disease = "181500", vocabulary = "OMIM",
data4 database = "CURATED",
score = c(0.4,1 ) )
data4
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: disease-gene
## . Database: CURATED
## . Score: 0.4-1
## . Term: 181500
## . Results: 227
The same results are obtained when querying DisGeNET with the ICD9-CM identifier for Schizophrenia (295).
<- disease2gene( disease = "295", vocabulary = "ICD9CM",
data4 database = "CURATED",
score = c(0.4,1 ) )
data4
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: disease-gene
## . Database: CURATED
## . Score: 0.4-1
## . Term: 295
## . Results: 227
The same results are obtained when querying DisGeNET with the NCI identifier for Schizophrenia (C3362).
<- disease2gene( disease = "C3362", vocabulary = "NCI",
data4 database = "CURATED",
score = c(0.4,1 ) )
data4
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: disease-gene
## . Database: CURATED
## . Score: 0.4-1
## . Term: C3362
## . Results: 227
The same results are obtained when querying DisGeNET with the DO identifier for Schizophrenia (5419).
<- disease2gene( disease = "HP:0100753", vocabulary = "HPO",
data4 database = "CURATED",
score = c(0.4,1 ) )
data4
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: disease-gene
## . Database: CURATED
## . Score: 0.4-1
## . Term: HP:0100753
## . Results: 227
Visualizing the genes associated to a single disease
There are two options to visualize the results from searching a single disease: a Gene-Disease Network
showing the genes related to the disease of interest (Figure 6), and a Disease-Protein Class Network
with the genes grouped by Panther Protein Class (Figure 7).
Figure 6 shows the default Gene-Disease Network
for Schizophrenia. As in the case of the gene2disease function, the blue nodes is the disease, the pink nodes are genes, and the width of the edges is proportional to the score of the association.
plot ( data3,
prop = 10)
Figure 6: The Gene-Disease Network for genes associated to Schizophrenia
Alternatively, in the Disease-Protein Class Network
, genes are grouped by the Panther Protein Class (Figure 7). This is a better choice when there is a large number of genes associated to the disease. This plot uses as class
argument “ProteinClass.” The resulting network will show in blue the disease, and in green the Protein Classes of the genes associated to the disease. The node size is proportional to the number of genes in the Panther Protein Class. In the example, the largest proportion of the genes associated to Schizophrenia are receptors. Notice again that not all genes have annotations to Panther Protein classes (69 genes in Figure 7)
plot( data3,
class="ProteinClass"
)
## [1] "warning: 89 gene(s) not shown in the plot"
Figure 7: The Protein Class-Disease Network for genes associated to Schizophrenia
Exploring the evidences associated to a disease
To explore the evidences supporting the associations for Schizophrenia using the function disease2evidence.
<- disease2evidence( disease = "C0036341",
data3 type = "GDA",
database = "CURATED",
score = c( 0.4,1 ) )
data3
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: evidence
## . Database: CURATED
## . Score: 0.4-1
## . Term: C0036341
Additionally, you can explore the evidences for a specific gene-disease pair by specifying the gene identifier using the argument gene
.
<- disease2evidence( disease = "C0036341",
data3 gene = c("DRD2", "DRD3"),
type = "GDA",
database = "CURATED",
score = c( 0.4,1 ) )
data3
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: evidence
## . Database: CURATED
## . Score: 0.4-1
## . Term: C0036341
<- extract(data3) results
The results are shown in the table below.
pmid | year | sentence |
---|---|---|
11245917 | 2001 | Because dopamine D2 receptors are the primary targets for antipsychotic drugs, including clozapine and quetiapine, and because some studies have found D2 receptors to be elevated in schizophrenia, we examined the mRNA of three forms of the D2 receptor, particularly the new form of the dopamine D2 receptor, D2(Longer), in post-mortem brains from patients who died with schizophrenia. |
21187413 | 2011 | DRD2/AKT1 interaction on D2 c-AMP independent signaling, attentional processing, and response to olanzapine treatment in schizophrenia. |
10523822 | 1999 | Homozygosity at the dopamine DRD3 receptor gene in cocaine dependence. |
18583979 | 2008 | Systematic meta-analyses and field synopsis of genetic association studies in schizophrenia: the SzGene database. |
10831489 | 2000 | Subjective experience and striatal dopamine D(2) receptor occupancy in patients with schizophrenia stabilized by olanzapine or risperidone. |
Searching multiple diseases
The disease2gene function also accepts as input a list of diseases (as UMLS CUI, MeSH, OMIM, Disease Ontology, Orphanet, Dechipher, or ICD9CM Identifiers), the database (by default, CURATED), and optionally, a value range for the score. In the example, we have selected a list of 10 diseases. Table 3 shows the UMLS CUIs and the corresponding disease names.
Table 3: Disease list selected for illustrating the disease2gene multiple search
UMLS_CUI | Disease_Name |
---|---|
C0036341 | Schizophrenia |
C0036341 | Alzheimer’s Disease |
C0030567 | Parkinson Disease |
C0005586 | Bipolar Disorder |
Creating the vector with the list of diseases.
<- c("C0036341", "C0002395", "C0030567","C0005586") diseasesOfInterest
In the example, we will search in CURATED data, using a score range of 0.4-1.
<- disease2gene(
data5 disease = diseasesOfInterest,
database = "CURATED",
score =c(0.4,1),
verbose = TRUE )
Visualizing the genes associated to multiple diseases
The default plot of the results of querying DisGeNET with a list of diseases produces a Gene-Disease Network
where the blue nodes are diseases, the pink nodes are genes, and the width of the edges is proportional to the score of the association (Figure 8).
plot( data5,
class = "Network",
prop = 10)
Figure 8: The Gene-Disease Network for genes associated to a list of diseases
To visualize the results as a Gene-Disease Heatmap
(Figure 9) change the argument class
to “Heatmap.” In this plot, the scale of colors is proportional to the score of the GDA. The argument limit
can be used to limit the number of rows to the top scoring GDAs when the results are large. By default, the plot shows the 50 highest scoring GDAs.
plot( data5,
class="Heatmap",
limit =30,
cutoff=0.2)
## [1] "Dataframe of 421 rows has been reduced to 30 rows."
Figure 9: The Gene-Disease Heatmap for genes associated to a list of diseases
A third visualization option is a Protein Class-Disease Heatmap
(Figure 10), in which genes are grouped by protein class. This plot is obtained by setting the class
argument to “ProteinClass.” In this case, the color of the heatmap is proportional to the percentage of genes for each disease in each protein class. This heatmap displays the protein classes associated to each disease.
plot( data5,
class="ProteinClass" )
## [1] "warning: 142 gene(s) not shown in the plot"
Figure 10: The Protein Class-Disease Heatmap for genes associated to a list of diseases
Retrieving Variant-Disease Associations from DisGeNET
Searching by variant
The variant2disease function receives a variant, or a list of variants as input, identified by the dbSNP identifier. It produces an object DataGeNET.DGN
, with Type = "variant-disease"
.
<- variant2disease( variant= "rs121913279",
data6 database = "CURATED")
data6
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: variant-disease
## . Database: CURATED
## . Score: 0-1
## . Term: rs121913279
## . Results: 48
Visualizing the diseases associated to a single variant
The disgenet2r package offers several options to visualize the results of querying DisGeNET for a single variant: a Variant-Disease Network
showing the diseases associated to the variant of interest, a Variant-Gene-Disease Network
showing the genes, diseases, and variant of interest, and a network showing the MeSH Disease Classes of the diseases associated to the variant (Variant-Disease Class Network
). These graphics can be obtained by changing the class
argument in the plot function.
By default, the plot function produces a Variant-Disease Network
on a DataGeNET.DGN
object (Figure 11). In the Variant-Disease Network
the blue nodes are diseases, the yellow nodes are variants, the blue nodes are diseases, and the width of the edges is proportional to the score of the association.
plot( data6,
class = "Network",
prop = 10)
Figure 11: The Variant-Disease Network for the variant rs121913279
The results can also be visualized in a network in which diseases are grouped by the MeSH Disease Class if the class
argument is set to “DiseaseClass” (Figure 12). In the Variant-Disease Class Network
, the node size of is proportional to the fraction of diseases in the disease class, with respect to the total number of diseases with disease class associated to the gene. In the example, the variant is associated mainly to Neoplasms.
plot( data6,
class = "DiseaseClass",
prop = 3)
## [1] "warning: 4 disease(s) not shown in the plot"
## [1] "warning: 4 disease(s) not shown in the plot"
Figure 12: The Variant- Disease Class Network for the variant rs121913279
Exploring the evidences associated to a variant
You can extract the evidences associated to a particular variant using the function variant2evidence. Additionally, you can explore the evidences for a specific variant-disease pair by specifying the argument disease
.
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: evidence
## . Database: ALL
## . Score: 0-1
## . Term: rs121913279
The results are shown in the table below.
pmid | year | sentence |
---|---|---|
15994075 | 2005 | The prevalence of PIK3CA mutations in gastric and colon cancer. |
15930273 | 2005 | Functional analysis of PIK3CA gene mutations in human colorectal cancer. |
23826249 | 2013 | Collectively, our data suggest that PF-04691502 exhibits potent anticancer activity in colorectal cancer by targeting both PIK3CA (H1047R) mutant CSCs and their derivatives. |
27405731 | 2016 | We developed a sensitive, simple and rapid approach to detect the low-abundance PIK3CA (H1047R) mutation in real CRC specimens, providing an effective tool for guiding cancer targeted therapy. |
Searching multiple variants
The variant2disease function retrieves the information in DisGeNET for a list of variants identified by the dbSNP identifier. The function also requires the user to specify the source database using the argument database
. By default, variant2disease function uses as source database CURATED.
<- variant2disease(
data7 variant = c("rs121913013", "rs1060500621",
"rs199472709", "rs72552293",
"rs74315445", "rs199472795"),
database = "ALL")
Visualizing the diseases associated to multiple variants
The results of querying DisGeNET with a list of variants can be visualized as a Variant-Disease Network
(Figure 13), as a Variant-Gene-Disease Network
(Figure 15) and as heatmap (Variant-Disease Class Heatmap
, Figure 16). Below, the Variant-Disease Network
.
plot( data7,
class = "Network",
prop = 10)
Figure 13: The Variant-Disease Network for a list of variants
To obtain the Variant-Gene-Disease Network
, Figure 14), change the showGenes
argument to “TRUE.”
plot( data7,
class = "Network",
showGenes = T)
Figure 14: The Variant-Gene-Disease Network for a list of variants
The results of querying DisGeNET variant information with a list of diseases can be visualized as a Variant-Disease Heatmap
by changing the class
argument to Heatmap
(Figure 15).
plot( data7,
class = "Heatmap",
prop = 10)
Figure 15: The Variant-Disease Heatmap for a list of variants
The results of querying DisGeNET variant information with a list of diseases can also be visualized as a Variant-Disease Class Network
.
plot( data7,
class = "DiseaseClass")
Figure 16: The Variant-Disease Class Heatmap for a list of variants
Searching by disease
The disease2variant function allows to retrieve the variants associated to a disease, or a list of diseases. The function uses as input the disease, or list of diseases of interest (as UMLS CUI, MeSH, OMIM, Disease Ontology, ICD9CM, NCIt, EFO, or Orphanet Identifiers), the disease vocabulary employed (OMIM
(OMIM), MESH
(MeSH),ICD9CM
(ICD9-CM), DO
(Disease Ontology), NCI
(NCI thesaurus), ORDO
(Orphanet), or EFO
(EFO) and the database (by default, CURATED
). A threshold value for the score can be set, like in the gene2disease function.
<- disease2variant(disease = c("C0752166"),
data8 database = "CLINVAR",
score = c(0, 1) )
data8
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: disease-variant
## . Database: CLINVAR
## . Score: 0-1
## . Term: C0752166
## . Results: 157
Visualizing the variants associated to a single disease
The results of querying DisGeNET variant information with a list of diseases can be visualized as a Variant-Disease Network
(Figure 17).
plot( data8,
class = "Network")
Figure 17: The Variant-Disease Network for a single disease
Explore the evidences associated to a single disease
To explore the evidences supporting the VDAs for Schizophrenia, run the disease2evidence function. You can use the argument variant to inspect the evidences for a particular variant and Schizophrenia. You can also specify a year range for the publications
<- disease2evidence( disease = "C0036341",
data8 variant = "rs1344706",
type = "VDA",
year = c(2019, 2020),
database = "ALL",
score = c( 0.1,1 ) )
data8
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: evidence
## . Database: ALL
## . Score: 0.1-1
## . Term: C0036341
pmid | year | sentence |
---|---|---|
30852803 | 2019 | ZNF804A rs1344706 has been identified as one of the risk genes for schizophrenia. |
30962687 | 2019 | rs1344706 risk allele carriers of schizophrenia had increased gray matter in the brain regions including frontal lobe, temporal lobe, and other brain regions, but the carriers of healthy individuals had decreased gray matter and white matter integrity in the frontal lobe, central network, and other brain regions. |
30079586 | 2019 | CACNA1C-rs1006737 and ZNF804A-rs1344706 polymorphisms are among the most robustly associated with schizophrenia (SCZ) and bipolar disorder (BD), and recently with brain phenotypes. |
31122238 | 2019 | Previous studies indicated that single-nucleotide polymorphism (SNP) rs1344706 in ZNF804A was strongly associated with schizophrenia and might influence social interaction. |
If you want to inspect the evidences for Schizophrenia, and all the variants in a particular gene, use the argument gene
.
<- disease2evidence( disease = "C0036341",
data8 gene = "CACNA1C",
type = "VDA",
database = "BEFREE",
score = c( 0.1,1 ) )
data8
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: evidence
## . Database: BEFREE
## . Score: 0.1-1
## . Term: C0036341
<- extract(data8) results
pmid | year | sentence | |
---|---|---|---|
6 | 30607529 | 2019 | Both CACNA1C (rs1006737) and KCNH2 (rs3800779) were genotyped in 101 controls and 50 schizophrenia patients. |
8 | 31033230 | 2019 | We performed a comprehensive meta-analysis of all available samples from existing studies under four different genetic models (recessive model, dominant model, additive model and allele model) to further confirm whether CACNA1C rs1006737 is an authentic risk single nucleotide polymorphism (SNP) for schizophrenia. |
25 | 30079586 | 2019 | CACNA1C-rs1006737 and ZNF804A-rs1344706 polymorphisms are among the most robustly associated with schizophrenia (SCZ) and bipolar disorder (BD), and recently with brain phenotypes. |
27 | 30268820 | 2019 | Taken together, results of this study suggested that rs2007044 could affect the activation of the left IFG, which was a possible brain mechanism underlying the association between CACNA1C gene polymorphism and schizophrenia. |
29 | 29599094 | 2018 | Three SNPs (rs1625579, rs1344706 and rs4765905) showed a consistent direction of effect with previous studies and the polygenic risk score constructed using the weighted sum of these three SNPs showed a significant association with Schizophrenia in this population (OR=3.78, p=0.005). |
Searching multiple diseases
<- disease2variant(
data8 disease = c("C3150943", "C1859062", "C2678485", "C4015695"),
database = "CURATED",
score = c(0.75, 1) )
data8
## Object of class 'DataGeNET.DGN'
## . Search: list
## . Type: disease-variant
## . Database: CURATED
## . Score: 0.75-1
## . Term: C3150943 ... C4015695
## . Results: 62
Visualizing the variants associated to multiple diseases
The results of querying DisGeNET variant information with a list of diseases can be visualized as a Variant-Disease Network
, or as a Variant-Disease Heatmap
, Figure 18), by changing the class
argument from “Network” to “Heatmap.”
plot( data8,
class = "Network")
Figure 18: The Variant-Disease Network for a list of diseases
The Variant-Disease Network
can be displayed as a Variant-Disease-Gene Network
, by setting the showGenes
parameter to TRUE
.
plot( data8,
class = "Network",
showGenes = T)
Figure 19: The Variant-Gene-Disease Network for a list of diseases
The results can be visualized as a Heatmap.
plot( data8,
class = "Heatmap",
prop = 10)
## [1] "warning: dataframe of 62 rows has been reduced to 50 rows."
Figure 20: The Variant-Disease Heatmap for a list of diseases
Retrieving Disease-Disease Associations from DisGeNET
The disgenet2r package also allows to obtain a list of diseases that share genes or variants with a particular disease, or disease list (disease-disease associations, or DDAs).
Searching DDAs by genes for a single disease
To obtain disease-disease associations, use the disease2disease_by_gene function. This function uses as input a disease, in the same format that in disease2gene, the database to perform the search (by default, CURATED), and the argument ndiseases
, the number of diseases that share genes with the disease(s) of interest. By default ndiseases = 10
. The output is a DataGeNET.DGN
object that contains the top 10 diseases that share genes with the disease that has been searched.
<- disease2disease_by_gene(
data9 disease = "C0010674",
database = "CURATED", ndiseases = 5 )
data9
## Object of class 'DataGeNET.DGN'
## . Search: single
## . Type: disease-disease-gene
## . Database: CURATED
## . Score:
## . Term: C0010674
## . Results: 5
<- extract(data9)
qr head(qr[c("disease1_name", "disease2_name","jaccard_genes","ngenes", "pvalue")])
## disease1_name disease2_name jaccard_genes ngenes pvalue
## 1 Cystic Fibrosis Pulmonary Cystic Fibrosis 0.818 9 0
## 2 Cystic Fibrosis Fibrocystic Disease of Pancreas 0.818 9 0
## 3 Cystic Fibrosis Idiopathic bronchiectasis 0.364 4 0
## 4 Cystic Fibrosis Liddle Syndrome 0.250 3 0
## 5 Cystic Fibrosis BRONCHIECTASIS WITH OR WITHOUT ELEVATED SWEAT CHLORIDE 1 0.182 2 0
The DataGeNET.DGN
object produced by the disease2disease_by_gene function also contains the Jaccard Index, also known as the Jaccard similarity coefficient for each disease pair. The Jaccard Coefficient is a similarity metric, computed as as the size of the intersection divided by the size of the union of two sample sets, in this case, the genes associates to each disease:
\[\begin{equation*} J(A, B) = \frac{\mid A \cap B \mid}{\mid A \cup B \mid} \end{equation*}\]
The pvalue
column estimates the significance of the Jaccard coefficient for a list of disease pairs. The estimation of the pvalue is estimated by randomizing 1000 times.
Visualizing the diseases associated to a single disease
The plot function applied to the DataGeNET.DGN
object generated by the disease2disease_by_gene function results in a Disease-Disease Network
, where the node in dark blue is the disease of interest and nodes in ligth blue are the diseases that share genes with it (Figure 21). The node size is proportional to the number of genes associated to each disease.
plot( data9,
class = "Network",
layout="layout.lgl" ,
prop = 0.4 )
Figure 21: The Disease-Disease Network by shared genes for MELAS Syndrome
To obtain the fully connected network, set the class
argument of the plot function to Diseasome
(Figure 22). In this network, the nodes are the diseases of interest, and the node size is proportional to the number of genes associated with them, and the edges size is proportional to the number of genes that are shared between the diseases they are connecting. The nodes in dark blue are the query diseases, and in light blue, the diseases that share genes with them.
plot( data9,
class = "Diseasome",
layout="layout.lgl" ,
prop = 5 )
Figure 22: The Diseasome plot for MELAS Syndrome
A second option to visualize the results of disease2disease_by_gene is as a Disease-Disease Barplot
(Figure 23), obtained by setting the class
argument to Barplot
. The Disease-Disease Barplot
shows the top n diseases that share genes with the disease of interest. This barplot also provides the jaccard index for each disease pair.
plot( data9,
class="Barplot")
Figure 23: Disease-Disease Barplot visualizing the top 5 diseases associated to MELAS Syndrome
Searching DDAs via genes for multiple diseases
Visualizing the diseases associated to multiple diseases
The function disease2disease_by_gene can receive a list of diseases as input, in any of the previously described vocabularies. It will retrieve the top 10 diseases that share genes with each of the diseases in the input list. This results can be visualized as a Disease-Disease Network
, a Diseasome plot
, or as a Barplot
.
Table 4: Disease list selected for illustrating the disease2disease_by_gene function
UMLS_CUI | Disease_Name |
---|---|
C0162671 | MELAS Syndrome |
C0023264 | Leigh Disease |
C0917796 | Optic Atrophy, Hereditary, Leber |
<- c("C0013182", "C0013221", "C3658290", "C0860207", "C1274933")
diseasesOfInterest <- disease2disease_by_gene(
data10 disease = diseasesOfInterest,
database = "CURATED", ndiseases = 5)
To obtain the network, set the class
argument of the plot function to Network
(Figure 24). In this network, the nodes are the diseases of interest, and the node size is proportional to the number of genes associated with them. On the other hand, the edges size is proportional to the number of genes that are shared between the diseases they are connecting.
plot( data10,
class = "Network",
prop = 0.1)
Figure 24: The Disease-Disease Network by shared genes for a list of diseases
To obtain the fully connected network, set the class
argument of the plot function to Diseasome
(Figure 25). In this network, the nodes are the diseases of interest, and the node size is proportional to the number of genes associated with them. On the other hand, the edges size is proportional to the number of genes that are shared between the diseases they are connecting.
plot( data10,
class = "Diseasome",
prop = 0.1)
Figure 25: The Diseasome plot for a list of diseases
To further inspect the relationship between a pair of diseases, create a Venn Diagram by setting the class
argument of the plot function to Venn
(Figure 26).
plot( data10,
class="Venn")
Figure 26: The Venn Diagram for 3 diseases
The Venn diagram for 4 diseases
<- disease2disease_by_gene(
data11 disease = c("C0018801", "C0028754", "C0004153", "C0011849"),
database = "CURATED",
ndiseases = 5)
plot( data11,
class="Venn"
)
Figure 27: The Venn Diagram for a list of 4 diseases
The Venn diagram for 5 diseases
<- disease2disease_by_gene(
data11 disease = c("C0018801", "C0028754", "C0004153", "C0011849", "C0021368"),
database = "CURATED",
ndiseases = 5)
plot( data11,
class="Venn"
)
Figure 28: The Venn Diagram for a list of 5 diseases
Performing a disease enrichment
The disease_enrichment function receives a list of genes and performs an enrichment analysis over the diseases in DisGeNET. The input list of genes should be identified with HGNC symbols, or Entrez Gene Identifiers. The vocabulary should be specified using the parameter vocabulary
. By default, vocabulary = "HGNC"
. The function has other optional argument: the source database (by default, database = “CURATED”
). The background for the Fisher test are all the genes in DisGeNET. The p-values resulting from the multiple Fisher tests are corrected for false discovery rate using the Benjamini-Hochberg method.
To perform the enrichment, run:
<- disease2gene(disease = "C0004352", database = "GENOMICS_ENGLAND")
list_of_genes
<- list_of_genes@qresult$gene_symbol
list_of_genes
<-disease_enrichment( entities =list_of_genes, vocabulary = "HGNC",
res_enrich database = "CURATED" )
<- res_enrich@qresult[1:10, c("Description", "FDR", "Ratio", "BgRatio")] table1
The column Ratio represents the number of genes from the input list that are associated to the disease. The BgRatio are the number of genes associated to the disease from the length of the universe, in the example, all genes in DisGeNET curated (9703).
Description | FDR | Ratio | BgRatio |
---|---|---|---|
Autistic Disorder | 0.0000000 | 20/20 | 261/9703 |
Intellectual Disability | 0.0000000 | 13/20 | 447/9703 |
Global developmental delay | 0.0000003 | 7/20 | 133/9703 |
Neurodevelopmental Disorders | 0.0000009 | 6/20 | 93/9703 |
Autism Spectrum Disorders | 0.0000207 | 5/20 | 85/9703 |
Abnormal behavior | 0.0001107 | 3/20 | 16/9703 |
Seizures | 0.0014839 | 5/20 | 218/9703 |
MENTAL RETARDATION, AUTOSOMAL DOMINANT 40 | 0.0029005 | 2/20 | 9/9703 |
Schizophrenia | 0.0038198 | 8/20 | 883/9703 |
Cardiovascular Abnormalities | 0.0076679 | 2/20 | 16/9703 |
To visualize the results of the enrichment, use the function plot. Use the argument cutoff
to set a minimum p value threshold, and the argument limit
to reduce the number of records shown. By default, limit=50
. Use the argument nchars
to control the lenght of the disease name.
plot( res_enrich, class = "Enrichment", count =3, cutoff= 0.05, nchars=70)
Figure : The Enrichment plot for a list of genes
<-disease_enrichment( entities =list_of_genes, vocabulary = "HGNC",
res_enrich database = "ALL" )
<- res_enrich@qresult[1:10, c("Description", "FDR", "Ratio", "BgRatio")] table1
Description | FDR | Ratio | BgRatio |
---|---|---|---|
Autistic Disorder | 0.00e+00 | 20/20 | 1112/21666 |
Neurodevelopmental Disorders | 0.00e+00 | 13/20 | 535/21666 |
Intellectual Disability | 0.00e+00 | 18/20 | 2165/21666 |
Pervasive Development Disorder | 0.00e+00 | 11/20 | 328/21666 |
Global developmental delay | 0.00e+00 | 15/20 | 1825/21666 |
Autism Spectrum Disorders | 0.00e+00 | 12/20 | 1071/21666 |
Autistic behavior | 0.00e+00 | 8/20 | 261/21666 |
Developmental delay (disorder) | 1.00e-07 | 9/20 | 584/21666 |
Dysmorphic facies | 1.40e-05 | 6/20 | 271/21666 |
Profound Mental Retardation | 3.03e-05 | 5/20 | 160/21666 |
A disease_enrichment for a list of variants can also be performed. The input list of variants should be identified with dbSNP identifiers and the vocabulary should be specified using the parameter vocabulary = "DBSNP"
. The function has other optional argument: the source database (by default, database = “CURATED”
). The background for the Fisher test are all the genes in DisGeNET. The p-values resulting from the multiple Fisher tests are corrected for false discovery rate using the Benjamini-Hochberg method.
To perform the enrichment, run:
<- disease2variant(disease = "C0004352", database = "CLINVAR")
list_of_variants
<- as.character(list_of_variants@qresult$variantid)
list_of_variants
<-disease_enrichment( entities =list_of_variants, vocabulary = "DBSNP",
res_enrich database = "CURATED" )
<- res_enrich@qresult[1:10, c("Description", "FDR", "Ratio", "BgRatio")] table1
Description | FDR | Ratio | BgRatio |
---|---|---|---|
Autistic Disorder | 0 | 69/69 | 181/168051 |
Seizures | 0 | 22/69 | 417/168051 |
Global developmental delay | 0 | 20/69 | 505/168051 |
Generalized hypotonia | 0 | 14/69 | 164/168051 |
Delayed speech and language development | 0 | 14/69 | 192/168051 |
Poor school performance | 0 | 15/69 | 411/168051 |
Constipation | 0 | 9/69 | 49/168051 |
Mild Mental Retardation | 0 | 8/69 | 48/168051 |
Aggressive behavior | 0 | 7/69 | 22/168051 |
Acid reflux | 0 | 8/69 | 58/168051 |
To visualize the results of the enrichment, use the function plot. Use the argument cutoff
to set a minimum p value threshold, and the argument limit
to reduce the number of records shown. By default, limit=50
.
plot( res_enrich, class = "Enrichment", count =6, cutoff= 0.05)
Figure : The Enrichment plot for a list of genes
Expanding DisGeNET data with other resources available in the LOD cloud
Currently, Wikipathways-RDF has issues and these federated queries do not work!
Retrieving the pathways associated to a disease
To obtain the pathways from WikiPathways associated to a disease, use the disease2pathway function.
library("SPARQL")
<- disease2pathway( disease = "C0018801" ,
dis2path database = "ALL", score = c(0.5, 1))
dis2path# head(dis2path@qresult)
# qr <- extract(dis2path)
# head(qr, 3)
Retrieving the diseases associated to a given pathway
To obtain the pathways from WikiPathways associated to a disease, use the disease2pathway function.
<- pathway2disease( pathway = "WP1591" ,
path2dis database = "CURATED",
score = c(0.9, 1))
path2dishead(path2dis@qresult)
<- extract(path2dis)
qr head(qr, 3)
Retrieve the drug targets for a disease
The function disease2compound allows retrieved the chemicals in ChEMBL that target the genes associated to the given disease, or list of diseases. The function produces a DataGeNET.RDF object with the input disease(s), the associated genes, the DisGeNET score, and the chemicals associated to the disease proteins. If there were no chemicals found, the genes are still included in the output. It is important to higlight that because of ChEMBL restrictions for their SPARQL endpoint if 1000 compounds are retrieved for a single protein it probably means that there are more than 1000. (See more information about ChEMBL SPARQL endpoint at https://www.ebi.ac.uk/rdf/what-are-limitations-sparql-endpoints).
<- disease2compound(
dis2com disease = c("C0020538"),
database = "CURATED" )
dis2com<- extract(dis2com)
qr head(qr, 3)
Retrieving Gene Ontology data for a gene
<- gene2biologicalprocess(gene = "351")
gene2bp head(gene2bp@qresult)
<- gene2cellcomponent(gene = "351")
gene2cc head(gene2cc@qresult)
<- gene2molecularfunction(gene = "351")
gene2mf head(gene2mf@qresult)
<- gene2indication( gene = "1588" )
gene2indi head(gene2indi@qresult)
Retrieving Gene Ontology data for a disease
The functions disease2biologicalprocess, disease2molecularfunction, and disease2cellcomponent can retrieve the Gene Ontology (GO) molecular function, GO biological process and GO cellular component of the genes associated to a disease or a list of diseases.
Retrieving the biological processes associated to a disease
For a disease, the biological process information can be retrieved in GO terms using the function disease2biologicalprocess that receives as input an NCBI gene identfier. The function produces a DataGeNET.RDF object containing the disease, the genes and their GOBP terms.
<- disease2biologicalprocess(
disease2bp disease = "C0036341",
database = "CURATED",
score = c( 0.6,1)
)
disease2bp<- extract(disease2bp)
disease2bp head(disease2bp[c("gene_product_label","go_label")])
Retrieving the molecular functions of the genes associated to a disease}
For a particular disease, the molecular activities of associated genes can be retrieved as GO terms using the function disease2molecularfunction. The function produces a DataGeNET.RDF object containing the disease, the genes and their GOMF terms.
<- disease2molecularfunction(
disease2mf disease = "C0002395",
database = "UNIPROT"
)
disease2mf<- extract(disease2mf)
disease2mf head(disease2mf[c("gene_product_label", "go_label")])
Retrieving the molecular functions of the genes associated to a disease
For a given gene, the cell component information (i.e. where gene products are active) can be retrieved as GO terms using the function gene2cellcomponent that receives as input an NCBI gene identfier. The function produces a DataGeNET.RDF object containing the disease, the genes and their GO CC terms.
<- disease2cellcomponent( disease = "C0002395",
disease2cc database = "MGD")
<- extract(disease2cc)
disease2cc head(disease2cc[c("gene_product_label", "go_label")])
Get DisGeNET data version
get_disgenet_version()
## [1] "{ api_version : 1.2.0 , database_version : DisGeNET Database v7 , lastUpdate : May 2021 , comments : - Added GDA and VDA evidences endpoints.- Added the disease similarity endpoint.- Added the disease mapping by name endpoint. }"
COPYRIGHT
Copyright (C) 2021, IBI group.
disgenet2r is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
disgenet2r is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.
How to cite disgenet2r
Piñero, J., Ramírez-Anguita, J. M., Saüch-Pitarch, J., Ronzano, F., Centeno, E., Sanz, F., & Furlong, L. I. (2020). The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Research, 48(D1), D845-D855.