GO注意事項

Ref: Nine quick tips for pathway enrichment analysis, 2022 PLOS computation biology, https://doi.org/10.1371/journal.pcbi.1010348

Tip 1: Before starting, clarify which analysis you would like to perform

What analysis type

Pathway enrichment analysis (PEA) does not give clues about the active or inhibited status of the pathways. More appropriately, PEA provides information about how genes help carry out pathways.

overrepresentation analysis (ORA)

  • focus of ORA methods is the gene set

  • ORA outputs all pathways enriched in the query gene list as a whole, and mainly uses a nonranked list

  • (except one option in g:Profiler g:GOst using a minimum hypergeometric value-based method).

gene set enrichment analysis (GSEA) approaches

  • 代表某一個pathway在transcriptome資料裡面有enrich
  • focus of GSEA techniques is the ranked pathways list
  • it is advisable to choose GSEA methods when there is uncertainty about the cutoff value.
  • GSEA indicates the pathways that are enriched in genes located at both extreme ends of a ranked gene list, and a higher ranked pathway indicates that more genes are located at the very top or at the very end of this list
    • competitive methods:
    • BioPAX-Parser (BiP) [35]
    • pathDIP [36,37]
    • SPIA [38]
    • CePaORA [39]
    • PathNet [40]
    • self-contained methods:
    • CePa [39]
    • GSEA [2527]

Which data type

  • For unordered lists of genes, researchers can use g:Profiler g:GOSt [810], Enrichr [28,29], and BioPAX-Parser [35,47].
  • If the genes are ranked, g:Profiler g:GOSt can treasure this information and generate rank-based functional enrichment results.
  • **If the input data are gene expression levels, they can be analyzed through GSEA [27]. **
  • 如果是expression level, 可以直接用GSEA
  • pathDIP [37], instead, can assist with curated analyses based on scientific literature.

Tip 2: Ensure the quality of your input genes or genomic regions

請確定放進去的是對的ID,有意義的list,有意義的expression

Tip 3: Use multiple PEA tools, not only one

we suggest all the practitioners performing a functional enrichment analysis phase to employ at least two different PEA tools

  • if the user had an unranked list of gene symbols, we would suggest to apply g:Profiler g:GOSt [10], Enrichr [28], and GeneTrail [52] to it, and then compare their results.
  • To perform pathway enrichment by employing more than a single database, users can employ cPEA [48], a software tool able to deal with several pathway databases using the BioPAX language [61] to store and represent pathways.
  • Or they can use BiP [47] by selecting the “Whole PathwayCommon Data” option that will perform cross enrichment using the whole collection of automatically downloaded locally Pathway Common databases [62] coded in BioPAX.

Tip 4: Document all your PEA tests and their details

記得要寫好筆記,所有使用的參數以及資料庫的版本都要清楚寫下來,這是dry lab的實驗紀錄

Box 1: Example of PEA test details

My test ID: 2022-02-04, h10:02 EST.

My input genes: AK4, ALDOC, EGLN1, FAM162A, MTFP1, PDK1, PGK1.

My input genes’ type: gene symbols.

Source: D. Cangelosi and colleagues [69].

Disease: neuroblastoma.

Tool: g:Profiler g:GOSt.

Access: online via Google Chrome browser.

Version: e104_eg51_p15_3922dba.

URL: https://biit.cs.ut.ee/gprofiler/gost

Organism: Homo sapiens.

Query: unordered genes.

Statistical domain scope: only annotated genes.

Significance threshold: g:SCS threshold.

User threshold: 0.005.

Data sources: default.

All the other parameters: default.

My output file(s) name(s): gprofiler_gost_NB_2022-02-04_h1002_output.csv

My output file(s) folder: /home/davide/PEA_analyses/neuroblastoma/

My output file(s) location: bioinformatics-laptop-2021 (Dell Latitude E5420).

Tip 5: Always use the corrected p-value, and not the nominal one

  • terms adjusted p-values, corrected p-values, and false discovery rate (FDR) values are often used as synonyms in the scientific literature
  • we suggest using the adjusted p-value threshold at 0.005 (that corresponds to 5 × 10−3), as recommended by Benjamin and colleagues [75].
  • We therefore suggest using the p.adj < 0.005 threshold for a first strict analysis of the results, and then repeating the test by using a more permissive threshold such as p.adj < 0.01, and then again with an even higher threshold, such as the traditional p.adj < 0.05. Based on the characteristics of the experiment, results found by one particular threshold might be more suitable than results found with other thresholds.
  • 記得從嚴格到寬鬆的原則

Tip 6: Keep in mind that your PEA results can be strongly affected by the statistical tests and the visualization techniques you use

Statistical tests.

  • 統計方法不一樣,會得到不一樣的結果
  • 沒有最適合的方法,只有適合你實驗設計的方法

Visualization

  • Enrichment Maps [80] and enrichplot [81] for biological pathways,
  • AutoAnnotate [82] for networks,
  • REVIGO [83] and CirGO [84] for GO annotations are few examples of different visualization techniques and contents.
  • Network visualization techniques can also be used to detect a lower adjusted p-value threshold (Tip 5).

Tip 7: Consider using subgroups of correlated genes instead of all your input genes

Instead of using all genes as input for the PEA, we therefore suggest bioinformatics practitioners to detect subgroups of correlated genes and perform the PEA on each of these subgroups alone.

不要一口氣全部做,可以考慮subgroup

  • Subgroups of correlated genes can be found, for example, through protein–protein interaction networks’ tools such as IID [8588], STRING [8992], GeneMANIA [9397], or Reactome Functional Interaction Network (Reactome FI) [98,99].
  • some R packages have been recently released: pathfindR [102] and netGO [103], which exploit the protein–protein interaction networks to produce more accurate PEA results.

Tip 8: Use the (recent) scientific literature to review your PEA results

we therefore suggest any practitioner to manually perform a literature search and look for scientific studies published about the significant genes–pathways associations found by the functional enrichment analysis and about the role of the genes inside the enriched pathways found.

請找paper確定文獻是否更新

Tip 9: Ask a wet lab biologist or a clinician to review your PEA results

a wet lab biologist or a clinician should review these results and clearly say if they make sense or if they contain mistakes or inappropriate information.

請找專家用眼睛看