Data Analysis FAQs

  • What software do I need to align illumina reads to reference sequences in my own lab?

       You can use commercial software such as CLCbio Workbench or open source software such as MAQ, BWA and Bowtie2 (all available on HPC accessible to UCI researchers).

  • What software do I need to de novo assemble illumina reads in my own lab?

     For genome assembly, you can use commercial software such as CLCbio Workbench or open source software such as ABySS, Velvet and ALLPATHS (all available on HPC). For de novo transcriptome assembly, Trans-AByss, Velvet-Oasis and Trinity can be used. For reference based transcriptome assembly, cufflinks is recommended.

  • What software do I need to analyze illumina reads from RNA-Seq in my own lab?

      You can use commercial software such as CLCbio Workbench or open source software such as TopHat and STAR to aligne your reads and Cufflinks or eXpress, (all available on HPC) to do transcript assembly, HTSeq and RSEM for abundance quantification and cuffdff differential expression. Differential expression statistical analysis can also be done with R packages such as edgeR, DESeq2.

  • Where can I find the shared genome index on HPC?

      You can find the shared prebuilt genome index files for mouse and human on HPC at /data/apps/commondata/.

  • How do I interpret the columns in the differential expression analysis?
col2: log2 fold change (MAP): condition treated vs untreated
col3: standard error: condition treated vs untreated
col4: Wald statistic: condition treated vs untreated
col5: ald test p-value: condition treated vs untreated
col6: BH adjusted p-values that controls FDR(false discovery rate)
  • How do I choose between different alignment software?

      See here

  • How do I start my own analysis on NGS data?

      See here

  • What free software do you recommend to view my alignment data?

      You can use Integrative Genomics Viewer (IGV), Tablet, Eagleview as stand alone application. You can also visualize your data online with UCSC Genome Browser.

  • Can I reanalyze my data using other software programs?

       Yes - Our data are in standard format (BAM/SAM, FASTA/FASTAQ, BED, VCF, WIG etc.) and can be viewed and analyzed using a variety of software that accept standard input.

       Yes, as long as it is in one of the supported format.

  •  How do I get gene annotation information?

       You can download gene annotation from UCSC genome browser or EMBL.

  • What other software programs can be used for downstream statistical data analysis?

       R, MATLAB (available on HPC).

  • What text editor programs do you recommend?

       Emacs on Unix/Linux, TextdEdit on Windows.

  • Do you have information about NIH guidelines for data sharing and security?

      New Rules for NIH Genomic Data Sharing & Security 2015

  • What free software do you recommend for further analysis of gene expression data?

        GenePattern developed by Broad Institute provides access to tools for gene expression, proteomics, SNP analysis, flow cytometry, RNA-seq analysis and common data processing tasks. A web-based interface provides easy access to these tools and allows the creation of multi-step analysis pipelines that enable reproducible in silico research. Cyber T developed by IGB UCI also serves as an easy to use tool for microarray statistical analysis.  RobiNA (available on HPC) is another  easy to use open source pipleline for microarray and RNA-seq analysis.

  • What commercial software do you recommend for further analysis of gene expression data such as pathway analysis?

        GeneSpring, Partek, Ingenuity (IPA) and GeneGO metacore. CLCbio (available on HPC).

Get to Know IPA:  A series of live webinars geared for new users.  This link also provides recorded webinars covering:
  • Discover IPA Introduction
  • Uploading Data in IPA
  • Interpreting the Results of your Core Analysis in IPA
  • Tips and Tricks for doing RNA-Seq Analysis in IPA
IPA Tutorials: step by step instructions for specific tasks within IPA
 
IPA Training Videos:  2-5 min training videos on specific features in IPA
Regulator Effects Analysis:  Provides insights into your data by integrating Upstream Regulator results with Downstream Effects results to create causal hypotheses that explain what may be occurring upstream to cause particular phenotypic or functional outcomes downstream.
Molecule Activity Predictor:  simulate directional consequences of downstream molecules and the inferred activity upstream in the network or pathway
BioProfiler:  Quickly profile a disease or phenotype by understanding its associated genes and compounds. Identify genes known to be causally relevant as potential targets or identify targets of toxicity, associated known drugs, biomarkers and pathways.
Causal Network Analysis & Upstream Regulators:  identifying upstream molecules that control the expression of the genes in your datasets. 
Pathway Activity Analysis:  determine if Canonical Pathways are increased or decreased based on your data
Comparison Analysis :  Quickly visualize trends and similarities across analyses and datasets
License IPA Users have access to customer Support team (PhD Scientists) via phone and email M-F 6am to 5pm PST.
Customer Support

Phone: 650.381.5111

support@ingenuity.com
  • What opensource software do you recommend for further analysis of gene expression data such as pathway analysis?

        DAVID, GSEA, topGO and WGCNA.