Course Material for Genomics Data Mining and Statistics

Documentation

Please visit https://montilab.github.io/BS831/

Requirements

Installation

Install the the package from Github.

devtools::install_github("montilab/BS831")

Developer notes on rebuilding documentation.

library(pkgdown)

# Use lazy=FALSE to rebuild entire site
build_site(pkg=".", lazy=TRUE)

# Only rebuild changed articles
build_articles(pkg=".", lazy=TRUE)

# Rebuild one article
build_article(name, pkg=".", lazy=FALSE)

Course Description

The goal of this course is for the participants to develop a good understanding and hands-on skills in the design and analysis of ‘omics’ data from microarray and high-throughput sequencing experiments, including data collection and management, statistical techniques for the identification of genes that have differential expression in different biological conditions, development of prognostic and diagnostic models for molecular classification, and the identification of new disease taxonomies based on their molecular profile. These topics will be covered using real examples, extensively documented hands-on’s (see Markdowns’ menu), class discussion and critical readings. Principles of reproducible research will be emphasized, and participants will become proficient in the use of the statistical language R (an advanced beginners’ knowledge of the language is expected) and associated packages (including Bioconductor), and in the use of R markdown (and/or electronic notebooks) for the redaction of analysis reports.

The course is organized in seven modules covering: 1) Introduction to Genomics Analysis; 2) Data Preprocessing and Quality Control; 3) Comparative Experiments based on Microarrays and Linear Models (LM); 4) Comparative Experiments based on RNA-sequencing and Generalized Linear Models (GLM); 5) Comparative Experiments based on Differential Enrichment Analysis; 6) Classification; and 7) Clustering and Class Discovery.