Introduction to Biological Data Science
Instructor: Christopher Hemme, PhD, University of Rhode Island
Location: URI
Session 1: June 19 – June 21 ( Lab: Room 405 / Lecture Room: Avedisian Rm 205)
Session 2: July 15 – July 17 ( Lab: Room 405 / Lecture Room: Avedisian Rm 403)
Course Overview
This WDT Module aims to train students in basic methods of biological data science. It will cover concepts in using the Unix command line in a high-performance computing environment, introduction to the R programming language, design of biomedical experiments, and basic statistical analyses (e.g. t-test, ANOVA, regression). Finally, students will get a basic introduction to bioinformatics and omics methods. The module will be completed over 2.5 days and participants who complete the module will receive an RI-INBRE Certificate of completion.
Learning Outcomes
- Learn how to use the Unix command line on URI’s Unity High-Performance Computing Cluster
- Learn basic concepts in biomedical experimental design, such as selection of controls, determining proper sample sizes, and different experimental setups
- Process different types of biomedical data using common statistical concepts such as t-test, ANOVA, and linear regression
- Learn basic concepts in bioinformatics including omics data analysis
- Analyze bioinformatics data using established methods such as differential expression analysis
Lab Report
Students will be expected to maintain detailed laboratory notes to include:
- Data analysis protocols (e.g., workflows, software used, etc.)
- Quality control metrics for sequencing and data analysis
- Results of data analysis
Resources Used in this Module:
- Jupyter Notebooks for Hands-On Bioinformatics Data Analysis
- Slide Decks for Lectures
Timeline
Day 1 | Day 2 | Day 3 | |
---|---|---|---|
9:00 AM-10:00 AM | Introduction to Best Practices in Biomedical Data Science (Lecture) | Basic Statistical Concepts for Biomedical Research | Introduction to Bioinformatics and Omics |
10:00 AM-11:00 AM | Using the Unix Command Line on High-Performance Computing Clusters (Hands-On) | Introduction to Experimental Design | R Programming IV – BioConductor (Hands-On) |
11:00 AM-12:00 PM | Using Interactive Tools (Jupyter Notebooks and RStudio) and Data Science Modules on Unity (Hands-On) | Using Power Analysis for Determining Sample Size for Biomedical Experiments (Lecture and Hands-On) | Bioinformatics Data Analysis (Hands On) |
12:00 - 1:00 PM | Break | Break | WDT Survey and Certificate Distribution |
1:00 PM-2:00 PM | R Programming I – Basic Data Structures and Functions (Hands-On) | Methods for Exploratory Analysis of Biomedical Data (Lecture and Hands-On) | |
2:00 PM-3:00 PM | R Programming II – Tidy Data and Tidyverse (Hands On) | Statistical Methods for Analysis of Biomedical Data (ANOVA and t-Test) (Lecture/Hands-On) | |
3:00 PM-4:00 PM | R Programming III – Data Visualization (Hands On) | Statistical Methods for Analysis of Biomedical Data (Linear and Logistic Regression) (Lecture and Hands-On) |