Introduction to Biological Data Science

Instructor: Christopher Hemme, PhD, University of Rhode Island

Location: URI   
Session 1: June 19 – June 21 ( Lab: Room 405 / Lecture Room: Avedisian Rm 205)
Session 2: July 15 – July 17 ( Lab: Room 405 / Lecture Room: Avedisian Rm 403)

Course Overview

This WDT Module aims to train students in basic methods of biological data science.  It will cover concepts in using the Unix command line in a high-performance computing environment, introduction to the R programming language, design of biomedical experiments, and basic statistical analyses (e.g. t-test, ANOVA, regression).  Finally, students will get a basic introduction to bioinformatics and omics methods. The module will be completed over 2.5 days and participants who complete the module will receive an RI-INBRE Certificate of completion.

Learning Outcomes

  • Learn how to use the Unix command line on URI’s Unity High-Performance Computing Cluster
  • Learn basic concepts in biomedical experimental design, such as selection of controls, determining proper sample sizes, and different experimental setups
  • Process different types of biomedical data using common statistical concepts such as t-test, ANOVA, and linear regression
  • Learn basic concepts in bioinformatics including omics data analysis
  • Analyze bioinformatics data using established methods such as differential expression analysis

Lab Report

Students will be expected to maintain detailed laboratory notes to include:

  • Data analysis protocols (e.g., workflows, software used, etc.)
  • Quality control metrics for sequencing and data analysis
  • Results of data analysis

Resources Used in this Module:

  • Jupyter Notebooks for Hands-On Bioinformatics Data Analysis
  • Slide Decks for Lectures

Timeline

Day 1Day 2Day 3
9:00 AM-10:00 AMIntroduction to Best Practices in Biomedical Data Science (Lecture)Basic Statistical Concepts for Biomedical ResearchIntroduction to Bioinformatics and Omics
10:00 AM-11:00 AMUsing the Unix Command Line on High-Performance Computing Clusters (Hands-On)Introduction to Experimental DesignR Programming IV – BioConductor (Hands-On)
11:00 AM-12:00 PMUsing Interactive Tools (Jupyter Notebooks and RStudio) and Data Science Modules on Unity (Hands-On)Using Power Analysis for Determining Sample Size for Biomedical Experiments (Lecture and Hands-On)Bioinformatics Data Analysis (Hands On)
12:00 - 1:00 PMBreakBreakWDT Survey and
Certificate Distribution
1:00 PM-2:00 PMR Programming I – Basic Data Structures and Functions (Hands-On)Methods for Exploratory Analysis of Biomedical Data (Lecture and Hands-On)
2:00 PM-3:00 PMR Programming II – Tidy Data and Tidyverse (Hands On)Statistical Methods for Analysis of Biomedical Data (ANOVA and t-Test) (Lecture/Hands-On)
3:00 PM-4:00 PMR Programming III – Data Visualization (Hands On)Statistical Methods for Analysis of Biomedical Data (Linear and Logistic Regression) (Lecture and Hands-On)