1 Misleading bioinformatics: Mistakes, Biases, Mis-interpretations and how to avoid themFestival of Genomics 2017 Course Exercise Material:
2 https://creativecommons.org/licenses/by-nc-sa/2.0/uk/legalcodeAttribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) https://creativecommons.org/licenses/by-nc-sa/2.0/uk/legalcode @babraham_bioinf @simon_andrews @drrshamilton #GenomicsFest
3 Who we are Simon Andrews, Head of Bioinformatics, Babraham Institute,Cambridge, UK Russell Hamilton, Bioinformatics Core Facility Manager, Centre for Trophoblast Research, Cambridge University, UK Christel Krueger, Bioinformatician, Babraham Institute, Cambridge, UK Malwina Prater, Bioinformatician, Centre for Trophoblast Research, Cambridge University, UK
4 And why do we care about misleading bioinformatics?We all work in Bioinformatics Core Facilities We see lots of different bioinformatics projects and people We see the kinds of mistakes people commonly make Collate a collection of common problems, biases and mistakes
5 Why now? Automation in analysis increasingly common Raw sequences(FASTQ) List of Differentially Expressed Genes
6 Why now? Automation in analysis increasingly common Raw sequencesDuplication Rates Stranded Vs Unstranded Alignment Stats Biases featureCounts Raw sequences (FASTQ) List of Differentially Expressed Genes Sample Mix Ups Read Error Rates Read Counts BAM
7 Workshop Outline Introduction Russell 15 minsUn-quantitated Data Exercise Christel 30 mins Expectations Simon 30 mins Biases Simon 30 mins Software Russell 30 mins Summary Simon 15 mins Coffee Coffee
8 Bioinformatics in a typical NGS projectExperimental Design and Planning Library Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results
9 Bioinformatics in a typical NGS projectExperimental Design and Planning Library Preparation Sample Tracking Sequencing Quantitation NGS Pipeline automation increasingly common Many advantages, but errors, bias and bug need to be checked for Comparison Statistics Functional Analysis Results
10 Impact on Project Time “Bowtie of NGS” time Experimental DesignLibrary Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results “Bowtie of NGS” time
11 Impact on Project Time “Bowtie of NGS” Bug / Error / Mistake timeExperimental Design Library Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results “Bowtie of NGS” Bug / Error / Mistake time
12 Impact on Project Time “Bowtie of NGS” Lost timeExperimental Design Library Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results “Bowtie of NGS” Lost time Time could be spend on other projects Follow on work may be dependent on preliminary results time
13 Impact on Project Time “Bowtie of NGS”Experimental Design Library Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results “Bowtie of NGS” Issues can lead to significant time delays Can sometimes lead to interesting discoveries Can also make bioinformaticians cry time
14 Impact on Costs $ $ cumulative Experimental Design and PlanningLibrary Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results
15 Impact on Costs $ cumulative Experimental Design Bug / Error / MistakeLibrary Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results Bug / Error / Mistake
16 Impact on Costs $ cumulative Experimental Design Bug / Error / MistakeLibrary Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results Bug / Error / Mistake Half the project budget could be lost
17 Workshop Goals Un-quantitated data exercise:Experimental Design Library Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results Un-quantitated data exercise: Look at the raw data before you analyse Expectations What assumptions and expectations should you have for your data Bias How to look for and identify them (examples and cases) Software How to select bioinformatics tools
18 Course Exercise Material: http://tinyurl.com/FOG-2017Workshop Outline Introduction Russell 15 mins Un-quantitated Data Exercise Christel 30 mins Expectations Simon 30 mins Biases Simon 30 mins Software Russell 30 mins Summary Simon 15 mins Coffee Course Exercise Material: Coffee