Opportunities for advancing omics data analysis
When: September 9th, 09:00 to 16:00
Where: University of Basel, Kollegienhaus building, Petersplatz 1, CH-4001 Basel
Room: Regenzzimmer 111
Fotis Psomopoulos, INAB-CERTH, ELIXIR-GR
Machine learning has emerged as a discipline that enables computers to assist humans in making sense of large and complex data sets. With the drop-in cost of sequencing technologies, large amounts of omics data are being generated and made accessible to researchers. Analysing these complex high-volume data is not trivial and the use of classical tools cannot explore their full potential. Machine learning can thus be very useful in mining large omics datasets to uncover new insights that can advance the field of medicine and improve health care.
The aim of this tutorial is to introduce participants to the Machine learning (ML) taxonomy and common machine learning algorithms. The tutorial will cover the methods being used to analyse different omics data sets by providing a practical context through the use of basic but widely used R and Python libraries. The tutorial will comprise a number of hands on exercises and challenges, where the participants will acquire a first understanding of the standard ML processes as well as the practical skills in applying them on familiar problems and publicly available real-world data sets.
This introductory tutorial is aimed towards bioinformaticians (graduate students and researchers) familiar with different omics data technologies that are interested in applying machine learning to analyse them.
Maximum participants: 30
Time | Details |
---|---|
09:00 - 09:15 | Tutorial introduction. - Get to know each other. - Setup Link to material |
Part I: Background | |
09:15 - 10:45 | Introduction to ML / DM. - Data Mining. - Machine Learning basic concepts. - Taxonomy of ML and examples of algorithms. - Deep learning overview. Link to material |
11:00 - 12:30 | Applications of ML in Bioinformatics. - Examples of different ML/DM techniques that can be applied to different NGS data analysis pipelines. - How to choose the right ML technique? Link to material |
Part II: Hands-on | |
13:15 - 14:45 | Loading and exploring omics data. - What is Exploratory Data Analysis (EDA) and why is it useful? - Unsupervised Learning. - How could unsupervised learning be used to analyze omics data? Link to material |
15:00 - 16:30 | Supervised Learning Classification. - How could supervised learning be used to analyze omics data. Regression. - What if the target variable is numerical rather than categorical? Link to material |
16:30 | Closing, discussion and resource sharing |
If you finish all the exercices and wish to practice on more examples, here are a couple of good examples to help you get more familiar with the different ML techniques and packages.
The material in the workshop has been based on the following resources:
Relevant literature includes:
This material is made available under the Creative Commons Attribution 4.0 International license. Please see LICENSE for more details.
Amel Ghouila, & Fotis E. Psomopoulos. (2019, September 9). Introduction to Machine Learning: Opportunities for advancing omics data analysis (Version v1.0.0). Zenodo. http://doi.org/10.5281/zenodo.3403768