Photograph via snooOG

Educational materials for those who wish to learn bioinformatics.

Welcome to LearnBioinformatics!

/r/LearnBioinformatics is a subreddit for providing you with the most relevant academic papers, textbooks, websites, and tutorials in the field of bioinformatics. If you have any recommended resources, please feel free to post away!

Mondays - New Programming Challenge

Tuesdays - TIL Computer Science

Wednesdays - TIL Biology/Biochemistry/Chemistry (sequencing techniques)

Thursdays - Paper Discussions

Fridays - TIL Data Science / Statistics

List of Resources and Guides

List of tools used for Next-Generation Sequence Analysis

Past weekly coding challenges

Posting Guidelines

  1. Write specific tags when posting. e.g. [Question], [Academic Paper], [Tutorial].
  2. Search your post before asking - it may have already been asked and answered.
  3. Please do not delete your post - This helps keep it as a reference for later on
  4. Write specific questions.


  1. No rewards, advertisements or affiliate links.
  2. Provide good, helpful content and comments. Remember that we are all here to learn!
  3. Never. stop. learning.

Related subreddits

Related websites

SEQanswers: A discussion forum and information source for next generation sequencing.

BioStar: A community for biology that provides tutorials, questions/answers and more.

Rosalind: A platform for learning bioinformatics through problem solving.

Bioconductor: A free, open source and open development software project for the analysis and comprehension of genomic data generated by wet lab experiments in molecular biology.

Biopython: Biopython is a set of freely available tools for biological computation written in Python by an international team of developers.

Bioperl: The BioPerl project is an international association of developers of open source Perl tools for bioinformatics, genomics and life science research.

Protein Data Bank: THE database of biological structures, namely proteins and nucleic acids. This is the starting point for any structural studies.

Proteopedia: A comprehensive encyclopedia of proteins (and nucleic acids as well).


4,728 Subscribers


Why large k-mer are more computationally demanding?

1 Comment
10:29 UTC


can anyone help me to find any workshop ..

online workshop would be preferred.. workshop of cloud, linux ,python , bioinformatics related,data science related

14:52 UTC


Study partner?

Hello, everyone. I haven't seen any posts for any recent study groups. I completed my undergrad in medical science and I'm hoping to switch careers from wet lab work to dry lab analysis. I want to study Bioinformatics every day in order to prepare myself for a Master's program. The application deadline for the program is October.

I'm looking for a study/accountability partner in a similar position. I intend to go through the entirety of the Biostar Handbook as well as complete an introductory Bioinformatics course on Coursera.

Is anyone interested?

03:21 UTC


Does it make sense to go for more expensive MS program?

I'm working in Biotech company as SE, my dream to become bioinformatician

Trying to compare two programs from John Hopkins (~$50k) and Maryland (~$20)

I don't understand much aside from the price, does it make sense to pay extra to go to top ranked university? Does it even help with employment later on?

13:35 UTC


Guide to learn analysis of SNP-array data?

Can you offer me a well-designed guide please.

10:35 UTC


How can I compare two MSAs?

Hi. I've performed multiple sequence alignment on mitochondrial genomes of primates using mafft and kalign. How can I tell which algorithm did a better job ?

22:10 UTC


Doubt regarding machine learning algorithm

I am doing my masters in bioinformatics,first sem. And I'm completely new to this since it has been only one month. I was given a task to download the Yeast dataset from https://archive.ics.uci.edu/dataset/110/yeast for predicting the cellular localization sites of proteins and apply different machine learning algorithms on this data. We were told to do this on orange software,which I'm not that much familiar with. I tried downloading the file,but it was a zip file,but I couldn't import this into orange software. And also I didn't particularly understand how to do this in this software. If anyone have any knowledge regarding the working of orange software,and how to prepare a workflow/pipeline in orange,pls do help . I googled it and searched in YouTube too,but couldn't find an answer. Please help.

11:35 UTC


Complete newbie

Hello everyone! I'm doing a BSc in a healthcare subject but I want to do bioinformatics in the future. Where should I begin? Any sites especially with certificates would be appreciated. Thank you! 😊

16:58 UTC


How can I construct a cloning primer without a stop codon?

hello, I have a bioinformatics project where I have to design cloning primers, and since the product is a fusion protein, I read that I have to omit the stop codon at the 3' end before i do reverse complementarity.

any tips on how can achieve this?

1 Comment
18:02 UTC


same isoforms

Some isoforms are %100 same but their id is different. Why is that?

19:52 UTC


Online Courses on Polygenic Risk Scores


I am an early career research in clinical research. My background is in public health and epidemiology and clinical research. However now I would like to learn how to calculate polygenic risk scores from my clinical subjects. Does anyone know of an online course, youtube videos, etc (can be free or paid) that offers focused learning on polygenic risk scores? Courses on GWAS and epigenetics would also be beneficial, though not exactly what I'm focused on at this time.


20:21 UTC

07:15 UTC


Enrichment analysis

17:31 UTC


Retroelement curation

Hi all! Pure bioinfo amateur here who's starting his PhD on retroelements innate immunity. What would you say are the fundamentals I should master before delving into sequence curation?

1 Comment
06:02 UTC


Looking for a Scientific Co Founder

Hey everyone, hope you're doing well.

After multiple customer discovery interviews and validating the problem. I'm starting to build BioLearn, a bioinformatics e-learning platform taht brings the best of udacity and datacamp to the bioinformatics industry, and I'm looking for a scientific co-founder.

I have over 5 years of experience in software engineering and I've been learning the business aspects of launching/running a startup for the last year.

No fluff here, it's an equal equity split based on responsibilities, qualifications and time commitment. This is to build a MVP and test the idea much further.

Who I'm Looking For: Someone passionate about bioinformatics with industry relevant experience, who likes teaching/education, and is willing to contribute with creating online learning material (the idea is to create 3 foundational courses on bioinformatics, data science and biology). You'll help build our courses and shape the future of bioinformatics education.

What's In It For You: Equal equity partnership. We're in this together.

A salary is out of question, I cannot afford it even for myself. Right now is all time investment for both.

If you're interested reach me out on LinkedIn and let's talk about this. This is my profile: https://www.linkedin.com/in/brian-rey/

Thanks :).

21:56 UTC


Do I need protein sequences needed for every single taxa to compare 2 different families phylogeny and rates of evolution?

If I am comparing phylogenies and rates of evolution of 2 bird families across 3 different genes, should I have those 3 gene's protein sequences available for every single taxa in those 2 families or can the taxa availability differ slightly in each of those 3 genes?

E.g. I have 3 genes. I have 10 taxas total (5 in each family for simplicity). Should all 3 genes' protein sequences be available in every single taxa or can 7/10 taxa have gene 1 protein sequence available and 9/10 taxa have gene 2 protein sequence available and 6/10 have gene 3 protein sequence available.

So each of the 3 trees for 3 different genes will differ in the taxas they have. Is this a way to compare phylogenies when some taxa do not have a specific gene's protein sequence available?

21:48 UTC


doubt about usage of megahit

hei, is there any problem if i use megahit to assemble a single plant genome..i am doubtful as its made for metagenomic data

06:28 UTC



I have a Samsung book notebook, when I opened it to install a micro SSD, I had to open the PC and disconnect a flat cable, the problem is that the guy in the video said to disconnect the battery from the motherboard before disconnecting the flat cable, Now the PC just doesn't turn on anymore, I press the Power button and it doesn't start, nor does the LED light up, but when I disconnect it from the charger the amount of battery available and a date of 2021 appears, before a certain date and time appeared, and this is the only sign of life on the pc, how to proceed?

18:29 UTC


My Autogrid4 doesn't work

Hello everyone. I am doing my biology research about amino acid substitution. Before replacing amino acids in my protein I should do a control trial with wild-type enzyme. I've cleaned my protein 6lfz in Biovia Discovery Studio and added a ligand (glucose) in AutoDockTools. I prepared my complex adding polar hydrogens and Kollman Charges to protein and adding polar hydrogens and Gasteiger to ligand, then I grid a box contains part of protein and ligand. After that I ran AutoGrid. It seemed everything will be okay, but when I wanted to start AutoDock I realised my AutoGrid didn't create P map. The remaining maps have been created (A, C, d, e, HD, N, NA, OA, SA) What I should do to obtain P map? It is so important to me, please help.

21:47 UTC


Need help choosing OS

Hi, I am starting a Master degree in Bioinformatics and Biostatistics and I need to change my current laptop to finally let him rest after 5 years of everyday use.

The thing is that almost everywhere I search they recommend using Linux or Mac for this purpose. Is windows that behind in this terrain?

What specs should I be looking for in order to not have any problems on the future? (Recommended specs for both Mac or Windows) Looking for a reasonable price.

Thank you :)

1 Comment
23:06 UTC


Drugs and herbals found using names of genes/proteins, diseases and more

How can we find drugs or traditional Chinese medicine with given info, such as a

disease or gene/protein/pathway?

  1. Go to the Coremine database (instructions on how to find Coremine database can

be found at the end of the text).

  1. In the Coremine search box, enter your keyword, which can be a disease name,

gene/protein, or GO term (BP: Biological Process, MF: Molecular Function, and CC:

Cellular Component).

  1. In the results column on the right side, you can find drugs and traditional Chinese

medicine associated with your query.

That's it! Coremine database can help you quickly find drugs or traditional Chinese

medicine related to a disease or specific genes/proteins/pathways.

How to use CoreMine Medical database?

  1. landing page https://coremine.com/medical/
  2. register for free using email.
  3. In the search bar, enter your keywords for searching. You can add keywords one by

one or in batch (for genes)



16:46 UTC


Is college the only way?

Hey everyone,

I'm curious about learning bioinformatics and whether college is the only realistic path to aspire to a serious job.

Can you share your experiences with learning bioinformatics? How did you get started, and what challenges did you encounter? Any advice for someone approaching this field from a non-traditional background? I come from a computer science background (mostly self learned/through work)

Thanks for your input!

16:35 UTC


Ideas for projects

Hello Reddit,

this is my very first post, I am still a student in a Msc in Biotechnologies in France ( titre d'ingénieur en biotechnologies at Subpiotech) and I would like to dive more into bioinformatics on my own. which kind of projects should I do to better understand bioinformatics ? what would you advise me ?

19:45 UTC


COREMINE Medical used in study of Multiple Sclerosis

The authors (Dadashkhan et al.) used COREMINE Medical as part of their study that identified six genes "as the most significant for MS pathophysiology" and proposed six drugs that target these genes.

Deciphering crucial genes in multiple sclerosis pathogenesis and drug repurposing: A systems biology approach


22:20 UTC


Coremine Medical explained on YouTube

13:59 UTC


Suggestion for DEG seq pipeline that entirely uses R studio

Hello everyone,

I'm a PhD student in Biology, and I must admit that I'm not very experienced in analyzing my data using R programming. Up until now, I've been relying on Excel for my data analysis. However, my current project requires me to perform a DEG (Differential Expression Gene) analysis using R programming. Initially, my supervisor suggested outsourcing the analysis to a bioinformatician. Unfortunately, that didn't work out as expected, especially regarding how the data should be grouped and the gene enrichment part. Moreover, the bioinformatician was not inclined to teach us the analysis process. I noticed that he utilized UNIX and R programming for the analysis. So, I'm curious if there's an entirely R-based pipeline available that I can use. I believe I can learn from the Bioconductor package, as they provide comprehensive documentation.

Based on what I observed, here are the tools he used:

Pre-processing of NGS Raw Reads:

  1. Cutadapt (version: 1.18) implemented in trim-galore (version: 0.650)
  2. FastQC (version: 0.11.8)
  3. MultiQC (version: 1.12)

Alignment of Clean Reads:

  1. STAR aligner (version: 2.7.10a)

Quantification and Normalization of Aligned Reads and Differential Expression Testing:

  1. featureCounts (version: 2.0.1)
  2. DESeq2

Functional Profiling and Pathway Analysis:

  1. ClusterProfiler (version: 4.2.2)

I would greatly appreciate any recommendations or guidance you can provide. Thank you in advance!

1 Comment
07:22 UTC


Career pathway regarding to healthcare and healthcare rasearch

Greetings, I am Anuththara, hailing from an Asian country. I hold a B.Sc. (Hons) degree in Medical Laboratory Science from a public university in my homeland. Back in grade 7 (2006), my family and I visited the USA to see relatives, but our stay lasted less than 5 months before we returned to our home country. As a result, I completed my bachelor's degree in my home country.

My inclination towards scientific exploration led me to undertake research related to drug discovery during my undergraduate years. I successfully presented my findings in two abstracts at international research conferences. My research experience heightened my appreciation for the significance of bioinformatics and computational chemistry. Consequently, I embarked on a self-guided journey into data science, engaging in courses on platforms like Coursera (including Bioinformatics from UC San Diego) to enhance my skills in Python programming (as validated by my IBM data science certification). Presently, I am employed as a researcher, focusing on snake venom, snakebites, and toxicology in a government research institute in my home country, garnering one year of experience.

My longstanding dream involves returning to the US for a master's degree, aimed at securing a healthcare-related job. Nonetheless, my aspirations have been hindered by financial challenges exacerbated by the economic crisis in my country. Despite this, I am in a position to allocate some resources towards pursuing a master's degree in my field. However, the tuition fees in the USA are notably higher than those in countries like Germany. To address these challenges, I have identified several potential career pathways:

  1. Seek enrollment in a Ph.D. program in medicinal chemistry or computational biology in the US with a scholarship. It's important to note that my GPA stands at 2.75 (B), and I am more drawn to practical applications than theoretical memorization.

  2. Consider pursuing a master's degree in bioinformatics or a related field in Germany before transitioning to a Ph.D. program in the US. While this approach could lead to significant cost savings, it would require additional time.

  3. Explore the option of completing a master's program in bioinformatics in the US, while concurrently obtaining the MLS ASCPi certification. While this route may be costly, it offers potential benefits in terms of job prospects and networking opportunities.

  4. Contemplate relocating to the US alongside my family, akin to our 2006 visit. By enrolling in a post-baccalaureate program in Medical Laboratory Science (MLS) and achieving MLS ASCPi certification, I could secure a job, setting the stage for pursuing a master's in bioinformatics at a later juncture.

Given my financial constraints, I am inclined towards pathways that offer practical experience or potential savings. It is a complex decision, influenced by various factors including priorities, timelines, and financial realities. Seeking insights from industry professionals, mentors, and career advisors is invaluable during this pivotal decision-making process. I extend my gratitude for any guidance you can provide, as I navigate this critical juncture towards realizing my career goals. Thank you for your assistance.

05:03 UTC


Material for studying ngs

22:56 UTC

Back To Top