/r/proteomics
This subreddit is dedicated to dissemination and discussion regarding the latest research and news in the field of proteomics.
The Proteomics Reddit
Proteomics - the large-scale study of proteins. Proteins are vital parts of living organisms, with many functions. The term proteomics was coined in 1997 in analogy with genomics, the study of the genome. The word proteome is a portmanteau of protein and genome, and was coined by Marc Wilkins in 1994 while he was a PhD student at Macquarie University.
The proteome is the entire set of proteins that are produced or modified by an organism or system. This varies with time and distinct requirements, or stresses, that a cell or organism undergoes. Proteomics is an interdisciplinary domain that has benefited greatly from the genetic information of the Human Genome Project; it is also emerging scientific research and exploration of proteomes from the overall level of intracellular protein composition, structure, and its own unique activity patterns. It is an important component of functional genomics.
While proteomics generally refers to the large-scale experimental analysis of proteins, it is often specifically used for protein purification and mass spectrometry. Wikipedia: proteomics
Related Reddits
Outside Reddit Sites
/r/proteomics
Probably a dumb question but do other proteomics lab use pure methanol for cleaning things instead of 70% EtOH? is there a reason to it? seems unnecessarily dangerous but that’s how my lab has been doing since way before i joined
Hello! We use Rapigest surfactant for MS based proteomic studies, but unfortunately for a few weeks now we have been seeing some intense hydrophilic contaminant components from all our samples and also from the blank samples (formic acid added to the Rapigest solution and then purified with C18). This chromatographic peak comes for more than an hour and provides a constant high background, which makes it impossible to measure real samples. The most intense m/z values are 259.0 and 467.1, which in principle do not correspond to Rapigest fragments. We tried to replace both the Rapigest solution and also the formic acid used, but the problem persists. Any ideas what might be the problem? Thank you in advance.
I have data from fractionated samples of the global proteome, and then a phospho-enriched sample that is unfractionated. What is the best way to compare whether phosphorylation was present or not for specific proteins in my different experimental samples? From processing the samples all together with phosphorylation as a dynamic modification, and using IMP-ptmRS, there are master proteins that are identified with phosphorylation, but there is no indication of whether the phosphorylation was present in every sample or only some. My data used a kinase inhibitor, so I am specifically interested in changes to the phosphoproteome as a result.
I have shotgun data from a brachyuran species for which I have an assembled, but not annotated, transcriptome. We don't have a genome, so the transcriptome assembly was de-novo, but we've validated the assembly with lots and lots of genes so I trust it. But, without annotation the majority of this data is pretty useless.
SO- I tried using the protein fasta from an annotated (from the NCBI annotation pipeline) genome from a closely related species as the target database to find PSMs and protein IDs and it worked well. The thing is, I want to keep the pseudo-annotation that I get from doing this, but also still have it associated with the contig numbers from my original transcriptome for downstream analysis.
My question is 2 parts:
Some proteins within a protein group only originate from my un-annoated transcriptome:
Some proteins within a protein group seem like a pretty straightforward match between both databases:
And other times there are several different proteins within a protein group:
With using the Protein Annotation node in my consensus workflow, I can also select both databases. I usually end up with minimal annotation, maybe 45 out of 1470 protein groups will have some combination of GO/Pfam/Ensembl etc. annotation. Am I missing something with a setting here?
Thanks in advance for any help you can provide!!
Esteemed proteomic wizards - I ran out of high pH spin columns. I've actually got the Affinisep plates, but I've only got 2 samples to fractionate and I don't want to potentially risk (or deal with later annoyance of having only 94 unused wells). Any reason you can think of that I can't just take the C-18 "desalting" spin columns, equilibrate those at high pH and knock out 6 fractions (on the regular kits I generally combine 1, 7 and 8 and have 6 fractions to run). I know I've done this before with ziptips and that looked okay but if it comes down to some ziptips in my drawer from 2011 vs a C-18 spin column, I figure the latter is the better move.
We did some molecular docking on an uncharacterized protein found in the nucleus of A. Niger cells. While I looked up what it could possibly be, I encountered Flb proteins. I have a small yet probably stupid question...
Are they really called Fluffy little ball proteins??????
And why? 🥲
Hi proteomics people, I'm a PhD student in PharmSci.
I have an idea for utilizing mass spec and proteomics software for the quantification of peptides based on a combinatorial peptide library.
Basically, I theoretically would know all the possible peptide sequences since its synthetically synthesized. But, I don't know the quantities.
Would it be feasible to use LFQ or something to compare the relative concentrations of two or more samples? For example, before and after some assay? I just don't fully understand if proteomics software like maxquant would work for a synthetic library rather than a known biological sample/protein due to the normalization algorithms or something like that.
Overall, just wanted to make a post and see whether there was an obvious issue that a non proteomics person might not see. Thanks :)
What is the best way to understand Xcaliber to manually analzye ms2 data, it seems very overwhelming, thank you.
Hi. We just got proteomics done for one of our ongoing research projects and I have no idea how to segregate the data and identify something useful to out of it. My PI is after my life though to get something out of it ASAP. Can someone please help in this? I have the excel file where the proteins are named that are being differentially regulated.
Has anyone made the transition from a triple quad LCMSMS to an Orbitrap for non-targeted PFAs testing? I plan to open a PFAs testing lab in the next year. Any advice or suggestions?
The number of compounds an orbitrap can test for makes it a very lucrative investment for PFAs labs. I have multiple orbitraps & will probably only use 1-2 in my lab. If anyone is in the market for an orbi, I can supply one for $40k-50k under market price. I hate these companies that rip scientists off with huge markups.
Hi everyone!
I'm new here and have just started my new position. I've been asked to study single-cell proteomics, but I don't have any experience with this technology. I'd be truly grateful if anyone with experience in this field could guide me from the very first steps to the basics of the experiment. I’m hoping to learn as much as I can and could really use some guidance. Thanks in advance!
Does anyone know of a generator tool like Python(can't figure out how to even use it on Mac)online, or is anyone interested in helping? I wanted to take a glycoprotein's amino acid sequence and selectively replace certain amino acid variants with one another with a ratio requirement. I've narrowed it down to 1024 at 247 letters listed, if done in every variant possible following those parameters, there's no rush. I just need mathematical accuracy for a research hypothesis. Also, if you are interested in research or want to help out, please feel free to reach out this is a leisure pursuit for now, I'm in nursing school.
Hello good people of reddit,
I am fairly new to bioinformatics, and am currently studying and helping out some old colleagues with a differential protein analysis of their DIA MS data thats been quantified using spectronaut and have given me the resulting output.
I've read a few articles about mass spec proteomic analysis, incl a recent on in nature giving some great indications as to which imputations, methods, packages etc to use in which instances linked here: https://www.nature.com/articles/s41467-024-47899-w. So far I've done some general EDA, including PCAs and looking at removing outliers detected by Mahalanobois distance etc, boxplots, distributions.
There are ~82samples across 2900 initial features. The data has a large number of missing values, with almost 50% of samples that have >40% missing values across features. I know some advice is general on cutoffs like 20% missing etc, also depending on the type of missing it is. Is there any advice for handling missing values that you all have for me?
What Ive done for missing values so far is to calculate the mean of missing values across the samples and remove samples that are missing values 1sd above the mean, and then filtered the features that have >30% missing. Is this a correct approach? Another question I have is, is it BAD? for some samples to have too much coverage skewing the data? IE if one sample has values for all features is that 'bad' and needs to be removed?
Thanks for any advice or help you can give
In fractionation of TMT labeled peptides, how is one supposed to inject the peptides( for the post- fractionation lcms part)
Is it necessary to quantify peptides in each fraction and load equal amounts for lcms analysis? My understanding is that should not be required. This should be handled in the TMT quant analysis.
How much peptides should one load for each fraction vs unfractionated sample?
Suppose, I normally load 1ug of unfractionated sample. That 1ug is spread over the chromatogram. Now if I have 10 fractions, should I load approximately 100ng per fraction (1ug/10). Because if I load 1ug per fraction too, then those peptides will be concentrated at one region of the chromatogram. Same logic why pure protein derived peptides are loaded in much smaller amount. Am I thinking correctly? What do you do?
These things are not really explained in the publications. Thanks for helping out.
Proteomics scientist in training here. I've conducted an phosphoproteomics experiment to study the effects of different inhibitor treatments on a cancer model. I have my list of differentially expressed proteins which looks good enough but dont know how to move forward now.
One of the condition combines inhibitor treatments and I have been comparing the significant phosphosites with those detected in the other conditions to see where the overlap is. I have been thinking about taking the overlapping onces i.e. the contributions from each treatment and seeing what pathways they belong to and what this could mean functionally. But I am running dry here (even with 90 shared phosphosites...). The few pathways that I could identify are only based on 2-3 hits which seems flimsy to me.
I generally struggle with this a bit and my supervisor is no help. How do I draw meaningful conclusions from my results? There must be a better way than checking the connection of every single phosphosite manually?
Is there a proteomics slack channel available? I saw there is one that exist for computational MS
Does anyone in one and could share the link ? Thanks in advance
I have 4 conditions in my experiment (A,B,C,D). If I search the conditions on spectronaut separately and then merge them in R, I get different results than if I search the 4 conditions together.
It is direct DIA. I am using data at the protein level, merging the files by “protein groups” .
Why do I have different results and what’s the best method to use?
Additionally are you doing discovery/I targeted or more targeted workflows. I’m trying to get a better understanding of the landscape. I worked on the MS side for a long time but now I’m in the chromatography space in industry and trying to get a better feel for what people are doing.
Is anyone familiar with the EvoSep one and how reliable it is? I was talking with a rep for a certain large conglomerate science company and they said they have seen people return them in exchange for another UHPLC. I thought one of the whole purposes of an EvoSep was and how they have single high pressure pump was reliability. I understand it could just be the rep trying to pressure me into buying their stuff so I was hoping I could get some unbiased reviews of the instrument. Thanks!
Our core facility is looking to invest in a high-end mass spectrometer. Our primary applications are bulk DIA proteomics and PTM analysis of tissue and cell proteins, with a strong emphasis on achieving routine high proteome coverage.
After demoing the Bruker timsTOF Ultra 2 and the Thermo Astral, their performance has been comparable so far. Now we're facing a tough decision and would love to hear your insights:
1️⃣ Maintenance & Reliability: What's been your experience with the upkeep, troubleshooting, and service quality of these instruments? Are there any long-term quirks or hidden costs to be aware of?
2️⃣ Timing the Purchase: ASMS 2025 is just around the corner. Do you think it’s worth waiting to see if new models or upgrades are announced, or should we move forward now with the proven options?
Hello all.
It was suggested to me to do a sequential digest instead of a double digest for my protein. I've done a solo in gel digestion with trypsin only.
In this case, I would like to ask how to process an in-gel digested sample with trypsin before doing a chymotrypsin digest.
After in gel digestion, stop the reaction with TFA then dry and proceed with chymotrypsin? After in gel digestion, get the supernatant then proceed with chymotrypsin? Do I just add the chymotrypsin in the tube after incubation without deactivating trypsin?
I have read the double digest protocol, but papers doing sequential digestion after trypsin differ or does not fully describe the protocol.
Some additional questions...
Does reduction and alkylation matter for trypsin/chymotrypsin digestion? I've done trypsin digestion without doing reduction and alkylation. At that time, I just had to identify my protein in MASCOT and not check for PTMs.
Thank you! Much appreciated.
On an exemplar exam question, my professor said to assume that I eluted the peptides from the binding cleft two HLA proteins and ran them through mass spectrometry, resulting in the table below, and that “the peptides in each group were aligned to emphasize common motifs”. I understand that the letters represent amino acids but beyond that I am clueless as to how to read this table - like, what would I even google to find info on how to read this? I have a pretty weak background in advanced science stuff (I wandered into this class from a graduate health sciences program). I suspect the highlighted regions are the 1 and 2 regions that give the molecule its “self” character, but past that I’m lost.
Despite what Google's klugey ChatGPT knock off seems to think, SpectroNaut can not process Agilent DIA data. Neither can Fragpipe. I found where Skyline listed it as Agilent compatible for DIA. I'll try DIA-NN but I think the scan header issues that make it incompatible with Fragpipe will also apply to DIA-NN. Is there anything else I'm not thinking of and should look into? I've gone down the list of the most common free stuff.
Hi! I want to prepare my stocks of TMTpro (18 plex). I don't have anhydrous acetonitrile. Is it OK to dry it with CaCl2? Is there any way this can interfere with subsequent labelling steps? I will of course desalt afterwards. Thank you in advance
I am currently working in a lab doing research but I really want to get into bioinformatics and proteomics. My company might start doing the proteomics in house.
I'm curious if something like the UCSD online bioinformatics certificate is worth my time or if I should just go back to school.
https://extendedstudies.ucsd.edu/certificates/applied-bioinformatics
Just look at the dates and the success rate
https://web.archive.org/web/20220626103547/https://hupo.org/
https://web.archive.org/web/20241112152157/https://www.hupo.org/
It is my explorer, connection?
What do you see if you load hupo.org right now?
After incubation of cell lysate (in buffer with Sds and Triton) to strepavidin magnetic beads, can I remove the detergents by multiple washes with detergent free buffer. Will that make it detergent free enough for downstream proteomics? Is that a valid approach?
I have a whole cell lysate of human cell line, where I am expecting 20-50 proteins to be biotinylated (out of the 15-20k proteins in lysate). These proteins will get immobilized on strepavidin magnetic beads by incubation of lysate.
Now, I want identify these 20-50 proteins by mass spec. These proteins are biotinylated at very specific residues only. I don't need to identify the residue. Identify of these proteins is enough. However, I am unsure how to go about it?
Shall I do on-bead digestion? My beads are not the tryptic resistant variety, so how to reduce streptavidin cleavage in this case?
Or shall I denature the beads to release the bound proteins? And then trypsinize. I am afraid lot of strepavidin will get released by harsh denaturation conditions as well. I read somewhere that GuCl pH 1.5 should specifically release proteins but not syreptavidin but I am not sure.
And guidance, advice, or published protocols on either of these two approaches is highly appreciated. I know it's a complicated topic and this sub is my best bet (because I don't have anyone doing proteomics nearby).
Thanks a lot. Please help me out.