A new mass spectrometry based technique can be used to map the location of thousands of proteins in the cell simultaneously, shedding light on different proteomic cell states in health and disease.
The omics era has made analysis of total genomes or transcriptomes to answer diverse scientific questions routine. However, while the proteome is in fact the true functional representation of a cell at any given moment, similar advances on the protein level have lagged behind. To acquire deeper insight into what makes one cell function differently from another, in a heterogeneous tissue or in diseased compared to healthy states, technological advances in the proteomics field are needed.
A recent paper in Molecular Cell describes such a refreshing addition to the proteomics toolbox, dubbed SubCellBarCode, which even includes a searchable online resource where you can look up the location of your favorite protein. The method is based on cellular fractionation followed by mass spectrometry and boasts it can pinpoint the location of more than 12 000 cellular proteins simultaneously with an excellent coverage across the human proteome ranging from 63% of all transcription factors to 96% of all chromatin remodeling factors. Using statistical methods, highly robust proteins can be used as markers for certain localizations in the cell, shown to be stable across five different human cell lines. Based on these marker proteins, 15 distinct clusters were defined within all cellular localizations, which can in turn be grouped into four cellular neighborhoods (secretory, nuclear, cytosol, and mitochondria). The output of the method is a SubCellBarCode or simply bar code for each individual protein based on the probability of its cellular localization. The bar code is a stacked barplot combining the probabilities calculated for each of the 15 compartments or four neighborhoods, always adding up to 1.
The authors admit the protocol is not without shortcomings as proteins with functions in multiple sub-cellular locations would be present in several fractions, leading to a reduced signal and a risk for drop out. This is estimated to pose a potential problem for around 10% of proteins in the cell. However, this percentage is lower than previously reported, as analysis indicates that most apparent multi-localization events are actually artefacts of previous methods used or of the uncertainty on which database classifications are based.
Initial application of the method on data acquired in different cell lines provided insight into general cellular processes. For example, the authors could show that splicing is generally not used to direct different isoforms of the same protein to different locations and that this is also not an explanation of differential localization in different cell lines. Secondly, it seems that cell types cannot be distinguished based on localization of proteins shuttling between different compartments, but rather based on differential expression of proteins in secretory compartments such as the plasma membrane. Expression of nuclear proteins on the other hand is much less useful in distinguishing cell types from each other.
Further on, we can glimpse potential broad applications of the method in studying protein complexes or drug responses. Comparison of localization of cellular complexes in the five cell lines showed that some partners of protein complexes have a more flexible localization than the core components. Exploration of this principle in important cellular complexes could lead to advanced understanding of their regulation and function. Lastly, barcodes of a lung cancer cell line revealed that certain candidate genes change their localization in response to EFGR inhibition, a common cancer therapy. These data clearly demonstrate the potential of the method to explore the uncharted territory of proteomics in drug sensitivity and response.
- Orre LM et al. SubCellBarCode: Proteome-wide Mapping of Protein Localization and Relocalization. Mol Cell. 2019 Jan 3;73(1):166-182.e7. (2019)