Inferring interaction partners from protein sequences
Specific protein-protein interactions play crucial roles in the stability of multi-protein complexes and in signal transduction. Thus, mapping these interactions is key to a systems-level understanding of cells. However, systematic experimental identification of protein interaction partners is still challenging, while a large and rapidly growing amount of sequence data is now available. Is it possible to identify which proteins interact just from their sequences ? We propose an approach based on sequence covariation, building on statistical inference methods used with success to predict the three-dimensional structures of proteins from sequences alone. Our method identifies specific interaction partners with high accuracy among the members of several ubiquitous prokaryotic protein families, and accurately distinguishes interacting protein families from noninteracting ones, using only sequence data.