Heterogeneity of the GFP fitness landscape and data-driven protein design
Understanding the relationship between genotype and phenotype, the fitness landscape, elucidates the fundamental laws of heredity and may ultimately create novel methods of protein design. In the present work a team of scientists from the Group of synthetic biology and the Group of molecular tags for optical nanoscopy IBCh RAS in the collaboration with foreign colleagues combined several approaches to engineer new variants of naturally occurring green fluorescent proteins by generating tens of thousands GFP mutant variants and assessing their ability to fluoresce. Moreover, machine learning algorithms were used for the predicting the performance of other GFP variants and expanding their fitness landscape. The published results indicate that to generate functional protein variants and to predict a protein’s function the algorithm only requires data on the effects of single-site mutations and their dependence on each other (low-order epistasis). The resulrs are published in the eLife journal. Learn more
june 16, 2022