SPIN-dvEvo: Exploration of vast functional sequence space by directed virtual evolution from a local sequence cluster
Abstract
Both natural and directed evolution are powerful in improving protein functions but they are slow in exploring the nearly endless sequence space. Here, we present SPIN-dvEvo that couples few-shot low-rank adaptation (LoRA) of an ESM-2 protein language model with a genetic algorithm to quickly evolve functional remote homologs from a local cluster of highly-homologous, binary-labeled sequences. We experimentally tested SPIN-dvEvo on an enzyme (the core deaminase component of adenine base editors, TadA) and an intrinsically disordered protein (antitoxin CcdA). In TadA, virtually evolved sequences with low sequence identity to the starting sequences achieved a 38% success rate (23/60) in the first round and a 51% success rate along with a one-order-of-magnitude improvement in enzymatic activity in the second round, for which SPIN-dvEvo was retrained on first-round labels. Virtual evolution of the disordered protein CcdA was also successful, albeit at low success rate of 2.6%. Thus, SPIN-dvEvo can simulate billions of years of evolution in just minutes, rapidly creating new functional clusters.
Metrics
DOI:
Submission ID:
Downloads
Posted
How to Cite
Declaration of Competing Interests
The authors declare no competing interests to disclose.
Copyright
The copyright holder for this preprint is the author/funder.

This work is licensed under a Creative Commons Attribution 4.0 International License.