High-throughput cryo-EM characterization and automated model building of glycofibrils via CryoSeek
Abstract
With CryoSeek, a structure-first paradigm for discovery, we have determined high resolution 3D structures of a number of glycofibrils, in which well-ordered glycans either form a thick shell coating various protein cores or constitute the entire fibril. To improve the throughput of CryoSeek, we hereby report two methods. The recursive bisection clustering (RBC) strategy has been designed to enable high-throughput cryo-EM data processing of fibrils. EModelG is an AI-facilitated algorithm for automated model building of glycans. Using the RBC method, we have established a high-throughput workflow for CryoSeek and have reconstructed 3D EM maps for hundreds of fibrils that can be automatically modelled in EModelG. Based on their molecular compositions and structural features, we tentatively proposed a unified nomenclature scheme for the fibrils discovered via CryoSeek. These structures will lay the foundation for decoding the principles of glycan folding. Furthermore, to adapt to the high volume of cryo-EM structures quickly obtained with the CryoSeek strategy, we have established a namesake database for data archiving and sharing.
Metrics
DOI:
Submission ID:
Downloads
Posted
How to Cite
Declaration of Competing Interests
The authors declare no competing interests to disclose.
Copyright
The copyright holder for this preprint is the author/funder.
All rights reserved. This work is protected by copyright. No part of this work may be reproduced, distributed, or transmitted in any form or by any means without the prior written permission of the copyright holder.