Wang, Yiping, et al. “Compression-Based Tokenization Improves Language Modeling of Hierarchical Genomic Structure”. LangTaoSha Preprint Server, 11 Dec. 2025, https://doi.org/10.65215/2qt5jb81.