Preprint / Version 1

LanguageFold: A Bio-inspired Hierarchical Sparse Attention Mechanism for Large Language Models

This article is a preprint and has not been certified by peer review.

Authors

Categories
Keywords
Biology4AI; natural language processing; large language model; attention mechansim

Abstract

Large language models predominantly rely on the Transformer architecture, whose self-attention mechanism incurs a quadratic computational cost O(N2) with respect to input length, leading to significant memory and computation bottlenecks when processing ultra-long contexts. This work proposes LanguageFold, a hierarchical sparse attention mechanism inspired by the Self-Returning Random Walk model of genome folding (Huang et al. 2020). LanguageFold decomposes global attention into dynamically constructed tree attention with a theoretical scaling of O(NlogN). Preliminary experiments on prompt-based generation and the DROP reading comprehension benchmark indicate that this tree-structured attention enables efficient language processing while preserving accuracy and enhancing structural interpretability. These results highlight the promise of genome-inspired attention mechanisms for optimizing the scalability of large language models.

Metrics

Favorites: 2
Views: 195
Downloads: 9

Downloads

Posted

2026-01-29

How to Cite

Wu, Y., & Huang, K. (2026). LanguageFold: A Bio-inspired Hierarchical Sparse Attention Mechanism for Large Language Models. LangTaoSha Preprint Server. https://doi.org/10.65215/LTSpreprints.2026.01.28.000108

Declaration of Competing Interests

The authors declare no competing interests to disclose.