Preprint / Version 1

Predicting Higher-Order Chromatin Interactions with PHOCI

This article is a preprint and has not been certified by peer review.

Authors

Categories
Keywords
chromatin structure; gene regulation; deep learning

Abstract

Understanding how the three-dimensional (3D) organization of chromatin shapes gene regulation requires moving beyond pairwise contacts to capture higher-order interactions among multiple genomic loci. Although emerging experimental techniques such as Pore-C provide snapshots of such multi-way interactions, their limited coverage and inherent sparsity hinder systematic characterization and cross-cell-type generalization. Here we present PHOCI (Predictor of Higher-Order Chromatin Interactions), a computational framework for probabilistic modeling of multi-way chromatin interactions directly from Hi-C and epigenomic data. Across multiple cell lines, PHOCI successfully captures experimentally observed interactions from challenging negative configurations and generalizes to previously uncharacterized cell types. The framework further enables generation of realistic multi-way interaction datasets and identification of recurrent multi-locus regulatory modules through association rule mining. Notably, at the MYB locus in K562 cells, predicted enhancer combinations exhibit non-additive, synergistic effects on gene expression, as validated by CRISPR interference experiments, providing direct evidence that higher-order interactions encode regulatory logic beyond pairwise contacts. By transforming sparse pairwise measurements into a probabilistic representation of higher-order chromatin architecture, PHOCI bridges a key gap in genomic modeling and provides a scalable foundation for studying the multi-body regulatory principles underlying genome function.

References

1. Misteli, T. The Self-Organizing Genome: Principles of Genome Architecture and Function. Cell 183, 28–45 (2020).

2. Bonev, B. & Cavalli, G. Organization and function of the 3D genome. Nat. Rev. Genet. 17, 661–678 (2016).

3. Sreenivasan, V. K. A., Yumiceba, V. & Spielmann, M. Structural variants in the 3D genome as drivers of disease. Nat. Rev. Genet. 26, 742–760 (2025).

4. Zheng, H. & Xie, W. The role of 3D genome organization in development and cell differentiation. Nat. Rev. Mol. Cell Biol. 20, 535–550 (2019).

5. Allahyar, A. et al. Enhancer hubs and loop collisions identified from single-allele topologies. Nat. Genet. 50, 1151–1160 (2018).

6. Tsai, A., Alves, M. R. & Crocker, J. Multi-enhancer transcriptional hubs confer phenotypic robustness. eLife 8, e45325 (2019).

7. Pinglay, S. et al. Synthetic regulatory reconstitution reveals principles of mammalian Hox cluster regulation. Science 377, eabk2820 (2022).

8. Sabari, B. R. et al. Coactivator condensation at super-enhancers links phase separation and gene control. Science 361, eaar3958 (2018).

9. Huang, K. et al. Physical and data structure of 3D genome. Sci. Adv. 6, eaay4055 (2020).

10. Carignano, M. A. et al. Local volume concentration, packing domains, and scaling properties of chromatin. eLife 13, RP97604 (2024).

11. Li, W. S. et al. Mature chromatin packing domains persist after RAD21 depletion in 3D. Sci. Adv. 11, eadp0855 (2025).

12. Oudelaar, A. M. et al. Single-allele chromatin interactions identify regulatory hubs in dynamic compartmentalized domains. Nat. Genet. 50, 1744–1751 (2018).

13. Deshpande, A. S. et al. Identifying synergistic high-order 3D chromatin conformations from genome-scale nanopore concatemer sequencing. Nat. Biotechnol. 40, 1488–1499 (2022).

14. Zhong, J.-Y. et al. High-throughput Pore-C reveals the single-allele topology and cell type-specificity of 3D genome folding. Nat. Commun. 14, 1250 (2023).

15. Zheng, M. et al. Multiplex chromatin interactions with single-molecule precision. Nature 566, 558–562 (2019).

16. Quinodoz, S. A. et al. Higher-Order Inter-chromosomal Hubs Shape 3D Genome Organization in the Nucleus. Cell 174, 744-757.e24 (2018).

17. Senior, A. W. et al. Improved protein structure prediction using potentials from deep learning. Nature 577, 706–710 (2020).

18. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

19. Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).

20. Bunne, C. et al. How to build the virtual cell with artificial intelligence: Priorities and opportunities. Cell 187, 7045–7063 (2024).

21. Cui, H. et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat. Methods 21, 1470–1480 (2024).

22. Cheng, W. et al. DNALONGBENCH: a benchmark suite for long-range DNA prediction tasks. Nat. Commun. 16, 10108 (2025).

23. Fan, Y. et al. GFETM: Genome foundation-based embedded topic model for scATAC-seq modeling. Cell Syst. 17, 101563 (2026).

24. He, Y. et al. Diffusion-enhanced characterization of 3D chromatin structure reveals its linkage to gene regulatory networks and the interactome. Genome Res. 33, 1354–1368 (2023).

25. Wang, T. et al. CellNavi predicts genes directing cellular transitions by learning a gene graph-enhanced cell state manifold. Nat. Cell Biol. 27, 1863–1874 (2025).

26. Zhang, Z. et al. Developing a general AI model for integrating diverse genomic modalities and comprehensive genomic knowledge. Nucleic Acids Res. 53, gkaf1269 (2025).

27. Wang, X., Luan, Y. & Yue, F. EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps. Sci. Adv. 8, eabn9215 (2022).

28. Nguyen, E. et al. Sequence modeling and design from molecular to genome scale with Evo. Science 386, eado9336 (2024).

29. Brixi, G. et al. Genome modelling and design across all domains of life with Evo 2. Nature 652, 1349–1361 (2026).

30. Theodoris, C. V. et al. Transfer learning enables predictions in network biology. Nature 618, 616–624 (2023).

31. Avsec, Ž. et al. Advancing regulatory variant effect prediction with AlphaGenome. Nature 649, 1206–1218 (2026).

32. Fulco, C. P. et al. Activity-by-contact model of enhancer–promoter regulation from thousands of CRISPR perturbations. Nat. Genet. 51, 1664–1669 (2019).

33. Lieberman-Aiden, E. et al. Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome. Science 326, 289–293 (2009).

34. Hamilton, W., Ying, Z. & Leskovec, J. Inductive Representation Learning on Large Graphs. in Advances in Neural Information Processing Systems (eds Guyon, I. et al.) vol. 30 (Curran Associates, Inc., 2017).

35. Larson, M. H. et al. CRISPR interference (CRISPRi) for sequence-specific control of gene expression. Nat. Protoc. 8, 2180–2196 (2013).

36. Kipf, T. N. & Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. in International Conference on Learning Representations (2017).

37. Toivonen, H. Apriori Algorithm. in Encyclopedia of Machine Learning (eds Sammut, C. & Webb, G. I.) 39–40 (Springer US, Boston, MA, 2010).

38. Patil, P., Sharma, G. & Murty, M. N. Negative Sampling for Hyperlink Prediction in Networks. in Advances in Knowledge Discovery and Data Mining (eds Lauw, H. W. et al.) 607–619 (Springer International Publishing, Cham, 2020).

39. Hwang, H., Lee, S., Park, C. & Shin, K. AHP: Learning to Negative Sample for Hyperedge Prediction. in Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2237–2242 (Association for Computing Machinery, New York, NY, USA, 2022).

40. Ling, C. X., Huang, J. & Zhang, H. AUC: a statistically consistent and more discriminating measure than accuracy. in Proceedings of the 18th International Joint Conference on Artificial Intelligence 519–524 (Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2003).

41. Zhang, E. & Zhang, Y. Average Precision. in Encyclopedia of Database Systems (eds LIU, L. & ÖZSU, M. T.) 192–193 (Springer US, Boston, MA, 2009).

42. McInnes, L., Healy, J. & Melville, J. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. Preprint at https://doi.org/10.48550/arXiv.1802.03426 (2018).

43. Healy, J. & McInnes, L. Uniform manifold approximation and projection. Nat. Rev. Methods Primer 4, 82 (2024).

44. Jin, X. & Han, J. K-Means Clustering. in Encyclopedia of Machine Learning (eds Sammut, C. & Webb, G. I.) 563–564 (Springer US, Boston, MA, 2010).

45. Rouillard, A. D. et al. The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins. Database 2016, baw100 (2016).

46. Diamant, I., Clarke, D. J. B., Evangelista, J. E., Lingam, N. & Ma’ayan, A. Harmonizome 3.0: integrated knowledge about genes and proteins from diverse multi-omics resources. Nucleic Acids Res. 53, D1016–D1028 (2025).

47. Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).

48. Ernst, J. & Kellis, M. Chromatin-state discovery and genome annotation with ChromHMM. Nat. Protoc. 12, 2478–2492 (2017).

49. Paszke, A. et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. in Advances in Neural Information Processing Systems (eds Wallach, H. et al.) vol. 32 (Curran Associates, Inc., 2019).

50. Fey, M. & Lenssen, J. E. Fast Graph Representation Learning with PyTorch Geometric. in ICLR 2019 Workshop on Representation Learning on Graphs and Manifolds (2019).

51. Bradley, A. P. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 30, 1145–1159 (1997).

52. Grover, A. & Leskovec, J. node2vec: Scalable Feature Learning for Networks. in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 855–864 (Association for Computing Machinery, New York, NY, USA, 2016). 28

53. Reiff, S. B. et al. The 4D Nucleome Data Portal as a resource for searching and visualizing curated nucleomics data. Nat. Commun. 13, 2365 (2022). 30

54. Dunham, I. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012). 31

55. Hitz, B. C. et al. The ENCODE Uniform Analysis Pipelines. Preprint at https://doi.org/10.1101/2023.04.04.535623 (2023). 33

56. Luo, Y. et al. New developments on the Encyclopedia of DNA Elements (ENCODE) data portal. Nucleic Acids Res. 48, D882–D889 (2020). 35

57. Edgar, R., Domrachev, M. & Lash, A. E. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30, 207–210 (2002).

58. Barrett, T. et al. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res. 41, D991–D995 (2013).

Metrics

Views: 126
Downloads: 42

Downloads

Posted

2026-06-11

How to Cite

Wu, Y., Jiang, X., Yang, Z., Wang, Y., Li, L., Xu, J., Deng, P., Hou, C., Yu, C., & Huang, K. (2026). Predicting Higher-Order Chromatin Interactions with PHOCI. LangTaoSha Preprint Server. https://doi.org/10.65215/LTSpreprints.2026.06.10.000272

Download Citation

Declaration of Competing Interests

The authors declare no competing interests to disclose.