Beyond Transformers: Evaluating the Robustness and Efficiency of State-Space Models for Next-Generation Natural Language Processing

Saron Tua Parsaoran L Tobing; Muhammad Reza Al Thoriq; Setia Widodo; Sandre Ebenezer Sibuea; Asprina BR Surbakti; Siti Jamilah BR Tarigan

Authors

Saron Tua Parsaoran L Tobing Institut Teknologi dan Bisnis Indonesia, Indonesia
Muhammad Reza Al Thoriq Institut Teknologi dan Bisnis Indonesia, Indonesia
Setia Widodo Institut Teknologi dan Bisnis Indonesia, Indonesia
Sandre Ebenezer Sibuea Institut Teknologi dan Bisnis Indonesia, Indonesia
Asprina BR Surbakti Institut Teknologi dan Bisnis Indonesia, Indonesia
Siti Jamilah BR Tarigan Institut Teknologi dan Bisnis Indonesia, Indonesia

Keywords:

Transformer, space, models

Abstract

Transformer architectures have dominated natural language processing (NLP) advancements in recent years, yet their growing computational demands and challenges in robustness motivate exploration of alternative models. This study qualitatively evaluates State-Space Models (SSMs) as a promising next-generation architecture for NLP tasks. By conducting a comprehensive literature analysis and comparative examination of current research, this paper investigates SSMs' theoretical foundations, robustness to input perturbations, efficiency in handling long sequences, and applicability to diverse linguistic contexts. The results show that SSMs offer compelling advantages over Transformers in memory efficiency and sequence modeling capacity, while demonstrating competitive or superior robustness in several NLP benchmarks, highlighting their potential as efficient, scalable, and robust alternatives for future NLP applications.

Downloads

Download data is not yet available.

References

Consens, M. E., Diaz-Navarro, A., Chu, V., Stein, L., He, H. H., Moses, A., & Wang, B. (2025). Interpreting Attention Mechanisms in Genomic Transformer Models: A Framework for Biological Insights. https://doi.org/10.1101/2025.06.26.661544

Gade, U. R. (2025). UNDERSTANDING MACHINE LEARNING MODELS IN PREDICTIVE PROCESSING PIPELINES. INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND INFORMATION TECHNOLOGY, 8(1), 2569–2582. https://doi.org/10.34218/IJRCAIT_08_01_186

Griffis, J. C., Bruss, J., Acker, S. F., Shea, C., Tranel, D., & Boes, A. D. (2024). Iowa Brain‐Behavior Modeling Toolkit: An Open‐Source MATLAB Tool for Inferential and Predictive Modeling of Imaging‐Behavior and Lesion‐Deficit Relationships. Human Brain Mapping, 45(18). https://doi.org/10.1002/hbm.70115

Hrycej, T., Bermeitinger, B., & Handschuh, S. (2025). Integrating the Attention Mechanism Into State Space Models. 2025 IEEE Swiss Conference on Data Science (SDS), 170–173. https://doi.org/10.1109/SDS66131.2025.00033

Khomutov, S. O., Belitsyn, I. V., Sabelnikov, A. S., Stepanov, O. A., Litvinova, N. A., & Shlyk, Y. K. (2025). Synthesis of alternatives for implementing energy efficient transformers. Vestnik IGEU, 3, 39–45. https://doi.org/10.17588/2072-2672.2025.3.039-045

Kumar Attar, R., & Komal. (2022). The Emergence of Natural Language Processing (NLP) Techniques in Healthcare AI. In Artificial Intelligence for Innovative Healthcare Informatics (pp. 285–307). Springer International Publishing. https://doi.org/10.1007/978-3-030-96569-3_14

LI, H., LI, T., WANG, Z., CHEN, Y., & ZHANG, X. (2025). Robustness anylysis of periodic oscillations in continuous-time crystals. Acta Physica Sinica, 74(13), 134204. https://doi.org/10.7498/aps.74.20250036

Metwaly, K., Kweon, J., Alhujaili, K., Gini, F., Greco, M. S., Rangaswamy, M., & Monga, V. (2024). MIMO Radar Beampattern Design via Algorithm Unrolling. IEEE Transactions on Aerospace and Electronic Systems, 60(6), 9204–9220. https://doi.org/10.1109/TAES.2024.3443020

Murph, A., Marston, C., Bramley, D., Trevelyan, C., & Allington, G. (2024). Dynamic Token Contextualization for Adaptive Knowledge Synthesis in Large Language Models. https://doi.org/10.31219/osf.io/8dsa7

Sato, M., Suzuki, J., Shindo, H., & Matsumoto, Y. (2018). Interpretable Adversarial Perturbation in Input Embedding Space for Text. Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 4323–4330. https://doi.org/10.24963/ijcai.2018/601

Soana, V., Minooee Sabery, S., Bosi, F., & Wurdemann, H. (2025). Elastic robotic structures: a multidisciplinary framework for the design and control of shape-morphing elastic system for architectural and design applications. Construction Robotics, 9(1), 3. https://doi.org/10.1007/s41693-024-00128-8

Tiomkin, S., Nemenman, I., Polani, D., & Tishby, N. (2024). Intrinsic Motivation in Dynamical Control Systems. PRX Life, 2(3), 033009. https://doi.org/10.1103/PRXLife.2.033009

Vats, A., Raja, R., Mathur, M., Chadha, A., & Jain, V. (2025). Multilingual State Space Models for Structured Question Answering in Indic Languages. Proceedings of the Eighth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2025), 115–128. https://doi.org/10.18653/v1/2025.loresmt-1.11

Wen, Z., Xu, L., & Wang, M. (2025). An Adaptive Parallel Layer-Skipping Framework for Large Language Model Inference Speedup With Speculative Decoding. Integrated Circuits and Systems, 2(2), 58–66. https://doi.org/10.23919/ICS.2025.3575371

Xu, Z., Yan, J., Gupta, A., & Srikumar, V. (2025). State Space Models are Strong Text Rerankers. Proceedings of the 10th Workshop on Representation Learning for NLP (RepL4NLP-2025), 152–169. https://doi.org/10.18653/v1/2025.repl4nlp-1.12