State-of-the-art natural language processing (NLP) relies heavily on machine learning, especially through the utilization of large language models. Recently, variational quantum algorithms emerged as a promising tool to obtain computational speedups in machine learning using a hybrid setting of classical and quantum computers. We apply masked language modeling (MLM) in the context of quantum natural language processing, particularly the DisCoCat framework, which is a mathematical framework for NLP, based on category theory. We reformulate MLM both as binary and multiclass classification task and show how masking can be conceptualized inside a quantum computing envi- ronment. For both cases, we train quantum machine learning models using quantum simulators on a toy dataset. In addition, we show an extensive hyperparameter search for our experiments, and we present a strategy to reduce the number of qubits, which also lowers task complexity. The evaluation of the models that have been trained with binary classification shows promising results, having on average about 79% accuracy on test data. The evaluation of the models trained on the multiclass clas- sification is not as successful as the binary classification, where we can reach about 29% accuracy on unseen data. The reasons for that outcome are discussed, and we present a potential path forward.
«State-of-the-art natural language processing (NLP) relies heavily on machine learning, especially through the utilization of large language models. Recently, variational quantum algorithms emerged as a promising tool to obtain computational speedups in machine learning using a hybrid setting of classical and quantum computers. We apply masked language modeling (MLM) in the context of quantum natural language processing, particularly the DisCoCat framework, which is a mathematical framework for N...
»