Significant advances in language modelling by scientists from IICT-BAS

Georgi Shopov

The PhD student Georgi Shopov from the Institute of Information and Communication Technologies of the Bulgarian Academy of Sciences (IICT-BAS) took part in the world’s leading conference in the field of natural language processing – “Empirical Methods in Natural Language Processing” which was held from 12-16 November in Miami, USA. At the conference, Georgi Shopov presented new scientific results in the field of language modeling achieved at IICT-BAS and forming the main part of his dissertation. 1271 papers were selected for the main conference from 6105 submitted. Georgi Shopov’s paper, co-authored with Associate Professor Stefan Gerdjikov from IICT-BAS and FMI of SU “St. Kliment Ohridski” were among the 168 awarded with an oral report.

In recent years, language models have established themselves as a fundamental approach in Artificial Intelligence. They have demonstrated remarkable abilities in solving problems related to natural language processing, programming, protein modeling, and generating basic linguistic and mathematical reasoning. However, the widely used modern language models (ChatGPT, Llama, Gemini, Claude) are unidirectional: they process and generate text strictly from left to right. The fixed directionality of this type of language models severely limits their expressiveness.

In their development, the scientists from IICT-BAS have presented a new theoretical view on language modeling based on well-known formalisms from automata theory. Thanks to this formal connection, they have introduced a new class of bidirectional language models that are strictly more expressive than unidirectional ones and allow solving significantly more complex problems. Another advantage of bidirectional language models is their higher efficiency compared to unidirectional ones. In other words, bidirectional language models allow text generation to be performed faster, on lower-performance computing devices, and at lower power consumption which greatly increases their applicability.

В бъдеще учените от ИИКТ–БАН планират да развиват двупосочните езикови модели с цел да се позволи ефективен контрол на генерирания текст, да се определя дълбочината на извода и да се избягват така наречените халюцинации на езиковите модели.

In the future, researchers at IICT-BAS plan to develop bidirectional language models in order to allow effective control of the generated text, to determine the depth of inference and to avoid so-called hallucinations of language models.

Link to publication:

https://aclanthology.org/2024.emnlp-main.328.pdf