On Context-free Bigram Languages


The article considers languages in the alphabet {a1,…,an}, in the words of which the proportion of all consecutive pairs aiaj is recorded. This proportion is described by the generating matrix of the language Θ. The author called such languages bigram. Natural languages have a similar property. It turns out that the properties of such languages to be empty, finite, regular, context-free or context-sensitive are verifiable by the matrix Θ. This paper examines in detail the issue of infinite context-free languages.

Intelligent Systems
Aleksandr Petiushko Александр Петюшко
Aleksandr Petiushko Александр Петюшко
Director, Head of ML Research / Adjunct Professor / PhD

Principal R&D Researcher (15+ years of experience), R&D Technical Leader (10+ years of experience), and R&D Manager (8+ years of experience). Running and managing industrial research and academic collaboration (35+ publications, 30+ patents). Inspired by theoretical computer science and how it changes the world.