Automata and orders: the Wheeler class. Complexity, bounds, and operations

Martincigh, Davide

In this thesis we discuss Wheeler automata, a subclass of finite states automata (NFAs) ordered by co-lexicographic order on strings, which allows efficient storage and substring query mechanisms. Wheeler automata form an important data structure for languages, as the determinization process via powerset construction is polynomial, making classical problems solvable in polynomial time. We investigate computational problems related to recognizing Wheeler automata starting from NFAs and reduced NFAs, noting that non-determinism generally leads to intractability. We also examine state complexity in Wheeler DFAs and prove that intersection of languages is computationally simpler, and we provide a construction for the minimum Wheeler DFA. Additionally, we explore the Krohn-Rhodes Decomposition Theorem (KRDT) for two compression-oriented classes of automata: path-coherent and Wheeler automata. These classes are efficiently encodable and indexable. We prove that automata in these classes can be decomposed into a cascade with a number of components linear to the original automaton's states. For Wheeler automata, only two-state resets are needed, avoiding full KRDT through a simpler inductive argument. Lastly, we extend the analysis of Wheeler automata to arbitrary NFAs using a parameter called width, which indicates how far an automaton is from being Wheeler. Specifically, we focus on the difference between the width calculated on DFAs and that calculated on NFAs recognizing a given language, showing that their distance can be exponentially large and providing useful lower bounds for the latter.

Automata and orders: the Wheeler class. Complexity, bounds, and operations

MARTINCIGH, Davide

2024

Abstract

In this thesis we discuss Wheeler automata, a subclass of finite states automata (NFAs) ordered by co-lexicographic order on strings, which allows efficient storage and substring query mechanisms. Wheeler automata form an important data structure for languages, as the determinization process via powerset construction is polynomial, making classical problems solvable in polynomial time. We investigate computational problems related to recognizing Wheeler automata starting from NFAs and reduced NFAs, noting that non-determinism generally leads to intractability. We also examine state complexity in Wheeler DFAs and prove that intersection of languages is computationally simpler, and we provide a construction for the minimum Wheeler DFA. Additionally, we explore the Krohn-Rhodes Decomposition Theorem (KRDT) for two compression-oriented classes of automata: path-coherent and Wheeler automata. These classes are efficiently encodable and indexable. We prove that automata in these classes can be decomposed into a cascade with a number of components linear to the original automaton's states. For Wheeler automata, only two-state resets are needed, avoiding full KRDT through a simpler inductive argument. Lastly, we extend the analysis of Wheeler automata to arbitrary NFAs using a parameter called width, which indicates how far an automaton is from being Wheeler. Specifically, we focus on the difference between the width calculated on DFAs and that calculated on NFAs recognizing a given language, showing that their distance can be exponentially large and providing useful lower bounds for the latter.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				Dottorato di ricerca in Informatica e scienze matematiche e fisiche
			
	Data di pubblicazione
	
				28-ott-2024
			
	Lingua
	
				Inglese
			
	Parola chiave
	
				Automata; Ordered languages; Wheeler automata; KRDT
			
	Relatore, Supervisor, Advisor o Tutor
	
				MARCONE, Alberto Giulio
POLICRITI, Alberto
D'AGOSTINO, Giovanna
			
	Nome Editore
	
				Università degli Studi di Udine
			
	Collezione di appartenenza
	
				Università degli Studi di Udine

File in questo prodotto:

File	Dimensione	Formato
Automata and orders the Wheeler class.pdf accesso aperto Licenza: Tutti i diritti riservati Dimensione 937.3 kB Formato Adobe PDF Visualizza/Apri	937.3 kB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/166029

Il codice NBN di questa tesi è URN:NBN:IT:UNIUD-166029