## Automata Theory

Automata theory is the study of abstract computing machines.

Notes and observations on John Hopcroft's *Automata Theory, Languages, and Computation*.

Many systems may be viewed as being in one of a finite number of "states". The purpose of a state is to remember a portion of the system's history. Since there are only a finite number of states, the entire history cannot be remembered, so **a system must be designed carefully to remember what is important and forget what is not**. The advantage of having only a finite number of states is that we can implement the system with a fixed set of resources.

A finite automation has a set of states. Its "control" moves from state to state in response to external inputs. A crucial distinction among automata is whether the control is **deterministic**, meaning the the automata cannot be in more than one state at any one time, or **nondeterministic**, meaning that it may be in several states at once. However, it is possible to convert any nondeterministic automata to a deterministic one that accepts the same language.

### Deterministic Finite Automata

A Deterministic Finite Automaton (DFA) consists of:

- A finite set of states:
**Q** - A finite set of input symbols:
**∑** - A transition function:
**𝛿**that takes as arguments a state and an input symbol and returns a state - A start state, one of the states in
**Q** - A set of final, accepting states:
**F**. (F is subset of Q.)

Five-tuple notation: **A = (Q, ∑, 𝛿, q0, F)**

The "language" of a DFA is the set of all strings that the DFA "accepts": strings that result in a sequence of state transitions from the start state to an accepting (final) state.

**Example:**

An automaton that accepts all and only strings of 0 and 1, that have the sequence "01" somewhere in the string.

**Transition Diagram**

**Set-Former notation**

n
{ x01y | x and y are any strings of 0 and 1 }

**Tuple notation**

A = ({q0,q1,q2}, {0,1}, 𝛿, q0, {q1})

where **𝛿** =

𝛿(q0, 0) = q2

𝛿(q0, 1) = q0

𝛿(q1, 0) = q1

𝛿(q1, 1) = q1

𝛿(q2, 0) = q2

𝛿(q2, 1) = q1

**Transition Table**

0 | 1 | ||

→ | q0 | q2 | q0 |

* | q1 | q1 | q1 |

q2 | q2 | q1 |

Given a DFA: A = (Q, ∑, 𝛿, q0, F), we can define its language as

**L(A) = { w | 𝛿(q0, w) is in F }**

The language of *A* is the set of strings w that take the start state *q0* to one of the accepting states. If *L* is *L(A)* for some DFA *A*, we say *L* is a **regular language**. A regular language can be defined as a language recognized by a finite automaton. A regular language can be expressed with a regular expression. **Regular expression cans define exactly the same languages as automata**.

the set of all strings such that each block of four consecutive symbols contains at least two zeros over the alphabet {0,1}.

the set of strings that either begin or end with 01, over the alphabet {0,1}.

### Nondeterministic Finite Automata

A **Nondeterminstic Finite Automata** (NFA) has the ability to be in several states at once. NFA's accept exactly the regular languages, just as DFA's do. However, NFAs are often more succinct and easier to design than DFAs. **It is always possible to convert an NFA into a DFA**, however, the latter could potentially have exponentially more states.

A nondeterministic finite automata has a finite set of states, a finite set of input symbols, one start state and a set of accepting states. The NFA's transition function **𝛿** takes a state and input symbol as arguments but returns a **set** of zero, one, or more states (rather than returning exactly one, as the DFA does).

( {q0,q1,q2}, {0,1}, 𝛿, q0, {q2} )

NFA accepting all strings that end in 01

**Transition Table**

0 | 1 | ||

→ | q0 | {q0,q1} | {q0} |

q1 | Ø | {q2} | |

* | q2 | Ø | Ø |

An NFA accepts a string *w* if it is possible to make any sequence of choices of next state, while reading the characters of *w*, and go from the start state to any accepting state. The fact that other choices using the input symbols of *w* lead to a nonaccepting state, or do not lead to any state at all (i.e. the sequence of states "dies"), does not prevent *w* from being accepted by the NFA as as a whole.

**L(A) = { w | 𝛿(q0, w) ∩ F ≠ Ø}**

The language L(A) is the set of strings w in ∑ such that 𝛿(q0, w) contains at least one accepting state.

### Equivalence of Deterministic and Nondeterministic Finite Automata

Although there are many languages for which an NFA is easier to construct than a DFA, it is a surprising fact that every language that can be described by some NFA can also be described by some DFA. Moreover, the DFA in practice has about as many states as the NFA, although it often has more transitions. In the worst case, however, a DFA can have 2^{n} states while the NFA for the same language has only n states.

The proof of equivalence involves a subset construction. After reading sequence of input symbols *w*, the constructed DFA is in one state that is the SET of NFA states that the NFA would be in after reading *w*. Since the accepting states of the DFA are those sets that include at least one accepting state of the NFA, and the NFA also accepts if it gets into at least one of its accepting states, we may conclude that the the DFA and NFA accept exactly the same strings, and therefore accept the same language.

equivlent DFA accepting all strings that end in 01

0 | 1 | ||

Ø | Ø | Ø | |

→ | {q0} | {q0,q1} | {q0} |

{q1} | Ø | {q2} | |

* | {q2} | Ø | Ø |

{q0,q1} | {q0,q1} | {q0,q2} | |

* | {q0,q2} | {q0,q1} | {q0} |

* | {q1,q2} | Ø | {q2} |

* | {q0,q1,q2} | {q0,q1} | {q0,q2} |

Since the NFA's set of states is {q0,q1,q2}, the subset construction produces a DFA with 2^{3} = 8 states. However, of the 8 states, we can only reach 3 of them, the other 5 states are inaccessible from the start state and may as well not be there.

0 | 1 | ||

→ | {q0} | {q0,q1} | {q0} |

{q0,q1} | {q0,q1} | {q0,q2} | |

* | {q0,q2} | {q0,q1} | {q0} |