written 7.6 years ago by | modified 2.8 years ago by |
Mumbai University > Informatica Technology > Sem 4 > Automata Theory
Marks: 05
Year:DEC 16
written 7.6 years ago by | modified 2.8 years ago by |
Mumbai University > Informatica Technology > Sem 4 > Automata Theory
Marks: 05
Year:DEC 16
written 7.6 years ago by |
Automata theory is the basis for the theory of formal languages.
A proper treatment of formal language theory begins with some basic definitions:
A symbol is simply a character, an abstraction that is meaningless by itself.
An alphabet is a finite set of symbols.
A word is a finite string of symbols from a given alphabet.
Finally, a language is a set of words formed from a given alphabet.
The set of words that form a language is usually infinite, although it may be finite or empty as well. Formal languages are treated like mathematical sets, so they can undergo standard set theory operations such as union and intersection. Additionally, operating on languages always produces a language.
As sets, they are defined and classified using techniques of automata theory.
Formal languages are normally defined in one of three ways, all of which can be described by automata theory:
regular expressions
standard automata
a formal grammar system
Regular Expressions Example
alphabet A1 = {a, b}
alphabet A2 = {1, 2}
language L1 = the set of all words over A1 = {a, aab, ...}
language L2 = the set of all words over A2 = {2, 11221, ...}
language L3 = L1 ∪ L2
language L4 = {an | n is even} = {aa, aaaa, ...}
language L5 = {anbn | n is natural} = {ab, aabb, ...}
Languages can also be defined by any kind of automaton, like a Turing Machine.
In general, any automata or machine M operating on an alphabet A can produce a perfectly valid language L.
The system could be represented by a bounded Turing Machine tape, for example, with each cell representing a word.
After the instructions halt, any word with value 1 (or ON) is accepted and becomes part of the generated language.
From this idea, one can defne the complexity of a language, which can be classified as P or NP, exponential, or probabilistic, for example.
Noam Chomsky extended the automata theory idea of complexity hierarchy to a formal language hierarchy, which led to the concept of formal grammar.
A formal grammar system is a kind of automata specifically defined for linguistic purposes. The parameters of formal grammar are generally defined as:
a set of non-terminal symbols N
a set of terminal symbols Σ
a set of production rules P
a start symbol S
Grammar Example
start symbol = S
non-terminals = {S}
terminals = {a, b}
production rules: S → aSb, S → ba
S → aSb → abab
S → aSb → aaSbb → aababb
L = {abab, aababb, ...}
As in purely mathematical automata, grammar automata can produce a wide variety of complex languages from only a few symbols and a few production rules.
Chomsky's hierarchy defines four nested classes of languages, where the more precise aclasses have stricter limitations on their grammatical production rules.
The formality of automata theory can be applied to the analysis and manipulation of actual human language as well as the development of human-computer interaction (HCI) and artificial intelligence (AI).