zoukankan      html  css  js  c++  java
  • 0

    Regular Languages

    Finite Automata

    Finite automata are good models for computers with an extremely limited amount of memory. The controller moves from state to state, depending on the input it receives. Finite automata and their probabilistic counterpart Markov Chains are useful tools when we are attempting to recognize patterns in data.

    Formal Definition of a finite automaton

    ​ A finite automaton is a list of those five objects: set of states, input alphabet, rules for moving, start state, and accept states.

    Definition A finite automaton is a 5-tuple ((Q,sum,delta,q_0,F)), where:

    • (Q) is a finite set called the state,
    • (sum) is a finite set called the alphabet,
    • (delta: Q imessum ightarrow Q) is the transition function,
    • (q_0in Q) is the start state, and
    • (Fsubseteq Q) is the set of accept states.

    Definition If (A) is the set of all strings that machine (M) accepts, we say that (A) is the language of machine (M) and write (L(M)=A). We say that (M) recognizes (A) or that (M) accepts (A). A machine may accepts several strings, but it always recognizes only one language. If the machine accepts no strings, it still recognizes only one language the empty language (emptyset).

    Formal Definition of Computation

    ​ Let (M=(Q,sum,delta,q_0,F)) be a finite automaton and let (w=w_1w_2cdots w_n) be a string where each (w_i) is a member of the alphabet (sum). Then (M) accepts (w) if a sequence of states (r_0,r_1,cdots, r_n) in (Q) exists with three conditions:

    1. (r_0=q_0)
    2. (delta(r_i,w_{i+1})=r_{i+1}), for (i=0,1,cdots,n-1), and
    3. (r_nin F).

    We say that (M) recognizes language (A) if (A={w|Mquad acceptsquad w}).

    Definition A language is called a regular language if some finite automaton recognizes it.

    The Regular Operations

    Definition Let (A) and (B) be languages. We define the regular operations union, concatenation, and star as follows:

    • Union: (Acup B={x|xin A extit{ or }xin B}).
    • Concatenation: (Acirc B={xy|xin Aquad and quad yin B})
    • Star: (A^*={x_1x_2,cdots,x_k|kgeq 0 extit{ and each }x_iin A}). The empty string (epsilon) is always a member of (A^*), no matter what (A) is. The set can be treated as all possible combination of the substring in (A).

    Generally speaking, a collection of objects is closed under some operation if applying that operation to members of the collection returns an object still in the collection.

    Theorem The class of regular language is closed under the union operation and the concatenation operation.

    Nondeterminism

    When the machine is in a given state and reads the next input symbol, we know what the next state will be -- it is determined. We call this deterministic computation. In a nondeterministic machine, several choices may exist for the next state at any point.

    Every state of deterministic finite automaton (DFA) always has exactly one exiting transition arrow for each symbol in alphabet. In an nondeterministic finite automaton (NFA), a state may have zero, one, or many exiting arrows for each alphabet symbol, including (epsilon).

    How NFA compute?

    ​ After reading that symbol, the machine splits into multiple copies of itself and follows all the possibilities in parallel. Each copy of the machine takes one of the possible ways to proceed and continues as before. Finally, if any one of these copies of the machine is in an accept state at the end of the input, the NFA accepts the input string.

    Formal Definition of a Nondeterministic finite automaton

    In a DFA, the transition function takes a state and an input symbol and produces the next state. In an NFA, the transition function takes a state and an input symbol or the empty string and produce the set of possible next states.

    For any set (Q) we write (P(Q)) to be the collection of all subsets of (Q). Here (P(Q)) is called the power set of (Q). For any alphabet (sum) we write (sum_epsilon) to be (sumcup{epsilon}).

    Definition A nondeterministic finite automaton is a 5-tuple ((Q,sum,delta,q_0,F)) where:

    • (Q) is a finite set of states,
    • (sum) is a finite set called the alphabet,
    • (delta: Q imessum_epsilon ightarrow P(Q)) is the transition function,
    • (q_0in Q) is the start state, and
    • (Fsubseteq Q) is the set of accept states.

    Equivalence of NFAs and DFAs

    DFA and NFA recognize the same class of languages. We say that two machines are equivalent if they recognize the same language.

    Theorem Every NFA has an equivalent deterministic finite automaton.

    Proof

    Let (N=(Q, sum, delta, q_0, F)) be the NFA recognizing some language (A). We construct a DFA (M=(Q',sum,delta',q_0',F')) recognizing (A). Let's first consider the easier case wherein (N) has no (epsilon) arrows.

    1. (Q'=P(Q)).

    2. For (Rin Q') and (ainsum), let (delta'(R,a)={qin Q|qindelta(r,a) extit{ for some } rin R}), or simply,

      [delta'(R,a)=igcup_{rin R}delta(r,a) ]

    3. (q_0'={q_0}).

    4. (F'={Rin Q'|R extit{ contains an accept state of } N}).

    To consider the (epsilon) errors, we define (E(R)) to be the collection of states that can be reached from members of (R) by going only along (epsilon) arrows, including the members of (R) themselves.

    • The new transition function can be written as:

    [delta'(R,a)={qin Q|qin E(delta(r,a)) extit{ for some }rin R} ]

    • Changing (q_0') to be (E({q_0})).

    Corollary A language is regular if and only if some NFA recognizes it.

    Definition The class of regular language is closed under the union, concatenation and star operation.

    Regular Expression

    Definition Say that (R) is a regular expression if (R) is:

    • (a) for some (a) in the alphabet (sum),
    • (epsilon),
    • (emptyset), the empty language
    • ((R_1cup R_2)), where (R_1) and (R_2) are regular expressions,
    • ((R_1circ R_2)), where (R_1) and (R_2) are regular expressions, or
    • ((R_1^*)), where (R_1) is a regular repression.

    The star operation is done first, followed by concatenation and finally union, unless parentheses change the usual order. For convenience, we let (R^+=RR^*) and we write (L(R)) to be the language of (R).

    Theorem A language is regular (recognized by DFA/NFA) if and only if some regular expression describes it.

    Theorem If a language is described by a regular expression, then it is regular; If a language is regular, then it is described by a regular expression.

    Nonregular Languages

    Theorem Pumping lemma: If (A) is a regular language, then there is a number (p) (the pumping length) where if (s) is any string in (A) of length at least (p), then (s) may be divided into three pieces, (s=xyz), satisfying the following conditions:

    1. for each (igeq 0, xy^izin A),
    2. (|y|>0), and
    3. (|xy|leq p).

    This theorem states that all regular languages have a special property. If we can show that a language does not have this property, we are guaranteed that it is not regular. The property states that all strings in the language can be pumped if they are least as long as a certain special value, called the pumping length. That means each such string contains a section that can be repeated any number of times with the resulting string remaining in the language.

    Reference

    Introduction to the theory of computation, 3rd Edition, Michael Sipser

  • 相关阅读:
    POJ 1269 Intersecting Lines(判断两条线段关系)
    POJ 3304 Segments(判断直线和线段相交)
    poj 1383 Labyrinth【迷宫bfs+树的直径】
    poj 2631 Roads in the North【树的直径裸题】
    poj 1985 Cow Marathon【树的直径裸题】
    hdoj 1596 find the safest road【最短路变形,求最大安全系数】
    hdoj 1260 Tickets【dp】
    poj 1564 Sum It Up【dfs+去重】
    2014 牡丹江现场赛 i题 (zoj 3827 Information Entropy)
    hdoj 2473 Junk-Mail Filter【并查集节点的删除】
  • 原文地址:https://www.cnblogs.com/romaLzhih/p/14410469.html
Copyright © 2011-2022 走看看