zoukankan      html  css  js  c++  java
  • 非确定有限状态自动机的构建(一)——NFA的定义和实现

    保留版权,转载需注明出处(http://blog.csdn.net/panjunbiao)。

    非确定有限状态自动机(Nondeterministic Finite Automata,NFA)由以下元素组成:

    1. 一个有限的状态集合S
    2. 一个输入符号集合Sigma,并且架设空字符epsilon不属于Sigma
    3. 一个状态迁移函数,对于所给的每一个状态和每一个属于Sigma或{epsilon}的符号,输出迁移状态的集合。
    4. 一个S中的状态s0作为开始状态(初始状态)
    5. S的一个子集F,作为接受状态(结束状态)
    例如,我们给定:
    1. S={s0, s1, s2, s3, s4}
    2. Sigma={a, b}
    3. 状态迁移函数T,且T(s0, a} = {s1}, T(s1, a) = {s2}, T(s2, b) = {s3}, T(s3, b) = {s4}
    4. s0为开始状态
    5. {s4}为接受状态
    这样我们就得到一个很简单的NFA,它可以用图来表示,如下图图1:


    NFA是一个识别器,例如图1所示的NFA,我们从状态s0开始,按顺序输入aabb,在输入第一个符号a之后,状态将从s0迁移到s1,输入第二个符号a之后,状态迁移到s2,输入第三个符号b之后,状态迁移到s3,输入第四个符号b之后,状态迁移到s4,而s4是接收状态,也就是说对我们刚才输入的aabb字符串说yes,表明本NFA识别了所输入的字符串。
    所谓非确定,是指在某个状态输入同一个符号,状态可以迁移到不同的下一个状态,例如图2,在s0处输入字符a,状态既可以迁移为s1,也可以迁移为s3,准确的说是状态迁移到了{s1,s3},因此图2所示的NFA能够接受的字符串包括aa和ab。
    另外,NFA的特点还在于空符号也能进行状态迁移,例如图3的s0,不需要任何输入字符就可以迁移到s1,因此图3的NFA可以识别的语言为*a*b,即0到任意多个a,接着0到任意多个b。

    NFA可以识别的语言与正则表达式所表达的语言是等价的,参考 http://en.wikipedia.org/wiki/Nondeterministic_finite_automaton
    那么,NFA如何实现呢?我们先来看看NFA状态节点的一种实现:
    /*
        This file is one of the component a Context-free Grammar Parser Generator,
        which accept a piece of text as the input, and generates a parser
        for the inputted context-free grammar.
        Copyright (C) 2013, Junbiao Pan (Email: panjunbiao@gmail.com)
    
        This program is free software: you can redistribute it and/or modify
        it under the terms of the GNU General Public License as published by
        the Free Software Foundation, either version 3 of the License, or
        any later version.
    
        This program is distributed in the hope that it will be useful,
        but WITHOUT ANY WARRANTY; without even the implied warranty of
        MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
        GNU General Public License for more details.
    
        You should have received a copy of the GNU General Public License
        along with this program.  If not, see <http://www.gnu.org/licenses/>.
     */
    
    package automata;
    
    import java.util.*;
    
    public class NFAState implements Comparable<NFAState> {
        private static int COUNT = 0;
    
        //状态标识,每个NFA状态节点都有唯一的数值标识
        private int id;
    
        public int getId() { return this.id; }
    
        //在创建NFA状态对象的时候,通过静态变量生成唯一标识
        public NFAState() {
            this.id = COUNT ++;
        }
    
        //迁移函数,由于迁移函数需要两个输入:当前状态和输入符号,因此在一个状态对象内部,
        //迁移函数都是针对本对象的,只需要输入符号就可以了,这里通过Map接口实现迁移函数
        protected Map<Integer, Set<NFAState>> transition = new HashMap<Integer, Set<NFAState>>();
        public Map<Integer, Set<NFAState>> getTransition() { return this.transition; }
    
        //空字符迁移函数,即从当前节点经过空字符输入所能够到达的下一个状态节点
        protected Set<NFAState> epsilonTransition = new HashSet<NFAState>();
        public Set<NFAState> getEpsilonTransition() { return this.epsilonTransition; }
    
        //向迁移函数添加一个映射,不给定下一个状态节点
        public NFAState addTransit(int input) {
            return addTransit(input, new NFAState());
        }
    
        //向迁移函数添加一个映射,给定下一个状态节点
        public NFAState addTransit(int input, NFAState next) {
            Set<NFAState> states = this.transition.get(input);
            if (states == null) {
                states = new HashSet<NFAState>();
                this.transition.put(input, states);
            }
            states.add(next);
            return next;
        }
    
        //向迁移函数添加一个映射,不给定下一个状态节点
        public NFAState addTransit(char input) {
            return addTransit(input, new NFAState());
        }
    
        //向迁移函数添加一个映射,给定下一个状态节点
        //假定我们的上下文无关文法是大小写不敏感的,当输入字符是char类型并且是字母时,
        //生成大写字母和小写字母两个映射
        public NFAState addTransit(char input, NFAState next) {
            if (Character.isLetter(input)) {
                this.addTransit((int) (Character.toUpperCase(input)), next);
                this.addTransit((int)(Character.toLowerCase(input)), next);
                return next;
            }
            this.addTransit((int)input, next);
            return next;
        }
    
        //添加一个空字符的映射
        public NFAState addTransit(NFAState next) {
            this.epsilonTransition.add(next);
            return next;
        }
    
        //返回迁移函数
        public Set<NFAState> getTransition(int input) {
            return this.transition.get(input);
        }
    
    }
    
    再来看看NFA的实现:
    /*
        This file is one of the component a Context-free Grammar Parser Generator,
        which accept a piece of text as the input, and generates a parser
        for the inputted context-free grammar.
        Copyright (C) 2013, Junbiao Pan (Email: panjunbiao@gmail.com)
    
        This program is free software: you can redistribute it and/or modify
        it under the terms of the GNU General Public License as published by
        the Free Software Foundation, either version 3 of the License, or
        any later version.
    
        This program is distributed in the hope that it will be useful,
        but WITHOUT ANY WARRANTY; without even the implied warranty of
        MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
        GNU General Public License for more details.
    
        You should have received a copy of the GNU General Public License
        along with this program.  If not, see <http://www.gnu.org/licenses/>.
     */
    
    package automata;
    
    import java.util.*;
    
    import abnf.CharVal;
    import abnf.NumVal;
    import abnf.AbnfParser;
    import abnf.RangedNumVal;
    import abnf.Repeat;
    import abnf.Repetition;
    import abnf.Rule;
    import abnf.RuleName;
    
    public class NFA {
        //开始状态startState
        private NFAState startState = null;
        public NFAState getStartState() { return startState; }
    
        //接收状态acceptingStates
        private Set<NFAState> acceptingStates = new HashSet<NFAState>();
        public Set<NFAState> getAcceptingStates() { return acceptingStates; }
        public boolean accept(NFAState state) {
            return this.acceptingStates.contains(state);
        }
        public void addAcceptingState(NFAState state) {
            this.acceptingStates.add(state);
        }
    
        public NFA() {
            this(new NFAState(), new NFAState());
        }
    
        public NFA(NFAState startState) {
            this(startState, new NFAState());
        }
    
        public NFA(NFAState startState, NFAState acceptingState) {
            this.startState = startState;
            this.addAcceptingState(acceptingState);
        }
    
        //在上面的NFAState类实现中,新的状态节点是在添加迁移映射的过程中生成的,
        //这个过程中NFA并没有介入,因此NFA类不能直接得到状态集S的成员
        //而是需要从状态startState开始,不断迭代找出所有的状态节点
        protected void getStateSet(NFAState current, Set<NFAState> states) {
            if (states.contains(current)) return;
            states.add(current);
    
            Iterator<NFAState> it;
    
            it = current.getNextStates().iterator();
            while (it.hasNext()) {
                this.getStateSet(it.next(), states);
            }
    
            it = current.getEpsilonTransition().iterator();
            while (it.hasNext()) {
                this.getStateSet(it.next(), states);
            }
    
        }
    
        public Set<NFAState> getStateSet() {
            Set<NFAState> states = new HashSet<NFAState>();
            this.getStateSet(this.getStartState(), states);
            return states;
        }
    
    }
    

    这样,我们可以从NFA类中获得一个NFA的开始状态startState和接受状态集合acceptingStates,在每一个状态节点NFAState中可以获得状态迁移函数,因此NFA所定义的各个元素都实现了。


  • 相关阅读:
    .NET 开源工作流: Slickflow流程引擎基础介绍(六)--模块化架构设计和实践
    .NET 开源工作流: Slickflow流程引擎基础介绍(五) -- 会签加签高级特性介绍
    vue实现Excel文件的上传与下载
    库存商品计算成本的几种方法
    如果有一天不做程序员了,还能入什么行业?
    C#使用EF连接PGSql数据库
    回顾2018,展望2019
    Git命令使用大全
    使用VSCode配置简单的vue项目
    SqlServer的两种插入方式效率对比
  • 原文地址:https://www.cnblogs.com/dyllove98/p/3196867.html
Copyright © 2011-2022 走看看