zoukankan      html  css  js  c++  java
  • 基于Predictive Parsing的ABNF语法分析器(十)——AbnfParser文法解析器之数值类型(num-val)

    ANBF语法中的数值类型有3种:二进制、十进制和十六进制,可以是一个以点号分隔的数列,也可以是一个数值的范围。例如,%d11.22.33.44.55表示五个有次序的十进制数字“11、22、33、44、55”,而%x80-ff表示一个字节,这个字节的数值可以是在0x80至0xff之间。

    我把以点号分隔的数列定义为NumVal,把范围类型的数值定义为RangedNumVal。这两个类实现了Element,其实我觉得应该定义一个接口NumVal(继承Element),然后一个SerialNumVal和一个RangedNumVal(实现NumVal),这样看起来更漂亮?作为一个完美主义者看到现在这个定义真是很蛋疼,有时间再重新考虑吧。

    由于二进制、十进制和十六进制的构成都是很相似的,只是进制符号(b、d、x)以及数字符号(01、0123456789、0123456789abcdef)不同而已,为了避免重复地写三个很相像的方法,我投机取巧的定义了一个Matcher接口,这个接口是用来判断字符是否在预设的符号集里面的,没什么技术含量,看代码就明白了。

    先来看看解析代码:

    /*
        This file is one of the component a Context-free Grammar Parser Generator,
        which accept a piece of text as the input, and generates a parser
        for the inputted context-free grammar.
        Copyright (C) 2013, Junbiao Pan (Email: panjunbiao@gmail.com)
    
        This program is free software: you can redistribute it and/or modify
        it under the terms of the GNU General Public License as published by
        the Free Software Foundation, either version 3 of the License, or
        any later version.
    
        This program is distributed in the hope that it will be useful,
        but WITHOUT ANY WARRANTY; without even the implied warranty of
        MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
        GNU General Public License for more details.
    
        You should have received a copy of the GNU General Public License
        along with this program.  If not, see <http://www.gnu.org/licenses/>.
     */
    
        //		        bin-val        =  "b" 1*BIT
    //		                          [ 1*("." 1*BIT) / ("-" 1*BIT) ]
    //  BIT            =  "0" / "1"
    //  二进制解析器
        protected Element bin_val() throws IOException, MatchException {
    //  真正的解析工作由val方法完成,只要把二进制数的符号集{0、1}通过Matcher实例传递给它就OK了。
            return val('b', new Matcher() {
                @Override
                public boolean match(int value) {
    //              如果符号是0或1就匹配
                    return value == '0' || value == '1';
                }
    
                @Override
                public String expected() {
    //              提示符号不在符号集内(仅用于异常情况)
                    return "['0', '1']";
                }
            });
        }
    
    //		        dec-val        =  "d" 1*DIGIT
    //		                          [ 1*("." 1*DIGIT) / ("-" 1*DIGIT) ]
        protected Element dec_val() throws IOException, MatchException {
    //      同上,把十进制的符号集0~9传递给val方法
            return val('d', new Matcher() {
                @Override
                public boolean match(int value) {
    //              直到写博客才发现这段代码错了,符号集不应该包含A~F的情形啊,居然单元测试已经通过了,尼玛这是什么测试质量!
    //              PS:单元测试代码也是我自己写的。。。
                    return (value >= 0x30 && value <= 0x39) || (value >= 'A' && value <= 'F') || (value >= 'a' && value <= 'f');
                }
    
                @Override
                public String expected() {
    //              错误代码,无语了。。。
                    return "['0'-'9', 'A'-'F', 'a'-'f']";
                }
            });
        }
    
    //		        hex-val        =  "x" 1*HEXDIG
    //		                          [ 1*("." 1*HEXDIG) / ("-" 1*HEXDIG) ]
        protected Element hex_val() throws IOException, MatchException {
    //      将十六进制的符号集通过Matcher实例传递给val方法进行解析
            return val('x', new Matcher() {
                @Override
                public boolean match(int value) {
                    return (value >= 0x30 && value <= 0x39) || (value >= 'A' && value <= 'F') || (value >= 'a' && value <= 'f');
                }
    
                @Override
                public String expected() {
                    return "['0'-'9', 'A'-'F', 'a'-'f']";
                }
            });
        }
    
    //  解析各个进制
        protected Element val(char base, Matcher matcher) throws IOException, MatchException {
    //      检查进制符号
            assertMatch(is.peek(), base);
            int baseValue = is.read();
            String from = "";
            String val = "";
    
    //      进制符号之后的第一个字符,必须在Matcher定义的字符集内,否则异常
            if (matcher.match(is.peek())) {
    //          连续读入符合字符的字符,构成NumVal的第一个数值。
                while (matcher.match(is.peek())) {
                    from += (char)is.read();
                }
    //          第一个数值后面如果是跟着点号,则是一个数列NumVal,如果是-破折号,则是一个范围型数值RangedNumVal,如果都不是,则是单一个数值
                if (match(is.peek(), '.')) {
                    NumVal numval = new NumVal(String.valueOf((char)baseValue));
    //              将刚才匹配到的数值作为第一个数值加到将要返回的NumVal中
                    numval.addValue(from);
    //              如果后面跟着点号,则继续加入新的数值到NumVal中
                    while (match(is.peek(), '.')) {
                        int next = is.peek(1);
                        if (!(matcher.match(next))) {
                            break;
                        }
                        is.read();
                        val = "";
                        while (matcher.match(is.peek())) {
                            val += (char)is.read();
                        }
                        numval.addValue(val);
                    }
    //              直到不能匹配到点号,数列结束,返回
                    return numval;
                } else if (match(is.peek(), '-')) {
    //              这里向前读取两个字符,因此即使破折号后面跟着的不是数字,也能返回单一个数字而且将破折号留给后面的分析程序
    //              这是本程序里为数不多的能够具备回溯的代码段之一,嘿嘿。
                    int next = is.peek(1);
                    if (!(matcher.match(next))) {
    //                  如果破折号后面跟的不是数字,则破折号不读入,返回单一数值
                        NumVal numval = new NumVal(String.valueOf((char)baseValue));
                        numval.addValue(from);
                        return numval;
                    }
    //              否则,破折号后面是数值,读取之,并返回RangedNumVal类型
                    is.read();
                    val ="";
                    val += (char)is.read();
                    while (matcher.match(is.peek())) {
                        val += (char)is.read();
                    }
                    return new RangedNumVal(String.valueOf((char)baseValue), from, val);
                } else {
    //              第一个数值之后跟的既不是点号,也不是破折号,则返回单一数值              
                    NumVal numval = new NumVal(String.valueOf((char)baseValue));
                    numval.addValue(from);
                    return numval;
                }
            } else {
                throw new MatchException(matcher.expected(), is.peek(), is.getPos(), is.getLine());
            }
    
        }
    
        //		        num-val        =  "%" (bin-val / dec-val / hex-val)
    //      解析num-val
    	protected Element num_val() throws IOException, MatchException {
    		String base = "", from ="", val ="";
    //              百分号开头
    		assertMatch(is.peek(), '%');
            is.read();
    //              根据进制符号选择相应的解析方法(函数)
    		switch ((char)is.peek()) {
                case 'b': case 'B': return bin_val();
    		    case 'd': case 'D': return dec_val();
    		    case 'x': case 'X': return hex_val();
        		default: throw new MatchException("['b', 'd', 'x']", is.peek(), is.getPos(), is.getLine());
    		}
    	}
    

    接下来看看单元测试部分,不详细说了,其中有一句注释说明为什么上面有错误代码不能测试出来:

    /*
        This file is one of the component a Context-free Grammar Parser Generator,
        which accept a piece of text as the input, and generates a parser
        for the inputted context-free grammar.
        Copyright (C) 2013, Junbiao Pan (Email: panjunbiao@gmail.com)
    
        This program is free software: you can redistribute it and/or modify
        it under the terms of the GNU General Public License as published by
        the Free Software Foundation, either version 3 of the License, or
        any later version.
    
        This program is distributed in the hope that it will be useful,
        but WITHOUT ANY WARRANTY; without even the implied warranty of
        MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
        GNU General Public License for more details.
    
        You should have received a copy of the GNU General Public License
        along with this program.  If not, see <http://www.gnu.org/licenses/>.
     */
    
        //		        bin-val        =  "b" 1*BIT
    //		                          [ 1*("." 1*BIT) / ("-" 1*BIT) ]
    //  BIT            =  "0" / "1"
    //  测试二进制数的解析
        @Test
        public void testBin_val() throws Exception {
            Tester<String> tester = new Tester<String>() {
                @Override
                public String test(AbnfParser parser) throws MatchException, IOException {
                    return parser.bin_val().toString();
                }
            };
            String input;
            input = "b1";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).bin_val().toString());
            input = "b1010";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).bin_val().toString());
            input = "B1";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).bin_val().toString());
            input = "b1.1";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).bin_val().toString());
            input = "b0101.1111";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).bin_val().toString());
            input = "b0000-1111";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).bin_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input+".00").bin_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input+"-1234").bin_val().toString());
            input = "b00.11.00.01.10.00.11.00.11";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).bin_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input+".").bin_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input+"..").bin_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input+".bb").bin_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input+"-00").bin_val().toString());
    
            Assertion.assertMatchException("", tester, 1, 1);
            Assertion.assertMatchException("b", tester, 2,1);
            Assertion.assertMatchException("bg", tester, 2, 1);
            Assertion.assertMatchException("b.", tester, 2, 1);
            Assertion.assertMatchException("b-", tester, 2, 1);
        }
    
        //		        dec-val        =  "d" 1*DIGIT
    //		                          [ 1*("." 1*DIGIT) / ("-" 1*DIGIT) ]
    //  测试十进制数的解析
        @Test
        public void testDec_val() throws Exception {
            Tester<String> tester = new Tester<String>() {
                @Override
                public String test(AbnfParser parser) throws MatchException, IOException {
                    return parser.dec_val().toString();
                }
            };
    
            String input;
            input = "d1";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).dec_val().toString());
            input = "d1234";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).dec_val().toString());
            input = "D1";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).dec_val().toString());
            input = "d1.2";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).dec_val().toString());
            input = "d1234.5678";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).dec_val().toString());
            input = "d1234-5678";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).dec_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input+".00").dec_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input+"-1234").dec_val().toString());
            input = "d12.34.56.78.9a.bc.de.f0";
    //      看看这里,就明白为什么单元测试测不出十进制数带有a~f符号的问题了,竟然有这样错误的测试用例!!!
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).dec_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input+".").dec_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input+"..").dec_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input+".##").dec_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input+"-00").dec_val().toString());
    
            Assertion.assertMatchException("", tester, 1, 1);
            Assertion.assertMatchException("d", tester, 2, 1);
            Assertion.assertMatchException("dg", tester, 2, 1);
            Assertion.assertMatchException("d.", tester, 2, 1);
            Assertion.assertMatchException("d-", tester, 2, 1);
        }
    
        //		        hex-val        =  "x" 1*HEXDIG
    //		                          [ 1*("." 1*HEXDIG) / ("-" 1*HEXDIG) ]
    //  测试十六进制数的解析
        @Test
        public void testHex_val() throws Exception {
            Tester<String> tester = new Tester<String>() {
                @Override
                public String test(AbnfParser parser) throws MatchException, IOException {
                    return parser.hex_val().toString();
                }
            };
    
            String input;
            input = "x1";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).hex_val().toString());
            input = "x1234";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).hex_val().toString());
            input = "X1";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).hex_val().toString());
            input = "x1.2";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).hex_val().toString());
            input = "x1234.5678";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).hex_val().toString());
            input = "xabcd.ef";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).hex_val().toString());
            input = "xA1.2B";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).hex_val().toString());
            input = "x1234-abCD";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).hex_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input+"-").hex_val().toString());
            input = "x12.34.56.78.9a.bc.de.f0";
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input).hex_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input + ".").hex_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input + ".g0").hex_val().toString());
            Assert.assertEquals("%" + input, AbnfParserFactory.newInstance(input + "-00").hex_val().toString());
    
            Assertion.assertMatchException("", tester, 1, 1);
            Assertion.assertMatchException("x", tester, 2, 1);
            Assertion.assertMatchException("xg", tester, 2, 1);
            Assertion.assertMatchException("x.", tester, 2, 1);
            Assertion.assertMatchException("x-", tester, 2, 1);
    
        }
    
        //		        num-val        =  "%" (bin-val / dec-val / hex-val)
    //  综合情况测试
        @Test
        public void testNum_val() throws Exception {
            String input;
            input = "%b0101";
            Assert.assertEquals(input, AbnfParserFactory.newInstance(input).num_val().toString());
            input = "%b0101.1010.1111";
            Assert.assertEquals(input, AbnfParserFactory.newInstance(input).num_val().toString());
            input = "%b0101-1111";
            Assert.assertEquals(input, AbnfParserFactory.newInstance(input).num_val().toString());
            input = "%d1234";
            Assert.assertEquals(input, AbnfParserFactory.newInstance(input).num_val().toString());
            input = "%d0123.4567.8901";
            Assert.assertEquals(input, AbnfParserFactory.newInstance(input).num_val().toString());
            input = "%d12345-67890";
            Assert.assertEquals(input, AbnfParserFactory.newInstance(input).num_val().toString());
            input = "%x0123";
            Assert.assertEquals(input, AbnfParserFactory.newInstance(input).num_val().toString());
            input = "%x0123.4567.89ab.CDEF";
            Assert.assertEquals(input, AbnfParserFactory.newInstance(input).num_val().toString());
            input = "%x0123456789-ABCDEFabcdef09";
            Assert.assertEquals(input, AbnfParserFactory.newInstance(input).num_val().toString());
        }
    

    本系列文章索引:基于预测的ABNF文法分析器

  • 相关阅读:
    vue form dynamic validator All In one
    TypeScript api response interface All In One
    closable VS closeable All In One
    macOS 如何开启 WiFi 热点 All In One
    vue css inline style All In One
    vs2010里面 新建网站里面的 asp.net网站 和 新建项目里面的 asp.net Web应用程序 的区别 (下)
    牛腩新闻 59 整合添加新闻页 FreeTextBox 富文本编辑器,检测到有潜在危险的 Request.Form 值,DropDownList 的使用
    牛腩新闻 61尾声: error.aspx的使用 防止报错
    vs2010里面 新建网站里面的 asp.net网站 和 新建项目里面的 asp.net Web应用程序 的区别 (上)
    牛腩新闻 62:尾声续2 asp.net的编译和发布
  • 原文地址:https://www.cnblogs.com/snake-hand/p/3141231.html
Copyright © 2011-2022 走看看