zoukankan      html  css  js  c++  java
  • Erlang XML处理解决方案

    XML以及相关的XSLT,XPath,XSD工具在数据层面为我们提供了极大的灵活性和便利.我们游戏协议的代码自动生成就是首先使用XSD工具设计了协议的Schema,然后使用.net的xsd工具直接生成实体类,然后就直接在工具中操作对象就可以了,协议的XML文件也可以通过事先的Schema检查来校验数据规范性;Erlang类库提供了对于XML的支持,可能你在STDLIB中并没有找到,这是因为这部分被独立在:http://www.erlang.org/doc/apps/xmerl/index.html  

      

      如果已经忘记了XML中常用的概念,最好还是在维基百科中做一下回顾:

      我们可以在"\erl5.9.1\lib\xmerl-1.3.1\include\xmerl.hrl"头文件中看到XML的上述各种概念在Erlang中的表达形式;

    复制代码
    %% XML Element
    %% content = [#xmlElement()|#xmlText()|#xmlPI()|#xmlComment()|#xmlDecl()]
    -record(xmlElement,{
           name,               % atom()
           expanded_name = [],     % string() | {URI,Local} | {"xmlns",Local}
           nsinfo = [],             % {Prefix, Local} | []
           namespace=#xmlNamespace{},
           parents = [],          % [{atom(),integer()}]
           pos,               % integer()
           attributes = [],     % [#xmlAttribute()]
           content = [],
           language = "",     % string()
           xmlbase="",           % string() XML Base path, for relative URI:s
           elementdef=undeclared % atom(), one of [undeclared | prolog | external | element]
         }).
    复制代码

      Erlang官方解决方案从模块划分上看是五脏俱全的:xmerl_scan,xmerl,xmerl_xs,xmerl_eventp,xmerl_xpath,xmerl_xsd,xmerl_sax_parser;但是官方文档上并没有给出足够低门槛的demo代码,仅有的两段示例代码可能由于搜索引擎收录的问题,并不容易找到,其实他们是在:

        http://erlang.org/doc/apps/xmerl/xmerl_xs_examples.html 

        http://www.erlang.org/doc/apps/xmerl/xmerl_xs_examples.html  

     如果你已经安装了Erlang那么你可以在下面的路径找到它们:erl5.9.1\lib\xmerl-1.3.1\doc\html;我们还是通过两段最简单的代码看看如何使用吧.
     

    解析&创建XML

     
    解析XML
     首先我们为这次demo设计一个简单的xml文件test.xml,比如:
    <shopping> 
      <item name="bread" quantity="3" price="2.50"/> 
      <item name="milk" quantity="2" price="3.50"/> 
    </shopping>
    我们要解析上面的xml文件计算得到购物清单的总金额,使用xmerl可以这样做:
    复制代码
    -include_lib("xmerl/include/xmerl.hrl").
    -export([get_total/1]).
    
    get_total(ShoppingList) ->
            {XmlElt, _} = xmerl_scan:string(ShoppingList),
            Items = xmerl_xpath:string("/shopping/item", XmlElt),
            Total = lists:foldl(fun(Item, Tot) ->
                                    [#xmlAttribute{value = PriceString}] = xmerl_xpath:string("/item/@price", Item),
                                    {Price, _} = string:to_float(PriceString),
                                    [#xmlAttribute{value = QuantityString}] = xmerl_xpath:string("/item/@quantity", Item),
                                    {Quantity, _} = string:to_integer(QuantityString),
                                    Tot + Price*Quantity
                            end,
                    0, Items),
            io:format("$~.2f~n", [Total]).
    复制代码

    运行上面的代码得到结果:$14.50 

     

    动态创建XML 

     下面我们从CSV文件数据源动态创建一个XML,CSV内容如下:

    bread,3,2.50 
    milk,2,3.50 

     

     要创建的XML如下,其实就是上面的购物清单:

    <shopping> <item name="bread" quantity="3" price="2.50"/> <item name="milk" quantity="2" price="3.50"/> </shopping>

    实现代码:

    复制代码
    to_xml(ShoppingList) ->
            Items = lists:map(fun(L) ->
                                    [Name, Quantity, Price] = string:tokens(L, ","),
                                    {item, [{name, Name}, {quantity, Quantity}, {price, Price}], []}
                    end, string:tokens(ShoppingList, "\n")),
            xmerl:export_simple([{shopping, [], Items}], xmerl_xml).
    复制代码
      官方给出的解决方案确实差强人意,甚至有人被惹恼,比如 [erlang-questions] Rant: I hate parsing XML with Erlang 其实我们还有别的选择,比如erlsom
     

    erlsom

      erlsom 项目地址:http://sourceforge.net/projects/erlsom/ erlsom支持三种使用模型:

    1. as a SAX parser. 备注: SAX即Simple API for XML(简称SAX)是个循序存取XML的解析器API.
    2. As a simple sort of DOM parser. 备注: DOM(Document Object Model)是W3C组织推荐的处理可扩展置标语言的标准编程接口.
    3. As a ‘data binder’ 直接解析成为Erlang的Record,类似于一个强类型DataSet的概念


    下面我们实际操练一下这三种模式,我们使用下面的xml,文件名test2.xml,目标还是计算购物清单的中金额

    复制代码
    <?xml version="1.0"?>
    <shopping> 
      <item name="bread" quantity="3" price="2.50"/> 
      <item name="milk" quantity="2" price="3.50"/> 
    </shopping>
    复制代码
     
    SAX parser
    复制代码
    2>  {ok, Xml} = file:read_file("test.xml").
    {ok,<<"<shopping> \r\n  <item name=\"bread\" quantity=\"3\" price=\"2.50\"/> \r\
    n  <item name=\"milk\" quantity=\"2\" price=\"3.50"...>>}
    3> erlsom:parse_sax(Xml, [], fun(Event, Acc) -> io:format("~p~n", [Event]), Acc end).
    startDocument
    {startElement,[],"shopping",[],[]}
    {ignorableWhitespace," \r\n  "}
    {startElement,[],"item",[],
                  [{attribute,"price",[],[],"2.50"},
                   {attribute,"quantity",[],[],"3"},
                   {attribute,"name",[],[],"bread"}]}
    {endElement,[],"item",[]}
    {ignorableWhitespace," \r\n  "}
    {startElement,[],"item",[],
                  [{attribute,"price",[],[],"3.50"},
                   {attribute,"quantity",[],[],"2"},
                   {attribute,"name",[],[],"milk"}]}
    {endElement,[],"item",[]}
    {ignorableWhitespace," \r\n"}
    {endElement,[],"shopping",[]}
    endDocument
    {ok,[]," "}
    4> Sum = fun(Event, Acc) -> case Event of {startElement, _, "item", _, [{_,_,_,_,P},{_,_,_,_,C},_]} -> Acc + list_to_float(P)*list_to_integer(C); _ -> Acc end end.
    #Fun<erl_eval.12.82930912>
    5> erlsom:parse_sax(Xml, 0, Sum).
    {ok,14.5," "}
    6>
    复制代码
     
    DOM parser
     使用下面的代码解析出来的结果由于精简掉了XML的架构信息,所以清爽简单了很多,后续计算略;
    复制代码
    9> erlsom:simple_form(Xml).
    {ok,{"shopping",[],
         [{"item",
           [{"price","2.50"},{"quantity","3"},{"name","bread"}],
           []},
          {"item",
           [{"price","3.50"},{"quantity","2"},{"name","milk"}],
           []}]},
        " "}
    10>
    复制代码
    Data Binder

       首先设计XML的XSD,然后使用XSD打通数据模型使用的各个环节,比如生成C#代码,直接获得强类型的对象,这个方法在.net里面很常用;erlsom提供的Data binder的模式,其实就是实现了这种设计方法;起点还是设计XSD文件,好吧,我们为上面的test2.xml设计一个XSD,如下:

     
    复制代码
    <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
      <xsd:element name="shopping" type="shoppingType"/>
    <xsd:complexType  name="shoppingType">
        <xsd:sequence>
          <xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
            <xsd:complexType>
               <xsd:attribute name="name" type="xsd:string" use="required"/>
              <xsd:attribute name="quantity" type="xsd:positiveInteger" use="required"/>
              <xsd:attribute name="price" type="xsd:decimal" use="required"/>
            </xsd:complexType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType >
    </xsd:schema>
    复制代码
    然后我们通过XSD生成对应的record,这个erlsom已经提供了工具:
     28> erlsom:write_xsd_hrl_file("test.xsd","test.hrl").
    ok
     
    打开test.hrl,对应的record已经生成:
    复制代码
    %% HRL file generated by ERLSOM
    %%
    %% It is possible to change the name of the record fields.
    %%
    %% It is possible to add default values, but be aware that these will
    %% only be used when *writing* an xml document.
    
    -record('shoppingType', {anyAttribs, 'item'}).
    -record('shoppingType/item', {anyAttribs, 'name', 'quantity', 'price'}).
    复制代码

    为了能在Erlang Shell中完成所有的测试,后面需要使用record的时候我们使用rd()命令,在shell中建立record的定义.

     

    下面就是解析并映射为record了:

    复制代码
    Eshell V5.9.1  (abort with ^G)
    1>  {ok, X} = erlsom:compile_xsd_file("test.xsd").
    
    =ERROR REPORT==== 20-Jul-2012::06:53:09 ===
    Call to tuple fun {erlsom_parse,xml2StructCallback}.
    
    Tuple funs are deprecated and will be removed in R16. Use "fun M:F/A" instead, f
    or example "fun erlsom_parse:xml2StructCallback/2".
    
    (This warning will only be shown the first time a tuple fun is called.)
    
    {ok,{model,[{type,'_document',sequence,
                      [{el,[{alt,shopping,shoppingType,[],1,1,true,undefined}],
                           1,1,1}],
                      [],undefined,undefined,1,1,1,false,undefined},
                {type,shoppingType,sequence,
                      [{el,[{alt,item,'shoppingType/item',[],1,1,true,undefined}],
                           0,unbound,1}],
                      [],undefined,undefined,2,1,1,undefined,undefined},
                {type,'shoppingType/item',sequence,[],
                      [{att,name,1,false,char},
                       {att,quantity,2,false,char},
                       {att,price,3,false,char}],
                      undefined,undefined,4,1,1,undefined,undefined}],
               [{ns,"http://www.w3.org/2001/XMLSchema","xsd"}],
               undefined,[]}}
    2> {ok, Xml} = file:read_file("test2.xml").
    {ok,<<"锘??xml version=\"1.0\"?>\r\n<shopping> \r\n  <item name=\"bread\" quanti
    ty=\"3\" price=\"2.50\"/> \r\n  <item name=\"milk"...>>}
    3> {ok, Result, _} = erlsom:scan(Xml, X).
    {ok,{shoppingType,[],
                      [{'shoppingType/item',[],"bread","3","2.50"},
                       {'shoppingType/item',[],"milk","2","3.50"}]},
        " "}
    4>
    复制代码
        对于不太复杂的XML,解析到这种程度实际上已经非常方便处理了,完全可以在此停住完成最终运算;但是对于特别复杂的XML使用Record处理,更灵活直观,我们把这个流程走完:
     
    复制代码
    5> rd('shoppingType', {anyAttribs, 'item'}).
    shoppingType
    6> rd('shoppingType/item', {anyAttribs, 'name', 'quantity', 'price'}).
    'shoppingType/item'
    7> R4#shoppingType.'item'.
    [#'shoppingType/item'{anyAttribs = [],name = "bread",
                          quantity = "3",price = "2.50"},
    #'shoppingType/item'{anyAttribs = [],name = "milk",
                          quantity = "2",price = "3.50"}]
    
    8> hd(R4#shoppingType.'item').
    #'shoppingType/item'{anyAttribs = [],name = "bread",
                         quantity = "3",price = "2.50"}
    9> #'shoppingType/item'.quantity.
    4
    复制代码

    其它可选方案

    [1] JSON 作为轻量级的数据交换格式,JSON有着巨大的优势,erlang相关解决方案也有很多比如ejson mochiweb也有相关模块

    [2] Google的Protocol Buffers 以及Facebook的Thrift为代表的解决方法

    [3] Piqi includes a data serialization system for Erlang. It can be used for serializing Erlang values in 4 different formats: Google Protocol Buffers, JSONXML and Piq.

         

     

    晚安!

     

    最后送上一张96星河版<笑傲江湖>的截图,这个版本让我欣喜不已,

    83版射雕,94版射雕,95版神雕,96版笑傲,97版天龙八部,百看不厌

    分类: Erlang
    标签: erlangxmlxmerlerlsom
  • 相关阅读:
    selinux 关闭
    Microsoft Visual Studio 2013 Language Pack
    Visual Studio Ultimate 2013 with Update 4
    页面滑动
    Android 适配器
    前端空格显示问题
    Your content must have a ListView whose id attribute is 'android.R.id.list'
    Ext.data.Store动态修改url
    Android 页面滑动
    实例化Layout中的布局文件(xml)
  • 原文地址:https://www.cnblogs.com/Leo_wl/p/2600530.html
Copyright © 2011-2022 走看看