zoukankan      html  css  js  c++  java
  • 如何快速处理xml文件信息

    这个范例是帮一个朋友做的,他写的处理代码速度是1笔/0.3秒,我重新写的代码运行数据1000笔/0.3秒。速度提升1000倍。代码如下。

    --------------------------------------------------------------------
    
    1、待处理的xml字符串数据
    
    --------------------------------------------------------------------
    <Positions from="51Job">
      <Company>
        <CorpName><![CDATA[北京XXXX有限公司]]></CorpName>
        <OfficeAddr1><![CDATA[北京市海淀区XXX大厦XXX座XXX层]]></OfficeAddr1>
        <OffPostCode1><![CDATA[100086]]></OffPostCode1>
        <CompanySize><![CDATA[XXX人以上]]></CompanySize>
        <TradeCodes>
          <TradeCode><![CDATA[办公用品及设备]]></TradeCode>
          <TradeCode><![CDATA[计算机硬件]]></TradeCode>
        </TradeCodes>
        <AboutUs><![CDATA[XXX数码办公企业创建于XXXX年XX月,历经XXXX年的发展,目前已成为以XXXX为中心的全国性股份制公司,旗下拥有XX家专业公司,分布于北京、上海、广州、武汉、成都、西安、长沙和南昌等核心城市,注册资金5000万元,在职员工近500名。  秉承“先进的技术、优质的产品、完善的服务、合理的价格”的经营理念,自公司创立以来,本公司的销售、技术服务、研发等高素质人才为各行各业的客户提供着从需求分析、产品咨询、专业培训、网络建设、软件开发、软件应用培训及维护、硬件销售、保养及维修、各种消耗品供应等一条龙的量身定做的专业化、规范化、人性化服务。  泰和数码办公企业重视培养人才,实行任人唯贤的人力资源政策,积极培养有理想、有道德、有责任感、有协作精神的现代化员工队伍,同时积极引进先进的管理理念,采用规范化、科学化的管理方法不断壮大自身实力。  公司始终坚持人才与企业共同发展的理念,关注员工的技能培训及发展,并不断加强企业文化的营建。公司业务的持续高速发展,也为员工担供了极大的发展空间。多元化的企业文化、以人为本的经营管理理念证明了泰和是一个良好的发展平台。     公司网站 http://www.XXXX.com.cn]]></AboutUs>
        <Websites>
          <Website><![CDATA[http://www.techhero.com.cn]]></Website>
        </Websites>
        <Informations>
          <Information>
            <Contact><![CDATA[人力资源部]]></Contact>
            <Addr><![CDATA[XXX市XXX区XX路XXX号XXX大厦A座10层]]></Addr>
            <OffPostCode><![CDATA[100086]]></OffPostCode>
          </Information>
        </Informations>
      </Company>
      <Position>
        <ReleaseDate><![CDATA[2011-04-14]]></ReleaseDate>
        <Description><![CDATA[岗位职责:1,主要负责杭州市内复印机等办公设备的日常保养和故障处理,维系客情关系;2,在为所负责区域的客户做保养或维修的同时进行硒鼓、纸张等办公耗材的销售工作。岗位要求:1,中专以上学历,计算机或电子类相关专业;2,熟悉杭州地形,能吃苦,有上进心;3,喜欢动手,有相关工作经验者优先。薪资福利:底薪+提成+工作餐+五险一金+高温补贴+带薪培训+免费体检+节日费等]]></Description>
        <PositionName><![CDATA[复印机维修工程师]]></PositionName>
        <Number><![CDATA[5]]></Number>
        <Education><![CDATA[中专]]></Education>
        <Salary><![CDATA[2000-2999]]></Salary>
        <Citys>
          <City><![CDATA[杭州]]></City>
        </Citys>
        <Emails>
          <Email><![CDATA[zhouqiaoyan@techhero.com.cn]]></Email>
        </Emails>
        <URLFrom>http://search.51job.com/job/45205821,c.html</URLFrom>
        <SnapshoAddr>{117BD768-CA9A-4D75-9C47-BB30E638FCE2}.html</SnapshoAddr>
      </Position>
    </Positions>
    
    --------------------------------------------------------------------
    
    2、test01.aspx.cs代码如下
    
    --------------------------------------------------------------------
    
    using System;
    using System.Collections;
    using System.Configuration;
    using System.Data;
    using System.Linq;
    using System.Web;
    using System.Web.Security;
    using System.Web.UI;
    using System.Web.UI.HtmlControls;
    using System.Web.UI.WebControls;
    using System.Web.UI.WebControls.WebParts;
    using System.Xml.Linq; 
    using System.Collections.Generic;
    using System.Xml;
    
    
    
    
    namespace Test.xml
    {
        public partial class test01 : System.Web.UI.Page
        {
            protected void Page_Load(object sender, EventArgs e)
            {
                if (!IsPostBack)
                {
                    DateTime begin = System.DateTime.Now;
                    TimeSpan span = begin - begin;
                    Positions obj = new Positions();
                    int num = 10000;
                    for (int i = 0; i < num; i++)
                    {
                        this.MakeData(Server.MapPath("~/Xml/work.xml"), ref obj);
                        obj.Dispose();
                    }
                    DateTime end = DateTime.Now;
                    span = end - begin;
                    this.Page.Response.Write(string.Format("span = {0}<br/> begin = {1}<br/> end = {2}<br/>"
                        , span.ToString()
                        , begin.ToString("yyyy/MM/dd HH:mm:ss:fff")
                        , end.ToString("yyyy/MM/dd HH:mm:ss:fff")
                        )); 
                }
            }
    
            /// <summary>
            /// 处理xml数据主函数
            /// 1、可以处理xml文件,见xdoc.Load(string xmlPath)
            /// 2、可以处理xml字符串,见xdoc.Load(string xmlInfo)
            /// 自己根据需要,调整函数参数和注释代码
            /// </summary>
            private void MakeData(string fpath,ref Positions obj)
            {
                string fpah = Server.MapPath("~/Xml/work.xml");
                // xdoc.LoadXml(strXml);
                System.Xml.XmlDocument xdoc = null; 
                try
                {
                    xdoc = new System.Xml.XmlDocument();
                    xdoc.Load(fpah);
                    if (xdoc.ChildNodes.Count > 0)
                    {
                        obj = new Positions();
                        foreach (XmlNode node in xdoc.ChildNodes[0].ChildNodes)
                        {
                            this.MakePoints(node, obj);
                        }
                    }
                }
                catch (Exception ex)
                {
                    //
                }
                finally
                { 
                    xdoc = null;
                }
            }
    
            /// <summary>
            /// 根据xml借点,处理该节点的子节点和属性,并将数据保存到自定义对象中
            /// </summary>
            /// <param name="node"></param>
            /// <param name="obj"></param>
            private void MakePoints(XmlNode node, Positions obj)
            {
                string name = node.Name.Trim();
                switch (name)
                {
                    case "Company":
                        {
                            this.Fun_Company(node, obj);
                            break;
                        }
                    case "Position":
                        {
                            this.Fun_Position(node, obj);
                            break;
                        }
    
                }
            }
    
            /// <summary>
            /// Company大节点解析处理数据,并且保存数据到自定义对象Company
            /// </summary>
            /// <param name="node"></param>
            /// <param name="obj"></param>
            protected void Fun_Company(XmlNode node, Positions obj)
            {
                obj.MyCompany = new Company
                {
                    CorpName = node.SelectSingleNode("descendant::CorpName").InnerText,
                    OfficeAddr1 = this.Fun_Single(node, "OfficeAddr1", obj),
                    OffPostCode1 = this.Fun_Single(node, "OffPostCode1", obj),
                    CompanySize = this.Fun_Single(node, "CompanySize", obj),
                    AboutUs = this.Fun_Single(node, "AboutUs", obj),
                };
                this.Fun_Multi(node.SelectSingleNode("descendant::TradeCodes"), obj);
                this.Fun_Multi(node.SelectSingleNode("descendant::Websites"), obj);
                this.Fun_Multi(node.SelectSingleNode("descendant::Informations"), obj);
            }
    
            /// <summary>
            /// Position大节点解析处理数据,并且保存数据到自定义对象Position
            /// </summary>
            /// <param name="node"></param>
            /// <param name="obj"></param>
            private void Fun_Position(XmlNode node, Positions obj)
            {
                obj.MyPosition = new Position
                {
                    ReleaseDate = this.Fun_Single(node, "ReleaseDate", obj),
                    Description = this.Fun_Single(node, "Description", obj),
                    PositionName = this.Fun_Single(node, "PositionName", obj),
                    Number = this.Fun_Single(node, "Number", obj),
                    Education = this.Fun_Single(node, "Education", obj),
                    Salary = this.Fun_Single(node, "Salary", obj),
                    URLFrom = this.Fun_Single(node, "URLFrom", obj),
                    SnapshoAddr = this.Fun_Single(node, "SnapshoAddr", obj),
                };
                this.Fun_Multi(node.SelectSingleNode("descendant::Citys"), obj);
                this.Fun_Multi(node.SelectSingleNode("descendant::Emails"), obj);
            }
    
            /// <summary>
            /// 处理没有子节点的xml节点,并且保留所有数据到自定义对象
            /// </summary>
            /// <param name="node"></param>
            /// <param name="child"></param>
            /// <param name="obj"></param>
            /// <returns></returns>
            private string Fun_Single(XmlNode node, string child, Positions obj)
            {
                string resutl = "";
                resutl = node.SelectSingleNode(string.Format("descendant::{0}", child)).InnerText;
                return resutl;
            }
    
            /// <summary>
            /// 处理有若干子节点的xml节点,并且保留所有数据到自定义对象
            /// </summary>
            /// <param name="node"></param>
            /// <param name="obj"></param>
            private void Fun_Multi(XmlNode node, Positions obj)
            {
                switch (node.Name)
                {
                    case "TradeCodes":
                        {
                            foreach (XmlNode i in node.ChildNodes)
                            {
                                obj.MyCompany.TradeCodes.Add(i.InnerText);
                            }
                            break;
                        }
    
                    case "Informations":
                        {
                            foreach (XmlNode i in node.ChildNodes)
                            {
                                Dictionary<string, string> item = new Dictionary<string, string>();
                                obj.MyCompany.Informations.Add(item);
                                foreach (XmlNode j in i.ChildNodes)
                                {
                                    item.Add(j.Name, j.InnerText);
                                }
                            }
                            break;
                        }
    
                    case "Websites":
                        {
                            foreach (XmlNode i in node.ChildNodes)
                            {
                                obj.MyCompany.Websites.Add(i.InnerText);
                            }
                            break;
                        }
    
    
                    case "Citys":
                        {
                            foreach (XmlNode i in node.ChildNodes)
                            {
                                obj.MyPosition.Citys.Add(i.InnerText);
                            }
                            break;
                        }
    
                    case "Emails":
                        {
                            foreach (XmlNode i in node.ChildNodes)
                            {
                                obj.MyPosition.Emails.Add(i.InnerText);
                            }
                            break;
                        }
                }
            }
        }
    
        /// <summary>
        /// 根据Xml属性定义的类
        /// </summary>
        public class Positions : BaseClass
        {
            public Positions()
            {
                this.MyCompany = new Company();
                this.MyPosition = new Position();
            }
    
            public Company MyCompany
            {
                get;
                set;
            }
    
            public Position MyPosition
            {
                get;
                set;
            }
        }
    
        public class Company : BaseClass
        {
            public Company()
            {
                this.TradeCodes = new List<string>();
                this.Websites = new List<string>();
                this.Informations = new List<Dictionary<string, string>>();
            }
    
            public string CorpName
            {
                get;
                set;
            }
    
            public string OfficeAddr1
            {
                get;
                set;
            }
    
            public string OffPostCode1
            {
                get;
                set;
            }
    
            public string CompanySize
            {
                get;
                set;
            }
    
            public List<string> TradeCodes
            {
                get;
                set;
            }
    
            public string AboutUs
            {
                get;
                set;
            }
    
            public List<string> Websites
            {
                get;
                set;
            }
    
            public List<Dictionary<string, string>> Informations
            {
                get;
                set;
            }
        }
    
        public class Position : BaseClass
        {
            public Position()
            {
                this.Citys = new List<string>();
                this.Emails = new List<string>();
            }
    
            public string ReleaseDate
            {
                get;
                set;
            }
    
            public string Description
            {
                get;
                set;
            }
    
            public string PositionName
            {
                get;
                set;
            }
    
            public string Number
            {
                get;
                set;
            }
    
            public string Education
            {
                get;
                set;
            }
    
            public string Salary
            {
                get;
                set;
            }
    
            public List<string> Citys
            {
                get;
                set;
            }
    
            public List<string> Emails
            {
                get;
                set;
            }
    
            public string URLFrom
            {
                get;
                set;
            }
    
            public string SnapshoAddr
            {
                get;
                set;
            }
        }
    
        public class BaseClass:IDisposable
        {
            public void Dispose()
            {
            }
        }
    }
    

    3、测试结果:1000笔/0.3秒

    【代码下载】

  • 相关阅读:
    document.readyState的使用
    Selenium操作滚动条
    seq2seq模型以及其tensorflow的简化代码实现
    MOXA的Nport5600初始密码
    预测功率和电流之间的关系
    KNN与SVM对比&SVM与逻辑回归的对比
    拉格朗日乘子法以及KKT条件
    复合熵、条件熵和信息增益
    softmax为什么使用指数函数?(最大熵模型的理解)
    极大似然、最小二乘和梯度下降
  • 原文地址:https://www.cnblogs.com/itshare/p/2040194.html
Copyright © 2011-2022 走看看