zoukankan      html  css  js  c++  java
  • 毕业设计-6-3

    连续几天做实验,有种一波未平一波又起的赶脚,今天也是,想想还是记录一下下吧。

    1.

    首先,跑GPS(Graph Processing System)的时候,因为输入文件增大导致昨天运行正常的流程出问题,显示heap size~~~~。所以毛病锁定在输入规模上面!也就是所谓的scalablity issue。由于GPS的资料非常的少,基本没有。我大概搜了一下,有人说要增大堆栈空间,巴拉巴拉,都不好使(http://stackoverflow.com/questions/1596009/java-lang-outofmemoryerror-java-heap-space)。后来去了GPS的讨论组,GPS之父告诉我们,应该这样做:

    参见:https://groups.google.com/forum/#!topic/stanfordgpsusers/62FeHpZijU0

    Hi, Semih.

    I am using GPS to process twitter graph. I got a critical problem. Twitter has a very skewed graph. Some users may have more than 100k followers. Such vertex will trigger huge number of messages.
    GPS generates "java heap overflow" exceptions. I thought that was because the messages are buffered in the memory before sending out. 

    I don't think this is really do to the skew in the data. 100k is still a very small amount of messages. It just means that, that vertex will likely generate or receive 100k * 8 = 800k bytes = 0.8 MB more data. How much memory are you giving to your java virtual machine? How much memory do your machines have. I have about 4GB on each of my machines so I give the following flags to my java scripts: -Xmx3000M. You should change the script file here: https://www.assembla.com/code/phd-projects/subversion/nodes/gps/trunk/scripts/start_gps_node.sh?rev=95 There are two jvm flags XMX_SIZE and XMS_SIZE, which you should adjust.
      于是乎我就跑过去修改了一下,分别增大XMX_SIZE 和 XMS_SIZE,然后又出现新的问题:连接不上端口,于是重新修改端口号,在传到hdfs上面去。
      好了!
     
     
     

    2.

    第二个问题,下午跑RDFlib的时候(我主要用它来使得解析RDF文件,使其变成图数据)。一开始使用 easy_install rdflib来安装的时候总是显示有错误(可能这几天网络有问题,梯子不够长,嘿嘿)。后来急了,直接下载源文件,本地手动安装!可是找了半天,居然没有找到怎么手动安装!!剁手~~~其实,进去文件目录后,prthon setup.py install就可以了。
    安装好了,尝试跑数据了,小数据跑得呼呼爽,后来准备跑真实数据了,300M 左右,结果就怂了。
    No handlers could be found for logger "rdflib.term"
    后来搜了一下,这个网址里面有解决方案 http://stackoverflow.com/questions/17393664/no-handlers-could-be-found-for-logger-rdflib-term
    import logging
    import rdflib
    
    logging.basicConfig()# now load your graph
    g = rdflib.Graph()
    g.load("life_the_universe_everything.rdf")

    3.

    跑rdflib的时候,遇到问题:
    WARNING:rdflib.term:http://www.w3.org/1999/02/22-rdf-syntax-ns# first  does not look like a valid URI, trying to serialize this will break.

    然后会直接导致不能运行,郁闷啊!况且我的代码怎么能有Warning!于是乎去改,搜索了一下啊,有人说把URL里面的空格岁百纳用什么替代就好了,瞬间就笑开颜了,哈哈,果然好用!
    当然了,我这么操作是因为我不在意具体的URL是什么,我只是把它当作一串字符而已!
  • 相关阅读:
    Spring Tool Suite 配置和使用
    自动提示在线/离线状态
    Excel数据导入数据库的SQL快速生成
    MySQL查询和删除重复数据
    内存不足时,调用ajax报的错
    订单支付成功后存储过程
    下订单存储过程
    课程表,订单表(统计报名人数),评论表(统计评论的人数),点赞表(点赞人数)
    更改浏览器的滚动条样式
    自定义文本选中样式
  • 原文地址:https://www.cnblogs.com/xubenben/p/3766343.html
Copyright © 2011-2022 走看看