zoukankan      html  css  js  c++  java
  • 毕业设计-6-3

    连续几天做实验,有种一波未平一波又起的赶脚,今天也是,想想还是记录一下下吧。

    1.

    首先,跑GPS(Graph Processing System)的时候,因为输入文件增大导致昨天运行正常的流程出问题,显示heap size~~~~。所以毛病锁定在输入规模上面!也就是所谓的scalablity issue。由于GPS的资料非常的少,基本没有。我大概搜了一下,有人说要增大堆栈空间,巴拉巴拉,都不好使(http://stackoverflow.com/questions/1596009/java-lang-outofmemoryerror-java-heap-space)。后来去了GPS的讨论组,GPS之父告诉我们,应该这样做:

    参见:https://groups.google.com/forum/#!topic/stanfordgpsusers/62FeHpZijU0

    Hi, Semih.

    I am using GPS to process twitter graph. I got a critical problem. Twitter has a very skewed graph. Some users may have more than 100k followers. Such vertex will trigger huge number of messages.
    GPS generates "java heap overflow" exceptions. I thought that was because the messages are buffered in the memory before sending out. 

    I don't think this is really do to the skew in the data. 100k is still a very small amount of messages. It just means that, that vertex will likely generate or receive 100k * 8 = 800k bytes = 0.8 MB more data. How much memory are you giving to your java virtual machine? How much memory do your machines have. I have about 4GB on each of my machines so I give the following flags to my java scripts: -Xmx3000M. You should change the script file here: https://www.assembla.com/code/phd-projects/subversion/nodes/gps/trunk/scripts/start_gps_node.sh?rev=95 There are two jvm flags XMX_SIZE and XMS_SIZE, which you should adjust.
      于是乎我就跑过去修改了一下,分别增大XMX_SIZE 和 XMS_SIZE,然后又出现新的问题:连接不上端口,于是重新修改端口号,在传到hdfs上面去。
      好了!
     
     
     

    2.

    第二个问题,下午跑RDFlib的时候(我主要用它来使得解析RDF文件,使其变成图数据)。一开始使用 easy_install rdflib来安装的时候总是显示有错误(可能这几天网络有问题,梯子不够长,嘿嘿)。后来急了,直接下载源文件,本地手动安装!可是找了半天,居然没有找到怎么手动安装!!剁手~~~其实,进去文件目录后,prthon setup.py install就可以了。
    安装好了,尝试跑数据了,小数据跑得呼呼爽,后来准备跑真实数据了,300M 左右,结果就怂了。
    No handlers could be found for logger "rdflib.term"
    后来搜了一下,这个网址里面有解决方案 http://stackoverflow.com/questions/17393664/no-handlers-could-be-found-for-logger-rdflib-term
    import logging
    import rdflib
    
    logging.basicConfig()# now load your graph
    g = rdflib.Graph()
    g.load("life_the_universe_everything.rdf")

    3.

    跑rdflib的时候,遇到问题:
    WARNING:rdflib.term:http://www.w3.org/1999/02/22-rdf-syntax-ns# first  does not look like a valid URI, trying to serialize this will break.

    然后会直接导致不能运行,郁闷啊!况且我的代码怎么能有Warning!于是乎去改,搜索了一下啊,有人说把URL里面的空格岁百纳用什么替代就好了,瞬间就笑开颜了,哈哈,果然好用!
    当然了,我这么操作是因为我不在意具体的URL是什么,我只是把它当作一串字符而已!
  • 相关阅读:
    Neko's loop HDU-6444(网络赛1007)
    Parameters
    SETLOCAL
    RD / RMDIR Command
    devenv 命令用法
    Cannot determine the location of the VS Common Tools folder.
    'DEVENV' is not recognized as an internal or external command,
    How to change Visual Studio default environment setting
    error signing assembly unknown error
    What is the Xcopy Command?:
  • 原文地址:https://www.cnblogs.com/xubenben/p/3766343.html
Copyright © 2011-2022 走看看