zoukankan      html  css  js  c++  java
  • 系统设计3:网络爬虫和短链接

    补充材料:

    Web相关:

    https://www.zhihu.com/question/22689579

    爬虫:

    https://www.zhihu.com/question/20899988

    http://www-rohan.sdsu.edu/~gawron/python_for_ss/course_core/book_draft/web/web_intro.html

    https://www.zhihu.com/question/27621722

    http://blog.csdn.net/yiliumu/article/details/21335245

    https://scrapy.org/

    Socket:

    http://www.cnblogs.com/thinksasa/archive/2013/02/26/2934206.html

    http://siddontang.com/2012/09/02/step-by-step-network/

    http://blog.csdn.net/rock_ray/article/details/22046449

    http://coolshell.cn/articles/11564.html

    正则表达式:

    https://regex101.com/

    https://docs.python.org/2/howto/regex.html

    https://docs.python.org/2/library/re.html

    条件变量:

    http://www.wuzesheng.com/?p=1668

    http://blog.csdn.net/jnu_simba/article/details/9129939

    http://stackoverflow.com/questions/11000725/implementation-of-condition-variables

    https://en.wikipedia.org/wiki/Monitor_(synchronization)

    http://blog.csdn.net/anonymalias/article/details/9174481

    信号量:

    http://c.biancheng.net/cpp/html/2598.html

    http://www.cnblogs.com/lcw/p/3236602.html

    http://www.blogjava.net/fhtdy2004/archive/2009/07/05/285519.html

    http://blog.csdn.net/nhn_devlab/article/details/6117239

    无锁队列:

    https://zh.wikipedia.org/wiki/%E7%94%9F%E4%BA%A7%E8%80%85%E6%B6%88%E8%B4%B9%E8%80%85%E9%97%AE%E9%A2%98

    http://www.cnblogs.com/clover-toeic/p/4029269.html

    http://ifeve.com/locks-are-bad/

    http://coolshell.cn/articles/8239.html

    http://coolshell.cn/articles/9169.html

    https://www.infoq.com/articles/High-Performance-Java-Inter-Thread-Communications

    http://blog.csdn.net/ns_code/article/details/17487337

    TinyURL:

    https://goo.gl

    https://www.zhihu.com/topic/19564386/hot

    https://www.hiredintech.com/system-design/the-system-design-process/

    https://developers.google.com/url-shortener/

    多线程是为了提升性能,但性能最好的往往是单线程,无锁的东西。

    通信过程:

    Internet分层:

    一个经典的任务执行调度器:

    高优先级的任务,可以通过Time延后执行。

    1s100w请求

    • queue
    • rate limit
    • more server + 负载均衡
    • 全内存:redis,memcache
    • 异步
  • 相关阅读:
    ICL7135的C程序
    数组属性的习题、Arrays工具、二维数组
    上传文件js端处理
    Java常见的系统路径与获取方法
    java 文件流的处理 文件打包成zip
    JAVA根据URL网址获取输入流
    nginx安装教程
    jackson 实体转json json字符串转实体
    java 对象重写tostring
    java 将文件流和文件名称转换为文件
  • 原文地址:https://www.cnblogs.com/zcy-backend/p/6699657.html
Copyright © 2011-2022 走看看