zoukankan      html  css  js  c++  java
  • 系统设计3:网络爬虫和短链接

    补充材料:

    Web相关:

    https://www.zhihu.com/question/22689579

    爬虫:

    https://www.zhihu.com/question/20899988

    http://www-rohan.sdsu.edu/~gawron/python_for_ss/course_core/book_draft/web/web_intro.html

    https://www.zhihu.com/question/27621722

    http://blog.csdn.net/yiliumu/article/details/21335245

    https://scrapy.org/

    Socket:

    http://www.cnblogs.com/thinksasa/archive/2013/02/26/2934206.html

    http://siddontang.com/2012/09/02/step-by-step-network/

    http://blog.csdn.net/rock_ray/article/details/22046449

    http://coolshell.cn/articles/11564.html

    正则表达式:

    https://regex101.com/

    https://docs.python.org/2/howto/regex.html

    https://docs.python.org/2/library/re.html

    条件变量:

    http://www.wuzesheng.com/?p=1668

    http://blog.csdn.net/jnu_simba/article/details/9129939

    http://stackoverflow.com/questions/11000725/implementation-of-condition-variables

    https://en.wikipedia.org/wiki/Monitor_(synchronization)

    http://blog.csdn.net/anonymalias/article/details/9174481

    信号量:

    http://c.biancheng.net/cpp/html/2598.html

    http://www.cnblogs.com/lcw/p/3236602.html

    http://www.blogjava.net/fhtdy2004/archive/2009/07/05/285519.html

    http://blog.csdn.net/nhn_devlab/article/details/6117239

    无锁队列:

    https://zh.wikipedia.org/wiki/%E7%94%9F%E4%BA%A7%E8%80%85%E6%B6%88%E8%B4%B9%E8%80%85%E9%97%AE%E9%A2%98

    http://www.cnblogs.com/clover-toeic/p/4029269.html

    http://ifeve.com/locks-are-bad/

    http://coolshell.cn/articles/8239.html

    http://coolshell.cn/articles/9169.html

    https://www.infoq.com/articles/High-Performance-Java-Inter-Thread-Communications

    http://blog.csdn.net/ns_code/article/details/17487337

    TinyURL:

    https://goo.gl

    https://www.zhihu.com/topic/19564386/hot

    https://www.hiredintech.com/system-design/the-system-design-process/

    https://developers.google.com/url-shortener/

    多线程是为了提升性能,但性能最好的往往是单线程,无锁的东西。

    通信过程:

    Internet分层:

    一个经典的任务执行调度器:

    高优先级的任务,可以通过Time延后执行。

    1s100w请求

    • queue
    • rate limit
    • more server + 负载均衡
    • 全内存:redis,memcache
    • 异步
  • 相关阅读:
    【洛谷P2921】[USACO08DEC]在农场万圣节Trick or Treat on the Farm
    【洛谷P3659】[USACO17FEB]Why Did the Cow Cross the Road I G
    【洛谷P3385】【模板】负环
    Typora+PicGo+Gitee实现图片上传功能
    Java substring() 方法
    Java lastIndexOf的用法
    Tomcat控制台乱码处理解决方法
    HTTP 协议中 URI 和 URL 有什么区别?
    java如何判断某一变量属于什么类型
    Idea发布web项目显示“找不到应用程序”的解决方法
  • 原文地址:https://www.cnblogs.com/zcy-backend/p/6699657.html
Copyright © 2011-2022 走看看