zoukankan      html  css  js  c++  java
  • 《Cracking the Coding Interview》——第10章:可扩展性和存储空间限制——题目6

    2014-04-24 22:01

    题目:你有10亿条url,怎么检测其中时候有重复呢?

    解法:Hash,算签名,然后用K-V数据库保存数据查重。

    代码:

    1 // 10.6 You have 10 billion URLs, how would you do to detect duplicates in them.
    2 // Answer:
    3 //    1. Use digital sign algorithm to convert string to a number of checksum.
    4 //    2. Use this sign as the hash key, if memory allow, use an in-memory hash table to detect duplicates.
    5 //    3. If memory won't fit in, use K-V database instead. 10GB scale should be acceptable for one machine, so I won't seek help from another computer.
    6 int main()
    7 {
    8     return 0;
    9 }
  • 相关阅读:
    [BZOJ1303][CQOI2009]中位数图
    [BZOJ1192][HNOI2006]鬼谷子的钱袋
    9.5题解
    9.3题解
    9.2题解
    9.1题解
    8.29题解
    8.28题解
    8.23<2>题解
    8.23<1>题解
  • 原文地址:https://www.cnblogs.com/zhuli19901106/p/3687456.html
Copyright © 2011-2022 走看看