zoukankan      html  css  js  c++  java
  • LeetCode #274 H-Index

    Question

    Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher's h-index.

    According to the definition of h-index on Wikipedia: "A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each."

    Example:

    Input: citations = [3,0,6,1,5]
    Output: 3 
    Explanation: [3,0,6,1,5] means the researcher has 5 papers in total and each of them had 
                 received 3, 0, 6, 1, 5 citations respectively. 
                 Since the researcher has 3 papers with at least 3 citations each and the remaining 
                 two with no more than 3 citations each, her h-index is 3.

    排序+遍历O(nlogn)

    根据 wiki(见参考链接)中提供的计算方法:

    First we order the values of f from the largest to the lowest value. Then, we look for the last position in which f is greater than or equal to the position (we call h this position).

    即只要从大到小排序然后遍历找到最后一个 citations[i] >= i 就行了,此时 h=i (实际上是citations[i] >= i+1)

    为什么work?举个例子

    6 5 3 1 0

    1 2 3 4 5

    这说明前三个数满足 citations[i] >= i >= 3 的,后两个数满足 citations[i] < i (此时i最小取4),所以citations[i] <=3

    当然,根据题目定义的方法来进行比较也是ok的,时间复杂度没有增加,但后续改进会难以继续

    bucket sort:O(n)

    用bucket sort桶排序可以达到O(n)。这题有个非常值得注意的特点是,h的范围是在[0, n]之间的,所以可以用bucket sort!

    class Solution:
        def hIndex(self, citations: List[int]) -> int:
            length = len(citations)
            freq_list = [0 for i in range(length+1)]
            
            # first pass freq_list
            for i in range(length):
                if citations[i] > length:
                    index = length
                else:
                    index = citations[i]
                freq_list[index] += 1
            
            # second pass freq_list
            last = 0
            for i in range(length, -1, -1):
                freq_list[i] += last
                last = freq_list[i] 
                if freq_list[i] >= i:
                    return i

    桶排序的关键是建立一个映射,比如基数为10的基数排序就是建立f(x) = x mod 10 这样的映射。我们先定义bucket:

    freq_list[i]:表示有多少篇文章被至少引用了i次

    要求出freq_list,需要两次遍历:第一次求出有多少篇文章被引用了i次,第二次求出有多少篇文章被至少引用了i次。

    注意到,如果有x篇文章的引用至少3次,那么引用至少2次的文章数量y等于x加上引用次数等于2次的文章数量,即 y= x + freq_list[i],因此该步骤可以以一次遍历完成。

    参考:

    https://en.wikipedia.org/wiki/H-index 

    https://www.cnblogs.com/zmyvszk/p/5619051.html

     
  • 相关阅读:
    Python分析网页中的<a>标签
    Python3.X爬虫
    如何使你的窗体样式固定,不因系统设定而变化
    C# 窗体内有子控件时鼠标检测
    VS新建项目工具箱图标丢失问题
    在c#中使用bitblt显示图片
    DotNetBar 中Ribbon汉化
    汉化DotNetBar中控件的系统文本
    castle动态代理的使用
    FastReport 套打全攻略
  • 原文地址:https://www.cnblogs.com/sbj123456789/p/12174052.html
Copyright © 2011-2022 走看看