Question
Given an array of citations sorted in ascending order (each citation is a non-negative integer) of a researcher, write a function to compute the researcher's h-index.
According to the definition of h-index on Wikipedia: "A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each."
Example:
Input:citations = [0,1,3,5,6]
Output: 3 Explanation:[0,1,3,5,6]
means the researcher has5
papers in total and each of them had received 0, 1, 3, 5, 6
citations respectively. Since the researcher has3
papers with at least3
citations each and the remaining two with no more than3
citations each, her h-index is3
.
Note:
If there are several possible values for h, the maximum one is taken as the h-index.
二分查找O(logn)
这题表面上是承接了H-index的一道题,实际上是一道典型的Binary Search题。
当citations排好序之后,直接找出citations[i] < i的位置即可。由于这里上升序排序,所以实际写的时候和降序稍有不同。
但此题的关键不是想到用Binary Search,而是处理Binary Search那些复杂的细节。这里介绍C++标准库<algorithm>里的超简洁、bug free的通用写法:
def lower_bound(array, first, last, value): # 返回[first, last)内第一个不小于value的值的位置 while first < last: # 搜索区间[first, last)不为空 mid = first + (last - first) // 2 # 防溢出 if array[mid] < value: first = mid + 1 else: last = mid
return first # last也行,因为[first, last)为空的时候它们重合
这样写的好处很多:
1. 简洁好记,只有first = mid+1 处用到了位置调整
2. 即使区间为空、答案不存在、有重复元素、搜索开/闭的上/下界也同样适用,鲁棒性高
要点:
1. 搜索区间是左闭右开,[first, last)
2. 只有first = mid + 1,而last = mid (这能避免死循环,详见参考资料)
3. mid = first + (last - first) // 2 (防溢出,适用于指针、迭代器,当然python大数可以不用考虑溢出)
最后将这个方法应用在本题中:
class Solution: def hIndex(self, citations: List[int]) -> int: length = len(citations) left = 0 right = length while left < right: mid = left + (right - left) // 2 if citations[mid] < length - mid: left = mid + 1 else: right = mid return length - left
非常简洁
python中其实也有二分查找的实现 bisect,见参考资料
参考: