You are given a large sample of integers in the range [0, 255]
. Since the sample is so large, it is represented by an array count
where count[k]
is the number of times that k
appears in the sample.
Calculate the following statistics:
minimum
: The minimum element in the sample.maximum
: The maximum element in the sample.mean
: The average of the sample, calculated as the total sum of all elements divided by the total number of elements.median
:- If the sample has an odd number of elements, then the
median
is the middle element once the sample is sorted. - If the sample has an even number of elements, then the
median
is the average of the two middle elements once the sample is sorted.
- If the sample has an odd number of elements, then the
mode
: The number that appears the most in the sample. It is guaranteed to be unique.
Return the statistics of the sample as an array of floating-point numbers [minimum, maximum, mean, median, mode]
. Answers within 10-5
of the actual answer will be accepted.
Example 1:
Input: count = [0,1,3,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
Output: [1.00000,3.00000,2.37500,2.50000,3.00000]
Explanation: The sample represented by count is [1,2,2,2,3,3,3,3].
The minimum and maximum are 1 and 3 respectively.
The mean is (1+2+2+2+3+3+3+3) / 8 = 19 / 8 = 2.375.
Since the size of the sample is even, the median is the average of the two middle elements 2 and 3, which is 2.5.
The mode is 3 as it appears the most in the sample.
Example 2:
Input: count = [0,4,3,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
Output: [1.00000,4.00000,2.18182,2.00000,1.00000]
Explanation: The sample represented by count is [1,1,1,1,2,2,2,3,3,4,4].
The minimum and maximum are 1 and 4 respectively.
The mean is (1+1+1+1+2+2+2+3+3+4+4) / 11 = 24 / 11 = 2.18181818... (for display purposes, the output shows the rounded number 2.18182).
Since the size of the sample is odd, the median is the middle element 2.
The mode is 1 as it appears the most in the sample.
Constraints:
count.length == 256
0 <= count[i] <= 109
1 <= sum(count) <= 109
- The mode of the sample that
count
represents is unique.
这道题说是有很多在0到 255 中的整数,由于重复的数字太多了,所以这里采用的是统计每个数字出现的个数的方式,用数组 count 来表示,其中 count[i] 表示数字i出现的次数。现在让统计原始数组中的最大值,最小值,平均值,中位数,和众数。这里面的最大最小值很好求,最小值就是 count 数组中第一个不为0的位置,最大值就是 count 数组中最后一个不为0的位置。最小值 mn 初始化为 256,在遍历 count 数组的过程中,遇到不为0的数字时,若此时 mn 为 256,则更新为坐标i。最大值 mx 直接每次更新为值不为0的坐标i即可。平均值也好求,只要求出所有的数字之和,跟数字的个数相除就行了,注意由于数字之和可能很大,需要用 double 来表示。众数也不难求,只要找出 count 数组中的最大值,则其坐标就是众数。比较难就是中位数了,由于数组的个数可奇可偶,中位数的求法不同,这里为了统一,采用一个小 trick,比如数组 1,2,3 和 1,2,3,4,可以用坐标为 (n-1)/2 和 n/2 的两个数字求平均值得到,对于长度为奇数的数组,这两个坐标表示的是相同的数字。这里由于是统计数组,所以要找的两个位置是 (cnt+1)/2 和 (cnt+2)/2,其中 cnt 是所有数字的个数。再次遍历 count 数组,使用 cur 来累计当前经过的数字个数,若 cur 小于 first,且 cur 加上 count[i] 大于等于 first,说明当前数字i即为所求,加上其的一半到 median。同理,若 cur 小于 second,cur 加上 count[i] 大于等于 second,说明当前数字i即为所求,加上其的一半到 median 即可,参见代码如下:
class Solution {
public:
vector<double> sampleStats(vector<int>& count) {
double mn = 256, mx = 0, mean = 0, median = 0, sum = 0;
int cnt = 0, mode = 0;
for (int i = 0; i < count.size(); ++i) {
if (count[i] == 0) continue;
if (mn == 256) mn = i;
mx = i;
sum += (double)i * count[i];
cnt += count[i];
if (count[i] > count[mode]) mode = i;
}
mean = sum / cnt;
int first = (cnt + 1) / 2, second = (cnt + 2) / 2, cur = 0;
for (int i = 0; i < count.size(); ++i) {
if (cur < first && cur + count[i] >= first) median += i / 2.0;
if (cur < second && cur + count[i] >= second) median += i / 2.0;
cur += count[i];
}
return {mn, mx, sum / cnt, median, (double)mode};
}
};
Github 同步地址:
https://github.com/grandyang/leetcode/issues/1093
参考资料:
https://leetcode.com/problems/statistics-from-a-large-sample/
https://leetcode.com/problems/statistics-from-a-large-sample/discuss/317626/Python-Solution