zoukankan      html  css  js  c++  java
  • python学习四(处理数据)

    head first python中的一个数据处理的例子

    有四个U10选手的600米成绩,请取出每个选手跑的最快的3个时间。以下是四位选手的9次成绩

    James

    2-34,3:21,2.34,2.45,3.01,2:01,2:01,3:10,2-22

    Julie

    2.59,2.11,2:11,2:23,3-10,2-23,3:10,3.21,3-21

    Mikey 

    2:22,3.01,3:01,3.02,3:02,3.02,3:22,2.49,2:38

     Sarah

    2:58,2.58,2:39,2-25,2-55,2:54,2.18,2:55,2:55

     代码如下:

    def sanitize(time_string):
        if '-' in time_string:
            splitter = '-'
        elif ':' in time_string:
            splitter = ':'
        else:
            return(time_string)
        (mins, secs) = time_string.split(splitter)
        return(mins + '.' + secs)
    
    def get_coach_data(filename): 
        try:
            with open(filename) as f:
                data = f.readline() 
            return(data.strip().split(','))
        except IOError as ioerr:
            print('File error: ' + str(ioerr))
            return(None)
    
    james = get_coach_data('james.txt')
    julie = get_coach_data('julie.txt')
    mikey = get_coach_data('mikey.txt')
    sarah = get_coach_data('sarah.txt')
    
    print(sorted(set([sanitize(t) for t in james]))[0:3])
    print(sorted(set([sanitize(t) for t in julie]))[0:3])
    print(sorted(set([sanitize(t) for t in mikey]))[0:3])
    print(sorted(set([sanitize(t) for t in sarah]))[0:3])

    首先定义了一个模块sanitize清理数据,注意set集合中不允许重复记录,sorted会返回一个排序后的列表,不会修改原有的列表。

     打印结果

    ['2.01', '2.22', '2.34']
    ['2.11', '2.23', '2.59']
    ['2.22', '2.38', '2.49']
    ['2.18', '2.25', '2.39']

     例2:

    提供另外一组成绩数据,数据中包括了运动员姓名,出生日期,及成绩。

    打印出每个运动员姓名,及最快的三次成绩 

    def sanitize(time_string):
        if '-' in time_string:
            splitter = '-'
        elif ':' in time_string:
            splitter = ':'
        else:
            return(time_string)
        (mins, secs) = time_string.split(splitter)
        return(mins + '.' + secs)
    
    def get_coach_data(filename):
        try:
            with open(filename) as f:
                data = f.readline()
            templ = data.strip().split(',')
            return({'Name' : templ.pop(0),
                    'DOB'  : templ.pop(0),
                    'Times': str(sorted(set([sanitize(t) for t in templ]))[0:3])})
        except IOError as ioerr:
            print('File error: ' + str(ioerr))
            return(None)
        
    james = get_coach_data('james2.txt')
    julie = get_coach_data('julie2.txt')
    mikey = get_coach_data('mikey2.txt')
    sarah = get_coach_data('sarah2.txt')
    
    print(james['Name'] + "'s fastest times are: " + james['Times'])
    print(julie['Name'] + "'s fastest times are: " + julie['Times'])
    print(mikey['Name'] + "'s fastest times are: " + mikey['Times'])
    print(sarah['Name'] + "'s fastest times are: " + sarah['Times'])

     上面代码中用{}定义了一个map类型的数据结构,key分别是name,DOB,Times。

    也可以用其它方式实现,类似于JAVA中的JAVABEAN

    def sanitize(time_string):
        if '-' in time_string:
            splitter = '-'
        elif ':' in time_string:
            splitter = ':'
        else:
            return(time_string)
        (mins, secs) = time_string.split(splitter)
        return(mins + '.' + secs)
    
    class AthleteList(list):
    
        def __init__(self, a_name, a_dob=None, a_times=[]):
            list.__init__([])
            self.name = a_name
            self.dob = a_dob
            self.extend(a_times)
    
        def top3(self):
            return(sorted(set([sanitize(t) for t in self]))[0:3])
            
    def get_coach_data(filename):
        try:
            with open(filename) as f:
                data = f.readline()
            templ = data.strip().split(',')
            return(AthleteList(templ.pop(0), templ.pop(0), templ))
        except IOError as ioerr:
            print('File error: ' + str(ioerr))
            return(None)
        
    james = get_coach_data('james2.txt')
    julie = get_coach_data('julie2.txt')
    mikey = get_coach_data('mikey2.txt')
    sarah = get_coach_data('sarah2.txt')
    
    print(james.name + "'s fastest times are: " + str(james.top3()))
    print(julie.name + "'s fastest times are: " + str(julie.top3()))
    print(mikey.name + "'s fastest times are: " + str(mikey.top3()))
    print(sarah.name + "'s fastest times are: " + str(sarah.top3()))

    注意class中的每个方法的第一个参数必须是self 

  • 相关阅读:
    HDU 1874 畅通工程续
    HDU 1232 畅通工程
    HDU 1233 还是畅通工程
    HDU 1269 迷宫城堡
    洛谷 P1078 文化之旅
    POJ 3461 Oulipo
    最长链
    矩形面积求并
    有趣的数
    修复公路
  • 原文地址:https://www.cnblogs.com/pingh/p/3439601.html
Copyright © 2011-2022 走看看