zoukankan      html  css  js  c++  java
  • 统计某学校考研录取信息

    目的:以学院为单位,统计本科学校为“一本”学校的占比,“第一志愿”占比。

    学校单位是XX学院的是二本,XX学校的是一本;“一志愿”是第一志愿录取,“调剂”是调剂之后录取。

    原始数据:

     代码:

    import pandas as pd
    if __name__ == '__main__':
        df =pd.read_csv("2020.csv",encoding  = 'gbk')
        list = df["毕业学校"].tolist()
        list_school = []
        for school in list:
            if "学院" in  school:
                list_school.append(0)
            else:
                list_school.append(1)
        df['毕业school'] = list_school
        list_volunteer = []
        list = df["备注"].tolist()
        for volunteer in list:
            if volunteer == '一志愿':
                list_volunteer.append(1)
            if volunteer =='调剂':
                list_volunteer.append(0)
        df["volunteer"] = list_volunteer
        #print(df.head(10))
        value1 = []  #一本学校录取率
    
        sum_school_sum_list = []
        sum_school_len_list = []
        for i in range(1,25):
            sum_school_sum = (df[df['学院代码'] == i])["毕业school"].sum()
            sum_school_len = (df[df['学院代码'] == i])["毕业school"].count()
            sum_school_sum_list.append(sum_school_sum)
            sum_school_len_list.append(sum_school_len)
            value1.append(sum_school_sum/sum_school_len)
        #print(value1)
        sum_volunteer_sum_list =[]
        value2 = []  #一志愿录取率
        for i in range(1,25):
            sum_volunteer_sum = (df[df['学院代码'] == i])["volunteer"].sum()
            sum_volunteer_len = (df[df['学院代码'] == i])["volunteer"].count()
            value2.append(sum_volunteer_sum/sum_volunteer_len)
            sum_volunteer_sum_list.append(sum_volunteer_sum)
        college_name = []
        for k in range(1,25):
            for i,j in zip(df["学院代码"],df["学院名称"]):
                if i == k:
                    college_name.append(j)
                    break
        college_number = []
        for i in range(1,25):
            college_number.append(i)
        data_dic = {
            "学院代码":college_number,
            "学院名称":college_name,
            "一本学校录取人数":sum_school_sum_list,
            "总录取人数":sum_school_len_list,
            "一本录取率":value1,
            "第一志愿录取人数":sum_volunteer_sum_list,
            "总录取人数":sum_school_len_list,
            "一志愿录取率":value2}
        pd_value = pd.DataFrame(data_dic)
        pd_value.to_csv("2020年录取率详情.csv",encoding='gbk')

    结果:

    数据源不对外公布。

  • 相关阅读:
    差分约束系统详解
    AC自动机详解
    KMP算法详解
    ST算法详解
    Trie详解
    欧拉路径详解
    树上差分详解
    LCA详解
    树链剖分详解
    树的直径详解
  • 原文地址:https://www.cnblogs.com/lgwdx/p/14240895.html
Copyright © 2011-2022 走看看