zoukankan html css js c++ java

学习助手开发(二)——表单排序

上期只是得到了复习计划的列表，这些复习计划并没有按照日期进行排序，本次目标是完成按照日期排序。
对列进行排序，选择pandas的方法。初步成果：https://github.com/Zeraka/mytodolisthelper

pandas可以直接读取文件，将文件中的内容转化为DataFrame结构。再通过操纵DataFrame数据结构对表数据进行处理。
pandas的官方文档
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

pandas使用read_csv()方法读取txt文件中的数据。最简单的读取方法如下：

pandas.read_csv(fileName, "s+") # 如果不指定分隔符的话，会默认以逗号进行分隔

分隔符可以用s+正则表达式将所有空白符号之间的字符串分开。
将原先的todolist用pandas读取后，打印出如下的信息：

 pythonshell执行和进程管理  2020-09-17  00:59:09  notdone
0   pythonshell执行和进程管理  2020-09-17  01:29:09  notdone
1   pythonshell执行和进程管理  2020-09-18  00:29:09     done
2   pythonshell执行和进程管理  2020-09-21  00:29:09  notdone
3   pythonshell执行和进程管理  2020-09-24  00:29:09  notdone
4   pythonshell执行和进程管理  2020-10-17  00:29:09  notdone
5   pythonshell执行和进程管理  2020-11-16  00:29:09  notdone
6   pythonshell执行和进程管理  2020-12-16  00:29:09  notdone
7     16进制转换10进制和字符串相乘  2020-09-18  19:07:17  notdone
8     16进制转换10进制和字符串相乘  2020-09-18  19:37:17  notdone
9     16进制转换10进制和字符串相乘  2020-09-19  18:37:17  notdone
10    16进制转换10进制和字符串相乘  2020-09-22  18:37:17  notdone
11    16进制转换10进制和字符串相乘  2020-09-25  18:37:17  notdone
12    16进制转换10进制和字符串相乘  2020-10-18  18:37:17  notdone
13    16进制转换10进制和字符串相乘  2020-11-17  18:37:17  notdone
14    16进制转换10进制和字符串相乘  2020-12-17  18:37:17  notdone

发现日期被分成了两部分，这是分隔符为's+'导致的，这里应使用' +'。同时使用'python'引擎而不是c引擎
在read_csv中加入engine='python'参数,因为C引擎不支持正则表达式。

df = pd.read_csv(fileName, "	+", engine="python")

得到如下：

pythonshell执行和进程管理  2020-09-17 00:59:09  notdone
0   pythonshell执行和进程管理  2020-09-17 01:29:09  notdone
1   pythonshell执行和进程管理  2020-09-18 00:29:09     done
2   pythonshell执行和进程管理  2020-09-21 00:29:09  notdone
3   pythonshell执行和进程管理  2020-09-24 00:29:09  notdone
4   pythonshell执行和进程管理  2020-10-17 00:29:09  notdone
5   pythonshell执行和进程管理  2020-11-16 00:29:09  notdone
6   pythonshell执行和进程管理  2020-12-16 00:29:09  notdone
7     16进制转换10进制和字符串相乘  2020-09-18 19:07:17  notdone
8     16进制转换10进制和字符串相乘  2020-09-18 19:37:17  notdone
9     16进制转换10进制和字符串相乘  2020-09-19 18:37:17  notdone
10    16进制转换10进制和字符串相乘  2020-09-22 18:37:17  notdone
11    16进制转换10进制和字符串相乘  2020-09-25 18:37:17  notdone
12    16进制转换10进制和字符串相乘  2020-10-18 18:37:17  notdone
13    16进制转换10进制和字符串相乘  2020-11-17 18:37:17  notdone
14    16进制转换10进制和字符串相乘  2020-12-17 18:37:17  notdone

这里又出现了一个问题，第一行左侧没有0序号，显然该行被当成了表头，所以在打开文件的时候，必须先设定表头。
添加表头可参考https://blog.csdn.net/weixin_44346972/article/details/104415553

df = pd.read_csv('xxx',header=None,names=['xx','yy','cc',...])

header如果为None，意为打开的时候不识别表头，第一行作为数据行，后面的names列表中一次是每一列的表头。
ps : read_csv还有哪些重要参数？
最后得到列表如下：

                  todo                 date haveDone
0   pythonshell执行和进程管理  2020-09-17 00:59:09  notdone
1   pythonshell执行和进程管理  2020-09-17 01:29:09  notdone
2   pythonshell执行和进程管理  2020-09-18 00:29:09     done
3   pythonshell执行和进程管理  2020-09-21 00:29:09  notdone
4   pythonshell执行和进程管理  2020-09-24 00:29:09  notdone
5   pythonshell执行和进程管理  2020-10-17 00:29:09  notdone
6   pythonshell执行和进程管理  2020-11-16 00:29:09  notdone
7   pythonshell执行和进程管理  2020-12-16 00:29:09  notdone
8     16进制转换10进制和字符串相乘  2020-09-18 19:07:17  notdone
9     16进制转换10进制和字符串相乘  2020-09-18 19:37:17  notdone
10    16进制转换10进制和字符串相乘  2020-09-19 18:37:17  notdone
11    16进制转换10进制和字符串相乘  2020-09-22 18:37:17  notdone
12    16进制转换10进制和字符串相乘  2020-09-25 18:37:17  notdone
13    16进制转换10进制和字符串相乘  2020-10-18 18:37:17  notdone
14    16进制转换10进制和字符串相乘  2020-11-17 18:37:17  notdone
15    16进制转换10进制和字符串相乘  2020-12-17 18:37:17  notdone

对日期列进行排序

sort_values方法有四个核心参数，分别为axis,ascending和inplace.
axis有0和1两个取值，取0为以每一行为单位，把行重新排序，而取1的话，就是以列为单位，把列进行排序，常使用的是axis=0.
ascending为True表明升序，为false表明降序。
inplace表明是否改变原dataFrame的值。
by是以某一列为基准，by的实参可以是列表，列表中从左向右优先级逐渐减少。

ndf = df.sort_values(axis=0,by='date',ascending=True,inplace=False)

此步骤将数据表重新按照日期列进行排序，得到了如下输出：
出现了问题：

                  todo                 date  haveDone
1   pythonshell执行和进程管理  2020-09-17 00:59:09   notdone
2   pythonshell执行和进程管理  2020-09-17 01:29:09   notdone
3   pythonshell执行和进程管理  2020-09-18 00:29:09      done
4     16进制转换10进制和字符串相乘  2020-09-18 19:07:17   notdone
5     16进制转换10进制和字符串相乘  2020-09-18 19:37:17   notdone
6     16进制转换10进制和字符串相乘  2020-09-19 18:37:17   notdone
7   pythonshell执行和进程管理  2020-09-21 00:29:09   notdone
8     16进制转换10进制和字符串相乘  2020-09-22 18:37:17   notdone
9   pythonshell执行和进程管理  2020-09-24 00:29:09   notdone
10    16进制转换10进制和字符串相乘  2020-09-25 18:37:17   notdone
11  pythonshell执行和进程管理  2020-10-17 00:29:09   notdone
12    16进制转换10进制和字符串相乘  2020-10-18 18:37:17   notdone
13  pythonshell执行和进程管理  2020-11-16 00:29:09   notdone
14    16进制转换10进制和字符串相乘  2020-11-17 18:37:17   notdone
15  pythonshell执行和进程管理  2020-12-16 00:29:09   notdone
16    16进制转换10进制和字符串相乘  2020-12-17 18:37:17   notdone
0                 todo                 date  haveDone

末尾有一行多余。原因是read_csv()以header=None的方式打开，第一次没有表头，然后添加了表头。第二次打开的时候仍然用header=None的话表头就被当作普通行。
所以解决方法是在一开始的txt文件中预先写好表头，然后以header=0的方式打开txt表文件。
最终解决了该问题。获得了正常输出。

                  todo                 date  haveDone
1   pythonshell执行和进程管理  2020-09-17 00:59:09   notdone
2   pythonshell执行和进程管理  2020-09-17 01:29:09   notdone
3   pythonshell执行和进程管理  2020-09-18 00:29:09      done
4     16进制转换10进制和字符串相乘  2020-09-18 19:07:17   notdone
5     16进制转换10进制和字符串相乘  2020-09-18 19:37:17   notdone
6     16进制转换10进制和字符串相乘  2020-09-19 18:37:17   notdone
7   pythonshell执行和进程管理  2020-09-21 00:29:09   notdone
8     16进制转换10进制和字符串相乘  2020-09-22 18:37:17   notdone
9   pythonshell执行和进程管理  2020-09-24 00:29:09   notdone
10    16进制转换10进制和字符串相乘  2020-09-25 18:37:17   notdone
11  pythonshell执行和进程管理  2020-10-17 00:29:09   notdone
12    16进制转换10进制和字符串相乘  2020-10-18 18:37:17   notdone
13  pythonshell执行和进程管理  2020-11-16 00:29:09   notdone
14    16进制转换10进制和字符串相乘  2020-11-17 18:37:17   notdone
15  pythonshell执行和进程管理  2020-12-16 00:29:09   notdone
16    16进制转换10进制和字符串相乘  2020-12-17 18:37:17   notdone

pandas排序的文章 https://blog.csdn.net/weixin_41261833/article/details/104167592
https://www.cnblogs.com/cymwill/p/10555148.html

df.sort_values(axis=0,ascending=True,inplace=False)#inplace为False代表对原始数据拷贝操作,axis为0是把每行作为单位进行排序，

https://blog.csdn.net/Jarry_cm/article/details/99633562

有关pandas的博客 https://blog.csdn.net/weixin_44946523/article/details/106739593

其他

将已打开的文件变为DataFrame对象

首先是用pandas库，将todolist中的表转化为内存中的DataFrame,数据表。
df = pd.DataFrame

系统输入参数判断

通过判断系统输入参数列表的长度来分开有参数和无参数该程序的执行情况。
即先判断argv的列表长度，使用len()方法,例如len(sys.argv)

将日期字符串序列化

python的datetime模块的datetime类，有strptime方法，
time = datetime.datetime.strptime(timestr, timeformatstr), 第一个参数是表示日期的字符串，第二个参数是解析该日期的格式的字符串。
例如'2020-09-21 00:29:09'作为第一个参数，则第二个参数是 '%Y-%m-%d %H:%M:%S', 第二个参数很好记， %号加年月日时分秒的英文单词受字符大写。

得到的是序列化后的日期对象，该对象可以通过 time.date().year或者time.year输出具体的数值。
可参考https://blog.csdn.net/xidianbaby/article/details/87983157

接下来计划

如果有超过3次没有复习，则重新制定复习计划。

查看全文

相关阅读:
14. 最长公共前缀-字符串-简单
 13. 罗马数字转整数-字符串-简单
 12. 整数转罗马数字-字符串-中等难度
 48. 旋转图像-数组-中等难度
 6. Z 字形变换-字符串-中等难度
 39. 组合总和-dfs回溯-中等难度
 【STM32F407开发板用户手册】第23章 STM32F407的USART串口基础知识和HAL库API
【STM32F429开发板用户手册】第22章 STM32F429的SysTick实现多组软件定时器
 【STM32F407开发板用户手册】第22章 STM32F407的SysTick实现多组软件定时器
 【STM32F429开发板用户手册】第21章 STM32F429的NVIC中断分组和配置（重要）

原文地址：https://www.cnblogs.com/goto2091/p/13701242.html