zoukankan      html  css  js  c++  java
  • 文件处理

    1.文件

    f = open(r"文件路径", mode="rt", encoding="utf-8")
    data = f.read(内容)  # f.write(内容)
    f.close()
    
    
    with open('今日内容.txt',mode='rt',encoding='utf-8') as f1:
        data = f1.read()
        print(data)
    

    自动调用f1.close()回收操作系统

    with open('今日内容.txt', mode='rt', encoding='utf-8') as f1, 
            open('a.txt', mode='rt', encoding='utf-8') as f2:
        print('文件1的内容'.center(50, '#'))
        data = f1.read()
        print(data)
    
    ​    print('文件2的内容'.center(50, '#'))
    ​    data = f2.read()
    ​    print(data)
    

    ​ 自动调用f1.close()、f2.close()回收操作系统

       bytes
    ​    with open('a.txt',mode='rt') as f:
    ​        data=f.read()
    ​        print(data)
    ​        print(type(data))
    

    2 t模式只能用于读文本文件

        with open('a.jpg',mode='rt',encoding='utf-8') as f:
    ​        data=f.read()
    ​        print(data)
    ​        print(type(data))
    

    图片<---------jpg-------二进制数

    字符<---------utf-8-------二进制数

    3 b模式可能用于读所有的文件

        with open('a.jpg',mode='rb') as f:
    ​        data=f.read()
    ​        print(data)
    ​        print(type(data))
    
    ​    with open('a.jpg', mode='rb') as f:
    ​        data = f.read()
    ​        print(data.decode("utf-8"))
    ​    print(type(data))
    

    ​ 二进制数

    4 t模式是帮我们解码了

    ​ 字符<---------utf-8-------二进制数

    ​ 补充字符编码解码的知识
    ​ '''

    user = input('>>: ') # user="林海峰"

    ​ user = "林海峰"

    编码操作:

    字符串=utf-8=》bytes

        res=user.encode("utf-8")
    ​    print(res)
    ​    print(type(res))
    

    基于网络发送数据(res)

    5 解码操作:

    bytes》utf-8=》字符串

        print(res.decode("utf-8"))
    ​    '''
    
    ​    with open('a.jpg', mode='rb') as src_f, 
    ​            open('b.jpg', mode='wb') as dst_f:
    
    data = src_f.read()
    
    dst_f.write(data)
    
    ​        for line in src_f: # line=文件中的2行内容
    ​            dst_f.write(line)
    
    ​    with open('b.txt', mode='wb') as f:
    ​        user = "林海峰"
    ​        res=user.encode('utf-8')
    ​        f.write(res)
    
    ​    with open('b.txt', mode='wt', encoding="utf-8") as f:
    ​        user = "林海峰"
    ​        f.write(user)
    

    6 可读可写模式,可以省略t,默认就是t模式,读写都是以字符串为单位

    r+t
    w+t
    a+t

    7 可读可写模式,b模式下读写都是以bytes二进制为单位

    r+b
    w+b
    a+b

    with open('b.txt',mode='r+t',encoding='utf-8') as f:
        print(f.read())
        f.write("abcdefg")
    
    with open('b.txt',mode='w+t',encoding='utf-8') as f:
        f.write("我爱你中国")
        print(f.read())
    
    with open('b.txt',mode='a+t',encoding='utf-8') as f:
        f.write("我爱你中国")
        print(f.read())
    
    with open('b.txt', mode='rt', encoding='utf-8') as f:
        line1=f.readline()
        line2=f.readline()
        line3=f.readline()
        line4=f.readline()
        print(line1,end="")
        print(line2,end="")
        print(line3,end="")
        print(line4,end="")
    
    ​    for line in f:
    ​        print(line)
    
    ​    l = []
    ​    for line in f:
    ​        l.append(line)
    
    ​    l = f.readlines()
    ​    print(l)
    

    with open('b.txt', mode='wt', encoding='utf-8') as f:
    f.write("1111 2222 333 ")

    ​ lines=["1111 ","222 ","333 "]

    ​ for line in lines:
    ​ f.write(line)

    ​ f.writelines(lines)

    ​ f.writelines({'k1':111,'k2':222,"k3":3333})
    ​ f.writelines({'k1':111,1:44444,'k2':222,"k3":3333}) # 报错

    ​ f.writelines("hello")
    ​ f.write("hello")

    with open(r'b.txt', mode='wt', encoding='utf-8') as f:
    print(f.name) # 获取的是文件的路径
    f.write('哈哈哈 ')
    f.flush()
    coding:utf-8 python2操作

    一:文件内指针移动的单位是什么?

    读出二进制解码得到的字符串:hello你好
    硬盘: 0101010101101010101011010101010

    1.只有t模式下read(n),这个n代表的字符个数

    with open('a.txt',mode='rt',encoding='utf-8') as f:
        data=f.read(6)
        print(f.tell())
    print(data)
    

    2.了解:硬盘容量的本质就是能存多个二进制数bit

    8bit=>1Byte
    1024Byte = 1KB
    1024KB=1MB
    1024MB=1GB
    1024GB=1TB
    1GB=102410248

    with open('a.txt',mode='rb') as f:
        data=f.read(8)
        print(type(data))
        print(len(data))
    
    ​    print(data.decode("utf-8"))
    
    with open('b.txt',mode='rb') as f:
        data=f.read(7)
        print(type(data))
        print(len(data))
    
    ​    print(data.decode("gbk"))
    
    
    r+
    a
    with open('a.txt', mode='r+t', encoding='utf-8') as f:
        f.truncate(7)
    

    除此之外,所有的被动的、主动的文件指针移动的单位都是字节的个数

    二: 主动/单纯地控制文件指针移动

    f.seek(x,y)
    x代表的是移动的字节个数
    y代表的模式:

    0:代表参照物是文件开头,可以在t模式和b模块下使用

    示范:

    with open('d.txt', mode='rt', encoding='utf-8') as f:
        f.read(3)
        print(f.tell())  # 5
    
    ​    f.seek(3, 0)
    ​    print(f.tell())  # 3
    

    1:代表参照物是当前位置,只能在b模式下用

    with open('d.txt', mode='rb') as f:
        f.read(1)
        print(f.tell()) # 1
        f.seek(2,1)
        print(f.tell()) # 3
    
    print(f.read().decode("utf-8"))
    

    2:代表参照物是文件末尾,,只能在b模式下用

    with open('d.txt', mode='rb') as f:
    
    f.seek(3333, 2)
    
    print(f.tell()) # 14+3333=3347
    
    f.seek(-3, 2)
    
    print(f.tell())
    
    ​    f.seek(0, 2)  # 快速将指针移动到文件末尾
    ​    print(f.tell())
    
    with open('d.txt', mode='a') as f:
        print(f.tell())
    

    开发如下命令:

    tail -f access.log
    import time
    with open(r"/day10/代码/access.log", mode="rb") as f:
        f.seek(0, 2)  # 快速将指针移动到文件末尾
    
    ​    while True:
    ​        line = f.readline()
    ​        if len(line) == 0:
    ​            time.sleep(0.1)
    ​        else:
    ​            print(line.decode('utf-8'),end='')
    

    引入:硬盘数据没有改这么一说,都是用新内容覆盖老内容

    with open('e.txt', mode="r+t", encoding='utf-8') as f:
    f.seek(9, 0)
    f.write("你好")

    但是文件是可以修改的,但都是模拟出来的,如何实现,借助内存
    具体来说,有两种方式

    方式一原理:

    1、把硬盘内容全部读入内存,

    2、在内存中把内容一次性修改完毕

    3、然后把修改完毕的结果覆盖回原文件

    with open('a.txt', mode='rt', encoding='utf-8') as read_f:
        data = read_f.read()
    with open('a.txt', mode='wt', encoding='utf-8') as write_f:
        write_f.write(data.replace('LIUGUIHAI','liuguihai'))
    

    总结方式一:

    优点:不费硬盘,硬盘数据只有一份
    缺点:费内存,文件过大时内存占用过多

    方式二原理:

    1、把硬盘一点一点读入内存,
    2、在内存中把内容一次修改
    3、然后把修改完毕的结果覆盖回原文件

    import os
    
    with open('f.txt', mode='rt', encoding='utf-8') as read_f,
            open(".f.txt.swap",mode='wt',encoding='utf-8') as write_f:
        for line in read_f:
            write_f.write(line.replace("egon",'===>EGON<==='))
    
    os.remove('f.txt')
    os.rename('.f.txt.swap', 'f.txt')
    

    总结方式二:

    优点:不费内存,内存同一时刻只有文件的一行内容
    缺点:费硬盘,在修改过程中硬盘上会同时存放两份数据

  • 相关阅读:
    The Joy of Clojure – Laziness(6.3)
    Python Decorator Closure
    Programming clojure – Recursion and Lazyseq
    Programming Clojure Unifying Data with Sequences
    SharePoint Workflow的code运行在哪个进程? w3wp.exe 还是OWSTimer.exe?
    利用PsExec提升命令行的安全级别, 绕过组策略执行命令
    WinDBG脚本入门
    记录一个SPS2010中RSS Web Part报错的问题
    User的Delegation选项卡在Active Directory Users and Computers找不到?
    修改SPS2010的Search Core Results webpart, 令其显示文档被索引了的所有属性
  • 原文地址:https://www.cnblogs.com/lgh8023/p/13092990.html
Copyright © 2011-2022 走看看