zoukankan      html  css  js  c++  java
  • Python关于File学习过程

    一、首先,认识下文件

    文本文件和二进制文件的差异和区别

    进行个总结:

    计算机内的文件广义上来说,只有二进制文件

    狭义上来讲分为两大类:二进制文件和文本文件。

    先说数据的产生(即写操作)

    文本文件的所有数据都是固定长度的,每条数据(也就是每个字符)都是1个字节。文本文件的“编/解码器”会将每条数据转换成ASCII码或者Unicode,然后以二进制的形式存到硬盘;

    而二进制文件每条数据不固定,如short占2个字节,int占5个字节,float占8个字节(不一定,只是举个例子),这是二进制文件的写操作是将内存里的数据直接写入文件。

    再说数据的读取:

    文件的读过程是这样的:磁盘 》》 文件缓冲区》》应用程序内存空间。

    我们说“文本文件和二进制文件没有区别”,实际上针对的是第一个过程;既然没有区别,那么打开方式不同,为何显示内容就不同呢?这个区别实际上是第二个过程造成的。

    文件实际上包括两部分,控制信息和内容信息。纯文本文件仅仅是没有控制格式信息罢了;

    1.以Numpy的multiarray.fromfile为例

    numpy.fromfile()

    def fromfile(file, dtype=None, count=-1, sep=''): # real signature unknown; restored from __doc__
        """
        fromfile(file, dtype=float, count=-1, sep='')
        
            Construct an array from data in a text or binary file.
        
            A highly efficient way of reading binary data with a known data-type,
            as well as parsing simply formatted text files.  Data written using the
            `tofile` method can be read using this function.
        
            Parameters
            ----------
            file : file or str
                Open file object or filename.
            dtype : data-type
                Data type of the returned array.
                For binary files, it is used to determine the size and byte-order
                of the items in the file.
            count : int
                Number of items to read. ``-1`` means all items (i.e., the complete
                file).
            sep : str
                Separator between items if file is a text file.
                Empty ("") separator means the file should be treated as binary.
                Spaces (" ") in the separator match zero or more whitespace characters.
                A separator consisting only of spaces must match at least one
                whitespace.
        
            See also
            --------
            load, save
            ndarray.tofile
            loadtxt : More flexible way of loading data from a text file.
        
            Notes
            -----
            Do not rely on the combination of `tofile` and `fromfile` for
            data storage, as the binary files generated are are not platform
            independent.  In particular, no byte-order or data-type information is
            saved.  Data can be stored in the platform independent ``.npy`` format
            using `save` and `load` instead.
        
            Examples
            --------
            Construct an ndarray:
        
            >>> dt = np.dtype([('time', [('min', int), ('sec', int)]),
            ...                ('temp', float)])
            >>> x = np.zeros((1,), dtype=dt)
            >>> x['time']['min'] = 10; x['temp'] = 98.25
            >>> x
            array([((10, 0), 98.25)],
                  dtype=[('time', [('min', '<i4'), ('sec', '<i4')]), ('temp', '<f8')])
        
            Save the raw data to disk:
        
            >>> import os
            >>> fname = os.tmpnam()
            >>> x.tofile(fname)
        
            Read the raw data from disk:
        
            >>> np.fromfile(fname, dtype=dt)
            array([((10, 0), 98.25)],
                  dtype=[('time', [('min', '<i4'), ('sec', '<i4')]), ('temp', '<f8')])
        
            The recommended way to store and load data:
        
            >>> np.save(fname, x)
            >>> np.load(fname + '.npy')
            array([((10, 0), 98.25)],
                  dtype=[('time', [('min', '<i4'), ('sec', '<i4')]), ('temp', '<f8')])
        """
        pass

      值得注意的是,

    Empty ("") separator means the file should be treated as binary.

     也就是说,default情况下,是将文件按照二进制文件读取的,加上separator参数后会将二进制转换后的ASCII码或者unicode再解码为文本数据,

    以test.txt文件为例(1对应的ASCII码十进制为49,","为44)

    test.txt

    1,1,1,1,1

    (1)使用默认sep参数读取:

    filepath = "D://Documents/temp/testForPyStruct.txt"
    data= np.fromfile(filepath , dtype=np.uint8, sep="")
    print(data)

    输出

    [49 44 49 44 49 44 49 44 49]

    (2)使用sep=","读取:

    filepath = "D://Documents/temp/testForPyStruct.txt"
    data= np.fromfile(filepath , dtype=np.uint8, sep=",")
    print(data)

    输出

    [1 1 1 1 1]

    2.

    See also
            --------
            load, save
            ndarray.tofile
            loadtxt : More flexible way of loading data from a text file.
  • 相关阅读:
    函数和指针
    SQL Server 2005 存储过程
    位数组
    C的名字空间
    C奇特的声明
    位字段
    Git忽略规则
    常用C库简介
    《SQL Server 2005 编程入门经典》第一到十二章
    Linus:利用二级指针删除单向链表
  • 原文地址:https://www.cnblogs.com/peanutk/p/10043463.html
Copyright © 2011-2022 走看看