zoukankan      html  css  js  c++  java
  • python 读取文本

    将文本转换到NumPy 数组中,做机器学习或其他任何任务,文本处理的技能必不可少。python 实现实现了很精简强大的文本处理功能:

    假设 文件 traindata.csv 中有数据 1000行,3列特征,第四列(最后一列)为类标签

    1. 基本方法:

    def file2matrix():
        dataMat = []
        labelMat = []
        fr = open('./traindata.csv','rb')  
       fr.readline()
    for line in fr.readlines(): #读取每一行
        curLine = line.strip().split(' ')
        lineArr = []
        for i in range(3):
          lineArr.append(float(curLine[i])) # 读取每个属性
        dataMat.append(lineArr)
        labelMat.append(float(curLine[3]))
      return dataMat,labelMat

    2. 使用csv模块

    import csv
    
    def file2Matrix():
         fr = open('./traindata','rb')
         lines = csv.reader(fr)
       lines.next()  // 忽略第一行
         for line in lines: 
              ....

    3. 使用pandas 模块

    import pandas as pd
    
    def file2Matrix():
        fr = open('./traindata.csv','rb')
        df = pd.read_csv(fr,header=0)
        dataMat = df[['feature1','feature2','feature3']]
        labelMat = df['label']
        return dataMat,labelMat

    很明显,如果熟练掌纹pandas 将会很简单,so easy.

    纸上得来终觉浅,绝知此事要躬行....

    just do it!

    每天一小步,人生一大步!Good luck~
  • 相关阅读:
    CF1270H
    CF1305G
    LeetCode-Sqrt(x)
    LeetCode-Plus One
    LeetCode-Integer to Roman
    LeetCode-Roman to Integer
    LeetCode-String to Integer (atoi)
    LeetCode-Reverse Integer
    C++
    LeetCode-Gray Code
  • 原文地址:https://www.cnblogs.com/jkmiao/p/4431397.html
Copyright © 2011-2022 走看看