zoukankan      html  css  js  c++  java
  • python加载csv数据

    入门机器学习时,一些测试数据是网络上的csv文件。这里总结了两种加载csv文件的方式:

    1 通过numpy、urllib2加载

    import numpy as np
    import urllib2
    
    url = "http://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data"
    raw_data = urllib2.urlopen(url)
    dataset = np.loadtxt(raw_data, delimiter=",")
    X = dataset[:, 0:7]
    y = dataset[:, 8]

    2 通过pandas加载

    import pandas as pd
    url
    = "http://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data" dataFrame = pd.read_csv(url, header=None) dataset = dataFrame.values X = dataset[:, 0:7] y = dataset[:, 8]

    3 总结

    • np.loadtxt返回的数据类型是:numpy.ndarray
    • pd.read_csv返回的数据类型是:pandas.core.frame.DataFrame
    • DataFrame.values的类型是:numpy.ndarray
    • 所以,本质上,两种方法最后是一样的
  • 相关阅读:
    AngularJS
    Java
    Java
    AngularJS
    Java
    Java
    AngularJS
    Java
    Debian
    Java
  • 原文地址:https://www.cnblogs.com/zc9527/p/6286621.html
Copyright © 2011-2022 走看看