zoukankan      html  css  js  c++  java
  • python 的csr_python

    [转载链接]:python 的csr_python - 以便携式数据形式保存/加载scipy稀疏csr_matrix_weixin_39974223的博客-CSDN博客

    以下是使用Jupyter笔记本的三个最受欢迎的答案的性能比较。 输入是一个1M x 100K随机稀疏矩阵,密度为0.001,包含100M非零值:

    from scipy.sparse import random

    matrix = random(1000000, 100000, density=0.001, format='csr')

    matrix

    <1000000x100000 sparse matrix of type ''

    with 100000000 stored elements in Compressed Sparse Row format>

    cPickle/np.savez

    from scipy.sparse import io

    %time io.mmwrite('test_io.mtx', matrix)

    CPU times: user 4min 37s, sys: 2.37 s, total: 4min 39s

    Wall time: 4min 39s

    %time matrix = io.mmread('test_io.mtx')

    CPU times: user 2min 41s, sys: 1.63 s, total: 2min 43s

    Wall time: 2min 43s

    matrix

    <1000000x100000 sparse matrix of type ''

    with 100000000 stored elements in COOrdinate format>

    Filesize: 3.0G.

    (请注意,格式已从csr更改为coo)。

    cPickle/np.savez

    import numpy as np

    from scipy.sparse import csr_matrix

    def save_sparse_csr(filename, array):

    # note that .npz extension is added automatically

    np.savez(filename, data=array.data, indices=array.indices,

    indptr=array.indptr, shape=array.shape)

    def load_sparse_csr(filename):

    # here we need to add .npz extension manually

    loader = np.load(filename + '.npz')

    return csr_matrix((loader['data'], loader['indices'], loader['indptr']),

    shape=loader['shape'])

    %time save_sparse_csr('test_savez', matrix)

    CPU times: user 1.26 s, sys: 1.48 s, total: 2.74 s

    Wall time: 2.74 s

    %time matrix = load_sparse_csr('test_savez')

    CPU times: user 1.18 s, sys: 548 ms, total: 1.73 s

    Wall time: 1.73 s

    matrix

    <1000000x100000 sparse matrix of type ''

    with 100000000 stored elements in Compressed Sparse Row format>

    Filesize: 1.1G.

    cPickle

    import cPickle as pickle

    def save_pickle(matrix, filename):

    with open(filename, 'wb') as outfile:

    pickle.dump(matrix, outfile, pickle.HIGHEST_PROTOCOL)

    def load_pickle(filename):

    with open(filename, 'rb') as infile:

    matrix = pickle.load(infile)

    return matrix

    %time save_pickle(matrix, 'test_pickle.mtx')

    CPU times: user 260 ms, sys: 888 ms, total: 1.15 s

    Wall time: 1.15 s

    %time matrix = load_pickle('test_pickle.mtx')

    CPU times: user 376 ms, sys: 988 ms, total: 1.36 s

    Wall time: 1.37 s

    matrix

    <1000000x100000 sparse matrix of type ''

    with 100000000 stored elements in Compressed Sparse Row format>

    Filesize: 1.1G.

    注意:cPickle不适用于非常大的对象(请参阅此答案)。根据我的经验,它不适用于具有270M非零值的2.7M x 50k矩阵。cPickle解决方案效果很好。

    结论

    (基于这个简单的CSR矩阵测试)cPickle是最快的方法,但它不适用于非常大的矩阵,np.savez只是稍慢,而io.mmwrite慢得多,产生更大的文件并恢复到错误的格式。 所以np.savez是赢家。
    ————————————————
    版权声明:本文为CSDN博主「weixin_39974223」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
    原文链接:https://blog.csdn.net/weixin_39974223/article/details/111766769

  • 相关阅读:
    Dot Net WinForm 控件开发 (七) 为属性提下拉式属性编辑器
    WinForm 程序的界面多语言切换
    c#遍历HashTable
    Dot Net WinForm 控件开发 (三) 自定义类型的属性需要自定义类型转换器
    Dot Net WinForm 控件开发 (六) 为属性提供弹出式编辑对话框
    Dot Net WinForm 控件开发 (一) 写一个最简单的控件
    Dot Net WinForm 控件开发 (四) 设置属性的默认值
    Dot Net WinForm 控件开发 (二) 给控件来点描述信息
    Dot Net WinForm 控件开发 (八) 调试控件的设计时行为
    Dot Net WinForm 控件开发 (五) 复杂属性的子属性
  • 原文地址:https://www.cnblogs.com/huixinquan/p/15221996.html
Copyright © 2011-2022 走看看