zoukankan      html  css  js  c++  java
  • pytorch利用多个GPU并行计算

    参考:

    https://pytorch.org/docs/stable/nn.html

    https://github.com/apachecn/pytorch-doczh/blob/master/docs/1.0/blitz_data_parallel_tutorial.md

    https://blog.csdn.net/Answer3664/article/details/98992409

    1. torch.nn.DataParallel

    torch.nn.DataParallel(module, device_ids=None, output_device=None, dim=0)
    

    在正向传递中,模块在每个设备上复制,每个副本处理一部分输入。在向后传递期间,来自每个副本的渐变被加到原始模块中

    • module:需要并行处理的模型

    • device_ids:并行处理的设备,默认使用所有的cuda

    • output_device:输出的位置,默认输出到cuda:0

    例子:

    net = torch.nn.DataParallel(model, device_ids=[0, 1, 2])
    output = net(input_var)  # input_var can be on any device, including CPU
    

    torch.nn.DataParallel()返回一个新的模型,能够将输入数据自动分配到所使用的GPU上。所以输入数据的数量应该大于所使用的设备的数量。

    2. 一个完整例子

    import torch
    import torch.nn as nn
    from torch.utils.data import Dataset, DataLoader
     
    # parameters and DataLoaders
    input_size = 5
    output_size = 2
     
    batch_size = 30
    data_size = 100
     
    device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
     
     
    # 随机数据集
    class RandomDataset(Dataset):
     
        def __init__(self, size, length):
            self.len = length
            self.data = torch.randn(length, size)
     
        def __getitem__(self, index):
            return self.data[index]
     
        def __len__(self):
            return self.len
     
     
    rand_loader = DataLoader(dataset=RandomDataset(input_size, data_size),
                             batch_size=batch_size, shuffle=True)
     
     
    # 以简单模型为例,同样可以用于CNN, RNN 等复杂模型
    class Model(nn.Module):
        def __init__(self, input_size, output_size):
            super(Model, self).__init__()
            self.fc = nn.Linear(input_size, output_size)
     
        def forward(self, input):
            output = self.fc(input)
            print('In model: input size', input.size(), 'output size:', output.size())
            return output
     
     
    # 实例
    model = Model(input_size, output_size)
     
    if torch.cuda.device_count() > 1:
        print("Use", torch.cuda.device_count(), 'gpus')
        model = nn.DataParallel(model)
     
    model.to(device)
     
    for data in rand_loader:
        input = data.to(device)
        output = model(input)
        print('Outside: input size ', input.size(), 'output size: ', output.size())
    

    输出:

    In model: input size torch.Size([30, 5]) output size: torch.Size([30, 2])
    Outside: input size  torch.Size([30, 5]) output size:  torch.Size([30, 2])
    In model: input size torch.Size([30, 5]) output size: torch.Size([30, 2])
    Outside: input size  torch.Size([30, 5]) output size:  torch.Size([30, 2])
    In model: input size torch.Size([30, 5]) output size: torch.Size([30, 2])
    Outside: input size  torch.Size([30, 5]) output size:  torch.Size([30, 2])
    In model: input size torch.Size([10, 5]) output size: torch.Size([10, 2])
    Outside: input size  torch.Size([10, 5]) output size:  torch.Size([10, 2])
    

    若有2个GPU:

    Use 2 GPUs!
        In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
        In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
    Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
        In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
        In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
    Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
        In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
        In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
    Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
        In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])
        In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])
    Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])
    

    若有3个GPU:

    Use 3 GPUs!
        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
    Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
    Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
        In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
    Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
        In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
        In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
    Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])
    

    总结:

    • DataParallel自动的划分数据,并将作业发送到多个GPU上的多个模型。

    • DataParallel会在每个模型完成作业后,收集与合并结果然后返回给你。

  • 相关阅读:
    python测试开发django(16)--admin后台中文版
    python测试开发django(15)--admin后台管理,python3.7与django3.06冲突,降低django为2.2
    python测试开发django(14)--JsonResponse返回中文编码问题
    python测试开发django(13)--查询结果转json(serializers)
    python测试开发django(12)--ORM查询表结果
    [二分,multiset] 2019 Multi-University Training Contest 10 Welcome Party
    [概率] HDU 2019 Multi-University Training Contest 10
    [dfs] HDU 2019 Multi-University Training Contest 10
    [bfs,深度记录] East Central North America Regional Contest 2016 (ECNA 2016) D Lost in Translation
    [状态压缩,折半搜索] 2019牛客暑期多校训练营(第九场)Knapsack Cryptosystem
  • 原文地址:https://www.cnblogs.com/douzujun/p/13426548.html
Copyright © 2011-2022 走看看