linux dma

zoukankan html css js c++ java

linux dma
DMA浅显分析
- 内核：Linux4.1
1.DMA介绍

其实很简单，DMA是Direct Memory Access的缩写，意思就是直接访问内存，什么叫直接访问内存？就是不需要CPU去参与，DMA就能从内存读或写入数据。

为什么需要DMA？原因很简单，为了让CPU更"轻松"，把搬运的苦力活交给DMA。

1.1 DMA通道

一个DMA控制器（controller）可以有多个通道（channel），多个channel并不意味着能够同时进行，channel请求传输都是会经过DMA controller，所以这些通道之间是”串行传输“的。

1.2 DMA request lines

DMA再怎么独立，它也是需要CPU去告诉它什么时候开始传输。

CPU发起DMA传输的时候，并不知道当前是否具备传输条件，例如source设备是否有数据、dest设备的FIFO是否空闲等等。那谁知道是否可以传输呢？设备！因此，需要DMA传输的设备和DMA控制器之间，会有几条物理的连接线（称作DMA request，DRQ），用于通知DMA控制器可以开始传输了

这就是DMA request lines的由来，通常来说，每一个数据收发的节点（称作endpoint），和DMA controller之间，就有一条DMA request line（memory设备除外）

1.3 DMA传输参数

1.3.1 transfer size

传输的数据大小（bytes），传输transfer size之后停止传输。

1.3.2 transfer width

传输宽度（bit），在某些设备可能需要在一个时间周期中，传输指定bit的数据大小。

1.3.3 burst size

DMA控制器内部可缓存的数据量的大小，当传输的源或者目的地是memory的时候，为了提高效率，DMA controller不愿意每一次传输都访问memory，而是在内部开一个buffer，将数据缓存在自己buffer中

2.DMA的使用

2.1 DMA一致性问题

这个一致性问题指的是：主内存与cache之间数据的一致性问题，因为CPU在访问内存之前，都会去查看cache有没有hit。

假设CPU要向内存地址为0x100中写入数据，cpu发现了cache是hit的，那么cpu会把数据先写入cache中，此时cache中保存着最新的数据，而内存地址为0x100中保存的是旧的数据。就在这时，DMA传输被启动了，那么DMA要去内存地址为0x100中取出数据，那么DMA取到的数据为旧的数据。

如何解决这个问题？

内核提供了DMA申请内存的函数，这些函数是可以解决一致性的问题。

2.1.1 一致性DMA缓存(Coherent DMA buffers)

先看lcd驱动mxsfb.c

mxsfb_probe

mxsfb_init_fbinfo

mxsfb_map_videomem
```
static int mxsfb_map_videomem(struct fb_info *fbi)
{
  ......
   	fbi->screen_base = dma_alloc_writecombine(fbi->device,
				fbi->fix.smem_len,
				(dma_addr_t *)&fbi->fix.smem_start,
				GFP_DMA | GFP_KERNEL);   
  ......
}
```
在这个Lcd驱动程序里面使用了dma_alloc_writecombine来申请lcd内存。那么内核是怎么处理的？内核可能需要对这段内存重新做一遍映射，特点是映射的时候标记这些页是不带cache的，这个特性也是存放在页表里面的。总而言之就是把这段内存标记为不使用cache的内存段。与dma_alloc_writecombine相同的还有另一个函数，dma_alloc_coherent，这两个的区别就是，dma_alloc_writecombine是使用关闭cache，但是会启用write buffer，而dma_alloc_coherent则是既关闭cache，也不使用write buffer，那么什么是write buffer呢？实际上write buffer也是cache，只有当进行写操作时，cache才会往write中写入数据。

如果，主内存中实在是没有多的可用来关闭cache的内存怎么办？可以使用dma_cache_sync函数，该函数就会去把cache的值更新到主内存中。

2.1.2 流式DMA映射(DMA Streaming Mapping)

我们再来看mmc驱动mxs-mmc.c，

mxs_mmc_prep_dma
```
static struct dma_async_tx_descriptor *mxs_mmc_prep_dma(
	struct mxs_mmc_host *host, unsigned long flags)
{
    ......
        	if (data) {
		/* data */
		dma_map_sg(mmc_dev(host->mmc), data->sg,
			   data->sg_len, ssp->dma_dir);
		sgl = data->sg;
		sg_len = data->sg_len;
	} else {
		/* pio */
		sgl = (struct scatterlist *) ssp->ssp_pio_words;
		sg_len = SSP_PIO_NUM;
	}

	desc = dmaengine_prep_slave_sg(ssp->dmach,
				sgl, sg_len, ssp->slave_dirn, flags);
    if (desc) {
		desc->callback = mxs_mmc_dma_irq_callback;
		desc->callback_param = host;
	} else {
    ......
}
    
    static void mxs_mmc_bc(struct mxs_mmc_host *host)
{
.......
	desc = mxs_mmc_prep_dma(host, DMA_CTRL_ACK);
	if (!desc)
		goto out;

	dmaengine_submit(desc);
	dma_async_issue_pending(ssp->dmach);
.......
    
}
```
在mmc驱动里面是使用dma_map_sg来进行申请内存，与一致性DMA映射不同的是，流式映射只有在启动DMA传输时才进行的，并且传输完数据之后，就会马上取消映射，可以在mxs_mmc_dma_irq_callback看到使用了dma_unmap_sg进行取消隐射。

2.1.3 Cache Coherent interconnect

上面说的是常规DMA，有些SoC可以用硬件做CPU和外设的cache coherence，例如在SoC中集成了叫做“Cache Coherent interconnect”的硬件，它可以做到让DMA踏到CPU的cache或者帮忙做cache的刷新。这样的话，dma_alloc_coherent()申请的内存就没必要是非cache的了。

2.2 使用DMA

看看别人是怎么实现，打开 s3cmci.c的s3cmci_probe:
```
static int s3cmci_probe(struct platform_device *pdev)
{
	if (s3cmci_host_usedma(host)) {
		dma_cap_mask_t mask;

		dma_cap_zero(mask);
		dma_cap_set(DMA_SLAVE, mask);

		host->dma = dma_request_slave_channel_compat(mask,
			s3c24xx_dma_filter, (void *)DMACH_SDI, &pdev->dev, "rx-tx");
		if (!host->dma) {
			dev_err(&pdev->dev, "cannot get DMA channel.
");
			ret = -EBUSY;
			goto probe_free_gpio_wp;
		}
	}
}
```
使用dma_request_slave_channel_compat来申请DMA通道，然后再看看s3cmci_prepare_dma：
```
static int s3cmci_prepare_dma(struct s3cmci_host *host, struct mmc_data *data)
{
	int rw = data->flags & MMC_DATA_WRITE;
	struct dma_async_tx_descriptor *desc;
	struct dma_slave_config conf = {
		.src_addr = host->mem->start + host->sdidata,
		.dst_addr = host->mem->start + host->sdidata,
		.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES,
		.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES,
	};

	BUG_ON((data->flags & BOTH_DIR) == BOTH_DIR);

	/* Restore prescaler value */
	writel(host->prescaler, host->base + S3C2410_SDIPRE);

	if (!rw)
		conf.direction = DMA_DEV_TO_MEM;
	else
		conf.direction = DMA_MEM_TO_DEV;

	dma_map_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
			     rw ? DMA_TO_DEVICE : DMA_FROM_DEVICE);

	dmaengine_slave_config(host->dma, &conf);
	desc = dmaengine_prep_slave_sg(host->dma, data->sg, data->sg_len,
		conf.direction,
		DMA_CTRL_ACK | DMA_PREP_INTERRUPT);
	if (!desc)
		goto unmap_exit;
	desc->callback = s3cmci_dma_done_callback;
	desc->callback_param = host;
	dmaengine_submit(desc);
	dma_async_issue_pending(host->dma);

	return 0;

unmap_exit:
	dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
			     rw ? DMA_TO_DEVICE : DMA_FROM_DEVICE);
	return -ENOMEM;
}
```
第一要确定的是，s3c的mmc控制器使用了流式映射，具体函数为dma_map_sg。然后调用dmaengine_slave_config配置DMA：

struct dma_slave_config部分成员：

src_addr：源地址。传输方向是dev2mem或者dev2dev时，读取数据的位置（通常是固定的FIFO地址）。对mem2dev类型的channel，不需配置该参数（每次传输的时候会指定）；

dst_addr：传输方向是mem2dev或者dev2dev时，写入数据的位置（通常是固定的FIFO地址）。对dev2mem类型的channel，不需配置该参数（每次传输的时候会指定）；

src_addr_width：源的数据宽度

dst_addr_width：目的数据宽度

direction：传输方向

配置DMA使用流式映射，dmaengine_prep_slave_sg，这里讲一下flags的参数：
/* *

enum dma_ctrl_flags - DMA flags to augment operation preparation,/

control completion, and communicate status.

@DMA_PREP_INTERRUPT - trigger an interrupt (callback) upon completion of

this transaction 传输完毕后产生一次中断

@DMA_CTRL_ACK - if clear, the descriptor cannot be reused until the client

acknowledges receipt, i.e. has has a chance to establish any dependency

chains

@DMA_PREP_PQ_DISABLE_P - prevent generation of P while generating Q

@DMA_PREP_PQ_DISABLE_Q - prevent generation of Q while generating P

@DMA_PREP_CONTINUE - indicate to a driver that it is reusing buffers as

sources that were the result of a previous operation, in the case of a PQ

operation it continues the calculation with new sources

@DMA_PREP_FENCE - tell the driver that subsequent operations depend

on the result of this operation

@DMA_CTRL_REUSE: client can reuse the descriptor and submit again till

cleared or freed
*/
配置成功后会返回一个结构体，如果使用了DMA_PREP_INTERRUPT来配置，就要实现一个callback，传输结束后就会调用callback。然后调用dmaengine_submit把本次传输放入传输队列，最后调用dma_async_issue_pending进行传输。

我的一个疑问：

当传输完成后，在callback里有清除中断标志，但是为什么在callback没有看见释放流式映射内存？整个文件里面，我只看到除非配置sg失败，才会unmap。这点是我非常不能理解的。随着s3cmci_prepare_dma不断执行，不会造成内存的泄露吗？
```
static void s3cmci_dma_done_callback(void *arg)
{
	struct s3cmci_host *host = arg;
	unsigned long iflags;

	BUG_ON(!host->mrq);
	BUG_ON(!host->mrq->data);

	spin_lock_irqsave(&host->complete_lock, iflags);

	dbg(host, dbg_dma, "DMA FINISHED
");

	host->dma_complete = 1;
	host->complete_what = COMPLETION_FINALIZE;

	tasklet_schedule(&host->pio_tasklet);
	spin_unlock_irqrestore(&host->complete_lock, iflags);

}
```
我在查看mxs-mmc.c，发现在dam的callback是有释放内存的。
```
static void mxs_mmc_request_done(struct mxs_mmc_host *host)
{
    .......
    	if (data) {
		dma_unmap_sg(mmc_dev(host->mmc), data->sg,
			     data->sg_len, ssp->dma_dir);
    .......
}
```
2.2.1framebuffer使用DMA

看mxsfb.c
```
static int mxsfb_init_fbinfo(struct mxsfb_info *host,
			struct fb_videomode *vmode)
{
    /* Memory allocation for framebuffer */
	fb_size = SZ_2M;
	fb_virt = dma_alloc_wc(dev, PAGE_ALIGN(fb_size), &fb_phys, GFP_KERNEL);
	if (!fb_virt)
		return -ENOMEM;

	fb_info->fix.smem_start = fb_phys;
	fb_info->screen_base = fb_virt;
	fb_info->screen_size = fb_info->fix.smem_len = fb_size;
}
```
这是使用一致性映射dma_alloc_wc，这个函数是禁止cache，但使用writebuffer。申请出来的内存特点就是物理内存一定是连续的。为什么要连续的，因为lcd控制器只会从内存中连续的取。

如果仔细看这个驱动程序的话，就会发现并没有像mmc那样要申请DMA通道和配置DMA等，原因就是这块内存并不需要DMA搬运，而是交给lcd控制器来进行搬运，这也是能解释为什么不去注册DMA通道等。

参考：Linux DMA Engine framework(1)_概述 (wowotech.net)

参考：Linux内存管理 —— DMA和一致性缓存_落尘纷扰的专栏-CSDN博客
查看全文

相关阅读:
nat下没法ping通virutalbox中的centos7，解决共享文件夹问题
 深度学习的精确率和召回率，浅显的例子
 python发送邮件心得体会
 ubuntu 16.04 搭建tigervnc
交叉编译7zip过程
 git使用经验汇总
 python 开发环境部署
 Ubuntu设置su和sudo为不需要密码 (摘录自别处）
ubuntu 16.04 安装wechat, chrome等
 andorid开发build.gradle 增加几种产品的方法

原文地址：https://www.cnblogs.com/r1chie/p/14809873.html

DMA浅显分析

1.DMA介绍

1.1 DMA通道

1.2 DMA request lines

1.3 DMA传输参数

1.3.1 transfer size

1.3.2 transfer width

1.3.3 burst size

2.DMA的使用

2.1 DMA一致性问题

2.1.1 一致性DMA缓存(Coherent DMA buffers)

2.1.2 流式DMA映射(DMA Streaming Mapping)

2.1.3 Cache Coherent interconnect

2.2 使用DMA

2.2.1framebuffer使用DMA