zoukankan      html  css  js  c++  java
  • Image processing Vitis -> (V)DMA -> HLS IP block -> (V)DMA on a bare metal application.

    https://forums.xilinx.com/t5/High-Level-Synthesis-HLS/Image-processing-Vitis-gt-V-DMA-gt-HLS-IP-block-gt-V-DMA-on-a/td-p/1104436

    Hello

    I am trying to read an image bare metal (later on PetaLinux) from SDK/Vitis to process this through a custom IP-core created in HLS. The HLS IP-core exists of a blur & threshold:

    filter.hpp contains:

    #define WIDTH 640
    #define HEIGHT 480

    typedef hls::stream<ap_axiu<16,1,1,1> > AXI_STREAM;
    typedef hls::Mat<HEIGHT, WIDTH, HLS_8UC3> RGB_IMAGE; //3Channel
    typedef hls::Mat<HEIGHT, WIDTH, HLS_8UC1> GRAY_IMAGE; //1Channel
    typedef hls::Mat<HEIGHT, WIDTH, HLS_8UC2> YUV_IMAGE; //2Channel

    void filter(AXI_STREAM& video_in, AXI_STREAM& video_out);

    --------------------------------------------------------------------------------------------

    filter.cpp contains :

    void filter(AXI_STREAM& video_in, AXI_STREAM& video_out)
    {

    //Create axi streaming interfaces for core.
    #pragma HLS INTERFACE axis port=video_in
    #pragma HLS INTERFACE axis port=video_out
    #pragma HLS INTERFACE ap_ctrl_none port=return

    GRAY_IMAGE img_1(HEIGHT, WIDTH);
    GRAY_IMAGE img_2(HEIGHT, WIDTH);
    GRAY_IMAGE img_3(HEIGHT, WIDTH);

    #pragma HLS dataflow
    hls::AXIvideo2Mat(video_in, img_1);

    hls::GaussianBlur<5,5>(img_1, img_2, 0, 0);
    hls::Threshold(img_2 ,img_3 , 200,255,HLS_THRESH_BINARY);

    hls::Mat2AXIvideo(img_3, video_out);

    }

    --------------------------------------------------------------------------------------------

    Transforming an Image into an Mat2AxiStream through the testBench is fairly easy and everything works as intended when simulating. My question is how to transform an image with a DMA or VDMA and with/without OpenCV libs in Vitis to convert it into an AXI4 stream? I have tried with a DMA implemented, but the transfer from DEVICE_TO_DMA gets stuck and I think this is related to the AXI4 stream protocol where the 'data.last=1' is required to the DMA?

    I have searched most of the forums already and would like any suggestions on good examples with a full workflow from HLS-Vivado-Vitis-PetaLinux.

    Thank you

    Gilles

    ===============================================

    I would recommend using the Video DMA instead of the DMA in this instance.  The Video DMA will format the associated data signals (tlast and tuser) in the way that the hls::AXIvideo2Mat() function is expecting.  When video is transported over an AXI4-Stream, there is a protocol that should be followed.  This is discussed in more detail in UG934:

    https://www.xilinx.com/support/documentation/ip_documentation/axi_videoip/v1_0/ug934_axi_videoIP.pdf

    Basically, there is a "tuser" signal that is used to denote the Start of Frame.  It goes high on the first pixel of an image.   Many video IP will use tuser to sync to the start of a frame.The "tlast" goes high on the last pixel of each line. Many video IP wll use tlast to make sure the image is the same size as they are expecting.  The Video DMA will format video in that format.  The DMA will not.

    Also, check out the Xilinx Video Series for some good background on doing video in a Xilinx device.

    https://forums.xilinx.com/t5/Video-and-Audio/Xilinx-Video-Series/td-p/849583

    Ted Booth | Tech. Lead FPGA Design Engineer | DesignLinx Solutions
    https://www.designlinxhs.com

    ===============================================

    Hi thank you for the answer.

    I have more insights on the protocol, but I don't see how you can set up a transfer between the VDMA and the designed IP-core without a video input. Is there a function that act the same as the XAxiDma_SimpleTransfer()?

    To understand the protocol transfers I created a simple gain core function with an array of 1000 integers as inputs and debugged it with an ILA. Transferring this array to the designed gain IP-core is done with a DMA (because I dont know how the VDMA transfer works without a video input) and works as intended. I found out is as soon as I change the input array size in the bare metal application code from 1000 to 1 higher or 1 lower the DMA wont do the transfer at all. Which might explain my other designs I have tried with image processing.

    My new question then is how would you be able to do this with a char array of image data (use a matlab script to generate an input array of chars from an image file) ? The image would be 640x480 so the array would be defined as 'unsigned char [640x480] = {...image data...}' and then transferred to the DMA as I did with the integers array. Is this a wrong way to approach the image processing ?
    Would love to hear your opinion on this.

    Thank your help and answer !

    Gilles

    ===============================================

    The VDMA has drivers similar to the DMA.  There is a little more to setup because it is transferring a rectangular region of memory instead of one long line of memory.  The VDMA knows nothing about video inputs.  It operates essentially the same way as the DMA.  It either reads from a memory buffer and outputs an AXI4-stream or it inputs an AXI4-stream and writes it to a memory buffer.  It's value is that it makes transferring image frames a little easier.

    When you setup your DMA in Vivado, did you select "Allow Unaligned Transfers"?  If no, then you are likely running into a known data transfer limitation of the DMA.  The DMA has a "Memory Map Data Width" and "Stream Data Width".  The transfers on the Stream side have to be a multiple of the Memory Map side.  For instance, if MM Data Width is 32 and S Data Width is 8, then the transfers have to be a multiple of 4 (32/8).  If your application does not met this criteria, then you need to setup your DMA with "Allow Unaligned Transfers" turned on.

    There are a number of ways to get an image from Matlab in the DDR memory of your FPGA board.  The simplest is to store the data to a text file as an array that you can add to your Sw and compile into the executable.  Once the program is loaded and running, you can point your DMA (or VDMA) to this buffer to read it out and stream it to your IP.

    If you haven't done so already, I suggest checking out the Xilinx Video Series

    https://forums.xilinx.com/t5/Video-and-Audio/Xilinx-Video-Series/td-p/849583

    There are tutorials on setting up a VDMA as well as creating and using HLS IP.

    Ted Booth | Tech. Lead FPGA Design Engineer | DesignLinx Solutions
    https://www.designlinxhs.com
  • 相关阅读:
    UIWindowLevel详解 一片
    关于博客的原创和转载的一点儿看法 一片
    UIViewController生命周期学习笔记 一片
    UINavigationController详解 一片
    UIColor,CGColor,CIColor三者间的区别和联系 一片
    subview事件响应范围 一片
    viewWithTag获取subview规则详解 一片
    FirstResponder 释放问题 一片
    UITabBarController详解 一片
    做IOS开发这一年 一片
  • 原文地址:https://www.cnblogs.com/ztguang/p/15228722.html
Copyright © 2011-2022 走看看