zoukankan      html  css  js  c++  java
  • OpenMPI源码剖析4:rte.h 头文件的说明信息

    上一篇文章中说道,我们在 rte.h 中发现了有价值的说明:

    我们一块一块来分析,首先看到第一块,关于 Process name Object:

     * (a) Process name objects and operations											// 进程名Object
     *     1. Definitions for integral types ompi_jobid_t and ompi_vpid_t.
     *        The jobid must be unique for a given MPI_COMM_WORLD capable of
     *        connecting to another OMPI_COMM_WORLD and the vpid will be the
     *        process's rank in MPI_COMM_WORLD.
     *     2. ompi_process_name_t - a struct that must contain at least two integer-typed fields:
     *           a. ompi_jobid_t jobid			// 作业ID,应该也就是 MPI_COMM_WORLD,为了让不同的WORLD可以相连,每个WORLD就用这个作业ID来标识
     *           b. ompi_vpid_t vpid			// 在 MPI_COMM_WORLD 的 进程rank
     *        Note that the structure can contain any number of fields beyond these
     *        two, so the process name struct for any particular RTE can be whatever
     *        is desired.
     *     3. OMPI_NAME_PRINT - a macro that prints a process name when given			// 打印进程名字的宏
     *        a pointer to ompi_process_name_t. The output format is to be
     *        a single string representing the name.  This function should
     *        be thread-safe for multiple threads to call simultaneously.
     *     4. OMPI_PROC_MY_NAME - a pointer to a global variable containing
     *        the ompi_process_name_t for this process. Typically, this is
     *        stored as a field in the ompi_process_info_t struct, but that
     *        is not a requirement.
     *     5. OMPI_NAME_WIlDCARD - a wildcard name.
     *     6. ompi_rte_compare_name_fields - a function used to compare fields
     *        in the ompi_process_name_t struct. The function prototype must be
     *        of the form:
     *        int ompi_rte_compare_name_fields(ompi_rte_cmp_bitmask_t mask,
     *                                         ompi_process_name_t *name1,
     *                                         ompi_process_name_t *name2);
     *        The bitmask must be defined to indicate the fields to be used
     *        in the comparison. Fields not included in the mask must be ignored.
     *        Supported bitmask values must include:
     *           b. OMPI_RTE_CMP_JOBID
     *           c. OMPI_RTE_CMP_VPID
     *           d. OMPI_RTE_CMP_ALL
     *      7. uint64_t ompi_rte_hash_name(name) - return a string hash uniquely
     *         representing the ompi_process_name passed in.
     *      8. OMPI_NAME - an Opal DSS constant for a handler already registered		// 序列化,反序列化?DSS?
     *         to serialize/deserialize an ompi_process_name_t structure.
    

      

    第二块,是关于集体信息交换的:

      * (b) Collective objects and operations													// 集体对象
     *     1. ompi_rte_collective_t - an OPAL object used during RTE collective operations		// nodex 是指 要求每个进程“发现”作业中所有其他进程的相关互连联系信息。
     *        such as modex and barrier. It must be an opal_list_item_t and contain the
     *        following fields:
     *           a. id (ORTE type: int32_t)
     *           b. bool active
     *              flag that user can poll on to know when collective
     *              has completed - set to false just prior to
     *              calling user callback function, if provided
     *     2. ompi_rte_modex - a function that performs an exchange of endpoint information		// 各节点进程间消息传输
     *        to wireup the MPI transports. The function prototype must be of the form:
     *        int ompi_rte_modex(ompi_rte_collective_t *coll);
     *        At the completion of the modex operation, the coll->active flag must be set
     *        to false, and the endpoint information must be stored in the modex database.		// nodex 的信息会存储到数据库
     *        This function must have barrier semantics across the MPI_COMM_WORLD of the
     *        calling process.
     *     3. ompi_rte_barrier - a function that performs a barrier operation within the
     *        RTE. The function prototype must be of the form:
     *        int ompi_rte_barrier(ompi_rte_collective_t *coll);
     *        At the completion of the barrier operation, the coll->active flag must be set
     *        to false
    

      更多的Modex操作信息,唯一能找到的参考是:  https://github.com/open-mpi/ompi/wiki/ModexlessLaunch

    启动MPI作业通常不仅要求各个进程在各个节点上生成,还要求每个进程“发现”作业中所有其他进程的相关互连联系信息。 完成后面这一步的默认启动机制被称为“modex”,包含几个步骤:

    1.在启动时,每个进程打开每个接口驱动程序以查询本地节点的可用接口
    2.那些具有接口的驱动程序会注册一个包含其接口联系信息的modex条目
    3.作业中的流程执行集体操作以交换其个人联系信息。 这是在MPI_Init期间发生的阻塞操作。

    也就是 MPI_Init 会进行一个各个进程间的信息交换,并且是有同步保障的。

    第3块: 进程结构体

    每个进程都有这样一个结构体,记录自己的 noderank , 记录自己在 node 中的 rank

     * (c) Process info struct
     *     1. ompi_process_info_t - a struct containing info about the current process.		           //当前进程的信息
     *        The struct must contain at least the following fields:
     *           a. app_num -
     *           b. pid - this process's pid.  Should be same as getpid().
     *           c. num_procs - Number of processes in this job (ie, MCW)								// 作业的进程数,一个作业可能包括多个节点,每个节点可能有多个进程
     *           d. my_node_rank - relative rank on local node to other peers this run-time             // 节点 rank
     *                    instance knows about.  If doing dynamics, this may be something
     *                    different than my_local_rank, but will be my_local_rank in a
     *                    static job.
     *           d. my_local_rank - relative rank on local node with other peers in this job (ie, MCW) 	// 本节点上的 process rank 
     *           e. num_local_peers - Number of local peers (peers in MCW on your node)                 // 本地节点进程个数
     *           f. my_hnp_uri -
     *           g. peer_modex - a collective id for the modex operation								// 不知道 modex啥意思
     *           h. peer_init_barrier - a collective id for the barrier during MPI_Init
     *           i. peer_fini_barrier - a collective id for the barrier during MPI_Finalize
     *           j. job_session_dir -
     *           k. proc_session_dir -
     *           l. nodename - a string representation for the name of the node this
     *              process is on
     *           m. cpuset -
     *     2. ompi_process_info - a global instance of the ompi_process_t structure.
     *     3. ompi_rte_proc_is_bound - global boolean that will be true if the runtime bound
     *        the process to a particular core or set of cores and is false otherwise.
    

      

    第4块,初始化和反初始化操作:

      * (e) Init and finalize objects and operations
     *     1. ompi_rte_init - a function to initialize the RTE. The function
     *        prototype must be of the form:
     *        int ompi_rte_init(int *argc, char ***argv);
     *     2. ompi_rte_finalize - a function to finalize the RTE. The function
     *        prototype must be of the form:
     *        int ompi_rte_finalize(void);
     *     3. void ompi_rte_wait_for_debugger(void) - Called during MPI_Init, this
     *        function is used to wait for debuggers to do their pre-MPI attach.
     *        If there is no attached debugger, this function will not block.
    

      

    第5块,数据库操作:

      * (f) Database operations
     *     1. ompi_rte_db_store - a function to store modex and other data in		// Modex 的记录插入数据库
     *        a local database. The function is primarily used for storing modex
     *        data, but can be used for general purposes. The prototype must be
     *        of the form:
     *        int ompi_rte_db_store(const ompi_process_name_t *proc,
     *                              const char *key, const void *data,
     *                              opal_data_type_t type);
     *        The implementation of this function must store a COPY of the data
     *        provided - the data is NOT guaranteed to be valid after return
     *        from the call.
     *     3. ompi_rte_db_fetch -
     *        NOTE: Fetch accepts an 'ompi_proc_t'.
     *        int ompi_rte_db_fetch(const struct ompi_proc_t *proc,
     *                              const char *key,
     *                              void **data,
     *                              opal_data_type_t type);
     *     4. ompi_rte_db_fetch_pointer -
     *        NOTE: Fetch accepts an 'ompi_proc_t'.
     *        int ompi_rte_db_fetch_pointer(const struct ompi_proc_t *proc,
     *                                      const char *key,
     *                                      void **data,
     *                                      opal_data_type_t type);
     *     5. Pre-defined db keys (with associated values after rte_init)
     *        a. OMPI_DB_HOSTNAME
     *        b. OMPI_DB_LOCALITY
    

      

    其实这篇文章并没有给我们提供什么有实质性的信息,只是大致指明了一些方向。

    还记起,我们在分析 MPI_Init 的消息时,并没有进入到 实际的 ompi_mpi_init 函数中,下一次,我们就要尝试进入该函数。

    最简单的MPI程序就2行代码:  MPI_Init();   MPI_Finalize;

    初始化过程中应该是做了非常多的事情的,包括很多进程信息初始化,交换信息等,我们慢慢地去探索。

  • 相关阅读:
    2019-9-2-win10-uwp-Markdown
    2018-8-10-控件
    2018-8-10-win10-uwp-dataGrid
    2018-2-13-win10-uwp-hashcash
    2018-2-13-git-cannot-lock-ref
    UCOSIII系统内部任务
    UCOSIII时间片轮转调度
    Keil MDK fromelf生成bin文件
    UCOS内存管理
    uavcan扩展帧格式 zubax
  • 原文地址:https://www.cnblogs.com/HelloGreen/p/8758450.html
Copyright © 2011-2022 走看看