zoukankan      html  css  js  c++  java
  • [Erlang 0119] Erlang OTP 源码阅读指引

      上周Erlang讨论群里面提到lists的++实现,争论大多基于猜测,其实打开代码看一下就都明了.贴出代码截图后有同学问这代码是哪里找的?
      "代码去哪里找?",关于Erlang源码阅读的路线图江湖上只有一份残卷了.我觉得"代码在哪儿?"这类问题是信息不对称造成的,本身难度不大,就像<贫民窟的百万富翁>里面的情节:贾马尔知道市井生活中的零零碎碎却说不出国徽上的文字,我们就从电影中的这一幕开始本文的探索吧
     
     
    内景,演播室—夜晚
    普瑞姆:这个问题的奖金四千卢比……印度的国徽是三只狮子,狮子下面写的是什么?是否是……
                       A.惟有真理必胜 B.惟有谎言必胜 C.惟有时尚必胜 D.惟有金钱必胜
    [普瑞姆假装困惑的样子看向观众,引他们发笑.]
    普瑞姆:你觉得是哪一个呢,贾马尔?这是我国历史上最著名的一句话.或许你想给朋友打电话求助吧?
    [观众哈哈大笑.一滴汗珠从贾马尔的额头流下来.普瑞姆喜欢贾马尔的不安.]
    普瑞姆:或者向现场观众求助?我凭直觉认为他们可能知道答案.你想怎么办?
    贾马尔:是的.
    普瑞姆(吃惊):什么是的?
    贾马尔:求助观众.
    [普瑞姆吹口哨.举目望向观众席.]
    普瑞姆:那么女士们、先生们,请帮他解难吧.现在请按下你们的选择键.
    [灯光转暗.让人紧张的音乐声响起.]

    内景,督察办公室—白天
    [督察按暂停键.叹了口气.]
    督察:贾马尔,我五岁大的女儿都知道答案,你却不知道.这对一个天才百万富翁来说,不是很奇怪吗?怎么回事?你的作弊同伙跑出去撒尿了是吗?又或者是他咳得不够大声?
    [沉默.斯里尼瓦斯警员朝贾马尔的椅子踢了一脚.]
    斯里尼瓦斯警员:督察问你话呢.
    贾马尔:在乔帕蒂海滩吉万的小吃摊上,炸脆饼多少钱?
    督察:什么?
    贾马尔:一份炸脆饼,多少钱?
    斯里尼瓦斯警员(忍不住说):十卢比.
    贾马尔:错.排灯节过后就是十五卢比了.上个星期四,是谁在达达尔车站外面偷了瓦尔马警员的自行车?
    督察(被逗乐了):你知道是谁偷的?
    贾马尔:朱胡区的每个人都知道.连五岁的小孩儿都知道.
     
     归正传,我们从代码下载开始......
     

    源码下载

     
     
      对于选择了Windows安装包的同学,要特别提示一下:lib目录中包含了对应类库的源码和ebin,比如kernel,stdlib等等,但ERTS目录里面没有对应源码,自己去下载一份来看吧,或者直接在线查看 https://github.com/erlang/otp/tree/maint/erts
     
     

    源码阅读工具

     
          Erlang OTP源码量不小,好的工具能帮我们省很多事,比如支持文件夹查找或者项目内搜索的,在代码之间各种跳转更是减少很多麻烦.如果是在Windows环境中Everything这样的工具也是定位文件利器,Visual studio 阅读C代码体验真的很棒,当然了如果你喜欢在纯文本编辑器里面用正则搞,也无不可;下面是在VS中代码截图:
     
     

    Overview

     
        大体上,otp_src的代码如下图这样组织的(打开文件夹就可以看到,算不上什么Thirty Thousand Feet).与我们每天写代码最息息相关的是ERTS和lib;ERTS(Erlang Run-Time System)包含了Erlang运行时系统的代码,是Erlang的基础设施.lib包含了所有的外围类库实现,有些类库的安排是违反直觉的,不过习惯了就好了,比如file.erl不是在stdlib而是在kernel;gen_server gen_fsm的代码实现应该是在kernel吧?错,它们的代码是在stdlib下;但是呢,application.erl是在kernel.
     
    Kernel
       
       看一下kernel目录,是不是有点摸不着头脑?Erlang运行时是有一个kernel application运行,运行一下appmon我们可以动态看到kernel涉及到的代码模块.我们大致可以揣摩到设计者的规划原则:kernel的范畴包含了application管理,code生命周期管理,IO(文件IO,网络IO,io_request),HIPE,分布式基础设施等等,见下面的思维导图:
     
     
     
      上面的划分方式只是我个人的一种看法,为了方便查阅我把上图转成了文字,见下面:
     
    Kernel
    
    Kernel APP
    	kernel.erl
    	kernel_config.erl
    	kernel.appup.src
    	kernel.app.src
    
    application管理
    	application_controller.erl
    	application_starter.erl
    	application_master.hrl
    	application_master.erl
    	application.erl
    	heart.erl
    
    HIPE
    	hipe_ext_format.hrl
    	hipe_unified_loader.erl
    
    调试& 日志
    	日志
    		disk_log.erl
    		disk_log_sup.erl
    		disk_log_server.erl
    		disk_log_1.erl
    		disk_log.hrl
    		error_logger.erl
    		wrap_log_reader.erl
    	调试
    		error_handler.erl
    		erts_debug.erl
    		standard_error.erl
    		seq_trace.erl
    IO
    	文件IO
    		file.erl
    		file_server.erl
    		file_io_server.erl
    		ram_file.erl
    	网络IO
    		gen_sctp.erl
    		gen_udp.erl
    		gen_tcp.erl
    		inet.erl
    		inet_config.hrl
    		inet_config.erl
    		inet_boot.hrl
    		inet6_udp.erl
    		inet6_tcp_dist.erl
    		inet6_tcp.erl
    		inet6_sctp.erl
    		inet_db.erl
    		inet_dns.hrl
    		inet_dns.erl
    		inet_gethost_native.erl
    		inet_udp.erl
    		inet_tcp_dist.erl
    		inet_tcp.erl
    		inet_sctp.erl
    		inet_res.hrl
    		inet_res.erl
    		inet_parse.erl
    		inet_int.hrl
    		inet_hosts.erl
    		inet_dns_record_adts.pl
    		erl_reply.erl
    		net_kernel.erl
    		net_adm.erl
    		net.erl
    	IO Request
    		user_drv.erl
    		user.erl
    		user_sup.erl
    		group.erl
    
    Code生命周期管理
    	code.erl
    	code_server.erl
    	erl_boot_server.erl
    	erl_ddll.erl
    	
    distribute管理
    	dist_util.erl
    	dist_ac.erl
    	Distributed Applications Controller
    	erl_distribution.erl
    	erl_epmd.erl
    	rpc.erl
    	pg2.erl
    	global_search.erl
    	global_group.erl
    	global.erl
    	auth.erl
    OS
    	os.erl
    

      

    stdlib 
     
       相比kernel,stdlib恰如起名包含了绝大多数的功能模块,比如lists,ets,各种数据结构实现,当然最重要的是它包含了OTP的gen_server gen_fsm gen_event supervisor以及幕后英雄proc_lib和sys.如果你不嫌弃,这里有一份略微过时的文档,是我初学Erlang的时候在文档上做的笔记注释:[Erlang STDLIB 中文注释版]
     
     特别值得一提的是shell和shell_default,对Erlang Shell好奇的同学看看这里能找到答案,所谓"EShell里面灵异的问题"也就有了一个合理的解释.
     其它的模块因为功能特别明确很容易定位到,比如专门处理XML的xmerl,数据库mnesia等等,辅之以Google,几乎没有什么障碍;
     
     

    Dive into ERTS

     
    Atom and bifs
     
     
    atom.names  枚举了ERTS使用的atom,学习一下惯用法还是非常有必要的
    bif.tab           bif清单 注意 Use "ubif" for guard BIFs and operators; use "bif" for ordinary BIFs.
     
     
    Basic Type
     

    /*
    ** Data types:
    **
    ** Eterm: A tagged erlang term (possibly 64 bits)
    ** BeamInstr: A beam code instruction unit, possibly larger than Eterm, not smaller.
    ** UInt:  An unsigned integer exactly as large as an Eterm.
    ** SInt:  A signed integer exactly as large as an eterm and therefor large
    **        enough to hold the return value of the signed_val() macro.
    ** UWord: An unsigned integer at least as large as a void * and also as large
    **          or larger than an Eterm
    ** SWord: A signed integer at least as large as a void * and also as large
    **          or larger than an Eterm
    ** Uint32: An unsigned integer of 32 bits exactly
    ** Sint32: A signed integer of 32 bits exactly
    ** Uint16: An unsigned integer of 16 bits exactly
    ** Sint16: A signed integer of 16 bits exactly.
    */
    

     

     
     
    这里我们还能看到一些复杂数据结构的内部表示,比如:
     
     
    两个例子
     
    看两个例子吧,第一个例子lists的append是如何实现的,很容易找到lists.erl
     
     
    append(L1, L2) -> L1 ++ L2.
     
    我们发现其实append就是使用的++,那++是在哪里实现的呢?
    在 https://github.com/erlang/otp/tree/maint/erts/emulator/beam 目录下面,可以看到一系列erl_bif_*.c的文件,这里可以找到对应模块的bif实现.打开
    https://github.com/erlang/otp/blob/maint/erts/emulator/beam/erl_bif_lists.c 是不是很快就找到我们想要的代码了?对,就是我上面截图的代码,这里不再重述.
    比较有趣的一个地方是这两句:
     
         copy = last = CONS(hp, CAR(list_val(list)), make_list(hp + 2));
         list = CDR(list_val(list));
    
     
    有同学说,CAR CDR CONS这三个东西好熟悉啊?对,没错,这就是Lisp列表操作的三个基础原语,分别实现取表头,取表头外剩余部分,表构造(constructs),跳转到它们的实现,在erl_term.h:

    #define CONS(hp, car, cdr) 
            (CAR(hp)=(car), CDR(hp)=(cdr), make_list(hp))
    
    #define CAR(x)  ((x)[0])
    #define CDR(x)  ((x)[1])
    

      

     
    第二个例子 看看process的定义是什么样的
     
    首先在 erl_process.h 找到 Process的定义
     
    typedef struct process Process;
     
    转到struct process的定义:
    struct process {
        ErtsPTabElementCommon common; /* *Need* to be first in struct */
    
        /* All fields in the PCB that differs between different heap
         * architectures, have been moved to the end of this struct to
         * make sure that as few offsets as possible differ. Different
         * offsets between memory architectures in this struct, means that
         * native code have to use functions instead of constants.
         */
    
        Eterm* htop;		/* Heap top */
        Eterm* stop;		/* Stack top */
        Eterm* heap;		/* Heap start */
        Eterm* hend;		/* Heap end */
        Uint heap_sz;		/* Size of heap in words */
        Uint min_heap_size;         /* Minimum size of heap (in words). */
        Uint min_vheap_size;        /* Minimum size of virtual heap (in words). */
    
    #if !defined(NO_FPE_SIGNALS) || defined(HIPE)
        volatile unsigned long fp_exception;
    #endif
    
    #ifdef HIPE
        /* HiPE-specific process fields. Put it early in struct process,
           to enable smaller & faster addressing modes on the x86. */
        struct hipe_process_state hipe;
    #endif
    
        /*
         * Saved x registers.
         */
        Uint arity;			/* Number of live argument registers (only valid
    				 * when process is *not* running).
    				 */
        Eterm* arg_reg;		/* Pointer to argument registers. */
        unsigned max_arg_reg;	/* Maximum number of argument registers available. */
        Eterm def_arg_reg[6];	/* Default array for argument registers. */
    
        BeamInstr* cp;		/* (untagged) Continuation pointer (for threaded code). */
        BeamInstr* i;		/* Program counter for threaded code. */
        Sint catches;		/* Number of catches on stack */
        Sint fcalls;		/* 
    				 * Number of reductions left to execute.
    				 * Only valid for the current process.
    				 */
        Uint32 rcount;		/* suspend count */
        int  schedule_count;	/* Times left to reschedule a low prio process */
        Uint reds;			/* No of reductions for this process  */
        Eterm group_leader;		/* Pid in charge
    				   (can be boxed) */
        Uint flags;			/* Trap exit, etc (no trace flags anymore) */
        Eterm fvalue;		/* Exit & Throw value (failure reason) */
        Uint freason;		/* Reason for detected failure */
        Eterm ftrace;		/* Latest exception stack trace dump */
    
        Process *next;		/* Pointer to next process in run queue */
    
        struct ErtsNodesMonitor_ *nodes_monitors;
    
        ErtsSuspendMonitor *suspend_monitors; /* Processes suspended by
    					     this process via
    					     erlang:suspend_process/1 */
    
        ErlMessageQueue msg;	/* Message queue */
    
        union {
    	ErtsBifTimer *bif_timers;	/* Bif timers aiming at this process */
    	void *terminate;
        } u;
    
        ProcDict  *dictionary;       /* Process dictionary, may be NULL */
    
        Uint seq_trace_clock;
        Uint seq_trace_lastcnt;
        Eterm seq_trace_token;	/* Sequential trace token (tuple size 5 see below) */
    
    #ifdef USE_VM_PROBES
        Eterm dt_utag;              /* Place to store the dynamc trace user tag */
        Uint dt_utag_flags;         /* flag field for the dt_utag */
    #endif       
        BeamInstr initial[3];	/* Initial module(0), function(1), arity(2), often used instead
    				   of pointer to funcinfo instruction, hence the BeamInstr datatype */
        BeamInstr* current;		/* Current Erlang function, part of the funcinfo:
    				 * module(0), function(1), arity(2)
    				 * (module and functions are tagged atoms;
    				 * arity an untagged integer). BeamInstr * because it references code
    				 */
        
        /*
         * Information mainly for post-mortem use (erl crash dump).
         */
        Eterm parent;		/* Pid of process that created this process. */
        erts_approx_time_t approx_started; /* Time when started. */
    
        /* This is the place, where all fields that differs between memory
         * architectures, have gone to.
         */
    
        Eterm *high_water;
        Eterm *old_hend;            /* Heap pointers for generational GC. */
        Eterm *old_htop;
        Eterm *old_heap;
        Uint16 gen_gcs;		/* Number of (minor) generational GCs. */
        Uint16 max_gen_gcs;		/* Max minor gen GCs before fullsweep. */
        ErlOffHeap off_heap;	/* Off-heap data updated by copy_struct(). */
        ErlHeapFragment* mbuf;	/* Pointer to message buffer list */
        Uint mbuf_sz;		/* Size of all message buffers */
        ErtsPSD *psd;		/* Rarely used process specific data */
    
        Uint64 bin_vheap_sz;	/* Virtual heap block size for binaries */
        Uint64 bin_vheap_mature;	/* Virtual heap block size for binaries */
        Uint64 bin_old_vheap_sz;	/* Virtual old heap block size for binaries */
        Uint64 bin_old_vheap;	/* Virtual old heap size for binaries */
    
        ErtsProcSysTaskQs *sys_task_qs;
    
        erts_smp_atomic32_t state;  /* Process state flags (see ERTS_PSFLG_*) */
    
    #ifdef ERTS_SMP
        ErlMessageInQueue msg_inq;
        ErtsPendExit pending_exit;
        erts_proc_lock_t lock;
        ErtsSchedulerData *scheduler_data;
        Eterm suspendee;
        ErtsPendingSuspend *pending_suspenders;
        erts_smp_atomic_t run_queue;
    #ifdef HIPE
        struct hipe_process_state_smp hipe_smp;
    #endif
    #endif
    
    #ifdef CHECK_FOR_HOLES
        Eterm* last_htop;		/* No need to scan the heap below this point. */
        ErlHeapFragment* last_mbuf;	/* No need to scan beyond this mbuf. */
    #endif
    
    #ifdef DEBUG
        Eterm* last_old_htop;	/*
    				 * No need to scan the old heap below this point
    				 * when looking for invalid pointers into the new heap or
    				 * heap fragments.
    				 */
    #endif
    
    #ifdef FORCE_HEAP_FRAGS
        Uint space_verified;        /* Avoid HAlloc forcing heap fragments when */ 
        Eterm* space_verified_from; /* we rely on available heap space (TestHeap) */
    #endif
    };
    

      

     
    庄子说:"吾生也有涯,而知也无涯.以有涯随无涯,殆已!",所以各取所需就好,今天就到这里,且行且珍惜吧
     
     
  • 相关阅读:
    一些关于"虚拟交易"的有趣文章
    Windows中的消息队列:Message Queuing (MSMQ)
    ATLStyle模板 不用虚函数实现多态
    AIX 下获取系统CPU及内存的使用情况方法
    关于HPUNIX C 兼容性
    Android进阶Acticivity的启动模式
    ViewState & UpdatePanle & ReadOnly属性
    由网站构架演变说起
    '操作必须使用一个可更新的查询'解决方法
    ScriptManager & ClientScriptManager
  • 原文地址:https://www.cnblogs.com/me-sa/p/erlang_source_code_guide.html
Copyright © 2011-2022 走看看