zoukankan      html  css  js  c++  java
  • ECMA2623 in detail. Chapter 5. Functions.

    转载地址:http://dmitrysoshnikov.com/ecmascript/chapter-5-functions/

    Introduction

    In this article we will talk about one of the general ECMAScript objects — about functions. In particular, we will go through various types of functions, will define how each type influencesvariables object of a context and what is contained in the scope chain of each function. We will answer the frequently asked questions such as: “is there any difference (and if there are, what are they?) between functions created as follows:

    var foo = function () {
      ...
    };

    from functions defined in a “habitual” way?”:

    function foo() {
      ...
    }

    Or, “why in the next call, the function has to be surrounded with parentheses?”:

    (function () {
      ...
    })();

    Since these articles relay on earlier chapters, for full understanding of this part it is desirable to read Chatper 2. Variable object and Chapter 4. Scope chain, since we will actively use terminology from these chapters.

    But let us give one after another. We begin with consideration of function types.

    Types of functions

    In ECMAScript there are three function types and each of them has its own features.

    Function Declaration

    Function Declaration (abbreviated form is FD) is a function which:

    • has an obligatory name;
    • in the source code position it is positioned: either at the Program level or directly in the body of another function (FunctionBody);
    • is created on entering the context stage;
    • influences variable object;
    • and is declared in the following way:
    function exampleFunc() {
      ...
    }

    The main feature of this type of functions is that only they influence variable object (they are stored in the VO of the context). This feature defines the second important point (which is a consequence of a variable object nature) — at the code execution stage they are already available (since FD are stored in the VO on entering the context stage — before the execution begins).

    Example (function is called before its declaration in the source code position):

    foo();
     
    function foo() {
      alert('foo');
    }

    What’s also important is the position at which the funcion is defined in the source code (see the second bullet in the Function declaration definition above):

    // function can be declared:
    // 1) directly in the global context
    function globalFD() {
      // 2) or inside the body
      // of another function
      function innerFD() {}
    }

    These are the only two positions in code where a function may be declared (i.e. it is impossible to declare it in an expression position or inside a code block).

    There’s one alternative to function declarations which is called function expressions, which we are about to cover.

    Function Expression

    Function Expression (abbreviated form is FE) is a function which:

    • in the source code can only be defined at the expression position;
    • can have an optional name;
    • it’s definition has no effect on variable object;
    • and is created at the code execution stage.

    The main feature of this type of functions is that in the source code they are always in theexpression position. Here’s a simple example such assignment expression:

    var foo = function () {
      ...
    };

    This example shows how an anonymous FE is assigned to foo variable. After that the function is available via foo name — foo().

    The definition states that this type of functions can have an optional name:

    var foo = function _foo() {
      ...
    };

    What’s important here to note is that from the outside FE is accessible via variable foo — foo(), while from inside the function (for example, in the recursive call), it is also possible to use _fooname.

    When a FE is assigned a name it can be difficult to distinguish it from a FD. However, if you know the definition, it is easy to tell them apart: FE is always in the expression position. In the following example we can see various ECMAScript expressions in which all the functions are FE:

    // in parentheses (grouping operator) can be only an expression
    (function foo() {});
     
    // in the array initialiser – also only expressions
    [function bar() {}];
     
    // comma also operates with expressions
    1, function baz() {};

    The definition also states that FE is created at the code execution stage and is not stored in the variable object. Let’s see an example of this behavior:

    // FE is not available neither before the definition
    // (because it is created at code execution phase),
     
    alert(foo); // "foo" is not defined
     
    (function foo() {});
     
    // nor after, because it is not in the VO
     
    alert(foo);  // "foo" is not defined

    The logical question now is why do we need this type of functions at all? The answer is obvious — to use them in expressions and “not pollute” the variables object. This can be demonstrated in passing a function as an argument to another function:

    function foo(callback) {
      callback();
    }
     
    foo(function bar() {
      alert('foo.bar');
    });
     
    foo(function baz() {
      alert('foo.baz');
    });

    In case a FE is assigned to a variable, the function remains stored in memory and can later be accessed via this variable name (because variables as we know influence VO):

    var foo = function () {
      alert('foo');
    };
     
    foo();

    Another example is creation of encapsulated scope to hide auxiliary helper data from external context (in the following example we use FE which is called right after creation):

    var foo = {};
     
    (function initialize() {
     
      var x = 10;
     
      foo.bar = function () {
        alert(x);
      };
     
    })();
     
    foo.bar(); // 10;
     
    alert(x); // "x" is not defined

    We see that function foo.bar (via its [[Scope]] property) has access to the internal variable x of function initialize. And at the same time x is not accessible directly from the outside. This strategy is used in many libraries to create “private” data and hide auxiliary entities. Often in this pattern the name of initializing FE is omitted:

    (function () {
     
      // initializing scope
     
    })();

    Here’s another examples of FE which are created conditionally at runtime and do not pollute VO:

    var foo = 10;
     
    var bar = (foo % 2 == 0
      ? function () { alert(0); }
      : function () { alert(1); }
    );
     
    bar(); // 0

    Question “about surrounding parentheses”

    Let’s go back and answer the question from the beginning of the article — “why is it necessary to surround a function in parentheses if we want to call it right from it’s definition”. Here’s an answer to this question: restrictions of the expression statement.

    According to the standard, the expression statement (ExpressionStatement) cannot begin with an opening curly brace — { since it would be indistinguishable from the block, and also the expression statement cannot begin with a function keyword since then it would be indistinguishable from thefunction declaration. I.e., if we try to define an immediately invoked function the following way (starting with a function keyword):

    function () {
      ...
    }();
     
    // or even with a name
     
    function foo() {
      ...
    }();

    we deal with function declarations, and in both cases a parser will produce a parse error. However, the reasons of these parse errors vary.

    If we put such a definition in the global code (i.e. on the Program level), the parser should treat the function as declaration, since it starts with a function keyword. And in first case we get aSyntaxError because of absence of the function’s name (a function declaration as we said should always have a name).

    In the second case we do have a name (foo) and the function declaration should be created normally. But it doesn’t since we have another syntax error there — a grouping operator without an expression inside it. Notice, in this case it’s exactly a grouping operator which follows the function declaration, but not the parentheses of a function call! So if we had the following source:

    // "foo" is a function declaration
    // and is created on entering the context
     
    alert(foo); // function
     
    function foo(x) {
      alert(x);
    }(1); // and this is just a grouping operator, not a call!
     
    foo(10); // and this is already a call, 10

    everything is fine since here we have two syntactic productions — a function declaration and agrouping operator with an expression (1) inside it. The example above is the same as:

    // function declaration
    function foo(x) {
      alert(x);
    }
     
    // a grouping operator
    // with the expression
    (1);
     
    // another grouping operator with
    // another (function) expression
    (function () {});
     
    // also - the expression inside
    ("foo");
     
    // etc

    In case we had such a definition inside a statement, then as we said, there because of ambiguity we would get a syntax error:

    if (true) function foo() {alert(1)}

    The construction above by the specification is syntactically incorrect (an expression statement cannot begin with a function keyword), but as we will see below, none of the implementations provide the syntax error, but handle this case, though, every in it’s own manner.

    Having all this, how should we tell the parser that what we really want it to call a function immediately after its creation? The answer is obvious. It’s should be a function expression, and notfunction declaration. And the simplest way to create an expression is to use mentioned abovegrouping operator. Inside it always there is an expression. Thus, the parser distinguishes a code as a function expression (FE) and there is no ambiguity. Such a function will be created during theexecution stage, then executed, and then removed (if there are no references to it).

    (function foo(x) {
      alert(x);
    })(1); // OK, it's a call, not a grouping operator, 1

    In the example above the parentheses at the end (Arguments production) are already call of the function, and not a grouping operator as it was in case of a FD.

    Notice, in the following example of the immediate invocation of a function, the surrounding parentheses are not required, since the function is already in the expression position and the parser knows that it deals with a FE which should be created at code execution stage:

    var foo = {
     
      bar: function (x) {
        return x % 2 != 0 ? 'yes' : 'no';
      }(1)
     
    };
     
    alert(foo.bar); // 'yes'

    As we see, foo.bar is a string but not a function as can seem at first inattentive glance. The function here is used only for initialization of the property — depending on the conditional parameter — it is created and called right after that.

    Therefore, the complete answer to the question “about parentheses” is the following:

    Grouping parentheses are needed when a function is not at the expression position and if we want to call it immediately right after its creation — in this case we just manually transform the function to FE.

    In case when a parser knows that it deals with a FE, i.e. the function is already at the expression position — the parentheses are not required.

    Apart from surrounding parentheses it is possible to use any other way of transformation of a function to FE type. For example:

    1, function () {
      alert('anonymous function is called');
    }();
     
    // or this one
    !function () {
      alert('ECMAScript');
    }();
     
    // and any other manual
    // transformation
     
    ...

    However, grouping parentheses are just the most widespread and the elegant way to do it.

    By the way, the grouping operator can surround the function description as without call parentheses, and also including call parentheses. I.e. both expressions below are correct FE:

    (function () {})();
    (function () {}());

    Implementations extension: Function Statement

    The following example shows a code in which none of implementations processes accordingly to the specification:

    if (true) {
     
      function foo() {
        alert(0);
      }
     
    } else {
     
      function foo() {
        alert(1);
      }
     
    }
     
    foo(); // 1 or 0 ? test in different implementations

    Here it is necessary to say that according to the standard this syntactic construction in general isincorrect, because as we remember, a function declaration (FD) cannot appear inside a code block(here if and else contain code blocks). As it has been said, FD can appear only in two places: at the Program level or directly inside a body of another function.

    The above example is incorrect because the code block can contain only statements. And the only place in which function can appear within a block is one of such statements — the expression statement. But by definition it cannot begin with an opening curly brace (since it is indistinguishable from the code block) or a function keyword (since it is indistinguishable from FD).

    However in section of errors processing the standard allows for implementations extensions of program syntax. And one of such extensions can be seen in case of functions which appear in blocks. All implementations existing today do not throw an exception in this case and process it. But every in its own way.

    Presence of if-else branches assumes a choice is being made which of the two function will be defined. Since this decision is to be made at runtime, that implies that a function expression (FE)should be used. However the majority of implementations will simply create both of the function declarations (FD) on entering the context stage, but since both of the functions use the same name, only the last declared function will get called. In this example the function foo shows 1 although theelse branch never executes.

    However, SpiderMonkey implementation treats this case in two ways: on the one hand it does not consider such functions as declarations (i.e. the function is created on the condition at the code execution stage), but on the other hand they are not real function expressions since they cannot be called without surrounding parentheses (again the parse error — “indistinguishably from FD”) and they are stored in the variable object.

    My opinion is that SpiderMonkey handles this case correctly, separating the own middle type of function — (FE + FD). Such functions are correctly created due the time and according to conditions, but also unlike FE, and more like FD, are available to be called from the outside. This syntactic extension SpiderMonkey names as Function Statement (in abbreviated form FS); this terminology ismentioned in MDC. JavaScript inventor Brendan Eich also noticed this type of functions provided by SpiderMonkey implementation.

    Feature of Named Function Expression (NFE)

    In case FE has a name (named function expression, in abbreviated form NFE) one important feature arises. As we know from definition (and as we saw in the examples above) function expressions do not influence variable object of a context (this means that it’s impossible to call them by namebefore or after their definition). However, FE can call itself by name in the recursive call:

    (function foo(bar) {
     
      if (bar) {
        return;
      }
     
      foo(true); // "foo" name is available
     
    })();
     
    // but from the outside, correctly, is not
     
    foo(); // "foo" is not defined

    Where is the name “foo” stored? In the activation object of foo? No, since nobody has defined any “foo” name inside foo function. In the parent variable object of a context which creates foo? Also not, remember the definition — FE does not influence the VO — what is exactly we see when callingfoo from the outside. Where then?

    Here’s how it works: when the interpreter at the code execution stage meets named FE, before creating FE, it creates auxiliary special object and adds it in front of the current scope chain. Then it creates FE itself at which stage the function gets the [[Scope]] property (as we know from theChapter 4. Scope chain) — the scope chain of the context which created the function (i.e. in[[Scope]] there is that special object). After that, the name of FE is added to the special object as unique property; value of this property is the reference to the FE. And the last action is removing that special object from the parent scope chain. Let’s see this algorithm on the pseudo-code:

    specialObject = {};
     
    Scope = specialObject + Scope;
     
    foo = new FunctionExpression;
    foo.[[Scope]] = Scope;
    specialObject.foo = foo; // {DontDelete}, {ReadOnly}
     
    delete Scope[0]; // remove specialObject from the front of scope chain

    Thus, from the outside this function name is not available (since it is not present in parent scope), but special object which has been saved in [[Scope]] of a function and there this name is available.

    It is necessary to note however, that some implementations, for example Rhino, save this optional name not in the special object but in the activation object of the FE. Implementation from Microsoft — JScript, completely breaking FE rules, keeps this name in the parent variables object and the function becomes available outside.

    NFE and SpiderMonkey

    Let’s have a look at how different implementations handle this problem. Some versions of SpiderMonkey have one feature related to special object which can be treated as a bug (although all was implemented according to the standard, so it is more of an editorial defect of the specification). It is related to the mechanism of the identifier resolution: the scope chain analysis istwo-dimensional and when resolving an identifier it considers the prototype chain of every object in the scope chain as well.

    We can see this mechanism in action if we define a property in Object.prototype and use a “nonexistent” variable from the code. In the following example when resolving the name x the global object is reached without finding x. However since in SpiderMonkey the global object inherits from Object.prototype the name x is resolved there:

    Object.prototype.x = 10;
     
    (function () {
      alert(x); // 10
    })();

    Activation objects do not have prototypes. With the same start conditions, it is possible to see the same behavior in the example with inner function. If we were to define a local variable x and declare inner function (FD or anonymous FE) and then to reference x from the inner function, this variable would be resolved normally in the parent function context (i.e. there, where it should be and is), instead of in Object.prototype:

    Object.prototype.x = 10;
     
    function foo() {
     
      var x = 20;
     
      // function declaration 
     
      function bar() {
        alert(x);
      }
     
      bar(); // 20, from AO(foo)
     
      // the same with anonymous FE
     
      (function () {
        alert(x); // 20, also from AO(foo)
      })();
     
    }
     
    foo();

    Some implementations set a prototype for activation objects, which is an exception compared to most of other implementations. So, in the Blackberry implementation value x from the above example is resolved to 10. I.e. do not reach activation object of foo since value is found in Object.prototype:

    AO(bar FD or anonymous FE) -> no ->
    AO(bar FD or anonymous FE).[[Prototype]] -> yes - 10

    And we can see absolutely the same situation in SpiderMonkey in case of special object of a named FE. This special object (by the standard) is a normal object — “as if by expression new Object(), and accordingly it should be inherited from Object.prototype, what is exactly what can be seen in SpiderMonkey implementation (but only up to version 1.7). Other implementations (including newer versions of SpiderMonkey) do not set a prototype for that special object:

    function foo() {
     
      var x = 10;
     
      (function bar() {
     
        alert(x); // 20, but not 10, as don't reach AO(foo)
     
        // "x" is resolved by the chain:
        // AO(bar) - no -> __specialObject(bar) -> no
        // __specialObject(bar).[[Prototype]] - yes: 20
     
      })();
    }
     
    Object.prototype.x = 20;
     
    foo();

    NFE and JScript

    ECMAScript implementation from Microsoft — JScript which is currently built into Internet Explorer (up to JScript 5.8 — IE8) has a number of bugs related with named function expressions (NFE). Every of these bugs completely contradicts ECMA-262-3 standard; some of them may cause serious errors.

    First, JScript in this case breaks the main rule of FE that they should not be stored in the variable object by name of functions. An optional FE name which should be stored in the special object and be accessible only inside the function itself (and nowhere else) here is stored directly in the parent variable object. Moreover, named FE is treated in JScript as the function declaration (FD), i.e. is created on entering the context stage and is available before the definition in the source code:

    // FE is available in the variable object
    // via optional name before the
    // definition like a FD
    testNFE();
     
    (function testNFE() {
      alert('testNFE');
    });
     
    // and also after the definition
    // like FD; optional name is
    // in the variable object
    testNFE();

    As we see, complete violation of rules.

    Secondly, in case of assigning the named FE to a variable at declaration, JScript creates two different function objects. It is difficult to name such behavior as logical (especially considering that outside of NFE its name should not be accessible at all):

    var foo = function bar() {
      alert('foo');
    };
     
    alert(typeof bar); // "function", NFE again in the VO – already mistake
     
    // but, further is more interesting
    alert(foo === bar); // false!
     
    foo.x = 10;
    alert(bar.x); // undefined
     
    // but both function make
    // the same action
     
    foo(); // "foo"
    bar(); // "foo"

    Again we see the full disorder.

    However it is necessary to notice that if to describe NFE separately from assigning to variable (for example via the grouping operator), and only after that to assign it to a variable, then check on equality returns true just like it would be one object:

    (function bar() {});
     
    var foo = bar;
     
    alert(foo === bar); // true
     
    foo.x = 10;
    alert(bar.x); // 10

    This moment can be explained. Actually, again two objects are created but after that remains, really, only one. If again to consider that NFE here is treated as the function declaration (FD) then on entering the context stage FD bar is created. After that, already at code execution stage the second object — function expression (FE) bar is created and is not saved anywhere. Accordingly, as there is no any reference on FE bar it is removed. Thus there is only one object — FD bar, the reference on which is assigned to foo variable.

    Thirdly, regarding the indirect reference to a function via arguments.callee, it references that object with which name a function is activated (to be exact — functions since there are two objects):

    var foo = function bar() {
     
      alert([
        arguments.callee === foo,
        arguments.callee === bar
      ]);
     
    };
     
    foo(); // [true, false]
    bar(); // [false, true]

    Fourthly, as JScript treats NFE as usual FD, it is not submitted to conditional operators rules, i.e. just like a FD, NFE is created on entering the context and the last definition in a code is used:

    var foo = function bar() {
      alert(1);
    };
     
    if (false) {
     
      foo = function bar() {
        alert(2);
      };
     
    }
    bar(); // 2
    foo(); // 1

    This behavior can also be “logically” explained. On entering the context stage the last met FD with name bar is created, i.e. function with alert(2). After that, at code execution stage already new function — FE bar is created, the reference on which is assigned to foo variable. Thus (as further in the code the if-block with a condition false is unreachable), foo activation produces alert(1). The logic is clear, but taking into account IE bugs, I have quoted “logically” word since such implementation is obviously broken and depends on JScript bugs.

    And the fifth NFE bug in JScript is related with creation of properties of global object via assigning value to an unqualified identifier (i.e. without var keyword). Since NFE is treated here as FD and, accordingly, stored in the variable object, assignment to unqualified identifier (i.e. not to variablebut to usual property of global object) in case when the function name is the same as unqualified identifier, this property does not become global.

    (function () {
     
      // without var not a variable in the local
      // context, but a property of global object
     
      foo = function foo() {};
     
    })();
     
    // however from the outside of
    // anonymous function, name foo
    // is not available
     
    alert(typeof foo); // undefined

    Again, the “logic” is clear: the function declaration foo gets to the activation object of a local context of anonymous function on entering the context stage. And at the moment of code execution stage, the name foo already exists in AO, i.e. is treated as local. Accordingly, at assignment operation there is simply an update of already existing in AO property foo, but not creation of new property of global object as should be according to the logic of ECMA-262-3.

    Functions created via Function constructor

    This type of function objects is discussed separately from FD and FE since it also has its own features. The main feature is that the [[Scope]] property of such functions contains only global object:

    var x = 10;
     
    function foo() {
     
      var x = 20;
      var y = 30;
     
      var bar = new Function('alert(x); alert(y);');
     
      bar(); // 10, "y" is not defined
     
    }

    We see that the [[Scope]] of bar function does not contain AO of foo context — the variable “y” is not accessible and the variable “x” is taken from the global context. By the way, pay attention, theFunction constructor can be used both with new keyword and without it, in this case these variants are equivalent.

    The other feature of such functions is related with Equated Grammar Productions and Joined Objects. This mechanism is provided by the specification as suggestion for the optimization (however, implementations have the right not to use such optimization). For example, if we have an array of 100 elements which is filled in a loop with functions, then implementation can use this mechanism of joined objects. As a result only one function object for all elements of an array can be used:

    var a = [];
     
    for (var k = 0; k < 100; k++) {
      a[k] = function () {}; // possibly, joined objects are used
    }

    But functions created via Function constructor are never joined:

    var a = [];
     
    for (var k = 0; k < 100; k++) {
      a[k] = Function(''); // always 100 different funcitons
    }

    Another example related with joined objects:

    function foo() {
     
      function bar(z) {
        return z * z;
      }
     
      return bar;
    }
     
    var x = foo();
    var y = foo();

    Here also implementation has the right to join objects x and y (and to use one object) because functions physically (including their internal [[Scope]] property) are not distinguishable. Therefore, the functions created via Function constructor always require more memory resources.

    Algorithm of function creation

    The pseudo-code of function creation algorithm (except steps with joined objects) is described below. This description helps to understand in more detail which function objects exist in ECMAScript. The algorithm is identical for all function types.

    F = new NativeObject();
     
    // property [[Class]] is "Function"
    F.[[Class]] = "Function"
     
    // a prototype of a function object
    F.[[Prototype]] = Function.prototype
     
    // reference to function itself
    // [[Call]] is activated by call expression F()
    // and creates a new execution context
    F.[[Call]] = <reference to function>
     
    // built in general constructor of objects
    // [[Construct]] is activated via "new" keyword
    // and it is the one who allocates memory for new
    // objects; then it calls F.[[Call]]
    // to initialize created objects passing as
    // "this" value newly created object
    F.[[Construct]] = internalConstructor
     
    // scope chain of the current context
    // i.e. context which creates function F
    F.[[Scope]] = activeContext.Scope
    // if this functions is created
    // via new Function(...), then
    F.[[Scope]] = globalContext.Scope
     
    // number of formal parameters
    F.length = countParameters
     
    // a prototype of created by F objects
    __objectPrototype = new Object();
    __objectPrototype.constructor = F // {DontEnum}, is not enumerable in loops
    F.prototype = __objectPrototype
     
    return F

    Pay attention, F.[[Prototype]] is a prototype of the function (constructor) and F.prototype is a prototype of objects created by this function (because often there is a mess in terminology, andF.prototype in some articles is named as a “prototype of the constructor” that is incorrect).

    Conclusion

    This article has turned out rather big; however, we will mention functions again when will discuss their work as constructors in one of chapters about objects and prototypes which follow. As always, I am glad to answer your questions in comments.

    Additional literature

  • 相关阅读:
    模拟ssh远程执行命令
    基于UDP协议的套接字编程
    TCP三次握手,四次挥手
    基于TCP协议的套接字编程
    osi七层协议
    Python之__class__.__module__,__class__.__name__
    异常处理
    单例模式
    类方法__setattr__,__delattr__,__getattr__
    反射(hasattr,getattr,delattr,setattr)
  • 原文地址:https://www.cnblogs.com/lwhkdash/p/2344379.html
Copyright © 2011-2022 走看看