Juriy Zaytsev | May 14, 2010
I assume you already know what a closure is. But just as a reminder, let's take a look at one of the more popular definitions.
What's a Closure
A "closure" is an expression (typically a function) that can have free variables together with an environment that binds those variables (that "closes" the expression).
Note that closure is formed during function instantiation, not invocation. The form in which function is defined makes no difference either—it could be either function declaration or function expression.
A simple form of closure could be exemplified like this:
When the function `outer` is called, an inner function is instantiated. At the moment of instantiation, the inner function has access to `x` variable, and so the function becomes closed over that variable. Even after execution of the function `outer`, the inner function still has access to `x` variable.
But instead of looking at artificial examples, let's instead examine some of the real-life ones.
Classic example: event handlers in a loop
One of the most natural uses for closure comes up whenever we're dealing with loops. This is a classic "last value" problem, when function declared within loop ends up using last value of an incrementing variable:
As you can see, this snippet assigns 10 functions as event handlers to corresponding elements. The author’s intention here is to have different `i` values in each of the event handlers. But this is not at all what happens in reality. If you look closer, it's easy to see what's really going on. Each of the functions assigned to `onclick` has access to `i` variable. Since `i` variable is bound to the entire enclosing scope—just like any other variable declared with `var` or as function declaration—each of the functions have access to the same `i` variable. Once this snippet is executed, `i` variable keeps its last value—the one that was set when loop ended.
So how do we fix it? By taking advantage of closures, of course. The solution is simple: scope each of the `i` values to the level of event handler. Let's see how this is done:
Instead of assigning a function to `onclick`, we first create an additional scope by executing anonymous function. We pass the `i` value to that function and return the inner, event handling function. Notice how the anonymous function has one `index` argument. When the function is executed and is passed `i` value, it is this `index` argument that results in creation of variable<sup><a href=""></a></sup> scoped to anonymous wrapper function. And since the inner, event-handling function is declared within a wrapping one, it also has access to this `index` variable.
We have just created a closure. Each of the event handlers now have access to `i` with proper values — 0, 1, 2, 3, and so on.
If all this anonymous function wrapping looks cryptic, you can always factor it out into a separate function:
It's also worth pointing out that names of variable passed to wrapping function and corresponding argument don't have to be different. I used `i` and `index` only to be able to refer to them unambiguously. If you don't need access to outer `i` variable from within event handler, it's absolutely fine to name them identically:
As a result, `this` value of function is not "preserved" when the function is "transferred" from property of one object to another. Moreover, the function can also be called without any object as its "base", in which case `this` references global object. And this is when `bind` comes to the rescue. Let's take a look at a simple implementation of it.
The inner returned function has access to variables declared within `bind`: `fn` — original function, `thisArg` — specific `this` value to invoke a function with, and `args` — arguments passed to `bind` after `thisArg`. Long after `bind` is executed, the inner function can use these variables to perform its magic.
The reason the `onclick` handler doesn't work as expected is because the function is invoked with `this`, referencing `document.body` (that's how `onclick` works). But binding ensures that the function is always called with `this` referencing specific object — `john` in this case.
document.body.onclick = john.speak.bind(john); // works as expected
Note how `Function.prototype.bind` is designed to capture not only `this` value but also arguments passed to it. Those arguments are then passed to the bound function before any arguments of the bound function itself. This process is also known as currying and it also takes advantage of closures.
Let's take a look at example:
The first value, the one passed to bind, is `10` and corresponds to `x` in this case. The second value, the one passed to bound function, is `20` and corresponds to `y`. We can juggle with these back and forth, and still have identical result (note the placement of parens):
All of these are functionally identical, since the bound function is always invoked with same list of arguments — `10` and `20`. The difference is that in first case both `10` and `20` are stored in a closure while in second case, only `10` is stored in a closure. In last scenario, none of `10` or `20` are stored in a closure and are instead passed to bound function directly.
This might not seem obvious, but closures are often created implicitly, for example when working with methods like `setTimeout` or `setInterval`. Instantiating and passing a function as the first argument results in the function capturing any free variables that it has access to at the moment of creation.
In this case, function passed to `setTimeout` has access to `x` long after this code is executed—in about 100ms.
The Module Pattern, popularized by Douglas Crockford, is a perfect example of closures in practice. The idea is to encapsulate private logic and expose only certain, "public" methods. Let's take a variation of that same `bind` implementation, but make it a property of `functionUtils` object this time, not `Function.prototype`:
Note the familiar self-executing anonymous function. Inside that function, we have a private `slice` method, and public `bind` method. The way this works is again due to a closure. The self-executing function returns an object with "bind" property referencing second function. Since that function is declared in the same scope as `slice` one, it has access to `slice` even after wrapping function is executed and `functionUtils` is assigned an object.
The placement of `slice` doesn't really matter here, as being function declaration, it is "hoisted" to the top of the enclosing scope. Positioning it after return statement would have functionally identical effect; although doing so is sometimes considered a bad practice (it looks confusing).
A slightly less popular variation of module pattern—or rather it’s wrapping—is via instantiation of anonymous function. Don't be confused; the following snippet also takes care of a closure, to capture private method in public one.
An ubiquitous private methods implementation is really nothing more but a module pattern applied to constructors. And just like module pattern, it relies on closures to provide privacy and give internal access of private data to public methods. Here's a simple example of `Circle` constructor, in which only public `getRadius` and `setRadius` methods have access to "internal" radius value:
During `Circle` instantiation, the `getRadius` and `setRadius` functions are instantiated right from within constructor. As a result, they form closures over an argument passed to a `Circle` function — `radius`. This `radius` value only exists in scope of these public methods of a circle, and is never exposed to the "outside":
Note that when implementing private methods via this pattern, it's good to understand performance implications involved. First of all, creation of the `getRadius` and `setRadius` functions in `Circle` is a slight hit in runtime execution. Second, and more importantly, there's now 2 function objects created per every instantiated instance of `Circle`. If instead, we were to create `getRadius` and `setRadius` as methods of `Circle.prototype`, the amount of function objects would be constant:
However, in such case, we lose the luxury of having truly private members, and have to resort to other means such as denoting privacy through convention (underscored property names). What it usually comes down to is making a choice between having truly private members or having more efficient implementation.
Giving element a unique id
An implementation of such helper could look like this:
The purpose of the closure here is to store a number counter. To guarantee uniqueness, every time element doesn't have an id, a counter is incremented and a new, unique id is assigned to an element. `getUniqueId` could now be used like this:
Caching (or memoization)
One of the well-known ways to improve performance of application is through caching. And again, closures allow for an elegant implementation of this optimization technique. Take for example, `hasClassName` —the irreplaceable helper when scripting for browsers. One of the possible implementations of `hasClassName` is to rely on regex parsing of element's className. However, this means that a regular expression needs to be created dynamically based on a value of className given to function.
The regular expression has to be compiled every time `hasClassName` is executed. This is generally slow, so let's employ caching.
As usual, the self-executing function results in the `cache` value being closed over the inner, returned function. That inner, returned function is then assigned to `hasClassName`. Since both the returned function and `cache` are declared within one scope, the returned function has access to `cache` even after it's being assigned to `hasClassName`.
Note that this implementation of `hasClassName` doesn't work reliably with names that collide with `Object.prototype.*` members: "toString", "valueOf", "propertyIsEnumerable", etc. This is more of an edge case, but it's good to be aware of this limitation.
"Shorter" variable/property resolution
Continuing with performance optimizations, let's take a look at identifier resolution.
Identifier resolution is the process of evaluating identifier against scope chain. Identifiers are lexical units in ECMAScript. They are what constitute names of variables, function declarations/expressions, formal parameters of a function, etc. When a program evaluates identifier, it has to follow scope chain, looking for property with the same name. The further in scope chain identifier is, the slower this resolution process is. Let's take a look at example:
During identifier resolution, the `inner` variable—the one that's defined locally—is found immediately on the closest object in the scope chain. The `outer` variable, on the other hand, is declared in the containing scope. Resolving it first requires checking nearest object in the scope chain (on which `inner` is defined), and only then proceed to the outer one. The more objects in scope chain, the longer this resolution process takes.
So how can closures help? By creating local "aliases" (in essence—local variables), we can speed up this resolution process. And to "hide" these local variables, we can simply store them in a closure.
This is a simple implementation of `keys` property, similar to `Object.keys` method from 5th edition of ECMAScript. This method returns an array of names, corresponding to own properties of an object.
Looking at implementation, you can see an aliasing of `Object.prototype.hasOwnProperty` to the local `hasOwnProperty` variable. Since that variable is stored in a closure, resolution of `hasOwnProperty` identifier in a loop of an actual `keys` function should now be faster. Instead of travelling all the way to the global scope, in which `Object` is defined, `hasOwnProperty` is now resolved on the next object in scope chain — the one corresponding to outer, wrapping function.
Another advantage of such aliasing is avoiding multiple property access. `Object.prototype.hasOwnProperty` has to first resolve "prototype" property on `Object`, and then `hasOwnProperty` on `Object.prototype`. There's no need to do this with local `hasOwnProperty` variable.
So with `Object.prototype.hasOwnProperty` an interpreter first needs to resolve an `Object` following scope chain all the way to the last object — the global object. Then it perform two property resolutions: "prototype" on `Object` and "hasOwnProperty" on `Object.prototype`. With local `hasOwnProperty`, there's only an identifier resolution and a short one at that. An advantage of latter approach is clear.
Note that this implementation of `keys` doesn't take care of a rather nasty JScript DontEnum bug. I omitted workaround for clarity and it shouldn't be hard to add it.
Speaking of identifier resolution, it's worth mentioning that additional closures can themselves hinder performance. If a method definition is already contained within an anonymous, wrapping function, there's often no need to wrap it in another function:
In this case, we can avoid an extra closure by declaring `hasOwnProperty` right in the scope of wrapping function. This avoids a name leak into the global scope anyway:
The advantage is one less closure and an extra object in the scope chain of `keys` function. The downside is that the `hasOwnProperty` identifier can now conflict with other code in the "wrapping" scope, especially if other functions follow a similar pattern and "hoist" variables to the outer scope.
As always, you should choose what makes more sense depending on a context.
Another interesting optimization which sometimes involves closures is object reuse. The idea is similar to caching (i.e: avoiding the creation of objects). Only this time objects are created once at "load" time, instead of multiple times at runtime.
Let's take a look at a well-known `clone`/`beget` method as an example of this enhancement. Introduced first by Lasse Reichstein Nielsen in 2003, and later popularized by Douglas Crockford, `clone` provides a way to create an object that inherits from another object. In its simplest and most popular form, `clone` looks like this:
Notice how function `F` is created every time `clone` is invoked. Not only does this involve a runtime performance hit, but also results in higher memory consumption. There's really no need to create function at runtime, when it can only be done once. To my knowledge, the following pattern was first proposed by Richard Cornford.
This time, function `F` is created within an anonymous "wrapping" function and is then reused within a returning `clone` function. It is only created once, never at runtime, which is how we save on memory and execution time.
A similar example of object reuse can be seen with the `escapeHTML` method. The `escapeHTML` method, as its name suggests, provides a way to escape an html string. The most straight-forward way to implement it is using regular expressions:
However, there's an interesting shortcut which involves taking advantage of non-standard (but de-facto and currently codified by the HTML5 `innerHTML` property. This approach was actually used by the Prototype.js library for quite a while.
There's no manual replacement of each of the "<", ">", and "&" with corresponding character entities ("<", ">", and "&"). Instead, a text node is created and its value is set to a string in question, using data property. The text node itself is not enough, though, and we need the `innerHTML` property to retrieve the escaped representation of an element and text nodes don't have it. That's why an HTML element is created and a text node is appended to it. The text node and HTML element are reusable objects. The `escapeHTML` function has access to both of them. During invocation, it sets data property on a text node and retrieves the escaped string from `innerHTML` property of an element which contains that text node.
It's a nice trick, but not without issues. Since `innerHTML` is a proprietary API, there's really no guarantee on consistency of returned representation. In Internet Explorer, for example, handling of new lines by `innerHTML` differs from that in other browsers. If you're planning to use this method, it's a good idea to feature test for any deviations. Another concern is increased memory consumption in some versions of Internet Explorer. There are no memory leaks per se, but because DOM objects are stored in a closure, IE has known to release less memory when closing tab/window with application.
Employ with care.
If you'd like to understand closures in all the gory details, I would recommend reading an in-depth article on the subject by Richard Cornford. It dives into some of the interesting aspects of what happens behind the scenes: explanation of activation and variable objects, scope chains and the process of identifier resolution, memory leaks, etc. If not for closures, understanding the underlying mechanisms of the language—explained in that article—can open your eyes to many other things.
 Technically, argument is not a variable, but we won't go into the finer details of distinction at this point.
About the Author
Find Juriy on: