The Problem

Suppose we want to count the words in a document, or the identifiers in a computer program. We’ll need to keep a dictionary (aka hash or mapping) whose keys are the words and whose values are their frequencies. When we come across a new word (or identifier) we add it to the dictionary as a key, with value equal to one. When we meet a word already in the dictionary, we increase the value by one.

The starting point, of course, is an empty dictionary (or mapping or hash or whatever you want to call it). In JavaScript creating one of these requires some thought, and seems not to have been done before. Here’s the problem. In JavaScript we have to store the data (key-value pairs) in an object, and every object comes with some builtin methods. Here’s some command-line JavaScript:

js> data = {};  // Create empty data object, we hope.
js> data.foo === undefined; // true
js> data['foo'] === undefined;  // true
js> data['constructor'] === undefined; // false - not what we want.

Explanation

JavaScript has what is called prototypical inheritance. This means that every JavaScript object has a prototype object, and that the original object inherits from the prototype object. Let’s call them obj and prot respectively. If we look up ‘obj.foo’ and this fails, then we try ‘prot.foo’ and if found use that instead. And if ‘prot.foo’ fails we look up ‘protprot.foo’ (where ‘protprot’ is the prototype object of ‘prot’) and so on, until we get to the end of the prototype chain.

As it happens, Object.prototype is the prototype object for data, and Object.prototype has a ‘constructor’ attribute. This is why data has a ‘constructor’ entry. It has some others, such as a ‘toString’ method.

Solution

A probably dangerous solution is to change Object.prototype, because this would change the behaviour of all objects, and would probably break a lot of scripts. Perhaps better is to create a new sort of object, which masks the unwelcome builtin attributes. Here’s the code to do this.

var Data = function(){};

Data.prototype = {
    // Note: 'for in' does not pick up builtin properties.
    'constructor': undefined,
    'hasOwnProperty': undefined,
    'isPrototypeof': undefined,
    'propertyisEnumerable': undefined,
    'toLocaleString': undefined,
    'toString': undefined,
    'valueOf': undefined
};

var data = new Data();
data.constructor == undefined;  // Is now true.

Discussion

First, here’s why I said earlier that the solution appears to be new. Douglas Cockcroft, in his new book JavaScript: The Good Parts, writes (p23) that there are two solutions to dealing with these undesired properties. The solutions he gives are to use the typeof operator or the hasOwnProperty method. He doesn’t mention the masking solution above. Douglas is very knowledgeable about JavaScript, and so that’s why I think this method is new. If you know better, please let me know.

Now for an unwelcome but unavoidable side effect. It your program tries to convert data into a string, it will fail. This is because data does not have a toString method. This is the unwelcome side effect. It is unavoidable, because if data did have a toString method, then it would show up, and data would not be initially empty. The same goes, of course, for the other Object methods we’ve masked, such as constructor.

You might think not being able to turn data into a string is a problem. It shouldn’t be. The keys and values of data can still be turned into strings. If you must turn data into a string, the code

Object.toString.apply(data)

will do this, using the usual method.

Finally, even if your code relies on turning data into a string, you may be better off using the Data approach. This is because when you add the key ‘toString’ to data you’ll immediately loose the ability to turn data into a string (unless toString has exactly the right value). If you use the Data approach, your data will start with the Object methods being masked, and so you’ll have to deal with it up front. If you use one of the other two approaches, the Object methods will be masked later, when the program is given unusual input.

Advertisements