February 24, 2013
What are JavaScript Arrays (a quick overview)?

Objects and Arrays in JavaScript look suspiciously similar, but are actually very different creatures. To see how they look similar lets run through some code examples:

// object
console.log(typeof {});

// object
console.log(typeof []);

// [Function: Object]
console.log({}.constructor);

// [Function: Array]
console.log([].constructor);

Just like we can assign fields to objects we can also do the same to arrays:

var obj = {};
obj.foo = ‘foo’;
obj[‘bar’] = ‘bar’;
obj[10] = ‘ten’;

// foo
console.log(obj.foo);
// bar
console.log(obj[‘bar’]);
// ten
console.log(obj[10]);

var arr = [];
arr.foo = ‘foo’;
arr[‘bar’] = ‘bar’;
arr[10] = ‘ten’;

// foo
console.log(arr.foo);
// bar
console.log(arr[‘bar’]);
// ten
console.log(arr[10]);

At a high level they look nearly identical. However there are some differences between the two. If we look at the ECMA specification, we see the following passage:

"Array objects give special treatment to a certain class of property names. A property name P (in the form of a String value) is an array index if and only if ToString(ToUint32(P)) is equal to P and ToUint32(P) is not equal to 2^32-1
. A property whose property name is an array index is also called an element. Every Array object has a length property whose value is always a nonnegative integer less than 2^32
. The value of the length property is numerically greater than the name of every property whose name is an array index; whenever a property of an Array object is created or changed, other properties are adjusted as necessary to maintain this invariant. Specifically, whenever a property is added whose name is an array index, the length property is changed, if necessary, to be one more than the numeric value of that array index; and whenever the length property is changed, every property whose name is an array index whose value is not smaller than the new length is automatically deleted.”

So arrays have elements and a length property, the length property is only affected by elements. It will always be equal to the largest element value plus one, if you change the length of an array, all elements that are greater than the new length will be deleted. Some examples:

var arr = [];
// 0
console.log(arr.length);

arr[0] = ‘foo’;
// 1
console.log(arr.length);

arr[‘foo’] = ‘foo’;
// 1
console.log(arr.length);

arr[100] = ‘bar’;
// 101
console.log(arr.length);

// bar
console.log(arr[100]);
arr.length = 20;

// undefined
console.log(arr[100]);

For a quick overview of why you shouldn’t use Arrays like an Object, read this article.

Sparse Arrays vs. Dense Arrays
The V8 JavaScript engine used in Chrome (other runtimes may vary) has two different underlying memory stores for an array, either Dictionary mode or C-Style array. Dictionary mode is when the array is backed by a V8 hashtable and it is considerably less efficient than the C style array which is a contiguous allocation of memory, like a traditional memory model for an array.

You get kicked into dictionary mode for a number of reasons. One (as outlined in the perf talk below) is that if the array is three times or more efficient in terms of space in dictionary mode vs. c-style then dictionary mode will be used. Also if you create an array and then just index into it at a large offset, you will end up in dictionary mode e.g.

var a = new Array();
a[1000] = ‘foo’;

This obviously doesn’t make any sense in C. Instead we can give a signal to V8 to indicate how many elements we should allocate upfront, so:

var a = new Array(1000);
a[10] = ‘foo’;
a[100] = ‘bar’;

Will allocate 1000 elements contiguously. For more information and more excellent V8 performance considerations, watch the following talk: https://www.youtube.com/watch?feature=player_detailpage&v=XAqIpGU8ZZk#t=994s

To see what other methods are available on JavaScript arrays, see the excellent MDN documentation.

February 22, 2013
How do you new?

function Foo() {
}
var f = new (Foo);
var f = new Foo;
var f = new Foo();
var f = new (Foo)();

All the above are valid syntax :)

February 21, 2013
Honour thy async signature

When writing asynchronous JavaScript code in node.js there is a common pattern that nearly everyone follows of providing a callback to the async function which is to be called once the function completes, the function takes an error parameter as the first argument and the results as the second. It is common to see code like:

function doWork() {
    getUserAsync(‘bob’, function(error, user) {
        if (error) {
            console.log(‘Whoops’);
            return;
        }
        console.log(‘fetched user’);
    });
}
doWork();

Callbacks are how node currently handles asyncronous callbacks, but it is important to note “callbacks != async” you can also have synchronous callbacks e.g.

function printUsers(users, print) {
    for(var i=0; i<users.length; ++i) {
        print(users[i]);
    }
}
printUsers([‘bob’, ‘joe’], function(username) { console.log(username); });

Here you can see we have a callback function but it is called synchronously by the printUsers function.

So where am I going with this. Well in node, if your function has an async signature you should always make sure it is async, you should always honour the asynchronous expectation a caller may have assumed is always implicit.

Let’s pretend that you are writing some code that connects to Amazon S3 then uploads a file to S3. The api connects and the callback is fired when the connection succeeds or fails:

var s3Client = S3.connect(function(error) {
    if (error) {
        console.log(‘Failed to connect’);
        return;
    }
    console.log(‘Successfully connected’);

    // do some work
    s3Client.upload(‘foo.txt’, ‘this is the contents of the file’);
});

This looks innocuous enough, if the client connects successfully we upload some content, however if the author of the S3 library did something like the following:

var S3 = {
    connectionOpen: false,

    connect: function(callback) {
        if (this.connectionOpen) {
            // If the connection is already open we can callback immediately
            callback();
        }
        else {
            // open a connection, simulate with a timeout for purposes
            // of the example an async network callback
            var self = this;
            setTimeout(function() {
                self.connectionOpen = true;
                callback();
            }, 1000);
        }
        return this;
    }
};

They have written an optimization, if the underlying connection is already open (maybe they are using connection pooling etc.), they simply callback synchronously, otherwise they do the full async connection. The problem is in our example code, if the connect callback is called synchronously you can see the s3Client variable hasn’t been set yet and so the code will crash with a null reference error!

Obviously as an SDK writer you wouldn’t want this kind of signature in the first place, you would want a constructor that you can call to create the client then probably a “connect” method, so you don’t run into this pattern, but I hope this example gives you the idea.

So, to work around this, always make sure you call any callback to your methods in an async manner, this is simple to do using process.nextTick, for example we could have written the above connect method as:

if(connectionOpen) {
    process.nextTick(callback);
}

This ensures the callback is not called immediately and is added to the event queue to be processed after the calling code has finished executing.

February 19, 2013
What is a self executing function in JavaScript and why you should care.

It is very common to see code in JavaScript wrapped in a function like:

(function() {
    var x = 3;
    console.log(x);
})();

The above syntax is known as a self executing function. Before getting into why you
would ever want to use a SEF, lets dive into how the syntax works.

To declare and define a function in JavaScript you would do something like:

function foo() {
    console.log(‘foo’);
}
foo();

You can also write this as:

var foo = function() {
    console.log(‘foo’);
};
foo();

Next we can use the little known grouping operator. It is defined as:
(EXPR) -> Return the result of evaluating EXPR

So in our case if our expression is simply a variable that refers to a function
the function will be returned. For example we can do:

var foo = function() {
    console.log(‘foo’);
};
var x = (foo);
x();

Then this is the same as saying:

(foo)();

At this point you can see we can just use the anonymous function without naming it:

(function() {
    console.log(‘foo’);
})();

And there we have it, the strange self executing function syntax is explained. Now the reason
why you would want to do this is, is mainly to not polute the global namespace when you are declaring
your code. For example let’s say I am writing a library in the file mylib.js, if I just wrote:

//mylib.js
var y = 10;

var mylib = {
    foo: function() {
        console.log(y);
    }
};

Then when a user includes mylib.js in their webpage I have polluted the global namespace with the
‘y’ variable, potentially overwriting any variable called ‘y’ they have declared. Obviously this is
a contrived example but you get the point. Now, since variables have function scope, if we decleared
all our code inside a function we sould not pollute the global namespace, so we can instead say:

//mylib.js
(function(global) {
    var y = 10;
    var mylib = {
        foo: function() {
            console.log(y);
        }
    };

    global.mylib = mylib;
})(window);

Now when a user includes mylib.js in their page only the mylib variable is exported and the global
namespace is not polluted by any of our other variables.

February 12, 2013
JavaScript, where the most obvious code is ILLEGAL

JavaScript is a small, beautiful and extremely powerful language, but it has its fair share of warts and quirks, as shown in this “the good parts vs. definitive guide photo” :)

image

Here is one example of a quirk with an innocuous looking statment containing a number literal:

10.toString()

Before we get into what happens with this simple looking line of code, here are some lines of JavaScript to help illustrate the point:

// true
console.log(true.toString());

// Nan
console.log(NaN.toString());

// foo
console.log(‘foo’.toString());

// 10
var n = 10;
console.log(n.toString());

// number
console.log(typeof n);

// number
console.log(typeof 10);

// 10
console.log(10);

// 10
console.log(10.);

// 10
console.log((10.).toString());

// 10
console.log(10.0.toString());

// 10 - hmmm, weird
console.log(10..toString());

// 10
console.log(10       .toString());

// 10
console.log(Number.prototype.toString.call(10));

// SyntaxError: Unexpected token ILLEGAL - say what!
console.log(10.toString());

Weird that the most logical looking line of code “10.toString()” doesn’t run but “10..toString()” and “(10.).toString()” both do. 10.toString() initially looks like it could be ambiguous, the . could be interpreted as a decimal shorthand 10. (which is 10.0) or it could be a method invocation. In the spec this particular case makes 10. be interpreted as a decimal therefore there is no . to invoke the method, it’s the same as saying (10.)toString(). Note 10.0.toString() is not ambiguous, the grouping operator makes (10.).toString() unambiguous and 10..toString() the which looks like a syntax error in anyones book also works because the first period is interpreted as the decimal place and the second period as the function invocation e.g. (10.).toString() .

February 26, 2012
Asynchronous file uploading using Express and Node.js

There are times on your website when you want to allow users to upload content from their local drive to your server e.g. user profile pictures. In this post I will show you how to do this using node.js, Express and jquery.

A fully functional example is available on my github account Note: You will have to follow the instructions below after fetching the code to install Express and all of it’s dependencies.

Creating your webserver - Express

Express is a web framework build on top of node. It provides functionality such as creating a webserver to handle network requests to your app, view rendering, static file serving etc. In this post we will use it as our webserver to host our client html file and also handle the file uploads to the server.

To install express, we will use NPM, a package manager for node. From the command line (after installing npm), run the following command:

npm install express

After this completes you will have a node_modules folder and inside that the express node module folder. To create your new express app, run:

node_modules/express/bin/express [the_name_of_your_app] e.g. upload-example-app

Express will then create the application skeleton, an app file, folders for static content etc. The final step is to make sure you have all of the dependencies Express relies on. Change to your applications folder and run:

npm install -d

Now we should be ready to run the server, simply type (I am assuming you have node installed :) ): node app.js, you then have a webserver listening on port 3000, you can view it in your browser at http://localhost:3000

The webpage

In HTML, we can specify an input element with a type of “file”, this then causes the browser to show an upload button that the user can use to choose a file from their computer. The upload buttons visual appearance is dependent on the browser you are using. I’m not going to talk about how you can style this button, but it can be done: http://www.clipboard.com/clip/LQtdg6YHzcYKLyZbAkF5hASUu7kvuDya_ZLe

Our example HTML page will have the basic structure:

Go ahead and create this file under the public folder in your express app, e.g. public/index.html

Limitations

The title of this post has the word “asynchronous” in it, the default implementation of the file input element causes a page reload when the form in submitted, it is not possible to use an AJAX request i.e. XmlHttpRequest, to upload the file to the server. In order to be able to upload a file to the server without causing a page reload you need to perform some browser trickery using hidden iframes, textareas and a sprinkling of magic. Luckily for us there is a jquery plugin we can use that will hide all of this trickery from us, it’s called jquery.form. Download the JavaScript file to your /public/javascripts directory and include the jquery library in your page (good developers will make sure they serve a minified and zipped version of these files), we will also add a javascript file that will contain all of our file upload specific code (upload.js)

We also need a way to initiate the upload of the file to the server, we could add another button to our page, so the user would first choose the file, then click on the upload button, but that seems kind of lazy, so what I do here is to add a setInterval call, that is continually checking the value of the input element, once it has a value we can assume the user has selected a file and we kick off the upload.

jquery.form

Using the jquery.form plugin is pretty simple, we’ll add a submit handler to our form, using jquery, then on the submit event, we use the jquery.form plugin to submit the form asynchronously.

In order for this to work correctly, we need to make sure the form enctype field is set and the action attribute points to our API end point on the server, so our final HTML looks like:

The code in upload.js is shown below, I haven’t filled in the success handler, we will do that once we have implemented the server side part of the code:

Server Side

Now we have the client side code, we need to implement the server code which will handle uploads to the /api/photos API. If we open app.js in your express app folder, we need to define a route for the API call and a handler. Express makes file uploads amazing easy, when you define the handler function, it has two parameters, req and res, which are the HTTP request and response objects. In our HTML we defined an input element named userPhoto, on the req object there is a files field that contains an object with information about all of the files that were uploaded. If we look at req.files we will see that Express has retrieved all of the uploaded data and saved it to a file in the /tmp directory. Using:

app.post('/api/photos', function(req, res) {
    console.log(JSON.stringify(req.files));
});

We see the following information printed to the console where we started our server:

{
    "userPhoto": {
        "size": 4668071,
        "path": "/tmp/f0c99fe2d93f1d0268acbe90c5d16c8a",
        "name": "_DSC7471.jpg",
        "type": "image/jpeg",
        "lastModifiedDate": "2012-02-27T01:52:15.763Z",
        "_writeStream": {
            "path": "/tmp/f0c99fe2d93f1d0268acbe90c5d16c8a",
            "fd": 11,
            "writable": false,
            "flags": "w",
            "encoding": "binary",
            "mode": 438,
            "bytesWritten": 4668071,
            "busy": false,
            "_queue": [],
            "drainable": true
        },
        "length": 4668071,
        "filename": "_DSC7471.jpg",
        "mime": "image/jpeg"
    }
}

For this sample app, I’m just going to move the image from the /tmp directory to the /public/images folder so that we can access the file via an URL. The response we send back to the client will be the path to the image. Note: It is your responsibility to delete the files in the /tmp directory in the request handler, Express will not delete these files once they have been created, don’t forget otherwise your server will mysteriously run out of hard disk space eventually.

Our final server side handler code is shown below, it’s just moving the file and returning the path to the client:

Back on the client side we, we now add the code to our success handler to show the image we uploaded on the page. Note, even though the function is called success, we still need to check for the error parameter that might exist if there was a problem on the server:

IE8 - sheeeeeeeeeeeeeeeeeeeeeeeeeeeeeet

Awesome, we’re all done, time to do a git push, high fives around the office and go home like a hero … but wait, there is that heavy feeling in your stomach, you should really test this in IE8, you don’t want to, but the good dev in you makes you fire up IE8 and test the code. Nearly everything works, but after you upload the data IE8 will open a save file dialog to save the response of the server, sigh (at least it was for me, your milage may vary.

To work around this annoyance, I had to change the response datatype to ‘text’ instead of JSON, you can do this by passing a ‘dataType’: ‘text’ field in the options object we pass to the ajaxSubmit method in the client code. Then on the server side, when calling res.send(), stringify the response e.g. res.send(JSON.stringify({ path: ‘foo/img.jpg’})); and finally before sending back the data using res.send, set the content type to be text/plain i.e. res.contentType(‘text/plain’)

Obviously now in the client code, instead of the response parameter being a JSON object it is a string, so we have to turn it back into a JSON object, using something like jqueries parseJSON method.

Further improvements

For production level code, make sure you:

  • Limit the size of the uploads you permit (if applicable). In production you are probably using something like nginx as a frontend webserver proxying to node, so you can set limits in your nginx.conf file
  • Never trust content users upload, make sure files that are uploaded are an expected type and don’t give the files more permissions than absolutely necessary

February 12, 2012
Node.js connection pooling

#nodejs

node.js is a great platform for building network based applications (I’ll jump into details why in another post). For the rest of this post I’ll assume you know what node is and have some experience using it.

If you have ever used node to make a number of simultaneous requests, what do you think the following code does:

If you are thinking that the code will kick off 100 simultaneous requests to your server, you would be wrong. I see code where people need to do a number of network operations and write code to actually batch and limit the number of requests they are making at once, but node.js networking is already doing this.

In the example above, 100 requests are created and added to an internal network queue, but the default node behaviour when using http.request is to have only 5 requests happening simultaneously to a given socket (a socket is the combination of an IP address and port number, so 127.0.0.1:8080 is a different socket to 127.0.0.1:9090), once a request completes, node then processes the next waiting request in the queue

We can control this behaviour by adjusting the Agent used (or not used) when making a network request (for some info see: http://nodejs.org/docs/latest/api/http.html#http.Agent)

To look at this in action, let’s create a server on our local machine (127.0.0.1) that is listening to port 1337, then we will run the above code. The server is keeping track of the number of concurrent requests made from the client, stored in concurrentServerRequests and the total number of requests made by the client in totalServerRequests

When we run this code we see:

$node maxsockets.js 

Server running at http://127.0.0.1:1337
Concurrent active server requests:1, total received:1
Concurrent active server requests:2, total received:2
Concurrent active server requests:3, total received:3
Concurrent active server requests:4, total received:4
Concurrent active server requests:5, total received:5
Concurrent active server requests:5, total received:6
Concurrent active server requests:5, total received:7
Concurrent active server requests:5, total received:8
Concurrent active server requests:5, total received:9
Concurrent active server requests:5, total received:10

...
Concurrent active server requests:5, total received:98
Concurrent active server requests:5, total received:99
Concurrent active server requests:5, total received:100

As you can see, the server never receives more than 5 simultaneous requests from the client. To change this behaviour, globally we can modify the http.globalAgent.maxSockets value, this allows us to specify how many open sockets we have at one time, this change will then apply to all http.requests made:

Running the code above, our output now show 10 simultaneous requests:

$ node maxsockets.js
Server running at http://127.0.0.1:1337
Concurrent active server requests:1, total received:1
Concurrent active server requests:2, total received:2
Concurrent active server requests:3, total received:3
Concurrent active server requests:4, total received:4
Concurrent active server requests:5, total received:5
Concurrent active server requests:6, total received:6
Concurrent active server requests:7, total received:7
Concurrent active server requests:8, total received:8
Concurrent active server requests:9, total received:9
Concurrent active server requests:10, total received:10
Concurrent active server requests:10, total received:11
Concurrent active server requests:10, total received:12
Concurrent active server requests:10, total received:13
Concurrent active server requests:10, total received:14
...
Concurrent active server requests:10, total received:98
Concurrent active server requests:10, total received:99
Concurrent active server requests:10, total received:100

To demonstrate how the connections are pooled per socket (IP address + port combination), lets create 2 servers, listening on port 1337 and 1338, in this case we can see that the client code is sending 20 simultaneous requests, 10 to each server (I’m leaving in the maxSockets = 10 line) :

$ node maxsockets.js
Server running at http://127.0.0.1:1337
Server running at http://127.0.0.1:1338
Concurrent active server requests:1, total received:1
Concurrent active server requests:2, total received:2
Concurrent active server requests:3, total received:3
Concurrent active server requests:4, total received:4
Concurrent active server requests:5, total received:5
Concurrent active server requests:6, total received:6
Concurrent active server requests:7, total received:7
Concurrent active server requests:8, total received:8
Concurrent active server requests:9, total received:9
Concurrent active server requests:10, total received:10
Concurrent active server requests:11, total received:11
Concurrent active server requests:12, total received:12
Concurrent active server requests:13, total received:13
Concurrent active server requests:14, total received:14
Concurrent active server requests:15, total received:15
Concurrent active server requests:16, total received:16
Concurrent active server requests:17, total received:17
Concurrent active server requests:18, total received:18
Concurrent active server requests:19, total received:19
Concurrent active server requests:20, total received:20
Concurrent active server requests:20, total received:21
Concurrent active server requests:20, total received:22
Concurrent active server requests:20, total received:23
...
Concurrent active server requests:20, total received:198
Concurrent active server requests:20, total received:199
Concurrent active server requests:20, total received:200

As well as defining a global maxSockets value, http.request takes an optional Agent instance in the options parameter, if you pass in false connection pooling is turned off:

Did you notice what I did above? No, it’s subtle but I just embedded an arbitary part of another webpage in my webpage - freaking awesome, no it’s not magic, it’s clipboard.com. If you want to try it out, just send me an email to mark@clipboard.com and I’ll be happy to send you an invite :)

February 9, 2012
Specifying the image to use when your page is shared on Facebook

If you want to specify the image that is shown when your webpage is shared on Facebook, like in the example below:

You need to add a bit of Open Graph metadata in your page:

Gotchas

I ran into two gotchas, the first was that my image was not showing up, it turns out Facebook doesn’t like it when you leave the protocol off the image URL, for example I had something like //www.clipboard.com/foo.png the reason for this is that having the URL start with // is a neat trick that will make the browser load the image from http is the page is http or https if the page is https, it stops you having to litter your code with checks for the protocol your page is on. So you have to explicitly list http:// or https:// in your URLs. I found this was causing the problem by using the Facebook Linter, you specify your URL and it shows you if there are any problems with your page - pretty handy!

The second problem was that the image I specified in the og:image meta value was not being picked up by Facebook, it had cached the previous share image. The way I found to refresh the cache was to actually run the Facebook Linter with the URL of the page and that seemed to refresh the cache Facebook was using for the page.

12:13am  |   URL: http://tmblr.co/Z0ePKyG7tHcq
Filed under: facebook web 
February 7, 2012
Sharing JavaScript code between the browser and node.js

One of the benefits of writing your server code in JavaScript is the ability to easily share code between your frontend and backend. In the rest of this post I will describe how you can share JavaScript code, using a technique that I use at clipboard.com.

First, a quick intro into how you include JavaScript in your node code and client code. In a browser you can inline JavaScript in a page or include a reference to a JavaScript file that should be loaded. Typically it is best practice to not litter the global namespace with functions and variables (which could cause problems if it was included in a page that had other JavaScript using the same variable or function names), so you will normally see a top level object hanging off the window object that contains all of a libraries code e.g.

In node there is a module loading system, basically when you define a module you export functions from the module, which can then be accessed from other parts of the code that load the module. For example, if I have a module that contains common code, I would create a file called common.js, then inside common I would export my functions by attaching them to the exports object, like so:

For more detailed documentation on the node module system, see http://nodejs.org/docs/latest/api/modules.html.

With the above in mind, it is obvious that for common code we can’t use objects that are only available in node e.g. the process object, or functionality only available in the client code e.g. the window object, so shared code is pure JavaScript code, just logic, it must be agnostic to the hosting environment under which the JavaScript is being run.

To share code, we will make the client code look like node code, with respect to the export process, so we will put all of our functions onto an export object. If the code is loaded from node, nothing changes, if the code is loaded from a browser, the export object will actually just be an object we define like the window.common object in the example above. To know if the code has been loaded in node or in a browser, you do a simple check to see if the process object exists (only exists in node), I do a double check to see if the process object has a version property, just incase a “process” object exists in the client code, it is less likely to also have a version field like the node process object. Below is an example showing how this all comes together:

Imagine you have a file called common.js that contains some helper functions, to use this file in both your client side JavaScript and node.js you would …

February 5, 2012
It’s been a while

Wow - it’s been a while since I last posted to this blog. I’ve been meaning for quite a while to get back to blogging and writing. It’s interesting to see how my life has changed since I last blogged, back in december 2010 I was working at Microsoft, living a life of C# and Silverlight, working with the Bing Maps team. Fast forward to 2012, I quit Microsoft and joined a great startup called www.clipboard.com and now work with a completely different technology stack, JavaScript, Node.js, redis, riak, not a hint of Microsoft .

A big change of pace and an exciting challenge, I plan to write some posts about working at a startup, so stay posted.

December 18, 2010
Awesome helmet cam video

Makes me want to go to the mountains right now!

Superior, Speed Fly from Marshall Miller on Vimeo.

October 30, 2010
John Carmack talking about Rage on the iphone

John Carmach talks about some of the technical challenges to get Rage running on the iphone, at an impressive 60fps!

October 26, 2010
Electroadhesion - a wall climbing robot

A very cool invention, a robot that can climb walls using a technique to produce electrical adhesion, think of it as a temporary sticky tape that doesn’t leave any residue.

Original news.com article

October 14, 2010
Radiohead - house of cards

The music video was shot without a traditional camera, instead the whole video was captured using two technologies, Geometric Informatics and Velodyne LIDAR to capture 3D points. Watch the video and see the “how it’s made” video for more info, you can also play with the actual data in a 3D viewer: http://code.google.com/creative/radiohead/viewer.html

October 6, 2010
JavaScript programming patterns

http://www.klauskomenda.com/code/javascript-programming-patterns/

Liked posts on Tumblr: More liked posts »