Blog Post

Nodeexpress 2 9 2015 1
Middleware in Node.js

Middleware

We have used existing middleware (body-parser, cookie-parser, static, and connect-session, to name a few), and we’ve even written some of our own (when we check for the presence of &test=1 in the querystring, and our 404 handler). But what is middleware, exactly?

Conceptually, middleware is a way to encapsulate functionality; specifially a functionality that operates on an HTTP request to your application. Practically, a middleware is simply a function that takes three arguments: a request object, a response object, and a “next” function.

Middleware is executed in what’s known as a pipeline. You can imagine a physical pipe that carries water. The water gets pumped in at one end, and then there are gauges and valves before the water gets going. The important part about this analogy is that order matters: if you put a pressure gauge before a valve, it has a different effect than if you put the pressure gauge after the valve. Similarly, if you have a valve that injects something into the water, everything “downstream” from that valve will contain the added ingredient. In an Express app, you insert middleware into the pipeline by calling app.use.

Prior to Express 4.0, the pipeline was complicated by having to link the router in explicitly. Depending on where you linked in the router, routes could be linked in out of order, making the pipeline sequence less clear when you mixed middleware and route handlers. In Express 4.0, middleware and route handlers are invoked in the order in which they were linked in, making it much clearer what the sequence is.It’s common practice to have the very last middleware in your pipeline be a “catch all” handler for any request that doesn’t match any other routes. This middleware usually returns a status code of 404 (Not Found).So how is a request “terminated” in the pipeline? That’s what the next function passed to each middleware does: if you don’t call next(), the request terminates with that middleware.

Learning how to think flexibly about middleware and route handlers is key to understanding how Express works. Here are the things you should keep in mind:

Route handlers (app.get, app.post, etc.—often referred to collectively as app.VERB) can be thought of as middleware that handle only a specific HTTP verb (GET, POST, etc.).

Conversely, middleware can be thought of as a route handler that handles all HTTP verbs (this is essentially equivalent to app.all, which handles any HTTP verb; there are some minor differences with exotic verbs such as PURGE, but for the common verbs, the effect is the same.)

Route handlers require a path as their first parameter. If you want that path to match any route, simply use /*. Middleware can also take a path as its first parameter, but it is optional (if it is omitted, it will match any path, as if you had specified /\*).

Route handlers and middleware take a callback function that takes two, three, or four parameters (technically, you could also have zero or one parameters, but there is no sensible use for these forms). If there are two or three parameters, the first two parameters are the request and response objects, and the third paramater is the next function. If there are four parameters, it becomes an error-handling middleware, and the first parameter becomes an error object, followed by the request, response, and next objects.

If you don’t call next(), the pipeline will be terminated, and no more route handlers or middleware will be processed. If you don’t call next(), you should send a response to the client (res.send, res.json, res.render, etc.); if you fail to do so, the client will hang and eventually time out.

If you call next(), it’s generally inadvisable to send a response to the client. By doing so, middleware or route handlers that are further down the pipeline will be executed, but any client responses they send will be ignored.

Let’s try some really simple middlewares:

app.use(function(req, res, next){

        console.log('processing request for "' + req.url + '"....');

        next();

});

app.use(function(req, res, next){

        console.log('terminating request');

        res.send('thanks for playing!');

        // note that we do NOT call next() here...this terminates the request

});

app.use(function(req, res, next){

        console.log('whoops, i\'ll never get called!');

});

Here we have three middlewares. The first one simply logs a message to the console before passing on the request to the next middleware in the pipeline by calling next(). Then the next middleware actually handles the request. Note that if we omitted the res.send here, no response would ever be returned to the client. Eventually the client would time out. The last middleware will never execute, because all requests are terminated in the prior middleware.

Now let’s consider a more complicated scenario:

var app = require('express')();

app.use(function(req, res, next){

        console.log('\n\nALLWAYS');

        next();

});

app.get('/a', function(req, res){

        console.log('/a: route terminated');

        res.send('a');

});

app.get('/a', function(req, res){

        console.log('/a: never called');

});

app.get('/b', function(req, res, next){

        console.log('/b: route not terminated');

        next();

});

app.use(function(req, res, next){

        console.log('SOMETIMES');

        next();

});

app.get('/b', function(req, res, next){

        console.log('/b (part 2): error thrown' );

        throw new Error('b failed');

});

app.use('/b', function(err, req, res, next){

        console.log('/b error detected and passed on');

        next(err);

});

app.get('/c', function(err, req){

        console.log('/c: error thrown');

        throw new Error('c failed');

});

app.use('/c', function(err, req, res, next){

        console.log('/c: error deteccted but not passed on');

        next();

});

app.use(function(err, req, res, next){

        console.log('unhandled error detected: ' + err.message);

        res.send('500 - server error');

});

app.use(function(req, res){

        console.log('route not handled');

        res.send('404 - not found');

});

app.listen(3000, function(){

        console.log('listening on 3000');

});

Before going over this example, try to guess what the result might turn out to be. What are the different routes? What will the client see? What will be printed on the console? If you can correctly answer all of those questions, then you’ve got the hang of the routes in Express! Pay particular attention to the difference between a request to /band a request to /c; in both instances, there was an error, but one results in a 404 and the other results in a 500.

Note that middleware must be a function. Keep in mind that in JavaScript, it’s quite easy (and common) to return a function from a function. For example, you’ll note that express.static is a function, but we actually invoke it, so it must return another function. Consider:

app.use(express.static);        // this will NOT work as expected

console.log(express.static());  // will log "function", indicating

                                // that express.static is a function

                                // that itself returns a function