In this tutorial I will show how to stream a file uploaded by user to a file storage server(such as Amazon S3 or any other custom server) without storing file in a temporary directory. The advantage of this is that in case there is a lot of parallel uploads of large files then it prevents your filesystem from hanging. We also won’t be storing the file in primary memory therefore we are also preventing memory getting hanged.
We don’t let the user to directly upload files to storage server when we want to verify user authentication and also modify or scan the files.
While the time of writing this tutorial, the latest stable version of Express framework is 4.13.3. Since express version 4, express doesn’t inherit connect module anymore. In earlier versions, express used connect for middleware and route definition functionalities and was only responsible for rendering views and some other functionalities.
Now Express 4 has its own middleware and route definition functionality. Express 4 now directly inherits from built-in HTTP module of Node.js. Earlier Express inherited connect and connect inherited the HTTP module.
But the good news is that you still will be able to use middleware and callback functions provided by connect because the pattern of these functions remain the same in express.
Parsing “multipart/form-data” Content-Type
Express doesn’t parse body of POST requests. Earlier versions of express used the body-parser module that came built-in with connect to parse body of POST requests. But now body-parser needs to be installed separately because express doesn’t use connect anymore.
But the problem is body-parser doesn’t parse HTTP POST request body if its type is multipart/form-data
. Therefore body-parser is useless when you are using multipart/form-data
to upload files.
So we need to use multiparty or connect-multiparty modules to parse POST request body if Content-Type
is multipart/form-data
.
connect-multiparty stores the uploaded file in a temporary directory. Whereas multiparty also provides readable stream objects for every uploaded file to read their content. Actually connect-multiparty is built on top of multiparty. So for our use we need multiparty module.
Streaming Uploaded File
Let’s start building the server to stream uploaded files to a storage server.
Here is the package.json file with the listed dependencies:
"name": "custom-server",
"dependencies": {
"express": "4.13.3",
"request": "2.64.0",
"multiparty": "4.1.2",
"form-data": "1.0.0-rc3"
}
}
Here is the Node.js code to create the server:
var multiparty = require("multiparty");
app.post("/submit", function(httpRequest, httpResponse, next){
var form = new multiparty.Form();
form.on("part", function(part){
if(part.filename)
{
var FormData = require("form-data");
var request = require("request")
var form = new FormData();
form.append("thumbnail", part, {filename: part.filename,contentType: part["content-type"]});
var r = request.post("http://localhost:7070/store", { "headers": {"transfer-encoding": "chunked"} }, function(err, res, body){
httpResponse.send(res);
});
r._form = form
}
})
form.on("error", function(error){
console.log(error);
})
form.parse(httpRequest);
});
app.get("/", function(httpRequest, httpResponse, next){
httpResponse.send("<form action='http://localhost:9090/submit' method='post' enctype='multipart/form-data'><input type='file' name='thumbnail' /><input type='submit' value='Submit' /></form>");
});
app.listen(9090);
In the above code the part
event is triggered when a file is encountered in the body of the request. part
parameter of the callback is a Readable Stream that reads content of that particular file for which the part
event was triggered.
We are then constructing an HTTP POST request using request module to send the file to the storage server. The second parameter of the form.append
method takes a Readable Stream and pipes the data to the writable stream, which continuously sends data to the storage server without saving the data anywhere. Whenever a chunk of data arrives from the client for the file then data
event of the readable stream is fired. The buffer that holds the chunk is removed automatically when the data
event’s callback is finished executing,
As we don’t know the size of the files therefore we are setting transfer-encoding:'chunked'
header instead of content-length
header.
To run this code make sure you are running a storage server at URL http://localhost:7070/store.
If you are streaming the file to Amazon S3 then you will need to use `x-amz-decoded-content-length` header. More on it here.