Introduction

Nginx Ticketed Upload module based on the Upload Progress module written by Brice Figureau. We’ve mostly added parts of code responsible for handling upload tickets (plus we’ve done some necessary modifications to Brice’s part).

Use case #1 – validating an upload before the whole file is sent

I’ll just give some clues about it (I hope to write something more complex in the future).

There are many ways of solving this problem. One of these is an Nginx module that will do all the job. The huge shortcoming is that all the validation logic has to be written in C, which is a bad idea for most companies (web developers probably won’t be able to do that, and it would be a rather tedious work anyways). The other solution is writing a module that will ask some application through FastCGI about the upload. The bad thing is that it might take a little bit too long, and the bytes would have to be kept somewhere on the socket while the FastCGI script is processing the request.

With my module you can write the validation logic in whatever language you like, and it won’t have any network problems. That’s how it would work:

  • the ticket-generating location is configured as an internal location, so it’s not possible to get a ticket by just requesting it
  • upper-level application validates the client, and does an internal redirect to that internal, ticket-generating location by setting the special Nginx header: X-Accel-Redirect
  • the ticket is generated and sent to the client
  • the client can now upload the file using the ticket (tickets ensure the upload session is valid; and, of course, the validation is fast)

Use case #2 – distributed file upload site

Why?

Large-scale websites must deal with heavy traffic. Usually, it’s done by diving the job of handling requests into fair parts, each handled by a different server. The division is done by a server called load-ballancer. A load-ballancer might be implemented in many ways – most notably as a reverse-proxy or as an LVS. In the former all the traffic goes through the load-ballancer, and it’s not appropriate in some situations. The latter is much more complicated solution, but the important thing is that only the incoming traffic must go through the central server. It’s a perfect solution for most of the websites – usually, the incoming traffic is not very heavy, as it mostly contains just the HTTP request headers.

This situation changes when we’re talking about websites that accept file uploads. The incoming traffic is much bigger that the outgoing. Here, plain LVS is not a solution as good as it was when we were thinking about typical websites.

What can we do?

Probably there are many good solutions. One of these is my (well, to be accurate, copyrights actually belong to DreamLab) Nginx module.

Imagine a simple model (I hope your eyes won’t start to bleed):

The above diagram doesn’t show everything it should. So I’ll just add some facts:

  • servers tagged as “C” have their external addresses
  • “B” server has a reverse-proxy/LVS installed and is forwarding the requests to the “C” servers
  • “C” servers have Nginx with my module installed

The algorithm of uploading a file works this way:

  1. A client application within the “A” cloud (WAN) wants to send a file. It just knows the address of the “B” server.
  2. It connects to the “B” server. “B” server forwards the connection to one of the “C” servers.
  3. The server from “C” responds to the client. The response contains its external address (so the client can later upload the file directly to the server, and the traffic won’t go through “B”) and a special piece of bytes called ticket, which will be used to identify the session (the ticket contains information that makes it possible to validate the connection details in future).
  4. The client starts directly uploading the file to the server, attaching the ticket (in POST). The server validates the ticket, and, depending on the result of the validation, accepts the upload or sends an error code and breaks the connection.
  5. During the upload process, client can connect to the server and get status of the upload. (Yeah, it’s possible to make fancy upload progress bars with this using AJAX!)

For more details, check the low-level documentation of the module.