This guide describes Fastly's Request Collapsing feature, frequently used when creating advanced service configurations.
NOTE: This guide requires advanced knowledge of Varnish and the VCL language.
Request Collapsing causes simultaneous cache misses within a single Fastly datacenter to be "collapsed" into a single request to an origin server. While the single request is being processed by the origin, the other requests wait in a queue for it to complete. Two types of Request Collapsing exist:
- Collapsing on a single cache server
- Collapsing within the datacenter between cache servers
Each cache server will automatically queue duplicate requests for the same hash and only allow one request to origin. You can disable this behavior by setting
Within a datacenter, not every cache stores every object. Only two servers in each datacenter will store an object: one as a primary and one as a backup. Only those two servers will fetch the object from origin.
How it works
In Fastly's version of Varnish, VCL subroutines often run on different caches during a request. For a particular request, both an edge node and a cluster node will exist (though a single cache can, in some cases, fulfill both of these roles). The edge node receives the HTTP request from the client and determines via a hash which server in the datacenter is the cluster node. If this cache determines it is the cluster node and has the object in cache, it fulfills both the edge node and the cluster node roles.
Certain VCL subroutines run on the edge node and some on the cluster node:
- Edge Node:
- Cluster Node:
Determining if a cache is an edge or a cluster node
bereq.is_clustering VCL variable will be true if the request was forwarded to another machine in the cluster.
Keep in mind the following limitations when using the Request Collapsing feature:
req.http.*headers are not transferred from the cluster node back to the edge node. Remember this when writing advanced configurations that use headers to keep track of state. If you set a
req.http.*header in any of the subroutines that run on the cluster node, expect that the change will not persist on the edge node.
- A single, slow request to origin can sometimes cause a great many other requests for the same object to hang and fail. Because many requests for a single object are being collapsed down to one, they all succeed or fail based on the request that reaches the origin.