Holding/Waiting Page
Hello!
I’d like to display a holding page when concurrent users reaches a set threshold. I’ve look high and low and I cant find any documentation on how to get started with it. I looked in the VCL book too. Now I am new to VCL, so I may be searching for the wrong thing. Could anyone point me in the right direction?
Thanks
BB
-
I'd actually want to try both, first would be reqs/s then moving to API calls from the origin hosts later, rather then response codes.
The idea would be to rate limit incoming connections, and give those that are waiting with a nice little count down that would let them in after a set number of minutes. In essence a virtual waiting room when things get busy at the back end. I can code, I just need to know what to code!
I'm guessing I need to count reqs/s or users session from varnish. "Y". Once the count reaches X new sessions over Y get put into a holding queue, and redirected to a page with a count down, once the count down has expired they are allowed into the site. I'm thinking I could do the holding by setting a cookie and checking the expiry.
Not sure if this is the correct path to go down.
-
OK, there are a few things you ought to factor in to your thinking here:
- There's no way to read the number of currently open connections to your origin in varnish, and it wouldn't tell you much even if their were, because of connection reuse, or in the case of HTTP/2, because all requests are multiplexed on one connection.
- A single user will be making lots of requests, so you would need to make sure that if you served a user's HTML request, you also serve all their subresource requests as well.
- If you provide a countdown, and when the countdown expires, your servers are even more busy, you may not be able to handle the request.
This is generally why it makes sense for your origin server to provide a signal to Fastly to let us know if it's OK to send traffic to you, which we call a health check. If the health check were failing we could generate a synthetic response.
That said, if you want the edge solution, we can offer some options. These usually revolve around generating an object and hitting it on every request that you want to limit, then checking it's hit count (
obj.hits
) and allow the request to restart and go to origin if you have not reached some limit. Here is a demo of that.Since that solution operates in buckets, the result might be quite spiky - you'd get all requests until you hit the limit, and then nothing until you tick into the next bucket. So in theory you could do this to figure out your request rate in a rudimentary way:
- In vclmiss (and/or vclpass depending on what requests you want to rate-limit), if
restarts == 0
, rewritereq.url
to a rate-limit URL such as"/__rate-limit/" time.sub(now, 1s)
, and set the backend to the service itself, or another service that can generate a small static response. Save the original URL in a header likereq.http.tmp-Orig-URL
. - In deliver, if
restarts == 0
and req.url is a rate-limit URL, readobj.hits
, and stash it in a header, likereq.http.prev-request-count
. - Set req.url to
"/__rate-limit/" now
. - restart
- In deliver, if
restarts == 1
and req.url is a rate-limit URL readobj.hits
. We'll call thiscurrent-request-count
- Work out what proportion of requests should go to origin. If you are willing to accept
max-rate
reqs/second, then the calculation( (prev-request-count - current-request-count) / max-rate )
would tell you what factor over your maximum rate you are (let's call this resultload-factor
. Eg if theload-factor
is 2, we are wanting to make twice as many requests to origin as you can actually take. Therefore in that case, only half the requests should actually make it to origin. So the probability that we should send this request to origin is(1 / load-factor)
(for an example load factor of 2, the probability is 0.5). We'll call thisfetch-probability
- Generate a random number between 0 and 1. If it's lower than
fetch-probability
, set req.url back to the original URL and restart. If it's higher, generate a synthetic response containing your queue page (and if you want to include a countdown, maybe use theload-factor
to decide how long to wait)
This is tough to implement purely in VCL but I think probably possible. If you do decide to give it a go, let us know how you get on!
-
How does this relate to what's on the Fastly pages here: https://www.fastly.com/blog/fastlys-edge-modules-that-will-power-your-ecommerce-site/
Visitor Prioritization When an ecommerce site starts to get overloaded, you can use visitor prioritization logic to determine who is just browsing your site, and who is actively trying to buy something. With this logic, you can redirect casual shoppers to a waiting room, while active buyers can access the site freely and complete their transactions. Otherwise, during times of high traffic such as Black Friday and the holiday shopping season, you run the risk of both users getting a server too busy error. And with 79% percent of users choosing not to buy from ecommerce websites that perform poorly, that’s potentially a lot of money lost from otherwise paying customers.
I've been asking about this for quite a while - and ended up writing one custom. What's the Fastly offering referred to on the blog?
Please sign in to leave a comment.
Comments
4 comments