Throttling web crawlers in vcl

Comments

2 comments

  • Rex Osafo-Asare

    Hi Jeff

    Unfortunately we don’t have an out of the box solution that handles rate limiting of this nature but we can work around it through setting some conditions within the app.

    If you are aware of which bots are crawling your site you can set up rate limiting by user-agent through the app. The VCL will allow 1 in 4 requests to hit and he other 3 will receive a synthetic responses. A random number generator will determine which of these requests will get through. You will need to follow these steps to implement this solution:

    1. Create a synthetic 503 Service Unavailable response for the bots that miss. See our doc page on how to create synthetic responses.

    2. After the response is created, set up a Request Condition thats satisfies the following:

    if req.http.User-Agent ~ (Googlebot|Baiduspider) && !randombool(1, 4)

    You should be able to extend the list of misbehaving bots as you see fit. This should do the trick!

    0
    Comment actions Permalink
  • Richard Alpagot

    We also have a partnership with PerimeterX with regards to bot detection which would be worth checking out with regards to solving this problem.

    Fastly Docs: https://docs.fastly.com/guides/integrations/perimeterx-bot-defender

    Direct Link to their product: https://www.perimeterx.com/products/bot-defender/?utmsource=partner&utmmedium=fastly&utm_campaign=botdefender

    0
    Comment actions Permalink

Please sign in to leave a comment.