Announcement

Collapse
No announcement yet.

Allow web crawling for compliance archiving

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Allow web crawling for compliance archiving

    I need to allow a third party company to crawl my site every day for legal compliance archiving reasons.

    The crawl from this US based vendor appears to be blocked by some security mechanism in webflow. We just migrated the site from wordpress to webflow, and now the compliance crawls are blocked (they worked fine in wordpress).

    How can I whitelist this vendor IP or user agent to crawl my site?

  • #2
    Webflow support is the only one who could actually adjust access to your site, if such a thing is even possible on an individual site level. Contact them to see what’s possible.

    Comment


    • #3
      I've run into the same thing and found that Webflow’s automatic robots.txt settings can block needed data crawling. What worked for me was turning off the “Disable Webflow subdomain indexing” option in the SEO settings under Site Settings. That way, crawlers can access published content for archiving without hitting a block. You'll still need to check if your main domain settings allow crawling too.

      Comment

      Working...
      X