Advanced Integration

Setup Site Crawling

Regularly capturing and translating new content is essential to maintaining a fully translated website. Content capture is triggered by loading a page on a translated site running through the Global Delivery Network (GDN). This can happen by organic traffic from end-users, or via bots  indexing a translated site.

Rather than relying on organic traffic to capture new content in a timely manner, set up a web crawler (spider), to browse each page automatically. This is especially helpful if you have a staging environment to capture and translate content before it's pushed to production, where organic traffic is low.

If you do not already utilize a web crawling tool, there are cloud-based solutions (like Apify) and browser-based extensions that you can use depending on your preference. Each crawler may have its own features, but the core functionality is the same. You can specify a domain, and a bot will identify all the hyperlinks on a page, store them in a queue, and systematically open or download each page while simultaneously queueing additional hyperlinks. 

It is only necessary to load one translated page (one language version) to trigger content capture for all languages tied to a given source domain.   

Depending on your web crawler, you may have to deselect the Protected checkbox within your translated site configuration

Web crawlers are unable to capture content that requires user interactions via submission.

Was this article helpful?