Why is the number of GDN requests higher than the number of page views?
Page views do not map 1:1 to GDN requests:
- Page views represent a single page loading, regardless of how many components or assets are needed to build the page.
- Requests are a count of each component the GDN proxy translates while the page is built. Multiple requests must be made to assemble a single page in many cases.
A site's or page's technical design and rendering method directly relate to the number of GDN requests. Generally speaking, the more complex the structure of a page, the higher the potential number of GDN requests required to translate it.
The number of registered page views may also depend on the exact tools used for traffic analysis and measurement. Tools like Google Analytics often exclude known bots and crawlers from the page view count, automatically resulting in fewer registered page views. The GDN does not exclude any traffic by default, so the metrics for requests include all traffic handled by our proxies.
What are some common reasons for spikes in the number of GDN requests?
The following events may lead to a spike in GDN requests:
- Any event that will increase the amount of bot traffic to your website, such as:
- Adding a new target locale
- Adding large amounts of new content to your website
- SEO crawlers being implemented
- Changes in the site structure and site redesigns
- Marketing activities directing traffic to particular pages
How to identify what caused a spike in GDN requests?
If you notice a spike in GDN requests, we recommend conducting a traffic analysis to identify possible causes:
- Check if any of the events listed above have recently occurred.
- Check if there are specific User-Agents causing the majority of the traffic.
- If needed, take steps to filter out this traffic.
- Check for spikes in unsuccessful HTTP responses (e.g. HTTP 404 status - Page Not Found).
- To help avoid unsuccessful responses, remove or fix any broken or archived links and references. These may be present in outdated robots.txt or sitemap.xml files that are part of your site.
- Consider caching translated error pages, so that they are not served by the GDN.
Remember that the GDN proxies all requests to your source site(s). This means that your source sites will have a record of all the traffic that passes through the GDN. In addition to any logs provided by Smartling, your web servers will also be able to provide a complete picture of the nature and composition of received traffic.
How can I filter out bot traffic?
Smartling's GDN doesn't filter out bot traffic as an out-of-the-box solution. Some bot traffic can create exceptional value for your company, e.g. bots from search providers like Google, which ensure good SEO performance by indexing your site.
Other bots can lead to a spike in GDN requests without providing any benefit to your company. There are steps that can be taken on your end to filter out unwanted bot traffic.
To filter out unwanted bot traffic from bots providing their User-Agent ID:
- Identify the User-Agent ID of unwanted bots (if available).
- Add unwanted bots and crawlers to your robots.txt file:
This file establishes which URLs a bot or crawler can and cannot access on your site. -
Locale-specific sitemaps:
If the robots.txt file is not sufficient, a custom sitemap can be implemented for your localized sites. Similar to robots.txt, the sitemap helps direct bots to the appropriate sections on your website.
To also filter out bots that don't provide a User-Agent ID or that pretend to be humans:
-
Content Delivery Network (CDN):
CDNs can be configured to implicitly direct or block traffic based upon more advanced tools like real-time behavior analysis of traffic sources. For example, no human can browse 1000 pages in a few minutes, so even if that "user" self-identifies with a non-bot User-Agent, they are likely some automatic crawling software. -
Firewall (WAF) rules:
As an alternative or complement to CDN, a Web Application Firewall (WAF) can block traffic using a massive range of traffic filtering, analysis, and shaping capabilities. Many WAFs already include a filter for known bots.