E-commerce platforms that cause crawl budget issues: Generate more than one URL for the same page content . For example, maybe you have the same product in multiple product categories, so you end up having multiple URLs for each instance of the product page instead of just one. For this reason, you will need to not index or pipe these additional URLs into the "original" product URL version to avoid duplicate content issues. Default crawlable URLs for each existing filter in listing pages .
For example, in category listings, visitors may have the option to sort the existing list of products based on criteria such as size, color, popularity, or price. This generates specific, crawlable URLs for each combination, most displaying the same or very similar fax number list content, and they should end up not getting indexed. For example, the following filtered list URL is being canonicalized to the main URL, with no parameters: canonicalized filtered page While this is useful to avoid content duplication or cannibalization issues, these URLs generally remain crawlable,
which doesn't help with crawl budget issues. It is also important to prevent crawling of those pages that are not intended to be indexed (or ranked); otherwise, we'll end up with a scenario like this: exploration budget This e-commerce site only has 1.6% active URLs driving organic search traffic to the site, out of a total of 200,000 crawlable URLs. A sizable amount of crawled URLs are indexable (compliant) but bring no organic search traffic. Even worse, 89% of crawled URLs are unindexable and without organic