This option automatically fetches the robots.txt file based on the current request and adheres to the `disallow` directives. JS version was implemented via the following PRs: * https://github.com/apify/crawlee/pull/2910 * https://github.com/apify/crawlee/pull/2916 * https://github.com/apify/crawlee/pull/2913 We will first need to implement the `RobotsTxtFile` and `Sitemap` classes: * https://github.com/apify/crawlee/blob/master/packages/utils/src/internals/robots.ts * https://github.com/apify/crawlee/blob/master/packages/utils/src/internals/sitemap.ts
This option automatically fetches the robots.txt file based on the current request and adheres to the
disallowdirectives.JS version was implemented via the following PRs:
respectRobotsTxtFilecrawler option crawlee#2910onSkippedRequestoption crawlee#2916RobotsFiletoRobotsTxtFilecrawlee#2913We will first need to implement the
RobotsTxtFileandSitemapclasses: