Currently, when a user runs their crawler, no log is printed, leaving users unaware of the crawler's progress and actions.
We have a log formatter already ready, but users need to configure it manually. Like this:
import asyncio
import logging
from crawlee.beautifulsoup_crawler import BeautifulSoupCrawler, BeautifulSoupCrawlingContext
from crawlee.log_config import CrawleeLogFormatter
# Configuration the logging
handler = logging.StreamHandler()
handler.setFormatter(CrawleeLogFormatter(include_logger_name=True))
root_logger = logging.getLogger()
root_logger.setLevel(logging.INFO)
root_logger.addHandler(handler)
async def main() -> None:
crawler = BeautifulSoupCrawler()
@crawler.router.default_handler
async def request_handler(context: BeautifulSoupCrawlingContext) -> None:
await context.enqueue_links()
data = {
'url': context.request.url,
'title': context.soup.title.string,
}
await context.push_data(data)
await crawler.run(['https://crawlee.dev'])
if __name__ == '__main__':
asyncio.run(main())
Can we make this setup a default? Possible solution: importing any module from Crawlee should configure the root logger automatically.
Currently, when a user runs their crawler, no log is printed, leaving users unaware of the crawler's progress and actions.
We have a log formatter already ready, but users need to configure it manually. Like this:
Can we make this setup a default? Possible solution: importing any module from Crawlee should configure the root logger automatically.