Setup
Install
pip install scrapy-poet
Enable
Add the following to your Scrapy configuration to enable scrapy-poet:
For Scrapy ≥ 2.10, configure the add-on:
settings.pyADDONS = { "scrapy_poet.Addon": 300, }
This is what the add-on changes:
-
Sets
InjectionMiddlewarewith value543.Replaces
scrapy.downloadermiddlewares.stats.DownloaderStatswithscrapy_poet.DownloaderStatsMiddleware.
Sets
REQUEST_FINGERPRINTER_CLASStoScrapyPoetRequestFingerprinter.In
SPIDER_MIDDLEWARES, setsRetryMiddlewarewith value275.
-
For Scrapy < 2.10, manually apply the add-on changes. For example:
settings.pyDOWNLOADER_MIDDLEWARES = { "scrapy_poet.InjectionMiddleware": 543, "scrapy.downloadermiddlewares.stats.DownloaderStats": None, "scrapy_poet.DownloaderStatsMiddleware": 850, } REQUEST_FINGERPRINTER_CLASS = "scrapy_poet.ScrapyPoetRequestFingerprinter" SPIDER_MIDDLEWARES = { "scrapy_poet.RetryMiddleware": 275, }
Configure
Declare the SCRAPY_POET_DISCOVER setting with a list of modules that
define page objects, so that they can be loaded at run-time.
A best practice is to create a pages/ folder in your Scrapy project, a
sibling of your spiders/ folder, add an empty __init__.py file to it
to make it a Python module, and declare its import path in the setting:
SCRAPY_POET_DISCOVER = ["myproject.pages"]