scrapy-poet documentation

scrapy-poet allows to use web-poet Page Objects with Scrapy.

web-poet defines a standard for writing reusable and portable extraction and crawling code; please check its docs to learn more.

By using scrapy-poet you’ll be organizing the spider code in a different way, which separates extraction and crawling logic from the I/O, and from the Scrapy implementation details as well. It makes the code more testable and reusable. Furthermore, it opens the door to create generic spider code that works across sites. Integrating a new site in the spider is then just a matter of write a bunch of Page Objects for it.

scrapy-poet also provides a way to integrate third-party APIs (like Splash and AutoExtract) with the spider, without losing testability and reusability. Concrete integrations are not provided by web-poet, but scrapy-poet makes them possbile.

To get started, see Installation and Scrapy Tutorial.

License is BSD 3-clause.