Challenge 2: Split on Multiple Pages
Goal
Each page has a few persons. Find all URLs and create a scraper to scrap all URLs.
Start
git checkout pagination
Instructions
  1. Open the file myscraper/spiders/myscraper.py
  2. Use a loop in the start_requests method
  3. Start the scraper (see Start instructions) and check the log item_scraped_count.
Soluce
git checkout .
git checkout pagination-soluce

Persons 0 - 2 / 100

Name
Mr Aaron Willer
Birth year
Death year
1912
Gender
M
Marital status
Spouse
Ticket class
3
Ticket number
3410
Ticket price
8.14
Residence
Job
Companions count
0
Cabin
Embarked in
Cherbourg
Destination
Chicago Illinois United States
Died in the Titanic
Yes
Body recovered
No
Rescue boat number
Name
Mr Albert Augustsson
Birth year
1889
Death year
1912
Gender
M
Marital status
Spouse
Ticket class
3
Ticket number
347468
Ticket price
7.17
Residence
Krakoryd Småland Sweden
Job
General Labourer
Companions count
0
Cabin
Embarked in
Southampton
Destination
Bloomington Indiana United States
Died in the Titanic
Yes
Body recovered
No
Rescue boat number