Deals of the Day
This is a web scraper, using the Python framework Scrapy, built to extract data such as price and product name from the Deals of the Day section on Mercado Livre website.
What Data Do We Want to Scrape?
- Product Name
- Original Price
- Current Price
- Product Url
- Data Extraction Date
Note: The scraper handles pagination and extracts the aforementioned data throughout the entire Deals of the Day section.
💻
Requirements
Before you start, please check if you have met these few basic requirements:
- Installed the latest stable python version (Python 3.7 or later).
- Created a virtual enviroment to run the ScraPy framework on your machine.
- Installed Scrapy 1.6 or a later stable version.
Note: It is strongly recommended that you install Scrapy in a dedicated virtualenv, to avoid conflicting with your system packages.
Getting Started
From terminal
- Create an Enviroment:
mkdir virtual-enviroments
$ cd virtual-enviroments
$ python3 -m venv venv
- Activate it:
Linux/macOS
$ source venv/bin/activate
- Install the Scrapy framework:
$ pip install Scrapy
🚀
How to Use:
Clone this repository into your workspace:
$ git clone https://github.com/david-adds/mercadolivre-scraper.git
Once you have cloned the repository, open it up so you can run the scraper.
$ cd mercadolivre-scraper
Then, run the spider to scrape the data:
$ scrapy crawl deals