Simple Craigslist wrapper

Overview

python-craigslist

A simple Craigslist wrapper.

License: MIT-Zero.

Disclaimer

  • I don't work for or have any affiliation with Craigslist.
  • This module was implemented for educational purposes. It should not be used for crawling or downloading data from Craigslist.

Installation

pip install python-craigslist

Classes

Base class:

  • CraigslistBase

Subclasses:

  • CraigslistCommunity (craigslist.org > community)
  • CraigslistHousing (craigslist.org > housing)
  • CraigslistJobs (craigslist.org > jobs)
  • CraigslistForSale (craigslist.org > for sale)
  • CraigslistEvents (craigslist.org > event calendar)
  • CraigslistServices (craigslist.org > services)
  • CraigslistGigs (craigslist.org > gigs)
  • CraigslistResumes (craigslist.org > resumes)

Examples

Looking for a room in San Francisco?

from craigslist import CraigslistHousing
cl_h = CraigslistHousing(site='sfbay', area='sfc', category='roo',
                         filters={'max_price': 1200, 'private_room': True})

# You can get an approximate amount of results with the following call:
print(cl_h.get_results_approx_count())

992

for result in cl_h.get_results(sort_by='newest', geotagged=True):
    print(result)

{
    'id': u'4851150747',
    'name': u'Near SFSU, UCSF and NEWLY FURNISHED - CLEAN, CONVENIENT and CLEAN!',
    'url': u'http://sfbay.craigslist.org/sfc/roo/4851150747.html',
    'datetime': u'2015-01-27 23:44',
    'price': u'$1100',
    'where': u'inner sunset / UCSF',
    'has_image': False,
    'has_map': True,
    'geotag': (37.738473, -122.494721)
}
# ...

Maybe a software engineering internship in Silicon Valley?

from craigslist import CraigslistJobs
cl_j = CraigslistJobs(site='sfbay', area='sby', category='sof',
                      filters={'is_internship': True, 'employment_type': ['full-time', 'part-time']})

for result in cl_j.get_results():
    print(result)

{
    'id': u'5708651182',
    'name': u'GAME DEVELOPER INTERNSHIP AT TYNKER - AVAILABLE NOW!',
    'url': u'http://sfbay.craigslist.org/pen/eng/5708651182.html',
    'datetime': u'2016-07-30 13:30',
    'price': None,
    'where': u'mountain view',
    'has_image': True,
    'has_map': True,
    'geotag': None
}
# ...

Events with free food in New York?

from craigslist import CraigslistEvents
cl_e = CraigslistEvents(site='newyork', filters={'free': True, 'food': True})

for result in cl_e.get_results(sort_by='newest', limit=5):
    print(result)

{
    'id': u'4866178242',
    'name': u'Lituation Thursdays @ Le Reve',
    'url': u'http://newyork.craigslist.org/mnh/eve/4866178242.html',
    'datetime': u'1/29',
    'price': None,
    'where': u'Midtown East',
    'has_image': True,
    'has_map': True,
    'geotag': None
}
# ...

Where to get filters from?

Every subclass has its own set of filters. To get a list of all the filters supported by a specific subclass, use the .show_filters() class-method:

>>> from craigslist import CraigslistJobs, CraigslistForSale
>>> CraigslistJobs.show_filters()

Base filters:
* posted_today = True/False
* query = ...
* search_titles = True/False
* has_image = True/False
Section specific filters:
* is_internship = True/False
* is_telecommuting = True/False
* is_contract = True/False
* is_parttime = True/False
* is_nonprofit = True/False
* employment_type = u'full-time', u'part-time', u'contract', u"employee's choice"

>>> CraigslistForSale.show_filters(category='cta')

Base filters:
* posted_today = True/False
* query = ...
* search_titles = True/False
* has_image = True/False
Section specific filters:
* min_year = ...
* model = ...
* min_price = ...
* max_miles = ...
* make = ...
* max_price = ...
* min_miles = ...
* max_year = ...
* auto_title_status = u'clean', u'salvage', u'rebuilt', u'parts only', u'lien', u'missing'
* auto_transmission = u'manual', u'automatic', u'other'
* auto_fuel_type = u'gas', u'diesel', u'hybrid', u'electric', u'other'
* auto_paint = u'black', u'blue', u'brown', u'green', u'grey', u'orange', u'purple', u'red', u'silver', u'white', u'yellow', u'custom'
* auto_bodytype = u'bus', u'convertible', u'coupe', u'hatchback', u'mini-van', u'offroad', u'pickup', u'sedan', u'truck', u'SUV', u'wagon', u'van', u'other'
* auto_drivetrain = u'fwd', u'rwd', u'4wd'
* auto_size = u'compact', u'full-size', u'mid-size', u'sub-compact'
* auto_cylinders = u'3 cylinders', u'4 cylinders', u'5 cylinders', u'6 cylinders', u'8 cylinders', u'10 cylinders', u'12 cylinders', u'other'
* condition = u'new', u'like new', u'excellent', u'good', u'fair', u'salvage'

Where to get site and area from?

When initializing any of the subclasses, you'll need to provide the site, and optionall the area, from where you want to query data.

To get the correct site, follow these steps:

  1. Go to craigslist.org/about/sites.
  2. Find the country or city you're interested on, and click on it.
  3. You'll be directed to <site>.craigslist.org. The value of <site> in the URL is the one you should use.

Not all sites have areas. To check if your site has areas, check for links next to the title of the Craigslist page, on the top center. For example, for New York you'll see:

https://user-images.githubusercontent.com/1008637/45307206-bb404d80-b51e-11e8-8e6d-edfbdbd0a6fa.png

Click on the one you're interested, and you'll be redirected to <site>.craigslist.org/<area>. The value of <area> in the URL is the one you should use. If there are no areas next to the title, it means your site has no areas, and you can leave that argument unset.

Where to get category from?

You can additionally provide a category when initializing any of the subclasses. To obtain the code of this category, follow these steps:

  1. Go to <site>.craigslist.org or just craigslist.org (you'll be directed to the last used site).
  2. You'll see a list of categories and subcategories (see image below).
  3. Click on the interested subcategory. You'll be redirected to the search view for that subcategory. The URL you were redirected will end with /search/<category>. This would be the code for your category.

https://user-images.githubusercontent.com/14173022/46252889-3614ce00-c424-11e8-9bac-060c236b8b58.png

Is there a limit for the number of results?

Yes, Craigslist caps the results for any search to 3000.

Support

If you find any bug or you want to propose a new feature, please use the issues tracker. I'll be happy to help you! :-)

Comments
  • Include mapaddress text

    Include mapaddress text

    Geocodes appear to be approximates in a lot of cases. In certain instances, the listing will have the address in the mapaddress div tag and it doesn't seem like this reads that. This is a feature request to include that information.

    Example page - https://york.craigslist.org/apa/d/york-room-for-rent/7104679926.html Example return:

    {
      'id': '7104679926',
      'repost_of': None,
      'name': 'Room for Rent',
      'url': 'https://york.craigslist.org/apa/d/york-room-for-rent/7104679926.html',
      'datetime': '2020-05-07 08:47',
      'last_updated': '2020-05-07 08:47',
      'price': '$669',
      'where': None,
      'has_image': True,
      'geotag': (39.955682, -76.718121)
    }
    

    Page with inspection open: Image with inspection

    opened by BigFav 11
  • Include body and images

    Include body and images

    Add another option to get_results called include_details which also fetches the body of the listing as well as the urls of all the images. The listing is only fetched once and reused for both the geotag logic and the include_details logic.

    opened by bschlenk 8
  • search_titles option

    search_titles option

    Hi and thanks again for this great tool.

    I am using the search_titles option for a keyword and I have come to realize that it may not be working properly.

    As an example, am using the keyword 'Softail' to search for Harley bikes and it does flag the ones with Softail in the title.

    Here is a sample of the code I use.

    image

    Could you check or tell me what I am doing wrong?

    opened by usctzen 6
  • Add min bedroom/bathroom flags to CraigslistHousing

    Add min bedroom/bathroom flags to CraigslistHousing

    This code allows the following example to work.

    cl_h = CraigslistHousing(site='sfbay', area='sfc', category='apa',
                             filters={'min_price': 6000, 'min_bedrooms': 5, 'min_bathrooms': 2})
    
    for result in cl_h.get_results(sort_by='newest', geotagged=True):
        print(result)
    
    opened by vpontis 6
  • An idea: Add filter for owner or dealer listing for certain for sale categories

    An idea: Add filter for owner or dealer listing for certain for sale categories

    In CraigslistForSale, there are several categories (e.g. 'CTA') that have the option to filter listings by dealer or owner on the website.

    Adding that as a filter in this would be very useful.

    opened by Quresh95 5
  • Bug in example code

    Bug in example code

    Here is the housing example:

    from craigslist import CraigslistHousing
    
    cl_h = CraigslistHousing(site='sfbay', area='sfc', category='roo',
                             filters={'max_price': 1200, 'private_room': True})
    
    for result in cl_h.get_results(sort_by='newest', geotagged=True):
        print result
    

    This was working about a week ago, but as of recently it throws an error (on the line with the for loop). The error is: File "/Library/Python/2.7/site-packages/craigslist/__init__.py", line 160, in get_results p_text = row.find('span', {'class': 'p'}).text AttributeError: 'NoneType' object has no attribute 'text'

    Thanks

    opened by Adama94 5
  • ImportError: No module named 'Queue'

    ImportError: No module named 'Queue'

    File "C:\Python34\lib\site-packages\craigslist\__init__.py", line 3, in <module> from Queue import Queue ImportError: No module named 'Queue'

    Looks like it's not Python3 compatible after all.

    opened by Superbest 5
  • Added Python 3 compatibility

    Added Python 3 compatibility

    When I tried to use your library with Python 3 it threw several errors, but I changed the import statements to work with both Python 2 and 3. I'd really appreciate it if you could add this functionality! :smile:

    opened by tweakdeveloper 5
  • Alerts are 10-15 minutes late

    Alerts are 10-15 minutes late

    There seems to be a 10-15 minute delay from when something is posted on craigslist to when python-craigslist finds it. Is there any way to minimize this time?

    opened by damanc7 4
  • 403 error

    403 error

    When following the basic lines in the README, I get requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://sfbay.craigslist.org/search/sfc/roo?searchNearby=1&sort=date&max_price=1200&private_room=1&s=0.

    Is there functionality that allows a user to work-around this? A quick look at the code makes it seem like no.

    A potential lede - https://stackoverflow.com/questions/16627227/http-error-403-in-python-3-web-scraping

    opened by BigFav 4
  • geotags

    geotags

    Hello, I set geotagged = True for housing, but I receive all 'geotag': None. These websites usually have maps and an address underneath. Is this a CL issue?

    opened by irahorecka 4
  • Wrapper Still Maintained?

    Wrapper Still Maintained?

    Hi - I was just wondering if this wrapper is still maintained? I tried a few different endpoints but received a couple different types of errors and wasn't sure if they're caused by a bad local installation on my end or if the wrapper was just not longer supported and there had since been updates to CL. Thanks!

    opened by tangunner 5
  • Use requests.Session and urllib3.Retry support

    Use requests.Session and urllib3.Retry support

    Using requests.Session() allows for keep-alive connections and prevents repeated DNS queries that can cause some caching resolvers to fail transiently. Using urllib3.Retry allows for better, more dynamic handling of request retries.

    opened by incontestableness 0
  • Don't log warnings for listings that have been removed

    Don't log warnings for listings that have been removed

    Sometimes, listings that violate Craigslist's TOS will be removed, but remain searchable. This change stops logging warnings for these non-existent listings. Here's an example from the cars & trucks section which consistently displays in searches but whose listing page returns an HTTP 404. image

    opened by incontestableness 1
  • Support for searching using `lat` `lon` instead of `zipcode`?

    Support for searching using `lat` `lon` instead of `zipcode`?

    For whatever reason, CG does not deal with my [large TX] zipcode well. So on CG I always use the "use map" option in my searches. That replaces zipcode with lat and lon params.

    But I'm getting:

    'lat' is not a valid filter
    'lon' is not a valid filter
    
    opened by mattalexx 0
Owner
Julio M. Alegria
Julio M. Alegria
A basic API to scrape Craigslist.

CLAPI A basic API to scrape Craigslist. Most useful for viewing posts across a broad geographic area or for viewing posts within a specific timeframe.

null 45 Jan 5, 2023
🚀 An asynchronous python API wrapper meant to replace discord.py - Snappy discord api wrapper written with aiohttp & websockets

Pincer An asynchronous python API wrapper meant to replace discord.py ❗ The package is currently within the planning phase ?? Links |Join the discord

Pincer 125 Dec 26, 2022
A wrapper for slurm especially on Taiwania2 (HPC CLI)A wrapper for slurm especially on Taiwania2 (HPC CLI)

TWCC-slurm-wrapper A wrapper for slurm especially on Taiwania2 (HPC CLI). For Taiwania2 (HPC CLI) usage, please refer to here. (中文) How to Install? gi

Chi-Liang, Liu 5 Oct 7, 2022
Discord-Wrapper - Discord Websocket Wrapper in python

This does not currently work and is in development Discord Websocket Wrapper in

null 3 Oct 25, 2022
Aws-lambda-requests-wrapper - Request/Response wrapper for AWS Lambda with API Gateway

AWS Lambda Requests Wrapper Request/Response wrapper for AWS Lambda with API Gat

null 1 May 20, 2022
A simple Python wrapper for the Amazon.com Product Advertising API ⛺

Amazon Simple Product API A simple Python wrapper for the Amazon.com Product Advertising API. Features An object oriented interface to Amazon products

Yoav Aviram 789 Dec 26, 2022
A simple Python wrapper for the archive.is capturing service

archiveis A simple Python wrapper for the archive.is capturing service. Installation pipenv install archiveis Python Usage Import it. >>> import archi

PastPages 157 Dec 28, 2022
PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.

PRAW: The Python Reddit API Wrapper PRAW, an acronym for "Python Reddit API Wrapper", is a Python package that allows for simple access to Reddit's AP

Python Reddit API Wrapper Development 3k Dec 29, 2022
PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.

PRAW: The Python Reddit API Wrapper PRAW, an acronym for "Python Reddit API Wrapper", is a Python package that allows for simple access to Reddit's AP

Python Reddit API Wrapper Development 3k Dec 29, 2022
This is a simple unofficial async Api-wrapper for tio.run

Async-Tio This is a simple unofficial async Api-wrapper for tio.run

Tom-the-Bomb 7 Oct 28, 2022
A Simple, LightWeight, Statically-Typed Python3 API wrapper for GogoAnime.

AniKimi API A Simple, LightWeight, Statically-Typed Python3 API wrapper for GogoAnime The v2 of gogoanimeapi (depreciated) Made with JavaScript and Py

null 17 Dec 9, 2022
A simple API wrapper for the Tenor API

Gifpy A simple API wrapper for the Tenor API Installation Python 3.9 or higher is recommended python3 -m pip install gifpy Clone repository: $ git cl

Juan Ignacio Battiston 4 Dec 22, 2021
A simple and stupid Miinto API wrapper

miinto-api-wrapper Miinto API Wrapper is a simple python wrapper for Miinto API. Miinto is a fashion luxury marketplace. For more information see the

Giuseppe Checchia 3 Jan 9, 2022
A simple API wrapper for Discord written in Python.

AIOCord This project is work in progress not for production use A simple asynchronous API wrapper around Discord API written in Python. Inspiration Th

Izhar Ahmad 3 Dec 7, 2021
Simple VK API wrapper for Python

VK Admier: documentation VK Admier is simple VK API wrapper for community bot development. Authorization You should create bot object from Client clas

Egor Light 2 Nov 10, 2022
A simple Python API wrapper for Cloudflare Stream's API.

python-cloudflare-stream A basic Python API wrapper for working with Cloudflare Stream. Arbington.com started off using Cloudflare Stream. We used the

Arbington 3 Sep 8, 2022
A simple healthcheck wrapper to monitor Kafka.

kafka-healthcheck A simple healthcheck wrapper to monitor Kafka. Kafka Healthcheck is a simple server that provides a singular API endpoint to determi

Rodrigo Nicolas Garcia 3 Oct 17, 2022
iCloudPy is a simple iCloud webservices wrapper library written in Python

iCloudPy ?? Please star this repository if you end up using the library. It will help me continue supporting this product. ?? iCloudPy is a simple iCl

Mandar Patil 49 Dec 26, 2022
A simple API Wrapper for Guilded.

Guildr A simple API Wrapper for Guilded. Frequently updated! I am not a user of Guilded, meaning I do not keep track of new Guilded updates or patches

null 2 Mar 7, 2022