:cloud: Python API for ThePirateBay.

Overview

TPB

Unofficial Python API for ThePirateBay.

Build Status Test Coverage Version Downloads (30 days)
Build Status Coverage Status Latest Version Downloads

Installation

$ pip install ThePirateBay

Note that ThePirateBay depends on lxml. If you run into problems in the compilation of lxml through pip, install the libxml2-dev and libxslt-dev packages on your system.

Usage

from tpb import TPB
from tpb import CATEGORIES, ORDERS

t = TPB('https://thepiratebay.org') # create a TPB object with default domain

# search for 'public domain' in 'movies' category
search = t.search('public domain', category=CATEGORIES.VIDEO.MOVIES)

# return listings from page 2 of this search
search.page(2)

# sort this search by count of seeders, and return a multipage result
search.order(ORDERS.SEEDERS.ASC).multipage()

# search, order by seeders and return page 3 results
t.search('python').order(ORDERS.SEEDERS.ASC).page(3)

# multipage beginning on page 4
t.search('recipe book').page(4).multipage()

# search, in a category and return multipage results
t.search('something').category(CATEGORIES.OTHER.OTHER).multipage()

# get page 3 of recent torrents
t.recent().page(3)

# get top torrents in Movies category
t.top().category(CATEGORIES.VIDEO.MOVIES)

# print all torrent descriptions
for torrent in t.search('public domain'):
    print(torrent.info)

# print all torrent files and their sizes
for torrent in t.search('public domain'):
    print(torrent.files)

Torrent details available

Attributes

  • title # the title of the torrent
  • url # TPB url for the torrent
  • category # the main category
  • sub_category # the sub category
  • magnet_link # magnet download link
  • torrent_link # .torrent download link
  • created # uploaded date time
  • size # size of torrent
  • user # username of uploader
  • seeders # number of seeders
  • leechers # number of leechers

Properties

  • created # creation date -- parsed when accessed
  • info # detailed torrent description -- needs separate request
  • files # dictionary of files and their size -- needs separate request

Tests

Tests can be ran using tox.

$ pip install tox
$ tox

Alternatively, you will need to install dependencies manually:

$ pip install -r tests/requirements.txt

Then, to execute the tests simply run:

$ python -m unittest discover

By default the tests are ran on a local test server with predownloaded original responses. You can activate the remote running option by:

$ REMOTE=true python -m unittest discover

Donations

If TPB has helped you in any way, and you'd like to help the developer, please consider donating.

- BTC: 19dLDL4ax7xRmMiGDAbkizh6WA6Yei2zP5

- Gratipay: https://www.gratipay.com/karan/

- Flattr: https://flattr.com/profile/thekarangoel

Contribute

If you want to add any new features, or improve existing ones, feel free to send a pull request!

Comments
  • Total rewrite allowing request modification, advanced pagination and chained methods.

    Total rewrite allowing request modification, advanced pagination and chained methods.

    from tpb import TPB
    from tpb import CATEGORIES, ORDERING
    
    t = TPB('http://thepiratebay.sx/')
    search = t.search('breaking bad', category=CATEGORIES.MOVIES)
    search.page(2)
    search.ordering(ORDERING.SEEDERS).multipage()
    t.search('breaking bad').ordering(ORDERING.SEEDERS).page(3)
    t.search('babylon 5').page(4).multipage() # multipage beginning on page 4
    t.search('something').category(CATEGORIES.OTHERS).multipage()
    t.recent().page(3)
    t.top().category(CATEGORIES.MOVIES)
    
    opened by umazalakain 9
  • urllib.error.HTTPError: HTTP Error 403: Forbidden

    urllib.error.HTTPError: HTTP Error 403: Forbidden

    When i run it with any request, i get this HTTP error:

    Traceback (most recent call last):
      File "main.py", line 7, in <module>
        for torrent in t.search('neighbors'):
      File "/usr/lib/python3.5/site-packages/tpb/tpb.py", line 149, in items
        for item in super(Paginated, self).items():
      File "/usr/lib/python3.5/site-packages/tpb/tpb.py", line 59, in items
        request = urlopen(str(self.url))
      File "/usr/lib/python3.5/urllib/request.py", line 163, in urlopen
        return opener.open(url, data, timeout)
      File "/usr/lib/python3.5/urllib/request.py", line 472, in open
        response = meth(req, response)
      File "/usr/lib/python3.5/urllib/request.py", line 582, in http_response
        'http', request, response, code, msg, hdrs)
      File "/usr/lib/python3.5/urllib/request.py", line 510, in error
        return self._call_chain(*args)
      File "/usr/lib/python3.5/urllib/request.py", line 444, in _call_chain
        result = func(*args)
      File "/usr/lib/python3.5/urllib/request.py", line 590, in http_error_default
        raise HTTPError(req.full_url, code, msg, hdrs, fp)
    urllib.error.HTTPError: HTTP Error 403: Forbidden
    

    This is the code i tried:

    from tpb import TPB
    from tpb import CATEGORIES, ORDERS
    
    t = TPB('https://thepiratebay.org') # create a TPB object with default domain
    
    # search for 'public domain' in 'movies' category
    search = t.search('public domain', category=CATEGORIES.VIDEO.MOVIES)
    
    # return listings from page 2 of this search
    search.page(2)
    
    # sort this search by count of seeders, and return a multipage result
    search.order(ORDERS.SEEDERS.ASC).multipage()
    
    # search, order by seeders and return page 3 results
    t.search('python').order(ORDERS.SEEDERS.ASC).page(3)
    
    # multipage beginning on page 4
    t.search('recipe book').page(4).multipage()
    
    # search, in a category and return multipage results
    t.search('something').category(CATEGORIES.OTHER.OTHER).multipage()
    
    # get page 3 of recent torrents
    t.recent().page(3)
    
    # get top torrents in Movies category
    t.top().category(CATEGORIES.VIDEO.MOVIES)
    
    # print all torrent descriptions
    for torrent in t.search('public domain'):
        print(torrent.info)
    
    # print all torrent files and their sizes
    for torrent in t.search('public domain'):
        print(torrent.files)
    
    opened by Almoullim 6
  • Missing Category

    Missing Category

    Please consider adding 'PORN' and its sub-categories to the code ;) Currently specifying category(CATEGORIES.PORN) throws up an AttributeError.

    Cheers.

    opened by sbjaved 6
  • cli utility

    cli utility

    A command line interface would be a great tool to make use of TPB. What do you think? It would be also great to automatically detect witch BitTorrent client has the system installed and launch selected torrents downloads with it!

    A ncurses interface would be also great but they are a bit more complex. Any thoughts?

    wontfix 
    opened by umazalakain 6
  • Adding the classmethod from_string in the Torrent class, which allows to

    Adding the classmethod from_string in the Torrent class, which allows to

    build a Torrent object from its url. Fixes issue #65.

    I added beautifulsoup as a new dependency. I'm sorry, I just can't stand lxml. It might also not be be the smartest way to do the job, but the job is done.

    opened by JPFrancoia 5
  • Element not found when parsing dates

    Element not found when parsing dates

    Hi, It's my first time using Python so my debugging skills are very rough, but upon iterating over some search results, I'm getting the following error: TypeError: 'NoneType' object is not iterable

    Stack trace ends with the following:

      File "/Users/tomasz/Server/couch/venv/lib/python2.7/site-packages/tpb/tpb.py", line 144, in items
        for item in super(Paginated, self).items():
      File "/Users/tomasz/Server/couch/venv/lib/python2.7/site-packages/tpb/tpb.py", line 60, in items
        yield self._build_torrent(row)
      File "/Users/tomasz/Server/couch/venv/lib/python2.7/site-packages/tpb/tpb.py", line 102, in _build_torrent
        created = dateutil.parser.parse(match.groups()[0].replace('\xa0', ' '))
      File "/Users/tomasz/Server/couch/venv/lib/python2.7/site-packages/dateutil/parser.py", line 748, in parse
        return DEFAULTPARSER.parse(timestr, **kwargs)
      File "/Users/tomasz/Server/couch/venv/lib/python2.7/site-packages/dateutil/parser.py", line 310, in parse
        res, skipped_tokens = self._parse(timestr, **kwargs)
    

    After digging into the code, line 102 in tpb.py seems to be the one causing issues created = dateutil.parser.parse(match.groups()[0].replace('\xa0', ' '))

    When replaced with datetime.new() it works. Having said that, it might be TPB itself causing problems as I can't reliably reproduce the issue 100% of the time... sometimes it just works. Any thoughts?

    opened by tomasz-tomczyk 5
  • Test server and base case

    Test server and base case

    Base test case that executes every test on remote and local URLs:

    • executes on remote only if remote is available.
    • executes always on a local bottle server that returns predownloaded original responses.
    opened by umazalakain 5
  • The multipage date does not work

    The multipage date does not work

    Hi,

    To begin, TPB changed their domain again, just to let you know. Second, it seems the torrent.created gives a wrong date when used with multipage.

    Example:

    search = t.search('*', category=CATEGORIES.VIDEO.MOVIES) search.order(ORDERS.UPLOADED.DES).multipage()

    for torrent in search: print(torrent, torrent.created)

    OUTPUT:

    ... Gravity (2013) DVDScr x264 AC3 by GFGTORRZ 2013-12-14 01:47:00 Resident Evil 2007 Special Edition by GFGTORRZ 2013-12-14 01:43:00 Romeo.And.Juliet.A.Love.Song.2013.DVDRip.XviD.AC3-EVO by UltraTorrents 2013-12-14 01:37:00 Vikingdom 2013 BRRip x264 AC3-MiLLENiUM by TvTeam 2013-12-14 01:29:00 Open Grave [2013] BRRip XViD-ETRG by UltraTorrents 2013-12-14 01:10:00 Gilda (1946) Español Latino by Rob.Merc. 2013-12-14 01:04:00 Turbo.2013.DVDRip.H264.AAC by Xanthippus 2013-12-14 00:11:00 A Fronteira.DVDRip.XviD-DualAudio by Sapop 2013-12-14 14:42:10.814431 Lisa.1990.DVDRip.XviD-EBX by TvTeam 2013-12-14 14:42:10.815012 ...

    After midnight, when we are supposed to go to the past day, the date is the current date, i.e if I start the script at 14:42, the torrent date will be 14:42, same day.

    BUT, if I loop: search.order(ORDERS.UPLOADED.DES).page(i)

    It's ok, the date is right.

    opened by JPFrancoia 4
  • Obtain the nfo infos

    Obtain the nfo infos

    Hi,

    I wonder if you plan to implement a way to get the infos in the nfo beacon, on the complete page of a torrent.

    ex: http://thepiratebay.sx/torrent/9236862/Ubuntu_Gnome_13.10_64-bit

    On this page, the beacon is: div class="nfo"

    It contains the comments of the uploader, and sometimes some links.

    Thxs.

    enhancement 
    opened by JPFrancoia 4
  • Index out of range for CSS Select

    Index out of range for CSS Select

    I am getting the following error when running the example script from the project site (here: https://github.com/karan/TPB#usage):

    File "example.py", line 34, in print(torrent.info) File "/usr/local/lib/python2.7/dist-packages/ThePirateBay-v1.3.5-py2.7.egg/tpb/tpb.py", line 347, in info info = root.cssselect('#details > .nfo > pre')[0].text_content() IndexError: list index out of range

    The only change I made, was replacing the domain from "https://thepiratebay.org" to "https://thepiratebay.la"

    opened by sebastian9er 3
  • Handling the decompression of the data from tpb

    Handling the decompression of the data from tpb

    Now using requests module instead or urllib. requests.get() handles the decompression of the data given by the website. Fixes issue #68. But introduces a dependance to te requests module.

    opened by JPFrancoia 3
  • Fixed issue relating to row parsing on other mirrors than piratebay.org

    Fixed issue relating to row parsing on other mirrors than piratebay.org

    The previous version of this didn't allow functionality on anything other than piratebay.org. I made a fix so that it can work on all functioning mirrors of ThePirateBay. This was a fix of what rows it is allowed to parse, as the previous one included the last row to be parsed which had a column count of 1 instead of the typical 4.

    opened by brandongallagher1999 0
  • fixed URL lib calls, fixed HTTP error, working on python 3.8

    fixed URL lib calls, fixed HTTP error, working on python 3.8

    This update allows the API to function in the latest version of python 3.8 / 3.9.

    Was getting an HTTP error beforehand using urllib and replaced it with requests.

    opened by brandongallagher1999 0
  • I've forked the project

    I've forked the project

    It's here: https://github.com/pawamoy/TPB

    Due to @karan not responding to issues and PRs, I've taken the liberty to fork the project and merge some PRs in it.

    Still not pushed to PyPI. What distribution name should I use since ThePirateBay is taken? tpb maybe?

    opened by pawamoy 1
  • Typo dependency on

    Typo dependency on "dateutils" library

    It seems that you have dateutils in your setup.py. That refers to this package which is almost certainly not what you wanted. I'm guessing you wanted python-dateutil and made a mistake about the package name. Since dateutils depends on python-dateutil, you probably didn't notice the mistake.

    You seem to have correctly listed python-dateutil in the requirements.txt file (though pinning to a quite old version).

    opened by pganssle 0
  • Switched broken pypip.in badges to shields.io

    Switched broken pypip.in badges to shields.io

    Hello, this is an auto-generated Pull Request. (Feedback?)

    Some time ago, pypip.in shut down. This broke the badges for a bunch of repositories, including thepiratebay. Thankfully, an equivalent service is run by shields.io. This pull request changes the badges to use shields.io instead.

    Unfortunately, PyPI has removed download statistics from their API, which means that even the shields.io "download count" badges are broken (they display "no longer available". See this). So those badges should really be removed entirely. Since this is an automated process (and trying to automatically remove the badges from READMEs can be tricky), this pull request just replaces the URL with the shields.io syntax.

    opened by movermeyer 0
Owner
Karan Goel
Little brown guy with big dreams. https://goel.io
Karan Goel
Python client for using Prefect Cloud with Saturn Cloud

prefect-saturn prefect-saturn is a Python package that makes it easy to run Prefect Cloud flows on a Dask cluster with Saturn Cloud. For a detailed tu

Saturn Cloud 15 Dec 7, 2022
💻 A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline!

LocalStack - A fully functional local AWS cloud stack LocalStack provides an easy-to-use test/mocking framework for developing Cloud applications. Cur

LocalStack 45.3k Jan 2, 2023
Cloud-native, data onboarding architecture for the Google Cloud Public Datasets program

Public Datasets Pipelines Cloud-native, data pipeline architecture for onboarding datasets to the Google Cloud Public Datasets Program. Overview Requi

Google Cloud Platform 109 Dec 30, 2022
A python to scratch API connector. Can fetch data from the API and send it back in cloud variables.

Scratch2py Scratch2py or S2py is a easy to use, versatile tool to communicate with the Scratch API Based of scratchclient by Raihan142857 Installation

null 20 Jun 18, 2022
AirDrive lets you store unlimited files to cloud for free. Upload & download files from your personal drive at any time using its super-fast API.

AirDrive lets you store unlimited files to cloud for free. Upload & download files from your personal drive at any time using its super-fast API.

Sougata 4 Jul 12, 2022
CloudFormation Drift Remediation - Use Cloud Control API to remediate drift that was detected on a CloudFormation stack

CloudFormation Drift Remediation - Use Cloud Control API to remediate drift that was detected on a CloudFormation stack

Cloudar 36 Dec 11, 2022
Get some python in google cloud functions

[NOTE]: This is a highly experimental (and proof of concept) library so do not expect all python packages to work flawlessly. Also, cloud functions ar

Martin Abelson Sahlen 200 Nov 24, 2022
Python SDK for IEX Cloud

iexfinance Python SDK for IEX Cloud. Architecture mirrors that of the IEX Cloud API (and its documentation). An easy-to-use toolkit to obtain data for

Addison Lynch 640 Jan 7, 2023
The Python SDK for the Rackspace Cloud

pyrax Python SDK for OpenStack/Rackspace APIs DEPRECATED: Pyrax is no longer being developed or supported. See openstacksdk and the rackspacesdk plugi

PyContribs 238 Sep 21, 2022
RichWatch is wrapper around AWS Cloud Watch to display beautiful logs with help of Python library Rich.

RichWatch is TUI (Textual User Interface) for AWS Cloud Watch. It formats and pretty prints Cloud Watch's logs so they are much more readable. Because

null 21 Jul 25, 2022
Develop and deploy applications with the Ionburst Cloud Python SDK.

Ionburst SDK for Python The Ionburst SDK for Python enables developers to easily integrate with Ionburst Cloud, building in ultra-secure and private o

Ionburst Cloud 3 Mar 6, 2022
PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.

PRAW: The Python Reddit API Wrapper PRAW, an acronym for "Python Reddit API Wrapper", is a Python package that allows for simple access to Reddit's AP

Python Reddit API Wrapper Development 3k Dec 29, 2022
PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.

PRAW: The Python Reddit API Wrapper PRAW, an acronym for "Python Reddit API Wrapper", is a Python package that allows for simple access to Reddit's AP

Python Reddit API Wrapper Development 3k Dec 29, 2022
alpaca-trade-api-python is a python library for the Alpaca Commission Free Trading API.

alpaca-trade-api-python is a python library for the Alpaca Commission Free Trading API. It allows rapid trading algo development easily, with support for both REST and streaming data interfaces

Alpaca 1.5k Jan 9, 2023
WhatsApp Api Python - This documentation aims to exemplify the use of Moorse Whatsapp API in Python

WhatsApp API Python ChatBot Este repositório contém uma aplicação que se utiliza

Moorse.io 3 Jan 8, 2022
Cloud-optimized, single-file archive format for pyramids of map tiles

PMTiles PMTiles is a single-file archive format for tiled data. A PMTiles archive can be hosted on a commodity storage platform such as S3, and enable

Protomaps 325 Jan 4, 2023
A little proxy tool based on Tencent Cloud Function Service.

SCFProxy 一个基于腾讯云函数服务的免费代理池。 安装 python3 -m venv .venv source .venv/bin/activate pip3 install -r requirements.txt 项目配置 函数配置 开通腾讯云函数服务 在 函数服务 > 新建 中使用自定义

Mio 716 Dec 26, 2022
This is a repository for the Duke University Cloud Computing course project on Serveless Data Engineering Pipeline. For this project, I recreated the below pipeline.

AWS Data Engineering Pipeline This is a repository for the Duke University Cloud Computing course project on Serverless Data Engineering Pipeline. For

null 15 Jul 28, 2021
WILSON Cloud Respwnder is a Web Interaction Logger Sending Out Notifications with the ability to serve custom content in order to appropriately respond to client-issued requests.

WILSON Cloud Respwnder What is this? WILSON Cloud Respwnder is a Web Interaction Logger Sending Out Notifications (WILSON) with the ability to serve c

null 48 Oct 31, 2022