Google Search Results via SERP API pip Python Package

Overview

Google Search Results in Python

Package Build

This Python package is meant to scrape and parse search results from Google, Bing, Baidu, Yandex, Yahoo, Home depot, Ebay and more.. using SerpApi.

The following services are provided:

SerpApi provides a script builder to get you started quickly.

Installation

Python 3.7+

pip install google-search-results

Link to the python package page

Quick start

from serpapi import GoogleSearch
search = GoogleSearch({
    "q": "coffee", 
    "location": "Austin,Texas",
    "api_key": "<your secret api key>"
  })
result = search.get_dict()

This example runs a search about "coffee" using your secret api key.

The SerpApi service (backend)

  • searches on Google using the search: q = "coffee"
  • parses the messy HTML responses
  • return a standardizes JSON response The GoogleSearch class
  • Format the request
  • Execute GET http request against SerpApi service
  • Parse JSON response into a dictionary Et voila..

Alternatively, you can search:

  • Bing using BingSearch class
  • Baidu using BaiduSearch class
  • Yahoo using YahooSearch class
  • duckduckgo using DuckDuckGoSearch class
  • Ebay using EbaySearch class
  • Yandex using YandexSearch class
  • HomeDepot using HomeDepotSearch class
  • GoogleScholar using GoogleScholarSearch class
  • Youtube using YoutubeSearch class
  • Walmart using WalmartSearch
  • Apple App Store using AppleAppStoreSearch class
  • Naver using NaverSearch class

See the playground to generate your code.

Summary

Google Search API capability

Source code.

params = {
  "q": "coffee",
  "location": "Location Requested", 
  "device": "desktop|mobile|tablet",
  "hl": "Google UI Language",
  "gl": "Google Country",
  "safe": "Safe Search Flag",
  "num": "Number of Results",
  "start": "Pagination Offset",
  "api_key": "Your SERP API Key", 
  # To be match
  "tbm": "nws|isch|shop", 
  # To be search
  "tbs": "custom to be search criteria",
  # allow async request
  "async": "true|false",
  # output format
  "output": "json|html"
}

# define the search search
search = GoogleSearch(params)
# override an existing parameter
search.params_dict["location"] = "Portland"
# search format return as raw html
html_results = search.get_html()
# parse results
#  as python Dictionary
dict_results = search.get_dict()
#  as JSON using json package
json_results = search.get_json()
#  as dynamic Python object
object_result = search.get_object()

Link to the full documentation

see below for more hands on examples.

How to set SERP API key

You can get an API key here if you don't already have one: https://serpapi.com/users/sign_up

The SerpApi api_key can be set globally:

GoogleSearch.SERP_API_KEY = "Your Private Key"

The SerpApi api_key can be provided for each search:

query = GoogleSearch({"q": "coffee", "serp_api_key": "Your Private Key"})

Example by specification

We love true open source, continuous integration and Test Drive Development (TDD). We are using RSpec to test our infrastructure around the clock to achieve the best QoS (Quality Of Service).

The directory test/ includes specification/examples.

Set your api key.

export API_KEY="your secret key"

Run test

make test

Location API

from serpapi import GoogleSearch
search = GoogleSearch({})
location_list = search.get_location("Austin", 3)
print(location_list)

it prints the first 3 location matching Austin (Texas, Texas, Rochester)

[   {   'canonical_name': 'Austin,TX,Texas,United States',
        'country_code': 'US',
        'google_id': 200635,
        'google_parent_id': 21176,
        'gps': [-97.7430608, 30.267153],
        'id': '585069bdee19ad271e9bc072',
        'keys': ['austin', 'tx', 'texas', 'united', 'states'],
        'name': 'Austin, TX',
        'reach': 5560000,
        'target_type': 'DMA Region'},
        ...]

Search Archive API

The search result are stored in temporary cached. The previous search can be retrieve from the the cache for free.

from serpapi import GoogleSearch
search = GoogleSearch({"q": "Coffee", "location": "Austin,Texas"})
search_result = search.get_dictionary()
assert search_result.get("error") == None
search_id = search_result.get("search_metadata").get("id")
print(search_id)

Now let retrieve the previous search from the archive.

archived_search_result = GoogleSearch({}).get_search_archive(search_id, 'json')
print(archived_search_result.get("search_metadata").get("id"))

it prints the search result from the archive.

Account API

from serpapi import GoogleSearch
search = GoogleSearch({})
account = search.get_account()

it prints your account information.

Search Bing

from serpapi import BingSearch
search = BingSearch({"q": "Coffee", "location": "Austin,Texas"})
data = search.get_dict()

this code prints baidu search results for coffee as a Dictionary.

https://serpapi.com/bing-search-api

Search Baidu

from serpapi import BaiduSearch
search = BaiduSearch({"q": "Coffee"})
data = search.get_dict()

this code prints baidu search results for coffee as a Dictionary. https://serpapi.com/baidu-search-api

Search Yandex

from serpapi import YandexSearch
search = YandexSearch({"text": "Coffee"})
data = search.get_dict()

this code prints yandex search results for coffee as a Dictionary.

https://serpapi.com/yandex-search-api

Search Yahoo

from serpapi import YahooSearch
search = YahooSearch({"p": "Coffee"})
data = search.get_dict()

this code prints yahoo search results for coffee as a Dictionary.

https://serpapi.com/yahoo-search-api

Search Ebay

from serpapi import EbaySearch
search = EbaySearch({"_nkw": "Coffee"})
data = search.get_dict()

this code prints ebay search results for coffee as a Dictionary.

https://serpapi.com/ebay-search-api

Search Home depot

from serpapi import HomeDepotSearch
search = HomeDepotSearch({"q": "chair"})
data = search.get_dict()

this code prints home depot search results for chair as Dictionary.

https://serpapi.com/home-depot-search-api

Search Youtube

from serpapi import HomeDepotSearch
search = YoutubeSearch({"q": "chair"})
data = search.get_dict()

this code prints youtube search results for chair as Dictionary.

https://serpapi.com/youtube-search-api

Search Google Scholar

from serpapi import GoogleScholarSearch
search = GoogleScholarSearch({"q": "Coffee"})
data = search.get_dict()

this code prints Google Scholar search results.

Search Walmart

from serpapi import WalmartSearch
search = WalmartSearch({"query": "chair"})
data = search.get_dict()

this code prints Google Scholar search results.

Search Youtube

from serpapi import YoutubeSearch
search = YoutubeSearch({"search_query": "chair"})
data = search.get_dict()

this code prints Google Scholar search results.

Search Apple Store

from serpapi import AppleAppStoreSearch
search = AppleAppStoreSearch({"term": "Coffee"})
data = search.get_dict()

this code prints Google Scholar search results.

Search Naver

from serpapi import NaverSearch
search = NaverSearch({"query": "chair"})
data = search.get_dict()

this code prints Google Scholar search results.

Generic search with SerpApiClient

from serpapi import SerpApiClient
query = {"q": "Coffee", "location": "Austin,Texas", "engine": "google"}
search = SerpApiClient(query)
data = search.get_dict()

This class enables to interact with any search engine supported by SerpApi.com

Search Google Images

from serpapi import GoogleSearch
search = GoogleSearch({"q": "coffe", "tbm": "isch"})
for image_result in search.get_dict()['images_results']:
    link = image_result["original"]
    try:
        print("link: " + link)
        # wget.download(link, '.')
    except:
        pass

this code prints all the images links, and download image if you un-comment the line with wget (linux/osx tool to download image).

This tutorial covers more ground on this topic. https://github.com/serpapi/showcase-serpapi-tensorflow-keras-image-training

Search Google News

from serpapi import GoogleSearch
search = GoogleSearch({
    "q": "coffe",   # search search
    "tbm": "nws",  # news
    "tbs": "qdr:d", # last 24h
    "num": 10
})
for offset in [0,1,2]:
    search.params_dict["start"] = offset * 10
    data = search.get_dict()
    for news_result in data['news_results']:
        print(str(news_result['position'] + offset * 10) + " - " + news_result['title'])

this script prints the first 3 pages of the news title for the last 24h.

Search Google Shopping

from serpapi import GoogleSearch
search = GoogleSearch({
    "q": "coffe",   # search search
    "tbm": "shop",  # news
    "tbs": "p_ord:rv", # last 24h
    "num": 100
})
data = search.get_dict()
for shopping_result in data['shopping_results']:
    print(shopping_result['position']) + " - " + shopping_result['title'])

this script prints all the shopping results order by review order.

Google Search By Location

With SerpApi, we can build Google search from anywhere in the world. This code is looking for the best coffee shop per city.

from serpapi import GoogleSearch
for city in ["new york", "paris", "berlin"]:
  location = GoogleSearch({}).get_location(city, 1)[0]["canonical_name"]
  search = GoogleSearch({
      "q": "best coffee shop",   # search search
      "location": location,
      "num": 1,
      "start": 0
  })
  data = search.get_dict()
  top_result = data["organic_results"][0]["title"]

Batch Asynchronous Searches

We do offer two ways to boost your searches thanks to async parameter.

  • Blocking - async=false - it's more compute intensive because the search would need to hold many connections. (default)
  • Non-blocking - async=true - it's way to go for large amount of query submitted by batch (recommended)
# Operating system
import os

# regular expression library
import re

# safe queue (named Queue in python2)
from queue import Queue

# Time utility
import time

# SerpApi search
from serpapi import GoogleSearch

# store searches
search_queue = Queue()

# SerpApi search
search = GoogleSearch({
    "location": "Austin,Texas",
    "async": True,
    "api_key": os.getenv("API_KEY")
})

# loop through a list of companies
for company in ['amd', 'nvidia', 'intel']:
    print("execute async search: q = " + company)
    search.params_dict["q"] = company
    result = search.get_dict()
    if "error" in result:
        print("oops error: ", result["error"])
        continue
    print("add search to the queue where id: ", result['search_metadata'])
    # add search to the search_queue
    search_queue.put(result)

print("wait until all search statuses are cached or success")

# Create regular search
while not search_queue.empty():
    result = search_queue.get()
    search_id = result['search_metadata']['id']

    # retrieve search from the archive - blocker
    print(search_id + ": get search from archive")
    search_archived = search.get_search_archive(search_id)
    print(search_id + ": status = " +
          search_archived['search_metadata']['status'])

    # check status
    if re.search('Cached|Success',
                 search_archived['search_metadata']['status']):
        print(search_id + ": search done with q = " +
              search_archived['search_parameters']['q'])
    else:
        # requeue search_queue
        print(search_id + ": requeue search")
        search_queue.put(result)

        # wait 1s
        time.sleep(1)

print('all searches completed')

This code shows how to run searches asynchronously. The search parameters must have {async: True}. This indicates that the client shouldn't wait for the search to be completed. The current thread that executes the search is now non-blocking which allows to execute thousand of searches in seconds. The SerpApi backend will do the processing work. The actual search result is defer to a later call from the search archive using get_search_archive(search_id). In this example the non-blocking searches are persisted in a queue: search_queue. A loop through the search_queue allows to fetch individual search result. This process can be easily multithreaded to allow a large number of concurrent search requests. To keep thing simple, this example does only explore search result one at a time (single threaded).

See example.

Python object as a result

The search results can be automatically wrapped in dynamically generated Python object. This solution offers a more dynamic solution fully Oriented Object Programming approach over the regular Dictionary / JSON data structure.

from serpapi import GoogleSearch
search = GoogleSearch({"q": "Coffee", "location": "Austin,Texas"})
r = search.get_object()
assert type(r.organic_results), list
assert r.organic_results[0].title
assert r.search_metadata.id
assert r.search_metadata.google_url
assert r.search_parameters.q, "Coffee"
assert r.search_parameters.engine, "google"

Pagination using iterator

Let's collect links accross multiple search result pages.

# to get 2 pages
start = 0
end = 40
page_size = 10

# basic search parameters
parameter = {
  "q": "coca cola",
  "tbm": "nws",
  "api_key": os.getenv("API_KEY"),
  # optional pagination parameter
  #  the pagination method can take argument directly
  "start": start,
  "end": end,
  "num": page_size
}

# as proof of concept 
# urls collects
urls = []

# initialize a search
search = GoogleSearch(parameter)

# create a python generator using parameter
pages = search.pagination()
# or set custom parameter
pages = search.pagination(start, end, page_size)

# fetch one search result per iteration 
# using a basic python for loop 
# which invokes python iterator under the hood.
for page in pages:
  print(f"Current page: {page['serpapi_pagination']['current']}")
  for news_result in page["news_results"]:
    print(f"Title: {news_result['title']}\nLink: {news_result['link']}\n")
    urls.append(news_result['link'])
  
# check if the total number pages is as expected
# note: the exact number if variable depending on the search engine backend
if len(urls) == (end - start):
  print("all search results count match!")
if len(urls) == len(set(urls)):
  print("all search results are unique!")

Examples to fetch links with pagination: test file, online IDE

Error management

SerpAPI keeps error mangement very basic.

  • backend service error or search fail
  • client error

If it's a backend error, a simple message error is returned as string in the server response.

from serpapi import GoogleSearch
search = GoogleSearch({"q": "Coffee", "location": "Austin,Texas", "api_key": "<secret_key>"})
data = search.get_json()
assert data["error"] == None

In some case, there is more details availabel in the data object.

If it's client error, then a SerpApiClientException is raised.

Change log

2021-12-22 @ 2.4.1

  • add more search engine
    • youtube
    • walmart
    • apple_app_store
    • naver
  • raise SerpApiClientException instead of raw string in order to follow Python guideline 3.5+
  • add more unit error tests for serp_api_client

2021-07-26 @ 2.4.0

  • add page size support using num parameter
  • add youtube search engine

2021-06-05 @ 2.3.0

  • add pagination support

2021-04-28 @ 2.2.0

  • add get_response method to provide raw requests.Response object

2021-04-04 @ 2.1.0

  • Add home depot search engine
  • get_object() returns dynamic Python object

2020-10-26 @ 2.0.0

  • Reduce class name to Search
  • Add get_raw_json

2020-06-30 @ 1.8.3

  • simplify import
  • improve package for python 3.5+
  • add support for python 3.5 and 3.6

2020-03-25 @ 1.8

  • add support for Yandex, Yahoo, Ebay
  • clean-up test

2019-11-10 @ 1.7.1

  • increase engine parameter priority over engine value set in the class

2019-09-12 @ 1.7

  • Change namespace "from lib." instead: "from serpapi import GoogleSearch"
  • Support for Bing and Baidu

2019-06-25 @ 1.6

  • New search engine supported: Baidu and Bing

Conclusion

SerpApi supports all the major search engines. Google has the more advance support with all the major services available: Images, News, Shopping and more.. To enable a type of search, the field tbm (to be matched) must be set to:

  • isch: Google Images API.
  • nws: Google News API.
  • shop: Google Shopping API.
  • any other Google service should work out of the box.
  • (no tbm parameter): regular Google search.

The field tbs allows to customize the search even more.

The full documentation is available here.

Comments
  • Provide a more convenient way to paginate via the Python package

    Provide a more convenient way to paginate via the Python package

    Currently, the way to paginate searches is to get the serpapi_pagination.current and increase the offset or start parameters in the loop. Like with regular HTTP requests to serpapi.com/search without an API wrapper.

    import os
    from serpapi import GoogleSearch
    
    params = {
        "engine": "google",
        "q": "coffee",
        "tbm": "nws",
        "api_key": os.getenv("API_KEY"),
    }
    
    search = GoogleSearch(params)
    results = search.get_dict()
    
    print(f"Current page: {results['serpapi_pagination']['current']}")
    
    for news_result in results["news_results"]:
        print(f"Title: {news_result['title']}\nLink: {news_result['link']}\n")
    
    while 'next' in results['serpapi_pagination']:
        search.params_dict[
            "start"] = results['serpapi_pagination']['current'] * 10
        results = search.get_dict()
    
        print(f"Current page: {results['serpapi_pagination']['current']}")
    
        for news_result in results["news_results"]:
            print(
                f"Title: {news_result['title']}\nLink: {news_result['link']}\n"
            )
    

    A more convenient way for an official API wrapper would be to provide some function like search.paginate(callback: Callable) which will properly calculate offset for the specific search engine and loop through pages until the end.

    import os
    from serpapi import GoogleSearch
    
    def print_results(results):
      print(f"Current page: {results['serpapi_pagination']['current']}")
    
      for news_result in results["news_results"]:
        print(f"Title: {news_result['title']}\nLink: {news_result['link']}\n")
    
    params = {
        "engine": "google",
        "q": "coffee",
        "tbm": "nws",
        "api_key": os.getenv("API_KEY"),
    }
    
    search = GoogleSearch(params)
    search.paginate(print_results)
    

    @jvmvik @hartator What do you think?

    enhancement question 
    opened by ilyazub 6
  • How to get

    How to get "related articles" links from google scholar via serpapi?

    I am using SERP API to fetch google scholar papers, although there is always a link called "related articles' under each article but SERP API doesn't have any SERP URL to fetch data of those links?

    Screenshot 2022-07-14 at 3 05 07 AM

    Serp API result :

    Screenshot 2022-07-14 at 3 15 16 AM

    Can I directly call this URL https://scholar.google.com/scholar?q=related:gemrYG-1WnEJ:scholar.google.com/&scioq=Multi-label+text+classification+with+latent+word-wise+label+information&hl=en&as_sdt=0,21 using serp API?

    opened by monk1337 3
  • Cannot increase the offset between returned results using pagination

    Cannot increase the offset between returned results using pagination

    I am trying to use the pagination feature based on the code at (https://github.com/serpapi/google-search-results-python#pagination-using-iterator). I want to request 20 results per API call but pagination by default iterates by 10 results only instead of 20, meaning I my requests end up overlapping.

    I think I have found a solution to this. Looking in the package, the pagination.py file has a Pagination class which takes a page_size variable that changes the size of the offset between returned results. The Pagination class is imported in the serp_api_client.py file within the pagination method starting on line 170 but here the page_size variable wasn't included. I just added page_size = 10 on lines 170 and 174 and now I can use the page_size variable if I call search.pagination(page_size = 20). Can this change be made in the code?

    opened by samuelhaysom 2
  • {'error':'We couldn't find your API Key.'}

    {'error':'We couldn't find your API Key.'}

    `from serpapi.google_search_results import GoogleSearchResults

    client = GoogleSearchResults({"q": "coffee", "serp_api_key": "************************"})

    result = client.get_dict()`

    I tried giving my API key from serpstack. Yet I am left with this error. Any help could be much useful.

    opened by rokintech 2
  • macOS installation issue

    macOS installation issue

    When installing the package via pip it fails.

    Collecting google-search-results
      Using cached https://files.pythonhosted.org/packages/08/eb/38646304d98db83d85f57599d2ccc8caf325961e8792100a1014950197a6/google_search_results-1.5.2.tar.gz
        Complete output from command python setup.py egg_info:
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
          File "/private/var/folders/3m/91gj9l890y71886_7sfndl3r0000gn/T/pip-install-YVqFKL/google-search-results/setup.py", line 7, in <module>
            with open(path.join(here, 'SHORT_README.rst'), encoding='utf-8') as f:
          File "/usr/local/Cellar/python@2/2.7.15_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 898, in open
            file = __builtin__.open(filename, mode, buffering)
        IOError: [Errno 2] No such file or directory: '/private/var/folders/3m/91gj9l890y71886_7sfndl3r0000gn/T/pip-install-YVqFKL/google-search-results/SHORT_README.rst'
    
        ----------------------------------------
    Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/3m/91gj9l890y71886_7sfndl3r0000gn/T/pip-install-YVqFKL/google-search-results/
    

    Running macOS catalina and python 2.7

    ~ ❯❯❯ pip --version
    pip 19.0.2 from /usr/local/lib/python2.7/site-packages/pip (python 2.7)
    ~ ❯❯❯ python --version
    Python 2.7.15
    
    opened by igorshtopor 2
  • SerpApiClient.get_search_archive fails with format='html'

    SerpApiClient.get_search_archive fails with format='html'

    SerpApiClient.get_search_archive assumes all results must be loaded as a JSON, so it fails when using format='html'

    GoogleSearchResults({}).get_search_archive(search_id='5df0db57ab3f5837994cd5a1', format='html')
    
    ---------------------------------------------------------------------------                                                                                                                                   JSONDecodeError                           Traceback (most recent call last)
    <ipython-input-8-b6d24cb47bf7> in <module>
    ----> 1 GoogleSearchResults({}).get_search_archive(search_id='5df0db57ab3f5837994cd5a1', format='html')
    
    C:\ProgramData\Anaconda3\lib\site-packages\serpapi\serp_api_client.py in get_search_archive(self, search_id, format)
    78             dict|string: search result from the archive
    79         """
    ---> 80         return json.loads(self.get_results("/searches/{0}.{1}".format(search_id, format)))
    81
    82     def get_account(self):
    
    C:\ProgramData\Anaconda3\lib\json\__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    352             parse_int is None and parse_float is None and
    353             parse_constant is None and object_pairs_hook is None and not kw):
    --> 354         return _default_decoder.decode(s)
    355     if cls is None:
    356         cls = JSONDecoder
    
    C:\ProgramData\Anaconda3\lib\json\decoder.py in decode(self, s, _w)
    337
    338         """
    --> 339         obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    340         end = _w(s, end).end()
    341         if end != len(s):
    
    C:\ProgramData\Anaconda3\lib\json\decoder.py in raw_decode(self, s, idx)
    355             obj, end = self.scan_once(s, idx)
    356         except StopIteration as err:
    --> 357             raise JSONDecodeError("Expecting value", s, err.value) from None
    358         return obj, end
    
    JSONDecodeError: Expecting value: line 1 column 1 (char 0)
    
    
    opened by gblazq 2
  • KeyError when Calling Answer Box

    KeyError when Calling Answer Box

    I've attempted to get the results from the answer box using the documentation here.

    I noticed the Playground does not return these results either.

    Is there any way to get this URL also?

    Output Returned when Attempting to Run the Sample Provided:

    from serpapi import GoogleSearch
    
    params = {
      "q": "What's the definition of transparent?",
      "hl": "en",
      "gl": "us",
      "api_key": ""
    }
    
    search = GoogleSearch(params)
    results = search.get_dict()
    answer_box = results['answer_box']
    
    opened by beingskyler 1
  • ImportError: cannot import name 'GoogleSearch' from 'serpapi'

    ImportError: cannot import name 'GoogleSearch' from 'serpapi'

    Hello,

    I am trying to run the example on your git page https://github.com/serpapi/google-search-results-python#batch-asynchronous-searches But constantly getting this error.

    Thanks

    opened by mishra-avinash 1
  • You need a valid browser to continue exploring our API

    You need a valid browser to continue exploring our API

    This is the error message you get when you don't supply a private key. I think information on this site should be provided regarding:

    1. How to get an API key
    2. Is a key free or how much does it cost
    3. Are there limits to using the key (hits/hour or whatever)

    The service provided by the repo is very valuable, but can I use it or not depends on the answers to these questions.

    opened by demongolem 1
  • Knowledge Graph object not being sent in response.

    Knowledge Graph object not being sent in response.

    Some queries which return a knowledge graph in both my own google search and when tested in the SerpApi Playground are not returning the 'knowledge_graph' key in my own application.

    Code:

    params = {
        'q': 'Aspen Pumps Ltd',
        'engine': 'google',
        'api_key': <api_key>,
        'num': 100
      }
    
    result_set = GoogleSearchResults(params).get_dict()
    
    print(result_set.keys())
    

    Evaluation:

    dict_keys(['search_metadata', 'search_parameters', 'search_information', 'ads', 'shopping_results', 'organic_results', 'related_searches', 'pagination', 'serpapi_pagination'])
    

    Manual Results:

    https://www.google.com Screenshot 2019-07-31 at 15 48 55

    https://serpapi.com/playground Screenshot 2019-07-31 at 15 49 10

    opened by lewismazzei 1
  • Connexion issue

    Connexion issue

    Hi, One of the user using my code get the following error when creating a client. image I suppose it is machine settings related as it doesn't happen to other users. Thanks for helping P.S. I am fairly new to coding.

    opened by redg25 1
  • [Pagination] Pagination isn't correct and it skips index by one

    [Pagination] Pagination isn't correct and it skips index by one

    image

    Since the start value starts from 0, the correct second page should be 10 and not 11.

    This behaviour is causing a skip in pages also. The customers are getting confusing results: image

    Intercom Link First recognized by @marm123.

    I think this part needs to be replaced by: image

    self.client.params_dict['start'] += 0
    

    Whether it would cause any error on other engines is something I don't know. But it may also fix it for every other engine.

    invalid 
    opened by kagermanov27 0
  • Update readme docs

    Update readme docs

    This PR focuses on:

    • fix typos.

    • remove duplicates, for example: image

    • text formatting, clarification.

    • remove unnecessary examples, or possibly leave them but with details disclosure element.

    • add backlinks to some terms.

    • add pagination examples for all available APIs.

    Pagination examples could be placed to oobt/ folder instead to test "out of the box tests" for all APIs. However, I'm not familiar with it and how things currently work so I need a little guidance before working on it 🙂

    opened by dimitryzub 1
  • [Google Jobs API] Support for Pagination

    [Google Jobs API] Support for Pagination

    As Google Jobs does not return serpapi_pagination key but expects start param to paginate, this iteration of the library does not support pagination in Google Jobs. Pagination Support to be added for Google Jobs.

    # stop if backend miss to return serpapi_pagination
    if not 'serpapi_pagination' in result:
      raise StopIteration
    
    # stop if no next page
    if not 'next' in result['serpapi_pagination']:
        raise StopIteration
    

    image

    enhancement 
    opened by aliayar 1
  • Use pagination parameters from SerpApi instead of calculating on the client

    Use pagination parameters from SerpApi instead of calculating on the client

    start and num parameters are not suitable for token-based pagination. Such pagination is used on Google Maps, YouTube, Google Scholar Authors, and other search engines.

    This PR consumes URL query parameters for the next page. It stops paginating when parameters do not change.

    Details: https://github.com/serpapi/google-search-results-python/issues/22

    Some tests are failing because start and num parameters are not supported anymore. These tests will be fixed in the following commits.

    Related: #25, #26

    opened by ilyazub 0
  • how to resolve the Connection aborted error when calling the serpapi

    how to resolve the Connection aborted error when calling the serpapi

    Hi, A new scrapper here. in my api call, i have the following error. Would you please let me know if i am doing anything wrong here? Thanks a lot

    https://serpapi.com/search
    ---------------------------------------------------------------------------
    ConnectionResetError                      Traceback (most recent call last)
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
        676                 headers=headers,
    --> 677                 chunked=chunked,
        678             )
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
        380         try:
    --> 381             self._validate_conn(conn)
        382         except (SocketTimeout, BaseSSLError) as e:
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/urllib3/connectionpool.py in _validate_conn(self, conn)
        977         if not getattr(conn, "sock", None):  # AppEngine might not have  `.sock`
    --> 978             conn.connect()
        979 
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/urllib3/connection.py in connect(self)
        370             server_hostname=server_hostname,
    --> 371             ssl_context=context,
        372         )
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/urllib3/util/ssl_.py in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir, key_password, ca_cert_data)
        385         if HAS_SNI and server_hostname is not None:
    --> 386             return context.wrap_socket(sock, server_hostname=server_hostname)
        387 
    
    /anaconda/envs/azureml_py36/lib/python3.6/ssl.py in wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, session)
        406                          server_hostname=server_hostname,
    --> 407                          _context=self, _session=session)
        408 
    
    /anaconda/envs/azureml_py36/lib/python3.6/ssl.py in __init__(self, sock, keyfile, certfile, server_side, cert_reqs, ssl_version, ca_certs, do_handshake_on_connect, family, type, proto, fileno, suppress_ragged_eofs, npn_protocols, ciphers, server_hostname, _context, _session)
        816                         raise ValueError("do_handshake_on_connect should not be specified for non-blocking sockets")
    --> 817                     self.do_handshake()
        818 
    
    /anaconda/envs/azureml_py36/lib/python3.6/ssl.py in do_handshake(self, block)
       1076                 self.settimeout(None)
    -> 1077             self._sslobj.do_handshake()
       1078         finally:
    
    /anaconda/envs/azureml_py36/lib/python3.6/ssl.py in do_handshake(self)
        688         """Start the SSL/TLS handshake."""
    --> 689         self._sslobj.do_handshake()
        690         if self.context.check_hostname:
    
    ConnectionResetError: [Errno 104] Connection reset by peer
    
    During handling of the above exception, another exception occurred:
    
    ProtocolError                             Traceback (most recent call last)
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
        448                     retries=self.max_retries,
    --> 449                     timeout=timeout
        450                 )
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
        726             retries = retries.increment(
    --> 727                 method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
        728             )
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/urllib3/util/retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
        409             if read is False or not self._is_method_retryable(method):
    --> 410                 raise six.reraise(type(error), error, _stacktrace)
        411             elif read is not None:
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/urllib3/packages/six.py in reraise(tp, value, tb)
        733             if value.__traceback__ is not tb:
    --> 734                 raise value.with_traceback(tb)
        735             raise value
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
        676                 headers=headers,
    --> 677                 chunked=chunked,
        678             )
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
        380         try:
    --> 381             self._validate_conn(conn)
        382         except (SocketTimeout, BaseSSLError) as e:
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/urllib3/connectionpool.py in _validate_conn(self, conn)
        977         if not getattr(conn, "sock", None):  # AppEngine might not have  `.sock`
    --> 978             conn.connect()
        979 
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/urllib3/connection.py in connect(self)
        370             server_hostname=server_hostname,
    --> 371             ssl_context=context,
        372         )
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/urllib3/util/ssl_.py in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir, key_password, ca_cert_data)
        385         if HAS_SNI and server_hostname is not None:
    --> 386             return context.wrap_socket(sock, server_hostname=server_hostname)
        387 
    
    /anaconda/envs/azureml_py36/lib/python3.6/ssl.py in wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, session)
        406                          server_hostname=server_hostname,
    --> 407                          _context=self, _session=session)
        408 
    
    /anaconda/envs/azureml_py36/lib/python3.6/ssl.py in __init__(self, sock, keyfile, certfile, server_side, cert_reqs, ssl_version, ca_certs, do_handshake_on_connect, family, type, proto, fileno, suppress_ragged_eofs, npn_protocols, ciphers, server_hostname, _context, _session)
        816                         raise ValueError("do_handshake_on_connect should not be specified for non-blocking sockets")
    --> 817                     self.do_handshake()
        818 
    
    /anaconda/envs/azureml_py36/lib/python3.6/ssl.py in do_handshake(self, block)
       1076                 self.settimeout(None)
    -> 1077             self._sslobj.do_handshake()
       1078         finally:
    
    /anaconda/envs/azureml_py36/lib/python3.6/ssl.py in do_handshake(self)
        688         """Start the SSL/TLS handshake."""
    --> 689         self._sslobj.do_handshake()
        690         if self.context.check_hostname:
    
    ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
    
    During handling of the above exception, another exception occurred:
    
    ConnectionError                           Traceback (most recent call last)
    <ipython-input-26-45ac328ca8f8> in <module>
          1 question = 'where to get best coffee'
    ----> 2 results = performSearch(question)
    
    <ipython-input-25-5bc778bad4e2> in performSearch(question)
         12 
         13     search = GoogleSearch(params)
    ---> 14     results = search.get_dict()
         15     return results
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/serpapi/serp_api_client.py in get_dict(self)
        101             (alias for get_dictionary)
        102         """
    --> 103         return self.get_dictionary()
        104 
        105     def get_object(self):
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/serpapi/serp_api_client.py in get_dictionary(self)
         94             Dict with the formatted response content
         95         """
    ---> 96         return dict(self.get_json())
         97 
         98     def get_dict(self):
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/serpapi/serp_api_client.py in get_json(self)
         81         """
         82         self.params_dict["output"] = "json"
    ---> 83         return json.loads(self.get_results())
         84 
         85     def get_raw_json(self):
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/serpapi/serp_api_client.py in get_results(self, path)
         68             Response text field
         69         """
    ---> 70         return self.get_response(path).text
         71 
         72     def get_html(self):
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/serpapi/serp_api_client.py in get_response(self, path)
         57             url, parameter = self.construct_url(path)
         58             print(url)
    ---> 59             response = requests.get(url, parameter, timeout=self.timeout)
         60             return response
         61         except requests.HTTPError as e:
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/requests/api.py in get(url, params, **kwargs)
         73     """
         74 
    ---> 75     return request('get', url, params=params, **kwargs)
         76 
         77 
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/requests/api.py in request(method, url, **kwargs)
         59     # cases, and look like a memory leak in others.
         60     with sessions.Session() as session:
    ---> 61         return session.request(method=method, url=url, **kwargs)
         62 
         63 
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/requests/sessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
        540         }
        541         send_kwargs.update(settings)
    --> 542         resp = self.send(prep, **send_kwargs)
        543 
        544         return resp
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/requests/sessions.py in send(self, request, **kwargs)
        653 
        654         # Send the request
    --> 655         r = adapter.send(request, **kwargs)
        656 
        657         # Total elapsed time of the request (approximately)
    
    /anaconda/envs/azureml_py36/lib/python3.6/site-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
        496 
        497         except (ProtocolError, socket.error) as err:
    --> 498             raise ConnectionError(err, request=request)
        499 
        500         except MaxRetryError as e:
    
    ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
    
    opened by redBirdTx 2
  • google scholar pagination skips result 20

    google scholar pagination skips result 20

    When retrieving results from Google Scholar using the pagination() method, the first article on the second page of google scholar is always missing.

    I think this is caused by the following snippet in the update() method of google-search-results-python/serpapi/pagination.py:

    def update(self):
            self.client.params_dict["start"] = self.start
            self.client.params_dict["num"] = self.num
            if self.start > 0:
                self.client.params_dict["start"] += 1
    

    This seems to mean that for all pages except the first, paginate increases start by 1. So while for the first page it requests results starting at 0 and ending at 19 (if page_size=20). For the second page it requests results starting at 21 and ending at 40, skipping result 20.

    If I delete the if statement, the code seems to work as intended and I get result 19 back.

    opened by samuelhaysom 2
Owner
SerpApi
API to get search engine results with ease.
SerpApi
🔍 Google Search unofficial API for Python with no external dependencies

Python Google Search API Unofficial Google Search API for Python. It uses web scraping in the background and is compatible with both Python 2 and 3. W

Avi Aryan 204 Dec 28, 2022
Easy Google Translate: Unofficial Google Translate API

easygoogletranslate Unofficial Google Translate API. This library does not need an api key or something else to use, it's free and simple. You can eit

Ahmet Eren Odacı 9 Nov 6, 2022
An EmbedBuilder in Python for discord.py embeds. Pip Module.

Discord.py-MaxEmbeds An EmbedBuilder for Discord bots in Python. You need discord.py to use this module. Installation Step 1 First you have to install

Max Tischberger 6 Jan 13, 2022
A python package to fetch results of various national examinations done in Tanzania.

Necta-API Get a formated data of examination results scrapped from necta results website. Note this is not an official NECTA API and is still in devel

vincent laizer 16 Dec 23, 2022
SickNerd aims to slowly enumerate Google Dorks via the googlesearch API then requests found pages for metadata

CLI tool for making Google Dorking a passive recon experience. With the ability to fetch and filter dorks from GHDB.

Jake Wnuk 21 Jan 2, 2023
One version package to rule them all, One version package to find them, One version package to bring them all, and in the darkness bind them.

AwesomeVersion One version package to rule them all, One version package to find them, One version package to bring them all, and in the darkness bind

Joakim Sørensen 39 Dec 31, 2022
Google scholar share - Simple python script to pull Google Scholar data from an author's profile

google_scholar_share Simple python script to pull Google Scholar data from an au

Paul Goldsmith-Pinkham 9 Sep 15, 2022
Python Package For MTN Zambia Momo API. This package can also be used by MTN momo in other countries.

MTN MoMo API Lite Python Client Power your apps with Lite-Python MTN MoMo API Usage Installation Add the latest version of the library to your project

Mathews Musukuma 7 Jan 1, 2023
domhttpx is a google search engine dorker with HTTP toolkit built with python, can make it easier for you to find many URLs/IPs at once with fast time.

domhttpx is a google search engine dorker with HTTP toolkit built with python, can make it easier for you to find many URLs/IPs at once with fast time

Naufal Ardhani 59 Dec 4, 2022
google-resumable-media Apache-2google-resumable-media (🥉28 · ⭐ 27) - Utilities for Google Media Downloads and Resumable.. Apache-2

google-resumable-media Utilities for Google Media Downloads and Resumable Uploads See the docs for examples and usage. Experimental asyncio Support Wh

Google APIs 36 Nov 22, 2022
An attendance bot that joins google meet automatically according to schedule and marks present in the google meet.

Google-meet-self-attendance-bot An attendance bot which joins google meet automatically according to schedule and marks present in the google meet. I

Sarvesh Wadi 12 Sep 20, 2022
Google Drive, OneDrive and Youtube as covert-channels - Control systems remotely by uploading files to Google Drive, OneDrive, Youtube or Telegram

covert-control Control systems remotely by uploading files to Google Drive, OneDrive, Youtube or Telegram using Python to create the files and the lis

Ricardo Ruiz 52 Dec 6, 2022
Automation that uses Github Actions, Google Drive API, YouTube Data API and youtube-dl together to feed BackJam app with new music

Automation that uses Github Actions, Google Drive API, YouTube Data API and youtube-dl together to feed BackJam app with new music

Antônio Oliveira 1 Nov 21, 2021
rewise is an unofficial wrapper for google search's auto-complete feature

rewise is an unofficial wrapper for google search's auto-complete feature

Somdev Sangwan 71 Jul 19, 2022
A google search telegram bot.

Google-Search-Bot A google search telegram bot. Made with Python3 (C) @FayasNoushad Copyright permission under MIT License License -> https://github.c

Fayas Noushad 37 Nov 24, 2022
Image-Bot-Discord - This Is a discord bot that shows the specific image you search from Google

Advanced Discord.py Image Bot CREDITS Made by RLX and Mathiscool README by Milrato Installation Guide in .env Adjust the TOKEN python main.py to start

RLX 3 Jan 16, 2022
PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.

PRAW: The Python Reddit API Wrapper PRAW, an acronym for "Python Reddit API Wrapper", is a Python package that allows for simple access to Reddit's AP

Python Reddit API Wrapper Development 3k Dec 29, 2022
PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.

PRAW: The Python Reddit API Wrapper PRAW, an acronym for "Python Reddit API Wrapper", is a Python package that allows for simple access to Reddit's AP

Python Reddit API Wrapper Development 3k Dec 29, 2022
Autodrive is designed to make it as easy as possible to interact with the Google Drive and Sheets APIs via Python

Autodrive Autodrive is designed to make it as easy as possible to interact with the Google Drive and Sheets APIs via Python. It is especially designed

Chris Larabee 1 Oct 2, 2021