Pythonic search engine based on PyLucene.

A. Coady

Last update: Jan 2, 2023

Related tags

Search search-engine lucene pylucene fastapi strawberry-graphql

Overview

Lupyne is a search engine based on PyLucene, the Python extension for accessing Java Lucene. Lucene is a relatively low-level toolkit, and PyLucene wraps it through automatic code generation. So although Java idioms are translated to Python idioms where possible, the resulting interface is far from Pythonic. See ./docs/examples.ipynb for comparisons with the Lucene API.

Lupyne also provides GraphQL and RESTful search services, based on Starlette. Note Solr and Elasticsearch are popular options for Lucene-based search, if no further (Python) customization is needed. So while the services are suitable for production usage, their primary motivation is to be an extensible example.

Not having to initially choose between an embedded library and a server not only provides greater flexibility, it can provide better performance, e.g., batch indexing offline and remote searching live. Additionally only lightweight wrappers with extended behavior are used wherever possible, so falling back to using PyLucene directly is always an option, but should never be necessary for performance.

Usage

PyLucene requires initializing the VM.

import lucene

lucene.initVM()

Indexes are accessed through an IndexSearcher (read-only), IndexWriter, or the combined Indexer.

from lupyne import engine

searcher = engine.IndexSearcher('index/path')
hits = searcher.search('text:query')

See ./lupyne/services/README.md for services usage.

Installation

% pip install lupyne[graphql,rest]

PyLucene is not pip installable.

Install instructions
Docker image: docker pull coady/pylucene
Homebrew formula: brew install coady/tap/pylucene

Dependencies

PyLucene >=8
strawberry-graphql >=0.84.4 (if graphql option)
fastapi (if rest option)

Tests

100% branch coverage.

% pytest [--cov]

Changes

dev

PyLucene >=8.6 required
PyLucene 8.11 supported
CherryPy server removed

2.5

Python >=3.7 required
PyLucene 8.6 supported
CherryPy server deprecated

2.4

PyLucene >=8 required
Hit.keys renamed to Hit.sortkeys

2.3

PyLucene >=7.7 required
PyLucene 8 supported

2.2

PyLucene 7.6 supported

2.1

PyLucene >=7 required

2.0

PyLucene >=6 required
Python 3 support
client moved to external package

1.9

Python 2.6 dropped
PyLucene 4.8 and 4.9 dropped
IndexWriter implements context manager
Server DocValues updated via patch method
Spatial tile search optimized

1.8

PyLucene 4.10 supported
PyLucene 4.6 and 4.7 dropped
Comparator iteration optimized
Support for string based FieldCacheRangeFilters

Comments

Combining Querys with BooleanQuerys
Hi @coady, thanks for all your hard work on lupyne, its been super helpful for me! I used your Dockerfile as a basis for compiling JCC & PyLucene to wheel files in my own non-Docker environment and now I've been able to successfully run some of the examples and setup my own 14 GB corpus, index it to a directory, and do some basic searches based on the examples you provided in the docs.

Right now I'm trying to write a slightly more complex query, but was having some trouble and hoping you might be able to point me in the right direction.

I have a fairly simple index that has 4 stored fields. A text field containing the article text, a text field containing the name of the company (the list of company names is finite and each document is associated with exactly one company), a datetime field that contains the date the article was published, and an article id.

I'm trying to write a query that does the following: find all documents that contain the phrase "lupyne is great" and occur between some arbitrary date range and that have a company_name field value of 'company a', 'company_b', or 'company_c'.

I've tried the following:

import lucene from lupyne import engine from datetime import date assert lucene.getVMEnv() or lucene.initVM() index_path: str = r'myindexdir' query_str: str = 'lupyne is great' start_date: date = date(year=2020, month=2, day=14) companies: [str] = ['company a', 'company b', 'company c'] indexer = engine.Indexer(index_path, mode='r', nrt=True) indexer.set('article_id', stored=True) indexer.set('company_name', stored=True) indexer.set('date', engine.DateTimeField, stored=True) indexer.set('text', engine.Field.Text, stored=True) query_engine = engine.Query # The following works with the query string 'lupyne' query_str: str = 'lupyne' query = indexer.fields['date'].range(start_date, None) & query_engine.term('text', query_str) # This does not with the query_string 'lupyne is great', query_str: str = 'lupyne is great' query = indexer.fields['date'].range(start_date, None) & query_engine.phrase('text', query_str) # TypeError: unsupported operand type(s) for &: 'Query' and 'MultiPhraseQuery' # This also does not work range_query = query_engine.range('date', date_field.timestamp(start_date), None) # java.lang.IncompatibleClassChangeError # at org.apache.lucene.util.BytesRef.<init>(BytesRef.java:84) # This will also break range_query = query_engine.range('date', start_date, None) # lucene.InvalidArgsError: (<class 'org.apache.lucene.util.BytesRef'>, '__init__', (datetime.date(2021, 2, 2),))

Any suggestions on how I might go about this? Thanks again for all the hard work!

EDIT: So, it looks like this might be because Query.ranges() doesn't return a lupyne Query object as seen here, but instead directly returns a pylucene query object. Any good way to get around this?
opened by ZeroCool2u 7
Bump actions/setup-python from 3 to 4
Bumps actions/setup-python from 3 to 4.

Release notes

Sourced from actions/setup-python's releases.

v4.0.0

What's Changed

Support for python-version-file input: #336

Example of usage:

- uses: actions/setup-python@v4 with: python-version-file: '.python-version' # Read python version from a file - run: python my_script.py

There is no default python version for this setup-python major version, the action requires to specify either python-version input or python-version-file input. If the python-version input is not specified the action will try to read required version from file from python-version-file input.

Use pypyX.Y for PyPy python-version input: #349

Example of usage:

- uses: actions/setup-python@v4 with: python-version: 'pypy3.9' # pypy-X.Y kept for backward compatibility - run: python my_script.py

RUNNER_TOOL_CACHE environment variable is equal AGENT_TOOLSDIRECTORY: #338

Bugfix: create missing pypyX.Y symlinks: #347

PKG_CONFIG_PATH environment variable: #400

Added python-path output: #405 python-path output contains Python executable path.

Updated zeit/ncc to vercel/ncc package: #393

Bugfix: fixed output for prerelease version of poetry: #409

Made pythonLocation environment variable consistent for Python and PyPy: #418

Bugfix for 3.x-dev syntax: #417

Other improvements: #318 #396 #384 #387 #388

Update actions/cache version to 2.0.2

In scope of this release we updated actions/cache package as the new version contains fixes related to GHES 3.5 (actions/setup-python#382)

Add "cache-hit" output and fix "python-version" output for PyPy

This release introduces new output cache-hit (actions/setup-python#373) and fix python-version output for PyPy (actions/setup-python#365)

The cache-hit output contains boolean value indicating that an exact match was found for the key. It shows that the action uses already existing cache or not. The output is available only if cache is enabled.

... (truncated)

Commits

d09bd5e fix: 3.x-dev can install a 3.y version (#417)

f72db17 Made env.var pythonLocation consistent for Python and PyPy (#418)

53e1529 add support for python-version-file (#336)

3f82819 Fix output for prerelease version of poetry (#409)

397252c Update zeit/ncc to vercel/ncc (#393)

de977ad Merge pull request #412 from vsafonkin/v-vsafonkin/fix-poetry-cache-test

22c6af9 Change PyPy version to rebuild cache

081a3cf Merge pull request #405 from mayeut/interpreter-path

ff70656 feature: add a python-path output

fff15a2 Use pypyX.Y for PyPy python-version input (#349)

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies github_actions
opened by dependabot[bot] 2
Bump codecov/codecov-action from 2 to 3
Bumps codecov/codecov-action from 2 to 3.

Release notes

Sourced from codecov/codecov-action's releases.

v3.0.0

Breaking Changes

#689 Bump to node16 and small fixes

Features

#688 Incorporate gcov arguments for the Codecov uploader

Dependencies

#548 build(deps-dev): bump jest-junit from 12.2.0 to 13.0.0

#603 [Snyk] Upgrade @actions/core from 1.5.0 to 1.6.0

#628 build(deps): bump node-fetch from 2.6.1 to 3.1.1

#634 build(deps): bump node-fetch from 3.1.1 to 3.2.0

#636 build(deps): bump openpgp from 5.0.1 to 5.1.0

#652 build(deps-dev): bump @vercel/ncc from 0.30.0 to 0.33.3

#653 build(deps-dev): bump @types/node from 16.11.21 to 17.0.18

#659 build(deps-dev): bump @types/jest from 27.4.0 to 27.4.1

#667 build(deps): bump actions/checkout from 2 to 3

#673 build(deps): bump node-fetch from 3.2.0 to 3.2.3

#683 build(deps): bump minimist from 1.2.5 to 1.2.6

#685 build(deps): bump @actions/github from 5.0.0 to 5.0.1

#681 build(deps-dev): bump @types/node from 17.0.18 to 17.0.23

#682 build(deps-dev): bump typescript from 4.5.5 to 4.6.3

#676 build(deps): bump @actions/exec from 1.1.0 to 1.1.1

#675 build(deps): bump openpgp from 5.1.0 to 5.2.1

v2.1.0

2.1.0

Features

#515 Allow specifying version of Codecov uploader

Dependencies

#499 build(deps-dev): bump @vercel/ncc from 0.29.0 to 0.30.0

#508 build(deps): bump openpgp from 5.0.0-5 to 5.0.0

#514 build(deps-dev): bump @types/node from 16.6.0 to 16.9.0

v2.0.3

2.0.3

Fixes

#464 Fix wrong link in the readme

#485 fix: Add override OS and linux default to platform

Dependencies

#447 build(deps): bump openpgp from 5.0.0-4 to 5.0.0-5

#458 build(deps-dev): bump eslint from 7.31.0 to 7.32.0

#465 build(deps-dev): bump @typescript-eslint/eslint-plugin from 4.28.4 to 4.29.1

#466 build(deps-dev): bump @typescript-eslint/parser from 4.28.4 to 4.29.1

#468 build(deps-dev): bump @types/jest from 26.0.24 to 27.0.0

#470 build(deps-dev): bump @types/node from 16.4.0 to 16.6.0

#472 build(deps): bump path-parse from 1.0.6 to 1.0.7

#473 build(deps-dev): bump @types/jest from 27.0.0 to 27.0.1

... (truncated)

Changelog

Sourced from codecov/codecov-action's changelog.

3.0.0

Breaking Changes

#689 Bump to node16 and small fixes

Features

#688 Incorporate gcov arguments for the Codecov uploader

Dependencies

#548 build(deps-dev): bump jest-junit from 12.2.0 to 13.0.0

#603 [Snyk] Upgrade @actions/core from 1.5.0 to 1.6.0

#628 build(deps): bump node-fetch from 2.6.1 to 3.1.1

#634 build(deps): bump node-fetch from 3.1.1 to 3.2.0

#636 build(deps): bump openpgp from 5.0.1 to 5.1.0

#652 build(deps-dev): bump @vercel/ncc from 0.30.0 to 0.33.3

#653 build(deps-dev): bump @types/node from 16.11.21 to 17.0.18

#659 build(deps-dev): bump @types/jest from 27.4.0 to 27.4.1

#667 build(deps): bump actions/checkout from 2 to 3

#673 build(deps): bump node-fetch from 3.2.0 to 3.2.3

#683 build(deps): bump minimist from 1.2.5 to 1.2.6

#685 build(deps): bump @actions/github from 5.0.0 to 5.0.1

#681 build(deps-dev): bump @types/node from 17.0.18 to 17.0.23

#682 build(deps-dev): bump typescript from 4.5.5 to 4.6.3

#676 build(deps): bump @actions/exec from 1.1.0 to 1.1.1

#675 build(deps): bump openpgp from 5.1.0 to 5.2.1

2.1.0

Features

#515 Allow specifying version of Codecov uploader

Dependencies

#499 build(deps-dev): bump @vercel/ncc from 0.29.0 to 0.30.0

#508 build(deps): bump openpgp from 5.0.0-5 to 5.0.0

#514 build(deps-dev): bump @types/node from 16.6.0 to 16.9.0

2.0.3

Fixes

#464 Fix wrong link in the readme

#485 fix: Add override OS and linux default to platform

Dependencies

#447 build(deps): bump openpgp from 5.0.0-4 to 5.0.0-5

#458 build(deps-dev): bump eslint from 7.31.0 to 7.32.0

#465 build(deps-dev): bump @typescript-eslint/eslint-plugin from 4.28.4 to 4.29.1

#466 build(deps-dev): bump @typescript-eslint/parser from 4.28.4 to 4.29.1

#468 build(deps-dev): bump @types/jest from 26.0.24 to 27.0.0

#470 build(deps-dev): bump @types/node from 16.4.0 to 16.6.0

#472 build(deps): bump path-parse from 1.0.6 to 1.0.7

#473 build(deps-dev): bump @types/jest from 27.0.0 to 27.0.1

#478 build(deps-dev): bump @typescript-eslint/parser from 4.29.1 to 4.29.2

#479 build(deps-dev): bump @typescript-eslint/eslint-plugin from 4.29.1 to 4.29.2

... (truncated)

Commits

e3c5604 Merge pull request #689 from codecov/feat/gcov

174efc5 Update package-lock.json

6243a75 bump to 3.0.0

0d6466f Bump to node16

d4729ee fetch.default

351baf6 fix: bash

d8cf680 Merge pull request #675 from codecov/dependabot/npm_and_yarn/openpgp-5.2.1

b775e90 Merge pull request #676 from codecov/dependabot/npm_and_yarn/actions/exec-1.1.1

2ebc2f0 Merge pull request #682 from codecov/dependabot/npm_and_yarn/typescript-4.6.3

8e2ef2b Merge pull request #681 from codecov/dependabot/npm_and_yarn/types/node-17.0.23

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies github_actions
opened by dependabot[bot] 2

Example Failed

It seems that example failed:

lucene.initVM()
indexer = engine.Indexer()
indexer.set('name', stored=True)
indexer.set('text')
indexer.add(name='sample', text='hello world')
indexer.commit()

Will raise:

Traceback (most recent call last):
  File "index.py", line 78, in <module>
    indexer.set('text')
  File "/Users/Nasy/.pyenv/versions/3.8.0/lib/python3.8/site-packages/lupyne/engine/indexers.py", line 546, in set
    field = self.fields[name] = cls(name, **settings)
  File "/Users/Nasy/.pyenv/versions/3.8.0/lib/python3.8/site-packages/lupyne/engine/documents.py", line 61, in __init__
    assert self.stored or self.indexed or self.docvalues or self.dimensions
AssertionError

opened by nasyxx 2

Bump github/codeql-action from 1 to 2
Bumps github/codeql-action from 1 to 2.

Changelog

Sourced from github/codeql-action's changelog.

2.1.8 - 08 Apr 2022

Update default CodeQL bundle version to 2.8.5. #1014

Fix error where the init action would fail due to a GitHub API request that was taking too long to complete #1025

2.1.7 - 05 Apr 2022

A bug where additional queries specified in the workflow file would sometimes not be respected has been fixed. #1018

2.1.6 - 30 Mar 2022

[v2+ only] The CodeQL Action now runs on Node.js v16. #1000

Update default CodeQL bundle version to 2.8.4. #990

Fix a bug where an invalid commit_oid was being sent to code scanning when a custom checkout path was being used. #956

Commits

2c03704 Allow the version of the ML-powered pack to depend on the CLI version

dd6b592 Simplify ML-powered query status report definition

a90d8bf Merge pull request #1011 from github/henrymercer/ml-powered-queries-pr-check

dc0338e Use latest major version of actions/upload-artifact

57096fe Add a PR check to validate that ML-powered queries are run correctly

b0ddf36 Merge pull request #1012 from github/henrymercer/update-actions-major-versions

1ea2f2d Merge branch 'main' into henrymercer/update-actions-major-versions

9dcc141 Merge pull request #1010 from github/henrymercer/stop-running-ml-powered-quer...

ea751a9 Update other Actions from v2 to v3

a2949f4 Update actions/checkout from v2 to v3

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies github_actions
opened by dependabot[bot] 1
Bump actions/checkout from 2 to 3
Bumps actions/checkout from 2 to 3.

Release notes

Sourced from actions/checkout's releases.

v3.0.0

Update default runtime to node16

v2.4.0

Convert SSH URLs like org-<ORG_ID>@github.com: to https://github.com/ - pr

v2.3.5

Update dependencies

v2.3.4

Add missing awaits

Swap to Environment Files

v2.3.3

Remove Unneeded commit information from build logs

Add Licensed to verify third party dependencies

v2.3.2

Add Third Party License Information to Dist Files

v2.3.1

Fix default branch resolution for .wiki and when using SSH

v2.3.0

Fallback to the default branch

v2.2.0

Fetch all history for all tags and branches when fetch-depth=0

v2.1.1

Changes to support GHES (here and here)

v2.1.0

Group output

Changes to support GHES alpha release

Persist core.sshCommand for submodules

Add support ssh

Convert submodule SSH URL to HTTPS, when not using SSH

Add submodule support

Follow proxy settings

Fix ref for pr closed event when a pr is merged

Fix issue checking detached when git less than 2.22

Changelog

Sourced from actions/checkout's changelog.

Changelog

v2.3.1

Fix default branch resolution for .wiki and when using SSH

v2.3.0

Fallback to the default branch

v2.2.0

Fetch all history for all tags and branches when fetch-depth=0

v2.1.1

Changes to support GHES (here and here)

v2.1.0

Group output

Changes to support GHES alpha release

Persist core.sshCommand for submodules

Add support ssh

Convert submodule SSH URL to HTTPS, when not using SSH

Add submodule support

Follow proxy settings

Fix ref for pr closed event when a pr is merged

Fix issue checking detached when git less than 2.22

v2.0.0

Do not pass cred on command line

Add input persist-credentials

Fallback to REST API to download repo

Commits

a12a394 update readme for v3 (#708)

8f9e05e Update to node 16 (#689)

230611d Change secret name for PAT to not start with GITHUB_ (#623)

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies github_actions
opened by dependabot[bot] 1
Could anyone help with a simple example?

Hello, I am new to search engine and lupyne. And I want to use search engine to help me achieve a simple target which is given a query, I want to search through all documents and return relevant ones in terms of BM25 score? How can I do it? I tried examples in the doc: How can I assign BM25 as scoring function? and should I give different setting like tokenization when searching different languages ? Sorry for taking your time! Thanks !

opened by Hannibal046 1
indexer.commit get struck when using multiprocess

when indexer.commit() is run using a process (multiprocess), commit tends to get struck. I've tried attachCurrentThread() as well, but it doesnt seem to work.

Is there any way where i ll be able to use multiprocess along with lypyne

Following is the code:

import lucene from lupyne import engine lucene.initVM() #assert lucene.getVMEnv() or lucene.initVM() from multiprocessing import Process

#vm_env = lucene.initVM(vmargs=['-Djava.awt.headless=true']) #from org.apache.lucene import analysis, document, index, queryparser, search, store, util class testd: def idx(self): #lucene.getVMEnv().attachCurrentThread() print("init") indexer = engine.Indexer() indexer.set('fieldname', stored=True) # settings for all documents of indexer; indexed and tokenized is the default indexer.add(fieldname="sample_test")
print("Trying to commit") indexer.commit() print("done")

if __name__ == '__main__': #testd().idx() p = Process(target=testd().idx) p.start() p.join()

opened by khasa3 1

Overriding dict.keys() with Hit.keys breaks Hit object displaying in IPython

First of all, thanks for your efforts in providing a high-level Lucene Python library! I really appreciate that I can almost completely omit Java-related code in my library.

I'm experimenting with the library in IPython and have problems with displaying Hit object:

In [101]: print(type(h))                                                                                                                                                         [0/11160]
<class 'lupyne.engine.documents.Hit'>

In [102]: print(repr(h))                      
{'LEMMA': ['кошка'], 'LEMMA_LANGUAGE': ['RU'], 'POS': ['n']}

In [103]: h                                   
Out[103]: ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/.pyenv/versions/3.6.9/envs/babelnet-lite-3.6.9/lib/python3.6/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)                                                                                                                                                   
    703             printer.flush()                                                                                                                                                       
    704             return stream.getvalue()                                                                                                                                              
                                                                                                                                                                                          
~/.pyenv/versions/3.6.9/envs/babelnet-lite-3.6.9/lib/python3.6/site-packages/IPython/lib/pretty.py in pretty(self, obj)                                                                   
    383                 if cls in self.type_pprinters:                                                                                                                                    
    384                     # printer registered in self.type_pprinters                                                                                                                   
--> 385                     return self.type_pprinters[cls](obj, self, cycle)                                                                                                             
    386                 else:                                                                                                                                                             
    387                     # deferred printer                                                                                                                                            
                                                                                                                                                                                          
~/.pyenv/versions/3.6.9/envs/babelnet-lite-3.6.9/lib/python3.6/site-packages/IPython/lib/pretty.py in inner(obj, p, cycle)                                                                
    606         step = len(start)                                                                                                                                                         
    607         p.begin_group(step, start)                                                                                                                                                
--> 608         keys = obj.keys()                                                                                                                                                         
    609         # if dict isn't large enough to be truncated, sort keys before displaying                                                                                                 
    610         # From Python 3.7, dicts preserve order by definition, so we don't sort.                                                                                                  
                                                                                                                                                                                          
TypeError: 'tuple' object is not callable

I think that the problem is that the Hit object is the instance of dict and IPython tries to pretty print it as a dict, but when it calls Hit.keys() TypeError occurs because you've overridden it with a tuple. I suggest you rename Hit.keys to Hit.keys_ to fix that and to follow the principle of least astonishment.

opened by rominf 1

Why Lupyne?

Hello,

While looking around for how to run PyLucene, I stumbled around your docker image for PyLucene and eventually here. I'm curious why you have written Lupyne? Is it to provide a more Pythonic interface to Lucene? Why should one use Lupyne over PyLucene? The lack of documentation on PyLucene makes me feel like only a handful of people are actually using PyLucene...

Thanks, Sep

opened by seperman 1
Bump actions/setup-python from 2 to 3
Bumps actions/setup-python from 2 to 3.

Release notes

Sourced from actions/setup-python's releases.

v3.0.0

What's Changed

Update default runtime to node16 (actions/setup-python#340)

Update package-lock.json file version to 2, @types/node to 16.11.25 and typescript to 4.2.3 (actions/setup-python#341)

Remove legacy pypy2 and pypy3 keywords (actions/setup-python#342)

Breaking Changes

With the update to Node 16, all scripts will now be run with Node 16 rather than Node 12.

This new major release removes support of legacy pypy2 and pypy3 keywords. Please use more specific and flexible syntax to specify a PyPy version:

jobs: build: runs-on: ubuntu-latest strategy: matrix: python-version: - 'pypy-2.7' # the latest available version of PyPy that supports Python 2.7 - 'pypy-3.8' # the latest available version of PyPy that supports Python 3.8 - 'pypy-3.8-v7.3.8' # Python 3.8 and PyPy 7.3.8 steps: - uses: actions/checkout@v2 - uses: actions/setup-python@v3 with: python-version: ${{ matrix.python-version }}

See more usage examples in the documentation

Update primary and restore keys for pip

In scope of this release we include a version of python in restore and primary cache keys for pip. Besides, we add temporary fix for Windows caching issue, that the pip cache dir command returns non zero exit code or writes to stderr. Moreover we updated node-fetch dependency.

Update actions/cache version to 1.0.8

We have updated actions/cache dependency version to 1.0.8 to support 10GB cache upload

Support caching dependencies

This release introduces dependency caching support (actions/setup-python#266)

Caching dependencies.

The action has a built-in functionality for caching and restoring pip/pipenv dependencies. The cache input is optional, and caching is turned off by default.

Besides, this release introduces dependency caching support for mono repos and repositories with complex structure.

By default, the action searches for the dependency file (requirements.txt for pip or Pipfile.lock for pipenv) in the whole repository. Use the cache-dependency-path input for cases when you want to override current behaviour and use different file for hash generation (for example requirements-dev.txt). This input supports wildcards or a list of file names for caching multiple dependencies.

Caching pip dependencies:

steps: - uses: actions/checkout@v2 - uses: actions/setup-python@v2 with: python-version: '3.9' </tr></table>

... (truncated)

Commits

0ebf233 Remove legacy PyPy input (#342)

665cd78 Update lockfileversion (#341)

93cb78f Update to node16 (#340)

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies github_actions
opened by dependabot[bot] 0

Owner

A. Coady

GitHub https://coady.github.io/lupyne/

Senginta is All in one Search Engine Scrapper for used by API or Python Module. It's Free!

Senginta is All in one Search Engine Scrapper. With traditional scrapping, Senginta can be powerful to get result from any Search Engine, and convert to Json. Now support only for Google Product Search Engine (GShop, GVideo and many too) and Baidu Search Engine.

33 Nov 21, 2022

Google Search Engine Results Pages (SERP) in locally, no API key, no signup required

Local SERP Google Search Engine Results Pages (SERP) in locally, no API key, no signup required Make sure the chromedriver and required package are in

4 Jun 29, 2021

Simple algorithm search engine like google in python using function

Mini-Search-Engine-Like-Google I have created the simple algorithm search engine like google in python using function. I am matching every word with w

5 Sep 24, 2021

A sentence search engine that fetches examples from trusted news/media organisations. Great for writing better English.

A sentence search engine that fetches examples from trusted news/media websites. Great for improving writing & speaking better English.

1 Apr 4, 2022

A simple search engine that allow searching for chess games

A simple search engine that allow searching for chess games based on queries about opening names & opening moves. Built with Python 3.10 and python-chess.

1 Jun 17, 2022

Pythonic Lucene - A simplified python impelementaiton of Apache Lucene

A simplified python impelementaiton of Apache Lucene, mabye helps to understand how an enterprise search engine really works.

2 Sep 12, 2022

Search emails from a domain through search engines

EmailFinder - search emails through Search Engines

155 Dec 30, 2022

GitScanner is a script to make it easy to search for Exposed Git through an advanced Google search.

GitScanner Legal disclaimer Usage of GitScanner for attacking targets without prior mutual consent is illegal. It is the end user's responsibility to

3 Oct 28, 2022

A fast, efficiency python package for searching and getting search results with many different search engines

search A fast, efficiency python package for searching and getting search results with many different search engines. Installation To install the pack

0 Oct 6, 2022

Reverse-ikea-image-search - A simple image of ikea search using jina.ai

IKEA Reverse Image Search This is a demo project to fetch ikea product images(IK

4 Mar 8, 2022

Modular search for Django

Haystack Author: Daniel Lindsley Date: 2013/07/28 Haystack provides modular search for Django. It features a unified, familiar API that allows you to

3.4k Jan 4, 2023

Full text search for flask.

flask-msearch Installation To install flask-msearch: pip install flask-msearch # when MSEARCH_BACKEND = "whoosh" pip install whoosh blinker # when MSE

197 Dec 29, 2022

Jina allows you to build deep learning-powered search-as-a-service in just minutes

Cloud-native neural search framework for any kind of data

17k Dec 31, 2022

document organizer with tags and full-text-search, in a simple and clean sqlite3 schema

152 Oct 29, 2022

A web search server for ParlAI, including Blenderbot2.

Description A web search server for ParlAI, including Blenderbot2. Querying the server: The server reacting correctly: Uses html2text to strip the mar

119 Jan 6, 2023

This project is a sample demo of Arxiv search related to AI/ML Papers built using Streamlit, sentence-transformers and Faiss.

49 Oct 30, 2022

Google Project: Search and auto-complete sentences within given input text files, manipulating data with complex data-structures.

Auto-Complete Google Project In this project there is an implementation for one feature of Google's search engines - AutoComplete. Autocomplete, or wo

10 Jun 20, 2022

Full-text multi-table search application for Django. Easy to install and use, with good performance.

django-watson django-watson is a fast multi-model full-text search plugin for Django. It is easy to install and use, and provides high quality search

1.1k Jan 3, 2023

rclip - AI-Powered Command-Line Photo Search Tool

rclip is a command-line photo search tool based on the awesome OpenAI's CLIP neural network.

394 Dec 12, 2022

Pythonic search engine based on PyLucene.

Related tags

Overview

Usage

Installation

Dependencies

Tests

Changes

Comments

v4.0.0

What's Changed

Update actions/cache version to 2.0.2

Add "cache-hit" output and fix "python-version" output for PyPy

v3.0.0

Breaking Changes

Features

Dependencies

v2.1.0

2.1.0

Features

Dependencies

v2.0.3

2.0.3

Fixes

Dependencies

3.0.0

Breaking Changes

Features

Dependencies

2.1.0

Features

Dependencies

2.0.3

Fixes

Dependencies

2.1.8 - 08 Apr 2022

2.1.7 - 05 Apr 2022

2.1.6 - 30 Mar 2022

v3.0.0

v2.4.0

v2.3.5

v2.3.4

v2.3.3

v2.3.2

v2.3.1

v2.3.0

v2.2.0

v2.1.1

v2.1.0

Changelog

v2.3.1

v2.3.0

v2.2.0

v2.1.1

v2.1.0

v2.0.0

v3.0.0

What's Changed

Breaking Changes

Update primary and restore keys for pip

Update actions/cache version to 1.0.8

Support caching dependencies

Caching dependencies.

Caching pip dependencies:

Owner

A. Coady

Senginta is All in one Search Engine Scrapper for used by API or Python Module. It's Free!

Google Search Engine Results Pages (SERP) in locally, no API key, no signup required

Simple algorithm search engine like google in python using function

A sentence search engine that fetches examples from trusted news/media organisations. Great for writing better English.

A simple search engine that allow searching for chess games

Pythonic Lucene - A simplified python impelementaiton of Apache Lucene

Search emails from a domain through search engines

GitScanner is a script to make it easy to search for Exposed Git through an advanced Google search.

A fast, efficiency python package for searching and getting search results with many different search engines

Reverse-ikea-image-search - A simple image of ikea search using jina.ai

Modular search for Django

Full text search for flask.

Jina allows you to build deep learning-powered search-as-a-service in just minutes

document organizer with tags and full-text-search, in a simple and clean sqlite3 schema