Wagtail CLIP allows you to search your Wagtail images using natural language queries.

Matt Segal

Last update: Dec 21, 2022

Related tags

Search wagtail-clip

Overview

Wagtail CLIP

Wagtail CLIP allows you to search your Wagtail images using natural language queries.

This project was inspired by, and draws heavily from memery, by deepfates et al. It makes use of OpenAI's CLIP model (it was very nice of them to open source it, cheers).

An example project is available here

Installation

You can install this package as follows (requires Git):

pip install \
    wagtailclip@git+https://github.com/MattSegal/wagtail-clip.git \
    -f https://download.pytorch.org/whl/torch_stable.html \
    torch==1.7.1+cpu \
    torchvision==0.8.2+cpu

You will find that this installs ~200MB of deep learning libraries (PyTorch). You will also need to update your Django project's settings:

INSTALLED_APPS = [
    # ... whatever ...
    "wagtailclip",
]

# A place to store ~330MB of downloaded model parameters
WAGTAIL_CLIP_DOWNLOAD_PATH = "/clip"
# Maximum number of search results
WAGTAIL_CLIP_MAX_IMAGE_SEARCH_RESULTS = 256
# A unique name for the search backend.
WAGTAIL_CLIP_SEARCH_BACKEND_NAME = "clip"
# Recommended model, or your can roll your own (read the source).
WAGTAILIMAGES_IMAGE_MODEL = "wagtailclip.NaturalSearchImage"
# Add the search backend.
WAGTAILSEARCH_BACKENDS = {
    # ... whatever ...
    WAGTAIL_CLIP_SEARCH_BACKEND_NAME: {
        "BACKEND": "wagtailclip.search.CLIPSearchBackend",
    },
}

That's enough to get started, however if you want pre-download the ~330MB of model parameters, you can run this management command:

./manage.py download_clip

How it works

This package wraps the CLIP model. which can be used for:

encoding text into 1x512 float vectors
encoding images into 1x512 float vectors

These vectors can be thought of as points in a 512 dimensional space, where the closer two points are to each other, the more "related" they are. Importantly, CLIP encodes both text and images into the same space, meaning that we can:

encode all Wagtail images into vectors and store them in the database
encode a user's search query text into a vector; and then
compare the search query vector with all the image vectors

This comparison is done using a dot product to get a similarity score for each image. The operation is performed in Python. Once we have a similarity score we pick the top N (say, 256) most similar images and return those as the results.

Will this scale?

Haha probably not. I've tested my naive implementation on up to 2048 images and it runs OK (~3s / query). There are specialized Postgres extensions and vector similarity databases that you can use if you want to do this for tens of thousands of images.

Contributing

If you want to help out, make a pull request and/or email me at [email protected] or DM me on Twitter. Probably better to talk to me first before writing a bunch of code.

Full text search for flask.

flask-msearch Installation To install flask-msearch: pip install flask-msearch # when MSEARCH_BACKEND = "whoosh" pip install whoosh blinker # when MSE

197 Dec 29, 2022

Senginta is All in one Search Engine Scrapper for used by API or Python Module. It's Free!

Senginta is All in one Search Engine Scrapper. With traditional scrapping, Senginta can be powerful to get result from any Search Engine, and convert to Json. Now support only for Google Product Search Engine (GShop, GVideo and many too) and Baidu Search Engine.

33 Nov 21, 2022

document organizer with tags and full-text-search, in a simple and clean sqlite3 schema

152 Oct 29, 2022

Google Search Engine Results Pages (SERP) in locally, no API key, no signup required

Local SERP Google Search Engine Results Pages (SERP) in locally, no API key, no signup required Make sure the chromedriver and required package are in

4 Jun 29, 2021

A web search server for ParlAI, including Blenderbot2.

Description A web search server for ParlAI, including Blenderbot2. Querying the server: The server reacting correctly: Uses html2text to strip the mar

119 Jan 6, 2023

Google Project: Search and auto-complete sentences within given input text files, manipulating data with complex data-structures.

Auto-Complete Google Project In this project there is an implementation for one feature of Google's search engines - AutoComplete. Autocomplete, or wo

10 Jun 20, 2022

Wagtail CLIP allows you to search your Wagtail images using natural language queries.

Related tags

Overview

Wagtail CLIP

Installation

How it works

Will this scale?

Contributing

You might also like...

Full text search for flask.

Senginta is All in one Search Engine Scrapper for used by API or Python Module. It's Free!

document organizer with tags and full-text-search, in a simple and clean sqlite3 schema

Google Search Engine Results Pages (SERP) in locally, no API key, no signup required

A web search server for ParlAI, including Blenderbot2.

Google Project: Search and auto-complete sentences within given input text files, manipulating data with complex data-structures.

Full-text multi-table search application for Django. Easy to install and use, with good performance.

rclip - AI-Powered Command-Line Photo Search Tool

An image inline search telegram bot.

Owner

Matt Segal

Jina allows you to build deep learning-powered search-as-a-service in just minutes

Reverse-ikea-image-search - A simple image of ikea search using jina.ai

Deep Image Search - AI-Based Image Search Engine

Search emails from a domain through search engines

Image search service based on imgsmlr extension of PostgreSQL. Support image search by image.

GitScanner is a script to make it easy to search for Exposed Git through an advanced Google search.

A fast, efficiency python package for searching and getting search results with many different search engines

This project is a sample demo of Arxiv search related to AI/ML Papers built using Streamlit, sentence-transformers and Faiss.

Simple algorithm search engine like google in python using function

Modular search for Django