Backend for the Autocomplete platform. An AI assisted coding platform.

Overview

Introduction

A custom predictor allows you to deploy your own prediction implementation, useful when the existing serving implementations don't fit your needs. If migrating from Cortex, the custom predictor work exactly the same way as PythonPredictor does in Cortex. Most PythonPredictors can be converted to custom predictor by copy pasting the code and renaming some variables.

The custom predictor is packaged as a Docker container. It is recommended, but not required, to keep large model files outside of the container image itself and to load them from a storage volume. This example follows that pattern. You will need somewhere to publish your Docker image once built. This example leverages Docker Hub, where storing public images are free and private images are cheap. Google Container Registry and other registries can also be used.

Make sure you use a GPU enabled Docker image as a base, and that you enable GPU support when loading the model.

Getting Started

After installing kubectl and adding your CoreWeave Cloud access credentials, the following steps will deploy the Inference Service. Clone this repository and folder, and execute all commands in there. We'll be using all the files.

Sign up for a Docker Hub account, or use a different container registry if you already have one. The free plan works perfectly fine, but your container images will be accessible by anyone. This guide assumes a private registry, requiring authentication. Once signed up, create a new repository. For the rest of the guide, we'll assume that the name of the new repository is gpt-6b.

Build the Docker image

  1. Enter the custom-predictor directory. Build and push the Docker image. No modifications are needed to any of the files to follow along. The default Docker tag is latest. We strongly discourage you to use this, as containers are cached on the nodes and in other parts of the CoreWeave stack. Once you have pushed to a tag, do not push to that tag again. Below, we use simple versioning by using tag 1 for the first iteration of the image.
    export DOCKER_USER=thotailtd
    docker build -t $DOCKER_USER/gpt-6b:v1alpha1 .
    docker push $DOCKER_USER/gpt-6b:v1alpha1

Set up repository access

  1. Create a Secret with the Docker Hub credentials. The secret will be named docker-hub. This will be used by nodes to pull your private image. Refer to the Kubernetes Documentation for more details.

    kubectl create secret docker-registry docker-hub --docker-server=https://index.docker.io/v1/ --docker-username=<your-name> --docker-password=<your-pword> --docker-email=<your-email>
  2. Tell Kubernetes to use the newly created Secret by patching the ServiceAccount for your namespace to reference this Secret.

    kubectl patch serviceaccounts default --patch "$(cat image-secrets-serviceaccount.patch.yaml)"

Download the model

As we don't want to bundle the model in the Docker image for performance reasons, a storage volume needs to be set up and the pre-trained model downloaded to it. Storage volumes are allocated using a Kubernetes PersistentVolumeClaim. We'll also deploy a simple container that we can use to copy files to our newly created volume.

  1. Apply the PersistentVolumeClaim and the manifest for the sleep container.

    $ kubectl apply -f model-storage-pvc.yaml
    persistentvolumeclaim/model-storage created
    $ kubectl apply -f sleep-deployment.yaml
    deployment.apps/sleep created
  2. The volume is mounted to /models inside the sleep container. Download the pre-trained model locally, create a directory for it in the shared volume and upload it there. The name of the sleep Pod is assigned to a variable using kubectl. You can also get the name with kubectl get pods.

    The model will be loaded to Amazon S3 soon. Now I directly uploaded it to CoreWeave
    
    export SLEEP_POD=$(kubectl get pod -l "app.kubernetes.io/name=sleep" -o jsonpath='{.items[0].metadata.name}')
    kubectl exec -it $SLEEP_POD -- sh -c 'mkdir /models/sentiment'
    kubectl cp ./sleep_383500 $SLEEP_POD:/models/sentiment/
  3. (Optional) Instead of copying the model from the local filesystem, the model can be downloaded from Amazon S3. The Amazon CLI utilities already exist in the sleep container.

    $ export SLEEP_POD=$(kubectl get pod -l "app.kubernetes.io/name=sleep" -o jsonpath='{.items[0].metadata.name}')
    $ kubectl exec -it $SLEEP_POD -- sh
    $# aws configure
    $# mkdir /models/sentiment
    $# aws s3 sync --recursive s3://thot-ai-models /models/sentiment/

Deploy the model

  1. Modify sentiment-inferenceservice.yaml to reference your docker image.

  2. Apply the resources. This can be used to both create and update existing manifests.

     $ kubectl apply -f sentiment-inferenceservice.yaml
     inferenceservice.serving.kubeflow.org/sentiment configured
  3. List pods to see that the Predictor has launched successfully. This can take a minute, wait for Ready to indicate 2/2.

    $ kubectl get pods
    NAME                                                           READY   STATUS    RESTARTS   AGE
    sentiment-predictor-default-px8xk-deployment-85bb6787d7-h42xk  2/2     Running   0          34s

    If the predictor fails to init, look in the logs for clues kubectl logs sentiment-predictor-default-px8xk-deployment-85bb6787d7-h42xk kfserving-container.

  4. Once all the Pods are running, we can get the API endpoint for our model. The API endpoints follow the Tensorflow V1 HTTP API.

    $ kubectl get inferenceservices
    NAME        URL                                                                          READY   DEFAULT TRAFFIC   CANARY TRAFFIC   AGE
    sentiment   http://sentiment.tenant-test.knative.chi.coreweave.com/v1/models/sentiment   True    100                                23h

    The URL in the output is the public API URL for your newly deployed model. A HTTPs endpoint is also available, however this one bypasses any canary deployments. Retrieve this one with kubectl get ksvc.

  5. Run a test prediction on the URL from above. Remember to add the :predict postfix.

     $ curl -d @sample.json http://sentiment.tenant-test.knative.chi.coreweave.com/v1/models/sentiment:predict
    {"predictions": ["positive"]}
  6. Remove the InferenceService. This will delete all the associated resources, except for your model storage and sleep Deployment.

    $ kubectl delete inferenceservices sentiment
    inferenceservice.serving.kubeflow.org "sentiment" deleted
    ```# thot.ai-Back-End
You might also like...
Universal End2End Training Platform, including pre-training, classification tasks, machine translation, and etc.

背景 安装教程 快速上手 (一)预训练模型 (二)机器翻译 (三)文本分类 TenTrans 进阶 1. 多语言机器翻译 2. 跨语言预训练 背景 TrenTrans是一个统一的端到端的多语言多任务预训练平台,支持多种预训练方式,以及序列生成和自然语言理解任务。 安装教程 git clone git

Labelling platform for text using distant supervision
Labelling platform for text using distant supervision

With DataQA, you can label unstructured text documents using rule-based distant supervision.

Chatbot for the Chatango messaging platform

BroiestBot The baddest bot in the game right now. Uses the ch.py framework for joining Chantango rooms and responding to user messages. Commands If a

Local cross-platform machine translation GUI, based on CTranslate2
Local cross-platform machine translation GUI, based on CTranslate2

DesktopTranslator Local cross-platform machine translation GUI, based on CTranslate2 Download Windows Installer You can either download a ready-made W

PSP (Python Starter Package) is meant for those who want to start coding in python but are new to the coding scene.

Python Starter Package PSP (Python Starter Package) is meant for those who want to start coding in python, but are new to the coding scene. We include

Transformer Huffman coding - Complete Huffman coding through transformer

Transformer_Huffman_coding Complete Huffman coding through transformer 2022/2/19

An interactive command-line HTTP and API testing client built on top of HTTPie featuring autocomplete, syntax highlighting, and more. https://twitter.com/httpie
An interactive command-line HTTP and API testing client built on top of HTTPie featuring autocomplete, syntax highlighting, and more. https://twitter.com/httpie

HTTP Prompt HTTP Prompt is an interactive command-line HTTP client featuring autocomplete and syntax highlighting, built on HTTPie and prompt_toolkit.

A fresh approach to autocomplete implementations, specially for Django. Status: v3 stable, 2.x.x stable, 1.x.x deprecated. Please DO regularely ping us with your link at #yourlabs IRC channel

Features Python 2.7, 3.4, Django 2.0+ support (Django 1.11 (LTS), is supported until django-autocomplete-light-3.2.10), Django (multiple) choice suppo

A fresh approach to autocomplete implementations, specially for Django. Status: v3 stable, 2.x.x stable, 1.x.x deprecated. Please DO regularely ping us with your link at #yourlabs IRC channel

Features Python 2.7, 3.4, Django 2.0+ support (Django 1.11 (LTS), is supported until django-autocomplete-light-3.2.10), Django (multiple) choice suppo

A fresh approach to autocomplete implementations, specially for Django.

A fresh approach to autocomplete implementations, specially for Django. Status: v3 stable, 2.x.x stable, 1.x.x deprecated. Please DO regularely ping us with your link at #yourlabs IRC channel

Code-autocomplete, a code completion plugin for Python
Code-autocomplete, a code completion plugin for Python

Code AutoComplete code-autocomplete, a code completion plugin for Python.

A handy tool for generating Django-based backend projects without coding. On the other hand, it is a code generator of the Django framework.
A handy tool for generating Django-based backend projects without coding. On the other hand, it is a code generator of the Django framework.

Django Sage Painless The django-sage-painless is a valuable package based on Django Web Framework & Django Rest Framework for high-level and rapid web

💉 🔍 VaxFinder - Backend The backend for the Vaccine Hunters Finder tool.

💉 🔍 VaxFinder - Backend The backend for the Vaccine Hunters Finder tool. Development Prerequisites Python 3.8 Poetry: A tool for dependency manageme

Todo-backend - Todo backend with python

Todo-backend - Todo backend with python

Django-gmailapi-json-backend - Email backend for Django which sends email via the Gmail API through a JSON credential

django-gmailapi-json-backend Email backend for Django which sends email via the

Planar Prior Assisted PatchMatch Multi-View Stereo

ACMP [News] The code for ACMH is released!!! [News] The code for ACMM is released!!! About This repository contains the code for the paper Planar Prio

Code for the USENIX 2017 paper: kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels

kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels Blazing fast x86-64 VM kernel fuzzing framework with performant VM reloads for Linux, MacOS an

Meta graph convolutional neural network-assisted resilient swarm communications
Meta graph convolutional neural network-assisted resilient swarm communications

Resilient UAV Swarm Communications with Graph Convolutional Neural Network This repository contains the source codes of Resilient UAV Swarm Communicat

AssistScraper - program for /r/nba to use to find list of all players a player assisted and how many assists each player recieved

AssistScraper - program for /r/nba to use to find list of all players a player assisted and how many assists each player recieved

Owner
Tatenda Christopher Chinyamakobvu
Tatenda Christopher Chinyamakobvu
API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend

gpt-j-api ?? An API to interact with the GPT-J language model. You can use and test the model in two different ways: Streamlit web app at http://api.v

Víctor Gallego 276 Dec 31, 2022
Poetry PEP 517 Build Backend & Core Utilities

Poetry Core A PEP 517 build backend implementation developed for Poetry. This project is intended to be a light weight, fully compliant, self-containe

Poetry 293 Jan 2, 2023
Fast topic modeling platform

The state-of-the-art platform for topic modeling. Full Documentation User Mailing List Download Releases User survey What is BigARTM? BigARTM is a pow

BigARTM 633 Dec 21, 2022
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing

PORORO: Platform Of neuRal mOdels for natuRal language prOcessing pororo performs Natural Language Processing and Speech-related tasks. It is easy to

Kakao Brain 1.2k Dec 21, 2022
DELTA is a deep learning based natural language and speech processing platform.

DELTA - A DEep learning Language Technology plAtform What is DELTA? DELTA is a deep learning based end-to-end natural language and speech processing p

DELTA 1.5k Dec 26, 2022
DELTA is a deep learning based natural language and speech processing platform.

DELTA - A DEep learning Language Technology plAtform What is DELTA? DELTA is a deep learning based end-to-end natural language and speech processing p

DELTA 1.4k Feb 17, 2021
sangha, pronounced "suhng-guh", is a social networking, booking platform where students and teachers can share their practice.

Flask React Project This is the backend for the Flask React project. Getting started Clone this repository (only this branch) git clone https://github

Courtney Newcomer 17 Sep 29, 2021
TextFlint is a multilingual robustness evaluation platform for natural language processing tasks,

TextFlint is a multilingual robustness evaluation platform for natural language processing tasks, which unifies general text transformation, task-specific transformation, adversarial attack, sub-population, and their combinations to provide a comprehensive robustness analysis.

TextFlint 587 Dec 20, 2022
PRAnCER is a web platform that enables the rapid annotation of medical terms within clinical notes.

PRAnCER (Platform enabling Rapid Annotation for Clinical Entity Recognition) is a web platform that enables the rapid annotation of medical terms within clinical notes. A user can highlight spans of text and quickly map them to concepts in large vocabularies within a single, intuitive platform.

Sontag Lab 39 Nov 14, 2022
A cross platform OCR Library based on PaddleOCR & OnnxRuntime

A cross platform OCR Library based on PaddleOCR & OnnxRuntime

RapidOCR Team 767 Jan 9, 2023