Python SDK for building, training, and deploying ML models

Overview

Overview of Kubeflow Fairing

Kubeflow Fairing is a Python package that streamlines the process of building, training, and deploying machine learning (ML) models in a hybrid cloud environment. By using Kubeflow Fairing and adding a few lines of code, you can run your ML training job locally or in the cloud, directly from Python code or a Jupyter notebook. After your training job is complete, you can use Kubeflow Fairing to deploy your trained model as a prediction endpoint.

Use Kubeflow Fairing SDK

To install the SDK:

pip install kubeflow-fairing

To quick start, you can run the E2E MNIST sample.

Documentation

To learn how Kubeflow Fairing streamlines the process of training and deploying ML models in the cloud, read the Kubeflow Fairing documentation.

To learn the Kubeflow Fairing SDK API, read the HTML documentation.

Comments
  • Azure backend support

    Azure backend support

    What this PR does / why we need it:

    This PR contains implementation for fairing.backends.KubeflowAzureBackend which allows storing build context to Azure Storage and using Azure Container Registry as the Docker Registry.

    Special notes for your reviewer:

    1. An experimental CI is being worked on at https://dev.azure.com/kubeflow/kubeflow/_build with the intent of running Azure-specific integration tests there.

    This change is Reviewable

    lgtm approved ok-to-test size/XL 
    opened by vjrantal 36
  • Enhance the runing time info in the debug model

    Enhance the runing time info in the debug model

    Our current debug info is so simple,especially, the confuse time stamp, also don't show the target debug file's location , [I 200108 13:36:53 config:131] Using preprocessor: <kubeflow.fairing.preprocessors.base.BasePreProcessor object at 0x1017af9b0> after changed, it will be more meaningful, for example : INFO|2020-01-08 13:18:33|/Users/llhu/Library/Python/3.7/lib/python/site-packages/werkzeug/_internal.py|122| * Running on http://127.0.0.1:8080/


    This change is Reviewable

    size/M 
    opened by xauthulei 29
  • Add annotations param to TfJob

    Add annotations param to TfJob

    What this PR does / why we need it: Add annotations param to TfJob for ability to pass it to Job.

    Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #

    Special notes for your reviewer:

    1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

    Release note:

    NONE
    

    This change is Reviewable

    lgtm approved size/XS ok-to-test 
    opened by dalfos 29
  • Add document for Fairing APIs

    Add document for Fairing APIs

    /kind feature

    Describe the solution you'd like Kubeflow Fairing should have baisc documents for Fairing APIs, so that the user can just reading the documents when calling APIs, instead of reading code.

    Anything else you would like to add:

    kind/feature 
    opened by jinchihe 21
  • Adding a configMap builder and architecture

    Adding a configMap builder and architecture

    This adds a configMap builder and architechture to allow using configmaps for the code for notebooks during distributed run. This avoids having to build images and therefore can aid distributed training during early exploration.


    This change is Reviewable

    size/L needs-ok-to-test 
    opened by ashahab 19
  • Implement AWSBackend and S3Context

    Implement AWSBackend and S3Context

    This patch enables kubeflow-fairing to run in AWS environment(EKS, S3, ECR...) like this.

    import os
    import fairing
    
    DOCKER_REGISTRY = '{account_id}.dkr.ecr.{region}.amazonaws.com'
    PY_VERSION = ".".join([str(x) for x in sys.version_info[0:3]])
    BASE_IMAGE = 'python:{}'.format(PY_VERSION)
    
    from fairing import TrainJob
    from fairing.backends import KubeflowAWSBackend
    train_job = TrainJob(HousingServe, BASE_IMAGE, input_files=['ames_dataset/train.csv', "requirements.txt"],
                         docker_registry=DOCKER_REGISTRY, backend=KubeflowAWSBackend())
    train_job.submit()
    

    Fixes #282


    This change is Reviewable

    lgtm approved size/L ok-to-test 
    opened by takmatsu 18
  • Fix bug in predict in examples.

    Fix bug in predict in examples.

    What this PR does / why we need it: Fix bugs in the function predict in examples

    Which issue(s) this PR fixes: Fixes #258

    /assign @jlewi /assign @zhenghuiwang


    This change is Reviewable

    lgtm approved size/M 
    opened by abcdefgs0324 17
  • update kfserving version

    update kfserving version

    What this PR does / why we need it:

    Because kfserving SDK version is changed to be consistent with kfserving release (0.2.1), to avoid confusing user, made decision by kfserving comminication, drop old version, and release new one to align with kfserving current release. so kfserving new release is 0.2.1.0, update fairing accordingly.


    This change is Reviewable

    lgtm approved size/XS 
    opened by jinchihe 16
  • Fix for failing Training Job on py 3.8+

    Fix for failing Training Job on py 3.8+

    I've encountered the issue similar to described in https://github.com/kubeflow/fairing/issues/452. It occurs when user uses function preprocessor + cloudpicke version containing cloudpickle_fast.py + python 3.8+.

    The root cause is that newer versions of cloudpickle contain cloudpickle_fast.py and that file isn't copied by preprocessor during image preparation.

    So, this PR aims to append mentioned above file along with the rest of cloudpickle package.

    By the way, what's the reason for copying that package into image ? Wouldn't it be better to copy egg there f.e. and install it on start up ?

    lgtm approved size/S ok-to-test 
    opened by izapolsk 15
  • Enhance kfserving integration

    Enhance kfserving integration

    What this PR does / why we need it: Enhance kfserving integration, move to new version of kfserving, and the CRD has been changed to Inferenceservice, updated in Fairing accordingly.

    Which issue(s) this PR fixes: Fixes #

    Special notes for your reviewer:

    1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

    Release note:

    
    

    This change is Reviewable

    lgtm approved size/L 
    opened by jinchihe 15
  • support kfserving in fairing

    support kfserving in fairing

    What this PR does / why we need it: Support KFServing in Fairing.

    Which issue(s) this PR fixes : Fixes #290

    Special notes for your reviewer:

    1. The intergration tests for the kfserving are not added, since the automation platform should not have the kfserving installation. But I tested most of cases in the on-prem cluster, and we can add later once the kfserving is installed for test platform.

    2. I updated the log() function in the fairing/kubernetes/manager.py to satify the implements, since there are two container in the pod, we need to point out which container logs will be got.

    Release note:

    Support KFServing to deploy model service
    

    This change is Reviewable

    lgtm approved size/L 
    opened by jinchihe 15
  • run the mnist example on AWS

    run the mnist example on AWS

    I'm trying to run mnist_e2e_on_prem.ipynb to train a model on AWS, but in creating PV/PVC step, msrestazure is required to install, does this example only support azure? How should I set it to run on AWS or local environment?

    kind/bug 
    opened by guanghui0607 0
  • OverflowError: string longer than 2147483647 bytes

    OverflowError: string longer than 2147483647 bytes

    [W 220617 06:49:11 append:81] Uploading registry.cn-hangzhou.aliyuncs.com/rory602/fairing-job:37DF66B9 [I 220617 06:49:12 docker_session_:284] Layer sha256:bcfb845ebce120ef8d7b6c3a1600b7b18de2cff8a66ddc57004034bc7f74ce80 pushed. [I 220617 06:49:39 docker_session_:284] Layer sha256:793b1b0c3ddfb6c3de731b9a027b6997bd786ff769648045a8f5a428a1406f5e pushed. [I 220617 06:49:40 docker_session_:284] Layer sha256:b62b8281eb2dcbcf3f4ac5199f7d729747bb256dd5104f44fd9d0fe5fe939353 pushed. [I 220617 06:49:40 docker_session_:284] Layer sha256:9b829c73b52b92b97d5c07a54fb0f3e921995a296c714b53a32ae67d19231fcd pushed. [I 220617 06:49:41 docker_session_:284] Layer sha256:700a07047b6b1612c1dc2a9ce93ec19913d129206585879749e8fe177f567fa9 pushed. [I 220617 06:49:46 docker_session_:284] Layer sha256:fcb6d5f7c98604476fda91fe5f61be5b56fdc398814fb15f7ea998f53023e774 pushed. [I 220617 06:50:25 docker_session_:284] Layer sha256:cb5b7ae361722f070eca53f35823ed21baa85d61d5d95cd5a95ab53d740cdd56 pushed. [I 220617 06:50:39 docker_session_:284] Layer sha256:35b0d149a82cb315c7d490f54f67bb122de76a15124c6524128fab47cba63bbd pushed. [I 220617 06:53:43 docker_session_:284] Layer sha256:6494e4811622b31c027ccac322ca463937fd805f569a93e6f15c01aade718793 pushed. [I 220617 06:53:57 docker_session_:284] Layer sha256:0e29546d541cdbd309281d21a73a9d1db78665c1b95b74f32b009e0b77a6e1e3 pushed. [I 220617 07:14:21 docker_session_:284] Layer sha256:6f9f74896dfa93fe0172f594faba85e0b4e8a0481a0fefd9112efc7e4d3c78f7 pushed. [E 220617 07:15:07 docker_session_:332] Error during upload of: registry.cn-hangzhou.aliyuncs.com/rory602/fairing-job:37DF66B9 ^CTraceback (most recent call last): File "/home/jovyan/.conda/envs/py38/lib/python3.8/runpy.py", line 192, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/jovyan/.conda/envs/py38/lib/python3.8/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/jovyan/.local/share/code-server/extensions/ms-python.python-2021.5.926500501/pythonFiles/lib/python/debugpy/main.py", line 45, in cli.main() File "/home/jovyan/.local/share/code-server/extensions/ms-python.python-2021.5.926500501/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main run() File "/home/jovyan/.local/share/code-server/extensions/ms-python.python-2021.5.926500501/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file runpy.run_path(target_as_str, run_name=compat.force_str("main")) File "/home/jovyan/.conda/envs/py38/lib/python3.8/runpy.py", line 262, in run_path return _run_module_code(code, init_globals, run_name, File "/home/jovyan/.conda/envs/py38/lib/python3.8/runpy.py", line 95, in _run_module_code _run_code(code, mod_globals, init_globals, File "/home/jovyan/.conda/envs/py38/lib/python3.8/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/jovyan/vol-1/fairing/helloworld.py", line 19, in remote_train() File "/home/jovyan/.conda/envs/py38/lib/python3.8/site-packages/kubeflow/fairing/config.py", line 163, in ret_fn self.run() File "/home/jovyan/.conda/envs/py38/lib/python3.8/site-packages/kubeflow/fairing/config.py", line 140, in run builder.build() File "/home/jovyan/.conda/envs/py38/lib/python3.8/site-packages/kubeflow/fairing/builders/append/append.py", line 58, in build self.timed_push(transport, src, new_img, dst) File "/home/jovyan/.conda/envs/py38/lib/python3.8/site-packages/kubeflow/fairing/builders/append/append.py", line 96, in timed_push self._push(transport, src, img, dst) File "/home/jovyan/.conda/envs/py38/lib/python3.8/site-packages/kubeflow/fairing/builders/append/append.py", line 82, in push session.upload(img) File "/home/jovyan/.conda/envs/py38/lib/python3.8/site-packages/containerregistry/client/v2_2/docker_session.py", line 321, in upload future.result() File "/home/jovyan/.conda/envs/py38/lib/python3.8/concurrent/futures/_base.py", line 432, in result return self.__get_result() File "/home/jovyan/.conda/envs/py38/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result raise self.exception File "/home/jovyan/.conda/envs/py38/lib/python3.8/concurrent/futures/thread.py", line 57, in run result = self.fn(*self.args, **self.kwargs) File "/home/jovyan/.conda/envs/py38/lib/python3.8/site-packages/containerregistry/client/v2_2/docker_session.py", line 283, in _upload_one self.put_blob(image, digest) File "/home/jovyan/.conda/envs/py38/lib/python3.8/site-packages/containerregistry/client/v2_2/docker_session.py", line 206, in _put_blob self.patch_upload(image, digest) File "/home/jovyan/.conda/envs/py38/lib/python3.8/site-packages/containerregistry/client/v2_2/docker_session.py", line 165, in _patch_upload resp, unused_content = self.transport.Request( File "/home/jovyan/.conda/envs/py38/lib/python3.8/site-packages/containerregistry/client/v2_2/docker_http.py", line 385, in Request resp, content = self.transport.request( File "/home/jovyan/.conda/envs/py38/lib/python3.8/site-packages/containerregistry/transport/transport_pool.py", line 61, in request return transport.request(*args, **kwargs) File "/home/jovyan/.conda/envs/py38/lib/python3.8/site-packages/httplib2/init.py", line 1701, in request (response, content) = self._request( File "/home/jovyan/.conda/envs/py38/lib/python3.8/site-packages/httplib2/init.py", line 1421, in _request (response, content) = self._conn_request(conn, request_uri, method, body, headers) File "/home/jovyan/.conda/envs/py38/lib/python3.8/site-packages/httplib2/init.py", line 1344, in _conn_request conn.request(method, request_uri, body, headers) File "/home/jovyan/.conda/envs/py38/lib/python3.8/http/client.py", line 1230, in request self._send_request(method, url, body, headers, encode_chunked) File "/home/jovyan/.conda/envs/py38/lib/python3.8/http/client.py", line 1276, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/home/jovyan/.conda/envs/py38/lib/python3.8/http/client.py", line 1225, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/home/jovyan/.conda/envs/py38/lib/python3.8/http/client.py", line 1043, in _send_output self.send(chunk) File "/home/jovyan/.conda/envs/py38/lib/python3.8/http/client.py", line 965, in send self.sock.sendall(data) File "/home/jovyan/.conda/envs/py38/lib/python3.8/ssl.py", line 1204, in sendall v = self.send(byte_view[count:]) File "/home/jovyan/.conda/envs/py38/lib/python3.8/ssl.py", line 1173, in send return self._sslobj.write(data) OverflowError: string longer than 2147483647 bytes

    opened by Rory602 0
  • add config size information in the append builder to support image pull in a podman environment

    add config size information in the append builder to support image pull in a podman environment

    What this PR does / why we need it: When trying to pull an image built by append builder using podman, blob size mismatch error occurs. This is due to config size mismatch in manifest.

    current append builder only appends a new layer and updates config digest in the manifest. so the config size of the newly built image manifest remains the same of that of the base image used to build a new image. This doesn't seem to be a problem when trying to pull an image using docker but when trying to pull an image using podman, size mismatch error occurs.

    This PR added a code to add config size information in the manifest.

    Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #568

    Special notes for your reviewer:

    Release note:

    
    
    size/XS 
    opened by choismn00 1
  • Image built using append builder cannot be pulled using podman

    Image built using append builder cannot be pulled using podman

    /kind bug

    What steps did you take and what happened: [A clear and concise description of what the bug is.]

    I built a simple mnist ML model using an append builder. When trying to pull this image using podman I get a "Error: writing blob: blob size mismatch " error.

    using docker pull works fine with this image.

    What did you expect to happen: For the image to be pulled without any errorr using podman.

    Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

    Environment:

    • Fairing version: (use python -c "import kubeflow.fairing; print(kubeflow.fairing.__version__)"): 0.5.3
    • Kubeflow version: (version number can be found at the bottom left corner of the Kubeflow dashboard):
    • Minikube version:
    • Kubernetes version: (use kubectl version): 1.19.4
    • OS (e.g. from /etc/os-release):

    NOTE: If you are using fair from master, please provide us the git commit hash.

    kind/bug 
    opened by choismn00 0
  • Adapt latest kfserving and training oprator

    Adapt latest kfserving and training oprator

    What this PR does / why we need it:

    Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #563 ,fixes #565

    fixes the dependence is too old.

    approved size/L 
    opened by 631068264 2
  • pip install can not stop  and  ImportError: cannot import name 'ServeRequest' from 'ray.serve.utils' in mnist e2e

    pip install can not stop and ImportError: cannot import name 'ServeRequest' from 'ray.serve.utils' in mnist e2e

    /kind bug

    What steps did you take and what happened: just use pip install kubeflow-fairing image

    keep installing for a long time and it try to install same package with different version.

    finally I try this pip install kubeflow-fairing --use-deprecated=legacy-resolver

    Then I run mnist e2e example py

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    import uuid
    
    import yaml
    from kubeflow import fairing
    from kubeflow.fairing.kubernetes.utils import mounting_pvc
    from kubernetes import client as k8s_client
    from kubernetes import config as k8s_config
    
    DOCKER_REGISTRY = '10.19.64.203:8080'
    my_namespace = 'kserve-test'
    
    num_chief = 1  # number of Chief in TFJob
    num_ps = 1  # number of PS in TFJob
    num_workers = 2  # number of Worker in TFJob
    model_dir = "/mnt"
    export_path = "/mnt/export"
    train_steps = "1000"
    batch_size = "100"
    learning_rate = "0.01"
    
    pvc_name = 'mnist-pvc'
    pvc_yaml = f'''
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: {pvc_name}
      namespace: {my_namespace}
    spec:
      accessModes:
        - ReadWriteMany
      storageClassName: ""
      resources:
        requests:
          storage: 10Gi
    '''
    
    k8s_config.load_kube_config()
    
    k8s_core_api = k8s_client.CoreV1Api()
    # k8s_core_api.create_persistent_volume(yaml.safe_load(pv_yaml))
    k8s_core_api.create_namespaced_persistent_volume_claim(my_namespace, yaml.safe_load(pvc_yaml))
    
    tfjob_name = f'mnist-training-{uuid.uuid4().hex[:4]}'
    
    output_map = {
        "Dockerfile": "Dockerfile",
        "mnist.py": "mnist.py"
    }
    
    command = ["python",
               "/opt/mnist.py",
               "--tf-model-dir=" + model_dir,
               "--tf-export-dir=" + export_path,
               "--tf-train-steps=" + train_steps,
               "--tf-batch-size=" + batch_size,
               "--tf-learning-rate=" + learning_rate]
    
    fairing.config.set_preprocessor('python', command=command, path_prefix="/app", output_map=output_map)
    fairing.config.set_builder(name='docker', registry=DOCKER_REGISTRY,
                               image_name="mnist", dockerfile_path="Dockerfile")
    
    fairing.config.set_deployer(name='tfjob', namespace=my_namespace, stream_log=False, job_name=tfjob_name,
                                chief_count=num_chief, worker_count=num_workers, ps_count=num_ps,
                                pod_spec_mutators=[mounting_pvc(pvc_name=pvc_name, pvc_mount_path=model_dir)])
    fairing.config.run()
    
    

    What did you expect to happen:

    Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

    Environment:

    • Fairing version: (use python -c "import kubeflow.fairing; print(kubeflow.fairing.__version__)"):
    (.env) ➜  kubeflow git:(master) ✗ python -c "import kubeflow.fairing; print(kubeflow.fairing.__version__)"
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/Users/wyx/union_workspce/kubeflow/.env/lib/python3.9/site-packages/kubeflow/fairing/__init__.py", line 2, in <module>
        from kubeflow.fairing.ml_tasks.tasks import TrainJob, PredictionEndpoint
      File "/Users/wyx/union_workspce/kubeflow/.env/lib/python3.9/site-packages/kubeflow/fairing/ml_tasks/tasks.py", line 4, in <module>
        from kubeflow.fairing.backends import KubernetesBackend
      File "/Users/wyx/union_workspce/kubeflow/.env/lib/python3.9/site-packages/kubeflow/fairing/backends/__init__.py", line 1, in <module>
        from kubeflow.fairing.backends.backends import *
      File "/Users/wyx/union_workspce/kubeflow/.env/lib/python3.9/site-packages/kubeflow/fairing/backends/backends.py", line 8, in <module>
        from kubeflow.fairing.builders.cluster import gcs_context
      File "/Users/wyx/union_workspce/kubeflow/.env/lib/python3.9/site-packages/kubeflow/fairing/builders/cluster/gcs_context.py", line 6, in <module>
        from kubeflow.fairing.kubernetes.manager import client, KubeManager
      File "/Users/wyx/union_workspce/kubeflow/.env/lib/python3.9/site-packages/kubeflow/fairing/kubernetes/manager.py", line 6, in <module>
        from kfserving import KFServingClient
      File "/Users/wyx/union_workspce/kubeflow/.env/lib/python3.9/site-packages/kfserving/__init__.py", line 16, in <module>
        from kfserving.kfmodel import KFModel
      File "/Users/wyx/union_workspce/kubeflow/.env/lib/python3.9/site-packages/kfserving/kfmodel.py", line 24, in <module>
        from ray.serve.utils import ServeRequest
    ImportError: cannot import name 'ServeRequest' from 'ray.serve.utils' (/Users/wyx/union_workspce/kubeflow/.env/lib/python3.9/site-packages/ray/serve/utils.py)
    
    (.env) ➜  kubeflow git:(master) ✗ pip list |grep kube                                                     
    kubeflow-fairing               1.0.2
    kubeflow-pytorchjob            0.1.3
    kubeflow-tfjob                 0.1.3
    kubernetes                     10.0.1
    
    
    • Kubeflow version: (version number can be found at the bottom left corner of the Kubeflow dashboard):
    dev_local
    
    • Kubernetes version: (use kubectl version): k3s Kubernetes 1.19
    • OS (e.g. from /etc/os-release): core run on osx
      k3s on centos7

    NOTE: If you are using fair from master, please provide us the git commit hash.

    kind/bug 
    opened by 631068264 0
Releases(v1.0.2)
Owner
Kubeflow
Kubeflow is an open, community driven project to make it easy to deploy and manage an ML stack on Kubernetes
Kubeflow
Simple converter for deploying Stable-Baselines3 model to TFLite and/or Coral

Running SB3 developed agents on TFLite or Coral Introduction I've been using Stable-Baselines3 to train agents against some custom Gyms, some of which

Gary Briggs 16 Oct 11, 2022
QuakeLabeler is a Python package to create and manage your seismic training data, processes, and visualization in a single place — so you can focus on building the next big thing.

QuakeLabeler Quake Labeler was born from the need for seismologists and developers who are not AI specialists to easily, quickly, and independently bu

Hao Mai 15 Nov 4, 2022
PyGAD, a Python 3 library for building the genetic algorithm and training machine learning algorithms (Keras & PyTorch).

PyGAD: Genetic Algorithm in Python PyGAD is an open-source easy-to-use Python 3 library for building the genetic algorithm and optimizing machine lear

Ahmed Gad 1.1k Dec 26, 2022
Pytorch implementation of "Forward Thinking: Building and Training Neural Networks One Layer at a Time"

forward-thinking-pytorch Pytorch implementation of Forward Thinking: Building and Training Neural Networks One Layer at a Time Requirements Python 2.7

Kim Heecheol 65 Oct 6, 2022
Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

InfoPro-Pytorch The Information Propagation algorithm for training deep networks with local supervision. (ICLR 2021) Revisiting Locally Supervised Lea

null 78 Dec 27, 2022
The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment

Hailo Model Zoo The Hailo Model Zoo provides pre-trained models for high-performance deep learning applications. Using the Hailo Model Zoo you can mea

Hailo 50 Dec 7, 2022
Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.

Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.

null 2.7k Jan 5, 2023
Code for pre-training CharacterBERT models (as well as BERT models).

Pre-training CharacterBERT (and BERT) This is a repository for pre-training BERT and CharacterBERT. DISCLAIMER: The code was largely adapted from an o

Hicham EL BOUKKOURI 31 Dec 5, 2022
AWS provides a Python SDK, "Boto3" ,which can be used to access the AWS-account from the local.

Boto3 - The AWS SDK for Python Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python, which allows Python developers to wri

Shreyas Srivastava 1 Oct 25, 2021
PoseCamera is python based SDK for human pose estimation through RGB webcam.

PoseCamera PoseCamera is python based SDK for human pose estimation through RGB webcam. Install install posecamera package through pip pip install pos

WonderTree 7 Jul 20, 2021
This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

KGI (Knowledge Graph Induction) for slot filling This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code fo

International Business Machines 72 Jan 6, 2023
NeuPy is a Tensorflow based python library for prototyping and building neural networks

NeuPy v0.8.2 NeuPy is a python library for prototyping and building neural networks. NeuPy uses Tensorflow as a computational backend for deep learnin

Yurii Shevchuk 729 Jan 3, 2023
Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly

Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly Code for this paper Ultra-Data-Efficient GAN Tra

VITA 77 Oct 5, 2022
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

ActNN : Activation Compressed Training This is the official project repository for ActNN: Reducing Training Memory Footprint via 2-Bit Activation Comp

UC Berkeley RISE 178 Jan 5, 2023
BERT model training impelmentation using 1024 A100 GPUs for MLPerf Training v1.1

Pre-trained checkpoint and bert config json file Location of checkpoint and bert config json file This MLCommons members Google Drive location contain

SAIT (Samsung Advanced Institute of Technology) 12 Apr 27, 2022
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

XCL 191 Dec 31, 2022