CLIP (Contrastive Language–Image Pre-training)
Experiments (Evaluation)
Model | Dataset | Acc (%) |
---|---|---|
ViT-B/32 (Paper) | CIFAR100 | 65.1 |
ViT-B/32 (Our) | CIFAR100 | 61.71 |
ViT-B/32 (Paper | CIFAR10 | 91.3 |
ViT-B/32 (Our) | CIFAR10 | 88.8 |
Overview
Training
- Work In Process
Usage
- Evaluation
python evaluation.py --dataset CIFAR100 --cuda True
- args
- dataset (str): CIFAR10, CIFAR100 (default: CIFAR100)
- num_workers (int): default: 0
- batch_size (int): default: 128
- cuda (bool): False
- Training
- Prepare Data
- Visual Genome Dataset link
- Download (images, region descriptions)
- training
python main.py --base_dir ./ --cuda True
- Prepare Data
Reference
- paper link
- Author: Alec Radford, Jong Wook Kim, Chris Hallacy, Girish Sastry, Amanda Askell, Pamela Mishkin, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Jack Clark, Gretchen Krueger, Ilya Sutskever
- OpenAI