Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]

Datasets	ImDBs	Object Faster R-CNN Features	OCR Faster R-CNN Features	OCR Recog-CNN Features
TextVQA	TextVQA ImDB	Open Images	TextVQA SBD-Trans OCRs	TextVQA SBD-Trans OCRs
ST-VQA	ST-VQA ImDB	ST-VQA Objects	ST-VQA SBD-Trans OCRs	ST-VQA SBD-Trans OCRs

Datasets	Config Files (under `configs/vqa/`)	Pretrained Models	Metrics	Notes
TextVQA (`m4c_textvqa`)	`m4c_textvqa/m4c_with_stvqa.yml`	`ssbaseline_with_stvqa`	val accuracy - 45.53%; test accuracy - 45.66%	SBD-Trans OCRs; ST-VQA as additional data

❓ Questions and Help

I am trying to generate the EvalAI prediction files for the TextVQA test set using SSBaseline model, but I am facing the following error:-

Code to run the predictions python tools/run.py --tasks vqa --datasets m4c_textvqa --model m4c --config configs/vqa/m4c_textvqa/m4c_with_stvqa.yml --save_dir save/m4c --run_type inference --evalai_inference 1 --resume_file data/models/best_textvqa_withStvqa.ckpt

Error

2021-03-30T13:13:19 ERROR: Key image_feature_2 not found in the SampleList. Valid choices are ['question_id', 'image_id', 'image_feature_0', 'image_info_0', 'image_feature_1', 'image_info_1', 'text', 'text_len', 'obj_bbox_coordinates', 'context', 'context_tokens', 'context_tokens_enc', 'context_feature_0', 'context_info_0', 'context_feature_1', 'context_info_1', 'order_vectors', 'ocr_bbox_coordinates', 'sampled_idx_seq', 'train_prev_inds', 'dataset_type', 'dataset_name', 'dataset_type_', 'dataset_name_'] Traceback (most recent call last): File "tools/run.py", line 86, in run() File "tools/run.py", line 75, in run trainer.train() File "/home/pratyush/ssbaseline/pythia/trainers/base_trainer.py", line 198, in train self.inference() File "/home/pratyush/ssbaseline/pythia/trainers/base_trainer.py", line 427, in inference self._inference_run("test") File "/home/pratyush/ssbaseline/pythia/trainers/base_trainer.py", line 431, in _inference_run self.predict_for_evalai(dataset_type) File "/home/pratyush/ssbaseline/pythia/trainers/base_trainer.py", line 472, in predict_for_evalai model_output = self.model(prepared_batch) File "/home/pratyush/ssbaseline/pythia/models/base_model.py", line 120, in call model_output = super().call(sample_list, *args, **kwargs) File "/home/pratyush/.virtualenvs/ssbase/lib/python3.8/site-packages/torch-1.4.0-py3.8-linux-x86_64.egg/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, **kwargs) File "/home/pratyush/ssbaseline/pythia/models/m4c.py", line 209, in forward self._forward_ocr_encoding(sample_list, fwd_results) File "/home/pratyush/ssbaseline/pythia/models/m4c.py", line 263, in _forward_ocr_encoding ocr_recogcnn = sample_list.image_feature_2[:, :ocr_fasttext.size(1), :] File "/home/pratyush/ssbaseline/pythia/common/sample.py", line 145, in getattr raise AttributeError( AttributeError: Key image_feature_2 not found in the SampleList. Valid choices are ['question_id', 'image_id', 'image_feature_0', 'image_info_0', 'image_feature_1', 'image_info_1', 'text', 'text_len', 'obj_bbox_coordinates', 'context', 'context_tokens', 'context_tokens_enc', 'context_feature_0', 'context_info_0', 'context_feature_1', 'context_info_1', 'order_vectors', 'ocr_bbox_coordinates', 'sampled_idx_seq', 'train_prev_inds', 'dataset_type', 'dataset_name', 'dataset_type_', 'dataset_name_']

Kindly, help me with this error or point me in the correct direction to resolve this issue.

Thanks in advance.

❓ Questions and Help

Hi, I am facing error when runningpython setup.py build develop Environment: Conda Python 3.7.11 PyTorch 1.9.1 CUDA 10.2

Can you share the version of Python, PyTorch and other relevant libraries. Thanks The error message :

running build
running build_py
running build_ext
running develop
running egg_info
writing pythia.egg-info/PKG-INFO
writing dependency_links to pythia.egg-info/dependency_links.txt
writing requirements to pythia.egg-info/requires.txt
writing top-level names to pythia.egg-info/top_level.txt
reading manifest file 'pythia.egg-info/SOURCES.txt'
adding license file 'LICENSE'
writing manifest file 'pythia.egg-info/SOURCES.txt'
running build_ext
copying build/lib.linux-x86_64-3.7/cphoc.cpython-37m-x86_64-linux-gnu.so -> 
Creating /home/cybertron/anaconda3/envs/ss/lib/python3.7/site-packages/pythia.egg-link (link to .)
Removing pythia 0.3 from easy-install.pth file
Adding pythia 0.3 to easy-install.pth file

Installed /home/cybertron/ssbaseline
Processing dependencies for pythia==0.3
Searching for fasttext==0.9.1
Reading https://pypi.org/simple/fasttext/
Downloading https://files.pythonhosted.org/packages/10/61/2e01f1397ec533756c1d893c22d9d5ed3fce3a6e4af1976e0d86bb13ea97/fasttext-0.9.1.tar.gz#sha256=6ead9c6aafe985472066e27c43e33f581b192befd136a84c3c2e8197e7e05be6
Best match: fasttext 0.9.1
Processing fasttext-0.9.1.tar.gz
Writing /tmp/easy_install-u9gat7ds/fasttext-0.9.1/setup.cfg
Running fasttext-0.9.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-u9gat7ds/fasttext-0.9.1/egg-dist-tmp-l97kscpt
/home/cybertron/anaconda3/envs/ss/lib/python3.7/site-packages/setuptools/dist.py:720: UserWarning: Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead
  % (opt, underscore_opt)
warning: no files found matching 'PATENTS'
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc: In lambda function:
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:227:35: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
             for (int32_t i = 0; i < vocab_freq.size(); i++) {
                                 ~~^~~~~~~~~~~~~~~~~~~
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc: In lambda function:
python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:241:35: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
             for (int32_t i = 0; i < labels_freq.size(); i++) {
                                 ~~^~~~~~~~~~~~~~~~~~~~
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
src/fasttext.cc: In member function ‘void fasttext::FastText::getWordVector(fasttext::Vector&, const string&) const’:
src/fasttext.cc:92:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (int i = 0; i < ngrams.size(); i++) {
                   ~~^~~~~~~~~~~~~~~
src/fasttext.cc: In lambda function:
src/fasttext.cc:302:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     return eosid == i1 || (eosid != i2 && norms[i1] > norms[i2]);
            ~~~~~~^~~~~
src/fasttext.cc:302:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     return eosid == i1 || (eosid != i2 && norms[i1] > norms[i2]);
                            ~~~~~~^~~~~
src/fasttext.cc: In member function ‘void fasttext::FastText::quantize(const fasttext::Args&)’:
src/fasttext.cc:322:40: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   if (qargs.cutoff > 0 && qargs.cutoff < input->size(0)) {
                           ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
src/fasttext.cc:323:45: warning: ‘std::vector<int> fasttext::FastText::selectEmbeddings(int32_t) const’ is deprecated: selectEmbeddings is being deprecated. [-Wdeprecated-declarations]
     auto idx = selectEmbeddings(qargs.cutoff);
                                             ^
src/fasttext.cc:293:22: note: declared here
 std::vector<int32_t> FastText::selectEmbeddings(int32_t cutoff) const {
                      ^~~~~~~~
src/fasttext.cc:327:24: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     for (auto i = 0; i < idx.size(); i++) {
                      ~~^~~~~~~~~~~~
src/fasttext.cc: In member function ‘void fasttext::FastText::cbow(fasttext::Model::State&, fasttext::real, const std::vector<int>&)’:
src/fasttext.cc:380:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (int32_t w = 0; w < line.size(); w++) {
                       ~~^~~~~~~~~~~~~
src/fasttext.cc:384:41: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       if (c != 0 && w + c >= 0 && w + c < line.size()) {
                                   ~~~~~~^~~~~~~~~~~~~
src/fasttext.cc: In member function ‘void fasttext::FastText::skipgram(fasttext::Model::State&, fasttext::real, const std::vector<int>&)’:
src/fasttext.cc:398:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (int32_t w = 0; w < line.size(); w++) {
                       ~~^~~~~~~~~~~~~
src/fasttext.cc:402:41: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       if (c != 0 && w + c >= 0 && w + c < line.size()) {
                                   ~~~~~~^~~~~~~~~~~~~
src/fasttext.cc: In member function ‘void fasttext::FastText::getSentenceVector(std::istream&, fasttext::Vector&)’:
src/fasttext.cc:479:27: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     for (int32_t i = 0; i < line.size(); i++) {
                         ~~^~~~~~~~~~~~~
src/fasttext.cc: In member function ‘std::vector<std::pair<std::__cxx11::basic_string<char>, fasttext::Vector> > fasttext::FastText::getNgramVectors(const string&) const’:
src/fasttext.cc:514:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (int32_t i = 0; i < ngrams.size(); i++) {
                       ~~^~~~~~~~~~~~~~~
src/fasttext.cc: In member function ‘void fasttext::FastText::lazyComputeWordVectors()’:
src/fasttext.cc:551:40: warning: ‘void fasttext::FastText::precomputeWordVectors(fasttext::DenseMatrix&)’ is deprecated: precomputeWordVectors is being deprecated. [-Wdeprecated-declarations]
     precomputeWordVectors(*wordVectors_);
                                        ^
src/fasttext.cc:534:6: note: declared here
 void FastText::precomputeWordVectors(DenseMatrix& wordVectors) {
      ^~~~~~~~
src/fasttext.cc: In member function ‘std::vector<std::pair<float, std::__cxx11::basic_string<char> > > fasttext::FastText::getNN(const fasttext::DenseMatrix&, const fasttext::Vector&, int32_t, const std::set<std::__cxx11::basic_string<char> >&)’:
src/fasttext.cc:585:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       if (heap.size() == k && similarity < heap.front().first) {
           ~~~~~~~~~~~~^~~~
src/fasttext.cc:590:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       if (heap.size() > k) {
           ~~~~~~~~~~~~^~~
src/fasttext.cc: In member function ‘std::shared_ptr<fasttext::Matrix> fasttext::FastText::getInputMatrixFromFile(const string&) const’:
src/fasttext.cc:701:24: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (size_t i = 0; i < n; i++) {
                      ~~^~~
src/fasttext.cc:706:26: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     for (size_t j = 0; j < dim; j++) {
                        ~~^~~~~
src/fasttext.cc:718:24: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (size_t i = 0; i < n; i++) {
                      ~~^~~
src/fasttext.cc:723:26: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     for (size_t j = 0; j < dim; j++) {
                        ~~^~~~~
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
src/loss.cc: In member function ‘void fasttext::Loss::findKBest(int32_t, fasttext::real, fasttext::Predictions&, const fasttext::Vector&) const’:
src/loss.cc:83:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     if (heap.size() == k && std_log(output[i]) < heap.front().first) {
         ~~~~~~~~~~~~^~~~
src/loss.cc:88:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     if (heap.size() > k) {
         ~~~~~~~~~~~~^~~
src/loss.cc: In member function ‘virtual fasttext::real fasttext::HierarchicalSoftmaxLoss::forward(const std::vector<int>&, int32_t, fasttext::Model::State&, fasttext::real, bool)’:
src/loss.cc:257:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (int32_t i = 0; i < pathToRoot.size(); i++) {
                       ~~^~~~~~~~~~~~~~~~~~~
src/loss.cc: In member function ‘void fasttext::HierarchicalSoftmaxLoss::dfs(int32_t, fasttext::real, int32_t, fasttext::real, fasttext::Predictions&, const fasttext::Vector&) const’:
src/loss.cc:282:19: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   if (heap.size() == k && score < heap.front().first) {
       ~~~~~~~~~~~~^~~~
src/loss.cc:289:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     if (heap.size() > k) {
         ~~~~~~~~~~~~^~~
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
src/productquantizer.cc: In member function ‘void fasttext::ProductQuantizer::load(std::istream&)’:
src/productquantizer.cc:246:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (auto i = 0; i < centroids_.size(); i++) {
                    ~~^~~~~~~~~~~~~~~~~~~
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
src/args.cc: In member function ‘void fasttext::Args::parseArgs(const std::vector<std::__cxx11::basic_string<char> >&)’:
src/args.cc:93:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (int ai = 2; ai < args.size(); ai += 2) {
                    ~~~^~~~~~~~~~~~~
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
src/dictionary.cc: In member function ‘void fasttext::Dictionary::computeSubwords(const string&, std::vector<int>&, std::vector<std::__cxx11::basic_string<char> >*) const’:
src/dictionary.cc:181:52: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     for (size_t j = i, n = 1; j < word.size() && n <= args_->maxn; n++) {
                                                  ~~^~~~~~~~~~~~~~
src/dictionary.cc:186:13: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       if (n >= args_->minn && !(n == 1 && (i == 0 || j == word.size()))) {
           ~~^~~~~~~~~~~~~~
src/dictionary.cc: In member function ‘void fasttext::Dictionary::initNgrams()’:
src/dictionary.cc:198:24: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (size_t i = 0; i < size_; i++) {
                      ~~^~~~~~~
src/dictionary.cc: In member function ‘void fasttext::Dictionary::initTableDiscard()’:
src/dictionary.cc:296:24: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (size_t i = 0; i < size_; i++) {
                      ~~^~~~~~~
src/dictionary.cc: In member function ‘void fasttext::Dictionary::addWordNgrams(std::vector<int>&, const std::vector<int>&, int32_t) const’:
src/dictionary.cc:316:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (int32_t i = 0; i < hashes.size(); i++) {
                       ~~^~~~~~~~~~~~~~~
src/dictionary.cc:318:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     for (int32_t j = i + 1; j < hashes.size() && j < i + n; j++) {
                             ~~^~~~~~~~~~~~~~~
src/dictionary.cc: In member function ‘void fasttext::Dictionary::prune(std::vector<int>&)’:
src/dictionary.cc:515:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (int32_t i = 0; i < words_.size(); i++) {
                       ~~^~~~~~~~~~~~~~~
src/dictionary.cc:517:12: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
         (j < words.size() && words[j] == i)) {
          ~~^~~~~~~~~~~~~~
creating /home/cybertron/anaconda3/envs/ss/lib/python3.7/site-packages/fasttext-0.9.1-py3.7-linux-x86_64.egg
Extracting fasttext-0.9.1-py3.7-linux-x86_64.egg to /home/cybertron/anaconda3/envs/ss/lib/python3.7/site-packages
Removing fasttext 0.9.2 from easy-install.pth file
Adding fasttext 0.9.1 to easy-install.pth file

Installed /home/cybertron/anaconda3/envs/ss/lib/python3.7/site-packages/fasttext-0.9.1-py3.7-linux-x86_64.egg
Searching for demjson>=2.2
Reading https://pypi.org/simple/demjson/
Downloading https://files.pythonhosted.org/packages/96/67/6db789e2533158963d4af689f961b644ddd9200615b8ce92d6cad695c65a/demjson-2.2.4.tar.gz#sha256=31de2038a0fdd9c4c11f8bf3b13fe77bc2a128307f965c8d5fb4dc6d6f6beb79
Best match: demjson 2.2.4
Processing demjson-2.2.4.tar.gz
Writing /tmp/easy_install-zh3undwn/demjson-2.2.4/setup.cfg
Running demjson-2.2.4/setup.py -q bdist_egg --dist-dir /tmp/easy_install-zh3undwn/demjson-2.2.4/egg-dist-tmp-5b8hbd5l
error: Setup script exited with error in demjson setup command: use_2to3 is invalid.

Official implementation of "Dynamic Anchor Learning for Arbitrary-Oriented Object Detection" (AAAI2021).

DAL This project hosts the official implementation for our AAAI 2021 paper: Dynamic Anchor Learning for Arbitrary-Oriented Object Detection [arxiv] [c

215 Nov 28, 2022

Code for KHGT model, AAAI2021

KHGT Code for KHGT accepted by AAAI2021 Please unzip the data files in Datasets/ first. To run KHGT on Yelp data, use python labcode_yelp.py For Movi

32 Nov 29, 2022

Out-of-Town Recommendation with Travel Intention Modeling (AAAI2021)

TrainOR_AAAI21 This is the official implementation of our AAAI'21 paper: Haoran Xin, Xinjiang Lu, Tong Xu, Hao Liu, Jingjing Gu, Dejing Dou, Hui Xiong

13 Oct 19, 2022

FrankMocap: A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator

FrankMocap pursues an easy-to-use single view 3D motion capture system developed by Facebook AI Research (FAIR). FrankMocap provides state-of-the-art 3D pose estimation outputs for body, hand, and body+hands in a single system. The core objective of FrankMocap is to democratize the 3D human pose estimation technology, enabling anyone (researchers, engineers, developers, artists, and others) can easily obtain 3D motion capture outputs from videos and images.

1.9k Jan 7, 2023

A general and strong 3D object detection codebase that supports more methods, datasets and tools (debugging, recording and analysis).

ALLINONE-Det ALLINONE-Det is a general and strong 3D object detection codebase built on OpenPCDet, which supports more methods, datasets and tools (de

5 Nov 3, 2022

The tool under this branch fork can be used to crack devices above A12 and up to A15. After cracking, you can also use SSH channel strong opening tool to open SSH channel and activate it with Demo or Shell script. The file can be extracted from my Github homepage, and the SSH channel opening tool can be extracted from Dr238 account.

Welcome to C0xy-A12-A15-Attack-Tool The tool under this branch fork can be used to crack devices above A12 and up to A15. After cracking, you can also

13 Dec 23, 2022

Tensors and Dynamic neural networks in Python with strong GPU acceleration

PyTorch is a Python package that provides two high-level features: Tensor computation (like NumPy) with strong GPU acceleration Deep neural networks b

61.4k Jan 4, 2023

46.1k Feb 13, 2021

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

This repository contains code for the following two papers: VisualBERT: A Simple and Performant Baseline for Vision and Language (arxiv) with a short

463 Dec 9, 2022

ERROR: Key image_feature_2 not found in the SampleList

❓ Questions and Help

I am trying to generate the EvalAI prediction files for the TextVQA test set using SSBaseline model, but I am facing the following error:-

Code to run the predictions python tools/run.py --tasks vqa --datasets m4c_textvqa --model m4c --config configs/vqa/m4c_textvqa/m4c_with_stvqa.yml --save_dir save/m4c --run_type inference --evalai_inference 1 --resume_file data/models/best_textvqa_withStvqa.ckpt

Error

2021-03-30T13:13:19 ERROR: Key image_feature_2 not found in the SampleList. Valid choices are ['question_id', 'image_id', 'image_feature_0', 'image_info_0', 'image_feature_1', 'image_info_1', 'text', 'text_len', 'obj_bbox_coordinates', 'context', 'context_tokens', 'context_tokens_enc', 'context_feature_0', 'context_info_0', 'context_feature_1', 'context_info_1', 'order_vectors', 'ocr_bbox_coordinates', 'sampled_idx_seq', 'train_prev_inds', 'dataset_type', 'dataset_name', 'dataset_type_', 'dataset_name_'] Traceback (most recent call last): File "tools/run.py", line 86, in run() File "tools/run.py", line 75, in run trainer.train() File "/home/pratyush/ssbaseline/pythia/trainers/base_trainer.py", line 198, in train self.inference() File "/home/pratyush/ssbaseline/pythia/trainers/base_trainer.py", line 427, in inference self._inference_run("test") File "/home/pratyush/ssbaseline/pythia/trainers/base_trainer.py", line 431, in _inference_run self.predict_for_evalai(dataset_type) File "/home/pratyush/ssbaseline/pythia/trainers/base_trainer.py", line 472, in predict_for_evalai model_output = self.model(prepared_batch) File "/home/pratyush/ssbaseline/pythia/models/base_model.py", line 120, in call model_output = super().call(sample_list, *args, **kwargs) File "/home/pratyush/.virtualenvs/ssbase/lib/python3.8/site-packages/torch-1.4.0-py3.8-linux-x86_64.egg/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, **kwargs) File "/home/pratyush/ssbaseline/pythia/models/m4c.py", line 209, in forward self._forward_ocr_encoding(sample_list, fwd_results) File "/home/pratyush/ssbaseline/pythia/models/m4c.py", line 263, in _forward_ocr_encoding ocr_recogcnn = sample_list.image_feature_2[:, :ocr_fasttext.size(1), :] File "/home/pratyush/ssbaseline/pythia/common/sample.py", line 145, in getattr raise AttributeError( AttributeError: Key image_feature_2 not found in the SampleList. Valid choices are ['question_id', 'image_id', 'image_feature_0', 'image_info_0', 'image_feature_1', 'image_info_1', 'text', 'text_len', 'obj_bbox_coordinates', 'context', 'context_tokens', 'context_tokens_enc', 'context_feature_0', 'context_info_0', 'context_feature_1', 'context_info_1', 'order_vectors', 'ocr_bbox_coordinates', 'sampled_idx_seq', 'train_prev_inds', 'dataset_type', 'dataset_name', 'dataset_type_', 'dataset_name_']

Kindly, help me with this error or point me in the correct direction to resolve this issue.

Thanks in advance.

opened by GoelPratyush 1

faced error installing demjson when running setup.py

opened by soonchangAI 0

Textcaps task

❓ Questions and Help

Hi, thank you for sharing your code. What files do I need to modify if I want to use this model in the Textcaps task? Can the dataset use textVQA parts directly, and is the configuration file also an M4C.yML file? Looking forward to your reply!

opened by Caroline0728 0
size mismatch for linear_ocr_feat_to_mmt_in.weight: copying a param with shape torch.Size([768, 3002]) from checkpoint, the shape in current model is torch.Size([768, 3464]).

❓ Questions and Help

I am getting size match error in checkpoint and I am not able to understand how to change the dimension in the checkpoint. Please help!

opened by sarthak-sg 1

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]

Related tags

Overview

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps

Citation

Installation

Getting Data

Pretrained Models

Training and Evaluation

You might also like...

Official implementation of "Dynamic Anchor Learning for Arbitrary-Oriented Object Detection" (AAAI2021).

Code for KHGT model, AAAI2021

Out-of-Town Recommendation with Travel Intention Modeling (AAAI2021)

FrankMocap: A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator

A general and strong 3D object detection codebase that supports more methods, datasets and tools (debugging, recording and analysis).

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

Comments

ERROR: Key image_feature_2 not found in the SampleList

❓ Questions and Help

faced error installing demjson when running setup.py

❓ Questions and Help

Textcaps task

❓ Questions and Help

size mismatch for linear_ocr_feat_to_mmt_in.weight: copying a param with shape torch.Size([768, 3002]) from checkpoint, the shape in current model is torch.Size([768, 3464]).

❓ Questions and Help

Owner

ZephyrZhuQi

This repo is developed for Strong Baseline For Vehicle Re-Identification in Track 2 Ai-City-2021 Challenges

A Strong Baseline for Image Semantic Segmentation

A tiny, friendly, strong baseline code for Person-reID (based on pytorch).

Image-retrieval-baseline - MUGE Multimodal Retrieval Baseline

Image-generation-baseline - MUGE Text To Image Generation Baseline

Jingju baseline - A baseline model of our project of Beijing opera script generation

The code of “Similarity Reasoning and Filtration for Image-Text Matching” [AAAI2021]

Implementation for our AAAI2021 paper (Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction).

[AAAI2021] The source code for our paper 《Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion》.

Implementation of our paper 'RESA: Recurrent Feature-Shift Aggregator for Lane Detection' in AAAI2021.