2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

Jianquan Ye

Last update: Dec 21, 2022

Related tags

Deep Learning TableMASTER-mmocr

Overview

TableMASTER-mmocr

About The Project
- Method Description
- Dependency
Getting Started
- Prerequisites
- Installation
Usage
Result
License
Acknowledgements

About The Project

This project presents our 2nd place solution for ICDAR 2021 Competition on Scientific Literature Parsing, Task B. We reimplement our solution by MMOCR，which is an open-source toolbox based on PyTorch. You can click here for more details about this competition. Our original implementation is based on FastOCR (one of our internal toolbox similar with MMOCR).

Method Description

In our solution, we divide the table content recognition task into four sub-tasks: table structure recognition, text line detection, text line recognition, and box assignment. Based on MASTER, we propose a novel table structure recognition architrcture, which we call TableMASTER. The difference between MASTER and TableMASTER will be shown below. You can click here for more details about this solution.

Dependency

Getting Started

Prerequisites

Competition dataset PubTabNet, click here for downloading.
About PubTabNet, check their github and paper.

About the metric TEDS, see github

Installation

Install mmdetection. click here for details.

# We embed mmdetection-2.11.0 source code into this project.
# You can cd and install it (recommend).
cd ./mmdetection-2.11.0
pip install -v -e .

Install mmocr. click here for details.

# install mmocr
cd ./MASTER_mmocr
pip install -v -e .

Install mmcv-full-1.3.4. click here for details.

pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html

# install mmcv-full-1.3.4 with torch version 1.8.0 cuda_version 10.2
pip install mmcv-full==1.3.4 -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html

Usage

Data preprocess

Run data_preprocess.py to get valid train data. Remember to change the 'raw_img_root' and ‘save_root’ property of PubtabnetParser to your path.

python ./table_recognition/data_preprocess.py

It will about 8 hours to finish parsing 500777 train files. After finishing the train set parsing, change the property of 'split' folder in PubtabnetParser to 'val' and get formatted val data.

Directory structure of parsed train data is :

.
├── StructureLabelAddEmptyBbox_train
│   ├── PMC1064074_007_00.txt
│   ├── PMC1064076_003_00.txt
│   ├── PMC1064076_004_00.txt
│   └── ...
├── recognition_train_img
│   ├── 0
│       ├── PMC1064100_007_00_0.png
│       ├── PMC1064100_007_00_10.png
│       ├── ...
│       └── PMC1064100_007_00_108.png
│   ├── 1
│   ├── ...
│   └── 15
├── recognition_train_txt
│   ├── 0.txt
│   ├── 1.txt
│   ├── ...
│   └── 15.txt
├── structure_alphabet.txt
└── textline_recognition_alphabet.txt

Train

Train text line detection model with PSENet.
```
sh ./table_recognition/table_text_line_detection_dist_train.sh
```
We don't offer PSENet train data here, you can create the text line annotations by open source label software. In our experiment, we only use 2,500 table images to train our model. It gets a perfect text line detection result on validation set.
Train text-line recognition model with MASTER.
```
sh ./table_recognition/table_text_line_recognition_dist_train.sh
```
We can get about 30,000,000 text line images from 500,777 training images and 550,000 text line images from 9115 validation images. But we only select 20,000 text line images from 550,000 dataset for evaluatiing after each trainig epoch, to pick up the best text line recognition model.

Note that our MASTER OCR is directly trained on samples mixed with single-line texts and multiple-line texts.
Train table structure recognition model, with TableMASTER.
```
sh ./table_recognition/table_recognition_dist_train.sh
```

Inference

To get final results, firstly, we need to forward the three up-mentioned models, respectively. Secondly, we merge the results by our matching algorithm, to generate the final HTML code.

Models inference. We do this to speed up the inference.

python ./table_recognition/run_table_inference.py

run_table_inference.py wil call table_inference.py and use multiple gpu devices to do model inference. Before running this script, you should change the value of cfg in table_inference.py .

Directory structure of text line detection and text line recognition inference results are:

# If you use 8 gpu devices to inference, you will get 8 detection results pickle files, one end2end_result pickle files and 8 structure recognition results pickle files. 
.
├── end2end_caches
│   ├── end2end_results.pkl
│   ├── detection_results_0.pkl
│   ├── detection_results_1.pkl
│   ├── ...
│   └── detection_results_7.pkl
├── structure_master_caches
│   ├── structure_master_results_0.pkl
│   ├── structure_master_results_1.pkl
│   ├── ...
│   └── structure_master_results_7.pkl

Merge results.

python ./table_recognition/match.py

After matching, congratulations, you will get final result pickle file.

Get TEDS score

Installation.

pip install -r ./table_recognition/PubTabNet-master/src/requirements.txt

Get gtVal.json.

python ./table_recognition/get_val_gt.py

Calcutate TEDS score. Before run this script, modify pred file path and gt file path in mmocr_teds_acc_mp.py
```
python ./table_recognition/PubTabNet-master/src/mmocr_teds_acc_mp.py
```

Result

Text line end2end recognition accuracy

Models	Accuracy
PSENet + MASTER	0.9885

Structure recognition accuracy

Model architecture	Accuracy
TableMASTER_maxlength_500	0.7808
TableMASTER_ConcatLayer_maxlength_500	0.7821
TableMASTER_ConcatLayer_maxlength_600	0.7799

TEDS score

Models	TEDS
PSENet + MASTER + TableMASTER_maxlength_500	0.9658
PSENet + MASTER + TableMASTER_ConcatLayer_maxlength_500	0.9669
PSENet + MASTER + ensemble_TableMASTER	0.9676

In this paper, we reported 0.9684 TEDS score in validation set (9115 samples). The gap between 0.9676 and 0.9684 comes from that we ensemble three text line models in the competition, but here, we only use one model. Of course, hyperparameter tuning will also affect TEDS score.

License

This project is licensed under the MIT License. See LICENSE for more details.

Citations

@article{ye2021pingan,
  title={PingAn-VCGroup's Solution for ICDAR 2021 Competition on Scientific Literature Parsing Task B: Table Recognition to HTML},
  author={Ye, Jiaquan and Qi, Xianbiao and He, Yelin and Chen, Yihao and Gu, Dengyi and Gao, Peng and Xiao, Rong},
  journal={arXiv preprint arXiv:2105.01848},
  year={2021}
}
@article{He2021PingAnVCGroupsSF,
  title={PingAn-VCGroup's Solution for ICDAR 2021 Competition on Scientific Table Image Recognition to Latex},
  author={Yelin He and Xianbiao Qi and Jiaquan Ye and Peng Gao and Yihao Chen and Bingcong Li and Xin Tang and Rong Xiao},
  journal={ArXiv},
  year={2021},
  volume={abs/2105.01846}
}
@article{Lu2021MASTER,
  title={{MASTER}: Multi-Aspect Non-local Network for Scene Text Recognition},
  author={Ning Lu and Wenwen Yu and Xianbiao Qi and Yihao Chen and Ping Gong and Rong Xiao and Xiang Bai},
  journal={Pattern Recognition},
  year={2021}
}
@article{li2018shape,
  title={Shape robust text detection with progressive scale expansion network},
  author={Li, Xiang and Wang, Wenhai and Hou, Wenbo and Liu, Ruo-Ze and Lu, Tong and Yang, Jian},
  journal={arXiv preprint arXiv:1806.02559},
  year={2018}
}

Acknowledgements

Comments

Error when training text-line detection model
Hi, thanks for the great repo!

I followed the guide in the README file and want to train the text-line detection model

I also prepared a dataset with COCO format, same as the MMOCR's repo and psenet_r50_fpnf_600e_pubtabnet.py but got the following error. Seem like it occurs with an empty object but I did not sure the error came from Did you face this issue when training the text-line detection model? Or do you have an idea about how to fix this issue?

Thanks in advance!
opened by huyhoang17 6
keyError:'TABLEMASTER is not in the {registry.name} registry'

您好，按照Install.md安装了mmdetection、mmcv-full、mmocr，但是运行 sh ./table_recognition/expr/table_recognition_dist_train.sh 报错： keyError:'TABLEMASTER is not in the {registry.name} registry'，请问有什么建议的解决方法吗？

opened by BrandnewA 5
missing text line recognition alphabet

Hi, thanks for publishing code. when I try to use text recognition checkpoint, find that the alphabet generated by mine is different from yours. This could be due to the fact that I only used part of the data for training master test recongnition model. Could you provide the ./tools/data/alphabet/textline_recognition_alphabet.txt file?

opened by zezeze97 2
Master is not registy，是需要单独用master训练？
return build_from_cfg(cfg, registry, default_args)

File "/python3.6/site-packages/mmcv/utils/registry.py", line 44, in build_from_cfg f'{obj_type} is not in the {registry.name} registry') KeyError: 'MASTER is not in the detector registry'
opened by cqray1990 2

训练后测试报错 'TableResize is not in the pipeline registry'

我对表格结构识别模型训练一轮后想用test_imgs.py这个脚本来测试一下模型能不能跑通，结果报了一个错，请问这是因为mmcv的版本问题吗，还是什么其他问题？谢谢！（我是用pip install mmcv-full==1.3.4 -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.7.0/index.html这个命令安装mmcv的, mmocr是通过在当前repo的根路径下运行pip install -v -e .安装的，安装后mmocr的版本是0.2.0）

python tools/test_imgs.py configs/textrecog/master/table_master_ResnetExtract_Ranger_0705.py experiments/mmocr_table_recognition/0705/latest.pth /home/datasets/ocr/table/pubtabnet/test /home/datasets/ocr/table/pubtabnet/test.list
Use load_from_local loader
[>>>>>>>>>>>>                                      ] 1/4, 22075.3 task/s, elapsed: 0s, ETA:     0sTraceback (most recent call last):
  File "tools/test_imgs.py", line 165, in <module>
    main()
  File "tools/test_imgs.py", line 151, in main
    result = inference_detector(model, img_path)
  File "/home/xray/homework/table/TableMASTER-mmocr/mmdetection-2.11.0/mmdet/apis/inference.py", line 117, in inference_detector
    test_pipeline = Compose(cfg.data.test.pipeline)
  File "/home/xray/homework/table/TableMASTER-mmocr/mmdetection-2.11.0/mmdet/datasets/pipelines/compose.py", line 22, in __init__
    transform = build_from_cfg(transform, PIPELINES)
  File "/home/anaconda3/envs/pytorch1.7/lib/python3.6/site-packages/mmcv/utils/registry.py", line 44, in build_from_cfg
    f'{obj_type} is not in the {registry.name} registry')
KeyError: 'TableResize is not in the pipeline registry'

opened by xray1111 2

Convert model to ONNX / TensorRT?

Have you tried to convert the MASTER-based model (included textline-recognition & table recognition) to ONNX or TensorRT format?

I followed the tutorial of MMOCR and MMDet to convert those models to ONNX format, here is the output error Do you have any idea for handle this issue? Thanks in advance.

opened by huyhoang17 2
Performance issues

Hi,

I tried to use table-master (alone, not the end2end model) but it takes 18s just to infer on one image from pubtabnet (~500 pixels width) plus it takes about 2GB memory from the GPU (1 tesla K80).

Therefore I wanted to know if it is the normal memory consumption and processing time for table-master and also I wanted to know if there is any mean to reduce the memory consumption and increase the processing speed. Because it is unfeasible to use a model that takes 18s to infer on one image.

Thank you in advance for your answer

opened by GTimothee 1
'TABLEMASTER is not in the models registry'

After run python data_preprocess.py, I want to train table structure recognition model TableMASTER.

But when I run ./table_recognition/expr/table_recognition_dist_train.sh, it raise KeyError: 'TABLEMASTER is not in the models registry'

opened by baoyuxu 1

distance_rule_match

def distance_rule_match(end2end_indexes, end2end_bboxes, master_indexes, master_bboxes):
    """
    Get matching between no-match end2end bboxes and no-match master bboxes.
    Use min distance to match.
    This rule will only run (no-match end2end nums > 0) and (no-match master nums > 0)
    It will Return master_bboxes_nums match-pairs.
    :param end2end_indexes:
    :param end2end_bboxes:
    :param master_indexes:
    :param master_bboxes:
    :return: match_pairs list, e.g. [[0,1], [1,2], ...]
    """
    min_match_list = []
    for j, master_bbox in zip(master_indexes, master_bboxes):
        min_distance = np.inf
        min_match = [0, 0]  # i, j
        for i, end2end_bbox in zip(end2end_indexes, end2end_bboxes):
            x_end2end, y_end2end = end2end_bbox[0], end2end_bbox[1]
            x_master, y_master = master_bbox[0], master_bbox[1]
            end2end_point = (x_end2end, y_end2end)
            master_point = (x_master, y_master)
            dist = cal_distance(master_point, end2end_point)
            if dist < min_distance:
                min_match[0], min_match[1] = i, j
                min_distance = dist
        min_match_list.append(min_match)
    return min_match_list

About this function, the output may contain several matches [i, *] for one i. But you want to find only one match for a specific i, should we change order of the two loops here?

opened by shaonanqinghuaizongshishi 0

when tun table_inference.py with one gpu of 2080ti load epoch_16_0.7767.pth model
when tun table_inference.py with one gpu of 2080ti load epoch_16_0.7767.pth model and config files as follows:

base = [ '../../base/default_runtime.py' ]

alphabet_file = '/tools/data/alphabet/structure_alphabet.txt' alphabet_len = len(open(alphabet_file, 'r').readlines()) max_seq_len = 500

start_end_same = False label_convertor = dict( type='TableMasterConvertor', dict_file=alphabet_file, max_seq_len=max_seq_len, start_end_same=start_end_same, with_unknown=True)

if start_end_same: PAD = alphabet_len + 2 else: PAD = alphabet_len + 3

model = dict( type='TABLEMASTER', backbone=dict( type='TableResNetExtra', input_dim=3, gcb_config=dict( ratio=0.0625, headers=1, att_scale=False, fusion_type="channel_add", layers=[False, True, True, True], ), layers=[1,2,5,3]), encoder=dict( type='PositionalEncoding', d_model=512, dropout=0.2, max_len=5000), decoder=dict( type='TableMasterDecoder', N=3, decoder=dict( self_attn=dict( headers=8, d_model=512, dropout=0.), src_attn=dict( headers=8, d_model=512, dropout=0.), feed_forward=dict( d_model=512, d_ff=2024, dropout=0.), size=512, dropout=0.), d_model=512), loss=dict(type='MASTERTFLoss', ignore_index=PAD, reduction='mean'), bbox_loss=dict(type='TableL1Loss', reduction='sum'), label_convertor=label_convertor, max_seq_len=max_seq_len)

TRAIN_STATE = True img_norm_cfg = dict(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) train_pipeline = [ dict(type='LoadImageFromNdarrayV2'), dict( type='TableResize', keep_ratio=True, long_size=480), dict( type='TablePad', size=(480, 480), pad_val=0, return_mask=True, mask_ratio=(8, 8), train_state=TRAIN_STATE), dict(type='TableBboxEncode'), dict(type='ToTensorOCR'), dict(type='NormalizeOCR', **img_norm_cfg), dict( type='Collect', keys=['img'], meta_keys=[ 'filename', 'ori_shape', 'img_shape', 'text', 'scale_factor', 'bbox', 'bbox_masks', 'pad_shape' ]), ]

valid_pipeline = [ dict(type='LoadImageFromNdarrayV2'), dict( type='TableResize', keep_ratio=True, long_size=480), dict( type='TablePad', size=(480, 480), pad_val=0, return_mask=True, mask_ratio=(8, 8), train_state=TRAIN_STATE), dict(type='TableBboxEncode'), dict(type='ToTensorOCR'), dict(type='NormalizeOCR', **img_norm_cfg), dict( type='Collect', keys=['img'], meta_keys=[ 'filename', 'ori_shape', 'img_shape', 'scale_factor', 'img_norm_cfg', 'ori_filename', 'bbox', 'bbox_masks', 'pad_shape' ]), ]

test_pipeline = [ dict(type='LoadImageFromNdarrayV2'), dict( type='TableResize', keep_ratio=True, long_size=480), dict( type='TablePad', size=(480, 480), pad_val=0, return_mask=True, mask_ratio=(8, 8), train_state=TRAIN_STATE), #dict(type='TableBboxEncode'), dict(type='ToTensorOCR'), dict(type='NormalizeOCR', **img_norm_cfg), dict( type='Collect', keys=['img'], meta_keys=[ 'filename', 'ori_shape', 'img_shape', 'scale_factor', 'img_norm_cfg', 'ori_filename', 'pad_shape' ]), ]

dataset_type = 'OCRDataset' #train_img_prefix = '/pubtabnet/pubtabnet/train' #train_anno_file1 = /StructureLabel_train' train_img_prefix = "pubtabnet/pubtabnet/train" train_anno_file1 = "StructureLabel_train"

train_img_prefix = ''

train_anno_file1 = ''

train1 = dict( type=dataset_type, img_prefix=train_img_prefix, ann_file=train_anno_file1, loader=dict( type='TableMASTERLmdbLoader', repeat=1, max_seq_len=max_seq_len, parser=dict( type='TableMASTERLmdbParser', keys=['filename', 'text'], keys_idx=[0, 1], separator=' ')), pipeline=train_pipeline, test_mode=False)

valid_img_prefix = /pubtabnet/pubtabnet/val'

valid_anno_file1 = /StructureLabel_val'

valid_img_prefix = '/pubtabnet/pubtabnet/val' valid_anno_file1 = '/StructureLabel_val' valid = dict( type=dataset_type, img_prefix=valid_img_prefix, ann_file=valid_anno_file1, loader=dict( type='TableMASTERLmdbLoader', repeat=1, max_seq_len=max_seq_len, parser=dict( type='TableMASTERLmdbParser', keys=['filename', 'text'], keys_idx=[0, 1], separator=' ')), pipeline=valid_pipeline, dataset_info='table_master_dataset', test_mode=True)

test_img_prefix = /pubtabnet/pubtabnet/val'

test_anno_file1 = '/StructureLabel_val'

test_img_prefix = '/pubtabnet/pubtabnet/val' test_anno_file1 = '/StructureLabel_val' test = dict( type=dataset_type, img_prefix=test_img_prefix, ann_file=test_anno_file1, loader=dict( type='TableMASTERLmdbLoader', repeat=1, max_seq_len=max_seq_len, parser=dict( type='TableMASTERLmdbParser', keys=['filename', 'text'], keys_idx=[0, 1], separator=' ')), pipeline=test_pipeline, dataset_info='table_master_dataset', test_mode=True)

data = dict( samples_per_gpu=4, workers_per_gpu=2, train=dict(type='ConcatDataset', datasets=[train1]), val=dict(type='ConcatDataset', datasets=[valid]), test=dict(type='ConcatDataset', datasets=[test]))

optimizer

optimizer = dict(type='Ranger', lr=1e-3) optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))

optimizer_config = dict(grad_clip=None)

learning policy

lr_config = dict( policy='step', warmup='linear', warmup_iters=50, warmup_ratio=1.0 / 3, step=[12, 15]) total_epochs = 17

evaluation

evaluation = dict(interval=1, metric='acc')

fp16

fp16 = dict(loss_scale='dynamic')

checkpoint setting

checkpoint_config = dict(interval=1)

log_config

log_config = dict( interval=100, hooks=[ dict(type='TextLoggerHook')

])

yapf:enable

dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)]

if raise find unused_parameters, use this.

find_unused_parameters = True

ret = input.softmax(dim) RuntimeError: CUDA out of memory. Tried to allocate 30.00 MiB (GPU 0; 10.76 GiB total capacity; 444.10 MiB already allocated; 24.56 MiB free; 604.00 MiB reserved in total by PyTorch)

it seems need more gpu memory
opened by cqray1990 0
File not found error

Trying to run table recognition demo script in google colab , gdrive mounted and repo is inside gdrive. While runninh the table recognition demoscript i am getting error below >>>>>>>

Use load_from_local loader Traceback (most recent call last): File "./table_recognition/demo/demo_cp.py", line 65, in master_inference = Recognition_Inference(args.master_config, args.master_checkpoint) File "/content/drive/MyDrive/work/TableMASTER-mmocr/table_recognition/table_inference.py", line 99, in init super().init(config_file, checkpoint_file) File "/content/drive/MyDrive/work/TableMASTER-mmocr/table_recognition/table_inference.py", line 38, in init self.model = build_model(config_file, checkpoint_file) File "/content/drive/MyDrive/work/TableMASTER-mmocr/table_recognition/table_inference.py", line 25, in build_model model = init_detector(config_file, checkpoint=checkpoint_file, device=device) File "/content/drive/MyDrive/work/TableMASTER-mmocr/mmdetection-2.11.0/mmdet/apis/inference.py", line 31, in init_detector config = mmcv.Config.fromfile(config) File "/usr/local/lib/python3.8/dist-packages/mmcv/utils/config.py", line 254, in fromfile cfg_dict, cfg_text = Config._file2dict(filename, File "/usr/local/lib/python3.8/dist-packages/mmcv/utils/config.py", line 148, in _file2dict mod = import_module(temp_module_name) File "/usr/lib/python3.8/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1014, in _gcd_import File "", line 991, in _find_and_load File "", line 975, in _find_and_load_unlocked File "", line 671, in _load_unlocked File "", line 843, in exec_module File "", line 219, in _call_with_frames_removed File "/tmp/tmp8emb1yyv/tmptcqvifrk.py", line 6, in FileNotFoundError: [Errno 2] No such file or directory: './tools/data/alphabet/textline_recognition_alphabet.txt' Exception ignored in: <function _TemporaryFileCloser.del at 0x7fe26a3dc940> Traceback (most recent call last): File "/usr/lib/python3.8/tempfile.py", line 579, in del self.close() File "/usr/lib/python3.8/tempfile.py", line 575, in close unlink(self.name) FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp8emb1yyv/tmptcqvifrk.py'

opened by VipulAlgoSoul 0

Owner

Jianquan Ye

GitHub

This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing.

Feedback Prize - Evaluating Student Writing This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing. The

41 Dec 14, 2022

Simple Linear 2nd ODE Solver GUI - A 2nd constant coefficient linear ODE solver with simple GUI using euler's method

Simple_Linear_2nd_ODE_Solver_GUI Description It is a 2nd constant coefficient li

4 Feb 5, 2022

Xview3 solution - XView3 challenge, 2nd place solution

Xview3, 2nd place solution https://iuu.xview.us/ test split aggregate score publ

24 Nov 23, 2022

DRIFT is a tool for Diachronic Analysis of Scientific Literature.

About DRIFT is a tool for Diachronic Analysis of Scientific Literature. The application offers user-friendly and customizable utilities for two modes:

108 Dec 12, 2022

1st Solution For NeurIPS 2021 Competition on ML4CO Dual Task

KIDA: Knowledge Inheritance in Data Aggregation This project releases our 1st place solution on NeurIPS2021 ML4CO Dual Task. Slide and model weights a

24 Sep 8, 2022

Kaggle G2Net Gravitational Wave Detection : 2nd place solution

33 Dec 26, 2022

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Vision Transformer for Fast and Efficient Scene Text Recognition (ICDAR 2021) ViTSTR is a simple single-stage model that uses a pre-trained Vision Tra

198 Dec 27, 2022

Official implementation of SynthTIGER (Synthetic Text Image GEneratoR) ICDAR 2021

?? SynthTIGER: Synthetic Text Image GEneratoR Official implementation of SynthTIGER | Paper | Datasets Moonbin Yim1, Yoonsik Kim1, Han-cheol Cho1, Sun

256 Jan 5, 2023

Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer Description Convert offline handwritten mathematical expressi

87 Dec 27, 2022

Codebase for the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge.

KAIROS MineRL BASALT Codebase for the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL B

37 Oct 30, 2022

Semi Supervised Learning for Medical Image Segmentation, a collection of literature reviews and code implementations.

Semi-supervised-learning-for-medical-image-segmentation. Recently, semi-supervised image segmentation has become a hot topic in medical image computin

1.3k Jan 3, 2023

Freecodecamp Scientific Computing with Python Certification; Solution for Challenge 2: Time Calculator

Assignment Write a function named add_time that takes in two required parameters and one optional parameter: a start time in the 12-hour clock format

0 Feb 26, 2022

The 3rd place solution for competition

The 3rd place solution for competition "Lyft Motion Prediction for Autonomous Vehicles" at Kaggle Team behind this solution: Artsiom Sanakoyeu [Homepa

104 Nov 22, 2022

Winning solution of the Indoor Location & Navigation Kaggle competition

This repository contains the code to generate the winning solution of the Kaggle competition on indoor location and navigation organized by Microsoft

62 Dec 28, 2022

Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application

FPT_data_centric_competition - Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application

2 Oct 30, 2022

Solution of Kaggle competition: Sartorius - Cell Instance Segmentation

Sartorius - Cell Instance Segmentation https://www.kaggle.com/c/sartorius-cell-instance-segmentation Environment setup Build docker image bash .dev_sc

68 Dec 9, 2022

Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Fast MST Algorithm Implementation of fast algorithms for (Maximum Spanning Tree) MST parsing that includes fast ArcMax+Reweighting+Tarjan algorithm fo

11 Oct 14, 2022

NeuroFind - A solution to the to the Task given by the Oberseminar of Messtechnik Institute of TU Dresden in 2021

NeuroFind A solution to the to the Task given by the Oberseminar of Messtechnik

1 Jan 20, 2022

This is 2nd term discrete maths project done by UCU students that uses backtracking to solve various problems.

Backtracking Project Sponsors This is a project made by UCU students: Olha Liuba - crossword solver implementation Hanna Yershova - sudoku solver impl

4 Oct 17, 2021

2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

Related tags

Overview

TableMASTER-mmocr

Contents

About The Project

Method Description

Dependency

Getting Started

Prerequisites

Installation

Usage

Data preprocess

Train

Inference

Get TEDS score

Result

License

Citations

Acknowledgements

Comments

train_img_prefix = ''

train_anno_file1 = ''

valid_img_prefix = /pubtabnet/pubtabnet/val'

valid_anno_file1 = /StructureLabel_val'

test_img_prefix = /pubtabnet/pubtabnet/val'

test_anno_file1 = '/StructureLabel_val'

optimizer

optimizer_config = dict(grad_clip=None)

learning policy

evaluation

fp16

checkpoint setting

log_config

yapf:enable

if raise find unused_parameters, use this.

find_unused_parameters = True

Owner

Jianquan Ye

This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing.

Simple Linear 2nd ODE Solver GUI - A 2nd constant coefficient linear ODE solver with simple GUI using euler's method

Xview3 solution - XView3 challenge, 2nd place solution

DRIFT is a tool for Diachronic Analysis of Scientific Literature.

1st Solution For NeurIPS 2021 Competition on ML4CO Dual Task

Kaggle G2Net Gravitational Wave Detection : 2nd place solution

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Official implementation of SynthTIGER (Synthetic Text Image GEneratoR) ICDAR 2021

Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Codebase for the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge.

Semi Supervised Learning for Medical Image Segmentation, a collection of literature reviews and code implementations.

Freecodecamp Scientific Computing with Python Certification; Solution for Challenge 2: Time Calculator

The 3rd place solution for competition

Winning solution of the Indoor Location & Navigation Kaggle competition

Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application

Solution of Kaggle competition: Sartorius - Cell Instance Segmentation

Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

NeuroFind - A solution to the to the Task given by the Oberseminar of Messtechnik Institute of TU Dresden in 2021

This is 2nd term discrete maths project done by UCU students that uses backtracking to solve various problems.