SimBiber - A tool for simplifying bibtex with official info

Overview

SimBiber: A tool for simplifying bibtex with official info.

We often need to simplify the official bib that consists of many information into a shorter version that only maintains necessary information (e.g., author, title, conference/journal name and etc) due to page limitation.

We introduce SimBiber, a simple tool in Python to simplify them automatically. Hope it's helpful for you.

We also highly recommend another wonderful tool for you Rebiber, which is a tool for normalizing bibtex with official info.

Changelog

  • 2021.12.31 We build the first version and release it.

Installation

git clone https://github.com/MLNLP-World/Simbiber.git
pip install bibtexparser

Usage(v0.1.0)

python SimBiberParser.py --input_path data/bibtex.bib --output_path out/bibtex.bib --config_path parserConfig.json --if_append_output False --cache_num 100
argument usage
--input_path The path to the input bib file that you want to simplify
--output_path The path to the output bib file that you want to save.
--config_path The path to the mapper config file
--if_append_output Whether append simplified data to output bib file.
--cache_num The number of bib items you want to simplify at once.
PLEASE ATTENTION: If you want to simplify a huge bib file, you'd better change it to achieve satisfactory speed.

Example Input and Output

An example simplified output entry with the official information:

@inproceedings{qin-etal-2019-stack,
    title = "A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding",
    author = "Qin, Libo  and
      Che, Wanxiang  and
      Li, Yangming  and
      Wen, Haoyang  and
      Liu, Ting",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-1214",
    doi = "10.18653/v1/D19-1214",
    pages = "2078--2087",
    abstract = "Intent detection and slot filling are two main tasks for building a spoken language understanding (SLU) system. The two tasks are closely tied and the slots often highly depend on the intent. In this paper, we propose a novel framework for SLU to better incorporate the intent information, which further guiding the slot filling. In our framework, we adopt a joint model with Stack-Propagation which can directly use the intent information as input for slot filling, thus to capture the intent semantic knowledge. In addition, to further alleviate the error propagation, we perform the token-level intent detection for the Stack-Propagation framework. Experiments on two publicly datasets show that our model achieves the state-of-the-art performance and outperforms other previous methods by a large margin. Finally, we use the Bidirectional Encoder Representation from Transformer (BERT) model in our framework, which further boost our performance in SLU task.",
}

An example simplified output entry from the official information:

@inproceedings{qin-etal-2019-stack,
    author = {Qin, Libo  and
     Che, Wanxiang  and
     Li, Yangming  and
     Wen, Haoyang  and
     Liu, Ting},
    booktitle = {Proc. of EMNLP},
    title = {A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding},
    year = {2019}
}

Supported Conferences

The parserConfig.json contains a list of converted json files of the mapper between official full name and simplified name.

Name Full Name
AAAI Association for the Advance of Artificial Intelligence
ACL Association for Computational Linguistics
CCL Chinese Computational Linguistics
COLING International Conference on Computational Linguistics
EMNLP Empirical Methods in Natural Language Processing
ICASSP International Conference on Acoustics, Speech and Signal Processing
ICLR International Conference on Learning Representations
ICML International Conference on Machine Learning
LREC Language Resources and Evaluation Conference
NeurIPS Neural Information Processing Systems
NLPCC Natural Language Processing and Chinese Computing
SemEval International Workshop on Semantic Evaluation
SIGDIAL SIGdial Meeting on Discourse and Dialogue

Adding a new conference

You can manually add any conferences from DBLP to config map.

Take ICLR as an example:

  • Step 1: Go to DBLP
  • Step 2: Find the full name of Conference
  • Step 3: Add map to parserConfig.json
{"International Conference on Learning Representations": "ICLR"}

Contact

Please email or [email protected] or [email protected] create Github issues here if you have any questions or suggestions.

Contributor

Thanks to the contributors:

Libo Qin; Qiguang Chen; Qian Liu

You might also like...
Script to change official Kali repository to mirrors

Script to change official Kali repository to mirrors. This helps increase packages update and downloading for some user.

Meliodas Official 1.4 BombSquad Server Scripts

Noxious-Official-1.4-BombSquad-Server-Scripts Scripts Are Provided By Sparxtn Somewhat Edited By Me Scripts are Working Fine Just Download & Use It Be

This repo presents you the official code of
This repo presents you the official code of "VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention"

VISTA VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention Shengheng Deng, Zhihao Liang, Lin Sun and Kui Jia* (*) Corresponding a

A flexible free and unlimited python tool to translate between different languages in a simple way using multiple translators.
A flexible free and unlimited python tool to translate between different languages in a simple way using multiple translators.

deep-translator Translation for humans A flexible FREE and UNLIMITED tool to translate between different languages in a simple way using multiple tran

HPomb Is Socail Engineering Tool , Used For Bombing , Spoofing and Anonymity Available For Linux And Android(Termux)
HPomb Is Socail Engineering Tool , Used For Bombing , Spoofing and Anonymity Available For Linux And Android(Termux)

HPomb v2020.02 Coming Soon Created By Secanonm HPomb Is Socail Engineering Tool , Used For Bombing , Spoofing and Anonymity Available For Linux And An

Tool to generate wrappers for Linux libraries allowing for dlopen()ing them without writing any boilerplate

Dynload wrapper This program will generate a wrapper to make it easy to dlopen() shared objects on Linux without writing a ton of boilerplate code. Th

A Sophisticated And Beautiful Doxing Tool

Garuda V1.1 A Sophisticated And Beautiful Doxing Tool Works on Android[Termux] | Linux | Windows Don't Forget to give it a star ❗ How to use ❓ First o

Subcert is an subdomain enumeration tool, that finds all the subdomains from certificate transparency logs.
Subcert is an subdomain enumeration tool, that finds all the subdomains from certificate transparency logs.

Subcert Subcert is a subdomain enumeration tool, that finds all the valid subdomains from certificate transparency logs. Table of contents Setup Demo

Oppia is an online learning tool that enables anyone to easily create and share interactive activities
Oppia is an online learning tool that enables anyone to easily create and share interactive activities

Oppia is an online learning tool that enables anyone to easily create and share interactive activities (called 'explorations'). These activities simulate a one-on-one conversation with a tutor, making it possible for students to learn by doing while getting feedback.

Comments
  • A sample error

    A sample error

    Simbiber工具在处理一篇CVPR的时候貌似出了些问题,处理完只剩authortitle了。

    处理前:

    @inproceedings{zhang_open-book_2021,
     author = {Zhang, Ziqi and Qi, Zhongang and Yuan, Chunfeng and Shan, Ying and Li, Bing and Deng, Ying and Hu, Weiming},
     date = {2021},
     eventtitle = {Proceedings of the {IEEE}/{CVF} Conference on Computer Vision and Pattern Recognition},
     file = {Full Text PDF:D\:\\Zotero_Data\\storage\\4SQ2DJ8E\\Zhang 等。 - 2021 - Open-Book Video Captioning With Retrieve-Copy-Gene.pdf:application/pdf;Snapshot:D\:\\Zotero_Data\\storage\\LXXMRANI\\Zhang_Open-Book_Video_Captioning_With_Retrieve-Copy-Generate_Network_CVPR_2021_paper.html:text/html},
     langid = {english},
     pages = {9837--9846},
     title = {Open-Book Video Captioning With Retrieve-Copy-Generate Network},
     url = {https://openaccess.thecvf.com/content/CVPR2021/html/Zhang_Open-Book_Video_Captioning_With_Retrieve-Copy-Generate_Network_CVPR_2021_paper.html},
     urldate = {2022-04-26}
    }
    

    处理后:

    @inproceedings{zhang_open-book_2021,
     author = {Zhang, Ziqi and Qi, Zhongang and Yuan, Chunfeng and Shan, Ying and Li, Bing and Deng, Ying and Hu, Weiming},
     title = {Open-Book Video Captioning With Retrieve-Copy-Generate Network}
    }
    
    opened by yuezih 1
  • 似乎会把*ACL 都简化为proc.ACL

    似乎会把*ACL 都简化为proc.ACL

    没有细看代码;似乎是因为ACL的全称是EACL等的字串,优先匹配了。

    把ACL行移到最下面会暂时缓解这个问题,

    例: @inproceedings{DBLP:conf/eacl/BunescuP06, author = {Razvan C. Bunescu and Marius Pasca}, editor = {Diana McCarthy and Shuly Wintner}, title = {Using Encyclopedic Knowledge for Named entity Disambiguation}, booktitle = {{EACL} 2006, 11st Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference, April 3-7, 2006, Trento, Italy}, publisher = {The Association for Computer Linguistics}, year = {2006}, url = {https://aclanthology.org/E06-1002/}, timestamp = {Fri, 06 Aug 2021 00:40:45 +0200}, biburl = {https://dblp.org/rec/conf/eacl/BunescuP06.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

    opened by Los-Phoenix 1
  • feature

    feature "-keep" is not usable

    $ simbiber -i rebibered.bib -o simbibered.bib -keep "doi"
    usage: simbiber [-h] [-i INPUT_PATH] [-o OUTPUT_PATH] [-c CONFIG_PATH] [-a IF_APPEND_OUTPUT]
                    [-cch CACHE_NUM] [-r REMOVE_DUPLICATE]
    simbiber: error: unrecognized arguments: -keep doi
    
    opened by yuezih 0
Owner
null
GDIT: Geometry Dash Info Tool

GDIT: Geometry Dash Info Tool This is the first large script that allows you to quickly get information from the Geometry Dash server

dezz0xY 2 Jan 9, 2022
A tool to flash .ofp files in bootloader mode without needing MSM Tool, an alternative to official realme tool

Oppo/Realme Flash .OFP File on Bootloader A tool to flash .ofp files in bootloader mode without needing MSM Tool, an alternative to official realme to

Italo Almeida 70 Jan 2, 2023
Convert a .vcf file to 'aa_table.tsv', including depth & alt frequency info

Produce an 'amino acid table' file from a vcf, including depth and alt frequency info.

Dan Fornika 1 Oct 16, 2021
Generating rent availability info from Effort rent

Rent-info Generating rent availability info from Effort rent Pre-Installation Latest version of python Pip module json, os, requests, datetime, time i

Laixuan 1 Oct 20, 2021
A parser of Windows Defender's DetectionHistory forensic artifact, containing substantial info about quarantined files and executables.

A parser of Windows Defender's DetectionHistory forensic artifact, containing substantial info about quarantined files and executables.

Jordan Klepser 101 Oct 30, 2022
A timer for bird lovers, plays a random birdcall while displaying its image and info.

Birdcall Timer A timer for bird lovers. Siriema hatchling by Junior Peres Junior Background My partner needed a customizable timer for sitting and sta

Marcelo Sanches 1 Jul 8, 2022
The Official interpreter for the Pix programming language.

The official interpreter for the Pix programming language. Pix Pix is a programming language dedicated to readable syntax and usability Q) Is Pix the

Pix 6 Sep 25, 2022
The official repository of iGEM Paris Bettencourt team's software tools.

iGEM_ParisBettencourt21 The official repository of iGEM Paris Bettencourt team's software tools. Cell counting There are two programs dedicated to the

Abhay Koushik 1 Oct 21, 2021
The Official Jaseci Code Repository

Jaseci Release Notes Version 1.2.2 Updates Added new built-ins for nodes and edges (context, info, and details) Fixed dot output Added reset command t

null 136 Dec 20, 2022
The official FOSSCOMM 2021 CTF by CSC@UOM

FOSSCOMM 2021 CTF Table of Contents General Info FAQ General Info Purpose: This CTF is a collaboration between the FOSSCOMM conference and the CSC@UOM

Machina 2 Nov 14, 2021