Fully-automated scripts for collecting AI-related papers

Overview

AI-Paper-collector

Fully-automated scripts for collecting AI-related papers

List of Conferences to crawel

  • ACL: 21-19 (including findings)
  • EMNLP: 21-19 (including findings)
  • NAACL: 21-19
  • COLING: 20
  • NIPS: 20-19
  • ICLR: 21-19
  • ICML: 21-19
  • AAAI: 21-19
  • IJCAI: 21-19
  • CVPR: 21-19
  • ICCV: 19
  • MM: 21-19
  • KDD: 21-19
  • SIGIR: 21-19
  • CIKM: 21-19

Usage

python paper_crawler.py --keyword {keyword}

TODO

  • crawel abstract and translate it
  • add github action
Comments
  • [Add] AISTAT, VLDB, COLT

    [Add] AISTAT, VLDB, COLT

    [ { 'name':'JMLR2022', 'url':'https://dblp.org/db/journals/jmlr/jmlr22.html', 'source': 'dblp', }, { 'name':'JMLR2021', 'url':'https://dblp.org/db/journals/jmlr/jmlr21.html', 'source': 'dblp', }, { 'name':'JMLR2020', 'url':'https://dblp.org/db/journals/jmlr/jmlr20.html', 'source': 'dblp', }, { 'name':'JMLR2019', 'url':'https://dblp.org/db/journals/jmlr/jmlr19.html', 'source': 'dblp' }, { 'name':'VLDB2019', 'url':'https://dblp.org/db/journals/pvldb/pvldb13.html', 'source': 'dblp', }, { 'name':'VLDB2020', 'url':'https://dblp.org/db/journals/pvldb/pvldb14.html', 'source': 'dblp', }, { 'name':'VLDB2021', 'url':'https://dblp.org/db/journals/pvldb/pvldb15.html', 'source': 'dblp', }, ]

    require to update cache 
    opened by yinxiangshi 10
  • The schedule of required features

    The schedule of required features

    • [x] Add code links and paper links @LightChen233
    • [x] The optimization of front-end (need more details) @beiyuouo
    • [x] search by sessions @SivilTaram
    • [x] search in the last search results @beiyuouo
    • [x] Refactor the current workflow @Doragd
    • [ ] fix: search ACL and return NAACL/ACL
    • [x] format: year, conf, ...
    • [x] provide the list of confs before search
    • [ ] multi-process crawl
    Official Development 
    opened by Doragd 2
  • app方式完全用不了,什么时候能修复?

    app方式完全用不了,什么时候能修复?

    Question Add your question here app方式完全用不了,什么时候能修复? Screenshots If applicable, add screenshots to help explain your problem.

    Desktop (please complete the following information):

    • OS: [e.g. Ubuntu]
    • Version [e.g. 0.1.7]
    documentation 
    opened by wgfrhebnr 1
  • [BUG]

    [BUG]

    This error happens when you input list of conferences.

    [+] Enter your list of conferences (default: All Confs): ACL,EMNLP,NAACL,COLING,ICLR,AAAI
    Traceback (most recent call last):
      File "main.py", line 60, in <module>
        main()
      File "main.py", line 43, in main
        results = exec_search(indexes, candidates, query, mode, threshold, confs, limit)
      File "/cephfs/zhoutong/work/AI-Paper-Collector/searcher.py", line 179, in exec_search
        results = exact_search(indexes, query, confs)
      File "/cephfs/zhoutong/work/AI-Paper-Collector/searcher.py", line 86, in exact_search
        results = [item for item in indexes if query in item[1].lower()]
      File "/cephfs/zhoutong/work/AI-Paper-Collector/searcher.py", line 86, in <listcomp>
        results = [item for item in indexes if query in item[1].lower()]
    AttributeError: 'dict' object has no attribute 'lower'
    
    bug 
    opened by zhou6140919 1
  • [BUG] Search papers by author

    [BUG] Search papers by author

    Describe the bug 在 Advanced Setting - Specific Author 中搜索作者,会包含一些错误的结果,比如搜索“Ming Li”,除了正确的结果外,还会包含错误的结果如“Xiaoming Li”,“Xiaoming Liu” (名字是我瞎说的,仅作错误示意)

    Expected behavior 希望只包含正确的结果,如果可以将搜索作者高亮显示就更好啦~💕

    bug 
    opened by LingyvKong 1
  • [Add] MICCAI

    [Add] MICCAI

    [ { 'name':'MICCAI2022', 'url':'https://dblp.org/db/conf/miccai/miccai2022-1.html', 'source':'dblp', }, { 'name':'MICCAI2022', 'url':'https://dblp.org/db/conf/miccai/miccai2022-2.html', 'source':'dblp', }, { 'name':'MICCAI2022', 'url':'https://dblp.org/db/conf/miccai/miccai2022-3.html', 'source':'dblp', }, { 'name':'MICCAI2022', 'url':'https://dblp.org/db/conf/miccai/miccai2022-4.html', 'source':'dblp', }, { 'name':'MICCAI2022', 'url':'https://dblp.org/db/conf/miccai/miccai2022-5.html', 'source':'dblp', }, { 'name':'MICCAI2022', 'url':'https://dblp.org/db/conf/miccai/miccai2022-6.html', 'source':'dblp', }, { 'name':'MICCAI2022', 'url':'https://dblp.org/db/conf/miccai/miccai2022-7.html', 'source':'dblp', }, { 'name':'MICCAI2022', 'url':'https://dblp.org/db/conf/miccai/miccai2022-8.html', 'source':'dblp', }, { 'name':'MICCAI2021', 'url':'https://dblp.org/db/conf/miccai/miccai2021-1.html', 'source':'dblp', }, { 'name':'MICCAI2021', 'url':'https://dblp.org/db/conf/miccai/miccai2021-2.html', 'source':'dblp', }, { 'name':'MICCAI2021', 'url':'https://dblp.org/db/conf/miccai/miccai2021-3.html', 'source':'dblp', }, { 'name':'MICCAI2021', 'url':'https://dblp.org/db/conf/miccai/miccai2021-4.html', 'source':'dblp', }, { 'name':'MICCAI2021', 'url':'https://dblp.org/db/conf/miccai/miccai2021-5.html', 'source':'dblp', }, { 'name':'MICCAI2021', 'url':'https://dblp.org/db/conf/miccai/miccai2021-6.html', 'source':'dblp', }, { 'name':'MICCAI2021', 'url':'https://dblp.org/db/conf/miccai/miccai2021-7.html', 'source':'dblp', }, { 'name':'MICCAI2021', 'url':'https://dblp.org/db/conf/miccai/miccai2021-8.html', 'source':'dblp', }, { 'name':'MICCAI2020', 'url':'https://dblp.org/db/conf/miccai/miccai2020-1.html', 'source':'dblp', }, { 'name':'MICCAI2020', 'url':'https://dblp.org/db/conf/miccai/miccai2020-2.html', 'source':'dblp', }, { 'name':'MICCAI2020', 'url':'https://dblp.org/db/conf/miccai/miccai2020-3.html', 'source':'dblp', }, { 'name':'MICCAI2020', 'url':'https://dblp.org/db/conf/miccai/miccai2020-4.html', 'source':'dblp', }, { 'name':'MICCAI2020', 'url':'https://dblp.org/db/conf/miccai/miccai2020-5.html', 'source':'dblp', }, { 'name':'MICCAI2020', 'url':'https://dblp.org/db/conf/miccai/miccai2020-6.html', 'source':'dblp', }, { 'name':'MICCAI2020', 'url':'https://dblp.org/db/conf/miccai/miccai2020-7.html', 'source':'dblp', }, { 'name':'MICCAI2019', 'url':'https://dblp.org/db/conf/miccai/miccai2019-1.html', 'source':'dblp', }, { 'name':'MICCAI2019', 'url':'https://dblp.org/db/conf/miccai/miccai2019-2.html', 'source':'dblp', }, { 'name':'MICCAI2019', 'url':'https://dblp.org/db/conf/miccai/miccai2019-3.html', 'source':'dblp', }, { 'name':'MICCAI2019', 'url':'https://dblp.org/db/conf/miccai/miccai2019-4.html', 'source':'dblp', }, { 'name':'MICCAI2019', 'url':'https://dblp.org/db/conf/miccai/miccai2019-5.html', 'source':'dblp', }, { 'name':'MICCAI2019', 'url':'https://dblp.org/db/conf/miccai/miccai2019-6.html', 'source':'dblp', }, ]

    require to update cache 
    opened by Doragd 1
  • feat: add boolean search

    feat: add boolean search

    The boolean query allows you to search exactly the key-words that you are interested in. Besides, it also helps to include the near-synonyms (like dialog, dialogue and conversation) and exclude the words that you are not interested in (like the second example).

    opened by EricLee8 1
  • [Add] new confs

    [Add] new confs

    [ { 'name':'TIP2022', 'url':'https://dblp.org/db/journals/tip/tip31.html', 'source':'dblp', }, { 'name':'TIP2021', 'url':'https://dblp.org/db/journals/tip/tip30.html', 'source':'dblp', }, { 'name':'TIP2020', 'url':'https://dblp.org/db/journals/tip/tip29.html', 'source':'dblp', }, { 'name':'TPAMI2022', 'url':'https://dblp.org/db/journals/pami/pami44.html', 'source':'dblp', }, { 'name':'TPAMI2021', 'url':'https://dblp.org/db/journals/pami/pami43.html', 'source':'dblp', }, { 'name':'TPAMI2020', 'url':'https://dblp.org/db/journals/pami/pami42.html', 'source':'dblp', }, { 'name':'RecSys2021', 'url':'https://dblp.org/db/conf/recsys/recsys2021.html', 'source':'dblp', }, { 'name':'RecSys2020', 'url':'https://dblp.org/db/conf/recsys/recsys2020.html', 'source':'dblp', }, { 'name':'RecSys2019', 'url':'https://dblp.org/db/conf/recsys/recsys2019.html', 'source':'dblp', }, { 'name':'TKDE2022', 'url':'https://dblp.org/db/journals/tkde/tkde34.html', 'source':'dblp', }, { 'name':'TKDE2021', 'url':'https://dblp.org/db/journals/tkde/tkde33.html', 'source':'dblp', }, { 'name':'TKDE2020', 'url':'https://dblp.org/db/journals/tkde/tkde32.html', 'source':'dblp', }, { 'name':'TOIS2022', 'url':'https://dblp.org/db/journals/tois/tois40.html', 'source':'dblp', }, { 'name':'TOIS2021', 'url':'https://dblp.org/db/journals/tois/tois39.html', 'source':'dblp', }, { 'name':'TOIS2020', 'url':'https://dblp.org/db/journals/tois/tois38.html', 'source':'dblp', }, { 'name':'ICDM2021', 'url':'https://dblp.org/db/conf/icdm/icdm2021.html', 'source':'dblp', }, { 'name':'ICDM2020', 'url':'https://dblp.org/db/conf/icdm/icdm2020.html', 'source':'dblp', }, { 'name':'ICDM2019', 'url':'https://dblp.org/db/conf/icdm/icdm2019.html', 'source':'dblp', }, { 'name':'TASLP2022', 'url':'https://dblp.org/db/journals/taslp/taslp30.html', 'source':'dblp', }, { 'name':'TASLP2021', 'url':'https://dblp.org/db/journals/taslp/taslp29.html', 'source':'dblp', }, { 'name':'TASLP2020', 'url':'https://dblp.org/db/journals/taslp/taslp28.html', 'source':'dblp', }, ]

    require to update cache 
    opened by Doragd 1
  • Boolean Search

    Boolean Search

    Can you add boolean search function? Especially on the web demo page.

    The Boolean Queries are like:

    • language AND generation AND pre-train
    • dialogue AND generation AND NOT (response AND selection)
    • toxic AND (dialogue OR conversation OR dialog)
    enhancement 
    opened by EricLee8 1
  • [Add] Interspeech

    [Add] Interspeech

    [ { 'name':'Interspeech2020', 'url':'https://dblp.org/db/conf/interspeech/interspeech2020.html', 'source':'dblp', }, { 'name':'Interspeech2019', 'url':'https://dblp.org/db/conf/interspeech/interspeech2019.html', 'source':'dblp', }, ]

    require to update cache 
    opened by ddlBoJack 1
  • [Add] AISTAT, COLT

    [Add] AISTAT, COLT

    [{"name":"COLT2021","url":"https://dblp.org/db/conf/colt/colt2021.html","source":"dblp"},{"name":"COLT2020","url":"https://dblp.org/db/conf/colt/colt2020.html","source":"dblp"},{"name":"COLT2019","url":"https://dblp.org/db/conf/colt/colt2019.html","source":"dblp"},{"name":"AISTATS2021","url":"https://dblp.org/db/conf/aistats/aistats2021.html","source":"dblp"},{"name":"AISTATS2020","url":"https://dblp.org/db/conf/aistats/aistats2020.html","source":"dblp"},{"name":"AISTATS2019","url":"https://dblp.org/db/conf/aistats/aistats2019.html","source":"dblp"}]

    require to update cache 
    opened by yinxiangshi 1
  • [Feature] Web可以增加分页吗?目前版本有点卡

    [Feature] Web可以增加分页吗?目前版本有点卡

    Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

    1. 所有检索结果都在1个页,导致页面数据过多,很卡,不能点击
    2. 勾选的会议,刷新后又是全选。得重新设置

    Describe the solution you'd like A clear and concise description of what you want to happen.

    1. 增加分页功能,默认显示10条
    2. 在设置里可以设置单页显示多少条,或显示所有(即原版本)
    3. 记住勾选设置项,本地存储,刷新后保持。强刷页面才刷掉cache
    enhancement 
    opened by shliujing 3
  • [BUG]AttributeError: 'dict' object has no attribute 'lower'

    [BUG]AttributeError: 'dict' object has no attribute 'lower'

    Traceback (most recent call last): File "main.py", line 60, in main() File "main.py", line 43, in main results = exec_search(indexes, candidates, query, mode, threshold, confs, limit) File "/home/zhuzihao/AI-Paper-Collector/searcher.py", line 179, in exec_search results = exact_search(indexes, query, confs) File "/home/zhuzihao/AI-Paper-Collector/searcher.py", line 86, in exact_search results = [item for item in indexes if query in item[1].lower()] File "/home/zhuzihao/AI-Paper-Collector/searcher.py", line 86, in results = [item for item in indexes if query in item[1].lower()] AttributeError: 'dict' object has no attribute 'lower'

    bug 
    opened by zihao-ai 1
  • [Feature] Suggestion: Besides the normal

    [Feature] Suggestion: Besides the normal "search" tab, adding an "advanced search" tab

    Sometimes the normal search results are too many. I suggest adding an "advanced search" tab. So I can use regular expression as the input to narrow down the search scope. Thanks!

    enhancement 
    opened by canyuchen 3
  • [Feature] Suggestion: grouping the conferences into different fields

    [Feature] Suggestion: grouping the conferences into different fields

    Hi, I suggest adding a feature to group the conferences into different fields like the csranking.org website. So it is more convenient to search papers in a specific field because we do not need to select every conference in the specific field for each search. Hope you could take this suggestion into consideration. Love this tool. Thanks!

    https://csrankings.org/#/index?all&us

    image enhancement 
    opened by canyuchen 1
  • [Question]搜索的时候会出错怎么办

    [Question]搜索的时候会出错怎么办

    Question 本地和colab上执行都会出错,请问下这怎么解决呢

    Screenshots

    Traceback (most recent call last): File "main.py", line 60, in main() File "main.py", line 43, in main results = exec_search(indexes, candidates, query, mode, threshold, confs, limit) File "/content/AI-Paper-Collector/searcher.py", line 179, in exec_search results = exact_search(indexes, query, confs) File "/content/AI-Paper-Collector/searcher.py", line 86, in exact_search results = [item for item in indexes if query in item[1].lower()] File "/content/AI-Paper-Collector/searcher.py", line 86, in results = [item for item in indexes if query in item[1].lower()] AttributeError: 'dict' object has no attribute 'lower'

    documentation 
    opened by cntommy 1
Owner
Gordon Lee
Go ahead.
Gordon Lee
arxiv-sanity, but very lite, simply providing the core value proposition of the ability to tag arxiv papers of interest and have the program recommend similar papers.

arxiv-sanity, but very lite, simply providing the core value proposition of the ability to tag arxiv papers of interest and have the program recommend similar papers.

Andrej 671 Dec 31, 2022
Collect super-resolution related papers, data, repositories

Collect super-resolution related papers, data, repositories

WangChaofeng 1.7k Jan 3, 2023
Code for How To Create A Fully Automated AI Based Trading System With Python

AI Based Trading System This code works as a boilerplate for an AI based trading system with yfinance as data source and RobinHood or Alpaca as broker

Rubén 196 Jan 5, 2023
This package contains deep learning models and related scripts for RoseTTAFold

RoseTTAFold This package contains deep learning models and related scripts to run RoseTTAFold This repository is the official implementation of RoseTT

null 1.6k Jan 3, 2023
3ds-Ghidra-Scripts - Ghidra scripts to help with 3ds reverse engineering

3ds Ghidra Scripts These are ghidra scripts to help with 3ds reverse engineering

Zak 7 May 23, 2022
Omniverse sample scripts - A guide for developing with Python scripts on NVIDIA Ominverse

Omniverse sample scripts ここでは、NVIDIA Omniverse ( https://www.nvidia.com/ja-jp/om

ft-lab (Yutaka Yoshisaka) 37 Nov 17, 2022
text_recognition_toolbox: The reimplementation of a series of classical scene text recognition papers with Pytorch in a uniform way.

text recognition toolbox 1. 项目介绍 该项目是基于pytorch深度学习框架,以统一的改写方式实现了以下6篇经典的文字识别论文,论文的详情如下。该项目会持续进行更新,欢迎大家提出问题以及对代码进行贡献。 模型 论文标题 发表年份 模型方法划分 CRNN 《An End-t

null 168 Dec 24, 2022
Classic Papers for Beginners and Impact Scope for Authors.

There have been billions of academic papers around the world. However, maybe only 0.0...01% among them are valuable or are worth reading. Since our limited life has never been forever, TopPaper provide a Top Academic Paper Chart for beginners and reseachers to take one step faster.

Qiulin Zhang 228 Dec 18, 2022
A list of multi-task learning papers and projects.

This page contains a list of papers on multi-task learning for computer vision. Please create a pull request if you wish to add anything. If you are interested, consider reading our recent survey paper.

svandenh 297 Dec 17, 2022
A list of multi-task learning papers and projects.

A list of multi-task learning papers and projects.

svandenh 84 Apr 27, 2021
A list of papers regarding generalization in (deep) reinforcement learning

A list of papers regarding generalization in (deep) reinforcement learning

Kaixin WANG 13 Apr 26, 2021
A curated list of programmatic weak supervision papers and resources

A curated list of programmatic weak supervision papers and resources

Jieyu Zhang 118 Jan 2, 2023
A list of papers about point cloud based place recognition, also known as loop closure detection in SLAM (processing)

A list of papers about point cloud based place recognition, also known as loop closure detection in SLAM (processing)

Xin Kong 17 May 16, 2021
The code for two papers: Feedback Transformer and Expire-Span.

transformer-sequential This repo contains the code for two papers: Feedback Transformer Expire-Span The training code is structured for long sequentia

Facebook Research 125 Dec 25, 2022
Automatic voice-synthetised summaries of latest research papers on arXiv

PaperWhisperer PaperWhisperer is a Python application that keeps you up-to-date with research papers. How? It retrieves the latest articles from arXiv

Valerio Velardo 124 Dec 20, 2022
A selection of State Of The Art research papers (and code) on human locomotion (pose + trajectory) prediction (forecasting)

A selection of State Of The Art research papers (and code) on human trajectory prediction (forecasting). Papers marked with [W] are workshop papers.

Karttikeya Manglam 40 Nov 18, 2022
The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

This is a curated list of tutorials, projects, libraries, videos, papers, books and anything related to the incredible PyTorch. Feel free to make a pu

Ritchie Ng 9.2k Jan 2, 2023
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

?? Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

xmu-xiaoma66 7.7k Jan 5, 2023
Must-read Papers on Physics-Informed Neural Networks.

PINNpapers Contributed by IDRL lab. Introduction Physics-Informed Neural Network (PINN) has achieved great success in scientific computing since 2017.

IDRL 330 Jan 7, 2023