Automatic voice-synthetised summaries of latest research papers on arXiv

Valerio Velardo

Last update: Dec 20, 2022

Related tags

Deep Learning paperwhisperer

Overview

PaperWhisperer

PaperWhisperer is a Python application that keeps you up-to-date with research papers. How? It retrieves the latest articles from arXiv on a topic, by performing a keyword-based search. Then, it creates vocal summaries of the articles using Text-To-Speech and stores them to disk.

Installation

To install the package, move to the root of the repo and type in the console:

$ pip install .

If you plan to develop the package further, install the package in editable mode also installing the packages necessary to run unittests:

$ pip install -e .[test]

Testing

To run unittests, issue the following command from the root of the repo:

$ pytest

Package structure

The package is divided into 2 sub-packages:

retrieval
tts

retrieval contains data structures and facilities necessary to retrieve articles from arXiv. Under the hood, the app uses arxiv, a Python package that is a wrapper around the arXiv free API.

tts has facilities to generate speech renditions of text-based article summaries. The summary of an article consists of its title, authors, and abstract. Speech synthesis is performed using Google Cloud Text-To-Speech.

Setting up Google Cloud Text-To-Speech

PaperWhisperer uses Google Cloud Text-To-Speech to synthesise speech.

In order to be able to use this service, you should:

create an account on Google Cloud,
create a Cloud Platform project,
enable the Text-To-Speech API in the project
setup authentication
download a Json private key

More info on how to set up Google Cloud Text-To-Speech

Environment variables

The app uses an environment variable called GOOGLE_APPLICATION_CREDENTIALS to connect to Google Cloud Text-To-Speech safely.

In config.yml, set GOOGLE_APPLICATION_CREDENTIALS to the path of the Json private key you previously downloaded while setting up the Google service.

Without this step, you won't be able to connect to Google Cloud Text-To-Speech, and the app will throw an error.

How to create summaries

To create summaries for a keyword search, use the create_summaries entry point. This is the only console script of the package and the main entry point of the application.

Below is an example of how you can run the script:

$ create_summaries "generate chord progressions" 100 /save/dir 40

The script takes 4 positional arguments:

keywords used for searching articles (more than one keyword is possible)
maximum number of articles to retrieve
directory where to store vocal summaries
retrieve articles no older than this integer value in days

Dependencies

PaperWhisperer depends on the following packages:

arxiv==1.2.0
google-cloud-texttospeech
python-dotenv

YouTube video

Learn more about PaperWhisperer in this project presentation video on The Sound of AI YouTube channel.

You might also like...

Collection of TensorFlow2 implementations of Generative Adversarial Network varieties presented in research papers.

TensorFlow2-GAN Collection of tf2.0 implementations of Generative Adversarial Network varieties presented in research papers. Model architectures will

41 Apr 28, 2022

Research on Tabular Deep Learning (Python package & papers)

Research on Tabular Deep Learning For paper implementations, see the section "Papers and projects". rtdl is a PyTorch-based package providing a user-f

510 Dec 30, 2022

A project to build an AI voice assistant using Python . The Voice assistant interacts with the humans to perform basic tasks.

AI_Personal_Voice_Assistant_Using_Python A project to build an AI voice assistant using Python . The Voice assistant interacts with the humans to perf

1 Oct 30, 2021

Voice assistant - Voice assistant with python

🌐 Python Voice Assistant 🌵 - User's greeting 🌵 - Writing tasks to todo-list ?

10 Dec 26, 2022

Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

auto-self-checker 자동으로 자가진단 해주는 프로그램(python 필요) 중요 이 프로그램이 실행될때에는 절대로 마우스포인터를 움직이거나 키보드를 건드리면 안된다(화면인식, 마우스포인터로 직접 클릭) 사용법 프로그램을 구동할 폴더 내의 cmd창에서 pip

1 Dec 30, 2021

An integration of several popular automatic augmentation methods, including OHL (Online Hyper-Parameter Learning for Auto-Augmentation Strategy) and AWS (Improving Auto Augment via Augmentation Wise Weight Sharing) by Sensetime Research.

An integration of several popular automatic augmentation methods, including OHL (Online Hyper-Parameter Learning for Auto-Augmentation Strategy) and AWS (Improving Auto Augment via Augmentation Wise Weight Sharing) by Sensetime Research.

45 Dec 8, 2022

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Comments

Perform Actual Summarization

Instead of simply turning the abstract of the paper into speech, it would be so cool to turn an actual summarization of a research paper into speech. The paper TLDR: Extreme Summarization of Scientific Documents does an amazing job of summarizing research papers into one to three sentences and its github repo can be found here. This model is so good that Semantic Scholar, a research paper database, uses TLDR here.

If I have time this winter break, I would love to take on the project of incorporating TLDR into paperwhisperer. My goal would be to give users the option of listening to the entire abstracts or just one sentence summaries of the papers you are interested in. Just thought I'd share this idea in case anyone else is interested.

opened by ez2rok 0
Pubmed module

Added Paper Retrieval from PubMed through the pymed package. In order to use PubMed APIs a tool name and a valid email id are required to be specified in the config.env folder.

opened by rishabkoul 0

Automatic voice-synthetised summaries of latest research papers on arXiv

Related tags

Overview

PaperWhisperer

Installation

Testing

Package structure

Setting up Google Cloud Text-To-Speech

Environment variables

How to create summaries

Dependencies

YouTube video

You might also like...

Collection of TensorFlow2 implementations of Generative Adversarial Network varieties presented in research papers.

Research on Tabular Deep Learning (Python package & papers)

A project to build an AI voice assistant using Python . The Voice assistant interacts with the humans to perform basic tasks.

Voice assistant - Voice assistant with python

Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

An integration of several popular automatic augmentation methods, including OHL (Online Hyper-Parameter Learning for Auto-Augmentation Strategy) and AWS (Improving Auto Augment via Augmentation Wise Weight Sharing) by Sensetime Research.

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Code for our method RePRI for Few-Shot Segmentation. Paper at http://arxiv.org/abs/2012.06166

Comments

Perform Actual Summarization

Pubmed module

Owner

Valerio Velardo

We have implemented shaDow-GNN as a general and powerful pipeline for graph representation learning. For more details, please find our paper titled Deep Graph Neural Networks with Shallow Subgraph Samplers, available on arXiv (https//arxiv.org/abs/2012.01380).

Listing arxiv - Personalized list of today's articles from ArXiv

Arxiv harvester - Poor man's simple harvester for arXiv resources

The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).

On Generating Extended Summaries of Long Documents

A New Approach to Overgenerating and Scoring Abstractive Summaries

Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Annotated notes and summaries of the TensorFlow white paper, along with SVG figures and links to documentation

A selection of State Of The Art research papers (and code) on human locomotion (pose + trajectory) prediction (forecasting)