If you inputted source article, basic summariztion is conducted.
>>>contents="""Tunip is the Octonauts' head cook and gardener. He is a Vegimal, a half-animal, half-vegetable creature capable of breathing on land as well as underwater. Tunip is very childish and innocent, always wanting to help the Octonauts in any way he can. He is the smallest main character in the Octonauts crew."""
>>>summ(contents)
'Tunip is a Vegimal, a half-animal, half-vegetable creature'
3. Query focused Summarization
If you want to input query together, Query focused summarization conducted.
>>>summ(contents, query="main character of Octonauts")
'Tunip is the smallest main character in the Octonauts crew.'
3. Abstractive QA (Auto Question Detection)
If you inputted question as query, Abstractive QA is conducted.
>>>summ(contents, query="What is Vegimal?")
'Half-animal, half-vegetable'
You can turn off this feature by setting param question_detection=False.
Copyright 2021 Hyunwoong Ko.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
A library to help users choose appropriate summarization tools based on their specific tasks or needs. Includes models, evaluation metrics, and datasets.
As far as I know of, CTRLsum is a model which supports abstractive summarization. On the README example, it also shows that 2. Basic Summarization performed abstractive summarization because the subject of the sentence has changed compared to the original sentence on the content.
Original: He is a ~
Summarization: Tunip is a~
However, on my custom dataset, all the samples I have tried returned the exact original sentence from the content, making me assume that it is performing an extractive summarization.
I am getting this odd message when I make a call from this module in a script:
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'BartTokenizer'.
The class this function is called from is 'PreTrainedTokenizerFast'.
opened by JochiRaider 1
Owner
Hyunwoong Ko
Research Engineer at @tunib-ai. previously @kakaobrain.