An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

Facebook Research

Last update: Oct 28, 2022

Related tags

Text Data & NLP vizseq

Overview

VizSeq is a Python toolkit for visual analysis on text generation tasks like machine translation, summarization, image captioning, speech translation and video description. It takes multi-modal sources, text references as well as text predictions as inputs, and analyzes them visually in Jupyter Notebook or a built-in Web App (the former has Fairseq integration). VizSeq also provides a collection of multi-process scorers as a normal Python package.

[Paper] [Documentation] [Blog]

Task Coverage

Source	Example Tasks
Text	Machine translation, text summarization, dialog generation, grammatical error correction, open-domain question answering
Image	Image captioning, image question answering, optical character recognition
Audio	Speech recognition, speech translation
Video	Video description
Multimodal	Multimodal machine translation

Metric Coverage

Accelerated with multi-processing/multi-threading.

Type	Metrics
N-gram-based	BLEU (Papineni et al., 2002), NIST (Doddington, 2002), METEOR (Banerjee et al., 2005), TER (Snover et al., 2006), RIBES (Isozaki et al., 2010), chrF (Popović et al., 2015), GLEU (Wu et al., 2016), ROUGE (Lin, 2004), CIDEr (Vedantam et al., 2015), WER
Embedding-based	LASER (Artetxe and Schwenk, 2018), BERTScore (Zhang et al., 2019)

Getting Started

Installation

VizSeq requires Python 3.6+ and currently runs on Unix/Linux and macOS/OS X. It will support Windows as well in the future.

You can install VizSeq from PyPI repository:

$ pip install vizseq

Or install it from source:

$ git clone https://github.com/facebookresearch/vizseq
$ cd vizseq
$ pip install -e .

Documentation

Jupyter Notebook Examples

Fairseq integration

Web App Example

Download example data:

$ git clone https://github.com/facebookresearch/vizseq
$ cd vizseq
$ bash get_example_data.sh

Launch the web server:

$ python -m vizseq.server --port 9001 --data-root ./examples/data

And then, navigate to the following URL in your web browser:

http://localhost:9001

License

VizSeq is licensed under MIT. See the LICENSE file for details.

Citation

Please cite as

@inproceedings{wang2019vizseq,
  title = {VizSeq: A Visual Analysis Toolkit for Text Generation Tasks},
  author = {Changhan Wang, Anirudh Jain, Danlu Chen, Jiatao Gu},
  booktitle = {In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing: System Demonstrations},
  year = {2019},
}

Contact

Changhan Wang ([email protected]), Jiatao Gu ([email protected])

Comments

[Bug] - cannot import name 'tokenize_13a' from 'sacrebleu'

🐛 Bug

I just followed the installation steps and got this error.

To reproduce

** Minimal Code/Config snippet to reproduce **

** Stack trace/error message **

(base) diegomoussallem@Diegos-MBP examples % python -m vizseq.server --port 9001 --data-root examples/data
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.7/runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/opt/anaconda3/lib/python3.7/runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "/Users/diegomoussallem/Desktop/vizseq/vizseq/__init__.py", line 15, in <module>
    from vizseq.ipynb import *
  File "/Users/diegomoussallem/Desktop/vizseq/vizseq/ipynb/__init__.py", line 8, in <module>
    from .core import (view_examples, view_n_grams, view_stats, view_scores,
  File "/Users/diegomoussallem/Desktop/vizseq/vizseq/ipynb/core.py", line 15, in <module>
    from vizseq._data import (VizSeqDataSources, PathOrPathsOrDictOfStrList,
  File "/Users/diegomoussallem/Desktop/vizseq/vizseq/_data/__init__.py", line 14, in <module>
    from .config_manager import VizSeqTaskConfigManager, VizSeqGlobalConfigManager
  File "/Users/diegomoussallem/Desktop/vizseq/vizseq/_data/config_manager.py", line 13, in <module>
    from .tokenizers import VizSeqTokenization
  File "/Users/diegomoussallem/Desktop/vizseq/vizseq/_data/tokenizers.py", line 10, in <module>
    from sacrebleu import tokenize_13a, tokenize_v14_international, tokenize_zh
ImportError: cannot import name 'tokenize_13a' from 'sacrebleu' (/opt/anaconda3/lib/python3.7/site-packages/sacrebleu/__init__.py)

Expected Behavior

System information

Additional context

Add any other context about the problem here.

bug

opened by DiegoMoussallem 4

[Question] Calculated BLEU score

Hi :)

The tool returns a BLEU score for the machine translation and runs great in general, but I am not sure if the BLEU score represents the sentence level or corpus level? I haven't been able to gather anything conclusive from the sacreBLEU implementation, so I am hoping you can help me with this :)

Best regards, Tobias

opened by Tojens 3
[Bug] Multiple references

When having multiple references (attached muti_refs.zip), I cannot configure metric BLEU (but I can configure GLEU). I get a 500 internal server error.

This does not happen with a single reference (see attached single_ref.zip).

opened by nadjet 3
Question about `tag` and `group` in official example
In the official scorer example from https://facebookresearch.github.io/vizseq/docs/getting_started/scorer_example/, the second block confuse me.

Corpus-level BLEU: 67.945 Sentence-level BLEU: [75.984, 61.479] Group BLEU: {'Test Group 2': 75.984, 'Test Group 1': 75.984}

I can see two generated sentences with corresponding reference sentences in the first block.

ref = [['This is a sample #1 reference.', 'This is a sample #2 reference.']] hypo = ['This is a sample #1 prediction.', 'This is a sample #2 model prediction.'] tags = [['Test Group 1', 'Test Group 2']] scores = scorer.score(hypo, ref, tags=tags) print(f'Corpus-level BLEU: {scores.corpus_score}') print(f'Sentence-level BLEU: {scores.sent_scores}') print(f'Group BLEU: {scores.group_scores}')

The first sample belongs to Test Group 1 and the second sample belongs to Test Group 2. If I'm not misunderstanding the use of the tag, according to the Sentence-level BLEU,the Group BLEU should be {'Test Group 2': 61.479, 'Test Group 1': 75.984}.

But the execution result is Group BLEU: {'Test Group 2': 75.984, 'Test Group 1': 75.984}
opened by YKX-A 2
Support for non-ascii chars

Support for non-ascii chars

Motivation

Current version cannot open non-ascii files

Have you read the Contributing Guidelines on pull requests?

Yes

Test Plan

Open some non-ascii files may help to test

Related Issues and PRs

(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)
CLA Signed

opened by fuzihaofzh 2
🐛 Uncaught exception GET
🐛 Bug

When trying to run the webapp with the example data, I have this error :

Uncaught exception GET

To reproduce

Follow README instructions : download example data and run :

python -m vizseq.server --port 9001 --data-root ./examples/data

The server starts fine, but when accessing the webapp at localhost:9001, I can only see 500: Internal Server Error.

Stack trace/error message

INFO - 11/04/19 10:36:39 - 0:00:00 - Application Started You can navigate to http://localhost:9001 ERROR - 11/04/19 10:36:42 - 0:00:03 - Uncaught exception GET / (192.168.0.30) HTTPServerRequest(protocol='http', host='192.168.0.231:9001', method='GET', uri='/', version='HTTP/1.1', remote_ip='192.168.0.30') ERROR - 11/04/19 10:36:42 - 0:00:03 - 500 GET / (192.168.0.30) 1.18ms

Expected Behavior

The webapp run normally.

System information

VizSeq Version : 0.1.2

Python version : 3.6.8

Operating system : Ubuntu 16.04

bug
opened by astariul 2

View Scores NoneType not subscriptable

When running the following for scores like Rouge and others like:

vizseq.view_scores(ref, hypo, ['metric that's not bleu'], tags=tag)

I am getting:

~/.local/lib/python3.6/site-packages/vizseq/scorers/__init__.py in _score_multiprocess_averaged(self, hypothesis, references, tags, sent_score_func)
    170             for t in tag_set:
    171                 indices = [i for i, cur in enumerate(tags) if t in cur]
--> 172                 group_scores[t] = np.mean([sent_scores[i] for i in indices])
    173 
    174         return VizSeqScore.make(

~/.local/lib/python3.6/site-packages/vizseq/scorers/__init__.py in <listcomp>(.0)
    170             for t in tag_set:
    171                 indices = [i for i, cur in enumerate(tags) if t in cur]
--> 172                 group_scores[t] = np.mean([sent_scores[i] for i in indices])
    173 
    174         return VizSeqScore.make(

TypeError: 'NoneType' object is not subscriptable

Any idea why that's going on? Using text data this is at least 3 tokens or more.

This works fine when running view_examples.

opened by smart-patrol 1

🐛 TypeError: score() got an unexpected keyword argument 'bert'

🐛 Bug

I tried to apply BertScore on my data, but received this error :

TypeError: score() got an unexpected keyword argument 'bert'

To reproduce

In configuration, select BertScore as metric.
Refresh the page

Stack trace/error message

Traceback (most recent call last):
  File "/home/me/.venv/presum/lib/python3.6/site-packages/tornado/web.py", line 1590, in _execute
    result = method(*self.path_args, **self.path_kwargs)
  File "/home/me/workspace/vizseq/vizseq/server.py", line 103, in get
    pd = wv.get_page_data()
  File "/home/me/workspace/vizseq/vizseq/_view/web_view.py", line 158, in get_page_data
    sorting_metric=self.sorting_metric, need_lang_tags=True
  File "/home/me/workspace/vizseq/vizseq/_view/data_view.py", line 132, in get
    for s in metrics
  File "/home/me/workspace/vizseq/vizseq/_view/data_view.py", line 132, in <dictcomp>
    for s in metrics
  File "/home/me/workspace/vizseq/vizseq/_view/data_view.py", line 130, in <dictcomp>
    for m, hh in cur_hypo.items()
  File "/home/me/workspace/vizseq/vizseq/scorers/bert_score.py", line 28, in score
    no_idf=True, verbose=self.verbose
TypeError: score() got an unexpected keyword argument 'bert'

Expected Behavior

Able to see BertScore.

System information

VizSeq Version : 0.1.2
Python version : 3.6.8
Operating system : Ubuntu 16.04

bug

opened by astariul 1

🐛 AttributeError: 'VizSeqLogger' object has no attribute 'set_console_mode'
🐛 Bug

When trying to run the webapp with the example data, I have this error :

AttributeError: 'VizSeqLogger' object has no attribute 'set_console_mode'

To reproduce

Follow README instructions : download example data and run :

python -m vizseq.server --port 9001 --data-root ./examples/data

Stack trace/error message

Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/me/workspace/vizseq/vizseq/server.py", line 14, in <module> logger.set_console_mode(enable=True) AttributeError: 'VizSeqLogger' object has no attribute 'set_console_mode'

Expected Behavior

The code run normally.

System information

VizSeq Version : 0.1.2

Python version : 3.6.8

Operating system : Ubuntu 16.04

bug
opened by astariul 1
Bump qs from 6.5.2 to 6.5.3 in /website
Bumps qs from 6.5.2 to 6.5.3.

Changelog

Sourced from qs's changelog.

6.5.3

[Fix] parse: ignore __proto__ keys (#428)

[Fix] utils.merge: avoid a crash with a null target and a truthy non-array source

[Fix] correctly parse nested arrays

[Fix] stringify: fix a crash with strictNullHandling and a custom filter/serializeDate (#279)

[Fix] utils: merge: fix crash when source is a truthy primitive & no options are provided

[Fix] when parseArrays is false, properly handle keys ending in []

[Fix] fix for an impossible situation: when the formatter is called with a non-string value

[Fix] utils.merge: avoid a crash with a null target and an array source

[Refactor] utils: reduce observable [[Get]]s

[Refactor] use cached Array.isArray

[Refactor] stringify: Avoid arr = arr.concat(...), push to the existing instance (#269)

[Refactor] parse: only need to reassign the var once

[Robustness] stringify: avoid relying on a global undefined (#427)

[readme] remove travis badge; add github actions/codecov badges; update URLs

[Docs] Clean up license text so it’s properly detected as BSD-3-Clause

[Docs] Clarify the need for "arrayLimit" option

[meta] fix README.md (#399)

[meta] add FUNDING.yml

[actions] backport actions from main

[Tests] always use String(x) over x.toString()

[Tests] remove nonexistent tape option

[Dev Deps] backport from main

Commits

298bfa5 v6.5.3

ed0f5dc [Fix] parse: ignore __proto__ keys (#428)

691e739 [Robustness] stringify: avoid relying on a global undefined (#427)

1072d57 [readme] remove travis badge; add github actions/codecov badges; update URLs

12ac1c4 [meta] fix README.md (#399)

0338716 [actions] backport actions from main

5639c20 Clean up license text so it’s properly detected as BSD-3-Clause

51b8a0b add FUNDING.yml

45f6759 [Fix] fix for an impossible situation: when the formatter is called with a no...

f814a7f [Dev Deps] backport from main

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

CLA Signed dependencies
opened by dependabot[bot] 0
Bump decode-uri-component from 0.2.0 to 0.2.2 in /website
Bumps decode-uri-component from 0.2.0 to 0.2.2.

Release notes

Sourced from decode-uri-component's releases.

v0.2.2

Prevent overwriting previously decoded tokens 980e0bf

https://github.com/SamVerschueren/decode-uri-component/compare/v0.2.1...v0.2.2

v0.2.1

Switch to GitHub workflows 76abc93

Fix issue where decode throws - fixes #6 746ca5d

Update license (#1) 486d7e2

Tidelift tasks a650457

Meta tweaks 66e1c28

https://github.com/SamVerschueren/decode-uri-component/compare/v0.2.0...v0.2.1

Commits

a0eea46 0.2.2

980e0bf Prevent overwriting previously decoded tokens

3c8a373 0.2.1

76abc93 Switch to GitHub workflows

746ca5d Fix issue where decode throws - fixes #6

486d7e2 Update license (#1)

a650457 Tidelift tasks

66e1c28 Meta tweaks

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

CLA Signed dependencies
opened by dependabot[bot] 0
Bump express from 4.17.1 to 4.18.2 in /website
Bumps express from 4.17.1 to 4.18.2.

Release notes

Sourced from express's releases.

4.18.2

Fix regression routing a large stack in a single route

deps: [email protected]

deps: [email protected]

perf: remove unnecessary object clone

deps: [email protected]

4.18.1

Fix hanging on large stack of sync routes

4.18.0

Add "root" option to res.download

Allow options without filename in res.download

Deprecate string and non-integer arguments to res.status

Fix behavior of null/undefined as maxAge in res.cookie

Fix handling very large stacks of sync middleware

Ignore Object.prototype values in settings through app.set/app.get

Invoke default with same arguments as types in res.format

Support proper 205 responses using res.send

Use http-errors for res.format error

deps: [email protected]

Fix error message for json parse whitespace in strict

Fix internal error when inflated body exceeds limit

Prevent loss of async hooks context

Prevent hanging when request already read

deps: [email protected]

deps: [email protected]

deps: [email protected]

deps: [email protected]

deps: [email protected]

deps: [email protected]

Add priority option

Fix expires option to reject invalid dates

deps: [email protected]

Replace internal eval usage with Function constructor

Use instance methods on process to check for listeners

deps: [email protected]

Remove set content headers that break response

deps: [email protected]

deps: [email protected]

deps: [email protected]

Prevent loss of async hooks context

deps: [email protected]

deps: [email protected]

Fix emitted 416 error missing headers property

Limit the headers removed for 304 response

deps: [email protected]

deps: [email protected]

deps: [email protected]

deps: [email protected]

... (truncated)

Changelog

Sourced from express's changelog.

4.18.2 / 2022-10-08

Fix regression routing a large stack in a single route

deps: [email protected]

deps: [email protected]

perf: remove unnecessary object clone

deps: [email protected]

4.18.1 / 2022-04-29

Fix hanging on large stack of sync routes

4.18.0 / 2022-04-25

Add "root" option to res.download

Allow options without filename in res.download

Deprecate string and non-integer arguments to res.status

Fix behavior of null/undefined as maxAge in res.cookie

Fix handling very large stacks of sync middleware

Ignore Object.prototype values in settings through app.set/app.get

Invoke default with same arguments as types in res.format

Support proper 205 responses using res.send

Use http-errors for res.format error

deps: [email protected]

Fix error message for json parse whitespace in strict

Fix internal error when inflated body exceeds limit

Prevent loss of async hooks context

Prevent hanging when request already read

deps: [email protected]

deps: [email protected]

deps: [email protected]

deps: [email protected]

deps: [email protected]

deps: [email protected]

Add priority option

Fix expires option to reject invalid dates

deps: [email protected]

Replace internal eval usage with Function constructor

Use instance methods on process to check for listeners

deps: [email protected]

Remove set content headers that break response

deps: [email protected]

deps: [email protected]

deps: [email protected]

Prevent loss of async hooks context

deps: [email protected]

deps: [email protected]

... (truncated)

Commits

8368dc1 4.18.2

61f4049 docs: replace Freenode with Libera Chat

bb7907b build: [email protected]

f56ce73 build: [email protected]

24b3dc5 deps: [email protected]

689d175 deps: [email protected]

340be0f build: [email protected]

33e8dc3 docs: use Node.js name style

644f646 build: [email protected]

ecd7572 build: [email protected]

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

CLA Signed dependencies
opened by dependabot[bot] 0
[Bug] BLEUScorer uses wrong default tokenizer.
🐛 Bug

vizseq.scorers.bleu.BLEUScorer does not use Tokenizer13a by default. When I look at the code, it looks like it should be used by default. sacrebleu library uses the Tokenizer13a by default as well.

To reproduce

Minimal Code/Config snippet to reproduce

import vizseq scorer = vizseq.scorers.bleu.BLEUScorer() print(scorer.score(["This is really nice."], [["That's really nice."]])) # corpus_score = 31.947 scorer = vizseq.scorers.bleu.BLEUScorer(extra_args={'tokenizer': '13a'}) print(scorer.score(["This is really nice."], [["That's really nice."]])) # corpus_score = 39.764

Stack trace/error message

The problem is here. Variable tokenizer is set to string none. When calling method get_default_args (here), default value 13a for parameter tokenize is not used, because the string none is passed.

Expected Behavior

vizseq.scorers.bleu.BLEUScorer should use Tokenizer13a by default.

System information

vizseq==0.1.15

python==3.7.3

macOS

bug
opened by landert 0
pip3 install vizseq failed on AArch64, Fedora 33
[jw@cn05 ~]$ pip3 install vizseq Defaulting to user installation because normal site-packages is not writeable Collecting vizseq Using cached vizseq-0.1.15-py3-none-any.whl (81 kB) Collecting nltk>=3.5 Using cached nltk-3.5-py3-none-any.whl Collecting sacrebleu>=1.4.13 Using cached sacrebleu-1.5.0-py3-none-any.whl (65 kB) Collecting langid Using cached langid-1.1.6.tar.gz (1.9 MB) Requirement already satisfied: tqdm in ./.local/lib/python3.9/site-packages (from vizseq) (4.31.1) Collecting google-cloud-translate Using cached google_cloud_translate-3.0.2-py2.py3-none-any.whl (93 kB) Collecting torch Using cached torch-0.1.2.post2.tar.gz (128 kB) Requirement already satisfied: numpy in ./.local/lib/python3.9/site-packages (from vizseq) (1.19.5) Requirement already satisfied: jinja2 in ./.local/lib/python3.9/site-packages (from vizseq) (2.10.3) Collecting soundfile Using cached SoundFile-0.10.3.post1-py2.py3-none-any.whl (21 kB) Requirement already satisfied: py-rouge in ./.local/lib/python3.9/site-packages (from vizseq) (1.1) Requirement already satisfied: matplotlib in ./.local/lib/python3.9/site-packages (from vizseq) (3.3.2) Requirement already satisfied: tornado in ./.local/lib/python3.9/site-packages (from vizseq) (6.1) Requirement already satisfied: IPython in ./.local/lib/python3.9/site-packages (from vizseq) (7.18.1) Collecting bert-score Using cached bert_score-0.3.7-py3-none-any.whl (53 kB) Requirement already satisfied: pandas in ./.local/lib/python3.9/site-packages (from vizseq) (1.1.4) Collecting laserembeddings Using cached laserembeddings-1.1.1-py3-none-any.whl (13 kB) Requirement already satisfied: click in ./.local/lib/python3.9/site-packages (from nltk>=3.5->vizseq) (7.1.2) Requirement already satisfied: regex in ./.local/lib/python3.9/site-packages (from nltk>=3.5->vizseq) (2020.11.13) Requirement already satisfied: joblib in ./.local/lib/python3.9/site-packages (from nltk>=3.5->vizseq) (0.17.0) Collecting portalocker Using cached portalocker-2.2.1-py2.py3-none-any.whl (15 kB) Collecting transformers>=3.0.0 Using cached transformers-4.3.3-py3-none-any.whl (1.9 MB) Collecting bert-score Using cached bert_score-0.3.6-py3-none-any.whl (53 kB) Using cached bert_score-0.3.5-py3-none-any.whl (52 kB) Using cached bert_score-0.3.4-py3-none-any.whl (52 kB) Using cached bert_score-0.3.3-py3-none-any.whl (52 kB) Using cached bert_score-0.3.2-py3-none-any.whl (52 kB) Using cached bert_score-0.3.1-py3-none-any.whl (51 kB) Using cached bert_score-0.3.0-py3-none-any.whl (48 kB) Using cached bert_score-0.2.3-py3-none-any.whl (15 kB) Using cached bert_score-0.2.2-py3-none-any.whl (14 kB) Using cached bert_score-0.1.2-py3-none-any.whl (9.4 kB) Using cached bert_score-0.1.1-py3-none-any.whl (9.4 kB) Using cached bert_score-0.1.0-py3-none-any.whl (7.3 kB) INFO: pip is looking at multiple versions of to determine which version is compatible with other requirements. This could take a while. INFO: pip is looking at multiple versions of sacrebleu to determine which version is compatible with other requirements. This could take a while. Collecting sacrebleu>=1.4.13 Using cached sacrebleu-1.4.14-py3-none-any.whl (64 kB) Using cached sacrebleu-1.4.13-py3-none-any.whl (43 kB) INFO: pip is looking at multiple versions of nltk to determine which version is compatible with other requirements. This could take a while. INFO: pip is looking at multiple versions of vizseq to determine which version is compatible with other requirements. This could take a while. Collecting vizseq Using cached vizseq-0.1.14-py3-none-any.whl (81 kB) Using cached vizseq-0.1.13-py3-none-any.whl (81 kB) Using cached vizseq-0.1.12-py3-none-any.whl (81 kB) Using cached vizseq-0.1.11-py3-none-any.whl (81 kB) Using cached vizseq-0.1.10-py3-none-any.whl (80 kB) Using cached vizseq-0.1.9-py3-none-any.whl (78 kB) Requirement already satisfied: nltk in ./.local/lib/python3.9/site-packages (from vizseq) (3.4.5) Collecting sacrebleu==1.4.7 Using cached sacrebleu-1.4.7-py3-none-any.whl (59 kB) Requirement already satisfied: typing in ./.local/lib/python3.9/site-packages (from sacrebleu==1.4.7->vizseq) (3.7.4.3) Collecting mecab-python3 Using cached mecab-python3-1.0.3.tar.gz (77 kB) INFO: pip is looking at multiple versions of to determine which version is compatible with other requirements. This could take a while. INFO: pip is looking at multiple versions of sacrebleu to determine which version is compatible with other requirements. This could take a while. ERROR: Cannot install vizseq and vizseq==0.1.9 because these package versions have conflicting dependencies.

The conflict is caused by: vizseq 0.1.9 depends on torch bert-score 0.3.7 depends on torch>=1.0.0 vizseq 0.1.9 depends on torch bert-score 0.3.6 depends on torch>=1.0.0 vizseq 0.1.9 depends on torch bert-score 0.3.5 depends on torch>=1.0.0 vizseq 0.1.9 depends on torch bert-score 0.3.4 depends on torch>=1.0.0 vizseq 0.1.9 depends on torch bert-score 0.3.3 depends on torch>=1.0.0 vizseq 0.1.9 depends on torch bert-score 0.3.2 depends on torch>=1.0.0 vizseq 0.1.9 depends on torch bert-score 0.3.1 depends on torch>=1.0.0 vizseq 0.1.9 depends on torch bert-score 0.3.0 depends on torch>=1.0.0 vizseq 0.1.9 depends on torch bert-score 0.2.3 depends on torch>=1.0.0 vizseq 0.1.9 depends on torch bert-score 0.2.2 depends on torch>=1.0.0 vizseq 0.1.9 depends on torch bert-score 0.1.2 depends on torch>=0.4.1 vizseq 0.1.9 depends on torch bert-score 0.1.1 depends on torch>=0.4.1 vizseq 0.1.9 depends on torch bert-score 0.1.0 depends on torch>=0.4.1

To fix this you could try to:

loosen the range of package versions you've specified

remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies [jw@cn05 ~]$
bug
opened by LutzWeischerFujitsu 0
Example speech task (IWSLT17 dev) not pairing correct audio source with reference [Bug]
🐛 Bug

Audio segments from speech data in example speech translation task (IWSLT17 dev) are not correctly associated with reference data.

Only the first TED talk audio segments are correctly aligned to the reference... playing the audio segments related to any other talks (from # 3 / 10 / 887 ( 153 / 887 ) onwards on page 16 of the task using the defaults) results in the segments of the first TED talk audio being played rather than the segments specified in the task directory speech_translation_iwslt17_dev/src_0.zip/source.txt

To reproduce

Get the example speech task data (IWSLT7 dev)

$ bash get_example_data.sh speech_translation_iwslt17_dev

Start the server and navigate to : http://127.0.0.1:5000/view?t=speech_translation_iwslt17_dev&m=&q=&p_sz=10&p_no=16&s=0&s_metric= Play the audio segments: first two on this page will be correctly associated with reference text, from # 3 / 10 / 887 ( 153 / 887 ) onwards they are not.
bug
opened by jb101 0
[Feature Request] Update BertScorer with oop implementation

🚀 Feature Request

Bert Score (https://github.com/Tiiiger/bert_score) new version (0.3.1) supports oop implementation. Current vizseq uses the functional implementation which could be updated to oop implementation.

Motivation

Currently, using Bert Score in a validation loop causes re-loading the model again and again. This can be avoided with oop implementation.

Pitch

I see two solutions: (i) create a separate scorer named bert_score_oop (ii) in the current implementation of bert_score add argument whether to use oop implementation or not.

Are you willing to open a pull request? Yes, I can send a pull request
enhancement

opened by TheShadow29 1
[Bug] Vizseq CSS breaks Jupyter layout
🐛 Bug

Executing vizseq.view_stats breaks the layout of the Jupyter. The menu at the top obscures a majority of the screen and a blank area of ~60px appears at the top of the page.

To reproduce

** Minimal Code/Config snippet to reproduce **

start Jupyter jupyter notebook

view any of the example notebooks e.g. speech_translation

Execute the cells one-by-one.

When the first cell containing vizseq.view_stats is finishes the layout changes and appears broken.

Expected Behavior

The display of tables and graphs by vizseq does not affect the layout of the Jupyter notebook.

System information

VizSeq Version: '0.1.11' (clone from master yesterday)

Python version: Python 3.8.1 (default, Jan 8 2020, 23:09:20) [GCC 9.2.0] on linux

Operating system: Manjaro Linux

Additional context

Cause: The bootstrap.min.css and an inline stylesheet loaded by vizseq break the layout. The inline stylesheet is:

body { padding-top: 60px; /* 60px to make the container go all the way to the bottom of the topbar */ }

The inline stylesheet is responsible for the blank bar at the top while Bootstrap breaks the menu's formatting.

To test this disable both stylesheets in the stylesheet editor included in the developer tools of a browser.
bug
opened by pyfisch 0