Python wrapper for Stanford CoreNLP tools v3.4.1

Dustin Smith

Last update: Sep 7, 2022

Related tags

Text Data & NLP stanford-corenlp-python

Overview

Python interface to Stanford Core NLP tools v3.4.1

This is a Python wrapper for Stanford University's NLP group's Java-based CoreNLP tools. It can either be imported as a module or run as a JSON-RPC server. Because it uses many large trained models (requiring 3GB RAM on 64-bit machines and usually a few minutes loading time), most applications will probably want to run it as a server.

Python interface to Stanford CoreNLP tools: tagging, phrase-structure parsing, dependency parsing, named-entity recognition, and coreference resolution.
Runs an JSON-RPC server that wraps the Java server and outputs JSON.
Outputs parse trees which can be used by nltk.

It depends on pexpect and includes and uses code from jsonrpc and python-progressbar.

It runs the Stanford CoreNLP jar in a separate process, communicates with the java process using its command-line interface, and makes assumptions about the output of the parser in order to parse it into a Python dict object and transfer it using JSON. The parser will break if the output changes significantly, but it has been tested on Core NLP tools version 3.4.1 released 2014-08-27.

Download and Usage

To use this program you must download and unpack the compressed file containing Stanford's CoreNLP package. By default, corenlp.py looks for the Stanford Core NLP folder as a subdirectory of where the script is being run. In other words:

sudo pip install pexpect unidecode
git clone git://github.com/dasmith/stanford-corenlp-python.git
cd stanford-corenlp-python
wget http://nlp.stanford.edu/software/stanford-corenlp-full-2014-08-27.zip
unzip stanford-corenlp-full-2014-08-27.zip

Then launch the server:

python corenlp.py

Optionally, you can specify a host or port:

python corenlp.py -H 0.0.0.0 -p 3456

That will run a public JSON-RPC server on port 3456.

Assuming you are running on port 8080, the code in client.py shows an example parse:

import jsonrpc
from simplejson import loads
server = jsonrpc.ServerProxy(jsonrpc.JsonRpc20(),
                             jsonrpc.TransportTcpIp(addr=("127.0.0.1", 8080)))

result = loads(server.parse("Hello world.  It is so beautiful"))
print "Result", result

That returns a dictionary containing the keys sentences and coref. The key sentences contains a list of dictionaries for each sentence, which contain parsetree, text, tuples containing the dependencies, and words, containing information about parts of speech, recognized named-entities, etc:

{u'sentences': [{u'parsetree': u'(ROOT (S (VP (NP (INTJ (UH Hello)) (NP (NN world)))) (. !)))',
                 u'text': u'Hello world!',
                 u'tuples': [[u'dep', u'world', u'Hello'],
                             [u'root', u'ROOT', u'world']],
                 u'words': [[u'Hello',
                             {u'CharacterOffsetBegin': u'0',
                              u'CharacterOffsetEnd': u'5',
                              u'Lemma': u'hello',
                              u'NamedEntityTag': u'O',
                              u'PartOfSpeech': u'UH'}],
                            [u'world',
                             {u'CharacterOffsetBegin': u'6',
                              u'CharacterOffsetEnd': u'11',
                              u'Lemma': u'world',
                              u'NamedEntityTag': u'O',
                              u'PartOfSpeech': u'NN'}],
                            [u'!',
                             {u'CharacterOffsetBegin': u'11',
                              u'CharacterOffsetEnd': u'12',
                              u'Lemma': u'!',
                              u'NamedEntityTag': u'O',
                              u'PartOfSpeech': u'.'}]]},
                {u'parsetree': u'(ROOT (S (NP (PRP It)) (VP (VBZ is) (ADJP (RB so) (JJ beautiful))) (. .)))',
                 u'text': u'It is so beautiful.',
                 u'tuples': [[u'nsubj', u'beautiful', u'It'],
                             [u'cop', u'beautiful', u'is'],
                             [u'advmod', u'beautiful', u'so'],
                             [u'root', u'ROOT', u'beautiful']],
                 u'words': [[u'It',
                             {u'CharacterOffsetBegin': u'14',
                              u'CharacterOffsetEnd': u'16',
                              u'Lemma': u'it',
                              u'NamedEntityTag': u'O',
                              u'PartOfSpeech': u'PRP'}],
                            [u'is',
                             {u'CharacterOffsetBegin': u'17',
                              u'CharacterOffsetEnd': u'19',
                              u'Lemma': u'be',
                              u'NamedEntityTag': u'O',
                              u'PartOfSpeech': u'VBZ'}],
                            [u'so',
                             {u'CharacterOffsetBegin': u'20',
                              u'CharacterOffsetEnd': u'22',
                              u'Lemma': u'so',
                              u'NamedEntityTag': u'O',
                              u'PartOfSpeech': u'RB'}],
                            [u'beautiful',
                             {u'CharacterOffsetBegin': u'23',
                              u'CharacterOffsetEnd': u'32',
                              u'Lemma': u'beautiful',
                              u'NamedEntityTag': u'O',
                              u'PartOfSpeech': u'JJ'}],
                            [u'.',
                             {u'CharacterOffsetBegin': u'32',
                              u'CharacterOffsetEnd': u'33',
                              u'Lemma': u'.',
                              u'NamedEntityTag': u'O',
                              u'PartOfSpeech': u'.'}]]}],
u'coref': [[[[u'It', 1, 0, 0, 1], [u'Hello world', 0, 1, 0, 2]]]]}

To use it in a regular script (useful for debugging), load the module instead:

from corenlp import *
corenlp = StanfordCoreNLP()  # wait a few minutes...
corenlp.parse("Parse this sentence.")

The server, StanfordCoreNLP(), takes an optional argument corenlp_path which specifies the path to the jar files. The default value is StanfordCoreNLP(corenlp_path="./stanford-corenlp-full-2014-08-27/").

Coreference Resolution

The library supports coreference resolution, which means pronouns can be "dereferenced." If an entry in the coref list is, [u'Hello world', 0, 1, 0, 2], the numbers mean:

0 = The reference appears in the 0th sentence (e.g. "Hello world")
1 = The 2nd token, "world", is the headword of that sentence
0 = 'Hello world' begins at the 0th token in the sentence
2 = 'Hello world' ends before the 2nd token in the sentence.

Questions

Stanford CoreNLP tools require a large amount of free memory. Java 5+ uses about 50% more RAM on 64-bit machines than 32-bit machines. 32-bit machine users can lower the memory requirements by changing -Xmx3g to -Xmx2g or even less. If pexpect timesout while loading models, check to make sure you have enough memory and can run the server alone without your kernel killing the java process:

java -cp stanford-corenlp-2014-08-27.jar:stanford-corenlp-3.4.1-models.jar:xom.jar:joda-time.jar -Xmx3g edu.stanford.nlp.pipeline.StanfordCoreNLP -props default.properties

You can reach me, Dustin Smith, by sending a message on GitHub or through email (contact information is available on my webpage).

License & Contributors

This is free and open source software and has benefited from the contribution and feedback of others. Like Stanford's CoreNLP tools, it is covered under the GNU General Public License v2 +, which in short means that modifications to this program must maintain the same free and open source distribution policy.

I gratefully welcome bug fixes and new features. If you have forked this repository, please submit a pull request so others can benefit from your contributions. This project has already benefited from contributions from these members of the open source community:

Emilio Monti
Justin Cheng
Abhaya Agarwal

Thank you!

Related Projects

Maintainers of the Core NLP library at Stanford keep an updated list of wrappers and extensions. See Brendan O'Connor's stanford_corenlp_pywrapper for a different approach more suited to batch processing.

Comments

File "corenlp.py", line 226 except Exception, e: Syntax error

Trying to run this in the terminal on my mac, but when I try running python corenlp.py, it gives me a syntax error at line 226 of corenlp.py. Any way to fix this?

I really need to get this working soon.

opened by justking14 5
I have an error and, if it's something you're aware of, wondered if you can help me with a fix?

python corenlp.py Traceback (most recent call last): File "corenlp.py", line 257, in nlp = StanfordCoreNLP() File "corenlp.py", line 163, in init self.corenlp = pexpect.spawn(start_corenlp) File "/usr/local/lib/python2.7/dist-packages/pexpect/pty_spawn.py", line 198, in init self._spawn(command, args, preexec_fn, dimensions) File "/usr/local/lib/python2.7/dist-packages/pexpect/pty_spawn.py", line 271, in _spawn 'executable: %s.' % self.command) pexpect.exceptions.ExceptionPexpect: The command was not found or was not executable: java.

opened by nikhil-shrestha 3
Multiple occurrences of a word not handled properly while creating tuples

If there are multiple occurrences of a word in a sentence, lack of ids makes it impossible to identify the source and target of a dependency correctly.

If you are open to accepting a patch for this, I can submit one. My idea is to keep the ids in the "tuples" and store the dependents of a word in the "words" array.

opened by abhaga 3
Error when processing Chinese text
After I start the server (with trained Chinese models and properties file), I test the server with a Chinese sentence by replacing the example English sentence in client.py, i.e.

#result = nlp.parse(u"Hello world! It is so beautiful.") result = nlp.parse(u"今天天气真不错啊！")

Traceback (most recent call last): File "client.py", line 17, in result = nlp.parse(u"今天天气真不错啊！") File "client.py", line 13, in parse return json.loads(self.server.parse(text)) File "/home/kqc/github/stanford-corenlp-python/jsonrpc.py", line 934, in call return self.__req(self.__name, args, kwargs) File "/home/kqc/github/stanford-corenlp-python/jsonrpc.py", line 907, in __req resp = self.__data_serializer.loads_response( resp_str ) File "/home/kqc/github/stanford-corenlp-python/jsonrpc.py", line 626, in loads_response raise RPCInternalError(error_data) jsonrpc.RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

Could you show me how to fix this?
opened by hitalex 2
Error while launching the server, i.e. running the command python corenlp.py

This is the error: Traceback (most recent call last):
File "", line 1, in File "corenlp.py", line 176, in init self.corenlp.expect("done.", timeout=200) # Loading PCFG (~3sec) File "/Users/mihir.saxena/virtualenvironment/my_new_project/lib/python2.7/site-packages/pexpect/spawnbase.py", line 327, in expect timeout, searchwindowsize, async_) File "/Users/mihir.saxena/virtualenvironment/my_new_project/lib/python2.7/site-packages/pexpect/spawnbase.py", line 355, in expect_list return exp.expect_loop(timeout) File "/Users/mihir.saxena/virtualenvironment/my_new_project/lib/python2.7/site-packages/pexpect/expect.py", line 102, in expect_loop return self.eof(e) File "/Users/mihir.saxena/virtualenvironment/my_new_project/lib/python2.7/site-packages/pexpect/expect.py", line 49, in eof raise EOF(msg) pexpect.exceptions.EOF: End Of File (EOF). Empty string style platform. <pexpect.pty_spawn.spawn object at 0x10ca092d0> command: /usr/bin/java args: ['/usr/bin/java', '-Xmx1800m', '-cp', './stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1.jar:./stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1-models.jar:./stanford-corenlp-full-2014-08-27/joda-time.jar:./stanford-corenlp-full-2014-08-27/xom.jar:./stanford-corenlp-full-2014-08-27/jollyday.jar', 'edu.stanford.nlp.pipeline.StanfordCoreNLP', '-props', 'default.properties'] buffer (last 100 chars): '' before (last 100 chars): 'aders.java:185)\r\n\tat java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:496)\r\n\t... 34 more\r\n' after: <class 'pexpect.exceptions.EOF'> match: None match_index: None exitstatus: None flag_eof: True pid: 46580 child_fd: 6 closed: False timeout: 30 delimiter: <class 'pexpect.exceptions.EOF'> logfile: None logfile_read: None logfile_send: None maxread: 2000 ignorecase: False searchwindowsize: None delaybeforesend: 0.05 delayafterclose: 0.1 delayafterterminate: 0.1 searcher: searcher_re: 0: re.compile("done.")

I have verified that all the jar files are of the same version that is specified in the corenlp.py code, earlier I had used a latest version of it and appropriately updated it in corenlp.py, in either cases, getting the same error. Not able to figure it out, kindly look into this and please suggest a solution.

opened by mihirsaxena 1

could you add windows support?

PS F:\gitwork\stanford-corenlp-python> python corenlp.py
Traceback (most recent call last):
  File "corenlp.py", line 257, in <module>
    nlp = StanfordCoreNLP()
  File "corenlp.py", line 163, in __init__
    self.corenlp = pexpect.spawn(start_corenlp)
AttributeError: 'module' object has no attribute 'spawn'

https://github.com/pexpect/pexpect/issues/321 http://pexpect.readthedocs.io/en/stable/overview.html#pexpect-on-windows

could you add windows support?

opened by BlankRain 1

parse returning as a string rather than a dictionary.

I'm trying to follow the instructions:

from corenlp import * corenlp = StanfordCoreNLP()
corenlp.parse("This is a test.")

When I do this it returns something like this: '{"coref": [[[["This", 0, 0, 0, 1], ["a test", 0, 3, 2, 4]]]], "sentences": [{"parsetree": "(ROOT (S (NP (DT This)) (VP (VBZ is) (NP (DT a) (NN test))) (. .)))", "text": "This is a test.", "dependencies": [["root", "ROOT", "test"], ["nsubj", "test", "This"], ["cop", "test", "is"], ["det", "test", "a"]], "words": [["This", {"NamedEntityTag": "O", "CharacterOffsetEnd": "4", "Lemma": "this", "PartOfSpeech": "DT", "CharacterOffsetBegin": "0"}], ["is", {"NamedEntityTag": "O", "CharacterOffsetEnd": "7", "Lemma": "be", "PartOfSpeech": "VBZ", "CharacterOffsetBegin": "5"}], ["a", {"NamedEntityTag": "O", "CharacterOffsetEnd": "9", "Lemma": "a", "PartOfSpeech": "DT", "CharacterOffsetBegin": "8"}], ["test", {"NamedEntityTag": "O", "CharacterOffsetEnd": "14", "Lemma": "test", "PartOfSpeech": "NN", "CharacterOffsetBegin": "10"}], [".", {"NamedEntityTag": "O", "CharacterOffsetEnd": "15", "Lemma": ".", "PartOfSpeech": ".", "CharacterOffsetBegin": "14"}]]}]}'

Where it is a dictionary wrapped in quotes making it a string. I'm not sure what I'm doing wrong...

opened by jjm0022 1
Instanciate StanfordCoreNLP with different annotators

I'm using the StanfordCoreNLP class to do NER on some text. Then somewhere else in my program I only need to do POS tagging, but performance is uselessly slowed down by NER. I see that I can edit the default.properties file to remove the annotators I don't need, but that would change every instance of StanfordCoreNLP, which won't work.

Right now I'm thinking of modifying StanfordCoreNLP's init to allow a custom string for props to be passed, and create several files that contain the annotator lists I need. This might work for now, but I'd like to know if you see a better way, and if you'd be interested in allowing StanfordCoreNLP instances to be created with an optional annotator list.

opened by hugomailhot 1
AttributeError: 'StanfordCoreNLP' object has no attribute 'parse_imperative'
Hi Dustin,

I am not sure if you are aware of the problem, when I try to run the corenlp.py, I get the following error

Starting the Stanford Core NLP parser. Loading Models: 5/5 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| plays hard to get, smiles from time to time NLP tools loaded. Traceback (most recent call last): File "corenlp.py", line 295, in <module> server.register_function(nlp.parse_imperative) AttributeError: 'StanfordCoreNLP' object has no attribute 'parse_imperative'

Commenting out the line 295 solved the problem. I have quickly scanned the code, and could not locate parse_imperative method. I am not very experienced with Python, may be I have missed something.

I wanted you to know

Thanks for the great work! Keep up.
opened by bcambel 1
Updated for Latest CoreNLP (2012-04-09)
Outputs parse trees (useful if you want to use CoreNLP in conjunction with other toolkits like nltk)

Outputs coreference data properly (parsing the new coreference output format)

Outputs word features better (including XML values which occasionally appear)

Slightly restructured output so that coreference data can exist at the top level, since that data doesn't really belong to any particular sentence.
opened by jcccf 1
Getting sentiment value via server implementation

Hi, i am interested in using the server implementation of your wrapper but it seems it doesn't seem to output the sentiment score while in the package implementation, there is a field for the same. What is the cause of this difference?

opened by vivekanand1101 0
Abount connection refused

When I run result = loads(server.parse("Hello world. It is so beautiful")) It is an connection error.

Traceback (most recent call last): File "", line 1, in File "jsonrpc.py", line 934, in call return self.__req(self.__name, args, kwargs) File "jsonrpc.py", line 906, in __req raise RPCTransportError(err) jsonrpc.RPCTransportError: [Errno 111] Connection refused

result = loads(server.parse("Hello world. It is so beautiful")) Traceback (most recent call last): File "", line 1, in File "jsonrpc.py", line 934, in call return self.__req(self.__name, args, kwargs) File "jsonrpc.py", line 906, in __req raise RPCTransportError(err) jsonrpc.RPCTransportError: [Errno 111] Connection refused

opened by Zhang-GK 0
jsonrpc import error: ValueError, err :

Hello, So iv been trying to use corenlp as a wrapper for Stanfordnlp for coreference resolution. But im having issues with the corenlp.py file. There was one error in the downloaded file which was

Exception, err: needs to writted as Exception as err:

But when i correct this the jsonrpc import doesnt work as a method within the import is throwing this error.

Traceback (most recent call last): File "corenlp.py", line 24, in import jsonrpc, pexpect File "D:\NLP\NaturalLanguageProcessing\stanford-corenlp-python\jsonrpc.py", line 376 except ValueError, err: ^ SyntaxError: invalid syntax

Any help would be much appreciated, Thanks in advance. Also would be agreat help if you could suggest any known API's for coreference resolution or a wrapper for stanfordnlp that has coreference resolution

opened by rishanfaliq 1
Corenlp.py does not loading any modules

Traceback (most recent call last): File "D:\fahma\corefernce resolution\stanford-corenlp-python-master\corenlp.py", line 281, in nlp = StanfordCoreNLP() File "D:\fahma\corefernce resolution\stanford-corenlp-python-master\corenlp.py", line 173, in init self.corenlp.expect("done.", timeout=20) # Load pos tagger model (~5sec) File "C:\Python27\lib\site-packages\pexpect\spawnbase.py", line 341, in expect timeout, searchwindowsize, async_) File "C:\Python27\lib\site-packages\pexpect\spawnbase.py", line 369, in expect_list return exp.expect_loop(timeout) File "C:\Python27\lib\site-packages\pexpect\expect.py", line 117, in expect_loop return self.eof(e) File "C:\Python27\lib\site-packages\pexpect\expect.py", line 63, in eof raise EOF(msg) EOF: End Of File (EOF). <pexpect.popen_spawn.PopenSpawn object at 0x021863B0> searcher: searcher_re: 0: re.compile('done.')

opened by FahmaBakkar 1
Could you please help me that explain what the result of coreference resolution means?

I tried the tools and get the result like: Barack Obama was born in Hawaii. He is the president. Obama was elected in 2008. "coref": [[[["He", 1, 0, 0, 1], ["Barack Obama", 0, 1, 0, 2]], [["the president", 1, 3, 2, 4], ["Barack Obama", 0, 1, 0, 2]], [["Obama", 2, 0, 0, 1], ["Barack Obama", 0, 1, 0, 2]]]] So could you please help that what it means? Especially what the indices in the list mean? Thank you very much!

opened by motefly 0

Corenlp.py does not go further after loading all 5 modules

Traceback (most recent call last):
  File "corenlp.py", line 257, in <module>
    nlp = StanfordCoreNLP()
  File "corenlp.py", line 178, in __init__
    self.corenlp.expect("Entering interactive shell.")
  File "/home/whiskey/.local/lib/python2.7/site-packages/pexpect/spawnbase.py", line 341, in expect
    timeout, searchwindowsize, async_)
  File "/home/whiskey/.local/lib/python2.7/site-packages/pexpect/spawnbase.py", line 369, in expect_list
    return exp.expect_loop(timeout)
  File "/home/whiskey/.local/lib/python2.7/site-packages/pexpect/expect.py", line 116, in expect_loop
    return self.timeout(e)
  File "/home/whiskey/.local/lib/python2.7/site-packages/pexpect/expect.py", line 80, in timeout
    raise TIMEOUT(msg)
pexpect.exceptions.TIMEOUT: Timeout exceeded.
<pexpect.pty_spawn.spawn object at 0x7f1cbb072050>
command: /usr/bin/java
args: ['/usr/bin/java', '-Xmx1800m', '-cp', './stanford-corenlp-full-2018-02-27/stanford-corenlp-3.9.1.jar:./stanford-corenlp-full-2018-02-27/stanford-corenlp-3.9.1-models.jar:./stanford-corenlp-full-2018-02-27/joda-time.jar:./stanford-corenlp-full-2018-02-27/xom.jar:./stanford-corenlp-full-2018-02-27/jollyday.jar', 'edu.stanford.nlp.pipeline.StanfordCoreNLP', '-props', 'default.properties']
buffer (last 100 chars): '[0.7 sec].\r\nAdding annotator dcoref\r\n'
before (last 100 chars): '[0.7 sec].\r\nAdding annotator dcoref\r\n'
after: <class 'pexpect.exceptions.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 7185
child_fd: 5
closed: False
timeout: 30
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_re:
    0: re.compile("Entering interactive shell.")

opened by arsalanamjid 0

Owner

Dustin Smith

GitHub

Official Stanford NLP Python Library for Many Human Languages

Stanza: A Python NLP Library for Many Human Languages The Stanford NLP Group's official Python NLP library. It contains support for running various ac

6.4k Jan 2, 2023

Official Stanford NLP Python Library for Many Human Languages

Stanza: A Python NLP Library for Many Human Languages The Stanford NLP Group's official Python NLP library. It contains support for running various ac

5.2k Feb 12, 2021

Official Stanford NLP Python Library for Many Human Languages

Stanza: A Python NLP Library for Many Human Languages The Stanford NLP Group's official Python NLP library. It contains support for running various ac

5.2k Feb 17, 2021

Google and Stanford University released a new pre-trained model called ELECTRA

Google and Stanford University released a new pre-trained model called ELECTRA, which has a much compact model size and relatively competitive performance compared to BERT and its variants. For further accelerating the research of the Chinese pre-trained model, the Joint Laboratory of HIT and iFLYTEK Research (HFL) has released the Chinese ELECTRA models based on the official code of ELECTRA. ELECTRA-small could reach similar or even higher scores on several NLP tasks with only 1/10 parameters compared to BERT and its variants.

1.2k Dec 30, 2022

Grading tools for Advanced NLP (11-711)Grading tools for Advanced NLP (11-711)

Grading tools for Advanced NLP (11-711) Installation You'll need docker and unzip to use this repo. For docker, visit the official guide to get starte

2 Sep 27, 2022

A python wrapper around the ZPar parser for English.

NOTE This project is no longer under active development since there are now really nice pure Python parsers such as Stanza and Spacy. The repository w

49 Sep 12, 2022

Visual Automata is a Python 3 library built as a wrapper for Caleb Evans' Automata library to add more visualization features.

55 Nov 17, 2022

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

easySpeech easySpeech is an open source python wrapper for google speech to text api that doesn't require PyAaudio(So you specially windows user don't

14 May 24, 2022

Python api wrapper for JellyFish Lights

Python api wrapper for JellyFish Lights The hope is to make this a pip installable package Current capabalilities: Connects to a local JellyFish Light

10 Dec 18, 2022

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Simple-Vosk A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk. Check out the official Vosk G

2 Jun 19, 2022

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par

Computational Linguistics Research Group

8.4k Dec 30, 2022

Python wrapper for Stanford CoreNLP tools v3.4.1

Related tags

Overview

Python interface to Stanford Core NLP tools v3.4.1

Download and Usage

Coreference Resolution

Questions

License & Contributors

Related Projects

Comments

Owner

Dustin Smith

Official Stanford NLP Python Library for Many Human Languages

Official Stanford NLP Python Library for Many Human Languages

Official Stanford NLP Python Library for Many Human Languages

Google and Stanford University released a new pre-trained model called ELECTRA

Grading tools for Advanced NLP (11-711)Grading tools for Advanced NLP (11-711)

A python wrapper around the ZPar parser for English.

Visual Automata is a Python 3 library built as a wrapper for Caleb Evans' Automata library to add more visualization features.

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

Python api wrapper for JellyFish Lights

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

A Python package implementing a new model for text classification with visualization tools for Explainable AI :octocat:

PyJPBoatRace: Python-based Japanese boatrace tools 🚤

Wrapper to display a script output or a text file content on the desktop in sway or other wlroots-based compositors

A spaCy wrapper of OpenTapioca for named entity linking on Wikidata

A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.

Tools, wrappers, etc... for data science with a concentration on text processing

profile tools for pytorch nn models

Toy example of an applied ML pipeline for me to experiment with MLOps tools.