Correctly generate plurals, ordinals, indefinite articles; convert numbers to words

Jason R. Coombs

Last update: Dec 29, 2022

Related tags

Text Data & NLP inflect

Overview

https://coveralls.io/repos/github/jaraco/inflect/badge.svg?branch=master

https://tidelift.com/badges/package/pypi/inflect

NAME

inflect.py - Correctly generate plurals, singular nouns, ordinals, indefinite articles; convert numbers to words.

SYNOPSIS

import inflect

p = inflect.engine()

# METHODS:

# plural plural_noun plural_verb plural_adj singular_noun no num
# compare compare_nouns compare_nouns compare_adjs
# a an
# present_participle
# ordinal number_to_words
# join
# inflect classical gender
# defnoun defverb defadj defa defan


# UNCONDITIONALLY FORM THE PLURAL

print("The plural of ", word, " is ", p.plural(word))


# CONDITIONALLY FORM THE PLURAL

print("I saw", cat_count, p.plural("cat", cat_count))


# FORM PLURALS FOR SPECIFIC PARTS OF SPEECH

print(
    p.plural_noun("I", N1),
    p.plural_verb("saw", N1),
    p.plural_adj("my", N2),
    p.plural_noun("saw", N2),
)


# FORM THE SINGULAR OF PLURAL NOUNS

print("The singular of ", word, " is ", p.singular_noun(word))

# SELECT THE GENDER OF SINGULAR PRONOUNS

print(p.singular_noun("they"))  # 'it'
p.gender("f")
print(p.singular_noun("they"))  # 'she'


# DEAL WITH "0/1/N" -> "no/1/N" TRANSLATION:

print("There ", p.plural_verb("was", errors), p.no(" error", errors))


# USE DEFAULT COUNTS:

print(
    p.num(N1, ""),
    p.plural("I"),
    p.plural_verb(" saw"),
    p.num(N2),
    p.plural_noun(" saw"),
)
print("There ", p.num(errors, ""), p.plural_verb("was"), p.no(" error"))


# COMPARE TWO WORDS "NUMBER-INSENSITIVELY":

if p.compare(word1, word2):
    print("same")
if p.compare_nouns(word1, word2):
    print("same noun")
if p.compare_verbs(word1, word2):
    print("same verb")
if p.compare_adjs(word1, word2):
    print("same adj.")


# ADD CORRECT "a" OR "an" FOR A GIVEN WORD:

print("Did you want ", p.a(thing), " or ", p.an(idea))


# CONVERT NUMERALS INTO ORDINALS (i.e. 1->1st, 2->2nd, 3->3rd, etc.)

print("It was", p.ordinal(position), " from the left\n")

# CONVERT NUMERALS TO WORDS (i.e. 1->"one", 101->"one hundred and one", etc.)
# RETURNS A SINGLE STRING...

words = p.number_to_words(1234)
# "one thousand, two hundred and thirty-four"
words = p.number_to_words(p.ordinal(1234))
# "one thousand, two hundred and thirty-fourth"


# GET BACK A LIST OF STRINGS, ONE FOR EACH "CHUNK"...

words = p.number_to_words(1234, wantlist=True)
# ("one thousand","two hundred and thirty-four")


# OPTIONAL PARAMETERS CHANGE TRANSLATION:

words = p.number_to_words(12345, group=1)
# "one, two, three, four, five"

words = p.number_to_words(12345, group=2)
# "twelve, thirty-four, five"

words = p.number_to_words(12345, group=3)
# "one twenty-three, forty-five"

words = p.number_to_words(1234, andword="")
# "one thousand, two hundred thirty-four"

words = p.number_to_words(1234, andword=", plus")
# "one thousand, two hundred, plus thirty-four"
# TODO: I get no comma before plus: check perl

words = p.number_to_words(555_1202, group=1, zero="oh")
# "five, five, five, one, two, oh, two"

words = p.number_to_words(555_1202, group=1, one="unity")
# "five, five, five, unity, two, oh, two"

words = p.number_to_words(123.456, group=1, decimal="mark")
# "one two three mark four five six"
# TODO: DOCBUG: perl gives commas here as do I

# LITERAL STYLE ONLY NAMES NUMBERS LESS THAN A CERTAIN THRESHOLD...

words = p.number_to_words(9, threshold=10)  # "nine"
words = p.number_to_words(10, threshold=10)  # "ten"
words = p.number_to_words(11, threshold=10)  # "11"
words = p.number_to_words(1000, threshold=10)  # "1,000"

# JOIN WORDS INTO A LIST:

mylist = join(("apple", "banana", "carrot"))
# "apple, banana, and carrot"

mylist = join(("apple", "banana"))
# "apple and banana"

mylist = join(("apple", "banana", "carrot"), final_sep="")
# "apple, banana and carrot"


# REQUIRE "CLASSICAL" PLURALS (EG: "focus"->"foci", "cherub"->"cherubim")

p.classical()  # USE ALL CLASSICAL PLURALS

p.classical(all=True)  # USE ALL CLASSICAL PLURALS
p.classical(all=False)  # SWITCH OFF CLASSICAL MODE

p.classical(zero=True)  #  "no error" INSTEAD OF "no errors"
p.classical(zero=False)  #  "no errors" INSTEAD OF "no error"

p.classical(herd=True)  #  "2 buffalo" INSTEAD OF "2 buffalos"
p.classical(herd=False)  #  "2 buffalos" INSTEAD OF "2 buffalo"

p.classical(persons=True)  # "2 chairpersons" INSTEAD OF "2 chairpeople"
p.classical(persons=False)  # "2 chairpeople" INSTEAD OF "2 chairpersons"

p.classical(ancient=True)  # "2 formulae" INSTEAD OF "2 formulas"
p.classical(ancient=False)  # "2 formulas" INSTEAD OF "2 formulae"


# INTERPOLATE "plural()", "plural_noun()", "plural_verb()", "plural_adj()", "singular_noun()",
# a()", "an()", "num()" AND "ordinal()" WITHIN STRINGS:

print(p.inflect("The plural of {0} is plural('{0}')".format(word)))
print(p.inflect("The singular of {0} is singular_noun('{0}')".format(word)))
print(p.inflect("I saw {0} plural('cat',{0})".format(cat_count)))
print(
    p.inflect(
        "plural('I',{0}) "
        "plural_verb('saw',{0}) "
        "plural('a',{1}) "
        "plural_noun('saw',{1})".format(N1, N2)
    )
)
print(
    p.inflect(
        "num({0}, False)plural('I') "
        "plural_verb('saw') "
        "num({1}, False)plural('a') "
        "plural_noun('saw')".format(N1, N2)
    )
)
print(p.inflect("I saw num({0}) plural('cat')\nnum()".format(cat_count)))
print(p.inflect("There plural_verb('was',{0}) no('error',{0})".format(errors)))
print(p.inflect("There num({0}, False)plural_verb('was') no('error')".format(errors)))
print(p.inflect("Did you want a('{0}') or an('{1}')".format(thing, idea)))
print(p.inflect("It was ordinal('{0}') from the left".format(position)))


# ADD USER-DEFINED INFLECTIONS (OVERRIDING INBUILT RULES):

p.defnoun("VAX", "VAXen")  # SINGULAR => PLURAL

p.defverb(
    "will",  # 1ST PERSON SINGULAR
    "shall",  # 1ST PERSON PLURAL
    "will",  # 2ND PERSON SINGULAR
    "will",  # 2ND PERSON PLURAL
    "will",  # 3RD PERSON SINGULAR
    "will",  # 3RD PERSON PLURAL
)

p.defadj("hir", "their")  # SINGULAR => PLURAL

p.defa("h")  # "AY HALWAYS SEZ 'HAITCH'!"

p.defan("horrendous.*")  # "AN HORRENDOUS AFFECTATION"

DESCRIPTION

The methods of the class engine in module inflect.py provide plural inflections, singular noun inflections, "a"/"an" selection for English words, and manipulation of numbers as words.

Plural forms of all nouns, most verbs, and some adjectives are provided. Where appropriate, "classical" variants (for example: "brother" -> "brethren", "dogma" -> "dogmata", etc.) are also provided.

Single forms of nouns are also provided. The gender of singular pronouns can be chosen (for example "they" -> "it" or "she" or "he" or "they").

Pronunciation-based "a"/"an" selection is provided for all English words, and most initialisms.

It is also possible to inflect numerals (1,2,3) to ordinals (1st, 2nd, 3rd) and to English words ("one", "two", "three").

In generating these inflections, inflect.py follows the Oxford English Dictionary and the guidelines in Fowler's Modern English Usage, preferring the former where the two disagree.

The module is built around standard British spelling, but is designed to cope with common American variants as well. Slang, jargon, and other English dialects are not explicitly catered for.

Where two or more inflected forms exist for a single word (typically a "classical" form and a "modern" form), inflect.py prefers the more common form (typically the "modern" one), unless "classical" processing has been specified (see MODERN VS CLASSICAL INFLECTIONS).

FORMING PLURALS AND SINGULARS

Inflecting Plurals and Singulars

All of the plural... plural inflection methods take the word to be inflected as their first argument and return the corresponding inflection. Note that all such methods expect the singular form of the word. The results of passing a plural form are undefined (and unlikely to be correct). Similarly, the si... singular inflection method expects the plural form of the word.

The plural... methods also take an optional second argument, which indicates the grammatical "number" of the word (or of another word with which the word being inflected must agree). If the "number" argument is supplied and is not 1 (or "one" or "a", or some other adjective that implies the singular), the plural form of the word is returned. If the "number" argument does indicate singularity, the (uninflected) word itself is returned. If the number argument is omitted, the plural form is returned unconditionally.

The si... method takes a second argument in a similar fashion. If it is some form of the number 1, or is omitted, the singular form is returned. Otherwise the plural is returned unaltered.

The various methods of inflect.engine are:

plural_noun(word, count=None)

The method plural_noun() takes a singular English noun or pronoun and returns its plural. Pronouns in the nominative ("I" -> "we") and accusative ("me" -> "us") cases are handled, as are possessive pronouns ("mine" -> "ours").

plural_verb(word, count=None)

The method plural_verb() takes the singular form of a conjugated verb (that is, one which is already in the correct "person" and "mood") and returns the corresponding plural conjugation.

plural_adj(word, count=None)

The method plural_adj() takes the singular form of certain types of adjectives and returns the corresponding plural form. Adjectives that are correctly handled include: "numerical" adjectives ("a" -> "some"), demonstrative adjectives ("this" -> "these", "that" -> "those"), and possessives ("my" -> "our", "cat's" -> "cats'", "child's" -> "childrens'", etc.)

plural(word, count=None)

The method plural() takes a singular English noun, pronoun, verb, or adjective and returns its plural form. Where a word has more than one inflection depending on its part of speech (for example, the noun "thought" inflects to "thoughts", the verb "thought" to "thought"), the (singular) noun sense is preferred to the (singular) verb sense.

Hence plural("knife") will return "knives" ("knife" having been treated as a singular noun), whereas plural("knifes") will return "knife" ("knifes" having been treated as a 3rd person singular verb).

The inherent ambiguity of such cases suggests that, where the part of speech is known, plural_noun, plural_verb, and plural_adj should be used in preference to plural.

singular_noun(word, count=None)

The method singular_noun() takes a plural English noun or pronoun and returns its singular. Pronouns in the nominative ("we" -> "I") and accusative ("us" -> "me") cases are handled, as are possessive pronouns ("ours" -> "mine"). When third person singular pronouns are returned they take the neuter gender by default ("they" -> "it"), not ("they"-> "she") nor ("they" -> "he"). This can be changed with gender().

Note that all these methods ignore any whitespace surrounding the word being inflected, but preserve that whitespace when the result is returned. For example, plural(" cat ") returns " cats ".

gender(genderletter)

The third person plural pronoun takes the same form for the female, male and neuter (e.g. "they"). The singular however, depends upon gender (e.g. "she", "he", "it" and "they" -- "they" being the gender neutral form.) By default singular_noun returns the neuter form, however, the gender can be selected with the gender method. Pass the first letter of the gender to gender to return the f(eminine), m(asculine), n(euter) or t(hey) form of the singular. e.g. gender('f') followed by singular_noun('themselves') returns 'herself'.

Numbered plurals

The plural... methods return only the inflected word, not the count that was used to inflect it. Thus, in order to produce "I saw 3 ducks", it is necessary to use:

print("I saw", N, p.plural_noun(animal, N))

Since the usual purpose of producing a plural is to make it agree with a preceding count, inflect.py provides a method (no(word, count)) which, given a word and a(n optional) count, returns the count followed by the correctly inflected word. Hence the previous example can be rewritten:

print("I saw ", p.no(animal, N))

In addition, if the count is zero (or some other term which implies zero, such as "zero", "nil", etc.) the count is replaced by the word "no". Hence, if N had the value zero, the previous example would print (the somewhat more elegant):

I saw no animals

rather than:

I saw 0 animals

Note that the name of the method is a pun: the method returns either a number (a No.) or a "no", in front of the inflected word.

Reducing the number of counts required

In some contexts, the need to supply an explicit count to the various plural... methods makes for tiresome repetition. For example:

print(
    plural_adj("This", errors),
    plural_noun(" error", errors),
    plural_verb(" was", errors),
    " fatal.",
)

inflect.py therefore provides a method (num(count=None, show=None)) which may be used to set a persistent "default number" value. If such a value is set, it is subsequently used whenever an optional second "number" argument is omitted. The default value thus set can subsequently be removed by calling num() with no arguments. Hence we could rewrite the previous example:

p.num(errors)
print(p.plural_adj("This"), p.plural_noun(" error"), p.plural_verb(" was"), "fatal.")
p.num()

Normally, num() returns its first argument, so that it may also be "inlined" in contexts like:

print(p.num(errors), p.plural_noun(" error"), p.plural_verb(" was"), " detected.")
if severity > 1:
    print(
        p.plural_adj("This"), p.plural_noun(" error"), p.plural_verb(" was"), "fatal."
    )

However, in certain contexts (see INTERPOLATING INFLECTIONS IN STRINGS) it is preferable that num() return an empty string. Hence num() provides an optional second argument. If that argument is supplied (that is, if it is defined) and evaluates to false, num returns an empty string instead of its first argument. For example:

print(p.num(errors, 0), p.no("error"), p.plural_verb(" was"), " detected.")
if severity > 1:
    print(
        p.plural_adj("This"), p.plural_noun(" error"), p.plural_verb(" was"), "fatal."
    )

Number-insensitive equality

inflect.py also provides a solution to the problem of comparing words of differing plurality through the methods compare(word1, word2), compare_nouns(word1, word2), compare_verbs(word1, word2), and compare_adjs(word1, word2). Each of these methods takes two strings, and compares them using the corresponding plural-inflection method (plural(), plural_noun(), plural_verb(), and plural_adj() respectively).

The comparison returns true if:

the strings are equal, or
one string is equal to a plural form of the other, or
the strings are two different plural forms of the one word.

Hence all of the following return true:

p.compare("index", "index")  # RETURNS "eq"
p.compare("index", "indexes")  # RETURNS "s:p"
p.compare("index", "indices")  # RETURNS "s:p"
p.compare("indexes", "index")  # RETURNS "p:s"
p.compare("indices", "index")  # RETURNS "p:s"
p.compare("indices", "indexes")  # RETURNS "p:p"
p.compare("indexes", "indices")  # RETURNS "p:p"
p.compare("indices", "indices")  # RETURNS "eq"

As indicated by the comments in the previous example, the actual value returned by the various compare methods encodes which of the three equality rules succeeded: "eq" is returned if the strings were identical, "s:p" if the strings were singular and plural respectively, "p:s" for plural and singular, and "p:p" for two distinct plurals. Inequality is indicated by returning an empty string.

It should be noted that two distinct singular words which happen to take the same plural form are not considered equal, nor are cases where one (singular) word's plural is the other (plural) word's singular. Hence all of the following return false:

p.compare("base", "basis")  # ALTHOUGH BOTH -> "bases"
p.compare("syrinx", "syringe")  # ALTHOUGH BOTH -> "syringes"
p.compare("she", "he")  # ALTHOUGH BOTH -> "they"

p.compare("opus", "operas")  # ALTHOUGH "opus" -> "opera" -> "operas"
p.compare("taxi", "taxes")  # ALTHOUGH "taxi" -> "taxis" -> "taxes"

Note too that, although the comparison is "number-insensitive" it is not case-insensitive (that is, plural("time","Times") returns false. To obtain both number and case insensitivity, use the lower() method on both strings (that is, plural("time".lower(), "Times".lower()) returns true).

Related Functionality

Shout out to these libraries that provide related functionality:

WordSet parses identifiers like variable names into sets of words suitable for re-assembling in another form.
word2number converts words to a number.

For Enterprise

Available as part of the Tidelift Subscription.

This project and the maintainers of thousands of other packages are working with Tidelift to deliver one enterprise subscription that covers all of the open source you use.

Learn more.

Security Contact

To report a security vulnerability, please use the Tidelift security contact. Tidelift will coordinate the fix and disclosure.

Comments

Add Travis CI and Coveralls
I've added a configuration file so Travis CI will build each commit and pull request. It runs against each Python version supported by Travis CI: 2.6, 2.7, 3.2, 3.3, 3.4, PyPy and PyPy3, and all of them pass with no changes needed. This will help maintain the codebase by automatically running the tests against all Python versions for each PR -- it's very handy for picking up those Py2/Py3 inconsistencies..

It also runs the tests with coverage enabled, and sends the coverage reports to Coveralls, so you can see how coverage changes, or if a PR doesn't include tests to cover its new code. Right now, inflect.py has an amazing 98% coverage!

You get reports like this:

https://travis-ci.org/hugovk/inflect.py

https://coveralls.io/r/hugovk/inflect.py

I've added badges to the README to show off how good stats this project has. Both Travis CI and Coveralls are free for open source project, you just need to enable them for the repo. Please can you do so at:

https://travis-ci.org/profile

https://coveralls.io/repos/new

Thanks!
opened by hugovk 18
Allow special case dictionaries to match on lowercase final word
While inflecting some ingredient-related content, I'd noticed that although the word olives is currently handled correctly by singular_noun, black olives is not:

>>> import inflect >>> inflect.__version__ '4.0.0' >>> e = inflect.engine() >>> e.singular_noun('olives') 'olive' >>> e.singular_noun('black olives') 'black olife'

In the inflect code, the handling of single-word olives is performed by a special case which overrides the default logic for items ending ...ves.

This is one of a number of special cases gated by boolean conditions of the form if lowerword in <map>:, which fail if lowerword doesn't exactly match with a key in the <map>.

Switching these conditions to use lowerwordlast instead of lowerword results in additional matches for multi-word phrases without negatively impacting any existing test cases.

It's arguably not 'perfect' since multi-word phrases may have challenging plurality rules (like the situations handled by the compound logic, but hopefully it's a reasonably safe improvement.

This also resolves https://github.com/jazzband/inflect/issues/30 which falls into a similar special case relating to ...ies tokens.
opened by jayaddison 14

Implement Jazzband guidelines for project inflect.py

This issue tracks the implementation of the Jazzband guidelines for the project inflect.py

It was initiated by @jaraco who was automatically assigned in addition to the Jazzband roadies.

See the TODO list below for the generally required tasks, but feel free to update it in case the project requires it.

Feel free to ping a Jazzband roadie if you have any question.

TODOs

[x] Fix all links in the docs (and README file etc) from old to new repo
[x] Add the Jazzband badge to the README file
[x] Add the Jazzband contributing guideline to the CONTRIBUTING.md file
[x] Check if continuous testing works (e.g. Travis-CI, CircleCI, AppVeyor, etc)
[x] Check if test coverage services work (e.g. Coveralls, Codecov, etc)
[x] Add jazzband account to PyPI project as maintainer role (URL: https://pypi.python.org/pypi?:action=role_form&package_name=<PROJECTNAME>)
[x] Add jazzband-bot as maintainer to the Read the Docs project (URL: https://readthedocs.org/dashboard/<PROJECTNAME>/users/)
[x] Fix project URL in GitHub project description
[x] Review project if other services are used and port them to Jazzband

Project details


Description	Correctly generate plurals, ordinals, indefinite articles; convert numbers to words
Homepage	http://pypi.python.org/pypi/inflect
Stargazers	0
Open issues	0
Forks	0
Default branch	master
Is a fork	True
Has Wiki	True
Has Pages	False

opened by jazzband-bot 13

Transfer project to jaraco

Given the ~~lack of support for Azure pipelines (#101)~~, the ~~less streamlined process for releases (#80)~~, and the exclusion of sponsorship for these projects (#102), the inclusion of this project in Jazzband is now more of a hindrance than a benefit.

@jezdez Would you please transfer the project back to /jaraco?

opened by jaraco 11

Plural issues?

Some are wrong, I think:

>>> p.plural('corpus')  # corpora
'corpuses'
>>> p.plural('means')  # means
'mean'
>>> p.plural('vita')   # vitae
'vitas'
>>> p.plural('backhoe')
'backhoes'
>>> p.plural('hoe')  # hoes
'ho'
>>> p.plural('ho')   # hos
'h'

These are technically correct, but unusual (and not a mistake like the octupus-plural):

>>> p.plural('radius')  # radii, radiuses
'radiuses'
>>> p.plural('curriculum')  # curricula, curriculums
'curriculums'
>>> p.plural('medium')  # mediums, media
'mediums'
>>> p.plural('appendix')  # appendixes, appendices
'appendixes'
# also index

Not sure how it should work for uncountables:

>>> p.plural('cattle')  # cattle        
'cattles'

help wanted

opened by mbr 11

query of death

On version 2.1.0 This string: "lens with a lens ()." yields an exception.

> /lib/python3.6/site-packages/inflect.py in plural(self, text, count)
>    2239             self._pl_special_adjective(word, count)
>    2240             or self._pl_special_verb(word, count)
> -> 2241             or self._plnoun(word, count),
>    2242         )
>    2243         return "{}{}{}".format(pre, plural, post)
> 
> /lib/python3.6/site-packages/inflect.py in postprocess(self, orig, inflected)
>    2209                 continue
>    2210             if word.capitalize() == word:
> -> 2211                 result[index] = result[index].capitalize()
>    2212             if word == word.upper():
>    2213                 result[index] = result[index].upper()

IndexError: list index out of range

stale invalid

opened by uriva 9

Cannot install with old setuptools

Windows 7 Python 2.7.8

C:\stufftodelete>pip install -e git+https://github.com/pwdyson/inflect.py#egg=inflect
Obtaining inflect from git+https://github.com/pwdyson/inflect.py#egg=inflect
  Cloning https://github.com/pwdyson/inflect.py to c:\stufftodelete\src\inflect
  Running setup.py (path:C:\stufftodelete\src\inflect\setup.py) egg_info for package inflect
    usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
       or: -c --help [cmd1 cmd2 ...]
       or: -c --help-commands
       or: -c cmd --help

    error: invalid command 'egg_info'
    Complete output from command python setup.py egg_info:
    usage: -c [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]

   or: -c --help [cmd1 cmd2 ...]

   or: -c --help-commands

   or: -c cmd --help



error: invalid command 'egg_info'

----------------------------------------
Cleaning up...
Command python setup.py egg_info failed with error code 1 in C:\stufftodelete\src\inflect

But then:

C:\stufftodelete>pip install --upgrade setuptools
Downloading/unpacking setuptools from https://pypi.python.org/packages/3.4/s/setuptools/setuptools-7.0-py2.py3-none-any.
whl#md5=918e7e5ea108507e1ffbbdfccc3496b1
Installing collected packages: setuptools
  Found existing installation: setuptools 0.6c11
    Uninstalling setuptools:
      Successfully uninstalled setuptools
Successfully installed setuptools
Cleaning up...

And so:

C:\stufftodelete>pip install -e git+https://github.com/pwdyson/inflect.py#egg=inflect
Obtaining inflect from git+https://github.com/pwdyson/inflect.py#egg=inflect
  Updating c:\stufftodelete\src\inflect clone
  Running setup.py (path:C:\stufftodelete\src\inflect\setup.py) egg_info for package inflect

  Installing extra requirements: 'egg'
Installing collected packages: inflect
  Running setup.py develop for inflect

    Creating c:\python27\lib\site-packages\inflect.egg-link (link to .)
    Adding inflect 0.2.5pre1 to easy-install.pth file

    Installed c:\stufftodelete\src\inflect
Successfully installed inflect
Cleaning up...

Can the dependency to certain setuptools version be included in the setup, or must it already be in place? If not, can it be mentioned in the README installation instructions?

opened by hugovk 9

$Failed to import: `Field` default cannot be set in `Annotated` for 'num_Annotated[str, FieldInfo(min_length=1, extra={})]'$

Failed to import: `Field` default cannot be set in `Annotated` for 'num_Annotated[str, FieldInfo(min_length=1, extra={})]'

Error message:

Traceback (most recent call last):
  File "/data/cuih7/[REDACTED]", line 7, in <module>
    import inflect
  File "/data/cuih7/miniconda3/envs/nlp20220817/lib/python3.10/site-packages/inflect/__init__.py", line 2046, in <module>
    class engine:
  File "/data/cuih7/miniconda3/envs/nlp20220817/lib/python3.10/site-packages/inflect/__init__.py", line 3781, in engine
    def number_to_words(  # noqa: C901
  File "pydantic/decorator.py", line 36, in pydantic.decorator.validate_arguments.validate
    import sys
  File "pydantic/decorator.py", line 126, in pydantic.decorator.ValidatedFunction.__init__
    try:
  File "pydantic/decorator.py", line 259, in pydantic.decorator.ValidatedFunction.create_model
    return fun
  File "pydantic/main.py", line 972, in pydantic.main.create_model
  File "pydantic/main.py", line 204, in pydantic.main.ModelMetaclass.__new__
  File "pydantic/fields.py", line 488, in pydantic.fields.ModelField.infer
  File "pydantic/fields.py", line 419, in pydantic.fields.ModelField.__init__
  File "pydantic/fields.py", line 534, in pydantic.fields.ModelField.prepare
  File "pydantic/fields.py", line 633, in pydantic.fields.ModelField._type_analysis
  File "pydantic/fields.py", line 776, in pydantic.fields.ModelField._create_sub_type
  File "pydantic/fields.py", line 451, in pydantic.fields.ModelField._get_field_info
ValueError: `Field` default cannot be set in `Annotated` for 'num_Annotated[str, FieldInfo(min_length=1, extra={})]'

I'm using the library from conda-forge. Downgrading from 6.0.0-pyhd8ed1ab_0 to 5.6.2-pyhd8ed1ab_0 fixes the issue.

bug

opened by cuihaoleo 7

dash and lowered_split fix for inflect.py _plnoun()

In function _plnoun(), the variable lowered_split is split on dashes, but when reusing this variable, code required whitespace split lowerd words, not dash split.

opened by picobyte 7
Import unicode_literals from __futures__ for unicode handling in 2.7

Unicode strings would cause errors in python 2.7 when using inflect. With this one additional line, no error is thrown.

Fixes #52

Includes test file, testing for words containing unicode characters.

opened by nielstron 7
Drop unsupported Python 3.3

EOL since 2017-09-29: https://en.wikipedia.org/wiki/CPython#Version_history

See also https://github.com/pwdyson/inflect.py/pull/44 which fixes the flake8 error for Python 3.5 on the CI.

opened by hugovk 7

"Entity" gets pluralized to "Entitys"

docker run -it python:3.11 bash
root@4966b54eda43:/# pip install inflect==6.0.2
Collecting inflect==6.0.2
  Downloading inflect-6.0.2-py3-none-any.whl (34 kB)
Collecting pydantic>=1.9.1
  Downloading pydantic-1.10.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (14.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 9.7 MB/s eta 0:00:00
Collecting typing-extensions>=4.1.0
  Downloading typing_extensions-4.4.0-py3-none-any.whl (26 kB)
Installing collected packages: typing-extensions, pydantic, inflect
Successfully installed inflect-6.0.2 pydantic-1.10.2 typing-extensions-4.4.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
root@4966b54eda43:/# python
Python 3.11.1 (main, Dec 21 2022, 18:32:57) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import inflect
>>> word="Entity"
>>> print(f'The plural of "{word}" is "{inflect.engine().plural(word)}".')
The plural of "Entity" is "Entitys".
>>> print(f'The plural of "{word}" is "{inflect.engine().plural_noun(word)}".')
The plural of "Entity" is "Entitys".

opened by fezzik1620 2

'defnoun' doesn't seem to work when the word is a part of a string passed

Originally reported by @aMiss-aWry in #161:

I can fix the specific words with defnoun('jeans', 'jeans'), which corrects the result for an isolated string 'jeans' -> 'jeans'.

However, if I try to run plural_noun on 'blue jeans' instead of 'jeans' it then breaks again, returning 'blue jeanss'. I understand this is because 'blue jeans' appears to be a different string from 'jeans', but perhaps it should check the rules for the part of the string it is altering?

feature help wanted

opened by jaraco 1
Words ending in 's' should never be pluralized by adding another 's'
stockings -> stockingss jeans -> jeanss sandals -> sandalss

How do I avoid this? I can fix the specific words with defnoun('jeans', 'jeans'), which corrects the result for an isolated string 'jeans' -> 'jeans'.

However, if I try to run plural_noun on 'blue jeans' instead of 'jeans' it then breaks again, returning 'blue jeanss'. I understand this is because 'blue jeans' appears to be a different string from 'jeans', but perhaps it should check the rules for the part of the string it is altering?

I've gotten around this so far by this incredibly hacky 'solution', so any improvement would obviously be immensely appreciated:

plural = _INFLECT.plural_noun(key, count) if plural.endswith('ss'): plural = plural[0:-1]
feature help wanted
opened by aMiss-aWry 4
singular_noun on word ending in "s" removes s.
import inflect as Inflect inflect = Inflect.engine() inflect.singular_noun("car") # get False, as expected inflect.singular_noun("mass") # get "mas", expected False

Is this behavior a bug? singular_noun() is supposed to return False when the word is already singular, and I can find no reference to suggest "mass" is the plural of "mas", which is itself the plural of "ma".
help wanted
opened by eykamp 3

Releases(v6.0.2)

v6.0.2(Oct 20, 2022)

null
Source code(tar.gz)
Source code(zip)
v6.0.1(Oct 20, 2022)

null
Source code(tar.gz)
Source code(zip)
v6.0.0(Jul 30, 2022)

null
Source code(tar.gz)
Source code(zip)
v5.6.2(Jul 15, 2022)

null
Source code(tar.gz)
Source code(zip)
v5.6.1(Jul 9, 2022)

null
Source code(tar.gz)
Source code(zip)
v5.6.0(May 1, 2022)

null
Source code(tar.gz)
Source code(zip)
v5.5.2(Apr 6, 2022)

null
Source code(tar.gz)
Source code(zip)
v5.5.1(Apr 6, 2022)

null
Source code(tar.gz)
Source code(zip)
v5.4.0(Feb 1, 2022)

null
Source code(tar.gz)
Source code(zip)
v5.3.0(Mar 3, 2021)

Source code(tar.gz)
Source code(zip)
v5.2.0(Feb 24, 2021)

null
Source code(tar.gz)
Source code(zip)
v5.0.3(Feb 23, 2021)

null
Source code(tar.gz)
Source code(zip)
v5.0.2(Nov 15, 2020)

null
Source code(tar.gz)
Source code(zip)
v5.0.1(Nov 15, 2020)

null
Source code(tar.gz)
Source code(zip)
v5.0.0(Nov 15, 2020)

null
Source code(tar.gz)
Source code(zip)

Owner

Jason R. Coombs

GitHub https://pypi.org/project/inflect

Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and convert them into audio. Here I have used Google-text-to-speech library popularly known as gTTS library to convert text file to .mp3 file. Hope you like my project!

Text to speech (using Python) Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and co

19 Jun 30, 2022

Random-Word-Generator - Generates meaningful words from dictionary with given no. of letters and words.

Random Word Generator Generates meaningful words from dictionary with given no. of letters and words. This might be useful for generating short links

1 Jan 1, 2022

The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text

speech-recognition-py Speech recognition is the ability of computer software to identify words and phrases in spoken language and convert them to huma

1 Apr 3, 2022

topic modeling on unstructured data in Space news articles retrieved from the Guardian (UK) newspaper using API

NLP Space News Topic Modeling Photos by nasa.gov (1, 2, 3, 4, 5) and extremetech.com Table of Contents Project Idea Data acquisition Primary data sour

1 Jan 3, 2022

Twitter bot that uses NLP models to summarize news articles referenced in a user's twitter timeline

Twitter-News-Summarizer Twitter bot that uses NLP models to summarize news articles referenced in a user's twitter timeline 1.) Extracts all tweets fr

1 Jan 27, 2022

Phomber is infomation grathering tool that reverse search phone numbers and get their details, written in python3.

A Infomation Grathering tool that reverse search phone numbers and get their details ! What is phomber? Phomber is one of the best tools available fo

121 Dec 27, 2022

Get list of common stop words in various languages in Python

Python Stop Words Table of contents Overview Available languages Installation Basic usage Python compatibility Overview Get list of common stop words

142 Dec 21, 2022

Get list of common stop words in various languages in Python

Python Stop Words Table of contents Overview Available languages Installation Basic usage Python compatibility Overview Get list of common stop words

121 Jan 6, 2021

Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form.

Neural G2P to portuguese language Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written for

11 Nov 16, 2022

Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

New State-of-the-Art in Preposition Sense Disambiguation Supervisor: Prof. Dr. Alexander Mehler Alexander Henlein Institutions: Goethe University TTLa

4 Apr 6, 2022

DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task

DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task。涵盖68个领域、共计916万词的专业词典知识库，可用于文本分类、知识增强、领域词汇库扩充等自然语言处理应用。

357 Dec 24, 2022

A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.

Multilingual Latent Dirichlet Allocation (LDA) Pipeline This project is for text clustering using the Latent Dirichlet Allocation (LDA) algorithm. It

74 Oct 7, 2022

Count the frequency of letters or words in a text file and show a graph.

Word Counter By EBUS Coding Club Count the frequency of letters or words in a text file and show a graph. Requirements Python 3.9 or higher matplotlib

0 Apr 9, 2022

This program do translate english words to portuguese

Python-Dictionary This program is used to translate english words to portuguese. Web-Scraping This program use BeautifulSoap to make web scraping, so

1 Oct 10, 2022

Python powered crossword generator with database with 20k+ polish words

crossword_generator Generate simple crossword puzzle from words and definitions fetched from krzyżowki.edu.pl endpoints -/< string:word > - returns js

0 Jan 4, 2022

This Project is based on NLTK It generates a RANDOM WORD from a predefined list of words, From that random word it read out the word, its meaning with parts of speech , its antonyms, its synonyms

This Project is based on NLTK(Natural Language Toolkit) It generates a RANDOM WORD from a predefined list of words, From that random word it read out the word, its meaning with parts of speech , its antonyms, its synonyms

2 Nov 17, 2021

Russian words synonyms and antonyms

ru_synonyms Russian words synonyms and antonyms. Install pip install git+https://github.com/ahmados/rusynonyms.git Usage from ru_synonyms import Anto

7 Dec 14, 2022

The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

Unsupervised technique to Glossary and Definition Extraction Code Files GPT2-DefinitionModel.ipynb - GPT-2 model for definition generation. Data_Gener

28 May 25, 2021

🏆 • 5050 most frequent words in 109 languages

?? Most Common Words Multilingual 5000 most frequent words in 109 languages. Uses wordfrequency.info as a source. ?? License source code license data

14 Nov 24, 2022

Correctly generate plurals, ordinals, indefinite articles; convert numbers to words

Related tags

Overview

NAME

SYNOPSIS

DESCRIPTION

FORMING PLURALS AND SINGULARS

Inflecting Plurals and Singulars

Numbered plurals

Reducing the number of counts required

Number-insensitive equality

Related Functionality

For Enterprise

Security Contact

Comments

TODOs

Project details

Releases(v6.0.2)

v6.0.2(Oct 20, 2022)

v6.0.1(Oct 20, 2022)

v6.0.0(Jul 30, 2022)

v5.6.2(Jul 15, 2022)

v5.6.1(Jul 9, 2022)

v5.6.0(May 1, 2022)

v5.5.2(Apr 6, 2022)

v5.5.1(Apr 6, 2022)

v5.4.0(Feb 1, 2022)

v5.3.0(Mar 3, 2021)

v5.2.0(Feb 24, 2021)

v5.0.3(Feb 23, 2021)

v5.0.2(Nov 15, 2020)

v5.0.1(Nov 15, 2020)

v5.0.0(Nov 15, 2020)

Owner

Jason R. Coombs

Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and convert them into audio. Here I have used Google-text-to-speech library popularly known as gTTS library to convert text file to .mp3 file. Hope you like my project!

Random-Word-Generator - Generates meaningful words from dictionary with given no. of letters and words.

The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text

topic modeling on unstructured data in Space news articles retrieved from the Guardian (UK) newspaper using API

Twitter bot that uses NLP models to summarize news articles referenced in a user's twitter timeline

Phomber is infomation grathering tool that reverse search phone numbers and get their details, written in python3.

Get list of common stop words in various languages in Python

Get list of common stop words in various languages in Python

Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form.

Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task

A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.

Count the frequency of letters or words in a text file and show a graph.

This program do translate english words to portuguese

Python powered crossword generator with database with 20k+ polish words

This Project is based on NLTK It generates a RANDOM WORD from a predefined list of words, From that random word it read out the word, its meaning with parts of speech , its antonyms, its synonyms

Russian words synonyms and antonyms

The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

🏆 • 5050 most frequent words in 109 languages