🏆 • 5050 most frequent words in 109 languages

Overview

🏆 Most Common Words Multilingual

5000 most frequent words in 109 languages. Uses wordfrequency.info as a source.

🔗 License

  • source code license
  • data is released under different license(s), as they're taken from online sources. Feel free to contribute with your own data!
🌐 Language 📁 File
Afrikaans (af) .txt
Albanian (sq) .txt
Amharic (am) .txt
Arabic (ar) .txt
Armenian (hy) .txt
Azerbaijani (az) .txt
Basque (eu) .txt
Belarusian (be) .txt
Bengali (bn) .txt
Bosnian (bs) .txt
Bulgarian (bg) .txt
Catalan (ca) .txt
Cebuano (ceb) .txt
Chichewa (ny) .txt
Chinese (simplified) (zh-CN) .txt
Chinese (traditional) (zh-TW) .txt
Corsican (co) .txt
Croatian (hr) .txt
Czech (cs) .txt
Danish (da) .txt
Dutch (nl) .txt
English (en) .txt
Esperanto (eo) .txt
Estonian (et) .txt
Filipino (tl) .txt
Finnish (fi) .txt
French (fr) .txt
Frisian (fy) .txt
Galician (gl) .txt
Georgian (ka) .txt
German (de) .txt
Greek (el) .txt
Gujarati (gu) .txt
Haitian creole (ht) .txt
Hausa (ha) .txt
Hawaiian (haw) .txt
Hebrew (iw) .txt
Hindi (hi) .txt
Hmong (hmn) .txt
Hungarian (hu) .txt
Icelandic (is) .txt
Igbo (ig) .txt
Indonesian (id) .txt
Irish (ga) .txt
Italian (it) .txt
Japanese (ja) .txt
Javanese (jw) .txt
Kannada (kn) .txt
Kazakh (kk) .txt
Khmer (km) .txt
Kinyarwanda (rw) .txt
Korean (ko) .txt
Kurdish (ku) .txt
Kyrgyz (ky) .txt
Lao (lo) .txt
Latin (la) .txt
Latvian (lv) .txt
Lithuanian (lt) .txt
Luxembourgish (lb) .txt
Macedonian (mk) .txt
Malagasy (mg) .txt
Malay (ms) .txt
Malayalam (ml) .txt
Maltese (mt) .txt
Maori (mi) .txt
Marathi (mr) .txt
Mongolian (mn) .txt
Myanmar (my) .txt
Nepali (ne) .txt
Norwegian (no) .txt
Odia (or) .txt
Pashto (ps) .txt
Persian (fa) .txt
Polish (pl) .txt
Portuguese (pt) .txt
Punjabi (pa) .txt
Romanian (ro) .txt
Russian (ru) .txt
Samoan (sm) .txt
Scots gaelic (gd) .txt
Serbian (sr) .txt
Sesotho (st) .txt
Shona (sn) .txt
Sindhi (sd) .txt
Sinhala (si) .txt
Slovak (sk) .txt
Slovenian (sl) .txt
Somali (so) .txt
Spanish (es) .txt
Sundanese (su) .txt
Swahili (sw) .txt
Swedish (sv) .txt
Tajik (tg) .txt
Tamil (ta) .txt
Tatar (tt) .txt
Telugu (te) .txt
Thai (th) .txt
Turkish (tr) .txt
Turkmen (tk) .txt
Ukrainian (uk) .txt
Urdu (ur) .txt
Uyghur (ug) .txt
Uzbek (uz) .txt
Vietnamese (vi) .txt
Welsh (cy) .txt
Xhosa (xh) .txt
Yiddish (yi) .txt
Yoruba (yo) .txt
Zulu (zu) .txt
You might also like...
Count the frequency of letters or words in a text file and show a graph.

Word Counter By EBUS Coding Club Count the frequency of letters or words in a text file and show a graph. Requirements Python 3.9 or higher matplotlib

This program do translate english words to portuguese

Python-Dictionary This program is used to translate english words to portuguese. Web-Scraping This program use BeautifulSoap to make web scraping, so

Python powered crossword generator with database with 20k+ polish words

crossword_generator Generate simple crossword puzzle from words and definitions fetched from krzyżowki.edu.pl endpoints -/ string:word - returns js

This Project is based on NLTK It generates a RANDOM WORD from a predefined list of words, From that random word it read out the word, its meaning with parts of speech , its antonyms, its synonyms

This Project is based on NLTK(Natural Language Toolkit) It generates a RANDOM WORD from a predefined list of words, From that random word it read out the word, its meaning with parts of speech , its antonyms, its synonyms

Russian words synonyms and antonyms

ru_synonyms Russian words synonyms and antonyms. Install pip install git+https://github.com/ahmados/rusynonyms.git Usage from ru_synonyms import Anto

The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques
The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

Unsupervised technique to Glossary and Definition Extraction Code Files GPT2-DefinitionModel.ipynb - GPT-2 model for definition generation. Data_Gener

Turkish Stop Words Türkçe Dolgu Sözcükleri

trstop Turkish Stop Words Türkçe Dolgu Sözcükleri In this repository I put Turkish stop words that is contained in the first 10 thousand words with th

The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text

speech-recognition-py Speech recognition is the ability of computer software to identify words and phrases in spoken language and convert them to huma

Words_And_Phrases - Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours

Words_And_Phrases Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours Abbreviations Abbreviation

Comments
  • build(deps): bump certifi from 2021.10.8 to 2022.12.7

    build(deps): bump certifi from 2021.10.8 to 2022.12.7

    Bumps certifi from 2021.10.8 to 2022.12.7.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • build(deps): bump numpy from 1.21.4 to 1.22.0

    build(deps): bump numpy from 1.21.4 to 1.22.0

    Bumps numpy from 1.21.4 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
Owner
🃏 effectively learn new languages by using cool methods, such as flashcards and most common words!
null
Get list of common stop words in various languages in Python

Python Stop Words Table of contents Overview Available languages Installation Basic usage Python compatibility Overview Get list of common stop words

Alireza Savand 142 Dec 21, 2022
Get list of common stop words in various languages in Python

Python Stop Words Table of contents Overview Available languages Installation Basic usage Python compatibility Overview Get list of common stop words

Alireza Savand 121 Jan 6, 2021
This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular intervals.It sends out the most recent news at random!

Nepali-news-notifier This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular in

Sachit Yadav 1 Feb 11, 2022
Correctly generate plurals, ordinals, indefinite articles; convert numbers to words

NAME inflect.py - Correctly generate plurals, singular nouns, ordinals, indefinite articles; convert numbers to words. SYNOPSIS import inflect p = in

Jason R. Coombs 762 Dec 29, 2022
Correctly generate plurals, ordinals, indefinite articles; convert numbers to words

NAME inflect.py - Correctly generate plurals, singular nouns, ordinals, indefinite articles; convert numbers to words. SYNOPSIS import inflect p = in

Jason R. Coombs 478 Feb 16, 2021
Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form.

Neural G2P to portuguese language Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written for

fluz 11 Nov 16, 2022
Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

New State-of-the-Art in Preposition Sense Disambiguation Supervisor: Prof. Dr. Alexander Mehler Alexander Henlein Institutions: Goethe University TTLa

Dirk Neuhäuser 4 Apr 6, 2022
DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task

DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task。涵盖68个领域、共计916万词的专业词典知识库,可用于文本分类、知识增强、领域词汇库扩充等自然语言处理应用。

liuhuanyong 357 Dec 24, 2022
A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.

Multilingual Latent Dirichlet Allocation (LDA) Pipeline This project is for text clustering using the Latent Dirichlet Allocation (LDA) algorithm. It

Artifici Online Services inc. 74 Oct 7, 2022