Just Want a List of Words? โ
If you only care about the list of words in this repo, ๐
that's great; use them and have an awesome day! ๐
Want More? ๐
For the minuscule minority of people who want more, this issue is for you! ๐
Brief History / Context
A few years ago I needed a list of English Words for a work project. ๐จโ๐ป
Went searching and didn't find a ready-made list of English Words ... ๐ ๐คทโโ๏ธ
But found this StackOverflow Question and Answer:
https://stackoverflow.com/questions/2213607/how-to-get-english-language-word-database
Extracted the words from the Excel file that was on InfoChimps (now 404
) and dumped them in a .txt
file.
Put it on GitHub
and linked to it in a comment on SO and didn't give it anymore thought. ๐
Sadly, the work project that used the words was closed source for a company that got acquired and the App was shut down. ๐ข The folly of working on closed source things is that you often have nothing to show for your years of your life! ๐ญ
Meanwhile many thousands of people have downloaded the word list and the repo has 8.3k
โญ ๐คฏ
The mini [Open Source] demo project I created: nelsonic/autocomplete โก๏ธ wordsy.herokuapp.com ...
will soon be taken offline by Heroku's Bean-counters ๐
I outlined what I wanted to do in autocomplete#tasks but it's very incomplete ...
so this issue will give a muuuuch better roadmap of what we're doing. ๐ค
What challenge are we solving? ๐ค
The original purpose of this repo will 100% be maintained. โ
What we are doing is enhancing the repo with a showcase App that allows people to:
With that in mind, this is the plan:
- High quality list of English words in an easy to extract file/format e.g.
.txt
, .json
and .zip
- Instructions for how to use the words in various programming languages;
code
examples.
- [ ]
JavaScript
/TypeScript
- [ ]
Python
- [ ]
Elixir
- [ ]
Dart
- [ ]
Rust
- [ ] Invite contributions from the community for
code
examples from more programming languages [but NOT frameworks]
Make it clear that we really don't want a React
sample because we don't want to encourage anyone to use it.
- Clarity on the Process for updating the words list both adding, correcting and removing [invalid] words.
- Automate the creation of the
.zip
file so that we don't have people attempting to submit Pull Requests with Zip Files.
We're never going to accept a PR with a zip file. It's an easy attack vector for a malicious auto-executable.
Read more: https://github.com/snyk/zip-slip-vulnerability
It's not that we don't "trust" people ... but we know that not everyone on GitHub has good intentions.
Crime pays otherwise there wouldn't be any crims ... And cyber-crime pays big BTCs! So let's just avoid it. ๐
- Allow anyone to lookup words with auto-completion and to make suggestions via Web App/UI. That will invite way more people including non-technical people who don't know how to use GitHub to help maintain+improve the list of words.
Todo
- [ ] Review the existing/open PRs and try to merge them: https://github.com/dwyl/english-words/issues/155
- [ ] Create
Phoenix
App ๐ ... Note: waiting for Phoenix v1.7
to do this to minimise time wasted with updates ... โณ
- [ ] Re-create basic features from nelsonic/autocomplete:
- [ ] Use
PostgreSQL
for simplicity.
- [ ] If we notice too much query latency, we can switch to
SQLite
or ETS
for speed:
- [ ] Load the current English Words List into the DB
- [ ] Determine/decide what other metadata we want to store for each word. ๐ญ
- [ ] Discuss any other features we want to have. (please comment!) ๐ฌ
enhancement help wanted T1d chore epic technical priority-2 discuss