Passphrase-wordlist - Shameless clone of passphrase wordlist

This repository is NOT official -- the original repository is located on GitLab at https://gitlab.com/initstring/passphrase-wordlist

This repository is only a tribute

Overview

People think they are getting smarter by using passphrases. Let's prove them wrong!

This project includes a massive wordlist of phrases (over 20 million) and two hashcat rule files for GPU-based cracking. The rules will create over 1,000 permutations of each phase.

To use this project, you need:

The wordlist hosted here (right-click, save-as).
Both hashcat rules here.

WORDLIST LAST UPDATED: 2021-10-04

Usage

Generally, you will use with hashcat's -a 0 mode which takes a wordlist and allows rule files. It is important to use the rule files in the correct order, as rule #1 mostly handles capital letters and spaces, and rule #2 deals with permutations.

Here is an example for NTLMv2 hashes: If you use the -O option, watch out for what the maximum password length is set to - it may be too short.

hashcat -a 0 -m 5600 hashes.txt passphrases.txt -r passphrase-rule1.rule -r passphrase-rule2.rule -O -w 3

Sources Used

Some sources are pulled from a static dataset, like a Kaggle upload. Others I generate myself using various scripts and APIs. I might one day automate that via CI, but for now you can see how I update the dynamic sources here.

source file name	source type	description
wiktionary-2021-09-29.txt	dynamic	Article titles scraped from Wiktionary's index dump here.
wikipedia-2021-09-29.txt	dynamic	Article titles scraped from the Wikipedia `pages-articles-multistream-index` dump generated 29-Sept-2021 here.
urban-dictionary-2021-09-29.txt	dynamic	Urban Dictionary dataset pulled using this script.
know-your-meme-2021-09-29.txt	dynamic	Meme titles from KnownYourMeme scraped using my tool here.
imdb-titles-2021-09-29.txt	dynamic	IMDB dataset using the "primaryTitle" column from `title.basics.tsv.gz` file available here
global-poi-2021-09-29.txt	dynamic	Global POI dataset using the 'allCountries' file from 29-Sept-2021.
billboard-titles-2021-10-04.txt	dynamic	Album and track names using Ultimate Music Database, scraped with a fork of mwkling's tool, modified to grab Billboard Singles (1940-2021) and Billboard Albums (1970-2021) charts.
billboard-artists-2021-10-04.txt	dynamic	Artist names using Ultimate Music Database, scraped with a fork of mwkling's tool, modified to grab Billboard Singles (1940-2021) and Billboard Albums (1970-2021) charts.
book.txt	static	Kaggle dataset with titles from over 300,000 books.
rstone-top-100.txt	static (could be dynamic in future)	Song lyrics for Rolling Stone's "top 100" artists using my lyric scraping tool.
cornell-movie-titles-raw.txt	static	Movie titles from this Cornell project.
cornell-movie-lines.txt	static	Movie lines from this Cornell project.
author-quotes-raw.txt	static	Quotables dataset on Kaggle.
1800-phrases-raw.txt	static	1,800 English Phrases.
15k-phrases-raw.txt	static	15,000 Useful Phrases.

Hashcat Rules

The rule files are designed to both "shape" the password and to mutate it. Shaping is based on the idea that human beings follow fairly predictable patterns when choosing a password, such as capitalising the first letter of each word and following the phrase with a number or special character. Mutations are also fairly predictable, such as replacing letters with visually-similar special characters.

Given the phrase take the red pill the first hashcat rule will output the following:

take the red pill
take-the-red-pill
take.the.red.pill
take_the_red_pill
taketheredpill
Take the red pill
TAKE THE RED PILL
tAKE THE RED PILL
Taketheredpill
tAKETHEREDPILL
TAKETHEREDPILL
Take The Red Pill
TakeTheRedPill
Take-The-Red-Pill
Take.The.Red.Pill
Take_The_Red_Pill

Adding in the second hashcat rule makes things get a bit more interesting. That will return a huge list per candidate. Here are a couple examples:

T@k3Th3R3dPill!
T@ke-The-Red-Pill
taketheredpill2020!
T0KE THE RED PILL

Additional Info

Optionally, some researchers might be interested in:

The raw source files mentioned in the table above. You can download them by appending the file name to https://f002.backblazeb2.com/file/passphrase-wordlist/.
The script I use to clean the raw sources into the wordlist here.

The cleanup script works like this:

$ python3.6 cleanup.py infile.txt outfile.txt
Reading from ./infile.txt: 505 MB
Wrote to ./outfile.txt: 250 MB
Elapsed time: 0:02:53.062531

Enjoy!

Passphrase-wordlist - Shameless clone of passphrase wordlist

Related tags

Overview

Overview

Usage

Sources Used

Hashcat Rules

Additional Info

You might also like...

Create password - Generate Random Password with Passphrase

An Advanced Wordlist Library Written In Python For Acm114

Generate Contextual Directory Wordlist For Target Org

Make your own huge Wordlist with advanced options

Professor Wordlist is a free open source command line tool written in python

Wordlist attacks on Bitwarden data.json files

Generate a wordlist to fuzz amounts or any other numerical values.

This repo is about steps to create a effective custom wordlist in a few clicks/

A tool to crack a wifi password with a help of wordlist

This is simple python FTP password craker. To crack FTP login using wordlist based brute force attack

FBGen is simple facebook user based wordlist generator using Username/ID and cookie.

Honcho: a python clone of Foreman. For managing Procfile-based applications.

Minecraft clone using Python Ursina game engine!

FB ID CLONER WUTHOT CHECKPOINT, FACEBOOK ID CLONE FROM FILE

🖍️This is a feature-complete clone of the awesome Chalk (JavaScript) library.

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Abandoned plan for a clone of the old Flash game Star Relic

git-partial-submodule is a command-line script for setting up and working with submodules while enabling them to use git's partial clone and sparse checkout features.

Whatsapp Clone using django, django-channels and websocket

Owner

Jeff McJunkin

Make your own huge Wordlist with advanced options

Wordlist attacks on Bitwarden data.json files

This repo is about steps to create a effective custom wordlist in a few clicks/

A tool to crack a wifi password with a help of wordlist

This is simple python FTP password craker. To crack FTP login using wordlist based brute force attack

FBGen is simple facebook user based wordlist generator using Username/ID and cookie.

NEW FACEBOOK CLONER WITH NEW PASSWORD, TERMUX FB CLONE, FB CLONING COMMAND. M

A python base script from which you can hack or clone any person's facebook friendlist or followers accounts which have simple password

A wordlist generator tool, that allows you to supply a set of words, giving you the possibility to craft multiple variations from the given words, creating a unique and ideal wordlist to use regarding a specific target.

This is a MD5 password/passphrase brute force tool