Tutela: an Ethereum and Tornado Cash Anonymity Tool

Overview

Tutela: an Ethereum and Tornado Cash Anonymity Tool

The repo contains open-source code for Tutela, an anonymity tool for Ethereum and Tornado Cash users.

About Tutela

In response to the Tornado Cash (TC) Anonymity Research Tools Grant, we have built Tutela v1, an Ethereum wallet anonymity detection tool, to tell you if your blockchain transactions have revealed anything about your identity. What does this mean? Well, for example, if you have used multiple Ethereum wallets to send tokens to a single centralized exchange deposit address, you may have revealed that your wallets are owned by the same entity.

We'd love to get user feedback! Tell us what you like, what you don’t and what you think is missing! Please leave your feedback in the Tutela-Product-Feedback channel of the Tornado Cash Discord.

The Tornado Cash User's Dilemma

Tornado cash users have multiple addresses and use Tornado Cash to hide this fact. We believe the most important need for this user base is to know whether their addresses can already be connected by third parties.

Tutela, an Anonymity Detection Tool

In response, our initial MVP has focused on informing users which of their Ethereum addresses are "affiliated" (a non-blockchain analogy would be haveibeenpwned.com). This involves using a clustering algorithm and two heuristics (i.e. reveals) so far, the Ethereum deposit address reuse heuristic and the Tornado Cash unique gas price heuristic. We plan to refine and add additional heuristics over time.

Current Heuristics

Ethereum Deposit Address Reuse Heuristic

When you send tokens from an Ethereum wallet to your account at a centralized exchange, the exchange creates a unique deposit address for each customer. If you reuse the same deposit address by sending tokens from multiple Ethereum wallets to it, your two wallets can be linked. Even if you send tokens from multiple wallets to multiple deposits, all of these addresses can be linked. In this way, it is possible to build a complex graph of address relationships.

Tornado Cash Pools Unique Gas Price Heuristic

Pre EIP-1559 Ethereum transactions contained a gas price. Users can set their wallet gas fee and pay a very specific gas fee (e.g. 147.4535436 Gwei) when they deposit in a Tornado Cash pool. If they also withdraw from that same Tornado cash pool, using the same wallet application (e.g. Metamask), but a different wallet address and haven’t changed the gas fee, it could reveal that two addresses are connected.

Tornado Cash Pools Synchronous Tx Heuristic

If a deposit transaction and a withdrawal transaction to a specific Tornado Cash pool share the same wallet address, then this address is now compromised, and should not add to the anonymity of future Tornado Cash transactions for that pool.

We Need Your Help!

Tutela is still in its very early stages and we are looking for feedback at all levels. Let us know your thoughts, critiques, and suggestions in the Tutela-Product-Feedback channel of the Tornado Cash Discord.. How can we make Tutela something useful for you? What features or heuristics are we missing?

Next Steps

Our plan for the next two months is to refine and develop Tutela v1 by:

  1. Getting your feedback!
  2. Refining the deposit reuse heuristic
  3. Adding anonymity set scoring for Tornado Cash pools
  4. Providing transaction by transaction reveal data (studying anonymity over time)
  5. Identifying, testing and implementing Tornado Cash Specific Heuristics:
    1. Transactions between deposit and withdrawal addresses from a specific TC pool
    2. Linking equal value deposits and withdrawals to specific deposit and withdrawal addresses - if there are multiple (say 12) deposit transactions coming from a deposit address and later there are 12 withdraw transactions to the same withdraw address, then we could link all these deposit transactions to the withdraw transactions
    3. Careless TC anonymity mining - anonymity mining is a clever way to incentivize users to participate in mixing. However, if users carelessly claim their Anonymity Points (AP) or Tornado tokens, then they can reduce their anonymity set. For instance, if a user withdraws their earned AP tokens to a deposit address, then we can approximate the maximum time a user has left their funds in the mixing pool. This is because users can only claim AP and TORN tokens after deposit transactions that were already withdrawn.
    4. Profiling deposit and withdrawal addresses - collect and analyze the behaviour of all addresses that have interacted with Tornado cash pools
    5. Wallet fingerprinting - different wallets work in different ways. We have several ideas on how we can distinguish between them. It will allow us to further fragment the anonymity sets of withdraw transactions.

Technical Summary

Ethereum and Tornado Cash transactions are downloaded using BigQuery. The deposit address reuse algorithm was adapted from the existing implementation in etherclust. Our Python implementation can be found in src/; it is written to scalably operate over the >1 Tb of Ethereum data. The Tornado-specific heuristics can be found in scripts/tornadocash, again written in Python. The Tutela web application lives in webapp/ and is written in Flask with a PostgreSQL database for storing clusters. The frontend is written in Javascript, HTML, and CSS.

Updates

We aim to provide consistent updates over time as we improve Tutela.

  • (11/17) We posted a pre-beta version of Tutela to the Tornado Cash community for feedback.
  • (11/23) We open-sourced the Tutela implementation and will make all future improvements public through pull requests. Since 11/17, we increased the number of CEXs for clustering from 171 to 332, and added a list of common addresses that we omit from consideration when classifying deposits. Improvements were made to the gas price and synchronous TCash reveals: searching by address will now return TCash specific information in the backend. Several bugfixes were implemented, such as address casing, incorrect deposit names, deposit reuse hyperparameters.

Contributors

Development of the web application and clustering was done by mhw32, kkailiwang, Tiggy560, and nickbax, with support from Convex Labs. Development of TCash heuristics was done by seresistvanandras, unbalancedparentheses, tomasdema, entropidelic, HermanObst, and pefontana.

Comments
  • Add Time Window to Multi-Denomination Heuristic

    Add Time Window to Multi-Denomination Heuristic

    For each withdrawal transaction, look for all transactions in the last 24 hours from that point in time, consider them a group if there are >2 transactions otherwise abort

    If >2 withdrawal transactions, consider each deposit transaction, look if in the preceding 24 hours a unique address deposits exactly the same amount as is withdrawn, consider matched otherwise, move onto next deposit transaction.

    • Care was taken to vectorize all operations for speed.
    • Small update to Gas price heuristic to fix a by_pool issue.
    opened by mhw32 4
  • add ihavebeencompromised feature for tcash addr page

    add ihavebeencompromised feature for tcash addr page

    If you pull up the page info for a tcash pool address, you can now input a deposit address to see if any of the compromised transactions involved that address.

    opened by kkailiwang 1
  • Updating Heuristics 1, 2, and 4

    Updating Heuristics 1, 2, and 4

    • Add pool restrictions for the AddressMatch heuristic.
    • Remove relayers from the UniqueGasPrice heuristic.
    • Add option for exact match in the MultiDenomination heuristic.
    • TODO: Add time constraints to the MultiDenomination heuristic.
    opened by mhw32 1
  • Edits to Tornado

    Edits to Tornado

    This is for the "have i been compromised" functionality.

    • Add a new path to check if address is tornado with some small features.
    • Okay... done, added a new path /search/compromised. It expects two params, and address and a pool address.
    opened by mhw32 1
  • Live updating pipeline

    Live updating pipeline

    Includes pipelines for (1) Tornado Cash heuristics (from Google BigQuery -> database), (2) deposit address reuse, and (3) transaction data needed for rankings.

    NOTE: This does not cover Diff2Vec and currently there are no plans to live update this. It is just far too expensive, and any greedy compromises sacrifice too much.

    Remaining TODOs:

    • Test db writing in dev server.
    • Code up deposit reuse pipeline.
    • Test end to end.
    opened by mhw32 0
  • Feature: Relative rank of Reveal Statistics

    Feature: Relative rank of Reveal Statistics

    Structure:

    413                     'ranks': {
    414                         'overall': 0,
    415                         'ethereum': {
    416                             DEPO_REUSE_HEUR: 0,
    417                         },
    418                         'tcash': {
    419                             SAME_ADDR_HEUR: 0,
    420                             GAS_PRICE_HEUR: 0,
    421                             SAME_NUM_TX_HEUR: 0,
    422                             LINKED_TX_HEUR: 0,
    423                             TORN_MINE_HEUR: 0,
    424                         },
    425                     },
    
    opened by mhw32 0
  • [WIP] Bugfixes: Transaction Page Issues

    [WIP] Bugfixes: Transaction Page Issues

    • Remove extra call to heuristic_to_int that was returning integers.
    • Add multiple options (1mth, 3mth, 6mth, 1yr, 3yr, 5yr). @kkailiwang You can default frontend to 6mth, 1yr, 3yr.
    opened by mhw32 0
  • Bugfix: Non-DAR addresses should be shown for Diff2Vec.

    Bugfix: Non-DAR addresses should be shown for Diff2Vec.

    The whole point of adding Diff2Vec was to find clusters for addresses not showing up in DAR. Unfortunately, the current design does not query Diff2Vec when address does not have DAR clusters.

    opened by mhw32 0
Owner
TutelaLabs
Privacy tools for Blockchain
TutelaLabs
Bridge between L1 (Ethereum) and L2 (cheapETH)

The ETH chain and the cheapETH chain. We can assume the ETH chain has ~1000x more value than the cheapETH chain.

null 107 Oct 12, 2022
Ethereum ETL lets you convert blockchain data into convenient formats like CSVs and relational databases.

Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions.

Blockchain ETL 2.3k Jan 1, 2023
A simple Ethereum mining pool

A simple getWork pool for ethereum mining Payouts are still manual. TODO: write payouts when someone mines 10 blocks. Also, make the submit actually

null 93 Oct 5, 2022
A simple Ethereum mining pool

A simple getWork pool for ethereum mining

null 93 Oct 5, 2022
How to setup a multi-client ethereum Eth1-Eth2 merge testnet

Mergenet tutorial Let's set up a local eth1-eth2 merge testnet! Preparing the setup environment In this tutorial, we use a series of scripts to genera

Diederik Loerakker 24 Jun 17, 2022
Basic Ethereum Miner Lib

EthMine ⛏ Basic Ethereum Miner Library. Developers can integrate this algorithm to mine blocks from their ethereum supported chain efficiently. Instal

Jaival Patel 1 Oct 30, 2021
Lottery by Ethereum Blockchain

Lottery by Ethereum Blockchain Set your web3 provider url in .env PROVIDER=https://mainnet.infura.io/v3/<YOUR-INFURA-TOKEN> Create your source file .

John Torres 3 Dec 23, 2021
theHasher Tool created for generate strong and unbreakable passwords by using Hash Functions.Generate Hashes and store them in txt files.Use the txt files as lists to execute Brute Force Attacks!

$theHasher theHasher is a Tool for generating hashes using some of the most Famous Hashes Functions ever created. You can save your hashes to correspo

SR18 6 Feb 2, 2022
Mysterium the first tool which permits you to retrieve the most part of a Python code even the .py or .pyc was extracted from an executable file, even it is encrypted with every existing encryptage. Mysterium don't make any difference between encrypted and non encrypted files, it can retrieve code from Pyarmor or .pyc files.

Mysterium the first tool which permits you to retrieve the most part of a Python code even the .py or .pyc was extracted from an executable file, even it is encrypted with every existing encryptage. Mysterium don't make any difference between encrypted and non encrypted files, it can retrieve code from Pyarmor or .pyc files.

Venax 116 Dec 21, 2022
A tool used to encrypt Python scripts version < 2.7 and version < 3.9

A tool used to encrypt Python scripts version < 2.7 and version < 3.9

Fajar Kim 1 Dec 14, 2021
Powerful Tool to encrypt and decrypt files using AES.

AEScryptor Tool Description Encrypt and Decrypt files with AES-128 (16bytes key). AES mode = CFB (cipher Feedback) security = super safe! Usage [1] Ch

null 5 Jan 12, 2022
A tool that can encrypt python2 or python3 code with the given password and can reuse with that password

A tool that can encrypt python2 or python3 code with the given password and can reuse with that password

Md Rasel Bhuyan 3 Feb 28, 2022
A python tool to track prices of various cryptocurrencies and alert

CryptoPriceTracker This is a tool to track prices of various cryptocurrencies and alert the user once the user defined maximum & minimum target is rea

null 1 Oct 1, 2021
PyCrypter , A Tool To Encrypt/Decrypt Text/Code With Ease And Safe Using Password !

PyCrypter PyCrypter , A Tool To Encrypt/Decrypt Text/Code With Ease And Safe Using Password ! Requirements pyfiglet And colorama Usage First Clone The

null 1 Nov 12, 2021
A Python Tool to encrypt all types of files using AES and XOR Algorithm.

DataShield This project intends to protect user’s data, it stores files in encrypted format in device provided the passcode and path of the file. AES

ADITYA SHINDE 4 Dec 20, 2021
Tool to compare smart contracts source code

smartdiffer Tool to compare smart contracts source code. Heavily relies on API of Etherscan and Diffchecker. Installation pip install smartdiffer API

Roman Moskalenko 23 Nov 16, 2022
😈 Shining is a tool that enables engineers to remotely pull smart contract code in multi-file situations.

?? Shining ?? Shining is a tool that enables engineers to remotely pull smart contract code in multi-file situations. Shining is the name of one of my

xxxeyJ 15 Jun 17, 2022
A simple Python tool to help anyone use Liquidity Pools on the BitShares blockchain.

ACCOUNT AND ACTIVE KEY ARE NOT PERSISTENT, YOU WILL NEED TO ENTER THEM EACH TIME YOU LAUNCH THE APP (but not every transaction. that's a win). If / wh

Brendan Jensen 17 Jun 15, 2022
zhash is a simple Python tool which allows to create/crack hashes

zhash zhash is a simple python tool which allows you to crack/create hashes. Below are the list of supported algorithms that zhash can crack Supported

null 3 May 27, 2022