Türkçe küfürlü içerikleri bulan bir yapay zeka kütüphanesi / An ML library for profanity detection in Turkish sentences

KaraGoz

Last update: Feb 18, 2022

Related tags

Text Data & NLP nlp profanity-detection turkish-language turkce kufur turkish-nlp acikhack2021

Overview

"Kötü söz sahibine aittir."

-Anonim

Nedir?

sinkaf uygunsuz yorumların bulunmasını sağlayan bir python kütüphanesidir.

Farkı nedir?

Diğer algoritmalardan en büyük farkı, önceden belirlenmiş bir kelime listesinden cümlerlerdeki sözcükleri tek tek kontrol etmek yerine, makine öğrenmesi metodları kullanarak cümlenin genel anlamına bakabilmesidir. Aynı zamanda sinkaf baya bi hızlı!

Nasıl çalışıyor?

Arka planda modelimizi eğitmek için A corpus of Turkish offensive language verisetini kullanıyoruz. Bu veriseti 36,000+ twitter yorumunun hakaret içerip içermediğini gösteren, Türkçe ile makine öğrenmesi denemeleri yapmak isteyenler için fevkaledenin fevkinde bir kaynak! Kendilerine teşekkür ediyoruz. Velhasıl...

Nasıl yüklerim?

pip3 install sinkaf

Gerekli paketler

joblib
transformers
numpy
scikit_learn

Nasıl kullanırım?

from sinkaf import Sinkaf
  
snf = Sinkaf()

snf.tahmin(["çok tatlı çocuk", "çok şerefsiz çocuk"])
# array([False,  True])

snf.tahminlik(["çok tatlı çocuk", "çok şerefsiz çocuk"])
# array([0.09811712, 0.86237484])

Alternatif model

BERT kullanılarak vektörize edilmiş veri üzerinde eğitilmiş modeller:

bert_pre: Küfürlü cümlelerin saptanmasında düşük duyarlılık yüksek kesinlik
bert_rec: Küfürlü cümlelerin saptanmasında yüksek duyarlılık az kesinlik

snf = Sinkaf(model = "bert_pre")

snf.tahmin(["çok tatlı çocuk", "çok şerefsiz çocuk"])
# array([False,  True])

snf.tahminlik(["çok tatlı çocuk", "çok şerefsiz çocuk"])
# array([0.26865139 0.85412345])

İyi çalışıyor mu?

Fena değil gibi ama tabi daha iyi kesinlikle olabilir.

Detaylar için:

sinkaf, Açık Hack 2021^*'e katılmak amacıyla Kara Göz ekibi tarafından geliştirilmiştir.

^{* sunum linki}

Write Alphabet, Words and Sentences with your eyes.

The-Next-Gen-AI-Eye-Writer The Eye tracking Technique has become one of the most popular techniques within the human and computer interaction era, thi

2 Apr 5, 2022

Finally, some decent sample sentences

tts-dataset-prompts This repository aims to be a decent set of sentences for people looking to clone their own voices (e.g. using Tacotron 2). Each se

19 Dec 13, 2022

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

⚠️ Checkout develop branch to see what is coming in pyannote.audio 2.0: a much smaller and cleaner codebase Python-first API (the good old pyannote-au

2.2k Jan 9, 2023

Visual Automata is a Python 3 library built as a wrapper for Caleb Evans' Automata library to add more visualization features.

55 Nov 17, 2022

Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and convert them into audio. Here I have used Google-text-to-speech library popularly known as gTTS library to convert text file to .mp3 file. Hope you like my project!

Text to speech (using Python) Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and co

19 Jun 30, 2022

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.

pySBD: Python Sentence Boundary Disambiguation (SBD) pySBD - python Sentence Boundary Disambiguation (SBD) - is a rule-based sentence boundary detecti

549 Jan 6, 2023

277 Feb 18, 2021

Scene Text Retrieval via Joint Text Detection and Similarity Learning

This is the code of "Scene Text Retrieval via Joint Text Detection and Similarity Learning". For more details, please refer to our CVPR2021 paper.

79 Nov 29, 2022

One Stop Anomaly Shop: Anomaly detection using two-phase approach: (a) pre-labeling using statistics, Natural Language Processing and static rules; (b) anomaly scoring using supervised and unsupervised machine learning.

One Stop Anomaly Shop (OSAS) Quick start guide Step 1: Get/build the docker image Option 1: Use precompiled image (might not reflect latest changes):

148 Dec 26, 2022

Türkçe küfürlü içerikleri bulan bir yapay zeka kütüphanesi / An ML library for profanity detection in Turkish sentences

Related tags

Overview

Nedir?

Farkı nedir?

Nasıl çalışıyor?

Nasıl yüklerim?

Gerekli paketler

Nasıl kullanırım?

Alternatif model

İyi çalışıyor mu?

You might also like...

Write Alphabet, Words and Sentences with your eyes.

Finally, some decent sample sentences

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Visual Automata is a Python 3 library built as a wrapper for Caleb Evans' Automata library to add more visualization features.

Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and convert them into audio. Here I have used Google-text-to-speech library popularly known as gTTS library to convert text file to .mp3 file. Hope you like my project!

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.

Scene Text Retrieval via Joint Text Detection and Similarity Learning

One Stop Anomaly Shop: Anomaly detection using two-phase approach: (a) pre-labeling using statistics, Natural Language Processing and static rules; (b) anomaly scoring using supervised and unsupervised machine learning.

Owner

KaraGoz

Python package for Turkish Language.

Question and answer retrieval in Turkish with BERT

Text Classification in Turkish Texts with Bert

Bu Chatbot, Konya Bilim Merkezi Yen için tasarlanmış olan bir projedir.

Extract Keywords from sentence or Replace keywords in sentences.

Extract Keywords from sentence or Replace keywords in sentences.

Using context-free grammar formalism to parse English sentences to determine their structure to help computer to better understand the meaning of the sentence.

基于GRU网络的句子判断程序/A program based on GRU network for judging sentences

Idea is to build a model which will take keywords as inputs and generate sentences as outputs.

Gpt2-WebAPI - The objective of this API is to provide the 3 best possible responses to sentences that the user would input via http GET request as a parameter