Convolutional Neural Network for Text Classification in Tensorflow

Denny Britz

Last update: Jan 2, 2023

Related tags

Deep Learning cnn-text-classification-tf

Overview

This code belongs to the "Implementing a CNN for Text Classification in Tensorflow" blog post.

It is slightly simplified implementation of Kim's Convolutional Neural Networks for Sentence Classification paper in Tensorflow.

Requirements

Python 3
Tensorflow > 0.12
Numpy

Training

Print parameters:

./train.py --help

optional arguments:
  -h, --help            show this help message and exit
  --embedding_dim EMBEDDING_DIM
                        Dimensionality of character embedding (default: 128)
  --filter_sizes FILTER_SIZES
                        Comma-separated filter sizes (default: '3,4,5')
  --num_filters NUM_FILTERS
                        Number of filters per filter size (default: 128)
  --l2_reg_lambda L2_REG_LAMBDA
                        L2 regularizaion lambda (default: 0.0)
  --dropout_keep_prob DROPOUT_KEEP_PROB
                        Dropout keep probability (default: 0.5)
  --batch_size BATCH_SIZE
                        Batch Size (default: 64)
  --num_epochs NUM_EPOCHS
                        Number of training epochs (default: 100)
  --evaluate_every EVALUATE_EVERY
                        Evaluate model on dev set after this many steps
                        (default: 100)
  --checkpoint_every CHECKPOINT_EVERY
                        Save model after this many steps (default: 100)
  --allow_soft_placement ALLOW_SOFT_PLACEMENT
                        Allow device soft device placement
  --noallow_soft_placement
  --log_device_placement LOG_DEVICE_PLACEMENT
                        Log placement of ops on devices
  --nolog_device_placement

Train:

./train.py

Evaluating

./eval.py --eval_train --checkpoint_dir="./runs/1459637919/checkpoints/"

Replace the checkpoint dir with the output from the training. To use your own data, change the eval.py script to load your data.

References

Comments

how to test the trained module?

i trained the movie review training set using this code. i got trained files in the path "runs/1458022294/summaries/train". how can i test the module is there any API in python to test it?

opened by haridatascientist 78
multi_class input

I see that in your code,[0,1] represent to label pos and [1,0] represent to label neg. But if I got a multi_class input such as 20 labels.How can I represent theses labels?

opened by tanghuiyu 16
Testing

Hey i have successfully trained the model but am unable to test it. I have tried to change the files in data_helpers.py(changed rt-polarity files) but then too i am not getting any new results.i also tried printing all_predictions in eval.py. Kindly help.

opened by rajmiglani 13
accuracy vs CNN_sentence

Thank you for this helpful reference implementation in tensorflow.

I'm running your code out-of-the-box but with hyperparameters chosen to match Yoon Kim's theano implementation https://github.com/yoonkim/CNN_sentence

I'm finding accuracy is much lower on the same data set over the same number of mini-batches / epochs...(not using pre-trained word2vec)

I am going to look at model topology, learning rate, dropout, L2, optimizer, loss function, etc. to get to the bottom of this and make sure it is an apples-to-apples comparison, but if you know where to focus efforts any help is appreciated.

opened by j314erre 11
what is needed for upgrading this code into word2vec version
As is stated in @dennybritz 's blog, this code can also be used together with a pre-trained word2vec.

I am thinking about how to make the modification for that.

The obvious steps that I came across are:

instead of padding and indexing vocabulary, use a fix sized w2v vector to represent the sentence.

remove the embedding layer on TextCNN class.

Is that all? Did I misunderstand something?

Thx
opened by sjhddh 7

AttributeError: type object 'NewBase' has no attribute 'is_abstract' using 0.8.0

Using tensorflow 0.8.0 (installed via pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl) I have the following error:

python3 train.py 
Traceback (most recent call last):
  File "train.py", line 3, in <module>
    import tensorflow as tf
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/__init__.py", line 23, in <module>
    from tensorflow.python import *
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/__init__.py", line 94, in <module>
    from tensorflow.python.platform import test
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/platform/test.py", line 62, in <module>
    from tensorflow.python.framework import test_util
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/test_util.py", line 41, in <module>
    from tensorflow.python.platform import googletest
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/platform/googletest.py", line 32, in <module>
    from tensorflow.python.platform import benchmark  # pylint: disable=unused-import
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/platform/benchmark.py", line 112, in <module>
    class Benchmark(six.with_metaclass(_BenchmarkRegistrar, object)):
  File "/usr/lib/python3/dist-packages/six.py", line 617, in with_metaclass
    return meta("NewBase", bases, {})
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/platform/benchmark.py", line 107, in __new__
    if not newclass.is_abstract():
AttributeError: type object 'NewBase' has no attribute 'is_abstract'

opened by servomac 6

I want to use cross-validation to train and predict all my own data, how to use eval.py?

My data has no test dataset. I want to use cross-validation to train and predict my data.Should I use my data to train, then use eval.py to predict? If not, how can I get predicted label for all data? Someone knows? Thanks

opened by jmfu95 5
Transfer Learning?

Hi,

Is there anyway I can utilise this for Transfer Learning? I mean, if I train the model on the rt data, can I then used the trained model to perform text classification on a different dataset?

PS: Thanks for this awesome repo and the blog ! :)

opened by pfrcks 5

multi-labels support + auto applicable vocabulary, label list, max sentence size and unseen word

First, I'm sorry. I have poor English.

Thanks to Yoon Kim and dennybritz.

I edit your script to apply multi-labels support. this script support multi-label data like next:

text1....\t label1
text2....\t label2
text3....\t label1
text4....\t label3

no vocabulary difference problem, no label set diffenence problem, no max sentence length difference problem, no unseen word problem.

Just simply use

python train.py --train_data_path="./data/train.txt"
python eval.py --checkpoint_dir="./runs/1463968251/checkpoints/" --test_data_path="./data/test.txt"

Here I attached my script:

data_helpers.py

import codecs
import os.path
import numpy as np
import re
import itertools
from collections import Counter

PAD_MARK = "<PAD/>"
UNK_MARK = "<UNK/>"

def clean_str(string):
    """
    Tokenization/string cleaning for all datasets except for SST.
    Original taken from https://github.com/yoonkim/CNN_sentence/blob/master/process_data.py
    """
#    string = re.sub(r"[^A-Za-z0-9(),!?\'\`]", " ", string)  # blocked to allow non-english char-set
    string = re.sub(r"\'s", " \'s", string)
    string = re.sub(r"\'ve", " \'ve", string)
    string = re.sub(r"n\'t", " n\'t", string)
    string = re.sub(r"\'re", " \'re", string)
    string = re.sub(r"\'d", " \'d", string)
    string = re.sub(r"\'ll", " \'ll", string)
    string = re.sub(r",", " , ", string)
    string = re.sub(r"!", " ! ", string)
    string = re.sub(r"\(", " \( ", string)
    string = re.sub(r"\)", " \) ", string)
    string = re.sub(r"\?", " \? ", string)
    string = re.sub(r"\s{2,}", " ", string)
    return string.strip().lower()


def load_data_and_labels( train_data_path ):
    """
    Loads MR polarity data from files, splits the data into words and generates labels.
    Returns split sentences and labels.
    """
    # Load data from files
    data = list()
    labels = list()
    for line in codecs.open( train_data_path, 'r', encoding='utf8' ).readlines() :
        if 1 > len( line.strip() ) : continue;
        t = line.split(u"\t");
        if 2 != len(t) :
            print "data format error" + line
            continue;
        data.append(t[0])
        labels.append(t[1])
    data   = [s.strip() for s in data]
    labels = [s.strip() for s in labels]
    # Split by words
    x_text = [clean_str(sent) for sent in data]
    x_text = [s.split(u" ") for s in x_text]
    return [x_text, labels]


def pad_sentences(sentences, max_sent_len_path):
    """
    Pads all sentences to the same length. The length is defined by the longest sentence.
    Returns padded sentences.
    """
    max_sequence_length = 0
    # Load base max sent length
    if len(max_sent_len_path) > 0 :
        max_sequence_length = int( open( max_sent_len_path, 'r' ).readlines()[0] )
    else : 
        max_sequence_length = max(len(x) for x in sentences)
    padded_sentences = []
    for i in range(len(sentences)):
        sentence = sentences[i]
        if max_sequence_length <= len(sentence) :
            padded_sentences.append(sentence[:max_sequence_length])
            continue
        num_padding = max_sequence_length - len(sentence)
        new_sentence = sentence + [PAD_MARK] * num_padding
        padded_sentences.append(new_sentence)
    return padded_sentences, max_sequence_length


def build_vocab(sentences, base_vocab_path):
    """
    Builds a vocabulary mapping from word to index based on the sentences.
    Returns vocabulary mapping and inverse vocabulary mapping.
    """
    vocabulary_inv = []
    # Load base vocabulary
    if len(base_vocab_path) > 0 :
        vL = [ [w.strip()] for w in codecs.open( base_vocab_path, 'r', encoding='utf8' ).readlines() ]
        c = Counter(itertools.chain(*vL))
        vocabulary_inv = [x[0] for x in c.most_common()]
    else :
        # Build vocabulary
        word_counts = Counter(itertools.chain(*sentences))
        # Mapping from index to word
        vocabulary_inv = vocabulary_inv + [x[0] for x in word_counts.most_common()] 
        if not UNK_MARK in vocabulary_inv :
            vocabulary_inv.append(UNK_MARK)
    vocabulary_inv = list(set(vocabulary_inv))
    vocabulary_inv.sort()
    # Mapping from word to index
    vocabulary = {x: i for i, x in enumerate(vocabulary_inv)}
    if not UNK_MARK in vocabulary :
        vocabulary[UNK_MARK] = vocabulary[PAD_MARK]

    return [vocabulary, vocabulary_inv]


def make_onehot(idx, size) :
    onehot = []
    for i in range(size) :
        if idx==i : onehot.append(1);
        else      : onehot.append(0);
    return onehot
# end def

def make_label_dic(labels) :
    """
    creator: [email protected]
    create date: 2016.05.22
    make 'label : one hot' dic
    """
    label_onehot = dict()
    onehot_label = dict()
    for i, label in enumerate(labels) :
        onehot =  make_onehot(i,len(labels))
        label_onehot[label] = onehot
        onehot_label[str(onehot)] = label
    return label_onehot, onehot_label
# end def

def build_onehot(labels, base_label_path):
    """
    Builds a vocabulary mapping from label to onehot based on the sentences.
    Returns vocabulary mapping and inverse vocabulary mapping.
    """
    uniq_labels = []
    # Load base vocabulary
    if len(base_label_path) > 0 :
        vL = [ [w.strip()] for w in codecs.open( base_label_path, 'r', encoding='utf8' ).readlines() ]
        c = Counter(itertools.chain(*vL))
        uniq_labels = [x[0] for x in c.most_common()]
    else :
        # Build vocabulary
        label_counts = Counter(labels)
        # Mapping from index to word
        uniq_labels = uniq_labels + [x[0] for x in label_counts.most_common()]
    uniq_labels = list(set(uniq_labels))
    uniq_labels.sort()
    label_onehot, onehot_label = make_label_dic( uniq_labels )
    return [uniq_labels, label_onehot, onehot_label]


def build_input_data(sentences, vocabulary, labels, label_onehot):
    """
    Maps sentencs and labels to vectors based on a vocabulary.
    """
    vL = []
    for sentence in sentences :
        wL = []
        for word in sentence :
            if word in vocabulary :
                wL.append( vocabulary[word] )
            else :
                wL.append( vocabulary[UNK_MARK] )
        vL.append(wL)
    x = np.array(vL)
    y = np.array([ label_onehot[label] for label in labels ])
    return [x, y]


def load_data( train_data_path, checkpoint_dir="" ):
    """
    Loads and preprocessed data for the MR dataset.
    Returns input vectors, labels, vocabulary, and inverse vocabulary.
    """
    # Load and preprocess data
    max_sent_len_path = "" if len(checkpoint_dir)<1 else checkpoint_dir+"/max_sent_len" 
    vocab_path        = "" if len(checkpoint_dir)<1 else checkpoint_dir+"/vocab" 
    label_path        = "" if len(checkpoint_dir)<1 else checkpoint_dir+"/label" 
    sentences, labels          = load_data_and_labels( train_data_path )
    sentences_padded, max_sequence_length = pad_sentences(sentences, max_sent_len_path)
    vocabulary, vocabulary_inv = build_vocab(sentences_padded, vocab_path)
    uniq_labels, label_onehot, onehot_label = build_onehot(labels, label_path) 
    x, y = build_input_data(sentences_padded, vocabulary, labels, label_onehot)
    return [x, y, vocabulary, vocabulary_inv, onehot_label, max_sequence_length]


def batch_iter(data, batch_size, num_epochs, shuffle=True):
    """
    Generates a batch iterator for a dataset.
    """
    data = np.array(data)
    data_size = len(data)
    num_batches_per_epoch = int(len(data)/batch_size) + 1
    for epoch in range(num_epochs):
        # Shuffle the data at each epoch
        if shuffle:
            shuffle_indices = np.random.permutation(np.arange(data_size))
            shuffled_data = data[shuffle_indices]
        else:
            shuffled_data = data
        for batch_num in range(num_batches_per_epoch):
            start_index = batch_num * batch_size
            end_index = min((batch_num + 1) * batch_size, data_size)
            yield shuffled_data[start_index:end_index]

train.py

#! /usr/bin/env python

import codecs
import tensorflow as tf
import numpy as np
import os
import time
import datetime
import data_helpers
from text_cnn import TextCNN

# Parameters
# ==================================================

# Model Hyperparameters
tf.flags.DEFINE_string("train_data_path", "./data/train.txt", "Data path to training")
tf.flags.DEFINE_integer("embedding_dim", 128, "Dimensionality of character embedding (default: 128)")
tf.flags.DEFINE_string("filter_sizes", "3,4,5", "Comma-separated filter sizes (default: '3,4,5')")
tf.flags.DEFINE_integer("num_filters", 128, "Number of filters per filter size (default: 128)")
tf.flags.DEFINE_float("dropout_keep_prob", 0.5, "Dropout keep probability (default: 0.5)")
tf.flags.DEFINE_float("l2_reg_lambda", 0.0, "L2 regularizaion lambda (default: 0.0)")

# Training parameters
tf.flags.DEFINE_integer("batch_size", 64, "Batch Size (default: 64)")
tf.flags.DEFINE_integer("num_epochs", 200, "Number of training epochs (default: 200)")
tf.flags.DEFINE_integer("evaluate_every", 100, "Evaluate model on dev set after this many steps (default: 100)")
tf.flags.DEFINE_integer("checkpoint_every", 100, "Save model after this many steps (default: 100)")
# Misc Parameters
tf.flags.DEFINE_boolean("allow_soft_placement", True, "Allow device soft device placement")
tf.flags.DEFINE_boolean("log_device_placement", False, "Log placement of ops on devices")

FLAGS = tf.flags.FLAGS
FLAGS._parse_flags()
print("\nParameters:")
for attr, value in sorted(FLAGS.__flags.items()):
    print("{}={}".format(attr.upper(), value))
print("")


# Data Preparatopn
# ==================================================

# Load data
print("Loading data...")
x, y, vocabulary, vocabulary_inv, onehot_label, max_sequence_length = data_helpers.load_data( FLAGS.train_data_path )
# Randomly shuffle data
np.random.seed(10)
shuffle_indices = np.random.permutation(np.arange(len(y)))
x_shuffled = x[shuffle_indices]
y_shuffled = y[shuffle_indices]
# Split train/test set
# TODO: This is very crude, should use cross-validation
x_train, x_dev = x_shuffled[:-1000], x_shuffled[-1000:]
y_train, y_dev = y_shuffled[:-1000], y_shuffled[-1000:]
print("Labels: %d: %s" % ( len(onehot_label), ','.join( onehot_label.values() ) ) )
print("Vocabulary Size: {:d}".format(len(vocabulary)))
print("Train/Dev split: {:d}/{:d}".format(len(y_train), len(y_dev)))


# Training
# ==================================================

with tf.Graph().as_default():
    session_conf = tf.ConfigProto(
      allow_soft_placement=FLAGS.allow_soft_placement,
      log_device_placement=FLAGS.log_device_placement)
    sess = tf.Session(config=session_conf)
    with sess.as_default():
        cnn = TextCNN(
            sequence_length=x_train.shape[1],
            num_classes=len(onehot_label),
            vocab_size=len(vocabulary),
            embedding_size=FLAGS.embedding_dim,
            filter_sizes=list(map(int, FLAGS.filter_sizes.split(","))),
            num_filters=FLAGS.num_filters,
            l2_reg_lambda=FLAGS.l2_reg_lambda)

        # Define Training procedure
        global_step = tf.Variable(0, name="global_step", trainable=False)
        optimizer = tf.train.AdamOptimizer(1e-3)
        grads_and_vars = optimizer.compute_gradients(cnn.loss)
        train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step)

        # Keep track of gradient values and sparsity (optional)
        grad_summaries = []
        for g, v in grads_and_vars:
            if g is not None:
                grad_hist_summary = tf.histogram_summary("{}/grad/hist".format(v.name), g)
                sparsity_summary = tf.scalar_summary("{}/grad/sparsity".format(v.name), tf.nn.zero_fraction(g))
                grad_summaries.append(grad_hist_summary)
                grad_summaries.append(sparsity_summary)
        grad_summaries_merged = tf.merge_summary(grad_summaries)

        # Output directory for models and summaries
        timestamp = str(int(time.time()))
        out_dir = os.path.abspath(os.path.join(os.path.curdir, "runs", timestamp))
        print("Writing to {}\n".format(out_dir))

        # Summaries for loss and accuracy
        loss_summary = tf.scalar_summary("loss", cnn.loss)
        acc_summary = tf.scalar_summary("accuracy", cnn.accuracy)

        # Train Summaries
        train_summary_op = tf.merge_summary([loss_summary, acc_summary, grad_summaries_merged])
        train_summary_dir = os.path.join(out_dir, "summaries", "train")
        train_summary_writer = tf.train.SummaryWriter(train_summary_dir, sess.graph_def)

        # Dev summaries
        dev_summary_op = tf.merge_summary([loss_summary, acc_summary])
        dev_summary_dir = os.path.join(out_dir, "summaries", "dev")
        dev_summary_writer = tf.train.SummaryWriter(dev_summary_dir, sess.graph_def)

        # Checkpoint directory. Tensorflow assumes this directory already exists so we need to create it
        checkpoint_dir = os.path.abspath(os.path.join(out_dir, "checkpoints"))
        checkpoint_prefix = os.path.join(checkpoint_dir, "model")
        if not os.path.exists(checkpoint_dir):
            os.makedirs(checkpoint_dir)

        # Save additional model info
        codecs.open( os.path.join(checkpoint_dir, "max_sent_len"), "w", encoding='utf8').write( str(max_sequence_length) )
        codecs.open( os.path.join(checkpoint_dir, "vocab"),        "w", encoding='utf8').write( '\n'.join(vocabulary_inv) )
        codecs.open( os.path.join(checkpoint_dir, "label"),        "w", encoding='utf8').write( '\n'.join(onehot_label.values()) )

        saver = tf.train.Saver(tf.all_variables())

        # Initialize all variables
        sess.run(tf.initialize_all_variables())

        def train_step(x_batch, y_batch):
            """
            A single training step
            """
            feed_dict = {
              cnn.input_x: x_batch,
              cnn.input_y: y_batch,
              cnn.dropout_keep_prob: FLAGS.dropout_keep_prob
            }
            _, step, summaries, loss, accuracy = sess.run(
                [train_op, global_step, train_summary_op, cnn.loss, cnn.accuracy],
                feed_dict)
            time_str = datetime.datetime.now().isoformat()
            print("{}: step {}, loss {:g}, acc {:g}".format(time_str, step, loss, accuracy))
            train_summary_writer.add_summary(summaries, step)

        def dev_step(x_batch, y_batch, writer=None):
            """
            Evaluates model on a dev set
            """
            feed_dict = {
              cnn.input_x: x_batch,
              cnn.input_y: y_batch,
              cnn.dropout_keep_prob: 1.0
            }
            step, summaries, loss, accuracy = sess.run(
                [global_step, dev_summary_op, cnn.loss, cnn.accuracy],
                feed_dict)
            time_str = datetime.datetime.now().isoformat()
            print("{}: step {}, loss {:g}, acc {:g}".format(time_str, step, loss, accuracy))
            if writer:
                writer.add_summary(summaries, step)

        # Generate batches
        batches = data_helpers.batch_iter(
            list(zip(x_train, y_train)), FLAGS.batch_size, FLAGS.num_epochs)
        # Training loop. For each batch...
        for batch in batches:
            x_batch, y_batch = zip(*batch)
            train_step(x_batch, y_batch)
            current_step = tf.train.global_step(sess, global_step)
            if current_step % FLAGS.evaluate_every == 0:
                print("\nEvaluation:")
                dev_step(x_dev, y_dev, writer=dev_summary_writer)
                print("")
            if current_step % FLAGS.checkpoint_every == 0:
                path = saver.save(sess, checkpoint_prefix, global_step=current_step)
                print("Saved model checkpoint to {}\n".format(path))

eval.py

#! /usr/bin/env python

import tensorflow as tf
import numpy as np
import os
import time
import datetime
import data_helpers
from text_cnn import TextCNN

# Parameters
# ==================================================

# Eval Parameters
tf.flags.DEFINE_string("test_data_path", "./data/test.txt", "Data path to evaluation")
tf.flags.DEFINE_integer("batch_size", 64, "Batch Size (default: 64)")
tf.flags.DEFINE_string("checkpoint_dir", "", "Checkpoint directory from training run")

# Misc Parameters
tf.flags.DEFINE_boolean("allow_soft_placement", True, "Allow device soft device placement")
tf.flags.DEFINE_boolean("log_device_placement", False, "Log placement of ops on devices")


FLAGS = tf.flags.FLAGS
FLAGS._parse_flags()
print("\nParameters:")
for attr, value in sorted(FLAGS.__flags.items()):
    print("{}={}".format(attr.upper(), value))
print("")

# Load data. Load your own data here
print("Loading data...")
x_test, y_test, vocabulary, vocabulary_inv, onehot_label, max_sequence_length = data_helpers.load_data( FLAGS.test_data_path, FLAGS.checkpoint_dir )
y_test = np.argmax(y_test, axis=1)
print("Labels: %d: %s" % ( len(onehot_label), ','.join( sorted(onehot_label.values()) ) ) )
print("Vocabulary size: {:d}".format(len(vocabulary)))
print("Test set size {:d}".format(len(y_test)))

print("\nEvaluating...\n")

# Evaluation
# ==================================================
checkpoint_file = tf.train.latest_checkpoint(FLAGS.checkpoint_dir)
graph = tf.Graph()
with graph.as_default():
    session_conf = tf.ConfigProto(
      allow_soft_placement=FLAGS.allow_soft_placement,
      log_device_placement=FLAGS.log_device_placement)
    sess = tf.Session(config=session_conf)
    with sess.as_default():
        # Load the saved meta graph and restore variables
        print "FLAGS.checkpoint_dir %s" % FLAGS.checkpoint_dir
        print "checkpoint_file %s" % checkpoint_file
        saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file))
        saver.restore(sess, checkpoint_file)

        # Get the placeholders from the graph by name
        input_x = graph.get_operation_by_name("input_x").outputs[0]
        # input_y = graph.get_operation_by_name("input_y").outputs[0]
        dropout_keep_prob = graph.get_operation_by_name("dropout_keep_prob").outputs[0]

        # Tensors we want to evaluate
        predictions = graph.get_operation_by_name("output/predictions").outputs[0]

        # Generate batches for one epoch
        batches = data_helpers.batch_iter(x_test, FLAGS.batch_size, 1, shuffle=False)

        # Collect the predictions here
        all_predictions = []

        for x_test_batch in batches:
            batch_predictions = sess.run(predictions, {input_x: x_test_batch, dropout_keep_prob: 1.0})
            all_predictions = np.concatenate([all_predictions, batch_predictions])

# Print accuracy
print "y_test: " + str(y_test)
print "all_predictions: " + str(all_predictions)
correct_predictions = float(sum(all_predictions == y_test))
print("Total number of test examples: {}".format(len(y_test)))
print("Accuracy: {:g}".format(correct_predictions/float(len(y_test))))

opened by MyeongjinHwang0 5

Process killed by the system

Hello, I've tested your code on my own data of 20,000 examples, and the result is quite good. I have another data of 300,000 examples. Each example is a short sentence of approximate 20 words. When tested on this new dataset, the first 100 steps are fine. However, it stops on the evaluation step. The dev set has more than 60,000 examples, and the message when it stops is "Killed". I guess the reason is that the number of examples of the dev set is very large, so it consumes a lot of memory. Is that true? And how can I fix that? Thank you very much.

opened by lenhhoxung86 5
AttributeError: _parse_flags

I've finished running train.py but when i run eval.py i get this error. Can you help me with this? Traceback (most recent call last): File "./eval.py", line 31, in FLAGS._parse_flags() File "C:\Users\Admin\Anaconda3\lib\site-packages\tensorflow\python\platform\flags.py", line 85, in getattr return wrapped.getattr(name) File "C:\Users\Admin\Anaconda3\lib\site-packages\absl\flags_flagvalues.py", line 470, in getattr raise AttributeError(name) AttributeError: _parse_flags

opened by CarbuncleOrigin 4
AttributeError: module 'tensorflow_estimator.python.estimator.api._v1.estimator' has no attribute 'preprocessing'

max_document_length = max([len(x.split(" ")) for x in x_text]) vocab_processor = estimator.preprocessing.VocabularyProcessor(max_document_length) x = np.array(list(vocab_processor.fit_transform(x_text)))

opened by xingyangfeng 0
tf.get_variable and tf.Variable

https://github.com/dennybritz/cnn-text-classification-tf/blob/18762b459e21d9c70e5c242f8d43fc4e6db37a0d/text_cnn.py#L66

why use tf.get_variable but not tf.Variable?

opened by sunpenglv 0
What is the size of embedded_chars_expanded ?

I'm trying to perform text classification using cnn, on my dataset, where embedding size is (?,768). I'm following this file text_cnn.py, from conv2d function. But before that, I need to convert my embedding into 4D tensor. How should I change (?,768) to 4D tensor?

In text_cnn.py : self.embedded_chars = tf.nn.embedding_lookup(self.W, self.input_x) self.embedded_chars_expanded = tf.expand_dims(self.embedded_chars, -1)

What are the dimensions of embedded_chars_expanded?

opened by arushi-08 0
Vocabulary size
Hi, while I was working with your code (I've rewritten it in Keras) I noticed one small detail about vocab_size:

vocab_size=len(vocab_processor.vocabulary_),

https://github.com/dennybritz/cnn-text-classification-tf/blob/18762b459e21d9c70e5c242f8d43fc4e6db37a0d/train.py#L88

Since we are padding sentences with zeros aren't we supposed to add padding word (0) into the vocabulary size? I think the original code from Yoon Kim does that:

W = np.zeros(shape=(vocab_size+1, k), dtype='float32')

https://github.com/yoonkim/CNN_sentence/blob/23e0e1f7355705bb083043fda05c031b15acb38c/process_data.py#L55

I know, it's probably a minor thing but wanted to ask to be 100% sure whether we should or should not add one to the vocabulary size.

PS. You can find my code here, I tried to follow your solution as closely as possible. But despite reaching similar accuracy as your code, the training doesn't behave so nicely like yours does (yet...?). https://github.com/lubiluk/cnn-sentence/blob/master/cnn_sentence.ipynb
opened by lubiluk 0

Convolutional Neural Network for Text Classification in Tensorflow

Related tags

Overview

Requirements

Training

Evaluating

References

Comments

Just simply use

data_helpers.py

train.py

eval.py

Owner

Denny Britz

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images.

Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemetic Analysis

paper: Hyperspectral Remote Sensing Image Classification Using Deep Convolutional Capsule Network

An implementation of paper `Real-time Convolutional Neural Networks for Emotion and Gender Classification` with PaddlePaddle.

Hl classification bc - A Network-Based High-Level Data Classification Algorithm Using Betweenness Centrality

This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

EPSANet：An Efficient Pyramid Split Attention Block on Convolutional Neural Network

Implementation of character based convolutional neural network

An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Meta graph convolutional neural network-assisted resilient swarm communications

Convolutional neural network web app trained to track our infant’s sleep schedule using our Google Nest camera.

Pytorch implementation of Cut-Thumbnail in the paper Cut-Thumbnail:A Novel Data Augmentation for Convolutional Neural Network.

Code repo for "RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network" (Machine Learning and the Physical Sciences workshop in NeurIPS 2021).

Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

An implementation of quantum convolutional neural network with MindQuantum. Huawei, classifying MNIST dataset

CasualHealthcare's Pneumonia detection with Artificial Intelligence (Convolutional Neural Network)