Convolutional Neural Network for Text Classification in Tensorflow

Overview

This code belongs to the "Implementing a CNN for Text Classification in Tensorflow" blog post.

It is slightly simplified implementation of Kim's Convolutional Neural Networks for Sentence Classification paper in Tensorflow.

Requirements

  • Python 3
  • Tensorflow > 0.12
  • Numpy

Training

Print parameters:

./train.py --help
optional arguments:
  -h, --help            show this help message and exit
  --embedding_dim EMBEDDING_DIM
                        Dimensionality of character embedding (default: 128)
  --filter_sizes FILTER_SIZES
                        Comma-separated filter sizes (default: '3,4,5')
  --num_filters NUM_FILTERS
                        Number of filters per filter size (default: 128)
  --l2_reg_lambda L2_REG_LAMBDA
                        L2 regularizaion lambda (default: 0.0)
  --dropout_keep_prob DROPOUT_KEEP_PROB
                        Dropout keep probability (default: 0.5)
  --batch_size BATCH_SIZE
                        Batch Size (default: 64)
  --num_epochs NUM_EPOCHS
                        Number of training epochs (default: 100)
  --evaluate_every EVALUATE_EVERY
                        Evaluate model on dev set after this many steps
                        (default: 100)
  --checkpoint_every CHECKPOINT_EVERY
                        Save model after this many steps (default: 100)
  --allow_soft_placement ALLOW_SOFT_PLACEMENT
                        Allow device soft device placement
  --noallow_soft_placement
  --log_device_placement LOG_DEVICE_PLACEMENT
                        Log placement of ops on devices
  --nolog_device_placement

Train:

./train.py

Evaluating

./eval.py --eval_train --checkpoint_dir="./runs/1459637919/checkpoints/"

Replace the checkpoint dir with the output from the training. To use your own data, change the eval.py script to load your data.

References

Comments
  • how to test the trained module?

    how to test the trained module?

    i trained the movie review training set using this code. i got trained files in the path "runs/1458022294/summaries/train". how can i test the module is there any API in python to test it?

    opened by haridatascientist 78
  • multi_class input

    multi_class input

    I see that in your code,[0,1] represent to label pos and [1,0] represent to label neg. But if I got a multi_class input such as 20 labels.How can I represent theses labels?

    opened by tanghuiyu 16
  • Testing

    Testing

    Hey i have successfully trained the model but am unable to test it. I have tried to change the files in data_helpers.py(changed rt-polarity files) but then too i am not getting any new results.i also tried printing all_predictions in eval.py. Kindly help.

    opened by rajmiglani 13
  • accuracy vs CNN_sentence

    accuracy vs CNN_sentence

    Thank you for this helpful reference implementation in tensorflow.

    I'm running your code out-of-the-box but with hyperparameters chosen to match Yoon Kim's theano implementation https://github.com/yoonkim/CNN_sentence

    I'm finding accuracy is much lower on the same data set over the same number of mini-batches / epochs...(not using pre-trained word2vec)

    I am going to look at model topology, learning rate, dropout, L2, optimizer, loss function, etc. to get to the bottom of this and make sure it is an apples-to-apples comparison, but if you know where to focus efforts any help is appreciated.

    screen shot 2016-03-22 at 11 08 49 am
    opened by j314erre 11
  • what is needed for upgrading this code into word2vec version

    what is needed for upgrading this code into word2vec version

    As is stated in @dennybritz 's blog, this code can also be used together with a pre-trained word2vec.

    I am thinking about how to make the modification for that.

    The obvious steps that I came across are:

    1. instead of padding and indexing vocabulary, use a fix sized w2v vector to represent the sentence.
    2. remove the embedding layer on TextCNN class.

    Is that all? Did I misunderstand something?

    Thx

    opened by sjhddh 7
  • AttributeError: type object 'NewBase' has no attribute 'is_abstract' using 0.8.0

    AttributeError: type object 'NewBase' has no attribute 'is_abstract' using 0.8.0

    Using tensorflow 0.8.0 (installed via pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl) I have the following error:

    python3 train.py 
    Traceback (most recent call last):
      File "train.py", line 3, in <module>
        import tensorflow as tf
      File "/usr/local/lib/python3.4/dist-packages/tensorflow/__init__.py", line 23, in <module>
        from tensorflow.python import *
      File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/__init__.py", line 94, in <module>
        from tensorflow.python.platform import test
      File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/platform/test.py", line 62, in <module>
        from tensorflow.python.framework import test_util
      File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/test_util.py", line 41, in <module>
        from tensorflow.python.platform import googletest
      File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/platform/googletest.py", line 32, in <module>
        from tensorflow.python.platform import benchmark  # pylint: disable=unused-import
      File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/platform/benchmark.py", line 112, in <module>
        class Benchmark(six.with_metaclass(_BenchmarkRegistrar, object)):
      File "/usr/lib/python3/dist-packages/six.py", line 617, in with_metaclass
        return meta("NewBase", bases, {})
      File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/platform/benchmark.py", line 107, in __new__
        if not newclass.is_abstract():
    AttributeError: type object 'NewBase' has no attribute 'is_abstract'
    
    opened by servomac 6
  • I want to use cross-validation to train and predict all my own data, how to use eval.py?

    I want to use cross-validation to train and predict all my own data, how to use eval.py?

    My data has no test dataset. I want to use cross-validation to train and predict my data.Should I use my data to train, then use eval.py to predict? If not, how can I get predicted label for all data? Someone knows? Thanks

    opened by jmfu95 5
  • Transfer Learning?

    Transfer Learning?

    Hi,

    Is there anyway I can utilise this for Transfer Learning? I mean, if I train the model on the rt data, can I then used the trained model to perform text classification on a different dataset?

    PS: Thanks for this awesome repo and the blog ! :)

    opened by pfrcks 5
  • multi-labels support + auto applicable vocabulary, label list, max sentence size and unseen word

    multi-labels support + auto applicable vocabulary, label list, max sentence size and unseen word

    First, I'm sorry. I have poor English.

    Thanks to Yoon Kim and dennybritz.

    I edit your script to apply multi-labels support. this script support multi-label data like next:

    text1....\t label1
    text2....\t label2
    text3....\t label1
    text4....\t label3
    

    no vocabulary difference problem, no label set diffenence problem, no max sentence length difference problem, no unseen word problem.

    Just simply use

    python train.py --train_data_path="./data/train.txt"
    python eval.py --checkpoint_dir="./runs/1463968251/checkpoints/" --test_data_path="./data/test.txt"
    

    Here I attached my script:

    data_helpers.py

    import codecs
    import os.path
    import numpy as np
    import re
    import itertools
    from collections import Counter
    
    PAD_MARK = "<PAD/>"
    UNK_MARK = "<UNK/>"
    
    def clean_str(string):
        """
        Tokenization/string cleaning for all datasets except for SST.
        Original taken from https://github.com/yoonkim/CNN_sentence/blob/master/process_data.py
        """
    #    string = re.sub(r"[^A-Za-z0-9(),!?\'\`]", " ", string)  # blocked to allow non-english char-set
        string = re.sub(r"\'s", " \'s", string)
        string = re.sub(r"\'ve", " \'ve", string)
        string = re.sub(r"n\'t", " n\'t", string)
        string = re.sub(r"\'re", " \'re", string)
        string = re.sub(r"\'d", " \'d", string)
        string = re.sub(r"\'ll", " \'ll", string)
        string = re.sub(r",", " , ", string)
        string = re.sub(r"!", " ! ", string)
        string = re.sub(r"\(", " \( ", string)
        string = re.sub(r"\)", " \) ", string)
        string = re.sub(r"\?", " \? ", string)
        string = re.sub(r"\s{2,}", " ", string)
        return string.strip().lower()
    
    
    def load_data_and_labels( train_data_path ):
        """
        Loads MR polarity data from files, splits the data into words and generates labels.
        Returns split sentences and labels.
        """
        # Load data from files
        data = list()
        labels = list()
        for line in codecs.open( train_data_path, 'r', encoding='utf8' ).readlines() :
            if 1 > len( line.strip() ) : continue;
            t = line.split(u"\t");
            if 2 != len(t) :
                print "data format error" + line
                continue;
            data.append(t[0])
            labels.append(t[1])
        data   = [s.strip() for s in data]
        labels = [s.strip() for s in labels]
        # Split by words
        x_text = [clean_str(sent) for sent in data]
        x_text = [s.split(u" ") for s in x_text]
        return [x_text, labels]
    
    
    def pad_sentences(sentences, max_sent_len_path):
        """
        Pads all sentences to the same length. The length is defined by the longest sentence.
        Returns padded sentences.
        """
        max_sequence_length = 0
        # Load base max sent length
        if len(max_sent_len_path) > 0 :
            max_sequence_length = int( open( max_sent_len_path, 'r' ).readlines()[0] )
        else : 
            max_sequence_length = max(len(x) for x in sentences)
        padded_sentences = []
        for i in range(len(sentences)):
            sentence = sentences[i]
            if max_sequence_length <= len(sentence) :
                padded_sentences.append(sentence[:max_sequence_length])
                continue
            num_padding = max_sequence_length - len(sentence)
            new_sentence = sentence + [PAD_MARK] * num_padding
            padded_sentences.append(new_sentence)
        return padded_sentences, max_sequence_length
    
    
    def build_vocab(sentences, base_vocab_path):
        """
        Builds a vocabulary mapping from word to index based on the sentences.
        Returns vocabulary mapping and inverse vocabulary mapping.
        """
        vocabulary_inv = []
        # Load base vocabulary
        if len(base_vocab_path) > 0 :
            vL = [ [w.strip()] for w in codecs.open( base_vocab_path, 'r', encoding='utf8' ).readlines() ]
            c = Counter(itertools.chain(*vL))
            vocabulary_inv = [x[0] for x in c.most_common()]
        else :
            # Build vocabulary
            word_counts = Counter(itertools.chain(*sentences))
            # Mapping from index to word
            vocabulary_inv = vocabulary_inv + [x[0] for x in word_counts.most_common()] 
            if not UNK_MARK in vocabulary_inv :
                vocabulary_inv.append(UNK_MARK)
        vocabulary_inv = list(set(vocabulary_inv))
        vocabulary_inv.sort()
        # Mapping from word to index
        vocabulary = {x: i for i, x in enumerate(vocabulary_inv)}
        if not UNK_MARK in vocabulary :
            vocabulary[UNK_MARK] = vocabulary[PAD_MARK]
    
        return [vocabulary, vocabulary_inv]
    
    
    def make_onehot(idx, size) :
        onehot = []
        for i in range(size) :
            if idx==i : onehot.append(1);
            else      : onehot.append(0);
        return onehot
    # end def
    
    def make_label_dic(labels) :
        """
        creator: [email protected]
        create date: 2016.05.22
        make 'label : one hot' dic
        """
        label_onehot = dict()
        onehot_label = dict()
        for i, label in enumerate(labels) :
            onehot =  make_onehot(i,len(labels))
            label_onehot[label] = onehot
            onehot_label[str(onehot)] = label
        return label_onehot, onehot_label
    # end def
    
    def build_onehot(labels, base_label_path):
        """
        Builds a vocabulary mapping from label to onehot based on the sentences.
        Returns vocabulary mapping and inverse vocabulary mapping.
        """
        uniq_labels = []
        # Load base vocabulary
        if len(base_label_path) > 0 :
            vL = [ [w.strip()] for w in codecs.open( base_label_path, 'r', encoding='utf8' ).readlines() ]
            c = Counter(itertools.chain(*vL))
            uniq_labels = [x[0] for x in c.most_common()]
        else :
            # Build vocabulary
            label_counts = Counter(labels)
            # Mapping from index to word
            uniq_labels = uniq_labels + [x[0] for x in label_counts.most_common()]
        uniq_labels = list(set(uniq_labels))
        uniq_labels.sort()
        label_onehot, onehot_label = make_label_dic( uniq_labels )
        return [uniq_labels, label_onehot, onehot_label]
    
    
    def build_input_data(sentences, vocabulary, labels, label_onehot):
        """
        Maps sentencs and labels to vectors based on a vocabulary.
        """
        vL = []
        for sentence in sentences :
            wL = []
            for word in sentence :
                if word in vocabulary :
                    wL.append( vocabulary[word] )
                else :
                    wL.append( vocabulary[UNK_MARK] )
            vL.append(wL)
        x = np.array(vL)
        y = np.array([ label_onehot[label] for label in labels ])
        return [x, y]
    
    
    def load_data( train_data_path, checkpoint_dir="" ):
        """
        Loads and preprocessed data for the MR dataset.
        Returns input vectors, labels, vocabulary, and inverse vocabulary.
        """
        # Load and preprocess data
        max_sent_len_path = "" if len(checkpoint_dir)<1 else checkpoint_dir+"/max_sent_len" 
        vocab_path        = "" if len(checkpoint_dir)<1 else checkpoint_dir+"/vocab" 
        label_path        = "" if len(checkpoint_dir)<1 else checkpoint_dir+"/label" 
        sentences, labels          = load_data_and_labels( train_data_path )
        sentences_padded, max_sequence_length = pad_sentences(sentences, max_sent_len_path)
        vocabulary, vocabulary_inv = build_vocab(sentences_padded, vocab_path)
        uniq_labels, label_onehot, onehot_label = build_onehot(labels, label_path) 
        x, y = build_input_data(sentences_padded, vocabulary, labels, label_onehot)
        return [x, y, vocabulary, vocabulary_inv, onehot_label, max_sequence_length]
    
    
    def batch_iter(data, batch_size, num_epochs, shuffle=True):
        """
        Generates a batch iterator for a dataset.
        """
        data = np.array(data)
        data_size = len(data)
        num_batches_per_epoch = int(len(data)/batch_size) + 1
        for epoch in range(num_epochs):
            # Shuffle the data at each epoch
            if shuffle:
                shuffle_indices = np.random.permutation(np.arange(data_size))
                shuffled_data = data[shuffle_indices]
            else:
                shuffled_data = data
            for batch_num in range(num_batches_per_epoch):
                start_index = batch_num * batch_size
                end_index = min((batch_num + 1) * batch_size, data_size)
                yield shuffled_data[start_index:end_index]
    

    train.py

    #! /usr/bin/env python
    
    import codecs
    import tensorflow as tf
    import numpy as np
    import os
    import time
    import datetime
    import data_helpers
    from text_cnn import TextCNN
    
    # Parameters
    # ==================================================
    
    # Model Hyperparameters
    tf.flags.DEFINE_string("train_data_path", "./data/train.txt", "Data path to training")
    tf.flags.DEFINE_integer("embedding_dim", 128, "Dimensionality of character embedding (default: 128)")
    tf.flags.DEFINE_string("filter_sizes", "3,4,5", "Comma-separated filter sizes (default: '3,4,5')")
    tf.flags.DEFINE_integer("num_filters", 128, "Number of filters per filter size (default: 128)")
    tf.flags.DEFINE_float("dropout_keep_prob", 0.5, "Dropout keep probability (default: 0.5)")
    tf.flags.DEFINE_float("l2_reg_lambda", 0.0, "L2 regularizaion lambda (default: 0.0)")
    
    # Training parameters
    tf.flags.DEFINE_integer("batch_size", 64, "Batch Size (default: 64)")
    tf.flags.DEFINE_integer("num_epochs", 200, "Number of training epochs (default: 200)")
    tf.flags.DEFINE_integer("evaluate_every", 100, "Evaluate model on dev set after this many steps (default: 100)")
    tf.flags.DEFINE_integer("checkpoint_every", 100, "Save model after this many steps (default: 100)")
    # Misc Parameters
    tf.flags.DEFINE_boolean("allow_soft_placement", True, "Allow device soft device placement")
    tf.flags.DEFINE_boolean("log_device_placement", False, "Log placement of ops on devices")
    
    FLAGS = tf.flags.FLAGS
    FLAGS._parse_flags()
    print("\nParameters:")
    for attr, value in sorted(FLAGS.__flags.items()):
        print("{}={}".format(attr.upper(), value))
    print("")
    
    
    # Data Preparatopn
    # ==================================================
    
    # Load data
    print("Loading data...")
    x, y, vocabulary, vocabulary_inv, onehot_label, max_sequence_length = data_helpers.load_data( FLAGS.train_data_path )
    # Randomly shuffle data
    np.random.seed(10)
    shuffle_indices = np.random.permutation(np.arange(len(y)))
    x_shuffled = x[shuffle_indices]
    y_shuffled = y[shuffle_indices]
    # Split train/test set
    # TODO: This is very crude, should use cross-validation
    x_train, x_dev = x_shuffled[:-1000], x_shuffled[-1000:]
    y_train, y_dev = y_shuffled[:-1000], y_shuffled[-1000:]
    print("Labels: %d: %s" % ( len(onehot_label), ','.join( onehot_label.values() ) ) )
    print("Vocabulary Size: {:d}".format(len(vocabulary)))
    print("Train/Dev split: {:d}/{:d}".format(len(y_train), len(y_dev)))
    
    
    # Training
    # ==================================================
    
    with tf.Graph().as_default():
        session_conf = tf.ConfigProto(
          allow_soft_placement=FLAGS.allow_soft_placement,
          log_device_placement=FLAGS.log_device_placement)
        sess = tf.Session(config=session_conf)
        with sess.as_default():
            cnn = TextCNN(
                sequence_length=x_train.shape[1],
                num_classes=len(onehot_label),
                vocab_size=len(vocabulary),
                embedding_size=FLAGS.embedding_dim,
                filter_sizes=list(map(int, FLAGS.filter_sizes.split(","))),
                num_filters=FLAGS.num_filters,
                l2_reg_lambda=FLAGS.l2_reg_lambda)
    
            # Define Training procedure
            global_step = tf.Variable(0, name="global_step", trainable=False)
            optimizer = tf.train.AdamOptimizer(1e-3)
            grads_and_vars = optimizer.compute_gradients(cnn.loss)
            train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step)
    
            # Keep track of gradient values and sparsity (optional)
            grad_summaries = []
            for g, v in grads_and_vars:
                if g is not None:
                    grad_hist_summary = tf.histogram_summary("{}/grad/hist".format(v.name), g)
                    sparsity_summary = tf.scalar_summary("{}/grad/sparsity".format(v.name), tf.nn.zero_fraction(g))
                    grad_summaries.append(grad_hist_summary)
                    grad_summaries.append(sparsity_summary)
            grad_summaries_merged = tf.merge_summary(grad_summaries)
    
            # Output directory for models and summaries
            timestamp = str(int(time.time()))
            out_dir = os.path.abspath(os.path.join(os.path.curdir, "runs", timestamp))
            print("Writing to {}\n".format(out_dir))
    
            # Summaries for loss and accuracy
            loss_summary = tf.scalar_summary("loss", cnn.loss)
            acc_summary = tf.scalar_summary("accuracy", cnn.accuracy)
    
            # Train Summaries
            train_summary_op = tf.merge_summary([loss_summary, acc_summary, grad_summaries_merged])
            train_summary_dir = os.path.join(out_dir, "summaries", "train")
            train_summary_writer = tf.train.SummaryWriter(train_summary_dir, sess.graph_def)
    
            # Dev summaries
            dev_summary_op = tf.merge_summary([loss_summary, acc_summary])
            dev_summary_dir = os.path.join(out_dir, "summaries", "dev")
            dev_summary_writer = tf.train.SummaryWriter(dev_summary_dir, sess.graph_def)
    
            # Checkpoint directory. Tensorflow assumes this directory already exists so we need to create it
            checkpoint_dir = os.path.abspath(os.path.join(out_dir, "checkpoints"))
            checkpoint_prefix = os.path.join(checkpoint_dir, "model")
            if not os.path.exists(checkpoint_dir):
                os.makedirs(checkpoint_dir)
    
            # Save additional model info
            codecs.open( os.path.join(checkpoint_dir, "max_sent_len"), "w", encoding='utf8').write( str(max_sequence_length) )
            codecs.open( os.path.join(checkpoint_dir, "vocab"),        "w", encoding='utf8').write( '\n'.join(vocabulary_inv) )
            codecs.open( os.path.join(checkpoint_dir, "label"),        "w", encoding='utf8').write( '\n'.join(onehot_label.values()) )
    
            saver = tf.train.Saver(tf.all_variables())
    
            # Initialize all variables
            sess.run(tf.initialize_all_variables())
    
            def train_step(x_batch, y_batch):
                """
                A single training step
                """
                feed_dict = {
                  cnn.input_x: x_batch,
                  cnn.input_y: y_batch,
                  cnn.dropout_keep_prob: FLAGS.dropout_keep_prob
                }
                _, step, summaries, loss, accuracy = sess.run(
                    [train_op, global_step, train_summary_op, cnn.loss, cnn.accuracy],
                    feed_dict)
                time_str = datetime.datetime.now().isoformat()
                print("{}: step {}, loss {:g}, acc {:g}".format(time_str, step, loss, accuracy))
                train_summary_writer.add_summary(summaries, step)
    
            def dev_step(x_batch, y_batch, writer=None):
                """
                Evaluates model on a dev set
                """
                feed_dict = {
                  cnn.input_x: x_batch,
                  cnn.input_y: y_batch,
                  cnn.dropout_keep_prob: 1.0
                }
                step, summaries, loss, accuracy = sess.run(
                    [global_step, dev_summary_op, cnn.loss, cnn.accuracy],
                    feed_dict)
                time_str = datetime.datetime.now().isoformat()
                print("{}: step {}, loss {:g}, acc {:g}".format(time_str, step, loss, accuracy))
                if writer:
                    writer.add_summary(summaries, step)
    
            # Generate batches
            batches = data_helpers.batch_iter(
                list(zip(x_train, y_train)), FLAGS.batch_size, FLAGS.num_epochs)
            # Training loop. For each batch...
            for batch in batches:
                x_batch, y_batch = zip(*batch)
                train_step(x_batch, y_batch)
                current_step = tf.train.global_step(sess, global_step)
                if current_step % FLAGS.evaluate_every == 0:
                    print("\nEvaluation:")
                    dev_step(x_dev, y_dev, writer=dev_summary_writer)
                    print("")
                if current_step % FLAGS.checkpoint_every == 0:
                    path = saver.save(sess, checkpoint_prefix, global_step=current_step)
                    print("Saved model checkpoint to {}\n".format(path))
    

    eval.py

    #! /usr/bin/env python
    
    import tensorflow as tf
    import numpy as np
    import os
    import time
    import datetime
    import data_helpers
    from text_cnn import TextCNN
    
    # Parameters
    # ==================================================
    
    # Eval Parameters
    tf.flags.DEFINE_string("test_data_path", "./data/test.txt", "Data path to evaluation")
    tf.flags.DEFINE_integer("batch_size", 64, "Batch Size (default: 64)")
    tf.flags.DEFINE_string("checkpoint_dir", "", "Checkpoint directory from training run")
    
    # Misc Parameters
    tf.flags.DEFINE_boolean("allow_soft_placement", True, "Allow device soft device placement")
    tf.flags.DEFINE_boolean("log_device_placement", False, "Log placement of ops on devices")
    
    
    FLAGS = tf.flags.FLAGS
    FLAGS._parse_flags()
    print("\nParameters:")
    for attr, value in sorted(FLAGS.__flags.items()):
        print("{}={}".format(attr.upper(), value))
    print("")
    
    # Load data. Load your own data here
    print("Loading data...")
    x_test, y_test, vocabulary, vocabulary_inv, onehot_label, max_sequence_length = data_helpers.load_data( FLAGS.test_data_path, FLAGS.checkpoint_dir )
    y_test = np.argmax(y_test, axis=1)
    print("Labels: %d: %s" % ( len(onehot_label), ','.join( sorted(onehot_label.values()) ) ) )
    print("Vocabulary size: {:d}".format(len(vocabulary)))
    print("Test set size {:d}".format(len(y_test)))
    
    print("\nEvaluating...\n")
    
    # Evaluation
    # ==================================================
    checkpoint_file = tf.train.latest_checkpoint(FLAGS.checkpoint_dir)
    graph = tf.Graph()
    with graph.as_default():
        session_conf = tf.ConfigProto(
          allow_soft_placement=FLAGS.allow_soft_placement,
          log_device_placement=FLAGS.log_device_placement)
        sess = tf.Session(config=session_conf)
        with sess.as_default():
            # Load the saved meta graph and restore variables
            print "FLAGS.checkpoint_dir %s" % FLAGS.checkpoint_dir
            print "checkpoint_file %s" % checkpoint_file
            saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file))
            saver.restore(sess, checkpoint_file)
    
            # Get the placeholders from the graph by name
            input_x = graph.get_operation_by_name("input_x").outputs[0]
            # input_y = graph.get_operation_by_name("input_y").outputs[0]
            dropout_keep_prob = graph.get_operation_by_name("dropout_keep_prob").outputs[0]
    
            # Tensors we want to evaluate
            predictions = graph.get_operation_by_name("output/predictions").outputs[0]
    
            # Generate batches for one epoch
            batches = data_helpers.batch_iter(x_test, FLAGS.batch_size, 1, shuffle=False)
    
            # Collect the predictions here
            all_predictions = []
    
            for x_test_batch in batches:
                batch_predictions = sess.run(predictions, {input_x: x_test_batch, dropout_keep_prob: 1.0})
                all_predictions = np.concatenate([all_predictions, batch_predictions])
    
    # Print accuracy
    print "y_test: " + str(y_test)
    print "all_predictions: " + str(all_predictions)
    correct_predictions = float(sum(all_predictions == y_test))
    print("Total number of test examples: {}".format(len(y_test)))
    print("Accuracy: {:g}".format(correct_predictions/float(len(y_test))))
    
    opened by MyeongjinHwang0 5
  • Process killed by the system

    Process killed by the system

    Hello, I've tested your code on my own data of 20,000 examples, and the result is quite good. I have another data of 300,000 examples. Each example is a short sentence of approximate 20 words. When tested on this new dataset, the first 100 steps are fine. However, it stops on the evaluation step. The dev set has more than 60,000 examples, and the message when it stops is "Killed". I guess the reason is that the number of examples of the dev set is very large, so it consumes a lot of memory. Is that true? And how can I fix that? Thank you very much.

    opened by lenhhoxung86 5
  • AttributeError: _parse_flags

    AttributeError: _parse_flags

    I've finished running train.py but when i run eval.py i get this error. Can you help me with this? Traceback (most recent call last): File "./eval.py", line 31, in FLAGS._parse_flags() File "C:\Users\Admin\Anaconda3\lib\site-packages\tensorflow\python\platform\flags.py", line 85, in getattr return wrapped.getattr(name) File "C:\Users\Admin\Anaconda3\lib\site-packages\absl\flags_flagvalues.py", line 470, in getattr raise AttributeError(name) AttributeError: _parse_flags

    opened by CarbuncleOrigin 4
  • AttributeError: module 'tensorflow_estimator.python.estimator.api._v1.estimator' has no attribute 'preprocessing'

    AttributeError: module 'tensorflow_estimator.python.estimator.api._v1.estimator' has no attribute 'preprocessing'

    max_document_length = max([len(x.split(" ")) for x in x_text]) vocab_processor = estimator.preprocessing.VocabularyProcessor(max_document_length) x = np.array(list(vocab_processor.fit_transform(x_text)))

    opened by xingyangfeng 0
  • tf.get_variable and tf.Variable

    tf.get_variable and tf.Variable

    https://github.com/dennybritz/cnn-text-classification-tf/blob/18762b459e21d9c70e5c242f8d43fc4e6db37a0d/text_cnn.py#L66

    why use tf.get_variable but not tf.Variable?

    opened by sunpenglv 0
  • What is the size of embedded_chars_expanded ?

    What is the size of embedded_chars_expanded ?

    I'm trying to perform text classification using cnn, on my dataset, where embedding size is (?,768). I'm following this file text_cnn.py, from conv2d function. But before that, I need to convert my embedding into 4D tensor. How should I change (?,768) to 4D tensor?

    In text_cnn.py : self.embedded_chars = tf.nn.embedding_lookup(self.W, self.input_x) self.embedded_chars_expanded = tf.expand_dims(self.embedded_chars, -1)

    What are the dimensions of embedded_chars_expanded?

    opened by arushi-08 0
  • Vocabulary size

    Vocabulary size

    Hi, while I was working with your code (I've rewritten it in Keras) I noticed one small detail about vocab_size:

    vocab_size=len(vocab_processor.vocabulary_),
    

    https://github.com/dennybritz/cnn-text-classification-tf/blob/18762b459e21d9c70e5c242f8d43fc4e6db37a0d/train.py#L88

    Since we are padding sentences with zeros aren't we supposed to add padding word (0) into the vocabulary size? I think the original code from Yoon Kim does that:

    W = np.zeros(shape=(vocab_size+1, k), dtype='float32')     
    

    https://github.com/yoonkim/CNN_sentence/blob/23e0e1f7355705bb083043fda05c031b15acb38c/process_data.py#L55

    I know, it's probably a minor thing but wanted to ask to be 100% sure whether we should or should not add one to the vocabulary size.

    PS. You can find my code here, I tried to follow your solution as closely as possible. But despite reaching similar accuracy as your code, the training doesn't behave so nicely like yours does (yet...?). https://github.com/lubiluk/cnn-sentence/blob/master/cnn_sentence.ipynb

    opened by lubiluk 0
Owner
Denny Britz
High-school dropout. Ex Google Brain, Stanford, Berkeley. Into Startups, Deep Learning. Writing at wildml.com and dennybritz.com. Lived in 日本 and 한국
Denny Britz
Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemetic Analysis

TDY-CNN for Text-Independent Speaker Verification Official implementation of Temporal Dynamic Convolutional Neural Network for Text-Independent Speake

Seong-Hu Kim 16 Oct 17, 2022
paper: Hyperspectral Remote Sensing Image Classification Using Deep Convolutional Capsule Network

DC-CapsNet This is a tensorflow and keras based implementation of DC-CapsNet for HSI in the Remote Sensing Letters R. Lei et al., "Hyperspectral Remot

LEI 7 Nov 29, 2022
An implementation of paper `Real-time Convolutional Neural Networks for Emotion and Gender Classification` with PaddlePaddle.

简介 通过PaddlePaddle框架复现了论文 Real-time Convolutional Neural Networks for Emotion and Gender Classification 中提出的两个模型,分别是SimpleCNN和MiniXception。利用 imdb_crop

null 8 Mar 11, 2022
Hl classification bc - A Network-Based High-Level Data Classification Algorithm Using Betweenness Centrality

A Network-Based High-Level Data Classification Algorithm Using Betweenness Centr

Esteban Vilca 3 Dec 1, 2022
This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

Orientation independent Möbius CNNs This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of

Maurice Weiler 59 Dec 9, 2022
CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes Implementation of CoSMA: Convolutional Semi-Regular Mesh Autoencoder arXiv p

Fraunhofer SCAI 10 Oct 11, 2022
EPSANet:An Efficient Pyramid Split Attention Block on Convolutional Neural Network

EPSANet:An Efficient Pyramid Split Attention Block on Convolutional Neural Network This repo contains the official Pytorch implementaion code and conf

Hu Zhang 175 Jan 7, 2023
Implementation of character based convolutional neural network

Character Based CNN This repo contains a PyTorch implementation of a character-level convolutional neural network for text classification. The model a

Ahmed BESBES 248 Nov 21, 2022
An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

Retina Blood Vessels Segmentation This is an implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional

Srijarko Roy 23 Aug 20, 2022
Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network This repository is the official implementation of Speech Separati

Kai Li (李凯) 116 Nov 9, 2022
Meta graph convolutional neural network-assisted resilient swarm communications

Resilient UAV Swarm Communications with Graph Convolutional Neural Network This repository contains the source codes of Resilient UAV Swarm Communicat

null 62 Dec 6, 2022
Convolutional neural network web app trained to track our infant’s sleep schedule using our Google Nest camera.

Machine Learning Sleep Schedule Tracker What is it? Convolutional neural network web app trained to track our infant’s sleep schedule using our Google

g-parki 7 Jul 15, 2022
Pytorch implementation of Cut-Thumbnail in the paper Cut-Thumbnail:A Novel Data Augmentation for Convolutional Neural Network.

Cut-Thumbnail (Accepted at ACM MULTIMEDIA 2021) Tianshu Xie, Xuan Cheng, Xiaomin Wang, Minghui Liu, Jiali Deng, Tao Zhou, Ming Liu This is the officia

null 3 Apr 12, 2022
Code repo for "RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network" (Machine Learning and the Physical Sciences workshop in NeurIPS 2021).

RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network An official PyTorch implementation of the RBSRICNN network as desc

Rao Muhammad Umer 6 Nov 14, 2022
Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

3D-GMPDCNN Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network PyTorch implementation of "Geological Modeling Usin

null 5 Nov 21, 2022
An implementation of quantum convolutional neural network with MindQuantum. Huawei, classifying MNIST dataset

关于实现的一点说明 山东大学 2020级 苏博南 www.subonan.com 文件说明 tools.py 这里面主要有两个函数: resize(a, lenb) 这其实是我找同学写的一个小算法hhh。给出一个$28\times 28$的方阵a,返回一个$lenb\times lenb$的方阵。因

ぼっけなす 2 Aug 29, 2022
CasualHealthcare's Pneumonia detection with Artificial Intelligence (Convolutional Neural Network)

CasualHealthcare's Pneumonia detection with Artificial Intelligence (Convolutional Neural Network) This is PneumoniaDiagnose, an artificially intellig

Azhaan 2 Jan 3, 2022