TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

Overview

TensorFlow Ranking

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform. It contains the following components:

We envision that this library will provide a convenient open platform for hosting and advancing state-of-the-art ranking models based on deep learning techniques, and thus facilitate both academic research and industrial applications.

Tutorial Slides

TF-Ranking was presented at premier conferences in Information Retrieval, SIGIR 2019 and ICTIR 2019! The slides are available here.

Demos

We provide a demo, with no installation required, to get started on using TF-Ranking. This demo runs on a colaboratory notebook, an interactive Python environment. Using sparse features and embeddings in TF-Ranking Run in Google Colab. This demo demonstrates how to:

  • Use sparse/embedding features
  • Process data in TFRecord format
  • Tensorboard integration in colab notebook, for Estimator API

Also see Running Scripts for executable scripts.

Linux Installation

Stable Builds

To install the latest version from PyPI, run the following:

# Installing with the `--upgrade` flag ensures you'll get the latest version.
pip install --user --upgrade tensorflow_ranking

To force a Python 3-specific install, replace pip with pip3 in the above commands. For additional installation help, guidance installing prerequisites, and (optionally) setting up virtual environments, see the TensorFlow installation guide.

Note: Since TensorFlow is now included as a dependency of the TensorFlow Ranking package (in setup.py). If you wish to use different versions of TensorFlow (e.g., tensorflow-gpu), you may need to uninstall the existing verison and then install your desired version:

$ pip uninstall tensorflow
$ pip install tensorflow-gpu

Installing from Source

  1. To build TensorFlow Ranking locally, you will need to install:

    • Bazel, an open source build tool.

      $ sudo apt-get update && sudo apt-get install bazel
    • Pip, a Python package manager.

      $ sudo apt-get install python-pip
    • VirtualEnv, a tool to create isolated Python environments.

      $ pip install --user virtualenv
  2. Clone the TensorFlow Ranking repository.

    $ git clone https://github.com/tensorflow/ranking.git
  3. Build TensorFlow Ranking wheel file and store them in /tmp/ranking_pip folder.

    $ cd ranking  # The folder which was cloned in Step 2.
    $ bazel build //tensorflow_ranking/tools/pip_package:build_pip_package
    $ bazel-bin/tensorflow_ranking/tools/pip_package/build_pip_package /tmp/ranking_pip
  4. Install the wheel package using pip. Test in virtualenv, to avoid clash with any system dependencies.

    $ ~/.local/bin/virtualenv -p python3 /tmp/tfr
    $ source /tmp/tfr/bin/activate
    (tfr) $ pip install /tmp/ranking_pip/tensorflow_ranking*.whl

    In some cases, you may want to install a specific version of tensorflow, e.g., tensorflow-gpu or tensorflow==2.0.0. To do so you can either

    (tfr) $ pip uninstall tensorflow
    (tfr) $ pip install tensorflow==2.0.0

    or

    (tfr) $ pip uninstall tensorflow
    (tfr) $ pip install tensorflow-gpu
  5. Run all TensorFlow Ranking tests.

    (tfr) $ bazel test //tensorflow_ranking/...
  6. Invoke TensorFlow Ranking package in python (within virtualenv).

    (tfr) $ python -c "import tensorflow_ranking"

Running Scripts

For ease of experimentation, we also provide a TFRecord example and a LIBSVM example in the form of executable scripts. This is particularly useful for hyperparameter tuning, where the hyperparameters are supplied as flags to the script.

TFRecord Example

  1. Set up the data and directory.

    MODEL_DIR=/tmp/tf_record_model && \
    TRAIN=tensorflow_ranking/examples/data/train_elwc.tfrecord && \
    EVAL=tensorflow_ranking/examples/data/eval_elwc.tfrecord && \
    VOCAB=tensorflow_ranking/examples/data/vocab.txt
  2. Build and run.

    rm -rf $MODEL_DIR && \
    bazel build -c opt \
    tensorflow_ranking/examples/tf_ranking_tfrecord_py_binary && \
    ./bazel-bin/tensorflow_ranking/examples/tf_ranking_tfrecord_py_binary \
    --train_path=$TRAIN \
    --eval_path=$EVAL \
    --vocab_path=$VOCAB \
    --model_dir=$MODEL_DIR \
    --data_format=example_list_with_context

LIBSVM Example

  1. Set up the data and directory.

    OUTPUT_DIR=/tmp/libsvm && \
    TRAIN=tensorflow_ranking/examples/data/train.txt && \
    VALI=tensorflow_ranking/examples/data/vali.txt && \
    TEST=tensorflow_ranking/examples/data/test.txt
  2. Build and run.

    rm -rf $OUTPUT_DIR && \
    bazel build -c opt \
    tensorflow_ranking/examples/tf_ranking_libsvm_py_binary && \
    ./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary \
    --train_path=$TRAIN \
    --vali_path=$VALI \
    --test_path=$TEST \
    --output_dir=$OUTPUT_DIR \
    --num_features=136 \
    --num_train_steps=100

TensorBoard

The training results such as loss and metrics can be visualized using Tensorboard.

  1. (Optional) If you are working on remote server, set up port forwarding with this command.

    $ ssh <remote-server> -L 8888:127.0.0.1:8888
  2. Install Tensorboard and invoke it with the following commands.

    (tfr) $ pip install tensorboard
    (tfr) $ tensorboard --logdir $OUTPUT_DIR

Jupyter Notebook

An example jupyter notebook is available in tensorflow_ranking/examples/handling_sparse_features.ipynb.

  1. To run this notebook, first follow the steps in installation to set up virtualenv environment with tensorflow_ranking package installed.

  2. Install jupyter within virtualenv.

    (tfr) $ pip install jupyter
  3. Start a jupyter notebook instance on remote server.

    (tfr) $ jupyter notebook tensorflow_ranking/examples/handling_sparse_features.ipynb \
            --NotebookApp.allow_origin='https://colab.research.google.com' \
            --port=8888
  4. (Optional) If you are working on remote server, set up port forwarding with this command.

    $ ssh <remote-server> -L 8888:127.0.0.1:8888
  5. Running the notebook.

    • Start jupyter notebook on your local machine at http://localhost:8888/ and browse to the ipython notebook.

    • An alternative is to use colaboratory notebook via colab.research.google.com and open the notebook in the browser. Choose local runtime and link to port 8888.

References

  • Rama Kumar Pasumarthi, Sebastian Bruch, Xuanhui Wang, Cheng Li, Michael Bendersky, Marc Najork, Jan Pfeifer, Nadav Golbandi, Rohan Anil, Stephan Wolf. TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank. KDD 2019.

  • Qingyao Ai, Xuanhui Wang, Sebastian Bruch, Nadav Golbandi, Michael Bendersky, Marc Najork. Learning Groupwise Scoring Functions Using Deep Neural Networks. ICTIR 2019

  • Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. Learning to Rank with Selection Bias in Personal Search. SIGIR 2016.

  • Xuanhui Wang, Cheng Li, Nadav Golbandi, Mike Bendersky, Marc Najork. The LambdaLoss Framework for Ranking Metric Optimization. CIKM 2018.

Citation

If you use TensorFlow Ranking in your research and would like to cite it, we suggest you use the following citation:

@inproceedings{TensorflowRankingKDD2019,
   author = {Rama Kumar Pasumarthi and Sebastian Bruch and Xuanhui Wang and Cheng Li and Michael Bendersky and Marc Najork and Jan Pfeifer and Nadav Golbandi and Rohan Anil and Stephan Wolf},
   title = {TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank},
   booktitle = {Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
   year = {2019},
   pages = {2970--2978},
   location = {Anchorage, AK}
}
Comments
  • Parsing ELWC example to tensors consumable by a non-EIE model

    Parsing ELWC example to tensors consumable by a non-EIE model

    • versions:
      • tensorflow==2.3.2
      • tensorflow-ranking==0.3.0
    • description: When the example list only contains one item, the model graph can consume the tensors parsed by tfr's parser, however, when the number of items >=2, we got errors like this from tf-serving: the following example has an item list (example list) of 2.
    ERROR:
      Code: InvalidArgument
      Message: Input to reshape is a tensor with 4 values, but the requested shape has
     2
             [[{{node dnn/input_from_feature_columns/input_layer/feature_A_indicator_1/Reshape}}]]
    

    Here feature_A is an indicator column of bucket size ==2, default to 0. The model is DNNLinearCombinedEstimator.

    By looking at the source code, I've noticed that ELWC parser is actually EIE parser, so I'm wondering if it's because the parser constructs the tensors to a EIE fashion, so that our model can not consume the model is expecting plain batched examples?

    I've tried changing the parser source code to alter the output tensors, but no luck. Any idea? Thank you!

    opened by edwardchu-studio 39
  • When using ELWC - OP_REQUIRES failed at example_parsing_ops.cc:91 : Invalid argument: Could not parse example input

    When using ELWC - OP_REQUIRES failed at example_parsing_ops.cc:91 : Invalid argument: Could not parse example input

    Hello Team,

    I trained a TF ranking model (basing my training on the following example: https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/examples/tf_ranking_tfrecord.py) and saved it using estimator.export_saved_model('my_model', serving_input_receiver_fn), the model was trained successfully & saved without any warnings/errors.

    I deployed the model to a local TensorFlow ModelServer and made an call to it over HTTP using cURL as described on https://www.tensorflow.org/tfx/serving/api_rest#request_format. Unfortunately I see the following error after making the request:

    W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1655] OP_REQUIRES failed at example_parsing_ops.cc:91 : Invalid argument: Could not parse example input, value: '
    
    ctx_f0
    
    { "error": "Could not parse example input, value: \'\n\035\n\021ctx_f0\022\010\022\006\n\004\000\000\340@\'\n\t [[{{node ParseExample/ParseExample}}]]" }
    

    I understand that this is a problem that may be related to serialization where my input was not properly serialized, but saving the model by generating serving_input_receiver_fn & using it produced no errors/warnings, so I am not sure where to start looking to resolve this.

    I am providing some details below, please let me know if you need more information.

    Details

    TF framework module versions
    • tensorflow-serving-api==2.0.0
    • tensorflow==2.0.0
    • tensorflow-ranking==0.2.0
    Some training parameters and functions
    • _CONTEXT_FEATURES = {'ctx_f0'}
    • _DOCUMENT_FEATURES = {'f0', 'f1', 'f2'}
    • _DATA_FORMAT = tfr.data.ELWC
    • _PADDING_LABEL = -1
    def example_feature_columns():
        spec = {}
        for f in _DOCUMENT_FEATURES:
            spec[f] = tf.feature_column.numeric_column(f, shape=(1,), default_value=_PADDING_LABEL, dtype=tf.float32)
        return spec
    
    def context_feature_columns():
        spec = {}
        for f in _CONTEXT_FEATURES:
            spec[f] = tf.feature_column.numeric_column(f, shape=(1,), default_value=_PADDING_LABEL, dtype=tf.float32)
        return spec
    
    Creating the serving_input_receiver_fn
    context_feature_spec = tf.feature_column.make_parse_example_spec(context_feature_columns().values())
    example_feature_spec = tf.feature_column.make_parse_example_spec(example_feature_columns().values())
    
    serving_input_receiver_fn = tfr.data.build_ranking_serving_input_receiver_fn(
            data_format=_DATA_FORMAT,
            list_size=20,
            default_batch_size=None,
            receiver_name="input_ranking_data",
            context_feature_spec=context_feature_spec,
            example_feature_spec=example_feature_spec)
    

    When making a REST API to a local TensorFlow ModelServer using the following cURL request

    curl -H "Content-Type: application/json" \
    -X POST http://192.168.99.100:8501/v1/models/my_model/versions/1587842143:regress \
    -d '{"context": {"ctx_f0": 7.2}, "examples":[{"f0":[35.92],"f1":[5.258],"f2":[5.261]},{"f0":[82.337],"f1":[2.06],"f2":[2.068]}]}'
    

    The error is as follows:

    W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1655] OP_REQUIRES failed at example_parsing_ops.cc:91 : Invalid argument: Could not parse example input, value: '
    
    ctx_f0
    
    { "error": "Could not parse example input, value: \'\n\035\n\021ctx_f0\022\010\022\006\n\004\000\000\340@\'\n\t [[{{node ParseExample/ParseExample}}]]" }
    
    opened by azagniotov 28
  • Model export

    Model export

    Could you please give a code example of how to export a model for TensorFlow Serving? No luck with estimator.export_saved_model or tf.estimator.BestExporter. I must be doing something wrong with feature_spec.

    opened by nzhiltsov 21
  • Tensorflow_ranking with keras

    Tensorflow_ranking with keras

    Hi there,

    I want to try Tensorflow_ranking with Keras. (https://github.com/tensorflow/ranking/tree/master/tensorflow_ranking/examples/keras)

    As a first step I have made a copy/paste from your Github repo, but I didn't find the TEST/TRAIN/VAL dataset. I tried to use this: tensorflow_ranking/examples/data but it does not work.

    My real goal to make a model that is able to rank Students in different classes based on their achievements. (Numeric and categorical features)

    My questions are:

    • where could I find data to test code from tensorflow_ranking/examples/keras repo?
    • how can I use tensorflow_ranking (with Keras) when the dataset is grouped? (Class by Class)

    I want to test your code with Python IDE (Spyder) so I didn't install Basel and other things.

    opened by korosig 20
  • How long does sparse-model run for ?

    How long does sparse-model run for ?

    HI guys,

    I have been running the sparse-model for the last 5 days on GPU server and I can't see anything in my models directory. Meaning it did not get even to the first check point. Anyone has had experience with this?

    My features are many though (About 900K) But I still expected to be past the first checkpoint.

    Any hints would help here.

    Thanks.

    opened by mulangonando 17
  • how to predict?

    how to predict?

    Urgent! I'm unable to obtain predictions using ranker.predict(test). On printing the predictions it says <generator object EstimatorV2.predict at 0x7ff0c8de5c50.

    opened by prakhar2811 17
  • Minimal example of prediction using TFR-BERT?

    Minimal example of prediction using TFR-BERT?

    Would it be possible to have a minimal example that performs prediction (ranking) on an unseen set using TFR-BERT?

    The current example ( tfrbert_example.py ) trains the model and evaluates performance on a development set, but it would be helpful to see a simple example of ranking on an unseen test set, and exporting these {query, document, rank} tuples to (for example) a plain text file for debugging.

    opened by PeterAJansen 15
  • Non-deterministic results in tf_ranking_tfrecord.py

    Non-deterministic results in tf_ranking_tfrecord.py

    Hello,

    When I run tf_ranking_tfrecord.py, I get each time different nDCG metrics.

    I have already tried the following:

    • tf.random.set_seed(1)
    • tf.compat.v1.random.set_seed(1)
    • shuffle=False in _input_fn()
    • and I have not modified group_size=1

    Is it possible to make the results deterministic? And, if so, how?

    Thanks.

    opened by davidmosca 14
  • Keras model couldn't save

    Keras model couldn't save

    platform: (CoLab) uname_result(system='Linux', node='1acc1ece6828', release='4.19.104+', version='#1 SMP Wed Feb 19 05:26:34 PST 2020', machine='x86_64', processor='x86_64')

    python==3.6.9 tensorflow==2.1.0 tensorflow-ranking==0.3.0

    TFRanking custom Keras layers can serialize and deserialize but the custom model failed saving to either saved_model or h5 format. Could there be any issues with get_config implementations?

    _LABEL_FEATURE = "relevance"
    _PADDING_LABEL = -1
    _SIZE="example_list_size"
    
    def create_feature_columns():
      sparse_column = tf.feature_column.categorical_column_with_hash_bucket(
          key="user_id", hash_bucket_size=100, dtype=tf.int64)
      query_embedding = tf.feature_column.embedding_column(
          categorical_column=sparse_column, dimension=20)
      context_feature_columns = {"user_id": query_embedding}
    
      sparse_column = tf.feature_column.categorical_column_with_hash_bucket(
          key="document_id", hash_bucket_size=100, dtype=tf.int64)
      document_embedding = tf.feature_column.embedding_column(
          categorical_column=sparse_column, dimension=20)
      example_feature_columns = {"document_id": document_embedding}
    
      return context_feature_columns, example_feature_columns
    
    context_feature_columns, example_feature_columns = create_feature_columns()
    
    # instantiate keras model
    network = tfr.keras.canned.DNNRankingNetwork(
        context_feature_columns=context_feature_columns,
        example_feature_columns=example_feature_columns,
        hidden_layer_dims=[1024, 512, 256],
        activation=tf.nn.relu,
        dropout=0.5)
    ranker = tfr.keras.model.create_keras_model(
        network=network,
        loss=tfr.keras.losses.get(tfr.losses.RankingLossKey.SOFTMAX_LOSS),
        metrics=tfr.keras.metrics.default_keras_metrics(),
        optimizer=tf.keras.optimizers.Adagrad(learning_rate=0.05),
        size_feature_name=_SIZE)
    
    # save keras model to saved_model format
    ranker.save('tmp')
    
    INFO:tensorflow:Assets written to: tmp/assets
    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-90-c6e3a44bcc2b> in <module>()
    ----> 1 ranker.save('tmp')
    
    11 frames
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/serialization.py in get_json_type(obj)
         70     return obj.__wrapped__
         71 
    ---> 72   raise TypeError('Not JSON Serializable:', obj)
    
    TypeError: ('Not JSON Serializable:', tf.int64)
    
    opened by yzhangswingman 14
  • ANTIQUE Dataset Tokenisation

    ANTIQUE Dataset Tokenisation

    Hi TFR Team,

    I tried creating tf-records from raw ANTIQUE dataset, but I couldn't reproduce similar results. Accuracies are pretty low.

    Could you please share with us on what kind of tokenisation you have used to create document and query tokens

    Thanks.

    opened by divyakyatam 13
  • Error reading the tfrecords EIE

    Error reading the tfrecords EIE

    Can someone help troubleshoot this error (I used the exact EIE converter provided in one of the issues here) :

    InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Name: , Feature: serialized_context (data type: string) is required but could not be found. [[{{node ParseExample/ParseExample}}]] [[IteratorGetNext]] [[transform/encoding_layer/qery_tokens_embedding/hash_table_Lookup/LookupTableFindV2/_275]] (1) Invalid argument: Name: , Feature: serialized_context (data type: string) is required but could not be found. [[{{node ParseExample/ParseExample}}]] [[IteratorGetNext]] 0 successful operations. 0 derived errors ignored.

    opened by mulangonando 13
  • is_label_valid in utils.py lacks support for integer targets

    is_label_valid in utils.py lacks support for integer targets

    Hello. in function is_label_valid (line 76 of utils.py), the following code is given:

    def is_label_valid(labels):
      """Returns a boolean `Tensor` for label validity."""
      labels = tf.convert_to_tensor(value=labels)
      return tf.greater_equal(labels, 0.)
    

    The result of this is an error if the target is an integer, because 0. is a float, and tf.greater_equal expects a both arguments to be of the same type. This prevents support for targets/labels that are integers.

    opened by nmonette 0
  • Fix passing of keyword args to Dense layers in create_tower

    Fix passing of keyword args to Dense layers in create_tower

    Current behavior: kwargs are passed to tf.keras.Sequential.add, so they are not passed on to tf.keras.layers.Dense as intended. For example, when passing use_bias=False to create_tower with the kwarg name kernel_regularizer, it throws an exception:

    Traceback (most recent call last):
      File "/Users/brussell/development/ranking/tensorflow_ranking/python/keras/layers_test.py", line 33, in test_create_tower_with_kwargs
        tower = layers.create_tower([3, 2, 1], 1, activation='relu', use_bias=False)
      File "/Users/brussell/development/ranking/tensorflow_ranking/python/keras/layers.py", line 70, in create_tower
        model.add(tf.keras.layers.Dense(units=layer_width), **kwargs)
      File "/usr/local/anaconda3/lib/python3.9/site-packages/tensorflow/python/trackable/base.py", line 205, in _method_wrapper
        result = method(self, *args, **kwargs)
      File "/usr/local/anaconda3/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 61, in error_handler
        return fn(*args, **kwargs)
    TypeError: add() got an unexpected keyword argument 'use_bias' test_create_tower_with_kwargs
    

    Fix: This PR fixes the behavior by shifting the closing paren of tf.keras.layers.Dense to the correct location.

    opened by b4russell 1
  • package Bert

    package Bert

    How do I pack Bert into my textual data? I have query and document pairs, should I package only documents? I ask because of this definition:

    SEQ_LENGTH = 64
    context_feature_spec = {}
    example_feature_spec = {
        'input_word_ids': tf.io.FixedLenFeature(
            shape=(SEQ_LENGTH,), dtype=tf.int64,
            default_value=[7] * SEQ_LENGTH),
        'input_mask': tf.io.FixedLenFeature(
            shape=(SEQ_LENGTH,), dtype=tf.int64,
            default_value=[7] * SEQ_LENGTH),
        'input_type_ids': tf.io.FixedLenFeature(
            shape=(SEQ_LENGTH,), dtype=tf.int64,
            default_value=[7] * SEQ_LENGTH)}
    label_spec = (
        "relevance",
        tf.io.FixedLenFeature(shape=(1,), dtype=tf.int64, default_value=-1)
    )
    

    Onde context_feature_spec = { }

    The antique dataset already has the keys - input_ids, input_mask, relevance e segment_ids. How do I do this for my texts?

    No model de ranking there is 'feature_name_mapping' which it shows what I should deliver and what the model expects.

    opened by Tavares8 0
  • How to create tf records for a custom dataset and run TFR Ranking on it?

    How to create tf records for a custom dataset and run TFR Ranking on it?

    As the example is given in https://www.tensorflow.org/ranking/tutorials/ranking_dnn_distributed , it directly uses the TF Records and doesnt show how the data got converted into them. Even though as per format there should be just query, document and relevance in the tf record, but the code crashes everytime it runs. Please provide the features / format used in creating the tf records for Antique dataset/

    opened by SubhayanDas08 0
  • Multi-MLP document representation

    Multi-MLP document representation

    This is RFC on setting up multi-tower or multi-MLP document representation for TF Ranking.

    If there are multiple logical groups of features, would be good to be able to create multi-MLP document representation.

    For example groups such as: QU, CTR, Distances, Popularities, etc If each of those groups contains a few dozen features, it might be useful to have multi-MLP doc representation such as:

    MLP(g1.[f1, f2, fn]) -> g1_h MLP(g2.[f1, f2, fn]) -> g2_h MLP(gN.[f1, f2, fn]) -> g3_h MLP(g1_h + g2_h + g3_h) -> doc_h

    Has anyone attempted this with TF ranking?

    CC: @ramakumar1729 @bendersky

    opened by vitalyli 0
Releases(v0.5.1)
  • v0.5.1(Oct 26, 2022)

    This is the 0.5.1 release of TensorFlow Ranking. We provide new ranking losses, metrics, layers, and pipeline based on the latest research progresses in Learning to Rank and Unbiased Ranking. We also update the API reference on www.tensorflow.org/ranking and on Github docs. The new changes include:

    Dependencies: The following packages will be installed as required when installing tensorflow-ranking. tensorflow-serving-api>= 2.0.0, < 3.0.0 tensorflow>=2.7.0.

    Source code(tar.gz)
    Source code(zip)
  • v0.5.0(Nov 16, 2021)

    This is the 0.5.0 release of TensorFlow Ranking. We provide a detailed overview, tutorial notebooks and API reference on www.tensorflow.org/ranking. The new changes are:

    • Move task.py and premade tfrbert_task.py to extension.
    • Remove RankingNetwork based tfr-bert example. The latest tfr-bert example using native Keras is available at tfrbert_antique_train.py.
    • Remove dependency on tf-models-official package to reduce install time. Users of tfr.ext.task or modules that depend on the above package will need to manually install it.
    • Updated all docstrings to be more detailed. Made several docstrings to be testable.
    • Add colab notebooks for quickstart tutorial and distributed ranking tutorial, also available on www.tensorflow.org/ranking.
    • Update strategy_utils to support parameter server strategy.
    • Add symmetric log1p to tfr.utils.
    • Remove references to Estimator/Feature Column related APIs in API reference.
    Source code(tar.gz)
    Source code(zip)
  • v0.4.2(Jul 22, 2021)

    This is the 0.4.2 release of TensorFlow Ranking. The main changes are the TFR-BERT module based on the Orbit framework in tf-models, which facilitates users to write customized training loops. The new components are:

    TFR-BERT in Orbit

    • tfr.keras.task: This module contains the general boilerplate code to train TF-Ranking models in the Orbit framework. Particularly, there are:
      • RankingDataLoader, which parses an ELWC formatted data record into tensors
      • RankingTask, which specifies the behaviors of each training and evaluation step, as well as the training losses and evaluation metrics.
      • In addition, there are config data classes like RankingDataConfig and RankingTaskConfig to store configurations for above classes.
    • tfr.keras.premade.tfrbert_task: This module contains the TFR-BERT specification of the TF-Ranking Orbit task.
      • TFRBertDataLoader, which subclasses the RankingDataLoader and further specifies the feature specs of a TFR-BERT model.
      • TFRBertScorer and TFRBertModelBuilder, which defines a model builder that can create a TFR-BERT ranking model as a Keras model, based on tf-models’ implementation of BERT encoder.
      • TFRBertTask, which is a subclass of RankingTask. It defines the build_model behavior. It also defines the initialization method which would load an pretrained BERT checkpoint to initialize the encoder. It also provides the function to output the prediction results along with query ids and document ids.
      • In addition, there are config data classes like TFRBertDataConfig, TFRBertModelConfig and TFRBertConfig which stores configurations for above classes.
    • examples/keras/tfrbert_antique_train.py: This file provides an example of training a TFR-BERT model on the Antique data set. There is also an .yaml file where users can specify parameter configurations.

    Dependencies: The following packages will be installed as required when installing tensorflow-ranking.

    • tf-models-official >= 2.5.0
    • tensorflow-serving-api>= 2.0.0, < 3.0.0
    • tensorflow==2.5.0.
    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(May 25, 2021)

    This release is one of the major releases for TF-Ranking. It provides full support to build and train a native Keras model for ranking problems. It includes necessary Keras layers for a ranking model, a module to construct a model in a flexible manner, and a pipeline to train a model with minimal boilerplate. To get started, please follow the example here. In addition, the new release adds RaggedTensor support in losses and metrics and we provide a handy example to show how to use it in a ranking model.

    The new components are listed below:

    • Keras Layers:

      • Use input packing for layer signatures for SavedModel compatibility.
      • create_tower function to create a feedforward neural network with batch normalization and dropout.
      • GAMLayer, a Keras layer which implements the neural generalized additive ranking model.
      • Update build method of DocumentInteractionAttention layer to ensure SavedModel is restored correctly.
    • ModelBuilder to build tf.keras.Model using Functional API:

      • AbstractModelBuilder class for users to inherit.
      • ModelBuilder class that wraps the boilerplate code to build tf.keras.Model for a ranking model.
      • InputCreator abstract class to implement create_inputs in ModelBuilder.
        • FeatureSpecInputCreator class to create inputs from feature_specs.
        • TypeSpecInputCreator class to create inputs from type_specs.
      • Preprocessor abstract class to implement preprocess in ModelBuilder.
        • PreprocessorWithSpec class to do Keras preprocessing or feature transformations with functions specified in Specs.
      • Scorer abstract class to implement score in ModelBuilder.
        • UnivariateScorer class to implement univariate scoring functions.
          • DNNScorer class to implement fully connected DNN univariate scoring.
          • GAMScorer class to implement feature based GAM univariate scoring.
    • Pipeline to wrap the boilerplate codes for training:

      • AbstractDatasetBuilder abstract class to build and serve the dataset for training.
      • BaseDatasetBuilder class to build training and validation datasets and signatures for SavedModel from feature_specs.
        • SimpleDatasetBuilder class to build datasets with a single label feature spec.
        • MultiLabelDatasetBuilder class to build datasets for multi-task learning.
      • DatasetHparams dataclass to specify all hyper-parameters used in BaseDatasetBuilder class.
      • AbstractPipeline abstract class to train and validate the ranking tf.keras.Model.
      • ModelFitPipeline class to train the ranking models using model.fit() compatible with distribution strategies.
        • SimplePipeline class for single-task training.
        • MultiTaskPipeline class for multi-task training.
        • An example client to showcase training a deep neural network model with a distribution strategy using SimplePipeline.
      • PipelineHparams dataclass to specify all hyper-parameters used in ModelFitPipeline class.
      • strategy_utils helper module to support tf.distribute strategies.
    • RaggedTensor support in losses and metrics:

      • Losses in tfr.keras.losses and metrics in tfr.keras.metrics support to act on tf.RaggedTensor inputs. To do so, set argument ragged=True when defining the loss and metric objects:
        • E.g.: loss = tf.keras.losses.SoftmaxLoss(name=’softmax_loss’, ragged=True)
        • Add this argument in get to get the losses and metrics support ragged tensors: loss = tf.keras.losses.get(‘softmax_loss’, ragged=True)
        • An example client to showcase training a deep neural network model using model.fit() with ragged inputs and outputs.

    Dependencies: The following packages will be installed as required when installing tensorflow-ranking. tf-models-official >= 2.5.0 tensorflow-serving-api>= 2.0.0, < 3.0.0 tensorflow==2.5.0.

    Source code(tar.gz)
    Source code(zip)
  • v0.3.3(Feb 2, 2021)

    This is the 0.3.3 release of TensorFlow Ranking. It depends on tf-models-official >= 2.4.0 and tensorflow-serving-api>= 2.0.0, < 3.0.0. It is compatible with tensorflow==2.4.1. All of these packages will be installed as required packages when installing tensorflow-ranking.

    The main changes in this release contain the Document Interaction Network (DIN) layer and layers for training Keras models using Functional API. The new components are listed below:

    • Document Interaction Network: See paper.

      • Building Keras ranking models for DIN using Keras Preprocessing Layers.
        • Native Keras training: An example client to showcase such a model using model.fit().
        • Estimator based training: Another example client to showcase training a DIN model as an Estimator.
      • tfr.keras.layers.DocumentInteractionAttention: A keras layer to model cross-document interactions. Applies cross-document attention across valid examples identified using a mask.
    • Keras Layers: for easy transformation of context and example features and related utilities.

    • Others

      • tfr.keras.metrics.get(metric_key): Add a get metric factory for keras metrics.
      • Masking support in tfr.data: Add support for parsing a boolean mask tensor which indicates number of valid examples via mask_feature_name argument in tfr.data._RankingDataParser and all associated input data parsing and serving_input_fn builders.
    Source code(tar.gz)
    Source code(zip)
  • v0.3.2(Aug 19, 2020)

    In the latest release of TensorFlow Ranking v0.3.2, we introduce TFR-BERT extension to better support ranking models for text data based on BERT. BERT is a pre-trained language representation model which has achieved substantial improvement over numerous NLP tasks. We find that fine-tuning BERT with ranking losses further improve the ranking performance (arXiv). You can read detailed information about what is included in TFR-BERT extension here. There is also an example showing how to use TFR-BERT here.

    Source code(tar.gz)
    Source code(zip)
  • v0.3.1(Jun 1, 2020)

    This is the 0.3.1 release of TensorFlow Ranking. It depends on tensorflow-serving-api==2.1.0 and is fully compatible with tensorflow==2.2.0. Both will be installed as required packages when installing tensorflow-ranking.

    The main changes in this release are canned Neural RankGAM estimator, canned DNN estimators, canned Neural RankGAM keras models and their examples. The new components are:

    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Mar 24, 2020)

    This is the 0.3.0 release of TensorFlow Ranking. It depends on tensorflow-serving-api==2.1.0 and is fully compatible with tensorflow==2.1.0. Both will be installed as required packages when installing tensorflow-ranking.

    The main changes in this release are related to the DNN Estimator Builder and Keras APIs.

    A DNN Estimator Builder is available at tfr.estimator.make_dnn_ranking_estimator().

    For Keras, we provide an example to showcase the use of Keras APIs to build ranking models , and a documentation providing step-by-step user instructions outlining the Keras user journey.

    The new Keras components are:

    Source code(tar.gz)
    Source code(zip)
  • v0.2.3(Mar 6, 2020)

    This is the 0.2.3 release of TensorFlow Ranking. It depends on tensorflow-serving-api==2.1.0 and is fully compatible with tensorflow==2.1.0. Both will be installed as required packages when installing tensorflow-ranking.

    The main changes in this release are:

    • Added an EstimatorBuilder Class to encapsulate boilerplate codes when constructing a TF-ranking model Estimator. Clients can access it via tfr.estimator.EstimatorBuilder.
    • Added a RankingPipeline Class to hide the boilerplate codes regarding the train and eval data reading, train and eval specs definition, dataset building, exporting strategies. With this, clients can construct a RankingPipeline object using tfr.ext.pipeline.RankingPipeline and then call train_and_eval() to run the pipeline.
    • Provided an example to demo the use of tfr.ext.pipeline.RankingPipeline.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.2(Jan 17, 2020)

    This is the 0.2.2 release of TensorFlow Ranking. It depends on tensorflow-serving-api==2.1.0 and is fully compatible with tensorflow==2.1.0. Both will be installed as required packages when installing tensorflow-ranking. The main changes in this release are:

    • Fixed metric computation to include lists without any relevant examples.
    • Updated demo code to be TF 2.1.0 compatible.
    • Replaced deprecated dataset.output_dtypes with tf.compat.v1.get_output_dtypes(dataset).
    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(Dec 18, 2019)

    This is the 0.2.1 release of TensorFlow Ranking. It depends on tensorflow-serving-api==2.0.0 and is fully compatible with tensorflow==2.0.0. Both will be installed as required packages when installing tensorflow-ranking.

    The main changes in this release are:

    • Updated demo code to use Antique data in ELWC format.
    • Updated tutorial script to demonstrate using weights in metrics and losses.
    • Removed LIBSVM generator from tfr.data and updated the docs.
    • Make gain and discount parameters in the definition of NDCG configurable.
    • Added MAP as a ranking metric.
    • Added a topn parameter to MRR metric.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Oct 22, 2019)

    This is the 0.2.0 release of TensorFlow Ranking. It depends on tensorflow-serving-api>=2.0.0 and is fully compatible with tensorflow==2.0.0. Both will be installed as required packages when installing tensorflow-ranking.

    There is no new functionality added compared with v0.1.6. This release marks a milestone that our future development will be based on TensorFlow 2.0.

    Source code(tar.gz)
    Source code(zip)
  • v0.1.6(Oct 22, 2019)

    This is the 0.1.6 release of TensorFlow Ranking. We add the dependency to tensorflow-serving-api to use tensorflow.serving.ExampleListWithContext as our input data format. It is tested and stable against TensorFlow 1.15.0 and TensorFlow 2.0.0. The main changes in this release are:

    • Support tensorflow.serving.ExampleListWithContext as our input data format (commit). This is a more user-friendly format than the ExampleInExample one.
    • Add a demo script for data stored in TFRecord. The stored format can be ExampleListhWithContext or other format defined in data.py.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.5(Sep 24, 2019)

    This is the 0.1.5 release of TensorFlow Ranking. It is tested and stable against TensorFlow version 1.14.0 and TensorFlow version 2.0 RC0. The main changes in this release are:

    • Support for Multi-Task Learning and Multi-Objective Learning (Issue #85).
    • Deprecate the input_size argument for tfr.feature. encode_listwise_features and infer it automatically in the function.
    • Fix the weighted mrr computation for doc-level weights.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.4(Sep 5, 2019)

    This is the 0.1.4 release of TensorFlow Ranking. It is tested and stable against TensorFlow version 1.14.0 and TensorFlow version 2.0 RC0. The main changes in this release are:

    • Documentation for APIs. List of symbols/operations are available here.
    • Demo for using sparse and embedded features on ANTIQUE dataset.
    • Example for prediction using ranking estimator in demo code.
    • Code and test cases are fully TF2.0 RC0 compatible.
    • Updated tfr.utils.sort_by_scores to break ties.
    • Added ApproxMRR loss function.

    Announcement: A hands-on tutorial for TF-Ranking, with relevant theoretical background will be presented on Oct 2 at ICTIR 2019, hosted in Santa Clara, CA. Please consider attending!

    Source code(tar.gz)
    Source code(zip)
  • v0.1.3(Jun 20, 2019)

    This is the 0.1.3 release of TensorFlow Ranking. It is tested and stable against TensorFlow version 1.14.0. The main changes in this release are:

    • Introduced an ExampleInExample data format.
    • Introduced a factory method to build tf.dataset in different data formats.
    • Introduced a factory method to build serving receiving input functions for different data formats.
    • Refactored the main modules to be object-oriented to increase the code extensibility.
    Source code(tar.gz)
    Source code(zip)
Owner
null
MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.

MicRank: Learning to Rank Microphones for Distant Speech Recognition Application Scenario Many applications nowadays envision the presence of multiple

Samuele Cornell 20 Nov 10, 2022
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.9k Jan 4, 2023
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 5.7k Feb 12, 2021
Pytorch based library to rank predicted bounding boxes using text/image user's prompts.

pytorch_clip_bbox: Implementation of the CLIP guided bbox ranking for Object Detection. Pytorch based library to rank predicted bounding boxes using t

Sergei Belousov 50 Nov 27, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Vowpal Wabbit 8.1k Jan 6, 2023
A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

A tour through tensorflow with financial data I present several models ranging in complexity from simple regression to LSTM and policy networks. The s

null 195 Dec 7, 2022
Source Code for DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances (https://arxiv.org/pdf/2012.01775.pdf)

DialogBERT This is a PyTorch implementation of the DialogBERT model described in DialogBERT: Neural Response Generation via Hierarchical BERT with Dis

Xiaodong Gu 67 Jan 6, 2023
Open source implementation of AceNAS: Learning to Rank Ace Neural Architectures with Weak Supervision of Weight Sharing

AceNAS This repo is the experiment code of AceNAS, and is not considered as an official release. We are working on integrating AceNAS as a built-in st

Yuge Zhang 6 Sep 7, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.5k Jan 8, 2023
Learning embeddings for classification, retrieval and ranking.

StarSpace StarSpace is a general-purpose neural model for efficient learning of entity embeddings for solving a wide variety of problems: Learning wor

Facebook Research 3.8k Dec 22, 2022
[ICLR 2021] Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments.

[ICLR 2021] RAPID: A Simple Approach for Exploration in Reinforcement Learning This is the Tensorflow implementation of ICLR 2021 paper Rank the Episo

Daochen Zha 48 Nov 21, 2022
Rank 1st in the public leaderboard of ScanRefer (2021-03-18)

InstanceRefer InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring

null 63 Dec 7, 2022
Code for "LoRA: Low-Rank Adaptation of Large Language Models"

LoRA: Low-Rank Adaptation of Large Language Models This repo contains the implementation of LoRA in GPT-2 and steps to replicate the results in our re

Microsoft 394 Jan 8, 2023
Official PyTorch Implementation of Rank & Sort Loss [ICCV2021]

Rank & Sort Loss for Object Detection and Instance Segmentation The official implementation of Rank & Sort Loss. Our implementation is based on mmdete

Kemal Oksuz 229 Dec 20, 2022
This is the pytorch implementation for the paper: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation, which is accepted to ICCV2021.

GMPQ: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation This is the pytorch implementation for the paper: Generalizable Mix

null 18 Sep 2, 2022
COD-Rank-Localize-and-Segment (CVPR2021)

COD-Rank-Localize-and-Segment (CVPR2021) Simultaneously Localize, Segment and Rank the Camouflaged Objects Full camouflage fixation training dataset i

JingZhang 52 Dec 20, 2022
Gradient-free global optimization algorithm for multidimensional functions based on the low rank tensor train format

ttopt Description Gradient-free global optimization algorithm for multidimensional functions based on the low rank tensor train (TT) format and maximu

null 5 May 23, 2022
ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

ViViT is a collection of numerical tricks to efficiently access curvature from the generalized Gauss-Newton (GGN) matrix based on its low-rank structure. Provided functionality includes computing

Felix Dangel 12 Dec 8, 2022
This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing.

Feedback Prize - Evaluating Student Writing This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing. The

Udbhav Bamba 41 Dec 14, 2022