A TensorFlow implementation of DeepMind's WaveNet paper

Igor Babuschkin

Last update: Dec 28, 2022

Related tags

Deep Learning tensorflow-wavenet

Overview

A TensorFlow implementation of DeepMind's WaveNet paper

This is a TensorFlow implementation of the WaveNet generative neural network architecture for audio generation.

The WaveNet neural network architecture directly generates a raw audio waveform, showing excellent results in text-to-speech and general audio generation (see the DeepMind blog post and paper for details).

The network models the conditional probability to generate the next sample in the audio waveform, given all previous samples and possibly additional parameters.

After an audio preprocessing step, the input waveform is quantized to a fixed integer range. The integer amplitudes are then one-hot encoded to produce a tensor of shape (num_samples, num_channels).

A convolutional layer that only accesses the current and previous inputs then reduces the channel dimension.

The core of the network is constructed as a stack of causal dilated layers, each of which is a dilated convolution (convolution with holes), which only accesses the current and past audio samples.

The outputs of all layers are combined and extended back to the original number of channels by a series of dense postprocessing layers, followed by a softmax function to transform the outputs into a categorical distribution.

The loss function is the cross-entropy between the output for each timestep and the input at the next timestep.

In this repository, the network implementation can be found in model.py.

Requirements

TensorFlow needs to be installed before running the training script. Code is tested on TensorFlow version 1.0.1 for Python 2.7 and Python 3.5.

In addition, librosa must be installed for reading and writing audio.

To install the required python packages, run

pip install -r requirements.txt

For GPU support, use

pip install -r requirements_gpu.txt

Training the network

You can use any corpus containing .wav files. We've mainly used the VCTK corpus (around 10.4GB, Alternative host) so far.

In order to train the network, execute

python train.py --data_dir=corpus

to train the network, where corpus is a directory containing .wav files. The script will recursively collect all .wav files in the directory.

You can see documentation on each of the training settings by running

python train.py --help

You can find the configuration of the model parameters in wavenet_params.json. These need to stay the same between training and generation.

Global Conditioning

Global conditioning refers to modifying the model such that the id of a set of mutually-exclusive categories is specified during training and generation of .wav file. In the case of the VCTK, this id is the integer id of the speaker, of which there are over a hundred. This allows (indeed requires) that a speaker id be specified at time of generation to select which of the speakers it should mimic. For more details see the paper or source code.

Training with Global Conditioning

The instructions above for training refer to training without global conditioning. To train with global conditioning, specify command-line arguments as follows:

python train.py --data_dir=corpus --gc_channels=32

The --gc_channels argument does two things:

It tells the train.py script that it should build a model that includes global conditioning.
It specifies the size of the embedding vector that is looked up based on the id of the speaker.

The global conditioning logic in train.py and audio_reader.py is "hard-wired" to the VCTK corpus at the moment in that it expects to be able to determine the speaker id from the pattern of file naming used in VCTK, but can be easily be modified.

Generating audio

Example output generated by @jyegerlehner based on speaker 280 from the VCTK corpus.

You can use the generate.py script to generate audio using a previously trained model.

Generating without Global Conditioning

Run

python generate.py --samples 16000 logdir/train/2017-02-13T16-45-34/model.ckpt-80000

where logdir/train/2017-02-13T16-45-34/model.ckpt-80000 needs to be a path to previously saved model (without extension). The --samples parameter specifies how many audio samples you would like to generate (16000 corresponds to 1 second by default).

The generated waveform can be played back using TensorBoard, or stored as a .wav file by using the --wav_out_path parameter:

python generate.py --wav_out_path=generated.wav --samples 16000 logdir/train/2017-02-13T16-45-34/model.ckpt-80000

Passing --save_every in addition to --wav_out_path will save the in-progress wav file every n samples.

python generate.py --wav_out_path=generated.wav --save_every 2000 --samples 16000 logdir/train/2017-02-13T16-45-34/model.ckpt-80000

Fast generation is enabled by default. It uses the implementation from the Fast Wavenet repository. You can follow the link for an explanation of how it works. This reduces the time needed to generate samples to a few minutes.

To disable fast generation:

python generate.py --samples 16000 logdir/train/2017-02-13T16-45-34/model.ckpt-80000 --fast_generation=false

Generating with Global Conditioning

Generate from a model incorporating global conditioning as follows:

python generate.py --samples 16000  --wav_out_path speaker311.wav --gc_channels=32 --gc_cardinality=377 --gc_id=311 logdir/train/2017-02-13T16-45-34/model.ckpt-80000

Where:

--gc_channels=32 specifies 32 is the size of the embedding vector, and must match what was specified when training.

--gc_cardinality=377 is required as 376 is the largest id of a speaker in the VCTK corpus. If some other corpus is used, then this number should match what is automatically determined and printed out by the train.py script at training time.

--gc_id=311 specifies the id of speaker, speaker 311, for which a sample is to be generated.

Running tests

Install the test requirements

pip install -r requirements_test.txt

Run the test suite

./ci/test.sh

Missing features

Currently there is no local conditioning on extra information which would allow context stacks or controlling what speech is generated.

Related projects

tex-wavenet, a WaveNet for text generation.
image-wavenet, a WaveNet for image generation.

Comments

Global conditioning

There is a unit test in test_model.py.

This code doesn't provider an AudioReader does not send back the speaker id, so it's not ready for use quite yet.

There are other global conditioning implementations in flight. Related discussions here. Let's try to find the most expeditious way of getting something merged from the various implementations. We're past-due for it IMO.

opened by jyegerlehner 30

Training error in main.py

Getting the following error when I try to train the network - any idea what this is?

I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:924] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:01:00.0
Total memory: 12.00GiB
Free memory: 11.53GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:01:00.0)
Traceback (most recent call last):
  File "main.py", line 129, in <module>
    main()
  File "main.py", line 83, in main
    loss = net.loss(audio_batch)
  File "/home/seth/Development/tensorflow-wavenet/wavenet.py", line 97, in loss
    raw_output = self._create_network(encoded)
  File "/home/seth/Development/tensorflow-wavenet/wavenet.py", line 67, in _create_network
    dilation=dilation)
  File "/home/seth/Development/tensorflow-wavenet/wavenet.py", line 23, in _create_dilation_layer
    name="conv_f")
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 168, in atrous_conv2d
    in_height = int(value_shape[1])
TypeError: __int__ returned non-int (type NoneType)

opened by polyrhythmatic 29

librosa.util.exceptions.ParameterError: Buffer is too short (n=1427) for frame_length=2048

Exception in thread Thread-13: Traceback (most recent call last): File "//anaconda/envs/py35/lib/python3.5/threading.py", line 914, in _bootstrap_inner self.run() File "//anaconda/envs/py35/lib/python3.5/threading.py", line 862, in run self._target(*self._args, **self._kwargs) File "/Users/fs8b/Documents/tech/tensorflow-wavenet-master/wavenet/audio_reader.py", line 162, in thread_main audio = trim_silence(audio[:, 0], self.silence_threshold) File "/Users/fs8b/Documents/tech/tensorflow-wavenet-master/wavenet/audio_reader.py", line 66, in trim_silence energy = librosa.feature.rmse(audio) File "//anaconda/envs/py35/lib/python3.5/site-packages/librosa/feature/spectral.py", line 575, in rmse hop_length=hop_length) File "//anaconda/envs/py35/lib/python3.5/site-packages/librosa/util/utils.py", line 82, in frame ' for frame_length={:d}'.format(len(y), frame_length)) librosa.util.exceptions.ParameterError: Buffer is too short (n=1427) for frame_length=2048

After training on the entire VCTK-Corpus, this is the error of which kills execution. Has anybody else encountered this?

Still cannot even train successfully, let alone generate audio

opened by Jovonni 23

Can't generate samples from checkpoint file

When I try to run generate.py per the readme, I get this:

I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)
Restoring model from model.ckpt-250
Traceback (most recent call last):
  File "generate.py", line 86, in <module>
    main()
  File "generate.py", line 66, in main
    feed_dict={samples: window})
  File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 710, in run
    run_metadata_ptr)
  File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 908, in _run
    feed_dict_string, options, run_metadata)
  File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 958, in _do_run
    target_list, options, run_metadata)
  File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 978, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: Output dimensions must be positive
         [[Node: wavenet/dilated_stack/layer1/conv_filter/BatchToSpace = BatchToSpace[T=DT_FLOAT, block_size=2, _device="/job:localhost/replica:0/task:0/gpu:0"](wavenet/dilated_stack/layer1/conv_filter, wavenet/dilated_stack/layer1/conv_filter/BatchToSpace/crops)]]
Caused by op u'wavenet/dilated_stack/layer1/conv_filter/BatchToSpace', defined at:
  File "generate.py", line 86, in <module>
    main()
  File "generate.py", line 51, in main
    next_sample = net.predict_proba(samples)
  File "/home/ubuntu/jupyter_base/project/tensorflow-wavenet/wavenet.py", line 154, in predict_proba
    raw_output = self._create_network(encoded)
  File "/home/ubuntu/jupyter_base/project/tensorflow-wavenet/wavenet.py", line 112, in _create_network
    self.dilation_channels)
  File "/home/ubuntu/jupyter_base/project/tensorflow-wavenet/wavenet.py", line 51, in _create_dilation_layer
    name="conv_filter")
  File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 228, in atrous_conv2d
    block_size=rate)
  File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 308, in batch_to_space
    block_size=block_size, name=name)
  File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 703, in apply_op
    op_def=op_def)
  File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2317, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/ubuntu/jupyter_base/venv/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1239, in __init__
    self._traceback = _extract_stack()```

bug

opened by maxhodak 22

My implementation of WaveNet for text generation (based in this repository)
Hi, friends.

As I have not a good GPU for heping you directly in this, I have use the baseline of the work in this repository to develop a WaveNet text generator (self-generator): https://github.com/Zeta36/tensorflow-tex-wavenet.

In summary: I utilize the WaveNet model as a text generator. I feed the model using a raw text data (characters), instead of raw audio files, and once the network is trained, I use the conditional probability found to generate samples (characters) into an self-generating process.

Only printable ASCII characters (Dec. 0 up to 255) is supported right now. Results

And pretty interesting results are reached!! Feeding the network with enough text and training, the model is able to memorize the probability of the characters disposition (in a lenguage), and generate later even a very similar text!!

For example, using the Penn Tree Bank (PTB) dataset, and only after 15000 steps of training (with low set of parameters setting) this was the self-generated output (the final loss was around 1.1):

"Prediction is: 300-servenns on the divide mushin attore and operations losers nis called him for investment it was with as pursicularly federal and sotheby d. reported firsts truckhe of the guarantees as paining at the available ransions i 'm new york for basicane as a facerement of its a set to the u.s. spected on install death about in the little there have a $ N million or N N bilot in closing is of a trading a congress of society or N cents for policy half feeling the does n't people of general and the crafted ended yesterday still also arjas trading an effectors that a can singaes about N bound who that mestituty was below for which unrecontimer 's have day simple d. frisons already earnings on the annual says had minority four-$ N sance for an advised in reclution by from $ N million morris selpiculations the not year break government these up why thief east down for his hobses weakness as equiped also plan amr. him loss appealle they operation after and the monthly spendings soa $ N million from cansident third-quarter loan was N pressure of new and the intended up he header because in luly of tept. N million crowd up lowers were to passed N while provision according to and canada said the 1980s defense reporters who west scheduled is a volume at broke also and national leader than N years on the sharing N million pro-m was our american piconmentalist profited himses but the measures from N in N N of social only announcistoner corp. say to average u.j. dey said he crew is vice phick-bar creating the drives will shares of customer with welm reporters involved in the continues after power good operationed retain medhay as the end consumer whitecs of the national inc. closed N million advanc"

This is really wonderful!! We can see that the original WaveNet model has a great capacity to learn and save long codified text information inside its nodes (and not only audio or image information). This "text generator" WaveNet was able to learn how to write English words and phrases just by predicting characters one by one, and sometimes was able even to learn what word to use based on context.

This output is far to be perfect, but It was trained in a only CPU machine (without GPU) using a low set of parameters configuration in just two hours!! I hope somebody with a better computer can explore the potential of this implementation.

You can download the new development in here: https://github.com/Zeta36/tensorflow-tex-wavenet.

Technically:

I made a TextReader for feeding and replace the AudioReader.

I used the printable character ASCII decimal value (0-255) as the 8bit sample (and I remove the mu_law function from everywhere).

Removed all TensorBoard summaries (I have no memory to waste :P).

Removed wite_wav() and developed a write_text()

Some other minor changes: I start the "waveform" always with a space (char 32) and not with a random int, changed some terminal arguments, etc.

And that all!!

I hope this can help you in any way.

Best regards, Samu.
opened by Zeta36 21
[WIP] Compute loss for outputs only where receptive field is filled

This is my attempt at a fix for issue 98, using nakosung's suggested solution.

I've merged it into a branch I'm training on, and there is not an immediate drop to lower reported loss. Though it might be learning a bit faster. Hard to say. At least it doesn't appear to have broken anything.

opened by jyegerlehner 20
Excessive memory consumption

The network currently runs into out of memory issues at a low number of layers. This seems to be a problem with TensorFlow's atrous_conv2d operation. If I set the dilation factor to 1, which means atrous_conv2d simply calls conv2d, I can easily run with 10s of layers. It could just be the additional batch_to_space and space_to_batch operations, in which case I can write a single C++ op for atrous_conv2d.

opened by ibab 20
The u-law encoding is badly mapping values from the [-1,1] range to [0,255]

The u-law encoding was badly mapping values from the [-1,1] range to [0,255].

The correct equation to do this is (tested): return tf.cast(((signal + 1) * mu) / 2, tf.int32)

opened by Zeta36 19
Fast generation
We added fast wavenet generation. Addresses issue #26.

Verification

We compared the output of our fast generation with slow generation, and ensured it exactly matches.

We also did some speed tests, and verify it is substantially faster 😸

Any comments on style/formatting are welcome.
opened by tomlepaine 19
Added temperature flag to generation script

It's nice to be able to specify sampling "temperature" when generating output, usually for aesthetic reasons, so I added some code to scale the sampling probabilities if a temperature other than 1.0 is provided.

Demo: https://soundcloud.com/robinsloan/sets/tensorflow-wavenet-temperature-demo

opened by robinsloan 16
What should output wave file sound like?

From the model of mine trained 1999 steps(It might be so little steps to sound normally), It sounds just like noises.

It would be better to give well-trained example output for understanding desired output.

opened by chanil1218 16

Project dependencies may have API risk issues

Hi, In tensorflow-wavenet, inappropriate dependency versioning constraints can cause risks.

Below are the dependencies and version constraints that the project is using

librosa>=0.5
tensorflow>=1.0.0

The version constraint == will introduce the risk of dependency conflicts because the scope of dependencies is too strict. The version constraint No Upper Bound and * will introduce the risk of the missing API Error because the latest version of the dependencies may remove some APIs.

After further analysis, in this project, The version constraint of dependency librosa can be changed to >=0.2.0,<=0.7.2.

The above modification suggestions can reduce the dependency conflicts as much as possible, and introduce the latest version as much as possible without calling Error in the projects.

The invocation of the current project includes all the following methods.

The calling methods from the librosa

librosa.output.write_wav

The calling methods from the all methods

sum
tf.trainable_variables
tf.Variable
tf.nn.embedding_lookup
np.logaddexp.reduce
coord.join
enumerate
np.random.randint
self._generator_conv
net.predict_proba
self._generator_causal_layer
self.coord.should_stop
tf.RunOptions
os.makedirs
q.enqueue_many
tf.train.AdamOptimizer
audio_reader.trim_silence
args.optimizer.optimizer_factory
tf.summary.audio
q.dequeue
tf.train.MomentumOptimizer
argparse.ArgumentTypeError
create_variable
self._one_hot
tf.histogram_summary
get_arguments
tf.div
tf.zeros
tf.sigmoid
sys.stdout.flush
initializer
tf.train.Saver
time_to_batch
writer.add_graph
librosa.load
tf.summary.merge_all
reader.dequeue
np.nonzero
WaveNetModel.calculate_receptive_field
get_default_logdir
load_generic_audio
tf.train.get_checkpoint_state
np.seterr
self._create_generator
tf.size
open
librosa.output.write_wav
var.append
dict
audio.reshape
f.write
create_bias_variable
np.arange
self._generator_dilation_layer
find_files
tf.pad
os.path.join
optimizer.minimize
np.array
tf.constant
trim_silence
write_wav
net.loss
tf.RunMetadata
abs
tf.constant_initializer
saver.restore
list
self._embed_gc
randomize_files
tf.PaddingFIFOQueue
id_reg_expression.findall
self._create_variables
waveform.append
tf.global_variables
np.pad
parser.parse_args
self._create_causal_layer
tf.cond
tf.shape
tf.transpose
np.testing.assert_allclose
float
batch_to_time
np.reshape
sess.run
tf.placeholder
tf.add_n
self._create_dilation_layer
int
len
tf.nn.conv1d
WaveNetModel
librosa.core.frames_to_samples
tf.slice
ckpt.model_checkpoint_path.split
thread.start
create_seed
threading.Thread
tf.nn.l2_loss
fnmatch.filter
ckpt.model_checkpoint_path.split.split
self.threads.append
tf.get_default_graph
tf.Session
tf.summary.FileWriter
tf.name_scope
tf.nn.softmax
tf.to_float
tf.nn.softmax_cross_entropy_with_logits
not_all_have_id
tf.cast
tf.add
datetime.now
time_since_print.total_seconds
mu_law_decode
os.walk
np.identity
tf.one_hot
parser.add_argument
tf.train.RMSPropOptimizer
self.queue.dequeue_many
json.load
format
s.lower
tf.ConfigProto
tf.to_int32
librosa.feature.rmse
create_embedding_table
causal_conv
tf.global_variables_initializer
datetime.now.str.replace
main
push_ops.append
save
global_condition.get_shape
str
self.gc_queue.dequeue_many
tf.train.start_queue_runners
load
outputs.extend
q.enqueue
tf.sign
tf.FIFOQueue
mu_law_encode
net.predict_proba_incremental
tf.train.Coordinator
id_reg_exp.findall
time.time
get_category_cardinality
print
tl.generate_chrome_trace_format
np.exp
argparse.ArgumentParser
writer.add_run_metadata
self._create_network
seed.sess.run.tolist
NotImplementedError
ValueError
tf.reshape
AudioReader
range
files.append
os.path.exists
tf.tanh
np.random.choice
tf.summary.scalar
saver.save
re.compile
tf.nn.relu
tf.reduce_mean
random.randint
tf.minimum
reader.dequeue_gc
timeline.Timeline
np.log
tf.abs
init_ops.append
reader.start_threads
optimizer_factory.keys
self.queue.enqueue
tf.variable_scope
self.gc_queue.enqueue
validate_directories
tf.matmul
outputs.append
tf.log1p
tf.contrib.layers.xavier_initializer_conv2d
writer.add_summary
coord.request_stop

@developer Could please help me check this issue? May I pull a request to fix it? Thank you very much.

opened by PyDeps 0

about loading VCTK_Corpus dataset？

when I used librosa to load audio file of VCTK_Corpus, the following errors occurs. Has anyone encountered the same situation? File "/anaconda3/envs/tf/lib/python3.6/site-packages/librosa/core/audio.py", line 112, in load with audioread.audio_open(os.path.realpath(path)) as input_file:

File "/anaconda3/envs/tf/lib/python3.6/site-packages/audioread/init.py", line 116, in audio_open raise NoBackendError()

audioread.exceptions.NoBackendError

opened by Joll123 0
ModuleNotFoundError: No module named 'tensorflow.contrib'

Hi,

I believe there isn't a contrib module in Tensorflow 2.0 - does this mean we need an earlier version of TF to run wavenet?

`import wavenet

2021-10-25 12:32:15.774545: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 2021-10-25 12:32:15.774629: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. Traceback (most recent call last):

File "", line 1, in import wavenet

File "C:\Users**\Environments\project1\lib\site-packages\wavenet_init_.py", line 5, in from .network import Conv, Model, Network

File "C:\Users**\Environments\project1\lib\site-packages\wavenet\network.py", line 9, in from .cell import ConvCell

File "C:\Users**\Environments\project1\lib\site-packages\wavenet\cell.py", line 6, in from tensorflow.contrib.rnn import RNNCell # pylint: disable=E0611

ModuleNotFoundError: No module named 'tensorflow.contrib'`

Cheers, Tristan

opened by tristankleyn 0
Why is there no activation function applied to the 1x1 conv that produces the dense output?

I have been trying to understand why there is no activation function applied to the 1x1 conv that is used between the residual connections. From what I understand having a linear layer with no activation function does not really add to the expressive power of the model. The skip connections eventually have a relu applied so that does make sense to me. However, the linear output of the residual connections has no activation applied as far as I can tell. It is just added to the residual bus and fed into the next layer. What is the point of having the 1x1 convolution in this case? Why not just skip the 1x1 convolution and add the filter * gate directly to the inputs to create the dense output?

opened by chasep255 0
Module 'tensorflow' has no attribute 'placeholder'

I'm using Anaconda3 and the latest version of this repository. I have manually installed librosa and TensorFlow (following Anaconda tutorial). Environment: Anaconda Prompt (Windows 10), TensorFlow set up as "tf" using conda create -n tf tensorflow

See attached picture for the error. The same happens when using tf-gpu.

I'm sure I did something wrong but I don't know what it is.

opened by UnforeseenOcean 8

A TensorFlow implementation of DeepMind's WaveNet paper

Related tags

Overview

A TensorFlow implementation of DeepMind's WaveNet paper

Requirements

Training the network

Global Conditioning

Training with Global Conditioning

Generating audio

Generating without Global Conditioning

Generating with Global Conditioning

Running tests

Missing features

Related projects

Comments

Owner

Igor Babuschkin

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Functional TensorFlow Implementation of Singular Value Decomposition for paper Fast Graph Learning

Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.

Unofficial Tensorflow-Keras implementation of Fastformer based on paper [Fastformer: Additive Attention Can Be All You Need](https://arxiv.org/abs/2108.09084).

Tensorflow 2 implementation of the paper: Learning and Evaluating Representations for Deep One-class Classification published at ICLR 2021

Unofficial implementation of the paper: PonderNet: Learning to Ponder in TensorFlow

The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

Unofficial Tensorflow 2 implementation of the paper Implicit Neural Representations with Periodic Activation Functions

Official TensorFlow code for the forthcoming paper

TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

Annotated notes and summaries of the TensorFlow white paper, along with SVG figures and links to documentation

The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

Implementation of Restricted Boltzmann Machine (RBM) and its variants in Tensorflow

StyleGAN2 - Official TensorFlow Implementation

An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

Implementation of Perceiver, General Perception with Iterative Attention in TensorFlow

Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow.