Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow

Overview

All Contributors

Do you want a RL agent nicely moving on Atari?

Rainbow is all you need!

This is a step-by-step tutorial from DQN to Rainbow. Every chapter contains both of theoretical backgrounds and object-oriented implementation. Just pick any topic in which you are interested, and learn! You can execute them right away with Colab even on your smartphone.

Please feel free to open an issue or a pull-request if you have any idea to make it better. :)

If you want a tutorial for policy gradient methods, please see PG is All You Need.

Contents

  1. DQN [NBViewer] [Colab]
  2. DoubleDQN [NBViewer] [Colab]
  3. PrioritizedExperienceReplay [NBViewer] [Colab]
  4. DuelingNet [NBViewer] [Colab]
  5. NoisyNet [NBViewer] [Colab]
  6. CategoricalDQN [NBViewer] [Colab]
  7. N-stepLearning [NBViewer] [Colab]
  8. Rainbow [NBViewer] [Colab]

Prerequisites

This repository is tested on Anaconda virtual environment with python 3.7+

$ conda create -n rainbow-is-all-you-need python=3.7
$ conda activate rainbow-is-all-you-need

Installation

First, clone the repository.

git clone https://github.com/Curt-Park/rainbow-is-all-you-need.git
cd rainbow-is-all-you-need

Secondly, install packages required to execute the code. Just type:

make setup

Related Papers

  1. V. Mnih et al., "Human-level control through deep reinforcement learning." Nature, 518 (7540):529–533, 2015.
  2. van Hasselt et al., "Deep Reinforcement Learning with Double Q-learning." arXiv preprint arXiv:1509.06461, 2015.
  3. T. Schaul et al., "Prioritized Experience Replay." arXiv preprint arXiv:1511.05952, 2015.
  4. Z. Wang et al., "Dueling Network Architectures for Deep Reinforcement Learning." arXiv preprint arXiv:1511.06581, 2015.
  5. M. Fortunato et al., "Noisy Networks for Exploration." arXiv preprint arXiv:1706.10295, 2017.
  6. M. G. Bellemare et al., "A Distributional Perspective on Reinforcement Learning." arXiv preprint arXiv:1707.06887, 2017.
  7. R. S. Sutton, "Learning to predict by the methods of temporal differences." Machine learning, 3(1):9–44, 1988.
  8. M. Hessel et al., "Rainbow: Combining Improvements in Deep Reinforcement Learning." arXiv preprint arXiv:1710.02298, 2017.

Contributors

Thanks goes to these wonderful people (emoji key):


Jinwoo Park (Curt)

💻 📖

Kyunghwan Kim

💻

Wei Chen

🚧

WANG Lei

🚧

leeyaf

💻

ahmadF

📖

This project follows the all-contributors specification. Contributions of any kind welcome!

Comments
  • Assertion error when calculating loss

    Assertion error when calculating loss

    tl;dr: an error occurs in 08.rainbow.ipynb while training to 100,000 steps (sometimes < 20k steps, sometimes more), here's my copy of the Colab

    Hi and thanks for sharing this wonderful learning resource, I really like your implementation!

    Testing with CartPole and LunarLander, inside colab, the network encounters nan inside update_priorities(self, indices, priorities) called by loss = self.update_model() inside the train() function. For CartPole, this error comes just after 17500 steps.

    image

    <ipython-input-20-92e9bbd87b9f> in train(self, num_frames, plotting_interval)
        222             # if training is ready
        223             if len(self.memory) >= self.batch_size:
    --> 224                 loss = self.update_model()
        225                 losses.append(loss)
        226                 update_cnt += 1
    
    <ipython-input-20-92e9bbd87b9f> in update_model(self)
        183         loss_for_prior = elementwise_loss.detach().cpu().numpy()
        184         new_priorities = loss_for_prior + self.prior_eps
    --> 185         self.memory.update_priorities(indices, new_priorities)
        186 
        187         # NoisyNet: reset noise
    
    <ipython-input-17-06b7969c0015> in update_priorities(self, indices, priorities)
         84 
         85         for idx, priority in zip(indices, priorities):
    ---> 86             assert priority > 0
         87             assert 0 <= idx < len(self)
         88 
    
    AssertionError: 
    

    You can trace this up the stack to the line from loss_for_prior = elementwise_loss.detach().cpu().numpy() to elementwise_loss = self._compute_dqn_loss(samples, self.gamma)

    So _compute_dqn_loss holds the clue because samples, nor self.gamma have nan values.

    I've left a copy of this inside Colab which is a copy of 08.rainbow.ipynb configured for LunarLander-v2. In other experiments I've played with the memory_size and set n_step to 1, but eventually the priorities variable receives a nan. I'd love to hear any ideas why it might be happening

    For completeness, here's a copy of the current definition of that function _compute_dqn_loss:

        def _compute_dqn_loss(self, samples: Dict[str, np.ndarray], gamma: float) -> torch.Tensor:
            """Return categorical dqn loss."""
            device = self.device  # for shortening the following lines
            state = torch.FloatTensor(samples["obs"]).to(device)
            next_state = torch.FloatTensor(samples["next_obs"]).to(device)
            action = torch.LongTensor(samples["acts"]).to(device)
            reward = torch.FloatTensor(samples["rews"].reshape(-1, 1)).to(device)
            done = torch.FloatTensor(samples["done"].reshape(-1, 1)).to(device)
            
            # Categorical DQN algorithm
            delta_z = float(self.v_max - self.v_min) / (self.atom_size - 1)
    
            with torch.no_grad():
                # Double DQN
                next_action = self.dqn(next_state).argmax(1)
                next_dist = self.dqn_target.dist(next_state)
                next_dist = next_dist[range(self.batch_size), next_action]
    
                t_z = reward + (1 - done) * gamma * self.support
                t_z = t_z.clamp(min=self.v_min, max=self.v_max)
                b = (t_z - self.v_min) / delta_z
                l = b.floor().long()
                u = b.ceil().long()
    
                offset = (
                    torch.linspace(
                        0, (batch_size - 1) * self.atom_size, self.batch_size
                    ).long()
                    .unsqueeze(1)
                    .expand(self.batch_size, self.atom_size)
                    .to(self.device)
                )
    
                proj_dist = torch.zeros(next_dist.size(), device=self.device)
                proj_dist.view(-1).index_add_(
                    0, (l + offset).view(-1), (next_dist * (u.float() - b)).view(-1)
                )
                proj_dist.view(-1).index_add_(
                    0, (u + offset).view(-1), (next_dist * (b - l.float())).view(-1)
                )
    
            dist = self.dqn.dist(state)
            log_p = torch.log(dist[range(self.batch_size), action])
            elementwise_loss = -(proj_dist * log_p).sum(1)
    
            return elementwise_loss
    
    

    Thanks, Greg

    bug 
    opened by signalprime 7
  • Integrate video rendering logics (Jupyter & Colab)

    Integrate video rendering logics (Jupyter & Colab)

    Done

    • Fixed packages for Apple M1 chip.
    • Integration of video rendering logics for Jupyter and Colab.
    • gym.wrapper.Monitor is replaced with gym.wrapper.RecordVideo. (gym.wrapper.Monitor will be deprecated from gym)

    Colab

    • https://colab.research.google.com/github/Curt-Park/rainbow-is-all-you-need/blob/feature%2Frendering-integration/01.dqn.ipynb
    • https://colab.research.google.com/github/Curt-Park/rainbow-is-all-you-need/blob/feature%2Frendering-integration/02.double_q.ipynb
    • https://colab.research.google.com/github/Curt-Park/rainbow-is-all-you-need/blob/feature%2Frendering-integration/03.per.ipynb
    • https://colab.research.google.com/github/Curt-Park/rainbow-is-all-you-need/blob/feature%2Frendering-integration/04.dueling.ipynb
    • https://colab.research.google.com/github/Curt-Park/rainbow-is-all-you-need/blob/feature%2Frendering-integration/05.noisy_net.ipynb
    • https://colab.research.google.com/github/Curt-Park/rainbow-is-all-you-need/blob/feature%2Frendering-integration/06.categorical_dqn.ipynb
    • https://colab.research.google.com/github/Curt-Park/rainbow-is-all-you-need/blob/feature%2Frendering-integration/07.n_step_learning.ipynb
    • https://colab.research.google.com/github/Curt-Park/rainbow-is-all-you-need/blob/feature%2Frendering-integration/08.rainbow.ipynb
    enhancement 
    opened by Curt-Park 5
  • Save memory checkpoints

    Save memory checkpoints

    Hello,

    I read the tutorial on rainbow 08.rainbow.ipynb and I really liked it. I need some help coding a method for efficiently saving the memory of the DQN (the ReplayBuffer object and the PrioritizedReplayBuffer object). I used numpy's savez method to save the ReplayBuffer like this:

    class ReplayBuffer:
        """A simple numpy replay buffer."""
    
        def __init__(
            self, 
            obs_dim: tuple, 
            size: int, 
            save_dir: str,
            batch_size: int = 32, 
            n_step: int = 1, 
            gamma: float = 0.99
        ):
            self.obs_buf = numpy.zeros((size, *obs_dim), dtype=numpy.float32)
            self.next_obs_buf = numpy.zeros((size, *obs_dim), dtype=numpy.float32)
            self.acts_buf = numpy.zeros(size, dtype=numpy.int64)
            self.rews_buf = numpy.zeros(size, dtype=numpy.float32)
            self.done_buf = numpy.zeros(size, dtype=numpy.uint8)
            self.max_size, self.batch_size = size, batch_size
            self.ptr, self.size, = 0, 0
    
            self.save_dir = save_dir
    
            # for N-step Learning
            self.n_step_buffer = deque(maxlen=n_step)
            self.n_step = n_step
            self.gamma = gamma
    
        def store(
            self, 
            obs: numpy.ndarray, 
            act: numpy.ndarray, 
            rew: float, 
            next_obs: numpy.ndarray, 
            done: bool,
        ) -> Tuple[numpy.ndarray, numpy.ndarray, float, numpy.ndarray, bool]:
            transition = (obs, act, rew, next_obs, done)
            self.n_step_buffer.append(transition)
    
            # single step transition is not ready
            if len(self.n_step_buffer) < self.n_step:
                return ()
            
            # make a n-step transition
            rew, next_obs, done = self._get_n_step_info(self.n_step_buffer, self.gamma)
            obs, act = self.n_step_buffer[0][:2]
            
            self.obs_buf[self.ptr] = obs
            self.next_obs_buf[self.ptr] = next_obs
            self.acts_buf[self.ptr] = act
            self.rews_buf[self.ptr] = rew
            self.done_buf[self.ptr] = done
            self.ptr = (self.ptr + 1) % self.max_size
            self.size = min(self.size + 1, self.max_size)
            
            return self.n_step_buffer[0]
    
        def sample_batch(self) -> Dict[str, numpy.ndarray]:
            idxs = numpy.random.choice(self.size, size=self.batch_size, replace=False)
    
            return dict(
                obs=self.obs_buf[idxs],
                next_obs=self.next_obs_buf[idxs],
                acts=self.acts_buf[idxs],
                rews=self.rews_buf[idxs],
                done=self.done_buf[idxs],
                # for N-step Learning
                indices=idxs,
            )
        
        def sample_batch_from_idxs(
            self, idxs: numpy.ndarray
        ) -> Dict[str, numpy.ndarray]:
            # for N-step Learning
            return dict(
                obs=self.obs_buf[idxs],
                next_obs=self.next_obs_buf[idxs],
                acts=self.acts_buf[idxs],
                rews=self.rews_buf[idxs],
                done=self.done_buf[idxs],
            )
        
        def _get_n_step_info(
            self, n_step_buffer: Deque, gamma: float
        ) -> Tuple[numpy.int64, numpy.ndarray, bool]:
            """Return n step rew, next_obs, and done."""
            # info of the last transition
            rew, next_obs, done = n_step_buffer[-1][-3:]
    
            for transition in reversed(list(n_step_buffer)[:-1]):
                r, n_o, d = transition[-3:]
    
                rew = r + gamma * rew * (1 - d)
                next_obs, done = (n_o, d) if d else (next_obs, done)
    
            return rew, next_obs, done
        
        def save_buffer(self):
            save_path = self.save_dir / f"xp_buffer-{self.ptr}-{self.max_size}.npz"
            numpy.savez_compressed(save_path, state_memory=self.obs_buf, next_state_memory=self.next_obs_buf,
                                   action_memory=self.acts_buf, reward_memory=self.rews_buf, terminal_memory=self.done_buf)
            print(f"Memory Buffer saved to {save_path} of size {self.ptr} and total capacity {self.max_size}")
    
        def load_buffer(self, chkpt_path: str):
            if not chkpt_path.exists():
                raise ValueError(f"{chkpt_path} does not exist")
    
            path = os.path.normpath(chkpt_path)
            chkpt = numpy.load(chkpt_path)
            tokens = path.split('-')
    
            self.obs_buf = chkpt['state_memory']
            self.acts_buf = chkpt['action_memory']
            self.rews_buf = chkpt['reward_memory']
            self.done_buf = chkpt['terminal_memory']
            self.next_obs_buf = chkpt['next_state_memory']
            self.ptr = int(tokens[1])
            self.max_size = int(os.path.splitext(tokens[2])[0])
            
            print(f"Loading buffer at {chkpt_path} of size {self.ptr} and total capacity {self.max_size}")
    
        def __len__(self) -> int:
            return self.size
    

    I don't know how to save the object created from PrioritizedReplayBuffer class. Any help is appreciated :)

    opened by AndreasKaratzas 5
  • V_min and V_max - Rainbow DQN

    V_min and V_max - Rainbow DQN

    Hello,

    I'm studying your Rainbow DQN tutorial. I'm trying to understand why is V_min set to 0 and V_max to 200. Most of Rainbow DQN implementations have those variables set to -10 and 10 respectively.

    opened by AndreasKaratzas 4
  • Fix rendering issue

    Fix rendering issue

    • issue from: https://www.facebook.com/groups/ReinforcementLearningKR/permalink/2397295550509671/

    Changed playing a gif animation to saving and playing an mp4 file.

    Changes

    # Colab configurations
    import sys
    IN_COLAB = "google.colab" in sys.modules
    
    if IN_COLAB:
        !apt install python-opengl
        !apt install ffmpeg
        !apt install xvfb
        !pip install pyvirtualdisplay
        from pyvirtualdisplay import Display
        
        # Start virtual display
        dis = Display(visible=0, size=(400, 400))
        dis.start()
    
    # test method in Agent
    def test(self) -> None:
            """Test the agent."""
            self.is_test = True
            
            state = self.env.reset()
            done = False
            score = 0
            
            while not done:
                self.env.render()
                action = self.select_action(state)
                next_state, reward, done = self.step(action)
    
                state = next_state
                score += reward
            
            print("score: ", score)
            self.env.close()
    
    # test
    agent.env = gym.wrappers.Monitor(env, "videos", force=True)
    agent.test()
    
    # render
    import base64
    import glob
    import io
    import os
    
    from IPython.display import HTML, display
    
    
    def ipython_show_video(path: str) -> None:
        """Show a video at `path` within IPython Notebook."""
        if not os.path.isfile(path):
            raise NameError("Cannot access: {}".format(path))
    
        video = io.open(path, "r+b").read()
        encoded = base64.b64encode(video)
    
        display(HTML(
            data="""
            <video alt="test" controls>
            <source src="data:video/mp4;base64,{0}" type="video/mp4"/>
            </video>
            """.format(encoded.decode("ascii"))
        ))
    
    list_of_files = glob.glob("videos/*.mp4")
    latest_file = max(list_of_files, key=os.path.getctime)
    print(latest_file)
    ipython_show_video(latest_file)
    
    bug enhancement 
    opened by Curt-Park 4
  • redundant max in double dqn

    redundant max in double dqn

    In double dqn, I found that there is max Q(~~~, argmaxQ(~~~)).

    Do we need max even though we have argmax in Q?

    I think the max is redundant.

    Would you kindly check this for reducing confusion?

    documentation 
    opened by DongukJu 4
  • Some questions on the N-step ReplayBuffer

    Some questions on the N-step ReplayBuffer

    Maybe some dumb questions about the N-step ReplayBuffer

    1. In update_model()
            if self.use_n_step:
                samples = self.memory_n.sample_batch_from_idxs(indices)
                gamma = self.gamma ** self.n_step
                n_loss = self._compute_dqn_loss(samples, gamma)
    

    Here the assumption is in the samples, for each transition, the next_obs will be n_step away from the obs so self.gamma ** self.n_step makes sense. However, the way _get_n_step_info() is implemented, there is no guarantee that assumption is true, for example when there is a terminal state right after obs, in that case, for that transition, the next_obs will be just 1-step away from the obs, right?

    1. In the same code snippet above samples = self.memory_n.sample_batch_from_idxs(indices) Here self.memory_n is samped with same indices sampled with self.memory. However, there seems two questions here 2.1 For the same index, the two transitions sampled from self.memory and self.memory_n respectively do not have the same obs (they have same next_obs if there is no terminal state issue, as described above) 2.2 If there is a terminal state after obs but before the n-step, then the transitions sampled from self.memory and self.memory_n have neither same obs nor same next_obs

    For example, the following was from a trace (the samples from N-step is renamed to samples2 for clarity)

    (Pdb) l 138 # prevent high-variance. 139 if self.use_n_step: 140 samples2 = self.memory_n.sample_batch_from_idxs(indices) 141 if (samples2['next_obs'][0][0] != samples['next_obs'][0][0]): 142 pdb.set_trace() 143 -> gamma = self.gamma ** self.n_step 144 n_loss = self._compute_dqn_loss(samples2, gamma) 145 loss += n_loss 146 147 self.optimizer.zero_grad() 148 loss.backward() (Pdb) p samples {'obs': array([[-0.03751984, 0.00060021, 0.00848503, 0.02023765]], dtype=float32), 'next_obs': array([[-0.03750784, -0.1946424 , 0.00888978, 0.31558558]], dtype=float32), 'acts': array([0.], dtype=float32), 'rews': array([1.], dtype=float32), 'done': array([0.], dtype=float32), 'indices': array([17])} (Pdb) samples2 {'obs': array([[-0.04211658, -0.7778139 , 0.17969099, 1.5349392 ]], dtype=float32), 'next_obs': array([[-0.05767286, -0.974587 , 0.21038976, 1.8778918 ]], dtype=float32), 'acts': array([0.], dtype=float32), 'rews': array([1.], dtype=float32), 'done': array([1.], dtype=float32)}

    So my question is, if the samples from 1-step and N-step are not really synchronized, what is the benefit of trying? I guess even if we simply use self.memory_n.sample_batch the result is probably going to be fine since here the 1-step loss is just used to lower the variance, and that does not necessary require the two samples to have the same indices?

    Thanks!

    opened by ty2000 4
  • clear momory during n_step_learning

    clear momory during n_step_learning

    In the training loop, it might make sense to call self.memory_n_step.n_step_buffer.clear() when an episode is done to avoid (final->initial) transitions.

    opened by PigUnderRoof 2
  • Update frequency/method and warm-up period

    Update frequency/method and warm-up period

    Hey!

    First: Thanks a lot for the awesome tutorial! It is really great :)

    I am currently building a Rainbow RL agent for the board game of Abalone based on your tutorial/implementation and gym-abalone. You can have a look at it here.

    While reading the Rainbow paper, I found some discrepancies regarding when a learning step takes place and when the target network is updated: | Issue | Paper | This repo | | --- | ---| --- | | 1 | training starts after warm-up period | training starts as soon as first batch is available | | 2 | model update is performed every 4th agent step | update is performed at every agent step | | 3 | target net is soft-updated at every training step? | target net is hard-updated every target_update steps |

    With issue 2, I understand that in the paper each action selected by the agent is repeated four times and that every state is a concatenation of four frames. So from that perspective updating at every agent step might make sense.

    With issue 3, I am not quite sure how it is done in the paper. It seems to be done like that in this implementation, on which you build upon (right?).

    Is there a reason for these differences? I would be very happy to hear from you! :)

    Best, Max

    opened by wuxmax 2
  • does this work with mountaincar and other gym environments

    does this work with mountaincar and other gym environments

    if i just chagne cartpole to mountaincar in chapter 8, it pretty much does not work. is there a way to make it work though? do you consider making it like stable baselines, running on all sorts of environments?

    thanks.

    question 
    opened by simin75simin 2
  • bias_sigma initialization in noisy net

    bias_sigma initialization in noisy net

    According to Sec. 3.2 in the paper "Noisy Networks for Exploration", sigmas are initialized to a constant sigma / sort(p). However, in this implementation, self.bias_sigma.data.fill_(self.std_init / math.sqrt(self.out_features)) is realized. Is this a typo?

    question 
    opened by kentropy 2
  • curiosity

    curiosity

    experiment curiosity, use icm to add intrinsic rewards. Deepak Pathak, Pulkit Agrawal, Alexei A. Efros and Trevor Darrell. Curiosity-driven Exploration by Self-supervised Prediction. In ICML 2017.

    opened by zhchaoo 4
Owner
Jinwoo Park (Curt)
A domain-independent problem-solver
Jinwoo Park (Curt)
Official repository of my book: "Deep Learning with PyTorch Step-by-Step: A Beginner's Guide"

This is the official repository of my book "Deep Learning with PyTorch Step-by-Step". Here you will find one Jupyter notebook for every chapter in the book.

Daniel Voigt Godoy 340 Jan 1, 2023
In this work, we will implement some basic but important algorithm of machine learning step by step.

WoRkS continued English 中文 Français Probability Density Estimation-Non-Parametric Methods(概率密度估计-非参数方法) 1. Kernel / k-Nearest Neighborhood Density Est

liziyu0104 1 Dec 30, 2021
Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

El Bruno 3 Mar 30, 2022
This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

Data Structure and Algorithms with Python This repository is related to the Arabic tutorial here, within the tutorial we discuss the common data struc

Mohamed Ayman 33 Dec 2, 2022
This is the code of using DQN to play Sekiro .

Update for using DQN to play sekiro 2021.2.2(English Version) This is the code of using DQN to play Sekiro . I am very glad to tell that I have writen

null 144 Dec 25, 2022
A very short and easy implementation of Quantile Regression DQN

Quantile Regression DQN Quantile Regression DQN a Minimal Working Example, Distributional Reinforcement Learning with Quantile Regression (https://arx

Arsenii Senya Ashukha 80 Sep 17, 2022
A working implementation of the Categorical DQN (Distributional RL).

Categorical DQN. Implementation of the Categorical DQN as described in A distributional Perspective on Reinforcement Learning. Thanks to @tudor-berari

Florin Gogianu 98 Sep 20, 2022
Quantile Regression DQN a Minimal Working Example, Distributional Reinforcement Learning with Quantile Regression

Quantile Regression DQN Quantile Regression DQN a Minimal Working Example, Distributional Reinforcement Learning with Quantile Regression (https://arx

Arsenii Senya Ashukha 80 Sep 17, 2022
Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...

Automatic, Readable, Reusable, Extendable Machin is a reinforcement library designed for pytorch. Build status Platform Status Linux Windows Supported

Iffi 348 Dec 24, 2022
Code for "Diffusion is All You Need for Learning on Surfaces"

Source code for "Diffusion is All You Need for Learning on Surfaces", by Nicholas Sharp Souhaib Attaiki Keenan Crane Maks Ovsjanikov NOTE: the linked

Nick Sharp 247 Dec 28, 2022
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

TimeSformer This is an official pytorch implementation of Is Space-Time Attention All You Need for Video Understanding?. In this repository, we provid

Facebook Research 1k Dec 31, 2022
PixelPick This is an official implementation of the paper "All you need are a few pixels: semantic segmentation with PixelPick."

PixelPick This is an official implementation of the paper "All you need are a few pixels: semantic segmentation with PixelPick." [Project page] [Paper

Gyungin Shin 59 Sep 25, 2022
Per-Pixel Classification is Not All You Need for Semantic Segmentation

MaskFormer: Per-Pixel Classification is Not All You Need for Semantic Segmentation Bowen Cheng, Alexander G. Schwing, Alexander Kirillov [arXiv] [Proj

Facebook Research 1k Jan 8, 2023
Unofficial PyTorch implementation of Fastformer based on paper "Fastformer: Additive Attention Can Be All You Need"."

Fastformer-PyTorch Unofficial PyTorch implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Usage : import t

Hong-Jia Chen 126 Dec 6, 2022
An implementation of Fastformer: Additive Attention Can Be All You Need in TensorFlow

Fast Transformer This repo implements Fastformer: Additive Attention Can Be All You Need by Wu et al. in TensorFlow. Fast Transformer is a Transformer

Rishit Dagli 139 Dec 28, 2022
Unofficial Tensorflow-Keras implementation of Fastformer based on paper [Fastformer: Additive Attention Can Be All You Need](https://arxiv.org/abs/2108.09084).

Fastformer-Keras Unofficial Tensorflow-Keras implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Tensorflo

Yam Peleg 10 Jan 30, 2022
A PyTorch implementation of the Transformer model in "Attention is All You Need".

Attention is all you need: A Pytorch Implementation This is a PyTorch implementation of the Transformer model in "Attention is All You Need" (Ashish V

Yu-Hsiang Huang 7.1k Jan 4, 2023
pytorch implementation of Attention is all you need

A Pytorch Implementation of the Transformer: Attention Is All You Need Our implementation is largely based on Tensorflow implementation Requirements N

null 230 Dec 7, 2022
Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Mozhdeh Gheini 16 Jul 16, 2022