This repository contains the source code for the paper First Order Motion Model for Image Animation

Overview

!!! Check out our new paper and framework improved for articulated objects

First Order Motion Model for Image Animation

This repository contains the source code for the paper First Order Motion Model for Image Animation by Aliaksandr Siarohin, Stéphane Lathuilière, Sergey Tulyakov, Elisa Ricci and Nicu Sebe.

Example animations

The videos on the left show the driving videos. The first row on the right for each dataset shows the source videos. The bottom row contains the animated sequences with motion transferred from the driving video and object taken from the source image. We trained a separate network for each task.

VoxCeleb Dataset

Screenshot

Fashion Dataset

Screenshot

MGIF Dataset

Screenshot

Installation

We support python3. To install the dependencies run:

pip install -r requirements.txt

YAML configs

There are several configuration (config/dataset_name.yaml) files one for each dataset. See config/taichi-256.yaml to get description of each parameter.

Pre-trained checkpoint

Checkpoints can be found under following link: google-drive or yandex-disk.

Animation Demo

To run a demo, download checkpoint and run the following command:

python demo.py  --config config/dataset_name.yaml --driving_video path/to/driving --source_image path/to/source --checkpoint path/to/checkpoint --relative --adapt_scale

The result will be stored in result.mp4.

The driving videos and source images should be cropped before it can be used in our method. To obtain some semi-automatic crop suggestions you can use python crop-video.py --inp some_youtube_video.mp4. It will generate commands for crops using ffmpeg. In order to use the script, face-alligment library is needed:

git clone https://github.com/1adrianb/face-alignment
cd face-alignment
pip install -r requirements.txt
python setup.py install

Animation demo with Docker

If you are having trouble getting the demo to work because of library compatibility issues, and you're running Linux, you might try running it inside a Docker container, which would give you better control over the execution environment.

Requirements: Docker 19.03+ and nvidia-docker installed and able to successfully run the nvidia-docker usage tests.

We'll first build the container.

docker build -t first-order-model .

And now that we have the container available locally, we can use it to run the demo.

docker run -it --rm --gpus all \
       -v $HOME/first-order-model:/app first-order-model \
       python3 demo.py --config config/vox-256.yaml \
           --driving_video driving.mp4 \
           --source_image source.png  \ 
           --checkpoint vox-cpk.pth.tar \ 
           --result_video result.mp4 \
           --relative --adapt_scale

Colab Demo

@graphemecluster prepared a gui-demo for the google-colab see: demo.ipynb. To run press Open In Colab button.

For old demo, see old-demo.ipynb.

Face-swap

It is possible to modify the method to perform face-swap using supervised segmentation masks. Screenshot For both unsupervised and supervised video editing, such as face-swap, please refer to Motion Co-Segmentation.

Training

To train a model on specific dataset run:

CUDA_VISIBLE_DEVICES=0,1,2,3 python run.py --config config/dataset_name.yaml --device_ids 0,1,2,3

The code will create a folder in the log directory (each run will create a time-stamped new directory). Checkpoints will be saved to this folder. To check the loss values during training see log.txt. You can also check training data reconstructions in the train-vis subfolder. By default the batch size is tunned to run on 2 or 4 Titan-X gpu (appart from speed it does not make much difference). You can change the batch size in the train_params in corresponding .yaml file.

Evaluation on video reconstruction

To evaluate the reconstruction performance run:

CUDA_VISIBLE_DEVICES=0 python run.py --config config/dataset_name.yaml --mode reconstruction --checkpoint path/to/checkpoint

You will need to specify the path to the checkpoint, the reconstruction subfolder will be created in the checkpoint folder. The generated video will be stored to this folder, also generated videos will be stored in png subfolder in loss-less '.png' format for evaluation. Instructions for computing metrics from the paper can be found: https://github.com/AliaksandrSiarohin/pose-evaluation.

Image animation

In order to animate videos run:

CUDA_VISIBLE_DEVICES=0 python run.py --config config/dataset_name.yaml --mode animate --checkpoint path/to/checkpoint

You will need to specify the path to the checkpoint, the animation subfolder will be created in the same folder as the checkpoint. You can find the generated video there and its loss-less version in the png subfolder. By default video from test set will be randomly paired, but you can specify the "source,driving" pairs in the corresponding .csv files. The path to this file should be specified in corresponding .yaml file in pairs_list setting.

There are 2 different ways of performing animation: by using absolute keypoint locations or by using relative keypoint locations.

  1. Animation using absolute coordinates: the animation is performed using the absolute postions of the driving video and appearance of the source image. In this way there are no specific requirements for the driving video and source appearance that is used. However this usually leads to poor performance since unrelevant details such as shape is transfered. Check animate parameters in taichi-256.yaml to enable this mode.

  1. Animation using relative coordinates: from the driving video we first estimate the relative movement of each keypoint, then we add this movement to the absolute position of keypoints in the source image. This keypoint along with source image is used for animation. This usually leads to better performance, however this requires that the object in the first frame of the video and in the source image have the same pose

Datasets

  1. Bair. This dataset can be directly downloaded.

  2. Mgif. This dataset can be directly downloaded.

  3. Fashion. Follow the instruction on dataset downloading from.

  4. Taichi. Follow the instructions in data/taichi-loading or instructions from https://github.com/AliaksandrSiarohin/video-preprocessing.

  5. Nemo. Please follow the instructions on how to download the dataset. Then the dataset should be preprocessed using scripts from https://github.com/AliaksandrSiarohin/video-preprocessing.

  6. VoxCeleb. Please follow the instruction from https://github.com/AliaksandrSiarohin/video-preprocessing.

Training on your own dataset

  1. Resize all the videos to the same size e.g 256x256, the videos can be in '.gif', '.mp4' or folder with images. We recommend the later, for each video make a separate folder with all the frames in '.png' format. This format is loss-less, and it has better i/o performance.

  2. Create a folder data/dataset_name with 2 subfolders train and test, put training videos in the train and testing in the test.

  3. Create a config config/dataset_name.yaml, in dataset_params specify the root dir the root_dir: data/dataset_name. Also adjust the number of epoch in train_params.

Additional notes

Citation:

@InProceedings{Siarohin_2019_NeurIPS,
  author={Siarohin, Aliaksandr and Lathuilière, Stéphane and Tulyakov, Sergey and Ricci, Elisa and Sebe, Nicu},
  title={First Order Motion Model for Image Animation},
  booktitle = {Conference on Neural Information Processing Systems (NeurIPS)},
  month = {December},
  year = {2019}
}
Comments
  • AttributeError: module 'yaml' has no attribute 'FullLoader'

    AttributeError: module 'yaml' has no attribute 'FullLoader'

    Recent commit causes error here.

    generator, kp_detector = load_checkpoints(config_path='config/vox-256.yaml', 
                                checkpoint_path='/content/gdrive/My Drive/first-order-motion-model/vox-cpk.pth.tar')
    
    ----------------------------------------------------
    
    AttributeError                            Traceback (most recent call last)
    <ipython-input-6-dbd18151b569> in <module>()
          1 from demo import load_checkpoints
          2 generator, kp_detector = load_checkpoints(config_path='config/vox-256.yaml', 
    ----> 3                             checkpoint_path='/content/gdrive/My Drive/first-order-motion-model/vox-cpk.pth.tar')
    
    /content/first-order-model/demo.py in load_checkpoints(config_path, checkpoint_path)
         24 
         25     with open(config_path) as f:
    ---> 26         config = yaml.load(f, Loader=yaml.FullLoader)
         27 
         28     generator = OcclusionAwareGenerator(**config['model_params']['generator_params'],
    
    AttributeError: module 'yaml' has no attribute 'FullLoader```
    
    
    opened by lookbothways 13
  • Graphic card compatibility

    Graphic card compatibility

    Hello there!

    My question is if the program is suitable for PCs that only have Nvidia graphic cards, or do AMD support this program as well? Yesterday I have encountered an error and it's about me not having Nvidia graphic card. Neither native nor in docker I could make it run.

    opened by aleksy 11
  • Erro when try to run

    Erro when try to run

    When I tried to run run.py it gave the following error

    (base) PS C:\Users\jefer\Desktop\first-order-model-master> python run.py --config config/vox-adv-256_clone.yaml --device_ids 0
    Use predefined train-test split.
    Training...
      0%|                                                                                                                                                                       | 0/150 [00:03<?, ?it/s]
    Traceback (most recent call last):
      File "run.py", line 77, in <module>
        train(config, generator, discriminator, kp_detector, opt.checkpoint, log_dir, dataset, opt.device_ids)
      File "C:\Users\jefer\Desktop\first-order-model-master\train.py", line 50, in train
        for x in dataloader:
      File "C:\Users\jefer\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 345, in __next__
        data = self._next_data()
      File "C:\Users\jefer\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 856, in _next_data
        return self._process_data(data)
      File "C:\Users\jefer\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 881, in _process_data
        data.reraise()
      File "C:\Users\jefer\anaconda3\lib\site-packages\torch\_utils.py", line 394, in reraise
        raise self.exc_type(msg)
    ValueError: Caught ValueError in DataLoader worker process 0.
    Original Traceback (most recent call last):
      File "C:\Users\jefer\anaconda3\lib\site-packages\torch\utils\data\_utils\worker.py", line 178, in _worker_loop
        data = fetcher.fetch(index)
      File "C:\Users\jefer\anaconda3\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "C:\Users\jefer\anaconda3\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp>
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "C:\Users\jefer\Desktop\first-order-model-master\frames_dataset.py", line 154, in __getitem__
        return self.dataset[idx % self.dataset.__len__()]
      File "C:\Users\jefer\Desktop\first-order-model-master\frames_dataset.py", line 103, in __getitem__
        path = np.random.choice(glob.glob(os.path.join(self.root_dir, name + '*.mp4')))
      File "mtrand.pyx", line 902, in numpy.random.mtrand.RandomState.choice
    ValueError: 'a' cannot be empty unless no samples are taken
    
    opened by Jefersonwulf 11
  • A size mismatch error on fashion dataset after fixing bug of Imageio using solutions in #197

    A size mismatch error on fashion dataset after fixing bug of Imageio using solutions in #197

    Hi, I followed the issue #197 to fix the bug of imageio by modifiying line 44 of the snippet codes. And run:

    CUDA_VISIBLE_DEVICES=0,1 python run.py --config config/fashion-256.yaml --device_ids 0,1

    And I got this error of size mismatch of fashion dataset. Do you have any suggestions?

    image

    opened by JialeTao 9
  • CannotReadFrameError: Could not read frame x: Frame is 0 bytes, but expected x.

    CannotReadFrameError: Could not read frame x: Frame is 0 bytes, but expected x.

    After I load a driving video and a source image I get this error:

    CannotReadFrameError: Could not read frame 860:
    Frame is 0 bytes, but expected 308160.
    === stderr ===
    ffmpeg version 3.4.8-0ubuntu0.2 Copyright (c) 2000-2020 the FFmpeg developers
      built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
      configuration: --prefix=/usr --extra-version=0ubuntu0.2 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared
      libavutil      55. 78.100 / 55. 78.100
      libavcodec     57.107.100 / 57.107.100
      libavformat    57. 83.100 / 57. 83.100
      libavdevice    57. 10.100 / 57. 10.100
      libavfilter     6.107.100 /  6.107.100
      libavresample   3.  7.  0 /  3.  7.  0
      libswscale      4.  8.100 /  4.  8.100
      libswresample   2.  9.100 /  2.  9.100
      libpostproc    54.  7.100 / 54.  7.100
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/content/gdrive/My Drive/first-order-motion-model/hinton.mp4':
      Metadata:
        major_brand     : isom
        minor_version   : 512
        compatible_brands: isomiso2avc1mp41
        encoder         : Lavf57.83.100
      Duration: 00:00:30.05, start: 0.000000, bitrate: 230 kb/s
        Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 428x240, 112 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)
        Metadata:
          handler_name    : VideoHandler
        Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 114 kb/s (default)
        Metadata:
          handler_name    : SoundHandler
    Stream mapping:
      Stream #0:0 -> #0:0 (h264 (native) -> rawvideo (native))
    Press [q] to stop, [?] for help
    Output #0, image2pipe, to 'pipe:':
      Metadata:
        major_brand     : isom
        minor_version   : 512
        compatible_brands: isomiso2avc1mp41
        encoder         : Lavf57.83.100
        Stream #0:0(und): Video: rawvideo (RGB[24] / 0x18424752), rgb24, 428x240, q=2-31, 73958 kb/s, 30 fps, 30 tbn, 30 tbc (default)
    ffmpeg version 3.4.8-0ubuntu0.2 Copyright (c) 2000-2020 the FFmpeg developers
      built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
      configuration: --prefix=/usr --extra-version=0ubuntu0.2 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared
      libavutil      55. 78.100 / 55. 78.100
      libavcodec     57.107.100 / 57.107.100
      libavformat    57. 83.100 / 57. 83.100
      libavdevice    57. 10.100 / 57. 10.100
      libavfilter     6.107.100 /  6.107.100
      libavresample   3.  7.  0 /  3.  7.  0
      libswscale      4.  8.100 /  4.  8.100
      libswresample   2.  9.100 /  2.  9.100
      libpostproc    54.  7.100 / 54.  7.100
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/content/gdrive/My Drive/first-order-motion-model/hinton.mp4':
      Metadata:
        major_brand     : isom
        minor_version   : 512
        compatible_brands: isomiso2avc1mp41
        encoder         : Lavf57.83.100
      Duration: 00:00:30.05, start: 0.000000, bitrate: 230 kb/s
        Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 428x240, 112 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)
        Metadata:
          handler_name    : VideoHandler
        Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 114 kb/s (default)
        Metadata:
          handler_name    : SoundHandler
    Stream mapping:
      Stream #0:0 -> #0:0 (h264 (native) -> rawvideo (native))
    Press [q] to stop, [?] for help
    Output #0, image2pipe, to 'pipe:':
      Metadata:
        major_brand     : isom
        minor_version   : 512
        compatible_brands: isomiso2avc1mp41
        encoder         : Lavf57.83.100
        Stream #0:0(und): Video: rawvideo (RGB[24] / 0x18424752), rgb24, 428x240, q=2-31, 73958 kb/s, 30 fps, 30 tbn, 30 tbc (default)
        Metadata:
          handler_name    : VideoHandler
          encoder         : Lavc57.107.100 rawvideo
    frame=  400 fps=0.0 q=-0.0 size=  120375kB time=00:00:13.33 bitrate=73958.4kbits/s speed=26.7x
    

    This error persists even if I convert the video with the command, and replace the input in the driving video:

    !ffmpeg -i /content/gdrive/My\ Drive/first-order-motion-model/anitta.mp4 /content/gdrive/My\ Drive/first-order-motion-model/anitta2.mp4

    I'm stuck now without a clue to how to solve this issue, I thought converting the video with ffmpeg would work, but it doesn't.

    opened by Almototo 9
  • Questions about the pre-trained models of vox

    Questions about the pre-trained models of vox

    Hi~ First of all, thanks a lot for your awesome work, and may I ask two questions about the pre-trained models of vox?

    1. what are the major differences between vox-cpk.pth.tar and vox-adv-cpk.pth.tar? I see some differences between vox-256.yaml and vox-adv-256.yaml, but are still a little confused.

    2. which version of vox dataset is used to train these two model? There are 20,076 video clips in video-preprocessing/vox-metadata.csv, so maybe the two models are trained with these 20,076 video clips?

    Thanks a again. Looking forwards to your replies.

    opened by Honlan 9
  • Face swap?

    Face swap?

    Hello,

    i have seen that on https://aliaksandrsiarohin.github.io/first-order-model-website/ you have published an example of face-swap, but i cant find the description of the method used in the paper or here.

    Thanks for reading.

    opened by lorrp1 9
  • Invalid URL 'None': No schema supplied. Perhaps you meant http://None?

    Invalid URL 'None': No schema supplied. Perhaps you meant http://None?

    I have a problem after i run the code and upload the image and the video there was an error and it was "MissingSchema: Invalid URL 'None': No schema supplied. Perhaps you meant http://none/?" what should I do?

    opened by AbdullahRamadan07 8
  • Regarding Tai-chi dataset license

    Regarding Tai-chi dataset license

    Firstly, I'd like to thank the authors for sharing the code and data publicly.

    I had a question regarding the Tai-Chi HD dataset - I noticed that while the paper mentions the dataset is shared publicly, I could not find a mention of the license it is shared under in the paper and the repositories . Could you please help me with the same?

    opened by Adithya-MN 8
  • RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 940 and 1254 in dimension 2

    RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 940 and 1254 in dimension 2

    Hi,good job! @AliaksandrSiarohin My command is " CUDA_VISIBLE_DEVICES=0,1 python run.py --config config/fashion-256.yaml --device_ids 0,1" My fashion-dataset folder is image (I don't process the data such as crop operation,)

    And i get the following error: Traceback (most recent call last): File "run.py", line 81, in train(config, generator, discriminator, kp_detector, opt.checkpoint, log_dir, dataset, opt.device_ids) File "/remote-home/my/pycharmprojects/first-order-model/train.py", line 50, in train for x in dataloader: File "/usr/local/miniconda3/envs/animation1/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 637, in next return self._process_next_batch(batch) File "/usr/local/miniconda3/envs/animation1/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch raise batch.exc_type(batch.exc_msg) RuntimeError: Traceback (most recent call last): File "/usr/local/miniconda3/envs/animation1/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop samples = collate_fn([dataset[i] for i in batch_indices]) File "/usr/local/miniconda3/envs/animation1/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 229, in default_collate return {key: default_collate([d[key] for d in batch]) for key in batch[0]} File "/usr/local/miniconda3/envs/animation1/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 229, in return {key: default_collate([d[key] for d in batch]) for key in batch[0]} File "/usr/local/miniconda3/envs/animation1/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 218, in default_collate return torch.stack([torch.from_numpy(b) for b in batch], 0) RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 940 and 1254 in dimension 2 at /pytorch/aten/src/TH/generic/THTensorMoreMath.cpp:1333

    opened by Jessicall 8
  • Model diverging when training in VoxCeleb

    Model diverging when training in VoxCeleb

    Hi, thanks for your great work!

    I am trying to retrain the method in VoxCeleb, I don't use GAN_loss. Any try it and face the same issue with me?
    Is there any strategy in training or some problem with my dataset? Any advice is appreciated. Below is my result during the training.

    image image image

    Your reply will be appreciated.

    opened by BaldrLector 8
  • will it work on AMD GPU

    will it work on AMD GPU

    Hey, I have a production company and we have a bunch of Macs that have different types of AMD GPUs, I want to know if this would run on them if we let's say installed windows on our Macs?

    opened by yashatishay 0
  • JSONDecodeError

    JSONDecodeError

    Coulnt find a solution grafik

    Here is the text version

    JSONDecodeError Traceback (most recent call last)

    in generate(button) 410 filename = model.value + ('' if model.value == 'fashion' else '-cpk') + '.pth.tar' 411 if not os.path.isfile(filename): --> 412 download = requests.get(requests.get('https://drive.google.com/drive/folders/1PyQJmkdCsAkOYwUyaj_l-l0as-iLDgeH' + filename).json().get('href')) 413 with open(filename, 'wb') as checkpoint: 414 checkpoint.write(download.content)

    3 frames

    /usr/local/lib/python3.8/dist-packages/requests/models.py in json(self, **kwargs) 896 # used. 897 pass --> 898 return complexjson.loads(self.text, **kwargs) 899 900 @property

    /usr/lib/python3.8/json/init.py in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw) 355 parse_int is None and parse_float is None and 356 parse_constant is None and object_pairs_hook is None and not kw): --> 357 return _default_decoder.decode(s) 358 if cls is None: 359 cls = JSONDecoder

    /usr/lib/python3.8/json/decoder.py in decode(self, s, _w) 335 336 """ --> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end()) 338 end = _w(s, end).end() 339 if end != len(s):

    /usr/lib/python3.8/json/decoder.py in raw_decode(self, s, idx) 353 obj, end = self.scan_once(s, idx) 354 except StopIteration as err: --> 355 raise JSONDecodeError("Expecting value", s, err.value) from None 356 return obj, end

    JSONDecodeError: Expecting value: line 1 column 1 (char 0)

    opened by MoBro23 0
  • Does License preclude using the model outputs?

    Does License preclude using the model outputs?

    Do I own the outputs of the model?

    Basically, are the outputs of the model "adapted material"? I'd assume not given they are not derived from the software in the sense that those derivations are usually meant (like an improvement or change to the software derived from the original software).

    a. Adapted Material means material subject to Copyright and Similar Rights that is derived from or based upon the Licensed Material and in which the Licensed Material is translated, altered, arranged, transformed, or otherwise modified in a manner requiring permission under the Copyright and Similar Rights held by the Licensor. For purposes of this Public License, where the Licensed Material is a musical work, performance, or sound recording, Adapted Material is always produced where the Licensed Material is synched in timed relation with a moving image.```

    opened by stevesmit 0
  • cannot import name 'pad' from 'skimage.util

    cannot import name 'pad' from 'skimage.util

    System: Running Mac M1 13.0.1

    Command being used:

    first-order-model % python3 demo.py --config config/vox-256.yaml --checkpoint checkpoints/vox-cpk.pth.tar --source_image ./assets/source.png --driving_video ./assets/driving.mp4 --cpu

    Result:

    Traceback (most recent call last): File "demo.py", line 15, in from animate import normalize_kp File "/Users/user_name/Documents/GitHub/first-order-model/animate.py", line 7, in from frames_dataset import PairedDataset File "/Users/user_name/Documents/GitHub/first-order-model/frames_dataset.py", line 10, in from augmentation import AllAugmentationTransform File "/Users/user_name/Documents/GitHub/first-order-model/augmentation.py", line 12, in from skimage.util import pad ImportError: cannot import name 'pad' from 'skimage.util' (/opt/miniconda3/envs/firstm/lib/python3.7/site-packages/skimage/util/init.py)

    opened by nightshining 4
  • Running on remote CentOS 7 Server

    Running on remote CentOS 7 Server

    I'm trying to build a closed web API where users can upload a pic of themselves and I'll make them sing a song like WOMBO does. Anyways, when I try to run it, it fails at the driver:

    When I run:

    docker run -it --rm --gpus all \
      -v $HOME/first-order-model:/app first-order-model \
      python3 demo.py \
      --config config/vox-256.yaml \
      --driving_video input/iko-iko.mp4 \
      --source_image input/iko-iko.jpg  \
      --checkpoint checkpoints/vox-cpk.pth.tar \
      --result_video output/iko-iko.mp4 \
      --relative --adapt_scale
    

    It gives me this error:

    docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
    

    I'm assuming it has to do with nvidia-docker, but i have no clue. Will this project only work on desktop?

    Thanks! Joe

    opened by BOXNYC 0
This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

Black-Box-Defense This repository contains the code and models necessary to replicate the results of our recent paper: How to Robustify Black-Box ML M

OPTML Group 2 Oct 5, 2022
This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

Black-Box-Defense This repository contains the code and models necessary to replicate the results of our recent paper: How to Robustify Black-Box ML M

OPTML Group 2 Oct 5, 2022
Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

AniFormer This is the PyTorch implementation of our BMVC 2021 paper AniFormer: Data-driven 3D Animation with Transformer. Haoyu Chen, Hao Tang, Nicu S

null 7 Oct 22, 2021
Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.

scc4onnx Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel

Katsuya Hyodo 16 Dec 22, 2022
This repository is an open-source implementation of the ICRA 2021 paper: Locus: LiDAR-based Place Recognition using Spatiotemporal Higher-Order Pooling.

Locus This repository is an open-source implementation of the ICRA 2021 paper: Locus: LiDAR-based Place Recognition using Spatiotemporal Higher-Order

Robotics and Autonomous Systems Group 96 Dec 15, 2022
Benchmark for Answering Existential First Order Queries with Single Free Variable

EFO-1-QA Benchmark for First Order Query Estimation on Knowledge Graphs This repository contains an entire pipeline for the EFO-1-QA benchmark. EFO-1

HKUST-KnowComp 14 Oct 24, 2022
This Jupyter notebook shows one way to implement a simple first-order low-pass filter on sampled data in discrete time.

How to Implement a First-Order Low-Pass Filter in Discrete Time We often teach or learn about filters in continuous time, but then need to implement t

Joshua Marshall 4 Aug 24, 2022
This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models are Pix2Pix, Pix2PixHD, CycleGAN and PointWise.

RGB2NIR_Experimental This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models

null 5 Jan 4, 2023
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

Achraf Rahouti 3 Nov 30, 2021
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

TorchRL Disclaimer This library is not officially released yet and is subject to change. The features are available before an official release so that

Meta Research 860 Jan 7, 2023
Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.

Restormer: Efficient Transformer for High-Resolution Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan,

Syed Waqas Zamir 906 Dec 30, 2022
This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper

Deep Continuous Clustering Introduction This is a Pytorch implementation of the DCC algorithms presented in the following paper (paper): Sohil Atul Sh

Sohil Shah 197 Nov 29, 2022
This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks Project Page | Video | Presentation | Paper | Data L

Facebook Research 281 Dec 22, 2022
Computer Vision Script to recognize first person motion, developed as final project for the course "Machine Learning and Deep Learning"

Overview of The Code BaseColab/MLDL_FPAR.pdf: it contains the full explanation of our work Base Colab: it contains the base colab used to perform all

Simone Papicchio 4 Jul 16, 2022
Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance This is the codebase for video-based human motion reconstruction in human-mot

Jiachen Xu 5 Jul 14, 2022
[AAAI2021] The source code for our paper 《Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion》.

DSM The source code for paper Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion Project Website; Datasets li

Jinpeng Wang 114 Oct 16, 2022
Code for ICCV 2021 paper "HuMoR: 3D Human Motion Model for Robust Pose Estimation"

Code for ICCV 2021 paper "HuMoR: 3D Human Motion Model for Robust Pose Estimation"

Davis Rempe 367 Dec 24, 2022
GANimation: Anatomically-aware Facial Animation from a Single Image (ECCV'18 Oral) [PyTorch]

GANimation: Anatomically-aware Facial Animation from a Single Image [Project] [Paper] Official implementation of GANimation. In this work we introduce

Albert Pumarola 1.8k Dec 28, 2022
Image Segmentation Animation using Quadtree concepts.

QuadTree Image Segmentation Animation using QuadTree concepts. Usage usage: quad.py [-h] [-fps FPS] [-i ITERATIONS] [-ws WRITESTART] [-b] [-img] [-s S

Alex Eidt 29 Dec 25, 2022