AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations

Related tags

Deep Learning AugLy
Overview

logo


AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations. Each modality’s augmentations are contained within its own sub-library. These sub-libraries include both function-based and class-based transforms, composition operators, and have the option to provide metadata about the transform applied, including its intensity.

AugLy is a great library to utilize for augmenting your data in model training, or to evaluate the robustness gaps of your model! We designed AugLy to include many specific data augmentations that users perform in real life on internet platforms like Facebook's -- for example making an image into a meme, overlaying text/emojis on images/videos, reposting a screenshot from social media. While AugLy contains more generic data augmentations as well, it will be particularly useful to you if you're working on a problem like copy detection, hate speech detection, or copyright infringement where these "internet user" types of data augmentations are prelevant.

Visual

To see more examples of augmentations, open the Colab notebooks in the README for each modality! (e.g. image README & Colab)

The library is Python-based and requires at least Python 3.6, as we use dataclasses.

Authors

Joanna Bitton — Software Engineer at Facebook AI

Zoe Papakipos — Research Engineer at FAIR

Installation

AugLy is a Python 3.6+ library. It can be installed with:

pip install augly

Or clone AugLy if you want to be able to run our unit tests, contribute a pull request, etc:

git clone [email protected]:facebookresearch/AugLy.git
[Optional, but recommended] conda create -n augly && conda activate augly && conda install pip
pip install -e AugLy/

NOTE: In some environments, pip doesn't install python-magic as expected. In that case, you will need to additionally run:

conda install -c conda-forge python-magic

Or if you aren't using conda:

sudo apt-get install python3-magic

Documentation

To find documentation about each sub-library, please see the READMEs in the respective directories.

Assets

We provide various media assets to use with some of our augmentations. These assets include:

  1. Emojis (Twemoji) - Copyright 2020 Twitter, Inc and other contributors. Code licensed under the MIT License. Graphics licensed under CC-BY 4.0.
  2. Fonts (Noto fonts) - Noto is a trademark of Google Inc. Noto fonts are open source. All Noto fonts are published under the SIL Open Font License, Version 1.1.
  3. Screenshot Templates - Images created by a designer at Facebook specifically to use with AugLy. You can use these with the overlay_onto_screenshot augmentation in both the image and video libraries to make it look like your source image/video was screenshotted in a social media feed similar to Facebook or Instagram.

Citation

If you use AugLy in your work, please cite:

@misc{bitton2021augly,
  author =       {Bitton, Joanna and Papakipos, Zoe},
  title =        {AugLy: A data augmentations library for audio, image, text, and video.},
  howpublished = {\url{https://github.com/facebookresearch/AugLy}},
  year =         {2021}
}

License

AugLy is MIT licensed, as found in the LICENSE file. Please note that some of the dependencies AugLy uses may be licensed under different terms.

Comments
  • Final Sphinx documentation w/ ReadTheDocs

    Final Sphinx documentation w/ ReadTheDocs

    Summary

    • [x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

    <Please summarize what you are trying to achieve, what changes you made, and how they acheive the desired result.>

    What This Is

    This is comprehensive documentation for AugLy with Sphinx docstring formatting under the hood with a Read the Docs theme to make augmentation parameters, return types, etc more readable and user-friendly.

    How It Works

    Sphinx is a documentation generator commonly used by the Python community. It also has its own docstring format.

    Internal AugLy docstrings utilize tags such as @param or @returns for labeling due to internal Facebook convention. However, Sphinx does not recognize these tags, opting for : in favor of @, among other changes.

    Luckily for us, Python docstring format epytext is very similar to AugLy (credit @Cloud9c), meaning that we can claim that we use epytext formatting and then convert it to sphinx when necessary.

    Another problem: Sphinx requires explicit types labeled in the form of :type and :rtype to display types once documentation is rendered. However, Python typehints (which AugLy uses generously) are not natively supported. Therefore we use an extension to Sphinx that autodetects typehints and adds them on the fly.

    In the end, Sphinx uses the module structure specified in docs/source with rST (filetype similar to Markdown) to generate a table of contents for our final documentation.

    How to Build Documentation Locally

    1. Clone repository. git clone github.com/facebookresearch/AugLy
    2. Install all requirements for both the library and documentation-specific dependencies cd docs && pip install -r requirements.txt
    3. Make sure you are in the docs subdirectory. Then run make html to generate documentation. If you want to delete all these later, you can run make clean.
    4. Navigate to docs/build and open index.html

    Generating new documentation later

    1. Sphinx can detect new files added to the augly subdirectory and add their .rst files accordingly, but the process of detection needs to be triggered manually. Run sphinx-apidoc -o ./source ../augly in the root directory to do so, and update the toctree in index.rst if necessary.
    2. Edited a docstring and want to see the same changes be reflected in the published documentation? No worries, this will be automatically completed, an overview of which is provided below.

    Integration with ReadTheDocs

    1. This documentation uses Sphinx's ReadTheDocs theme to make it easy to publish documentation on RTD's site.
    2. Create a Github webhook to detect pushes made to the repository so documentation can rebuild.
    3. The .readthedocs.yml file specifies the configuration for these builds. By default ffmpeg and libsndfile1 are based on C and are required as prerequisites before requirements in docs/requirements.txt are installed (as RTD uses Ubuntu behind the scenes).
    4. Docstrings aren't stored in the docs subdirectory at all, so all are read from the source folder. Updating the docstrings in augly/<modality> will be sufficient.
    CLA Signed 
    opened by iAdityaEmpire 59
  • Enable to set random ranges to RandomEmojiOverlay parameters

    Enable to set random ranges to RandomEmojiOverlay parameters

    Summary

    • [x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

    Added randomness to RandomEmojiOverlay parameters like emoji_size. A motivation is that I wanted to randomly change the parameters on-the-fly, rather than "fixed" like current implementation. This modified augmentation was actually used in ISC21 Descriptor Track 1st-place solution.

    Test Results

    The tests are passed except the following error.

    Traceback (most recent call last):
      File "/Users/shuhei.yokoo/Documents/AugLy/augly/tests/image_tests/transforms_unit_test.py", line 177, in test_RandomEmojiOverlay
        self.evaluate_class(
      File "/Users/shuhei.yokoo/Documents/AugLy/augly/tests/image_tests/base_unit_test.py", line 129, in evaluate_class
        are_equal_images(dst, ref), "Expected and outputted images do not match"
      File "/Users/shuhei.yokoo/Documents/AugLy/augly/tests/image_tests/base_unit_test.py", line 20, in are_equal_images
        return a.size == b.size and np.allclose(np.array(a), np.array(b))
      File "<__array_function__ internals>", line 5, in allclose
      File "/Users/shuhei.yokoo/.pyenv/versions/augly/lib/python3.8/site-packages/numpy/core/numeric.py", line 2249, in allclose
        res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
      File "<__array_function__ internals>", line 5, in isclose
      File "/Users/shuhei.yokoo/.pyenv/versions/augly/lib/python3.8/site-packages/numpy/core/numeric.py", line 2358, in isclose
        return within_tol(x, y, atol, rtol)
      File "/Users/shuhei.yokoo/.pyenv/versions/augly/lib/python3.8/site-packages/numpy/core/numeric.py", line 2339, in within_tol
        return less_equal(abs(x-y), atol + rtol * abs(y))
    ValueError: operands could not be broadcast together with shapes (1080,1920,3) (1080,1920,4) 
    
    ----------------------------------------------------------------------
    Ran 71 tests in 35.609s
    
    FAILED (errors=1, skipped=4)
    sys:1: ResourceWarning: unclosed file <_io.BufferedReader name='/Users/shuhei.yokoo/Documents/AugLy/augly/assets/tests/image/inputs/dfdc_1.jpg'>
    

    It seems that expected image has alpha channel whereas output image doesn't have alpha channel. I'm not sure why it happened. Replacing the expected image is needed?

    CLA Signed 
    opened by lyakaap 27
  • Optimize hflip

    Optimize hflip

    Summary

    • [X ] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

    <Please summarize what you are trying to achieve, what changes you made, and how they acheive the desired result.> I am trying to speed up horizontally flipping, and what I did is use VidGear to execute an ffmpeg command with the preset of ultrafast.

    With my changes in running HFlip tests: 4.724 seconds (tests would complete around 2.5 seconds) Without my changes in running HFlip tests: 12.534 seconds (tests would complete around 10.5 seconds)

    I've run each test with and without my changes five times on fish shell with this command time for i in (seq 5); python -m unittest augly.tests.video_tests.transforms.ffmpeg_test.TransformsVideoUnitTest.test_HFlip;; end

    Unit Tests

    If your changes touch the audio module, please run all of the audio tests and paste the output here. Likewise for image, text, & video. If your changes could affect behavior in multiple modules, please run the tests for all potentially affected modules. If you are unsure of which modules might be affected by your changes, please just run all the unit tests.

    Audio

    python -m unittest discover -s augly/tests/audio_tests/ -p "*"
    

    Image

    python -m unittest discover -s augly/tests/image_tests/ -p "*_test.py"
    # Or `python -m unittest discover -s augly/tests/image_tests/ -p "*.py"` to run pytorch test too (must install `torchvision` to run)
    

    Text

    python -m unittest discover -s augly/tests/text_tests/ -p "*"
    

    Video

    python -m unittest discover -s augly/tests/video_tests/ -p "*"
    

    All

    python -m unittest discover -s augly/tests/ -p "*"
    

    Other testing

    If applicable, test your changes and paste the output here. For example, if your changes affect the requirements/installation, then test installing augly in a fresh conda env, then make sure you are able to import augly & run the unit test

    CLA Signed 
    opened by Adib234 27
  • Problem using audio augmentations with tensorflow

    Problem using audio augmentations with tensorflow

    I tried to use the audio augmentations in a tensorflow project but I had a "Segmentation fault" error in running time while importing the modules. In my case it can be reproduced only running these two lines:

    import tensorflow
    import augly.audio as audaugs
    

    Versions:

    tensorflow-gpu==2.4.1
    augly==0.1.1
    Python 3.8.5
    

    Thank you

    bug 
    opened by mcanan 20
  • Return images same mode (initial commit)

    Return images same mode (initial commit)

    Related Issue

    Fixes #{128}

    Summary

    • [x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)
    • [x] Add src_mode arg to ret_and_save_image() in image/utils/utils.py
    • [x] Pass src_mode arg into ret_and_save_image() from every augmentation except convert_color in image/functional.py (e.g. here for apply_lambda)
    • [x] In image test evaluate_class(), assert that the mode of self.img & dst are equal.
    • [x] Run image tests, make sure they all pass: python -m unittest discover -s augly/tests/image_tests/ -p "*"

    Unit Tests

    If your changes touch the audio module, please run all of the audio tests and paste the output here. Likewise for image, text, & video. If your changes could affect behavior in multiple modules, please run the tests for all potentially affected modules. If you are unsure of which modules might be affected by your changes, please just run all the unit tests.

    Image

    python -m unittest discover -s augly/tests/image_tests/ -p "*_test.py"
    # Or `python -m unittest discover -s augly/tests/image_tests/ -p "*.py"` to run pytorch test too (must install `torchvision` to run)
    

    Test Output: n/a

    Other testing

    N/A

    CLA Signed Merged 
    opened by membriux 17
  • Added spatial bbox helper

    Added spatial bbox helper

    Summary

    • [x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

    Computes the bbox that encloses a white box in a black background for any augmentation.

    Image

    python -m unittest discover -s augly/tests/image_tests/ -p "*_test.py"
    # Or `python -m unittest discover -s augly/tests/image_tests/ -p "*.py"` to run pytorch test too (must install `torchvision` to run)
    
    Ran 82 tests in 53.014s
    
    OK (skipped=5)
    

    Other testing

    Colab notebook testing the bbox helper → https://colab.research.google.com/drive/1g_0I6f_bv4Wsna6l9jjZrOJ62a4dpu8U#scrollTo=yUczCe6FU9Bs

    CLA Signed 
    opened by membriux 15
  • `black` formatting

    `black` formatting

    Summary: Now our files are all correctly formatted for black, and thus we can run black on our files during code review & not have a million irrelevant changes!

    Followed the steps here: https://fb.prod.workplace.com/groups/pythonfoundation/posts/2990917737888352/

    • Removed aml/augly/ from list of dirs excluded from black formatting in fbsource/tools/arcanist/lint/fbsource-lint-engine.toml
    • Ran arc lint --take BLACK --apply-patches --paths-cmd 'hg files aml/augly/'
    • Fixed type issues in aml/augly/text/fb/augmenters/back_translate/ which were causing linter errors (there were pyre-ignore comments but they were no longer on the correct line post-black, so just added an assert & initialized a var so we no longer need them)
    • Made changes to TARGETS files suggested by linter (moving imports/changing from where some libraries were imported)

    Differential Revision: D31526814

    fb-exported 
    opened by zpapakipos 12
  • Fix for rotating bounding box the wrong way

    Fix for rotating bounding box the wrong way

    Summary

    • [x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

    The current method for rotating bounding boxes was bugged because it rotated the bounding box in the wrong direction. This simple sign change fixes it.

    Test case: https://github.com/juliusfrost/AugLy/blob/rotate-bounding-box-jupyter/examples/test_rotate.ipynb

    Unit Tests

    Passes unit tests.

    CLA Signed 
    opened by juliusfrost 11
  • Update resize() to be on par with torchvision speed

    Update resize() to be on par with torchvision speed

    Summary

    • [x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

    Summary: Refactored resize() from image/functional.py to be on par with torchvision. However, I just have one minor failure in my code. Please advise on where I should look :)

    Test results The following results were acquired from my machine

    • Augly original (without interpolation)= 0.04475s
    • Augly revised (with interpolation) = 0.02873s
    • torchvision (uses interpolation) = 0.02696s

    Test code → https://colab.research.google.com/drive/14-KZdSGaOaz73OgIS0DZZY4RsS3cJ0rg#scrollTo=xVI_h-1v49lC

    Unit Tests

    If your changes touch the audio module, please run all of the audio tests and paste the output here. Likewise for image, text, & video. If your changes could affect behavior in multiple modules, please run the tests for all potentially affected modules. If you are unsure of which modules might be affected by your changes, please just run all the unit tests.

    
    ### Image
    ```bash
    python -m unittest discover -s augly/tests/image_tests/ -p "*_test.py"
    # Or `python -m unittest discover -s augly/tests/image_tests/ -p "*.py"` to run pytorch test too (must install `torchvision` to run)
    

    TEST OUTPUT

    ======================================================================
    FAIL: test_Resize (transforms_unit_test.TransformsImageUnitTest)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/Users/macbookpro/Desktop/Github/AugLy/augly/tests/image_tests/transforms_unit_test.py", line 187, in test_Resize
        self.evaluate_class(imaugs.Resize(), fname="resize")
      File "/Users/macbookpro/Desktop/Github/AugLy/augly/tests/image_tests/base_unit_test.py", line 111, in evaluate_class
        self.assertTrue(
    AssertionError: False is not true
    
    ----------------------------------------------------------------------
    Ran 82 tests in 52.735s
    
    FAILED (failures=1, skipped=5)
    
    CLA Signed 
    opened by membriux 10
  • `OSError: unknown freetype error` when using `OverlayText`

    `OSError: unknown freetype error` when using `OverlayText`

    🐛 Bug

    To Reproduce

    Steps to reproduce the behavior:

    1. OverlayText with a combination of specific text [166, 287] and specific font (NotoSansBengaliUI-Regular.ttf") results in error.
    2. It appears that even if these two characters (with indices 166 and 287) are separated by any number of other characters, it results in the same error.
    # imports
    import os
    import augly.image as imaugs
    import augly.utils as utils
    from IPython.display import display
    
    # import paths
    from augly.utils.base_paths import (
        EMOJI_DIR,
        FONTS_DIR,
        SCREENSHOT_TEMPLATES_DIR,
    )
    
    # read sample input_img
    input_img_path = os.path.join(
        utils.TEST_URI, "image", "inputs", "dfdc_1.jpg"
    )
    input_img = imaugs.scale(input_img_path, factor=0.2)
    
    # This results in error
    overlay_text = imaugs.OverlayText(
        text = [166, 287],
        font_file = os.path.join(FONTS_DIR, "NotoSansBengaliUI-Regular.ttf"),
    )
    overlay_text(input_image)
    
    # These do not result in error
    overlay_text = imaugs.OverlayText(
        text = [166],
        font_file = os.path.join(FONTS_DIR, "NotoSansBengaliUI-Regular.ttf"),
    )
    overlay_text(input_image)
    
    # These do not result in error
    overlay_text = imaugs.OverlayText(
        text = [287],
        font_file = os.path.join(FONTS_DIR, "NotoSansBengaliUI-Regular.ttf"),
    )
    overlay_text(input_image)
    
    # These do not result in error
    overlay_text = imaugs.OverlayText(
        text = [166, 287]
       # font_file not specified
    )
    overlay_text(input_image)
    

    Stack trace:

    ---------------------------------------------------------------------------
    OSError                                   Traceback (most recent call last)
    <ipython-input-74-a3f6c232cffd> in <module>
          3     font_file = os.path.join(FONTS_DIR, "NotoSansBengaliUI-Regular.ttf"),
          4 )
    ----> 5 overlay_text(input_image)
    
    ~/miniconda3/envs/mds572/lib/python3.7/site-packages/augly/image/transforms.py in __call__(self, image, force, metadata)
         48             return image
         49 
    ---> 50         return self.apply_transform(image, metadata)
         51 
         52     def apply_transform(
    
    ~/miniconda3/envs/mds572/lib/python3.7/site-packages/augly/image/transforms.py in apply_transform(self, image, metadata)
        920             x_pos=self.x_pos,
        921             y_pos=self.y_pos,
    --> 922             metadata=metadata,
        923         )
        924 
    
    ~/miniconda3/envs/mds572/lib/python3.7/site-packages/augly/image/functional.py in overlay_text(image, output_path, text, font_file, font_size, opacity, color, x_pos, y_pos, metadata)
       1121         text=text_str,
       1122         fill=(color[0], color[1], color[2], round(opacity * 255)),
    -> 1123         font=font,
       1124     )
       1125 
    
    ~/miniconda3/envs/mds572/lib/python3.7/site-packages/PIL/ImageDraw.py in text(self, xy, text, fill, font, anchor, spacing, align, direction, features, language, stroke_width, stroke_fill, embedded_color, *args, **kwargs)
        461             else:
        462                 # Only draw normal text
    --> 463                 draw_text(ink)
        464 
        465     def multiline_text(
    
    ~/miniconda3/envs/mds572/lib/python3.7/site-packages/PIL/ImageDraw.py in draw_text(ink, stroke_width, stroke_offset)
        416                     ink=ink,
        417                     *args,
    --> 418                     **kwargs,
        419                 )
        420                 coord = coord[0] + offset[0], coord[1] + offset[1]
    
    ~/miniconda3/envs/mds572/lib/python3.7/site-packages/PIL/ImageFont.py in getmask2(self, text, mode, fill, direction, features, language, stroke_width, anchor, ink, *args, **kwargs)
        668         """
        669         size, offset = self.font.getsize(
    --> 670             text, mode, direction, features, language, anchor
        671         )
        672         size = size[0] + stroke_width * 2, size[1] + stroke_width * 2
    
    OSError: unknown freetype error
    

    Expected behavior

    I don't know why this combination of text and font does not work. I would expect there to be no error, since overlaying the same text using another font file does not result in error.

    Environment

    • AugLy Version (e.g., 0.1.2): 0.1.5
    • OS (e.g., Linux): Ubuntu 20.04
    • How you installed AugLy (pip install augly, clone & pip install -e AugLy): pip install augly
    • Python version: 3.7.9
    • Other relevant packages (Tensorflow, etc):

    Additional context

    Same error message can be reproduced using the following combinations as well: text [449, 262] with font file NotoSansThaana-Regular.ttf text [295, 481] with font file NotoSansBengali-Regular.ttf

    dependency bug 
    opened by jkim222383 10
  • increasing highpass efficiency

    increasing highpass efficiency

    Related Issue

    Fixes N/A

    Summary

    • [x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

    <Please summarize what you are trying to achieve, what changes you made, and how they acheive the desired result.> Speeding up highpassfilter by taking advantage of existing dependencies

    Previous runtime: 1.686479s New runtime: 0.045012s

    Unit Tests

    If your changes touch the audio module, please run all of the audio tests and paste the output here. Likewise for image, text, & video. If your changes could affect behavior in multiple modules, please run the tests for all potentially affected modules. If you are unsure of which modules might be affected by your changes, please just run all the unit tests.

    Audio

    python -m unittest discover -s augly/tests/audio_tests/ -p "*"
    ......./home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this w
    arning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      np.dtype(np.float): np.complex,
    /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.complex` is a deprecated alias for the builtin `complex`. To silence this warn
    ing, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      np.dtype(np.float): np.complex,
    /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:869: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning,
    use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      mag = np.abs(S).astype(np.float)
    ........../home/adityaprasad/.local/lib/python3.8/site-packages/librosa/core/spectrum.py:1223: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence
    this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      time_steps = np.arange(0, D.shape[1], rate, dtype=np.float)
    /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning,
     use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      np.dtype(np.float): np.complex,
    /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.complex` is a deprecated alias for the builtin `complex`. To silence this warn
    ing, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      np.dtype(np.float): np.complex,
    /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/core/spectrum.py:1223: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warni
    ng, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      time_steps = np.arange(0, D.shape[1], rate, dtype=np.float)
    /home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:869: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning,
    use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
    Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
      mag = np.abs(S).astype(np.float)
    ..................................................
    ----------------------------------------------------------------------
    Ran 67 tests in 5.665s
    
    OK
    
    

    If applicable, test your changes and paste the output here. For example, if your changes affect the requirements/installation, then test installing augly in a fresh conda env, then make sure you are able to import augly & run the unit test

    CLA Signed 
    opened by iAdityaEmpire 9
  • Paraphrasing using AugLy

    Paraphrasing using AugLy

    🚀 Feature

    As i was going thru AugLy i didn't found anything that can paraphrase a sentence and can create 2-3 sentence from 1 as xlnet does in nlpaug library. If it is already available in AugLy can you please mention the same?

    Motivation

    As i have less data to train. So Augmentation with paraphrasing will help me create more data and will let me train the model.

    Pitch

    I want one sentence to be paraphrase in how many number of sentence i want. For example if i give n=3, Function should produce 3 sentences from 1 sentence which have similar meaning(paraphrased basically)

    opened by ChiragM-Hexaware 0
  • Missing comparison in paper

    Missing comparison in paper

    Hi AugLy team, Thanks for the great package!

    I have seen your comparison with other libs in the paper and have to highlight a missing point: in the table below, pixelization is not compared with an alternative from albumentations. image

    My guess is you didn't find one, so may I suggest looking at https://albumentations.ai/docs/api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Downscale? It does pixelization using the default params.

    opened by arsenyinfo 0
  • Disabled transitions if the transition duration is too short.

    Disabled transitions if the transition duration is too short.

    Summary: Fixing a pattern of failures where ffmpeg fails when the transition duration is too short.

    Refactored the concatenation (no transition effect) in a separate function. Shifting to concatenation when the transition duration is below 0.5 seconds (chosen empirically).

    Differential Revision: D37737241

    CLA Signed fb-exported 
    opened by gpostelnicu 1
  • Issue about not specifying the path to the ffmpeg package

    Issue about not specifying the path to the ffmpeg package

    1 Issues

    When I use Augly to perform data enhancement operations on videos, I encounter this problem:

    Compression Mode is disabled, Kindly enable it to access this function.
    

    But I have installed ffmpeg and configured the environment. Then I tried to debug the code and found a bug.

    When the function add_augmenter() calls WriteGear(),The value of parameter custom_ffmpeg is not specified. But later in the code, it is necessary to use its real value:

    Line 215 in writegear.py:
        
                self.__ffmpeg = get_valid_ffmpeg_path(
                    custom_ffmpeg,
                    self.__os_windows,
                    ffmpeg_download_path=__ffmpeg_download_path,
                    logging=self.__logging,
                )
    

    So that in function get_valid_ffmpeg_path(), the value returned is always False, which causes the program to fail to get the locally downloaded ffmpeg package and must download it again.

    Line 885 in helper.py:
    
    def get_valid_ffmpeg_path(
        custom_ffmpeg="", is_windows=False, ffmpeg_download_path="", logging=False
    ):
        """
        ## get_valid_ffmpeg_path
    
        Validate the given FFmpeg path/binaries, and returns a valid FFmpeg executable path.
    
        Parameters:
            custom_ffmpeg (string): path to custom FFmpeg executables
            is_windows (boolean): is running on Windows OS?
            ffmpeg_download_path (string): FFmpeg static binaries download location _(Windows only)_
            logging (bool): enables logging for its operations
    
        **Returns:** A valid FFmpeg executable path string.
        """
        final_path = ""
        if is_windows:
            # checks if current os is windows
            if custom_ffmpeg:
                # if custom FFmpeg path is given assign to local variable
                final_path += custom_ffmpeg
            else:
                # otherwise auto-download them
                try:
                    if not (ffmpeg_download_path):
                        # otherwise save to Temp Directory
                        import tempfile
    
                        ffmpeg_download_path = tempfile.gettempdir()
    
                    logging and logger.debug(
                        "FFmpeg Windows Download Path: {}".format(ffmpeg_download_path)
                    )
    
                    # download Binaries
                    os_bit = (
                        ("win64" if platform.machine().endswith("64") else "win32")
                        if is_windows
                        else ""
                    )
                    _path = download_ffmpeg_binaries(
                        path=ffmpeg_download_path, os_windows=is_windows, os_bit=os_bit
                    )
                    # assign to local variable
                    final_path += _path
    

    2 Solution

    Giving the path to the local ffmpeg package when use get_valid_ffmpeg_path(). Like this:

     self.__ffmpeg = get_valid_ffmpeg_path(
                    "D:/workSoftware/anaconda/envs/augly/Library/bin/",
                    self.__os_windows,
                    ffmpeg_download_path=__ffmpeg_download_path,
                    logging=self.__logging,
                )
    

    Its path can be obtained using the following method:

    import distutils
    distutils.spawn.find_executable('ffmpeg')
    

    This way you won't need to reinstall the ffmpeg package every time you use it.

    opened by ZOMIN28 0
  • mosaic

    mosaic

    Hi, Is there any data enhancement for the following image, which looks like a mosaic, I used pixelation, but it doesn't look like this kind of corruption.

    0d9d6cd5d9160adace80c5c74031b45

    Thank you !

    opened by WEIZHIHONG720 0
  • Support For Keypoints?

    Support For Keypoints?

    🚀 Feature

    Can the augmentations track keypoints through transformations? I see you have support for bounding boxes, so is it possible to set the bounding box as a single coordinate and track it?

    Motivation

    Landmark Localization.

    Pitch

    Track x,y coordinates through transformations.

    opened by Schobs 0
Releases(v1.0.0)
  • v1.0.0(Mar 29, 2022)

    Changes

    Text:

    • Fixed return types in the doc strings so all text augmentations are consistent.
    • Conserved whitespace through tokenization/detokenization for all text augmentations, so they are now consistent.

    Image:

    • Fixed bug with bounding boxes in rotate augmentation.

    Overall:

    • Split dependencies by modality so installation will be lighter-weight for most users. See issue https://github.com/facebookresearch/AugLy/issues/208 as well as the README of each modality for more details.
    • Moved the test input/output data out of the main augly folder so it isn't packaged with the pypi package, to make installation lighter-weight.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(Dec 17, 2021)

    Changes

    Audio:

    • New augmentations: loop
    • Efficiency improvements: made high_pass_filter & low_pass_filter ~97% faster by using torchaudio

    Image:

    • New augmentations: skew
    • Added bbox computation helper spatial_bbox_helper to make it easier to add new image augmentations & automatically compute the bounding box transformations (e.g. see how we used this for skew here)
    • Efficiency improvements: made resize ~35% faster by defaulting to bilinear interpolation

    Text:

    • Allow multi-word typo replacement
    • Efficiency improvements: made contractions, replace_similar_chars, replace_similar_unicode_chars, replace_upside_down ~40-60% faster using algorithmic improvements

    Video:

    • Efficiency improvements: made 30 of the video augmentations faster using vidgear (a new dependency we added in this release) to execute ffmpeg commands using higher compression rates (e.g. hflip 75% faster, loop 85% faster, remove_audio 96% faster, pixelization 71% faster)

    Overall:

    • Modified internal imports to be Python 3.6-compatible
    • Added error messages to unit tests for easier debugging
    • If you want to see a full report benchmarking the runtimes of all AugLy augmentations versus other libraries, keep an eye out for the AugLy paper, which will be up on Arxiv in January!
    Source code(tar.gz)
    Source code(zip)
  • v0.1.10(Oct 18, 2021)

    Changes

    Image

    • Added bounding box support to all augmentations
    • Images are now returned in the same format they were passed into all augmentations (except convert_color)

    Text

    • New augmentations: swap_gendered_words, merge_words, change_case, contractions
    • Allow for kwarg overriding in __call__() for all augmentations
    • Exposed typo_type param in simulate_typos aug
    • Added ignore_words param to replace_words & swap_gendered_words

    Video

    • New augmentation: augment_audio

    Other

    • Enforce black formatting
    Source code(tar.gz)
    Source code(zip)
  • v0.1.7(Sep 13, 2021)

    Changes

    Image

    • New augmentations: apply_pil_filter, clip_image_size, overlay_onto_background_image, overlay_onto_background_image_with_blurred_mask, apply_pil_filter, clip_image_size, overlay_onto_background_image
    • New unit tests: Compose, overlay_image
    • Fixed color_jitter_intensity
    • Don't modify input image in overlay_stripes
    • Added metadata arg to Compose operator
    • Added support to overlay_text for multi-line text
    • Added resize_src_to_match_template option to overlay_onto_screenshot
    • Improved meme_format error message

    Text

    • New augmentation: insert_whitespace_chars
    • Add metadata arg to Compose operator
    • Added more font options to replace_fun_fonts

    Video

    • Added metadata arg to Compose operator, added unit test
    Source code(tar.gz)
    Source code(zip)
  • v0.1.5(Jul 9, 2021)

  • v0.1.3(Jun 28, 2021)

  • v0.1.2(Jun 22, 2021)

UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.)

Microsoft 7.6k Jan 1, 2023
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

OFA Sys 1.4k Jan 8, 2023
Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E. Evaluated on benchmark dataset Office31.

Deep-Unsupervised-Domain-Adaptation Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E.

Alan Grijalva 49 Dec 20, 2022
[ICLR 2021] "Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective" by Wuyang Chen, Xinyu Gong, Zhangyang Wang

Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective [PDF] Wuyang Chen, Xinyu Gong, Zhangyang Wang In ICLR 2

VITA 156 Nov 28, 2022
Employs neural networks to classify images into four categories: ship, automobile, dog or frog

Neural Net Image Classifier Employs neural networks to classify images into four categories: ship, automobile, dog or frog Viterbi_1.py uses a classic

Riley Baker 1 Jan 18, 2022
We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Multi-Modal Self-Supervision using GDT and StiCa This is an official pytorch implementation of papers: Multi-modal Self-Supervision from Generalized D

Facebook Research 42 Dec 9, 2022
Facilitates implementing deep neural-network backbones, data augmentations

Introduction Nowadays, the training of Deep Learning models is fragmented and unified. When AI engineers face up with one specific task, the common wa

null 40 Dec 29, 2022
VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection (ICCV 2021)

Preparation Please see dataset/README.md to get more details about our datasets-VIL100 Please see INSTALL.md to install environment and evaluation too

null 82 Dec 15, 2022
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

Jongheon Jeong 174 Dec 29, 2022
official implemntation for "Contrastive Learning with Stronger Augmentations"

CLSA CLSA is a self-supervised learning methods which focused on the pattern learning from strong augmentations. Copyright (C) 2020 Xiao Wang, Guo-Jun

Lab for MAchine Perception and LEarning (MAPLE) 47 Nov 29, 2022
AugMax: Adversarial Composition of Random Augmentations for Robust Training

[NeurIPS'21] "AugMax: Adversarial Composition of Random Augmentations for Robust Training" by Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Animashree Anandkumar, and Zhangyang Wang.

VITA 31 Oct 28, 2021
A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts to train RL agents to navigate the closed world and collect video data.

MUGEN 11 Oct 22, 2022
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation This is a demo implementation of BYOL for Audio (BYOL-A), a self-sup

NTT Communication Science Laboratories 160 Jan 4, 2023
This project uses Template Matching technique for object detecting by detection of template image over base image.

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

Pratham Bhatnagar 7 May 29, 2022
This project uses Template Matching technique for object detecting by detection of template image over base image

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

Pratham Bhatnagar 4 Nov 16, 2021
OptaPlanner wrappers for Python. Currently significantly slower than OptaPlanner in Java or Kotlin.

OptaPy is an AI constraint solver for Python to optimize the Vehicle Routing Problem, Employee Rostering, Maintenance Scheduling, Task Assignment, School Timetabling, Cloud Optimization, Conference Scheduling, Job Shop Scheduling, Bin Packing and many more planning problems.

OptaPy 211 Jan 2, 2023
Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

AudioCLIP Extending CLIP to Image, Text and Audio This repository contains implementation of the models described in the paper arXiv:2106.13043. This

null 458 Jan 2, 2023
The audio-video synchronization of MKV Container Format is exploited to achieve data hiding

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding, where the hidden data can be utilized for various management purposes, including hyper-linking, annotation, and authentication

Maxim Zaika 1 Nov 17, 2021
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.9k Jan 4, 2023