AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations

Facebook Research

Last update: Jan 9, 2023

Related tags

Deep Learning AugLy

Overview

AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations. Each modality’s augmentations are contained within its own sub-library. These sub-libraries include both function-based and class-based transforms, composition operators, and have the option to provide metadata about the transform applied, including its intensity.

AugLy is a great library to utilize for augmenting your data in model training, or to evaluate the robustness gaps of your model! We designed AugLy to include many specific data augmentations that users perform in real life on internet platforms like Facebook's -- for example making an image into a meme, overlaying text/emojis on images/videos, reposting a screenshot from social media. While AugLy contains more generic data augmentations as well, it will be particularly useful to you if you're working on a problem like copy detection, hate speech detection, or copyright infringement where these "internet user" types of data augmentations are prelevant.

To see more examples of augmentations, open the Colab notebooks in the README for each modality! (e.g. image README & Colab)

The library is Python-based and requires at least Python 3.6, as we use dataclasses.

Authors

Joanna Bitton — Software Engineer at Facebook AI

Zoe Papakipos — Research Engineer at FAIR

Installation

AugLy is a Python 3.6+ library. It can be installed with:

pip install augly

Or clone AugLy if you want to be able to run our unit tests, contribute a pull request, etc:

git clone [email protected]:facebookresearch/AugLy.git
[Optional, but recommended] conda create -n augly && conda activate augly && conda install pip
pip install -e AugLy/

NOTE: In some environments, pip doesn't install python-magic as expected. In that case, you will need to additionally run:

conda install -c conda-forge python-magic

Or if you aren't using conda:

sudo apt-get install python3-magic

Documentation

To find documentation about each sub-library, please see the READMEs in the respective directories.

Assets

We provide various media assets to use with some of our augmentations. These assets include:

Emojis (Twemoji) - Copyright 2020 Twitter, Inc and other contributors. Code licensed under the MIT License. Graphics licensed under CC-BY 4.0.
Fonts (Noto fonts) - Noto is a trademark of Google Inc. Noto fonts are open source. All Noto fonts are published under the SIL Open Font License, Version 1.1.
Screenshot Templates - Images created by a designer at Facebook specifically to use with AugLy. You can use these with the overlay_onto_screenshot augmentation in both the image and video libraries to make it look like your source image/video was screenshotted in a social media feed similar to Facebook or Instagram.

Citation

If you use AugLy in your work, please cite:

@misc{bitton2021augly,
  author =       {Bitton, Joanna and Papakipos, Zoe},
  title =        {AugLy: A data augmentations library for audio, image, text, and video.},
  howpublished = {\url{https://github.com/facebookresearch/AugLy}},
  year =         {2021}
}

License

AugLy is MIT licensed, as found in the LICENSE file. Please note that some of the dependencies AugLy uses may be licensed under different terms.

Comments

Final Sphinx documentation w/ ReadTheDocs
Summary

[x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

<Please summarize what you are trying to achieve, what changes you made, and how they acheive the desired result.>

What This Is

This is comprehensive documentation for AugLy with Sphinx docstring formatting under the hood with a Read the Docs theme to make augmentation parameters, return types, etc more readable and user-friendly.

How It Works

Sphinx is a documentation generator commonly used by the Python community. It also has its own docstring format.

Internal AugLy docstrings utilize tags such as @param or @returns for labeling due to internal Facebook convention. However, Sphinx does not recognize these tags, opting for : in favor of @, among other changes.

Luckily for us, Python docstring format epytext is very similar to AugLy (credit @Cloud9c), meaning that we can claim that we use epytext formatting and then convert it to sphinx when necessary.

Another problem: Sphinx requires explicit types labeled in the form of :type and :rtype to display types once documentation is rendered. However, Python typehints (which AugLy uses generously) are not natively supported. Therefore we use an extension to Sphinx that autodetects typehints and adds them on the fly.

In the end, Sphinx uses the module structure specified in docs/source with rST (filetype similar to Markdown) to generate a table of contents for our final documentation.

How to Build Documentation Locally

Clone repository. git clone github.com/facebookresearch/AugLy

Install all requirements for both the library and documentation-specific dependencies cd docs && pip install -r requirements.txt

Make sure you are in the docs subdirectory. Then run make html to generate documentation. If you want to delete all these later, you can run make clean.

Navigate to docs/build and open index.html

Generating new documentation later

Sphinx can detect new files added to the augly subdirectory and add their .rst files accordingly, but the process of detection needs to be triggered manually. Run sphinx-apidoc -o ./source ../augly in the root directory to do so, and update the toctree in index.rst if necessary.

Edited a docstring and want to see the same changes be reflected in the published documentation? No worries, this will be automatically completed, an overview of which is provided below.

Integration with ReadTheDocs

This documentation uses Sphinx's ReadTheDocs theme to make it easy to publish documentation on RTD's site.

Create a Github webhook to detect pushes made to the repository so documentation can rebuild.

The .readthedocs.yml file specifies the configuration for these builds. By default ffmpeg and libsndfile1 are based on C and are required as prerequisites before requirements in docs/requirements.txt are installed (as RTD uses Ubuntu behind the scenes).

Docstrings aren't stored in the docs subdirectory at all, so all are read from the source folder. Updating the docstrings in augly/<modality> will be sufficient.

CLA Signed
opened by iAdityaEmpire 59

Enable to set random ranges to RandomEmojiOverlay parameters

Summary

[x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

Added randomness to RandomEmojiOverlay parameters like emoji_size. A motivation is that I wanted to randomly change the parameters on-the-fly, rather than "fixed" like current implementation. This modified augmentation was actually used in ISC21 Descriptor Track 1st-place solution.

Test Results

The tests are passed except the following error.

Traceback (most recent call last):
  File "/Users/shuhei.yokoo/Documents/AugLy/augly/tests/image_tests/transforms_unit_test.py", line 177, in test_RandomEmojiOverlay
    self.evaluate_class(
  File "/Users/shuhei.yokoo/Documents/AugLy/augly/tests/image_tests/base_unit_test.py", line 129, in evaluate_class
    are_equal_images(dst, ref), "Expected and outputted images do not match"
  File "/Users/shuhei.yokoo/Documents/AugLy/augly/tests/image_tests/base_unit_test.py", line 20, in are_equal_images
    return a.size == b.size and np.allclose(np.array(a), np.array(b))
  File "<__array_function__ internals>", line 5, in allclose
  File "/Users/shuhei.yokoo/.pyenv/versions/augly/lib/python3.8/site-packages/numpy/core/numeric.py", line 2249, in allclose
    res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
  File "<__array_function__ internals>", line 5, in isclose
  File "/Users/shuhei.yokoo/.pyenv/versions/augly/lib/python3.8/site-packages/numpy/core/numeric.py", line 2358, in isclose
    return within_tol(x, y, atol, rtol)
  File "/Users/shuhei.yokoo/.pyenv/versions/augly/lib/python3.8/site-packages/numpy/core/numeric.py", line 2339, in within_tol
    return less_equal(abs(x-y), atol + rtol * abs(y))
ValueError: operands could not be broadcast together with shapes (1080,1920,3) (1080,1920,4) 

----------------------------------------------------------------------
Ran 71 tests in 35.609s

FAILED (errors=1, skipped=4)
sys:1: ResourceWarning: unclosed file <_io.BufferedReader name='/Users/shuhei.yokoo/Documents/AugLy/augly/assets/tests/image/inputs/dfdc_1.jpg'>

It seems that expected image has alpha channel whereas output image doesn't have alpha channel. I'm not sure why it happened. Replacing the expected image is needed?

CLA Signed

opened by lyakaap 27

Optimize hflip
Summary

[X ] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

<Please summarize what you are trying to achieve, what changes you made, and how they acheive the desired result.> I am trying to speed up horizontally flipping, and what I did is use VidGear to execute an ffmpeg command with the preset of ultrafast.

With my changes in running HFlip tests: 4.724 seconds (tests would complete around 2.5 seconds) Without my changes in running HFlip tests: 12.534 seconds (tests would complete around 10.5 seconds)

I've run each test with and without my changes five times on fish shell with this command time for i in (seq 5); python -m unittest augly.tests.video_tests.transforms.ffmpeg_test.TransformsVideoUnitTest.test_HFlip;; end

Unit Tests

If your changes touch the audio module, please run all of the audio tests and paste the output here. Likewise for image, text, & video. If your changes could affect behavior in multiple modules, please run the tests for all potentially affected modules. If you are unsure of which modules might be affected by your changes, please just run all the unit tests.

Audio

python -m unittest discover -s augly/tests/audio_tests/ -p "*"

Image

python -m unittest discover -s augly/tests/image_tests/ -p "*_test.py" # Or `python -m unittest discover -s augly/tests/image_tests/ -p "*.py"` to run pytorch test too (must install `torchvision` to run)

Text

python -m unittest discover -s augly/tests/text_tests/ -p "*"

Video

python -m unittest discover -s augly/tests/video_tests/ -p "*"

All

python -m unittest discover -s augly/tests/ -p "*"

Other testing

If applicable, test your changes and paste the output here. For example, if your changes affect the requirements/installation, then test installing augly in a fresh conda env, then make sure you are able to import augly & run the unit test
CLA Signed
opened by Adib234 27
Problem using audio augmentations with tensorflow
I tried to use the audio augmentations in a tensorflow project but I had a "Segmentation fault" error in running time while importing the modules. In my case it can be reproduced only running these two lines:

import tensorflow import augly.audio as audaugs

Versions:

tensorflow-gpu==2.4.1 augly==0.1.1 Python 3.8.5

Thank you
bug
opened by mcanan 20
Return images same mode (initial commit)
Related Issue

Fixes #{128}

Summary

[x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

[x] Add src_mode arg to ret_and_save_image() in image/utils/utils.py

[x] Pass src_mode arg into ret_and_save_image() from every augmentation except convert_color in image/functional.py (e.g. here for apply_lambda)

[x] In image test evaluate_class(), assert that the mode of self.img & dst are equal.

[x] Run image tests, make sure they all pass: python -m unittest discover -s augly/tests/image_tests/ -p "*"

Unit Tests

If your changes touch the audio module, please run all of the audio tests and paste the output here. Likewise for image, text, & video. If your changes could affect behavior in multiple modules, please run the tests for all potentially affected modules. If you are unsure of which modules might be affected by your changes, please just run all the unit tests.

Image

python -m unittest discover -s augly/tests/image_tests/ -p "*_test.py" # Or `python -m unittest discover -s augly/tests/image_tests/ -p "*.py"` to run pytorch test too (must install `torchvision` to run)

Test Output: n/a

Other testing

N/A
CLA Signed Merged
opened by membriux 17
Added spatial bbox helper
Summary

[x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

Computes the bbox that encloses a white box in a black background for any augmentation.

Image

python -m unittest discover -s augly/tests/image_tests/ -p "*_test.py" # Or `python -m unittest discover -s augly/tests/image_tests/ -p "*.py"` to run pytorch test too (must install `torchvision` to run)

Ran 82 tests in 53.014s OK (skipped=5)

Other testing

Colab notebook testing the bbox helper → https://colab.research.google.com/drive/1g_0I6f_bv4Wsna6l9jjZrOJ62a4dpu8U#scrollTo=yUczCe6FU9Bs
CLA Signed
opened by membriux 15
`black` formatting
Summary: Now our files are all correctly formatted for black, and thus we can run black on our files during code review & not have a million irrelevant changes!

Followed the steps here: https://fb.prod.workplace.com/groups/pythonfoundation/posts/2990917737888352/

Removed aml/augly/ from list of dirs excluded from black formatting in fbsource/tools/arcanist/lint/fbsource-lint-engine.toml

Ran arc lint --take BLACK --apply-patches --paths-cmd 'hg files aml/augly/'

Fixed type issues in aml/augly/text/fb/augmenters/back_translate/ which were causing linter errors (there were pyre-ignore comments but they were no longer on the correct line post-black, so just added an assert & initialized a var so we no longer need them)

Made changes to TARGETS files suggested by linter (moving imports/changing from where some libraries were imported)

Differential Revision: D31526814
fb-exported
opened by zpapakipos 12
Fix for rotating bounding box the wrong way
Summary

[x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

The current method for rotating bounding boxes was bugged because it rotated the bounding box in the wrong direction. This simple sign change fixes it.

Test case: https://github.com/juliusfrost/AugLy/blob/rotate-bounding-box-jupyter/examples/test_rotate.ipynb

Unit Tests

Passes unit tests.
CLA Signed
opened by juliusfrost 11
Update resize() to be on par with torchvision speed
Summary

[x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

Summary: Refactored resize() from image/functional.py to be on par with torchvision. However, I just have one minor failure in my code. Please advise on where I should look :)

Test results The following results were acquired from my machine

Augly original (without interpolation)= 0.04475s

Augly revised (with interpolation) = 0.02873s

torchvision (uses interpolation) = 0.02696s

Test code → https://colab.research.google.com/drive/14-KZdSGaOaz73OgIS0DZZY4RsS3cJ0rg#scrollTo=xVI_h-1v49lC

Unit Tests

If your changes touch the audio module, please run all of the audio tests and paste the output here. Likewise for image, text, & video. If your changes could affect behavior in multiple modules, please run the tests for all potentially affected modules. If you are unsure of which modules might be affected by your changes, please just run all the unit tests.

### Image ```bash python -m unittest discover -s augly/tests/image_tests/ -p "*_test.py" # Or `python -m unittest discover -s augly/tests/image_tests/ -p "*.py"` to run pytorch test too (must install `torchvision` to run)

TEST OUTPUT

====================================================================== FAIL: test_Resize (transforms_unit_test.TransformsImageUnitTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/macbookpro/Desktop/Github/AugLy/augly/tests/image_tests/transforms_unit_test.py", line 187, in test_Resize self.evaluate_class(imaugs.Resize(), fname="resize") File "/Users/macbookpro/Desktop/Github/AugLy/augly/tests/image_tests/base_unit_test.py", line 111, in evaluate_class self.assertTrue( AssertionError: False is not true ---------------------------------------------------------------------- Ran 82 tests in 52.735s FAILED (failures=1, skipped=5)
CLA Signed
opened by membriux 10

`OSError: unknown freetype error` when using `OverlayText`

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

OverlayText with a combination of specific text [166, 287] and specific font (NotoSansBengaliUI-Regular.ttf") results in error.
It appears that even if these two characters (with indices 166 and 287) are separated by any number of other characters, it results in the same error.

# imports
import os
import augly.image as imaugs
import augly.utils as utils
from IPython.display import display

# import paths
from augly.utils.base_paths import (
    EMOJI_DIR,
    FONTS_DIR,
    SCREENSHOT_TEMPLATES_DIR,
)

# read sample input_img
input_img_path = os.path.join(
    utils.TEST_URI, "image", "inputs", "dfdc_1.jpg"
)
input_img = imaugs.scale(input_img_path, factor=0.2)

# This results in error
overlay_text = imaugs.OverlayText(
    text = [166, 287],
    font_file = os.path.join(FONTS_DIR, "NotoSansBengaliUI-Regular.ttf"),
)
overlay_text(input_image)

# These do not result in error
overlay_text = imaugs.OverlayText(
    text = [166],
    font_file = os.path.join(FONTS_DIR, "NotoSansBengaliUI-Regular.ttf"),
)
overlay_text(input_image)

# These do not result in error
overlay_text = imaugs.OverlayText(
    text = [287],
    font_file = os.path.join(FONTS_DIR, "NotoSansBengaliUI-Regular.ttf"),
)
overlay_text(input_image)

# These do not result in error
overlay_text = imaugs.OverlayText(
    text = [166, 287]
   # font_file not specified
)
overlay_text(input_image)

Stack trace:

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-74-a3f6c232cffd> in <module>
      3     font_file = os.path.join(FONTS_DIR, "NotoSansBengaliUI-Regular.ttf"),
      4 )
----> 5 overlay_text(input_image)

~/miniconda3/envs/mds572/lib/python3.7/site-packages/augly/image/transforms.py in __call__(self, image, force, metadata)
     48             return image
     49 
---> 50         return self.apply_transform(image, metadata)
     51 
     52     def apply_transform(

~/miniconda3/envs/mds572/lib/python3.7/site-packages/augly/image/transforms.py in apply_transform(self, image, metadata)
    920             x_pos=self.x_pos,
    921             y_pos=self.y_pos,
--> 922             metadata=metadata,
    923         )
    924 

~/miniconda3/envs/mds572/lib/python3.7/site-packages/augly/image/functional.py in overlay_text(image, output_path, text, font_file, font_size, opacity, color, x_pos, y_pos, metadata)
   1121         text=text_str,
   1122         fill=(color[0], color[1], color[2], round(opacity * 255)),
-> 1123         font=font,
   1124     )
   1125 

~/miniconda3/envs/mds572/lib/python3.7/site-packages/PIL/ImageDraw.py in text(self, xy, text, fill, font, anchor, spacing, align, direction, features, language, stroke_width, stroke_fill, embedded_color, *args, **kwargs)
    461             else:
    462                 # Only draw normal text
--> 463                 draw_text(ink)
    464 
    465     def multiline_text(

~/miniconda3/envs/mds572/lib/python3.7/site-packages/PIL/ImageDraw.py in draw_text(ink, stroke_width, stroke_offset)
    416                     ink=ink,
    417                     *args,
--> 418                     **kwargs,
    419                 )
    420                 coord = coord[0] + offset[0], coord[1] + offset[1]

~/miniconda3/envs/mds572/lib/python3.7/site-packages/PIL/ImageFont.py in getmask2(self, text, mode, fill, direction, features, language, stroke_width, anchor, ink, *args, **kwargs)
    668         """
    669         size, offset = self.font.getsize(
--> 670             text, mode, direction, features, language, anchor
    671         )
    672         size = size[0] + stroke_width * 2, size[1] + stroke_width * 2

OSError: unknown freetype error

Expected behavior

I don't know why this combination of text and font does not work. I would expect there to be no error, since overlaying the same text using another font file does not result in error.

Environment

AugLy Version (e.g., 0.1.2): 0.1.5
OS (e.g., Linux): Ubuntu 20.04
How you installed AugLy (pip install augly, clone & pip install -e AugLy): pip install augly
Python version: 3.7.9
Other relevant packages (Tensorflow, etc):

Additional context

Same error message can be reproduced using the following combinations as well: text [449, 262] with font file NotoSansThaana-Regular.ttf text [295, 481] with font file NotoSansBengali-Regular.ttf

dependency bug

opened by jkim222383 10

increasing highpass efficiency

Related Issue

Fixes N/A

Summary

[x] I have read CONTRIBUTING.md to understand how to contribute to this repository :)

<Please summarize what you are trying to achieve, what changes you made, and how they acheive the desired result.> Speeding up highpassfilter by taking advantage of existing dependencies

Previous runtime: 1.686479s New runtime: 0.045012s

Unit Tests

If your changes touch the audio module, please run all of the audio tests and paste the output here. Likewise for image, text, & video. If your changes could affect behavior in multiple modules, please run the tests for all potentially affected modules. If you are unsure of which modules might be affected by your changes, please just run all the unit tests.

Audio

python -m unittest discover -s augly/tests/audio_tests/ -p "*"
......./home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this w
arning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  np.dtype(np.float): np.complex,
/home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.complex` is a deprecated alias for the builtin `complex`. To silence this warn
ing, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  np.dtype(np.float): np.complex,
/home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:869: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning,
use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  mag = np.abs(S).astype(np.float)
........../home/adityaprasad/.local/lib/python3.8/site-packages/librosa/core/spectrum.py:1223: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence
this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  time_steps = np.arange(0, D.shape[1], rate, dtype=np.float)
/home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning,
 use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  np.dtype(np.float): np.complex,
/home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:2099: DeprecationWarning: `np.complex` is a deprecated alias for the builtin `complex`. To silence this warn
ing, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  np.dtype(np.float): np.complex,
/home/adityaprasad/.local/lib/python3.8/site-packages/librosa/core/spectrum.py:1223: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warni
ng, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  time_steps = np.arange(0, D.shape[1], rate, dtype=np.float)
/home/adityaprasad/.local/lib/python3.8/site-packages/librosa/util/utils.py:869: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning,
use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  mag = np.abs(S).astype(np.float)
..................................................
----------------------------------------------------------------------
Ran 67 tests in 5.665s

OK

If applicable, test your changes and paste the output here. For example, if your changes affect the requirements/installation, then test installing augly in a fresh conda env, then make sure you are able to import augly & run the unit test

CLA Signed

opened by iAdityaEmpire 9

Paraphrasing using AugLy

🚀 Feature

As i was going thru AugLy i didn't found anything that can paraphrase a sentence and can create 2-3 sentence from 1 as xlnet does in nlpaug library. If it is already available in AugLy can you please mention the same?

Motivation

As i have less data to train. So Augmentation with paraphrasing will help me create more data and will let me train the model.

Pitch

I want one sentence to be paraphrase in how many number of sentence i want. For example if i give n=3, Function should produce 3 sentences from 1 sentence which have similar meaning(paraphrased basically)

opened by ChiragM-Hexaware 0
Missing comparison in paper

Hi AugLy team, Thanks for the great package!

I have seen your comparison with other libs in the paper and have to highlight a missing point: in the table below, pixelization is not compared with an alternative from albumentations.

My guess is you didn't find one, so may I suggest looking at https://albumentations.ai/docs/api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Downscale? It does pixelization using the default params.

opened by arsenyinfo 0
Disabled transitions if the transition duration is too short.

Summary: Fixing a pattern of failures where ffmpeg fails when the transition duration is too short.

Refactored the concatenation (no transition effect) in a separate function. Shifting to concatenation when the transition duration is below 0.5 seconds (chosen empirically).

Differential Revision: D37737241
CLA Signed fb-exported

opened by gpostelnicu 1

Issue about not specifying the path to the ffmpeg package

1 Issues

When I use Augly to perform data enhancement operations on videos, I encounter this problem：

Compression Mode is disabled, Kindly enable it to access this function.

But I have installed ffmpeg and configured the environment. Then I tried to debug the code and found a bug.

When the function add_augmenter() calls WriteGear()，The value of parameter custom_ffmpeg is not specified. But later in the code, it is necessary to use its real value:

Line 215 in writegear.py:
    
            self.__ffmpeg = get_valid_ffmpeg_path(
                custom_ffmpeg,
                self.__os_windows,
                ffmpeg_download_path=__ffmpeg_download_path,
                logging=self.__logging,
            )

So that in function get_valid_ffmpeg_path(), the value returned is always False, which causes the program to fail to get the locally downloaded ffmpeg package and must download it again.

Line 885 in helper.py:

def get_valid_ffmpeg_path(
    custom_ffmpeg="", is_windows=False, ffmpeg_download_path="", logging=False
):
    """
    ## get_valid_ffmpeg_path

    Validate the given FFmpeg path/binaries, and returns a valid FFmpeg executable path.

    Parameters:
        custom_ffmpeg (string): path to custom FFmpeg executables
        is_windows (boolean): is running on Windows OS?
        ffmpeg_download_path (string): FFmpeg static binaries download location _(Windows only)_
        logging (bool): enables logging for its operations

    **Returns:** A valid FFmpeg executable path string.
    """
    final_path = ""
    if is_windows:
        # checks if current os is windows
        if custom_ffmpeg:
            # if custom FFmpeg path is given assign to local variable
            final_path += custom_ffmpeg
        else:
            # otherwise auto-download them
            try:
                if not (ffmpeg_download_path):
                    # otherwise save to Temp Directory
                    import tempfile

                    ffmpeg_download_path = tempfile.gettempdir()

                logging and logger.debug(
                    "FFmpeg Windows Download Path: {}".format(ffmpeg_download_path)
                )

                # download Binaries
                os_bit = (
                    ("win64" if platform.machine().endswith("64") else "win32")
                    if is_windows
                    else ""
                )
                _path = download_ffmpeg_binaries(
                    path=ffmpeg_download_path, os_windows=is_windows, os_bit=os_bit
                )
                # assign to local variable
                final_path += _path

2 Solution

Giving the path to the local ffmpeg package when use get_valid_ffmpeg_path(). Like this:

 self.__ffmpeg = get_valid_ffmpeg_path(
                "D:/workSoftware/anaconda/envs/augly/Library/bin/",
                self.__os_windows,
                ffmpeg_download_path=__ffmpeg_download_path,
                logging=self.__logging,
            )

Its path can be obtained using the following method：

import distutils
distutils.spawn.find_executable('ffmpeg')

This way you won't need to reinstall the ffmpeg package every time you use it.

opened by ZOMIN28 0

mosaic

Hi, Is there any data enhancement for the following image, which looks like a mosaic, I used pixelation, but it doesn't look like this kind of corruption.

Thank you !

opened by WEIZHIHONG720 0
Support For Keypoints?

🚀 Feature

Can the augmentations track keypoints through transformations? I see you have support for bounding boxes, so is it possible to set the bounding box as a single coordinate and track it?

Motivation

Landmark Localization.

Pitch

Track x,y coordinates through transformations.

opened by Schobs 0

Releases(v1.0.0)

v1.0.0(Mar 29, 2022)
Changes

Text:

Fixed return types in the doc strings so all text augmentations are consistent.

Conserved whitespace through tokenization/detokenization for all text augmentations, so they are now consistent.

Image:

Fixed bug with bounding boxes in rotate augmentation.

Overall:

Split dependencies by modality so installation will be lighter-weight for most users. See issue https://github.com/facebookresearch/AugLy/issues/208 as well as the README of each modality for more details.

Moved the test input/output data out of the main augly folder so it isn't packaged with the pypi package, to make installation lighter-weight.

Source code(tar.gz)
Source code(zip)
v0.2.1(Dec 17, 2021)
Changes

Audio:

New augmentations: loop

Efficiency improvements: made high_pass_filter & low_pass_filter ~97% faster by using torchaudio

Image:

New augmentations: skew

Added bbox computation helper spatial_bbox_helper to make it easier to add new image augmentations & automatically compute the bounding box transformations (e.g. see how we used this for skew here)

Efficiency improvements: made resize ~35% faster by defaulting to bilinear interpolation

Text:

Allow multi-word typo replacement

Efficiency improvements: made contractions, replace_similar_chars, replace_similar_unicode_chars, replace_upside_down ~40-60% faster using algorithmic improvements

Video:

Efficiency improvements: made 30 of the video augmentations faster using vidgear (a new dependency we added in this release) to execute ffmpeg commands using higher compression rates (e.g. hflip 75% faster, loop 85% faster, remove_audio 96% faster, pixelization 71% faster)

Overall:

Modified internal imports to be Python 3.6-compatible

Added error messages to unit tests for easier debugging

If you want to see a full report benchmarking the runtimes of all AugLy augmentations versus other libraries, keep an eye out for the AugLy paper, which will be up on Arxiv in January!

Source code(tar.gz)
Source code(zip)
v0.1.10(Oct 18, 2021)
Changes

Image

Added bounding box support to all augmentations

Images are now returned in the same format they were passed into all augmentations (except convert_color)

Text

New augmentations: swap_gendered_words, merge_words, change_case, contractions

Allow for kwarg overriding in __call__() for all augmentations

Exposed typo_type param in simulate_typos aug

Added ignore_words param to replace_words & swap_gendered_words

Video

New augmentation: augment_audio

Other

Enforce black formatting

Source code(tar.gz)
Source code(zip)
v0.1.7(Sep 13, 2021)
Changes

Image

New augmentations: apply_pil_filter, clip_image_size, overlay_onto_background_image, overlay_onto_background_image_with_blurred_mask, apply_pil_filter, clip_image_size, overlay_onto_background_image

New unit tests: Compose, overlay_image

Fixed color_jitter_intensity

Don't modify input image in overlay_stripes

Added metadata arg to Compose operator

Added support to overlay_text for multi-line text

Added resize_src_to_match_template option to overlay_onto_screenshot

Improved meme_format error message

Text

New augmentation: insert_whitespace_chars

Add metadata arg to Compose operator

Added more font options to replace_fun_fonts

Video

Added metadata arg to Compose operator, added unit test

Source code(tar.gz)
Source code(zip)
v0.1.5(Jul 9, 2021)
Changes

Other

Trimmed down dependencies

Audio & video dependencies are only installed if the [av] extra is specified

Source code(tar.gz)
Source code(zip)
v0.1.3(Jun 28, 2021)
Changes

Text

Added font .pkl files that had been accidentally deleted

Other

Added test & lint GitHub workflows

Source code(tar.gz)
Source code(zip)
v0.1.2(Jun 22, 2021)

First github release! Aligns with augly==0.1.2 on pypi.
Source code(tar.gz)
Source code(zip)

Owner

Facebook Research

GitHub https://ai.facebook.com/blog/augly-a-new-data-augmentation-library-to-help-build-more-robust-ai-models/

UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.)

7.6k Jan 1, 2023

Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

1.4k Jan 8, 2023

Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E. Evaluated on benchmark dataset Office31.

Deep-Unsupervised-Domain-Adaptation Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E.

49 Dec 20, 2022

[ICLR 2021] "Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective" by Wuyang Chen, Xinyu Gong, Zhangyang Wang

Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective [PDF] Wuyang Chen, Xinyu Gong, Zhangyang Wang In ICLR 2

156 Nov 28, 2022

Employs neural networks to classify images into four categories: ship, automobile, dog or frog

Neural Net Image Classifier Employs neural networks to classify images into four categories: ship, automobile, dog or frog Viterbi_1.py uses a classic

1 Jan 18, 2022

We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Multi-Modal Self-Supervision using GDT and StiCa This is an official pytorch implementation of papers: Multi-modal Self-Supervision from Generalized D

42 Dec 9, 2022

Facilitates implementing deep neural-network backbones, data augmentations

Introduction Nowadays, the training of Deep Learning models is fragmented and unified. When AI engineers face up with one specific task, the common wa

40 Dec 29, 2022

VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection (ICCV 2021)

Preparation Please see dataset/README.md to get more details about our datasets-VIL100 Please see INSTALL.md to install environment and evaluation too

82 Dec 15, 2022

Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

174 Dec 29, 2022

official implemntation for "Contrastive Learning with Stronger Augmentations"

Lab for MAchine Perception and LEarning (MAPLE)

47 Nov 29, 2022

AugMax: Adversarial Composition of Random Augmentations for Robust Training

[NeurIPS'21] "AugMax: Adversarial Composition of Random Augmentations for Robust Training" by Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Animashree Anandkumar, and Zhangyang Wang.

31 Oct 28, 2021

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts to train RL agents to navigate the closed world and collect video data.

11 Oct 22, 2022

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation This is a demo implementation of BYOL for Audio (BYOL-A), a self-sup

160 Jan 4, 2023

This project uses Template Matching technique for object detecting by detection of template image over base image.

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

7 May 29, 2022

This project uses Template Matching technique for object detecting by detection of template image over base image

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

4 Nov 16, 2021

OptaPlanner wrappers for Python. Currently significantly slower than OptaPlanner in Java or Kotlin.

OptaPy is an AI constraint solver for Python to optimize the Vehicle Routing Problem, Employee Rostering, Maintenance Scheduling, Task Assignment, School Timetabling, Cloud Optimization, Conference Scheduling, Job Shop Scheduling, Bin Packing and many more planning problems.

211 Jan 2, 2023

Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

AudioCLIP Extending CLIP to Image, Text and Audio This repository contains implementation of the models described in the paper arXiv:2106.13043. This

458 Jan 2, 2023

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding, where the hidden data can be utilized for various management purposes, including hyper-linking, annotation, and authentication

1 Nov 17, 2021

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

6.9k Jan 4, 2023

AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations

Related tags

Overview

Authors

Installation

Documentation

Assets

Citation

License

Comments

Summary

What This Is

How It Works

How to Build Documentation Locally

Generating new documentation later

Integration with ReadTheDocs

Summary

Test Results

Summary

Unit Tests

Audio

Image

Text

Video

All

Other testing

Related Issue

Summary

Unit Tests

Other testing

Summary

Image

Other testing

Summary

Unit Tests

Summary

Unit Tests

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Related Issue

Summary

Unit Tests

Audio

🚀 Feature

Motivation

Pitch

1 Issues

2 Solution

🚀 Feature

Motivation

Pitch

Releases(v1.0.0)

v1.0.0(Mar 29, 2022)

Changes

Text:

Image:

Overall:

v0.2.1(Dec 17, 2021)

Changes

Audio:

Image:

Text:

Video:

Overall:

v0.1.10(Oct 18, 2021)

Changes

Image

Text

Video

Other

v0.1.7(Sep 13, 2021)

Changes

Image

Text

Video

v0.1.5(Jul 9, 2021)

Changes