Python module for handling audio metadata

Quod Libet

Last update: Dec 31, 2022

Related tags

Audio python music mp4 mp3 tagging id3v2 id3v1 flac id3 ogg opus apev2

Overview

Mutagen is a Python module to handle audio metadata. It supports ASF, FLAC, MP4, Monkey's Audio, MP3, Musepack, Ogg Opus, Ogg FLAC, Ogg Speex, Ogg Theora, Ogg Vorbis, True Audio, WavPack, OptimFROG, and AIFF audio files. All versions of ID3v2 are supported, and all standard ID3v2.4 frames are parsed. It can read Xing headers to accurately calculate the bitrate and length of MP3s. ID3 and APEv2 tags can be edited regardless of audio format. It can also manipulate Ogg streams on an individual packet/page level.

Mutagen works with Python 3.6+ (CPython and PyPy) on Linux, Windows and macOS, and has no dependencies outside the Python standard library. Mutagen is licensed under the GPL version 2 or later.

For more information visit https://mutagen.readthedocs.org

Comments

Lame VBR Preset

Originally reported by: Christoph Reiter (Bitbucket: lazka, GitHub: lazka)

From [email protected] on June 03, 2010 23:04:55

I've gone through all python libraries, and even other programs and 
libraries, but I can't find anything that can figure out the VBR preset 
used by LAME on an mp3.

The only program I know of that does it is Mr. QuestionMan http://www.burrrn.net/?page_id=5 And it works very well, but I want to make renaming scripts and stuff that 
would need to parse mp3s and find that information. Sadly, I can't find any 
way of doing this.

It would be really great if you guys could implement it. Not sure why no 
one has made anything similar. If they can do it, I'm guessing it's 
possible? Unless they guess it according to the average bitrate, but I 
personally doubt that.

And I'm speaking about v0-v9 setting, of course.

Original issue: http://code.google.com/p/mutagen/issues/detail?id=66

Bitbucket: https://bitbucket.org/lazka/mutagen/issue/66

enhancement

opened by lazka 39

Python 3.0 Support
Originally reported by: Christoph Reiter (Bitbucket: lazka, GitHub: lazka)

From [email protected] on September 23, 2009 13:03:19

Not really an issue, but it would really be great to seee suppor for Python 3.0. Or is there support and I'm too n00b to figure out how to work it?

Original issue: http://code.google.com/p/mutagen/issues/detail?id=27

Bitbucket: https://bitbucket.org/lazka/mutagen/issue/27

enhancement
opened by lazka 39
FileType, Metadata: File-like object support
Originally reported by: Christoph Reiter (Bitbucket: lazka, GitHub: lazka)

From [email protected] on June 15, 2009 07:49:19

FileType and Metadata subclasses should support loading from file-like objects as well as filenames. We can probably restrict this to FLOs that support random access.

Original issue: http://code.google.com/p/mutagen/issues/detail?id=1

Bitbucket: https://bitbucket.org/lazka/mutagen/issue/1

enhancement
opened by lazka 36
RIFF/WAVE support, using ID3v2
Initial implementation of RIFF/WAVE format.

Currently using ID3v2 tag chunk to store metadata.

Storing metadata in the RIFF/INFO tag is still something which needs to be done.

Related issues:

#207: Support RIFF INFO chunk metadata (AVI, WAV, XMA, xWMA, RMI, DLS)

PICARD-1128: Support Microsoft WAVE format (RIFF/WAVE) tagging

Borewit/music-metadata#19
opened by Borewit 23
Picard slowness over Windows network share

Using Picard over a Windows 10 network share I have been getting very poor performance. Windows task manager shows open handles growing.

This issue is to log the possibility of a file handle leak in Mutagen creating these performance problems.

Ongoing discussion re: Picard is in ticket PICARD-744 but for MP3 files it looks to me like the code reads the header to determine the ID3 data size, and at ID3/_file.py:164 calls ._util.py:read_full which reads the rest of the ID3 data (using the size in the header) but does NOT read to the end of the file (which would implicitly close the file) or does it close the file explicitly.

I haven't looked at other formats, but this may apply to them as well as ID3.

Do we need to explicitly close the file at the end of the load method?
needinfo

opened by Sophist-UK 19

Add support for writing ID3v2.3 tags

Originally reported by: Christoph Reiter (Bitbucket: lazka, GitHub: lazka)

From lalinsky on March 29, 2011 16:43:44

Mutagen always supported only saving of ID3v2.4 tags. When I considered using it in Picard, I implemented ID3v2.3 support by subclassing the ID3 class. I'd like to get this code included in Mutagen, so that we do not have to watch Mutagen for changes that could possibly make it incompatible. The code has been in use for a couple of years now (and since the majority of Picard uses are on Windows, they basically have to use ID3v2.3), so I'm pretty confident it's safe.

Attachment: id3v23.diff

Original issue: http://code.google.com/p/mutagen/issues/detail?id=85

Bitbucket: https://bitbucket.org/lazka/mutagen/issue/85

enhancement

opened by lazka 18

Mutagen is unusable on ZFS

When I moved to ZFS I found that beets became unusably slow, to the point where scrubbing tags in a single FLAC was taking 10 or more minutes. I opened an issue, https://github.com/beetbox/beets/issues/3642, to investigate and eventually tracked it down to this portion of mutagen code:

 --- modulename: flac, funcname: _writeblock
flac.py(124):         data = bytearray()
flac.py(125):         code = (block.code | 128) if is_last else block.code
flac.py(126):         datum = block.write()
 --- modulename: flac, funcname: write
flac.py(653):         try:
flac.py(654):             return b"\x00" * self.length
flac.py(127):         size = len(datum)
flac.py(128):         if size > cls._MAX_SIZE:
flac.py(138):         assert not size > cls._MAX_SIZE
flac.py(139):         length = struct.pack(">I", size)[-3:]
flac.py(140):         data.append(code)
flac.py(141):         data += length
flac.py(142):         data += datum
flac.py(143):         return data
flac.py(168):         return data
flac.py(868):         data_size = len(data)
flac.py(870):         resize_bytes(filething.fileobj, available, data_size, header)
 --- modulename: _util, funcname: resize_bytes
_util.py(909):     if new_size < old_size:
_util.py(910):         delete_size = old_size - new_size
_util.py(911):         delete_at = offset + new_size
_util.py(912):         delete_bytes(fobj, delete_size, delete_at)
 --- modulename: _util, funcname: delete_bytes
_util.py(875):     if size < 0 or offset < 0:
_util.py(878):     fobj.seek(0, 2)
_util.py(879):     filesize = fobj.tell()
_util.py(880):     movesize = filesize - offset - size
_util.py(882):     if movesize < 0:
_util.py(885):     if mmap is not None:
_util.py(886):         try:
_util.py(887):             mmap_move(fobj, offset, offset + size, movesize)
 --- modulename: _util, funcname: mmap_move
_util.py(704):     assert mmap is not None, "no mmap support"
_util.py(706):     if dest < 0 or src < 0 or count < 0:
_util.py(709):     try:
_util.py(710):         fileno = fileobj.fileno()
_util.py(715):     fileobj.seek(0, 2)
_util.py(716):     filesize = fileobj.tell()
_util.py(717):     length = max(dest, src) + count
_util.py(719):     if length > filesize:
_util.py(722):     offset = ((min(dest, src) // mmap.ALLOCATIONGRANULARITY) *
_util.py(723):               mmap.ALLOCATIONGRANULARITY)
_util.py(722):     offset = ((min(dest, src) // mmap.ALLOCATIONGRANULARITY) *
_util.py(724):     assert dest >= offset
_util.py(725):     assert src >= offset
_util.py(726):     assert offset % mmap.ALLOCATIONGRANULARITY == 0
_util.py(729):     if count == 0:
_util.py(733):     if src == dest:
_util.py(736):     fileobj.flush()
_util.py(737):     file_map = mmap.mmap(fileno, length - offset, offset=offset)
_util.py(738):     try:
_util.py(739):         file_map.move(dest - offset, src - offset, count)

On the OpenZFS IRC someone mentioned the issue seems to be the mmap based file writing mutagen is doing. ZFS isn't integrated with the linux cache, and so each mmap operation needs to be copied in and out of the cache making it painfully slow. To test this I applied the following patch to mutagen:

diff --git a/mutagen/_util.py b/mutagen/_util.py
index 1332f9d..5b9a8cd 100644
--- a/mutagen/_util.py
+++ b/mutagen/_util.py
@@ -20,7 +20,7 @@ import decimal
 from io import BytesIO
 
 try:
-    import mmap
+    mmap = None
 except ImportError:
     # Google App Engine has no mmap:
     #   https://github.com/quodlibet/mutagen/issues/286
@@ -701,8 +701,6 @@ def mmap_move(fileobj, dest, src, count):
         ValueError: In case invalid parameters were given
     """
 
-    assert mmap is not None, "no mmap support"
-
     if dest < 0 or src < 0 or count < 0:
         raise ValueError("Invalid parameters")
 
diff --git a/tests/test__util.py b/tests/test__util.py
index 0ed25ed..55d0d7a 100644
--- a/tests/test__util.py
+++ b/tests/test__util.py
@@ -2,7 +2,7 @@
 
 from mutagen._util import DictMixin, cdata, insert_bytes, delete_bytes, \
     decode_terminated, dict_match, enum, get_size, BitReader, BitReaderError, \
-    resize_bytes, seek_end, mmap_move, verify_fileobj, fileobj_name, \
+    resize_bytes, seek_end, verify_fileobj, fileobj_name, \
     read_full, flags, resize_file, fallback_move, encode_endian, loadfile, \
     intround, verify_filename
 from mutagen._compat import text_type, itervalues, iterkeys, iteritems, PY2, \
@@ -376,33 +376,12 @@ class TMoveMixin(object):
             self.MOVE(o, 0, 1, 2)
             self.MOVE(o, 1, 0, 2)
 
-    def test_larger_than_page_size(self):
-        off = mmap.ALLOCATIONGRANULARITY
-        with self.file(b"f" * off * 2) as o:
-            self.MOVE(o, off, off + 1, off - 1)
-            self.MOVE(o, off + 1, off, off - 1)
-
-        with self.file(b"f" * off * 2 + b"x") as o:
-            self.MOVE(o, off * 2 - 1, off * 2, 1)
-            self.assertEqual(self.read(o)[-3:], b"fxx")
-
 
 class Tfallback_move(TestCase, TMoveMixin):
 
     MOVE = staticmethod(fallback_move)
 
 
-class MmapMove(TestCase, TMoveMixin):
-
-    MOVE = staticmethod(mmap_move)
-
-    def test_stringio(self):
-        self.assertRaises(mmap.error, mmap_move, cBytesIO(), 0, 0, 0)
-
-    def test_no_fileno(self):
-        self.assertRaises(mmap.error, mmap_move, object(), 0, 0, 0)
-
-
 class FileHandling(TestCase):
     def file(self, contents):
         temp = tempfile.TemporaryFile()

And the issue immediately went away.

I'd like to request for either:

mutagen to detect ZFS and disable mmap. I don't know if that's possible
For consumers of mutagen to be able to specify, somehow, that they don't want to use mmap'd files

opened by lovesegfault 17

Strange bitrates reported, e.g. 320141 instead of 320000
When extracting the bitrate from many mp3 files, I get strange values such as 128111, 192167, 256222, or 320141 instead of the standards 128000, 192000, 256000, and 320000.

As I can only see the standard values in mutagen/mp3/init.py, I infer these numbers have to come from the file itself. Though the bitrate is correctly reported by other tools. Take for example this audio file. With mutagen I get:

>>> f = mutagen.File('Tours_-_01_-_Enthusiast.mp3') >>> f.info.bitrate, f.info.bitrate_mode (320141, <BitrateMode.CBR: 1>)

With eyeD3:

>>> f = eyed3.load('Tours_-_01_-_Enthusiast.mp3') >>> f.info.bit_rate (False, 320)

Exiftool gives:

>>> exiftool Tours_-_01_-_Enthusiast.mp3 Audio Bitrate : 320 kbps Encoder : LAME3.98r Lame Method : CBR Lame Bitrate : 255 kbps

Any idea? (What does "Lame Bitrate" mean BTW?)
bug
opened by mdeff 15
docs: Add example for reading/writing vorbiscomment images
Originally reported by: scribbled_pixels (Bitbucket: scribbled_pixels, GitHub: Unknown)

According to this specification there's now a standard for embedding Cover art in Ogg Vorbis files. As far as I understood it, this is exactly the same format that FLAC uses, so they could share the Picture class.

Bitbucket: https://bitbucket.org/lazka/mutagen/issue/200

enhancement
opened by lazka 15

DeprecationWarning:: _util.py:151 :: to_int_be = staticmethod(lambda data: struct.pack('>i', data))

Originally reported by: Christoph Reiter (Bitbucket: lazka, GitHub: lazka)

From [email protected] on May 09, 2010 21:12:29

When deleting tags from an ogg vorbis file, I am getting a 
DeprecationWarning in Python 2.5.2.  

Here is an example:

Python 2.5.2 ( r252 :60911, Jan 20 2010, 23:14:04) 
[GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from mutagen.oggvorbis import OggVorbis
>>> ogg = OggVorbis('test.ogg')
>>> ogg['title']
[u'Christmas Waltz']
>>> ogg.delete()
/usr/lib/python2.5/site-packages/mutagen/_util.py:151: DeprecationWarning: 
'i' format requires -2147483648 <= number <= 2147483647
  to_int_be = staticmethod(lambda data: struct.pack('>i', data))
>>> ogg.tags
[]

The delete() still works - is just throwing that warning.

Thanks!

Original issue: http://code.google.com/p/mutagen/issues/detail?id=63

Bitbucket: https://bitbucket.org/lazka/mutagen/issue/63

bug

opened by lazka 15

mid3v2: custom delimiters in COMM, TXXX, POPM descriptions

Originally reported by: Christoph Reiter (Bitbucket: lazka, GitHub: lazka)

From [email protected] on August 30, 2013 04:06:28

I use mid3v2 as a tool for skripted tagging of my music collection for Quod Libet, and I'm quite happy with it. There's just one limitation that I find a little irksome. A command like the following just won't work as intended:

mid3v2 --TXXX "QuodLibet::albumartist:The Examples" track.mp3

The reason, of course, are the double quotes in the prefix as used by Quod Libet (and Ex Falso), since they conflict with the delimiter colon in the description key for TXXX (as well as COMM and POMP) frames.

My first thought was to add some special handling for QuodLibet:: prefixes and/or double colons. A better idea, in my view, is to allow users to specify their own delimiters from the command line as needed, like this:

mid3v2 --delimiter=# --TXXX "QuodLibet::albumartist#The Examples" track.mp3

The colon would remain the default delimiter. There's no risk of breaking existing skripts with the added option. Backward compatibility with id3v2 would be preserved.

My tentative patch shows how mid3v2 (and its documentation) might be adapted for a new --delimiter option. I know very little python, so my code is certainly far from perfect. Most of my changes are built after the --verbose option.

Attachment: mid3v2-deliminator.diff

Original issue: http://code.google.com/p/mutagen/issues/detail?id=159

Bitbucket: https://bitbucket.org/lazka/mutagen/issue/159

enhancement

opened by lazka 14

mutagen-inspect shows tag data but mid3cp shows

I'm having some trouble with tagged .m4a files in the ALAC format from iTunes.

When I run the files through mutagen-inspect it outputs a bunch of data (see below). But when I try to use mid3cp I just get the message No ID3 header found ...

mid3cp ./05\ Levitating.m4a ./05\ Levitating2.m4a 
No ID3 header found in  ./05 Levitating.m4a

mutagen-inspect ./05\ Levitating.m4a 
-- /Users/***/Desktop/music conversion testing/05 Levitating.m4a
- MPEG-4 audio (AAC LC), 203.87 seconds, 320000 bps (audio/mp4)
----:com.apple.iTunes:Encoding Params=MP4FreeForm(b'vers\x00\x00\x00\x01acbf\x00\x00\x00\x02brat\x00\x04\xe2\x00srcq\x00\x00\x00\x7fcdcv\x00\x01\x07\x01', <AtomDataType.IMPLICIT: 0>)
----:com.apple.iTunes:iTunNORM=MP4FreeForm(b' 000026FF 00002688 0000E85B 0000DED6 00025774 000242F4 00007E88 00007E88 00002203 000021EC', <AtomDataType.UTF8: 1>)
----:com.apple.iTunes:iTunSMPB=MP4FreeForm(b' 00000000 00000840 000002A4 000000000089251C 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000', <AtomDataType.UTF8: 1>)
aART=Dua Lipa
covr=[91060 bytes of data]
cpil=False
disk=(1, 1)
pgap=False
tmpo=0
trkn=(5, 13)
©ART=Dua Lipa
©alb=Future Nostalgia
©day=2020
©lyr=If you wanna run away with me
I know a galaxy and I can take you for a ride
I had a premonition that we fell into a rhythm
Where the music don't stop for life
Glitter in the sky, glitter in my eyes
Shining just the way I like
If you feel like you need a little bit of company
You met me at the perfect time
...

opened by glitch452 2

Performance on NFS-mounted files much helped by specifying buffering
This is probably not a mutagen issue, but something which may be of interest anyway. I did not try to reproduce the thing in other contexts, so it may be quite specific. While doing mass tags extraction from an NFS-mounted file system, specifying buffering=4096 to the open() call in _utils.py yields a massive performance improvement (around 5x in my configuration).

Details:

Client system: "Ubuntu 22.04.1 LTS" Linux 5.15.0-56-generic Python 3.10.6

NFS server: Odroid hc4 : ARM running "Ubuntu 22.04.1 LTS" Linux 5.19.17-meson64

The volume is a 4TB spinning disk on the ARM system.

Without the buffering parameter, extracting tags from 3000 FLAC and MP3 files takes around 100 mS per file. With the buffering argument we get down to around 22 mS

I also did a quick test on a local SSD, on which the buffering does not appear to make a difference one way or another.

Tests done while trying to determine why recoll was slow indexing NFS-mounted audio files. The workaround for the application is to open the file with a buffering argument, before building the mutagen object.

This appears to be actually a Python bug, as from the Python manual open() doc:

Binary files are buffered in fixed-size chunks; the size of the buffer is chosen using a heuristic trying to determine the underlying device’s “block size” and falling back on [io.DEFAULT_BUFFER_SIZE](https://docs.python.org/3/library/io.html#io.DEFAULT_BUFFER_SIZE). On many systems, the buffer will typically be 4096 or 8192 bytes long.

So specifying buffering=4096 should be close to a no-op, and doing it as a precautionary default in mutagen should be inocuous enough.
opened by medoc92 0
Updating APIC tag leads to empty cover art on Spotify's local files

Hi, I'm programming my first project using mutagen, moviepy, pytube and flask. Essentially the program converts a youtube video into an mp3 with updated metadata (such as updated song name, artist name and cover art).

The problem that I'm having is that my cover art on Spotify is not updating whereas my artist name and song name both update perfectly fine. In order to update the cover art, I'd have to manually upload the art through Apple Music or MP3Tag, or some other third party software.

Here is my code so far, it runs perfectly well but the cover art does not show up on Spotify's local files. Any help would be appreciated!

EDIT: Fixed by using eyed3

opened by shubhhpatel 0

Having problems figuring out how to write to ID3 tags for MP3's

Hey, pretty new'ish to Python, and I had looked at the documentation to try to figure this out. I couldn't understand how to read the TCOP tag (among other ID3 tags) and write it as blank or remove it. I want to do the same thing to comments and other fields that are unnecessary, but I'm only concerned about doing the copyright (TCOP) tag for the time, because once I figure one out, it'll be easily distributable across other tags I want to change.

Here's the snippet of code I'm having problems with.

def scan_files():  # read the files
        for root, subfolders, files, in os.walk(scan_dir):
            for name in files:
                # if name.endswith((".mp3", ".m4a", ".flac", ".alac")):
                if name.endswith((".mp3")):  # only handle mp3's
                    # add track names to a list variable
                    tracks_import_dir.append(name)
                    # try:
                    global track_mp3
                    global track_id3
                    track_mp3 = MP3(root + "\\" + name)
                    track_id3 = ID3(root + "\\" + name)
                    #tags = ID3(track)

                    class format_track():  # change various tags
                        # print("")
                        # https://mutagen.readthedocs.io/en/latest/api/id3.html#mutagen.id3.ID3
                        # https://mutagen.readthedocs.io/en/latest/api/mp3.html
                        def format_copyright():
                            print("Formatting copyright...")
                            copyright = (track_mp3['TCOP'].text[0])
                            print(copyright)  # Copyright
                            
                            # track_id3 = ID3(track)
                            # print(copyright_tag)
                        format_copyright()

                    class save_track:  # save track files
                        track_mp3.save()
                        track_id3.save()
                        # tags.save(track)
                        print(style.color.yellow,
                              "[INFO] Track saved\n", style.reset)

                    if config.debug >= 3:  # print all the tracks to prove they exist
                        print(style.color.yellow,
                              track_mp3.pprint(), style.reset, "\n")

Under def format_copyright, is a function that when called should set the TCOP tag for my audio files, and class save track: should write those tags. I was able to get copyright = (track_mp3['TCOP'].text[0]) which prints the copyright tag, to work, but I was unable to figure out a way to write it. I tried for about 4 hours to figure out a method that would work to write it (I had like 100 comments of failed ways of doing this before...)

Dumbed down for me, how can I take track_id3 and/or track_mp3 and write to the TCOP tag "" (to be blank) or remove them entirely?

opened by Rycia 1

MP4.save() makes phone videos unreadable (Invalid NAL unit size)
If I use Mutagen to write tags to an MP4 video that I took with my phone (Samsung Galaxy S7 with LineageOS 14.1), I can't read the file with mplayer or VLC or any player I tried anymore.

This seems to be the case for any video taken with my phone and no other video I ever encountered.

Sample file: https://uno.nahoj.eu/nextcloud/s/CTffQgpw8qcqWnb

Steps to reproduce (last tried with Python 3.10 and Mutagen 1.46.0):

>>> f = mutagen.File("VID_20220226_151532a.mp4") >>> type(f) <class 'mutagen.mp4.MP4'> >>> f["title"] = "Hi!" >>> f.pprint() 'MPEG-4 audio (), 0.00 seconds, 0 bps (audio/mp4)\ntitle=Hi!' >>> # Up to this point the file is fine >>> f.save() >>> # The file is now unreadable

After this, if I try to read the file with VLC, it outputs a bunch of errors of this type:

[h264 @ 0x7f0240c438c0] Invalid NAL unit size (961310570 > 279649). [h264 @ 0x7f0240c438c0] Error splitting the input into NAL units.

Funnily this doesn't prevent Mutagen from reading and writing the tags multiple times. But undoing any changes doesn't make the video readable again.

I think this is a pretty bad bug. I noticed this when I added a tag on a personal video that I care a lot about and that got corrupted. Thankfully I had a backup; otherwise I would have had a very miserable day.
opened by nahoj 0