The audio-video synchronization of MKV Container Format is exploited to achieve data hiding

Maxim Zaika

Last update: Nov 17, 2021

Related tags

Deep Learning Data-Hiding-in-MKV-Container-Format

Overview

1.0 Data Hiding in MKV Container Format

1.1 Brief Description

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding, where the hidden data can be utilized for various management purposes, including hyper-linking, annotation, and authentication

1.2 Video Demonstration @ YouTube

Data Hiding (Hidden Watermark) in MKV Container Format

1.3 Requirements

Linux (not tested anywhere else)
Python
.MKV reader (like VLC player)
All the files are required:
- .MKV video (./VideoForTesting/2mb.mkv)
- ./convert_xml2mkv.py
- ./parse_and_convert_mkv2xml.py
- ./find_data.py
- ./hide_data.py
- ./find
- ./hide
Ensure that you have all the permission to access these files. Run the following command: chmod +x convert_xml2mkv.py && chmod +x find_data.py && chmod +x hide_data.py && chmod +x parse_and_convert_mkv2xml.py
If the command above doesn't work and Linux prevents your access you may use the following command on any of the affected files: chmod +x filename.extension

1.4 How To Run Data Embedding Process

Note: for screenshots refer to the end of the ./Maxim_Zaika_Data_Hiding_in_MKV_Container.pdf file

Ensure 1.3 Requirements are fulfilled
Run ./hide from your terminal within the folder where files are located.
Enter the name of the .MKV container: 2mb.mkv.
Enter the data that needs to be hidden: 'example'. Write it down!
Enter the SECRET KEY that will be used to decrypt your data in the data detecting process: 'encryption key'. Write it down!
Enter the timecode where data will be saved to: 10.523 or type 'help' to display all the available timecodes. Write it down!
File modified_mkv.mkv should now be created that stores your hidden data.

Note: do not lose text of the hidden data, SECRET KEY, and the timecode. Otherwise, you won't be able to verify it later.

1.5 How To Run Data Detecting Process

Ensure 1.3 Requirements are fulfilled
Run ./find from your terminal within the folder where files are located.
Enter the file name: modified_mkv.mkv.
Enter the text of your hidden data: 'example'.
Enter the SECRET KEY used: 'encryption key'.
Enter the timecode used: 10.523.
If the data is matching then it will show a success.

2.0 Data Embedding Process

2.1 Software Architecture of Data Embedding

2.2 Data Embedding Design

2.3 Data Embedding Pseudocode

Note: this is incomplete representation.

Function main {
  Set a_word -> “word that needs to be written in”
  Set encryption_key -> “key used for the encryption”
  If (length of encryption_key) < (length of a_word) {
	  Set encryption_key -> same length as a_word
  }
  Set a_word -> convert to ascii
  Set encryption_key -> convert to ascii
  Set ascii_a_word -> convert to hexadecimal
  Set ascii_encryption_key -> convert to hexadecimal
  If (length of ascii_encryption_key) < (length of ascii_a_word) { 
	  Set ascii_encryption_key = -> same length as ascii_a_word
  }
  Encrypt a_word(ascii_a_word, ascii_encryption_key, a_word) // encrypt ascii word
                                                             // using original word 
  Convert encrypted word to hexadecimal // because MKV parser accepts hexadecimals
                                        // inside the cluster’s timecode
  Timecodes = [] // read the XML file and identify the timecodes
  Set input_timecode -> “input timecode here”
  Call function embed data (filename, input_timecode, encrypted_word_in_hexadecimal_format)
}

Function embed data {
	Loop through the file {
		Identify the location of the timecode {
			Identify the location of the data inside the cluster’s timecode {
				Write-in the data
			}
		} else not found timecode {
			Try again
		}
	}
}

3.0 Data Detecting Process

3.1 Software Architecture of Data Detecting

3.2 Data Detecting Design

3.3 Data Embedding Pseudocode

Note: this is incomplete representation.

Function detect data {
	Set hexadecimal_word -> ‘the encrypted word’ \\ basically the identical process like in data 
						                                    \\ hiding process
	Loop through the file {
		Loop each line of the file {
			Identify the location of the timecode {
				Identify the data inside the cluster’s timecode {
					Read through the line ignoring first 6 characters // format
				}
				If there is at least 1 miss-match {
					Return error
				} else fully matched {
					Return success
				}
			}
		}
	}
}

4.0 Results

Description	Explanation
Limited Number of Cluster's Timecodes	Modifying more than two cluster’s timecodes cause slight video distortion; however, modifying even more timecodes causes both video and audio distortions.
Embedding Capacity	Passed test of up to 2,500 characters. Assumption is that 2,500 characters should be more than enough for the user.
File Size Increment	Original file: 2.1 MB (2,097,641 bytes) -> Modified File (2,500 characters): 2.1 MB (2,122,058 bytes). Increased by 23,417 bytes (1.00%).

5.0 Additional Information

For more information (like testing and background information), refer to the .PDF file attached to this repository: ./Maxim_Zaika_Data_Hiding_in_MKV_Container.pdf

6.0 Credits

It would not be possible to complete this project without MKV > XML > MKV parser created by Vitaly "_Vi" Shukela: https://github.com/vi/mkvparse.

Parser is rewritten for my own needs (for better understanding) and included in this repository to ensure that there is no mismatch with Vitaly's version. If you are interested in the parser, please, refer to his repository provided above. I do not take any credit for its creation.

You might also like...

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

1 Jan 23, 2022

All-in-one Docker container that allows a user to explore Nautobot in a lab environment.

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding

Related tags

Overview

1.0 Data Hiding in MKV Container Format

1.1 Brief Description

1.2 Video Demonstration @ YouTube

1.3 Requirements

1.4 How To Run Data Embedding Process

1.5 How To Run Data Detecting Process

2.0 Data Embedding Process

2.1 Software Architecture of Data Embedding

2.2 Data Embedding Design

2.3 Data Embedding Pseudocode

3.0 Data Detecting Process

3.1 Software Architecture of Data Detecting

3.2 Data Detecting Design

3.3 Data Embedding Pseudocode

4.0 Results

5.0 Additional Information

6.0 Credits

You might also like...

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

All-in-one Docker container that allows a user to explore Nautobot in a lab environment.

Container : Context Aggregation Network

NVIDIA container runtime

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

Data manipulation and transformation for audio signal processing, powered by PyTorch

[NeurIPS 2020] Blind Video Temporal Consistency via Deep Video Prior

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.

Search Youtube Video and Get Video info

Owner

Maxim Zaika

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data

We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and 3D representations, all in terms of performance, accuracy, and training time.

using STGCN to achieve egg classification task

Json2Xml tool will help you convert from json COCO format to VOC xml format in Object Detection Problem.

Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations