Seamless Audio Splicing Seamless Audio Splicing for for ISO/IEC - PowerPoint PPT Presentation

where information lives Seamless Audio Splicing Seamless Audio Splicing for for ISO/IEC 13818 Transport Streams ISO/IEC 13818 Transport Streams A New Framework for Audio Elementary Stream Tailoring and Modeling Seyfullah Halit Oguz, Ph.D. and Sorin Faibish EMC Corporation Media Solutions Group Engineering Page. 1

where information lives EMC Media Solutions Group Profile The EMC Media Solutions Group is part of EMC Engineering — EMC Engineering will spend well over $1 Billion Dollars on Research and Development in FY 2001. — Over 300 engineers concentrating on Celerra Server development and Rich Media products. — The Media Solutions Group Lab has in excess of $300 Million dollars of hardware for development of Rich Media solutions. Page. 2

where information lives Mission Statement The EMC Media Solutions Group is tasked with: — Deploying EMC Products into the Rich Media market. — Develop products and solutions which meet the requirements of Rich Media customers. — Developing partnerships with key companies to develop and deploy customer solutions. — Provide Professional Services in the Rich Media environment. Page. 3

where information lives Outline Outline � Splicing in brief � Objective � Problem description � Basic algorithm � A model for the audio elementary streams � Enhanced algorithm � Additional implementation details � Conclusions Page. 4

where information lives Splicing Splicing � Splicing is the act of switching from one MPEG-2 program (embedded in a transport stream) to another MPEG-2 program (again embedded in a transport stream). � Commercial insertion, camera or content switching and content editing all require splicing to be performed on compressed bit-streams. � The structure of the compressed data makes a seamless splicing algorithm be far from trivial. Page. 5

where information lives Objective Objective � A generic method to process the audio elementary streams during the splicing of ITU-T Rec. H.222.0 | ISO/IEC 13818-1 transport streams (TS) to achieve a seamless audio splice. * “generic”: No constraining assumptions are made about signal formats (e.g. the video frame rate (PAL, NTSC), the audio sampling frequency), or various encoding parameters (e.g. the audio bit rate, the layer of audio encoding algorithm employed). * “audio elementary streams”. Current focus will be on the audio elementary streams. Ultimately audio and video splicing should be considered jointly. * “transport streams”. Achieve audio elementary stream splicing directly on transport streams with lowest possible complexity. Page. 6

where information lives Definitions and Notation Definitions and Notation Encoded Data Domain Decoded Data Domain Audio Presentation Unit, Audio Access Unit APU, (a block of contig- Audio AAU (Audio Frame) Decoder uous audio samples) (576 bytes) * (24 ms) * Video Presentation Unit, Video Video Access Unit VPU, (a video frame) Decoder VAU (variable size) (1/29.97 s) ** * (ISO/IEC 11172-3 Layer-II Audio coding with sampling frequency 48kHz and audio bitrate 192kbits/s assumed only for illustrative purposes) ** (NTSC frame rate assumed only for illustrative purposes) Page. 7

where information lives VPU and APU alignment VPU and APU alignment Start End S E S E S AT THE BEGINNING VPU 0 VPU 1 VPU 2 ~= 0.03337 seconds (1 / 29.97) seconds The start of a VPU will be aligned Start End S E S E S with the start of an APU possibly APU 0 APU 1 APU 2 at the beginning of a stream and then only at multiples of 5 minutes 0.024 seconds increments in time. This implies that later they will not be aligned again for all practical purposes. E S E S E S E S E S E S VPU (k-2) VPU (k-1) VPU k VPU (k+1) VPU (k+2) LATER E S E S E S E S E S E S E S APU (j-1) APU (j+1) APU (j+2) APU (j+3) APU (j-2) APU j Page. 8

where information lives The setting for splicing The setting for splicing Ending stream E S E S E S E S E S E S VPU (k-2) VPU (k-1) VPU k VPU (k+1) VPU (k+2) time base #1 E S E S E S E S E S E S E S APU (j-2) APU (j-1) APU j APU (j+1) APU (j+2) APU (j+3) Splicing point is naturally defined with respect to VPUs . E S E S E S E S E S E S VPU (n-2) VPU (n-1) VPU n VPU (n+1) VPU (n+2) time base #2 E S E S E S E S E S E S E S APU (m-1) APU (m+1) APU (m+2) APU (m+3) APU (m-2) APU m Beginning stream Page. 9

where information lives Audio processing at splicing Audio processing at splicing Time base of the beginning stream is shifted to achieve video presentation continuity. E S E S E S E S E S E S VPU (k-2) VPU (k-1) VPU k VPU (k+1) VPU (k+2) E S E S E S E S E S E S E S APU (j-1) APU (j+1) APU (j+2) APU (j+3) APU (j-2) APU j E S E S E S E S E S E S E S APU (m-3) APU (m-2) APU (m-1) APU m APU (m+1) APU (m+2) APUs are available only through the decoding of their corresponding AAUs. Fractional (i.e. truncated) AAUs in the encoded data domain are useless. Page. 10

where information lives So far… So far… � Decoding, time domain editing and re-encoding. High computational complexity. � Gaps in the audio stream. Audio mutes, uncontrolled audio-visual skew. � Overlaps in the scopes of APUs. Uncontrolled audio-visual skew, inconsistent ES structure. Page. 11

where information lives Observations Observations � Audio truncation should always be done at AAU boundaries i.e. no fractional AAUs! � Audio truncation for the ending stream should be done with respect to the end of its last VPU’s presentation interval. � Audio truncation for the beginning stream should be done relative to the beginning of its first VPU’s presentation interval. “BEST ALIGNED APUs” Page. 12

where information lives Best aligned APUs Best aligned APUs Best aligned final APU The APU of the ending stream whose presentation interval ends APU (j+1) within the identified 24 ms interval is called the “best aligned “short” “long” final APU”. 12 msec. 12 msec. VPU (k+2) VPU (k-1) VPU (k+1) VPU k 12 msec. 12 msec. “long” “short” The APU of the beginning stream whose presentation interval starts APU m within the identified 24 ms interval is called the “best aligned initial APU”. Best aligned initial APU There is a comprehensive list of 8 possible cases that can be identified regarding the alignment of ending and beginning audio streams based on the above definitions. Page. 13

where information lives How to make use of best aligned APUs How to make use of best aligned APUs REQUIRED PROCESSING AT ACTION ELEMENTARY STREAM LEVEL � Truncate the ending audio � In the audio PES packet carrying the best aligned final APU stream at the end of the - truncate after the AAU associated with best aligned final APU. the best aligned final APU, - modify the PES packet size information � Start the beginning audio accordingly. stream at the beginning of the best aligned initial � In the audio PES packet carrying the best APU. aligned initial APU - delete the AAU data preceding the AAU � Re-stamp the audio PTSs associated with the best aligned initial APU, of the beginning stream to - modify the PES packet size information generate an immediate accordingly. continuation of the ending � Modify the PTS values associated with the first and all consequent audio PES audio stream. packets accordingly. Page. 14

where information lives Case 6) b. a. final APU long, b. a. initial APU short and 0 msec. < audio overlap < 12 msec. Best aligned final APU APU (j+1) APU (j+2) 12 msec. 12 msec. VPU (k+2) VPU (k-1) VPU (k+1) VPU k 12 msec. 12 msec. APU (m-1) APU m Best aligned initial APU (m) (m+1) APU (j+1) APU ( j+2 ) APU ( j+3 ) SOLUTION: A/V skew of at most 12 msec. VPU (k+2) VPU (k-1) VPU (k+1) VPU k Page. 15

where information lives Minimal Achievable Skew Algorithm Minimal Achievable Skew Algorithm � Immediately applicable to 6 out of the 8 possible best aligned APU relative position classes. � In the remaining 2 classes of relative position, a slight modification to the proposed algorithm is needed to achieve an A/V skew bounded by half APU duration. Page. 16

where information lives Case 1) Both best aligned APUs are short and 12 msec. < audio gap < 24 msec. Best aligned final APU APU j+1 APU (j+2) 12 msec. 12 msec. VPU (k+2) VPU (k-1) VPU (k+1) VPU k 12 msec. 12 msec. APU (m-1) APU m Best aligned initial APU (m) APU (j+1) APU (j+2) APU ( j+3 ) SOLUTION (a): A/V skew of at most 12 msec. VPU (k-1) VPU (k+1) VPU (k+2) VPU k A/V skew of at most 12 msec. (m-1) (m) SOLUTION (b): APU (j+1) APU ( j+2 ) APU ( j+3 ) Page. 17

where information lives Facts - I Facts - I � An audio elementary stream construction with no holes and no audio PTS discontinuity is possible. � As a consequence, an A/V skew of magnitude at most half APU duration will be induced in the beginning stream. This is below the sensitivity limits of human perception. � The proposed algorithm can be repeatedly applied an arbitrary number of times with neither a failure to meet its structural assumptions nor a degradation in its promised A/V skew performance. Page. 18

Seamless Audio Splicing Seamless Audio Splicing for for ISO/IEC - PowerPoint PPT Presentation

where information lives Seamless Audio Splicing Seamless Audio Splicing for for ISO/IEC 13818 Transport Streams ISO/IEC 13818 Transport Streams A New Framework for Audio Elementary Stream Tailoring and Modeling Seyfullah Halit Oguz, Ph.D.

SPLICING SYSTEMS ACCEPTING VS. GENERATING Juan Castellanos Victor Mitrana Eugenio Santos

Studying Alternative Splicing Meelis Kull PhD student in the University of Tartu supervisor:

Audio Device Client Better and Faster Audio I/O on Web Hongchan Choi Google Chrome Web Audio

Alternative Splicing, RNA-Seq 02-715 Advanced Topics in Computa8onal

PSIchomics Shiny application for the integrated analysis of alternative splicing from large

Zero-Copy Socket Splicing Alexander Bluhm bluhm@openbsd.org Sunday, 29. September 2013

Cirrus Audio Solutions Cirrus Audio Solutions Home Audio Portable Audio Personal CD Player

Boly Pipe Co., Ltd. AUGUST, 2019 SEAMLESS PIPE MANUFACTURER IN THAILAND Who are we? A Baosteel

Prior Work Consensus Consensus Reliable BGP Consensus Reliable BGP Consensus Routing

Prior Work Consensus Consensus Reliable BGP Consensus Reliable BGP Consensus Routing

Create PowerPoint Audio and Video V0B August 2020 V0B V0B Schield: 2020 PPTX Create Audio-Video

Audio and Speech August 13, 2001 Audio 2 Digital sound anti-aliasing amplifier codec filter

Game Audio Coding vs. Aesthetics Leonard Paul of Lotus Audio Vancouver, Canada Game Audio :

ARREL AUDIO ML-118 Mid-Side Unit Livio Argentini, Marco Re ARREL AUDIO Rome Via Arnoldo

Audio Indexing and Retrieval IT6902; Semester B, 2004/2005; Leung Audio Indexing and Retrieval

CobraNet CobraNet Audio Network Audio Network Overview Overview Developed by Peak Audio

RANGE OF SOLUTIONS 2 0 1 6 Your presentations on video 2 UBICAST - 2015 UBICAST - 2015 Your

Online Content Proposal *This is document does not convey an offer to do work or provide services

A concept for the visual and interactive impact analysis and simulation of data changes to

Digital Infrastructure The Prerequisite for Modern Economic Development Presented By: TV

Artificial Intelligence, Machine Learning and Content Tim Collins Digital Disruption and

Scholars at work Adscholars is a digital 360 company based out of UAE, INDIA. We are a bunch of

Win! Win! Win! With Kristin S. Johnson CARW, CCMC, CJSS, COPNS, CG3C, CBBSC Executive Resume

Rich Media Ads in Mobile Apps Sumanth S Mobile Developer Conference June 2012 What to expect