Volume normalization after concatenation of audio clips Laboratorio - - PowerPoint PPT Presentation

volume normalization after concatenation of audio clips
SMART_READER_LITE
LIVE PREVIEW

Volume normalization after concatenation of audio clips Laboratorio - - PowerPoint PPT Presentation

Volume normalization after concatenation of audio clips Laboratorio di produzione prototipi e nuovi format di contenuti digitali Danilo Abbasciano danilo.abbasciano@gmail.com 2008 October 29 Volume normalization after concatenation of audio


slide-1
SLIDE 1

Volume normalization after concatenation of audio clips

Laboratorio di produzione prototipi e nuovi format di contenuti digitali

Danilo Abbasciano

danilo.abbasciano@gmail.com

2008 October 29

Volume normalization after concatenation of audio clips – p. 1/2

slide-2
SLIDE 2

The problem of audio alignment

Listening to etherogenic audio tracks usually produces differences in perceived volume level.

Volume normalization after concatenation of audio clips – p. 2/2

slide-3
SLIDE 3

The problem of audio alignment

Listening to etherogenic audio tracks usually produces differences in perceived volume level. This differences can be very annoying, in fact, in addition to lose concentration, the listener has to continually change the volume manually.

Volume normalization after concatenation of audio clips – p. 2/2

slide-4
SLIDE 4

The problem of audio alignment

Listening to etherogenic audio tracks usually produces differences in perceived volume level. This differences can be very annoying, in fact, in addition to lose concentration, the listener has to continually change the volume manually. The problem concerns all services that require a concatenation

  • f audio/video clips (Simple Media Player, TV, ecc...)

Volume normalization after concatenation of audio clips – p. 2/2

slide-5
SLIDE 5

Goal

To provide a solution for balances sound levels from multiple sources.

Volume normalization after concatenation of audio clips – p. 3/2

slide-6
SLIDE 6

Project description

The project consists of:

  • a theoretical part

where we studied the problem of audio alignment, signals and their scale, dynamics and the level of sound pressure.

Volume normalization after concatenation of audio clips – p. 4/2

slide-7
SLIDE 7

Project description

The project consists of:

  • a theoretical part

where we studied the problem of audio alignment, signals and their scale, dynamics and the level of sound pressure.

  • and practical part

where the theory will be applied in establishing a program for manipulating audio signals.

Volume normalization after concatenation of audio clips – p. 4/2

slide-8
SLIDE 8

Definitions

  • Decibel (dB) is a logarithmic unit of measurement that

expresses the magnitude of a physical quantity relative to a specified or implied reference level.

Volume normalization after concatenation of audio clips – p. 5/2

slide-9
SLIDE 9

Definitions

  • Decibel (dB) is a logarithmic unit of measurement that

expresses the magnitude of a physical quantity relative to a specified or implied reference level.

  • Sound intensity (W/m2) ratio between the power of a wave

sound and its surface that it is crossed.

Volume normalization after concatenation of audio clips – p. 5/2

slide-10
SLIDE 10

Definitions

  • Decibel (dB) is a logarithmic unit of measurement that

expresses the magnitude of a physical quantity relative to a specified or implied reference level.

  • Sound intensity (W/m2) ratio between the power of a wave

sound and its surface that it is crossed.

  • Sound intensity level (dB) is the intensity noise measured in
  • dB. L = 10 log10

I I0

Volume normalization after concatenation of audio clips – p. 5/2

slide-11
SLIDE 11

Definitions

  • Decibel (dB) is a logarithmic unit of measurement that

expresses the magnitude of a physical quantity relative to a specified or implied reference level.

  • Sound intensity (W/m2) ratio between the power of a wave

sound and its surface that it is crossed.

  • Sound intensity level (dB) is the intensity noise measured in
  • dB. L = 10 log10

I I0

  • Sound pressure (Pa) is the changing of the pressure as

regards of a condition of peace

Volume normalization after concatenation of audio clips – p. 5/2

slide-12
SLIDE 12

Definitions

  • Decibel (dB) is a logarithmic unit of measurement that

expresses the magnitude of a physical quantity relative to a specified or implied reference level.

  • Sound intensity (W/m2) ratio between the power of a wave

sound and its surface that it is crossed.

  • Sound intensity level (dB) is the intensity noise measured in
  • dB. L = 10 log10

I I0

  • Sound pressure (Pa) is the changing of the pressure as

regards of a condition of peace

  • Sound pressure level (dB) logarithmic measure of sound

pressure relative to a reference value Lp = 10 log10(p/p0)2

Volume normalization after concatenation of audio clips – p. 5/2

slide-13
SLIDE 13

Significant function for the volume of sound

The first problem is to find a function that receives a sound as input and return a value that describes the level of sound pressure.

Volume normalization after concatenation of audio clips – p. 6/2

slide-14
SLIDE 14

Significant function for the volume of sound

  • peak–pressure

Pressure Time peak pressure Volume normalization after concatenation of audio clips – p. 7/2

slide-15
SLIDE 15

Significant function for the volume of sound

  • peak–pressure
  • peak–to–peak

Pressure Time Peak to peak peak pressure Volume normalization after concatenation of audio clips – p. 7/2

slide-16
SLIDE 16

Significant function for the volume of sound

  • peak–pressure
  • peak–to–peak
  • root mean square

Pressure Time RMS Peak to peak peak pressure Volume normalization after concatenation of audio clips – p. 7/2

slide-17
SLIDE 17

Root Mean Square (RMS)

Data set: X = {x1, x2, ..., xn} the root mean square is: Xrms =

  • 1

n

n

  • i=1

x2

i

Volume normalization after concatenation of audio clips – p. 8/2

slide-18
SLIDE 18

Root Mean Square (RMS)

Data set: X = {x1, x2, ..., xn} the root mean square is: Xrms =

  • 1

n

n

  • i=1

x2

i

If we have a continuous function f in [t0, t1] the frms is: frms =

  • 1

t1 − t0 t1

t0

[f(x)]2dx

Volume normalization after concatenation of audio clips – p. 8/2

slide-19
SLIDE 19

Idea

Volume normalization after concatenation of audio clips – p. 9/2

slide-20
SLIDE 20

Idea

Volume normalization after concatenation of audio clips – p. 9/2

slide-21
SLIDE 21

Idea

Volume normalization after concatenation of audio clips – p. 9/2

slide-22
SLIDE 22

How much to change the volume?

If v, the value of the amplification factor, Mrms, the mean audio level desired output and Ms, the mean given value of the audio stream then: Mrms ⋍ vMs v ⋍ Mrms Ms

Volume normalization after concatenation of audio clips – p. 10/2

slide-23
SLIDE 23

Tools and standards

For the project and its implementation we chose to use

Open Source tools only. The main ones are:

  • Platform: GNU/Linux
  • Distribution: Fedora
  • Programming language: Python
  • Library: GStreamer
  • Editor: GNU Emacs
  • File format for audio samples: Ogg Vorbis

Volume normalization after concatenation of audio clips – p. 11/2

slide-24
SLIDE 24

Tools and standards

For the project and its implementation we chose to use

Open Source tools only. The main ones are:

  • Platform: GNU/Linux
  • Distribution: Fedora
  • Programming language: Python
  • Library: GStreamer
  • Editor: GNU Emacs
  • File format for audio samples: Ogg Vorbis
  • L

A

T EX for this presentation

Volume normalization after concatenation of audio clips – p. 11/2

slide-25
SLIDE 25

Tools and standards

For the project and its implementation we chose to use

Open Source tools only. The main ones are:

  • Platform: GNU/Linux
  • Distribution: Fedora
  • Programming language: Python
  • Library: GStreamer
  • Editor: GNU Emacs
  • File format for audio samples: Ogg Vorbis
  • L

A

T EX for this presentation

  • ...and many others.

Volume normalization after concatenation of audio clips – p. 11/2

slide-26
SLIDE 26

Portability

The software developed with the Python language has an excellent portability.

Volume normalization after concatenation of audio clips – p. 12/2

slide-27
SLIDE 27

Portability

The software developed with the Python language has an excellent portability. Python runs on many kind of platforms: UNIX like, MacOS X, Windows, AIX, AROS, AS/400, BeOS, iPod, OS/2, OS/390, z/OS, PlayStation, PSP , Psion, VxWorks, QNX, Acorn, Sparc Solaris, Windows CE, Pocket PC, VMS, MorphOS, Sharp Zaurus.

Volume normalization after concatenation of audio clips – p. 12/2

slide-28
SLIDE 28

Integration

The code of the program leverages all the benefits of

  • bject–oriented programming. It consists of a library structured

classes and a second executable script with a command line interface to test the capabilities of library.

Volume normalization after concatenation of audio clips – p. 13/2

slide-29
SLIDE 29

Integration

The code of the program leverages all the benefits of

  • bject–oriented programming. It consists of a library structured

classes and a second executable script with a command line interface to test the capabilities of library. Thanks to this structure is very simple to integrate this software in a larger project. Simply import the library and use its classes.

Volume normalization after concatenation of audio clips – p. 13/2

slide-30
SLIDE 30

Integration

The code of the program leverages all the benefits of

  • bject–oriented programming. It consists of a library structured

classes and a second executable script with a command line interface to test the capabilities of library. Thanks to this structure is very simple to integrate this software in a larger project. Simply import the library and use its classes. The interface uses the standard GStreamer object.

Volume normalization after concatenation of audio clips – p. 13/2

slide-31
SLIDE 31

GStreamer

GStreamer is a library for creating streaming media applications. The framework is based on plugins that will provide the various codec and other functionality. The plugins can be linked and arranged in a pipeline. This pipeline defines the flow of the data.

Plugin Src Plugin Src Sink Plugin Src Sink Plugin Sink

Volume normalization after concatenation of audio clips – p. 14/2

slide-32
SLIDE 32

GStreamer plugins

  • Level analyses incoming audio buffers and generates, after

each interval of time, the Root Mean Square (or average power) level in dB for each channel.

Volume normalization after concatenation of audio clips – p. 15/2

slide-33
SLIDE 33

GStreamer plugins

  • Level analyses incoming audio buffers and generates, after

each interval of time, the Root Mean Square (or average power) level in dB for each channel.

  • Fakesink is a black hole for data.

Volume normalization after concatenation of audio clips – p. 15/2

slide-34
SLIDE 34

GStreamer plugins

  • Level analyses incoming audio buffers and generates, after

each interval of time, the Root Mean Square (or average power) level in dB for each channel.

  • Fakesink is a black hole for data.
  • Volume plugin set volume on audio streams. We can set the

amplification factor that can take values from 0 to 10. On the interval [0, 1) the volume will be attenuated, with 1 will remain unchanged and values in (1, 10] will be amplified.

Volume normalization after concatenation of audio clips – p. 15/2

slide-35
SLIDE 35

Editing a pipeline for the inclusion of Level

Plugin Src Plugin Src Sink Plugin Src Sink Plugin Sink

Volume normalization after concatenation of audio clips – p. 16/2

slide-36
SLIDE 36

Editing a pipeline for the inclusion of Level

Plugin Src Plugin Src Sink Plugin Src Sink Plugin Sink Level Src Sink Fakesink Sink

Volume normalization after concatenation of audio clips – p. 16/2

slide-37
SLIDE 37

Editing a pipeline for the inclusion of Level

Plugin Src Plugin Src Sink Plugin Src Sink Plugin Sink Level Src Sink Fakesink Sink

Volume normalization after concatenation of audio clips – p. 16/2

slide-38
SLIDE 38

Editing a pipeline for the inclusion of Level

Plugin Src Plugin Src Sink Plugin Src Sink Plugin Sink Level Src Sink Fakesink Sink

Volume normalization after concatenation of audio clips – p. 16/2

slide-39
SLIDE 39

Editing a pipeline for the inclusion of Level

Plugin Src Plugin Src Sink Level Src Sink Fakesink Sink

Volume normalization after concatenation of audio clips – p. 16/2

slide-40
SLIDE 40

RMS calculation

To get the average level of a stream we can evaluate the average values of RMS returned from the plugin level. Ms = 1 m

m−1

  • k=0

 

  • 1

n

n

  • i=1

x2

i+kn

 

Volume normalization after concatenation of audio clips – p. 17/2

slide-41
SLIDE 41

RMS calculation

To get the average level of a stream we can evaluate the average values of RMS returned from the plugin level. Ms = 1 m

m−1

  • k=0

 

  • 1

n

n

  • i=1

x2

i+kn

  Once a reference value is set we need to bring all audio levels to this value.

Volume normalization after concatenation of audio clips – p. 17/2

slide-42
SLIDE 42

Editing a pipeline for the inclusion of Volume

Plugin Src Sink Plugin Src Sink Plugin Src Sink Plugin Sink

Volume normalization after concatenation of audio clips – p. 18/2

slide-43
SLIDE 43

Editing a pipeline for the inclusion of Volume

Plugin Src Sink Plugin Src Sink Plugin Src Sink Plugin Sink Volume Src Sink

Volume normalization after concatenation of audio clips – p. 18/2

slide-44
SLIDE 44

Editing a pipeline for the inclusion of Volume

Plugin Src Sink Plugin Src Sink Plugin Src Sink Plugin Sink Volume Src Sink

Volume normalization after concatenation of audio clips – p. 18/2

slide-45
SLIDE 45

Editing a pipeline for the inclusion of Volume

Plugin Src Sink Plugin Src Sink Plugin Src Sink Plugin Sink Volume Src Sink

Volume normalization after concatenation of audio clips – p. 18/2

slide-46
SLIDE 46

Changing the volume of an audio track

  • 35
  • 30
  • 25
  • 20
  • 15
  • 10

20 40 60 80 100 120 140 160 180 RMS (dB) campionamenti Un canale audio con il volume abbassato di 0.8 ’originale’ ’modificato’

  • 19.592920654
  • 21.5319890141

Volume normalization after concatenation of audio clips – p. 19/2

slide-47
SLIDE 47

Applications

The possible applications of a study of audio alignment concern all systems that allow the vision and the listening of multimedia content.

Volume normalization after concatenation of audio clips – p. 20/2

slide-48
SLIDE 48

Applications

The possible applications of a study of audio alignment concern all systems that allow the vision and the listening of multimedia content. For example CreaTiVù, IPTV, TV, music players, radio broadcasts, digital satellite television, recordings various media formats on a single support ...

Volume normalization after concatenation of audio clips – p. 20/2

slide-49
SLIDE 49

Advantages and utility

The advantages of using this filter in a stream could be:

  • Avoid hassle in places where people work with audio/video

files.

  • Avoid manually invokation of the regulations.
  • Avoid the breaking of delicate devices.
  • Avoid applying a compressor or audio limiter that reduce the

range of the signal at the beginning and end of each audio signal.

Volume normalization after concatenation of audio clips – p. 21/2

slide-50
SLIDE 50

Advantages and utility

All the benefits you have in using this software instead of an appliance with an electronic circuit:

  • No additional hardware devices
  • Economic savings
  • More opportunities for integration
  • Increased reliability
  • Energy saving

Volume normalization after concatenation of audio clips – p. 22/2

slide-51
SLIDE 51

Demo

Volume normalization after concatenation of audio clips – p. 23/2