도서 정보
도서 상세설명
List of Authors xvii
Preface xxi
Acknowledgment xxiii
Notations xxv
Acronyms xxix
About the Companion Website xxxi
Part I Prerequisites 1
1 Introduction 3
Emmanuel Vincent, Sharon Gannot, and Tuomas Virtanen
1.1 Why are Source Separation and Speech Enhancement Needed? 3
1.2 What are the Goals of Source Separation and Speech Enhancement? 4
1.3 How can Source Separation and Speech Enhancement be Addressed? 9
1.4 Outline 11
Bibliography 12
2 Time-Frequency Processing: Spectral Properties 15
Tuomas Virtanen, Emmanuel Vincent, and Sharon Gannot
2.1 Time-Frequency Analysis and Synthesis 15
2.2 Source Properties in the Time-Frequency Domain 23
2.3 Filtering in the Time-Frequency Domain 25
2.4 Summary 28
Bibliography 28
3 Acoustics: Spatial Properties 31
Emmanuel Vincent, Sharon Gannot, and Tuomas Virtanen
3.1 Formalization of the Mixing Process 31
3.2 Microphone Recordings 32
3.3 Artificial Mixtures 36
3.4 Impulse Response Models 37
3.5 Summary 43
Bibliography 43
4 Multichannel Source Activity Detection, Localization, and Tracking 47
Pasi Pertilä, Alessio Brutti, Piergiorgio Svaizer, and Maurizio Omologo
4.1 Basic Notions in Multichannel Spatial Audio 47
4.2 Multi-Microphone Source Activity Detection 52
4.3 Source Localization 54
4.4 Summary 60
Bibliography 60
Part II Single-Channel Separation and Enhancement 65
5 Spectral Masking and Filtering 67
Timo Gerkmann and Emmanuel Vincent
5.1 Time-Frequency Masking 67
5.2 Mask Estimation Given the Signal Statistics 70
5.3 Perceptual Improvements 81
5.4 Summary 82
Bibliography 83
6 Single-Channel Speech Presence Probability Estimation and Noise Tracking 87
Rainer Martin and Israel Cohen
6.1 Speech Presence Probability and its Estimation 87
6.2 Noise Power Spectrum Tracking 93
6.3 Evaluation Measures 102
6.4 Summary 104
Bibliography 104
7 Single-Channel Classification and Clustering Approaches 107
FelixWeninger, Jun Du, Erik Marchi, and Tian Gao
7.1 Source Separation by Computational Auditory Scene Analysis 108
7.2 Source Separation by Factorial HMMs 111
7.3 Separation Based Training 113
7.4 Summary 125
Bibliography 125
8 Nonnegative Matrix Factorization 131
Roland Badeau and Tuomas Virtanen
8.1 NMF and Source Separation 131
8.2 NMF Theory and Algorithms 137
8.3 NMF Dictionary LearningMethods 145
8.4 Advanced NMF Models 148
8.5 Summary 156
Bibliography 156
9 Temporal Extensions of Nonnegative Matrix Factorization 161
Cédric Févotte, Paris Smaragdis, NasserMohammadiha, and Gautham J.Mysore
9.1 Convolutive NMF 161
9.2 Overview of DynamicalModels 169
9.3 Smooth NMF 170
9.4 Nonnegative State-Space Models 174
9.5 Discrete DynamicalModels 178
9.6 The Use of DynamicModels in Source Separation 182
9.7 Which Model to Use? 183
9.8 Summary 184
9.9 Standard Distributions 184
Bibliography 185
Part III Multichannel Separation and Enhancement 189
10 Spatial Filtering 191
Shmulik Markovich-Golan,Walter Kellermann, and Sharon Gannot
10.1 Fundamentals of Array Processing 192
10.2 Array Topologies 197
10.3 Data-Independent Beamforming 199
10.4 Data-Dependent Spatial Filters: Design Criteria 202
10.5 Generalized Sidelobe Canceler Implementation 209
10.6 Postfilters 210
10.7 Summary 211
Bibliography 212
11 Multichannel Parameter Estimation 219
Shmulik Markovich-Golan,Walter Kellermann, and Sharon Gannot
11.1 Multichannel Speech Presence Probability Estimators 219
11.2 Covariance Matrix Estimators Exploiting SPP 227
11.3 Methods forWeakly Guided and Strongly Guided RTF Estimation 228
11.4 Summary 231
Bibliography 231
12 Multichannel Clustering and Classification Approaches 235
Michael I.Mandel, Shoko Araki, and Tomohiro Nakatani
12.1 Two-Channel Clustering 236
12.2 Multichannel Clustering 244
12.3 Multichannel Classification 251
12.4 Spatial Filtering Based on Masks 255
12.5 Summary 257
Bibliography 258
13 Independent Component and Vector Analysis 263
Hiroshi Sawada and Zbynˇek Koldovský
13.1 Convolutive Mixtures and their Time-Frequency Representations 264
13.2 Frequency-Domain Independent Component Analysis 265
13.3 Independent Vector Analysis 279
13.4 Example 280
13.5 Summary 284
Bibliography 284
14 Gaussian Model Based Multichannel Separation 289
Alexey Ozerov and Hirokazu Kameoka
14.1 Gaussian Modeling 289
14.2 Library of Spectral and SpatialModels 295
14.3 Parameter Estimation Criteria and Algorithms 300
14.4 Detailed Presentation of Some Methods 305
14.5 Summary 312
Acknowledgment 312
Bibliography 312
15 Dereverberation 317
Emanuël A.P. Habets and Patrick A. Naylor
15.1 Introduction to Dereverberation 317
15.2 Reverberation Cancellation Approaches 319
15.3 Reverberation Suppression Approaches 329
15.4 Direct Estimation 335
15.5 Evaluation of Dereverberation 336
15.6 Summary 337
Bibliography 337
Part IV Application Scenarios and Perspectives 345
16 Applying Source Separation to Music 347
Bryan Pardo, Antoine Liutkus, Zhiyao Duan, and Gaël Richard
16.1 Challenges and Opportunities 348
16.2 Nonnegative Matrix Factorization in the Case of Music 349
16.3 Taking Advantage of the Harmonic Structure of Music 354
16.4 Nonparametric Local Models: Taking Advantage of Redundancies in Music 358
16.5 Taking Advantage of Multiple Instances 363
16.6 Interactive Source Separation 367
16.7 Crowd-Based Evaluation 367
16.8 Some Examples of Applications 368
16.9 Summary 370
Bibliography 370
17 Application of Source Separation to Robust Speech Analysis and Recognition 377
ShinjiWatanabe, Tuomas Virtanen, and Dorothea Kolossa
17.1 Challenges and Opportunities 377
17.2 Applications 380
17.3 Robust Speech Analysis and Recognition 390
17.4 Integration of Front-End and Back-End 397
17.5 Use of Multimodal Information with Source Separation 403
17.6 Summary 404
Bibliography 405
18 Binaural Speech Processing with Application to Hearing Devices 413
Simon Doclo, Sharon Gannot, Daniel Marquardt, and Elior Hadad
18.1 Introduction to Binaural Processing 413
18.2 Binaural Hearing 415
18.3 Binaural Noise Reduction Paradigms 416
18.4 The Binaural Noise Reduction Problem 420
18.5 Extensions for Diffuse Noise 425
18.6 Extensions for Interfering Sources 431
18.7 Summary 437
Bibliography 437
19 Perspectives 443
Emmanuel Vincent, Tuomas Virtanen, and Sharon Gannot
19.1 Advancing Deep Learning 443
19.2 Exploiting Phase Relationships 447
19.3 AdvancingMultichannel Processing 450
19.4 Addressing Multiple-Device Scenarios 453
19.5 TowardsWidespread Commercial Use 455
Acknowledgment 457
Bibliography 457
Index 465