# Binaural LCMV Beamforming with Partial Noise Estimation
**Authors**: Nico Gößling, Elior Hadad, Sharon Gannot, Simon Doclo
## Binaural LCMV Beamforming with Partial Noise Estimation
Nico G¨ oßling, Student Member, IEEE, Elior Hadad, Member, IEEE, Sharon Gannot, Senior Member, IEEE and Simon Doclo, Senior Member, IEEE
Abstract -Besides reducing undesired sources, i.e., interfering sources and background noise, another important objective of a binaural beamforming algorithm is to preserve the spatial impression of the acoustic scene, which can be achieved by preserving the binaural cues of all sound sources. While the binaural minimum variance distortionless response (BMVDR) beamformer provides a good noise reduction performance and preserves the binaural cues of the desired source, it does not allow to control the reduction of the interfering sources and distorts the binaural cues of the interfering sources and the background noise. Hence, several extensions have been proposed. First, the binaural linearly constrained minimum variance (BLCMV) beamformer uses additional constraints, enabling to control the reduction of the interfering sources while preserving their binaural cues. Second, the BMVDR with partial noise estimation (BMVDR-N) mixes the output signals of the BMVDR with the noisy reference microphone signals, enabling to control the binaural cues of the background noise. Aiming at merging the advantages of both extensions, in this paper we propose the BLCMV with partial noise estimation (BLCMV-N). We show that the output signals of the BLCMV-N can be interpreted as a mixture between the noisy reference microphone signals and the output signals of a BLCMV using an adjusted interference scaling parameter. We provide a theoretical comparison between the BMVDR, the BLCMV, the BMVDR-N and the proposed BLCMV-N in terms of noise and interference reduction performance and binaural cue preservation. Experimental results using recorded signals as well as the results of a perceptual listening test show that the BLCMV-N is able to preserve the binaural cues of an interfering source (like the BLCMV), while enabling to trade off between noise reduction performance and binaural cue preservation of the background noise (like the BMVDR-N).
Index Terms -Binaural cues, binaural noise reduction, MVDR beamformer, LCMV beamformer, hearing devices
## I. INTRODUCTION
B EAMFORMING algorithms for head-mounted assistive hearing devices (e.g., hearing aids, earbuds and hearables) are crucial to improve speech quality and speech intelligibility in noisy acoustic environments. Assuming a binaural configuration where both devices exchange their microphone signals, the information captured by all microphones on both
This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - Project ID 352015383 (SFB 1330 B2) and Project ID 390895286 (EXC 2177/1) and the Israeli Ministry of Science and Technology, #88962, 2019.
E. Hadad and S. Gannot are with the Faculty of Engineering, BarIlan University, Ramat-Gan, 5290002, Israel (e-mail: elior.hadad@biu.ac.il; sharon.gannot@biu.ac.il).
N. G¨ oßling and S. Doclo are with the Department of Medical Physics and Acoustics and the Cluster of Excellence Hearing4all, University of Oldenburg, 26111 Oldenburg, Germany (e-mail: nico.goessling@uol.de; simon.doclo@uol.de).
sides of the head can be exploited [1]-[3]. Besides reducing interfering sources (e.g., competing speakers) and background noise (e.g., diffuse babble noise), another important objective of a binaural beamforming algorithm is the preservation of the listener's spatial impression of the acoustic scene. This can be achieved by preserving the binaural cues of all sound sources, i.e., the interaural level difference (ILD) and the interaural time difference (ITD) for coherent sources (desired source and interfering sources) and the interaural coherence (IC) for incoherent sound fields (background noise) [4]. Binaural cues play a major role for spatial perception, i.e., to localize sound sources and to determine the spatial width or diffuseness of a sound field [5], and are very important for speech intelligibility due to so-called binaural unmasking [6], [7].
Unlike monaural beamforming algorithms, binaural beamforming algorithms need to generate two output signals (i.e., one for each ear), hence typically processing all available microphone signals from both devices by two different spatial filters [8]-[19]. A frequently used binaural beamforming algorithm is the binaural minimum variance distortionless response (BMVDR) beamformer, which aims at minimizing the power spectral density (PSD) of the noise component in the output signals while preserving the desired source component in the reference microphone signals on the left and the right device [2], [3], [11]. While the BMVDR provides a good noise reduction performance and preserves the binaural cues of the desired source, it does not allow to control the reduction of the interfering sources and distorts the binaural cues of the undesired sources (interfering sources and background noise). More specifically, after applying the BMVDR the binaural cues of the undesired sources are equal to the binaural cues of the desired source, such that all sources are perceived as coming from the same direction, which is obviously undesired. Hence, several extensions of the BMVDR have been proposed. On the one hand, the binaural linearly constrained minimum variance (BLCMV) beamformer uses additional interference reduction constraints, enabling to control the reduction of the interfering sources while preserving the binaural cues of the interfering sources in addition to the desired source by means of interference scaling parameters [12], [14], [17], [20]. However, due to the additional constraints there are less degrees of freedom available for noise reduction, such that the noise reduction performance for the BLCMV is lower than for the BMVDR. Furthermore, it is not possible to explicitly trade off between noise reduction performance and binaural cue preservation of the background noise. On the other hand, the BMVDR with partial noise estimation (BMVDR-N) aims
for the noise component in the output signals to be equal to a scaled version of the noise component in the reference microphone signals while preserving the desired source component in the reference microphone signals [3], [10], [11], [16]. It has been shown that the output signals of the BMVDR-N can be interpreted as a mixture between the output signals of the BMVDR and the noisy reference microphone signals, i.e., the BMVDR-N provides a trade-off between noise reduction performance and binaural cue preservation of the background noise. While for (incoherent) background noise the BMVDRN showed promising results [16], [21], the effect of partial noise estimation on a (coherent) interfering source strongly depends on the position of the interfering source relative to the desired source and is harder to control [11].
Aiming at merging the advantages of the BLCMV and the BMVDR-N, i.e., preserving the binaural cues of the interfering sources and controlling the reduction of the interfering sources as well as the binaural cues of the background noise, in this paper we propose the BLCMV with partial noise estimation (BLCMV-N). First, we derive two decompositions for the BLCMV-N which reveal differences and similarities between the BLCMV-N and the BLCMV. We show that the output signals of the BLCMV-N can be interpreted as a mixture between the noisy reference microphone signals and the output signals of a BLCMV using an adjusted interference scaling parameter. We then analytically derive the performance of the BLCMV-N in terms of noise and interference reduction performance and binaural cue preservation. We show that the output signal-to-noise ratio (SNR) of the BLCMV-N is smaller than or equal to the output SNR of the BLCMV and derive the optimal interference scaling parameter maximizing the output SNR of the BLCMV-N. The derived analytical expressions are first validated using measured anechoic acoustic transfer functions (ATFs). In addition, more realistic experiments are performed using recorded signals for a binaural hearing device in a reverberant cafeteria with one interfering source and multitalker babble noise. Both the objective performance measures as well as the results of a perceptual listening test with 13 normal-hearing participants show that the proposed BLCMVN is able to preserve the binaural cues and hence the spatial impression of the interfering source (like the BLCMV), while trading off between noise reduction performance and binaural cue preservation of the background noise (like the BMVDRN).
The remainder of this paper is organized as follows. In Section II we introduce the considered binaural hearing device configuration and the used objective performance measures. In Section III we briefly review several binaural beamforming algorithms, namely the BMVDR, the BLCMV and the BMVDR-N. In Section IV we present the BLCMV-N and derive two decompositions. In Section V we provide a detailed theoretical analysis of the proposed BLCMV-N in terms of noise and interference reduction performance and binaural cue preservation. In Section VI we first validate the analytical expressions using anechoic ATFs, followed by simulations and a perceptual listening test using realistic recordings in a reverberant room.
Fig. 1. Binaural hearing device configuration with M L microphones on the left side and M R microphones the right side.
<details>
<summary>Image 1 Details</summary>

### Visual Description
## Block Diagram: System Architecture with Dual Processing Blocks
### Overview
The diagram illustrates a system architecture divided into two processing blocks, **W_L** (left) and **W_R** (right), which process sequential input signals and produce outputs. The system appears modular, with distinct input ranges assigned to each block and dedicated output streams.
---
### Components/Axes
- **Blocks**:
- **W_L**: Processes inputs **y₁** to **y_M_L**.
- **W_R**: Processes inputs **y_{M_L+1}** to **y_{M_L+M_R}**.
- **Inputs**:
- Sequential signals labeled **y₁, y₂, ..., y_{M_L+M_R}**.
- Dashed lines connect intermediate inputs (**y₂, ..., y_{M_L}**) to both blocks, suggesting auxiliary or feedback connections.
- **Outputs**:
- **z_L**: Output of **W_L**.
- **z_R**: Output of **W_R**.
- **Connections**:
- Solid arrows indicate primary data flow from inputs to blocks to outputs.
- Dashed lines imply optional or secondary interactions between blocks and inputs.
---
### Detailed Analysis
- **Input Ranges**:
- **W_L** handles the first **M_L** inputs (**y₁** to **y_M_L**).
- **W_R** handles the next **M_R** inputs (**y_{M_L+1}** to **y_{M_L+M_R}**).
- **Output Structure**:
- Each block produces a single output (**z_L**, **z_R**), implying aggregation or summarization of its input range.
- **Dashed Lines**:
- Connect inputs **y₂** to **y_{M_L}** to both blocks, possibly indicating shared processing or cross-block dependencies.
---
### Key Observations
1. **Modular Design**: The system splits inputs into two distinct ranges, enabling parallel or independent processing.
2. **Output Dependency**: Outputs **z_L** and **z_R** are directly tied to their respective input ranges, suggesting deterministic processing.
3. **Dashed Connections**: The presence of dashed lines between intermediate inputs and both blocks hints at potential feedback loops or shared computational resources, though the exact mechanism is unspecified.
---
### Interpretation
This diagram represents a partitioned system where inputs are divided into left and right segments for processing. The modularity suggests scalability or fault isolation, as failures in one block may not affect the other. The dashed lines introduce ambiguity: they could represent:
- **Feedback**: Intermediate inputs influencing both blocks iteratively.
- **Auxiliary Data**: Secondary signals aiding computation in both blocks.
- **Redundancy**: Overlapping processing for reliability.
The lack of explicit labels for dashed lines leaves their purpose open to interpretation, but their placement between inputs and blocks implies they are intermediate steps or shared parameters. The system’s simplicity emphasizes clear input-output separation, prioritizing modularity over complex inter-block communication.
</details>
## II. HEARING DEVICE CONFIGURATION
In Section II-A the considered binaural hearing device configuration and the signal model are introduced. In Sections II-B and II-C the objective performance measures and the binaural cues are defined.
## A. Signal Model
Consider the binaural hearing device configuration depicted in Figure 1 with M L microphones on the left side and M R microphones on the right side, i.e., M = M L + M R microphones in total. In this paper we consider an acoustic scenario with one desired source (target speaker) and one interfering source (competing speaker) in a noisy and reverberant environment, where the background noise is assumed to be incoherent (e.g., diffuse babble noise, sensor noise).
In the frequency-domain, the m -th microphone signal y m ( ω ) can be decomposed as
<!-- formula-not-decoded -->
with ω the normalized (radian) frequency, x m ( ω ) the desired source component, u m ( ω ) the interfering source component and n m ( ω ) the noise component in the m -th microphone signal. The undesired component v m ( ω ) is defined as the sum of the interfering source component u m ( ω ) and the noise component n m ( ω ) . For the sake of conciseness, we omit the variable ω in the remainder of the paper wherever possible. The M -dimensional noisy input vector containing all microphone signals is defined as
<!-- formula-not-decoded -->
where ( · ) T denotes the transpose. Using (1), this vector can be written as
<!-- formula-not-decoded -->
where x , u , n and v are defined similarly as y in (2).
<!-- formula-not-decoded -->
For the considered acoustic scenario, the desired source component and the interfering source component can be written as where s x and s u denote the desired source signal and the interfering source signal, respectively, and a and b denote M -dimensional ATF vectors, containing the ATFs between the microphones and the desired source and the interfering source, respectively. It should be noted that the ATFs include reverberation, microphone characteristics and the head-shadow effect.
Without loss of generality, the first microphone on each side is defined as the so-called reference microphone. To simplify the notation, the reference microphone signals y 1 and y M L +1 are denoted as y L and y R , i.e.,
<!-- formula-not-decoded -->
where e L and e R denote M -dimensional selection vectors with all elements equal to 0 except one element equal to 1, i.e., e L (1) = 1 and e R ( M L +1) = 1 . Using (3), (4) and (5), the reference microphone signals can be written as
<!-- formula-not-decoded -->
The noisy input covariance matrix R y , the desired source covariance matrix R x , the interfering source covariance matrix R u and the noise covariance matrix R n are defined as
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
with E{·} the expected value operator and ( · ) H the conjugate transpose. Assuming statistical independence between all signal components, R y can be written as
<!-- formula-not-decoded -->
with R v the undesired covariance matrix. Using (4), (8) and (9), the desired source covariance matrix and the interfering source covariance matrix can be written as rank-1 matrices, i.e.,
<!-- formula-not-decoded -->
with p x = E{| s x | 2 } the PSD of the desired source and p u = E{| s u | 2 } the PSD of the interfering source. The noise covariance matrix R n is assumed to be full-rank, i.e., invertible and positive definite.
The left and the right output signals z L and z R are obtained by filtering and summing all microphone signals using the M -dimensional filter vectors w L and w R (cf. Figure 1), i.e.,
<!-- formula-not-decoded -->
## B. Objective Performance Measures
The PSD and the cross power spectral density (CPSD) of the desired source component in the left and the right reference microphone signal are given by
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
Similarly, the output PSD of the desired source component in the left and the right output signal is given by
<!-- formula-not-decoded -->
The same definitions can be applied for the noisy input signal, the interfering source component and the noise component by substituting R x with R y , R u or R n .
The narrowband input SNR in the left and the right reference microphone signal is defined as the ratio of the input PSD of the desired source and noise components, i.e.,
<!-- formula-not-decoded -->
Similarly, the narrowband output SNR in the left and the right output signal is defined as the ratio of the output PSD of the desired source and noise components, i.e.,
<!-- formula-not-decoded -->
The SNR improvement (in dB) is defined as ∆SNR L/R = 10 log 10 SNR out L/R -10 log 10 SNR in L/R .
The narrowband input signal-to-interference ratio (SIR) in the left and the right reference microphone signal is defined as the ratio of the input PSD of the desired source and interfering source components, i.e.,
<!-- formula-not-decoded -->
Similarly, the narrowband output SIR in the left and the right output signal is defined as the ratio of the output PSD of the desired source and interfering source components, i.e.,
<!-- formula-not-decoded -->
The SIR improvement (in dB) is defined as ∆SIR L/R = 10 log 10 SIR out L/R -10 log 10 SIR in L/R .
## C. Binaural Cues
For coherent sources (desired source and interfering source) the main binaural cues used by the auditory system are the ILD and the ITD [4], which can be computed from the so-called interaural transfer function (ITF). Using (11), the input ITFs of the desired source and the interfering source are given by [11]
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
Similarly, the output ITFs of the desired source and the interfering source are given by
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
The ILD and the ITD can be calculated from the ITF as [11]
<!-- formula-not-decoded -->
with ∠ ( · ) denoting the unwrapped phase.
For an incoherent sound field (background noise), ILD and ITD cues are not very descriptive, but the IC is known to play a major role for spatial perception (e.g., spatial width or diffuseness) [4]. The input IC of the noise component is defined as
<!-- formula-not-decoded -->
while the output IC of the noise component is defined as
<!-- formula-not-decoded -->
Because the IC is typically complex-valued, the magnitudesquared coherence (MSC) is often used. The input and the output MSC of the noise component are defined as
<!-- formula-not-decoded -->
An MSC of 1 corresponds to a coherent source perceived as a distinct point source, while smaller MSC values correspond to a broader or even diffuse sound field impression [4].
## III. BINAURAL BEAMFORMING ALGORITHMS
In this section we briefly review three state-of-the-art binaural beamforming algorithms, namely the BMVDR beamformer, the BLCMV beamformer and the BMVDR-N beamformer. We discuss the performance of these beamforming algorithms in terms of noise and interference reduction performance and binaural cue preservation. For the sake of conciseness, we only show expressions for the left hearing device, denoted by the subscript L . It should be noted that all expressions can also be formulated for the right hearing device by changing the subscript to R .
## A. BMVDR Beamformer
The BMVDR aims at minimizing the output PSD of the noise component while preserving the desired source component in the reference microphone signals [2], [3], [11]. The constrained optimization problem for the left filter vector is given by
<!-- formula-not-decoded -->
Using (4), (6) and (9), the solution of (28) is equal to [2], [22], [23]
<!-- formula-not-decoded -->
with
<!-- formula-not-decoded -->
It should be noted that the BMVDR can also be defined using the undesired covariance matrix R v instead of the noise covariance matrix R n . However, since R v is considerably more difficult to estimate or model in practice than R n , in this paper we only consider the BMVDR using R n in (29).
By substituting (29) in (18) and (20), it has been shown in [3], [11] that the output SNR and the output SIR of the BMVDR are equal to
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
with γ a defined in (30) and
<!-- formula-not-decoded -->
Although the BMVDR yields the largest output SNR among all distortionless binaural beamforming algorithms, the output SIR depends on the relative position of the interfering source to the desired source, cf. (33).
As shown in [3], [11], [13], the BMVDR preserves the binaural cues of the desired source, i.e.,
<!-- formula-not-decoded -->
but distorts the binaural cues of the undesired sources, i.e., for the interfering source
<!-- formula-not-decoded -->
and for the background noise
<!-- formula-not-decoded -->
Hence, at the output of the BMVDR the interfering source and the (incoherent) background noise are perceived as coming from the direction of the desired source, which is obviously undesired in terms of spatial awareness.
## B. BLCMV Beamformer
In addition to preserving the desired source component in the reference microphone signals, the BLCMV preserves a scaled version of the interfering source component in the reference microphone signals while minimizing the output PSD of the noise component [12], [14]. The constrained optimization problem for the left filter vector is given by [14]
<!-- formula-not-decoded -->
with 0 < δ ≤ 1 the (real-valued) interference scaling parameter. Using (4), (6) and (9), the solution of (37) is equal to [14]
<!-- formula-not-decoded -->
with the constraint matrix C and the left response vector g L defined as
<!-- formula-not-decoded -->
By substituting (38) in (18), it has been shown in [14] that the output SNR of the BLCMV is equal to
<!-- formula-not-decoded -->
with
<!-- formula-not-decoded -->
where {·} denotes the real part of a complex number. The output SNR of the BLCMV in (40) is smaller than or equal to the output SNR of the BMVDR in (31), since less degrees of freedom are available for noise reduction. In addition, the output SIR of the BLCMV is equal to [14]
<!-- formula-not-decoded -->
which can hence be directly controlled by the interference scaling parameter δ .
As shown in [14], the BLCMV preserves the binaural cues of both the desired source and the interfering source, i.e.,
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
and the output MSC of the noise component is equal to
<!-- formula-not-decoded -->
Because R xu, 1 in (41) is a rank-2 matrix, it has been shown in [14] that the output MSC of the noise component is smaller than 1 but is not equal to the input MSC of the noise component. Furthermore, it should be noted that the output MSC of the noise component depends on the relative position of the interfering source to the desired source, cf. (41) and (42), such that it is not straightforward to control the binaural cues of the background noise.
## C. BMVDR-N beamformer
In addition to preserving the desired source component in the reference microphone signals, the BMVDR with partial noise estimation (BMVDR-N) aims at preserving a scaled version of the noise component in the reference microphone signals [3], [10], [11]. The constrained optimization problem for the left filter vector is given by
<!-- formula-not-decoded -->
with 0 ≤ η ≤ 1 the (real-valued) mixing parameter. It has been shown in [11] that the solution of (47) is equal to
<!-- formula-not-decoded -->
with w BMVDR ,L defined in (29). Hence, the output signals of the BMVDR-N can be interpreted as a mixture between the noisy reference microphone signals (scaled with η ) and the output signals of the BMVDR (scaled with 1 -η ). For η = 0 , the BMVDR-N is equal to the BMVDR, whereas for η = 1 , no beamforming is applied.
Since the output signals of the BMVDR are mixed with the noisy reference microphone signals, the output SNR of the BMVDR-N is always smaller than or equal to the output SNR of the BMVDR [11], i.e.,
<!-- formula-not-decoded -->
and decreases with increasing η . By substituting (48) in (20), it can be shown that the output SIR of the BMVDR-N is equal to with
<!-- formula-not-decoded -->
As shown in [11], [16], the BMVDR-N preserves the binaural cues of the desired source, i.e.,
<!-- formula-not-decoded -->
By substituting (48) in (24) and (26), it has been shown in [16] and [20] that the output ITF of the interfering source is equal to
<!-- formula-not-decoded -->
and the output MSC of the noise component is equal to
<!-- formula-not-decoded -->
It can be seen from (52) and (53) that only for η = 1 the binaural cues of the undesired sources (interfering source and background noise) are preserved, whereas for η = 0 the binaural cues of the undesired sources are equal to the binaural cues of the desired source (as for the BMVDR). The mixing parameter η hence allows to trade off between noise reduction performance and binaural cue preservation of the background noise, or in other words control the binaural cues of the background noise. Furthermore, it should be noted that the interference reduction performance in (50) and the output ITF of the interfering source in (52) do not only depend on the mixing parameter η but also on the relative position of the interfering source to the desired source, such that it is not straightforward to control both.
## IV. BLCMV WITH PARTIAL NOISE ESTIMATION
Aiming at merging the advantages of the BLCMV and the BMVDR-N, i.e., preserving the binaural cues of the interfering source and controlling the binaural cues of the background noise, in Section IV-A we present the BLCMV beamformer with partial noise estimation (BLCMV-N). Similarly as for the BLCMV in [14], in Sections IV-B and IV-C we derive two decompositions for the BLCMV-N which reveal differences and similarities between the BLCMV-N and the BLCMV.
<!-- formula-not-decoded -->
## A. BLCMV-N Beamformer
Compared to the BMVDR in (28), the BLCMV-N uses an additional constraint to preserve a scaled version of the interfering source component in the reference microphone signals, like the BLCMV in (37), and aims at preserving a scaled version of the noise component in the reference microphone signals, like the BMVDR-N in (47). The constrained optimization problem for the left filter vector is given by
<!-- formula-not-decoded -->
The solution of (54) is equal to (see Appendix A)
<!-- formula-not-decoded -->
with C defined in (39) and the adjusted interference scaling parameter ¯ δ equal to
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
Hence, the output signals of the BLCMV-N can be interpreted as a mixture between the noisy reference microphone signals (scaled with η ) and the output signals of a BLCMV (scaled with 1 -η ) using the adjusted interference scaling parameter ¯ δ in (56) instead of the interference scaling parameter δ . For η = 0 , the BLCMV-N is equal to the BLCMV in (38) with ¯ δ = δ , whereas for η = 1 , it should be realized that only if δ = 1 no beamforming is applied. Since mixing with the reference microphone signals not only affects the noise component but also the interfering source component, the adjusted interference scaling parameter ¯ δ depends on both the interference scaling parameter δ as well as the mixing parameter η due to the interference reduction constraint in (54). Figure 2 depicts ¯ δ as a function of η for different values of δ . It can be seen that
As will be shown in more detail in the following sections, using the parameters δ and η it is possible to control the noise reduction performance, the interference reduction performance and the binaural cues of the background noise for the BLCMVN.
## B. Decomposition into two BLCMVs
In [14] it has been shown that the BLCMV in (38) can be decomposed as the sum of two sub-BLCMVs, i.e.,
<!-- formula-not-decoded -->
with
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST 2015
Fig. 2. Adjusted interference scaling parameter ¯ δ as a function of η for different values of δ . Fig. 2. Adjusted interference scaling parameter ¯ δ as a function of η for different values of δ .
<details>
<summary>Image 2 Details</summary>

### Visual Description
## Line Graph: Relationship Between η and δ̄ for Different δ Values
### Overview
The image depicts a line graph illustrating the relationship between two variables, η (x-axis) and δ̄ (y-axis), across four distinct scenarios defined by the parameter δ. The graph shows four data series, each represented by a unique line style and color, with δ values ranging from 0 to 0.75. The y-axis spans from -1 to 1, while the x-axis ranges from 0 to 1.
### Components/Axes
- **X-axis (η)**: Labeled as η, with a linear scale from 0 to 1 in increments of 0.2.
- **Y-axis (δ̄)**: Labeled as δ̄, with a linear scale from -1 to 1 in increments of 0.5.
- **Legend**: Positioned in the top-right corner, mapping four δ values to line styles and colors:
- **δ = 0**: Blue circles (○)
- **δ = 0.25**: Red squares (■)
- **δ = 0.5**: Green diamonds (◇)
- **δ = 0.75**: Black triangles (▲)
### Detailed Analysis
1. **δ = 0.75 (Black Triangles)**:
- Starts at approximately δ̄ = 0.8 when η = 0.
- Declines steeply, reaching δ̄ ≈ -0.2 at η = 1.
- Trend: Strong negative slope, indicating a rapid decrease in δ̄ as η increases.
2. **δ = 0.5 (Green Diamonds)**:
- Begins at δ̄ ≈ 0.5 when η = 0.
- Decreases gradually, reaching δ̄ ≈ -0.1 at η = 1.
- Trend: Moderate negative slope, less steep than δ = 0.75.
3. **δ = 0.25 (Red Squares)**:
- Starts at δ̄ ≈ 0.2 when η = 0.
- Declines slightly, reaching δ̄ ≈ -0.3 at η = 1.
- Trend: Gentle negative slope, with a shallow decline.
4. **δ = 0 (Blue Circles)**:
- Begins at δ̄ ≈ -0.8 when η = 0.
- Increases slightly to δ̄ ≈ -0.6 at η = 0.2, then declines to δ̄ ≈ -0.9 at η = 1.
- Trend: Initial upward curvature followed by a steep negative slope.
### Key Observations
- **Inverse Relationship**: Higher δ values correspond to higher initial δ̄ values and steeper declines as η increases.
- **δ = 0 Anomaly**: The δ = 0 series exhibits a non-monotonic trend, with an initial increase before declining, unlike the other series.
- **Convergence**: All series converge toward δ̄ ≈ -0.5 to -0.9 as η approaches 1, suggesting a shared limiting behavior.
### Interpretation
The graph demonstrates that δ acts as a scaling or modulation factor for the relationship between η and δ̄. Higher δ values amplify the initial δ̄ and accelerate its decline with increasing η. The δ = 0 case deviates significantly, showing an initial increase in δ̄ before a sharp drop, which may indicate a threshold or bifurcation effect in the underlying system. This could represent scenarios where δ represents a damping coefficient, efficiency factor, or similar parameter influencing system dynamics. The convergence at low η values suggests a common asymptotic behavior across all δ scenarios as the system stabilizes.
</details>
A. BLCMV-N Beamformer and the respective response vectors
Compared interfering source component in the reference microphone signals, like the BLCMV beamformer in (38), and aims at preserving a scaled version of the noise component in the reference microphone signals, like the BMVDR-N in (48). The constrained optimization problem for the left filter vector is given by The sub-BLCMV w x,L in (59) preserves the desired source component in the reference microphone signals and steers a null towards the interfering source, whereas the sub-BLCMV w u,L in (60) preserves the interfering source component in the reference microphone signals and steers a null towards the desired source. Using (55), it can be easily seen that the proposed BLCMV-N can be decomposed as
<!-- formula-not-decoded -->
an additional
<!-- formula-not-decoded -->
as a mixture between the noisy reference microphone signals
The solution of (56) is equal to (see Appendix for derivation) w BLCMV -N ,L = η e L +(1 -η ) R -1 n C ( C H R -1 n C ) -1 [ a ∗ L ¯ δb ∗ L ] (57) with C defined in (40) and the adjusted interference scaling parameter ¯ δ equal to ¯ δ = δ -η 1 -η . (58) The output signals of the BLCMV-N hence can be interpreted Hence, the BLCMV-N can be interpreted as a mixture of the reference microphone signals (scaled with η ), a BLCMV that preserves the desired source and rejects the interfering source (scaled with 1 -η ) and a BLCMV that preserves the interfering source and rejects the desired source (scaled with δ -η ). Since the scaling of the sub-BLCMV w x,L controls the desired source component without affecting the interfering source component and the scaling of the sub-BLCMV w u,L controls the interfering source component without affecting the desired source component [14], it can be directly observed from the scaling factors in (62) that the desired source component is not distorted and the interfering source component is scaled with δ .
## (scaled with η ) and the output signals of a BLCMV (scaled C. Decomposition using Binauralization Postfilters
and the
<!-- formula-not-decoded -->
with 1 -η ) using the adjusted interference scaling parameter ¯ δ in (58) instead of the interference scaling parameter δ . For η = 0 , the BLCMV-N is equal to the BLCMV in (39) with ¯ δ = δ , whereas for η = 1 , no beamforming is applied. Since mixing with the reference microphone signals not only affects the noise component but also the interfering source component In [14] it has also been shown that the sub-BLCMV w x,L in (59) for the left hearing device and the sub-BLCMV w x,R for the right hearing device (defined similarly as w x,L ) can be written using a common spatial filter and two binauralization postfilters as
¯
<!-- formula-not-decoded -->
to be satisfied, the adjusted interference scaling parameter δ depends on both the interference scaling parameter δ as well with the common desired BLCMV (D-BLCMV) given by
¯ δ ( η, δ ) = > 0 , for δ > η < 0 , for δ < η 0 , for δ = η . (59) and the ATFs a L and a R between the desired source and the reference microphones used as binauralization postfilters. Similarly, the sub-BLCMV w u,L in (60) and the sub-BLCMV w u,R (defined similarly as w u,L ) can be written as w u,L = w u b ∗ L , w u,R = w u b ∗ R , (65)
B. Deco
Simila reference
BLCMV
different
BLCMV
w
BLC
with the and their
The sub- and steer
desired s rejection
microph preserves
It can th
Hence, t that
the refer
interferin pres
BLCMV
combinat rejects
t
binaural partially
interferin
BLCMV
Section cues of t
C.
Filter
For a was sho
bines bot be writte
interferin
BLCMV
with the
desir desired s
binaurali
<details>
<summary>Image 3 Details</summary>

### Visual Description
## Block Diagram: Dual-Path Processing System with Parameter Adjustment
### Overview
The diagram illustrates a technical system with two parallel processing paths (Left and Right), each involving weighted inputs, parameter adjustments, and output aggregation. The system uses Greek symbols (η, δ) to represent adjustable parameters influencing the flow and combination of components.
### Components/Axes
- **Left Path (L):**
- **Inputs:** `e_L` (error signal), `w_x` (weighted input)
- **Parameters:** `a*_L` (adjusted parameter), `b*_L` (adjusted parameter)
- **Operations:**
- `a*_L × (1 - η)`
- `b*_L × (δ - η)`
- Summation (`+`) to produce `z_L`
- **Right Path (R):**
- **Inputs:** `e_R` (error signal), `w_u` (weighted input)
- **Parameters:** `a*_R` (adjusted parameter), `b*_R` (adjusted parameter)
- **Operations:**
- `a*_R × (1 - η)`
- `b*_R × (δ - η)`
- Summation (`+`) to produce `z_R`
- **Shared Elements:**
- Greek symbols: `η` (eta), `δ` (delta), `1 - η`, `δ - η` (parameter adjustments)
- Arrows indicate directional flow between components.
### Detailed Analysis
1. **Left Path Flow:**
- `e_L` and `w_x` feed into a junction.
- `a*_L` is scaled by `(1 - η)` and `b*_L` by `(δ - η)`.
- Results are summed to generate `z_L`.
2. **Right Path Flow:**
- `e_R` and `w_u` feed into a junction.
- `a*_R` is scaled by `(1 - η)` and `b*_R` by `(δ - η)`.
- Results are summed to generate `z_R`.
3. **Parameter Roles:**
- `η` and `δ` act as tunable weights, modulating the influence of `a*` and `b*` parameters in each path.
- `(1 - η)` and `(δ - η)` suggest dynamic balancing between components.
### Key Observations
- **Symmetry:** Both paths share identical structural logic but differ in input labels (`e_L` vs. `e_R`, `w_x` vs. `w_u`).
- **Parameter Dependency:** Outputs `z_L` and `z_R` are directly tied to the values of `η` and `δ`.
- **No Numerical Data:** The diagram lacks explicit numerical values, focusing instead on symbolic relationships.
### Interpretation
This diagram likely represents a **dual-branch neural network** or **signal processing system** where:
- **Left/Right Paths** process distinct inputs (`e_L`, `w_x` vs. `e_R`, `w_u`).
- **Parameters (`a*`, `b*`)** are adjusted by `η` and `δ` to control feature weighting.
- **Outputs (`z_L`, `z_R`)** aggregate scaled parameters, suggesting a fusion mechanism for final output generation.
The use of `η` and `δ` implies a design for **adaptive learning** or **dynamic resource allocation**, where these parameters could be optimized during training or operation. The absence of numerical values indicates a conceptual or architectural representation rather than empirical data.
</details>
×
Fig. 3. Decomposition of the BLCMV-N into a mixture of the reference microphone signals and two BLCMVs with binauralization postfilters.
with the common interference BLCMV (I-BLCMV) given by and the ATFs b L and b R between the interfering source and the reference microphones used as binauralization postfilters.
<!-- formula-not-decoded -->
Using (63) and (65) in (62), the BLCMV-N can be decomposed as
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
Figure 3 depicts this decomposition of the BLCMV-N using common spatial filters and binauralization postfilters. The output signals of the BLCMV-N can hence be interpreted as a mixture between the reference microphone signals (scaled with η ), the binauralized output signals of the D-BLCMV (scaled with 1 -η ) and the binauralized output signals of the I-BLCMV (scaled with δ -η ).
Due to the constraints in (54), the BLCMV-N perfectly preserves the desired source component and scales the interfering source component with δ . Using (67) and (68), the noise component in the output signals of the BLCMV-N are equal to
<!-- formula-not-decoded -->
with n x = w H x n and n u = w H u n the noise component in the output signal of the D-BLCMV and the I-BLCMV, respectively. The noise component in the output signals of the BLCMV-N can hence be interpreted as a mixture between the noise component in the reference microphone signals (scaled with η ), a coherent residual noise source ( n x ) coming from the direction of the desired source (scaled with 1 -η ) and a coherent residual noise source ( n u ) coming from the direction of the interfering source (scaled with δ -η ).
## V. PERFORMANCE OF THE BLCMV-N
In this section we provide a performance analysis of the proposed BLCMV-N. In Section V-A we derive the output PSDs of the signal components. In Sections V-B and V-C
we analyze the noise and interference reduction performance and the binaural cue preservation performance. Finally, in Section V-D we discuss the setting of the mixing parameter η and the interference scaling parameter δ .
## A. Output Power Spectral Densities
Due to the constraints in (54), the output PSD of the desired and interfering source components in the left output signal of the BLCMV-N are equal to, cf. (13),
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
Furthermore, the output PSD of the noise component in the left output signal of the BLCMV-N is equal to (see Appendix B)
with
<!-- formula-not-decoded -->
with γ a defined in (30), γ ab defined in (33), and γ b and Ψ defined in (42). It can be seen that the output PSD of the noise component for the BLCMV-N is a quadratic function in both the mixing parameter η and the interference scaling parameter δ . By comparing (74) to (41), it can be observed that
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
where R δ =1 xu, 1 denotes the expression for the BLCMV in (41) with δ = 1 , corresponding to no suppression of the interfering source. Please note that for η = 0 , R xu, 3 = R xu, 1 , and for η = 1 and δ = 1 , R xu, 3 = 0 M . By using (75) in (73), it follows that
<!-- formula-not-decoded -->
## B. Noise and Interference Reduction Performance
By substituting (71) and (73) in (18), the left output SNR of the BLCMV-N is equal to
<!-- formula-not-decoded -->
which depends on both the mixing parameter η and the interference scaling parameter δ . Using (76) and realizing that the output PSD of the noise component in the left output signal of the BLCMV (for any value of δ ) is smaller than or equal to the PSD of the noise component in the left reference microphone signal, the output SNR of the BLCMV-N in (77) is smaller than or equal to the output SNR of the BLCMV in (40), i.e.,
<!-- formula-not-decoded -->
By substituting (71) and (72) in (20), the left output SIR of the BLCMV-N is equal to
<!-- formula-not-decoded -->
which is equal to the left output SIR of the BLCMV in (43) and solely controlled by the interference scaling parameter δ . For η = 0 , the left output SNR of the BLCMV-N is equal to the left output SNR of the BLCMV in (40), while for η = 1 and δ = 1 , the left output SNR of the BLCMV-N is equal to the left input SNR because no beamforming is applied.
## C. Binaural Cue Preservation
Similarly as for the BLCMV, due to the constraints in (54) the BLCMV-N preserves the binaural cues of both the desired source and the interfering source, i.e.,
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
Using (26), the output IC of the noise component for the BLCMV-N is equal to (see Appendix B for derivation of components)
<!-- formula-not-decoded -->
with R xu, 3 defined in (74). Since R xu, 3 depends on both the mixing parameter η and the interference scaling parameter δ , also the output IC of the noise component in (82) depends on both parameters. Using (27), the output MSC of the noise component for the BLCMV-N is equal to
<!-- formula-not-decoded -->
Since for η = 0 the BLCMV-N is equal to the BLCMV, the output MSC of the noise component is smaller than 1, see Section III-B. It should however be realized that in contrast to the BMVDR-N discussed in Section III-C, for η = 1 the BLCMV-N does not always preserve the MSC of the noise component. Only for η = 1 and δ = 1 the binaural cues of all signal components are preserved because no beamforming is applied.
## D. Parameter Settings
Maximizing the left output SNR in (77) corresponds to minimizing the denominator, i.e., using (75),
Setting the derivative of (84) with respect to the mixing parameter η equal to zero, yields
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
as the optimal mixing parameter η in terms of left (and right) output SNR. The derivative of (84) with respect to the interference scaling parameter δ is equal to, using (41),
<!-- formula-not-decoded -->
Setting (86) to zero and solving for δ yields the optimal interference scaling parameter in terms of left output SNR, i.e., with
As can be seen from (79), the output SIR is not affected by the mixing parameter η but is solely determined by the interference scaling parameter δ .
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
## VI. SIMULATIONS
In Section VI-A we first validate the expressions derived in the previous sections using measured anechoic ATFs. In Section VI-B we then experimentally compare the performance of the proposed BLCMV-N with the BMVDR, BLCMV and BMVDR-N using recorded signals in a reverberant environment with a competing speaker and multi-talker babble noise. Finally, in Section VI-C we compare the spatial impression of the considered binaural beamforming algorithms using a perceptual listening test.
## A. Validation Using Measured Anechoic ATFs
To validate the derived expressions for the considered algorithms we used measured anechoic ATFs of two behind-the-ear hearing aids mounted on a head-and-torso-simulator (HATS) [24]. Each hearing aid has two microphones ( M = 4 ) with an inter-microphone distance of about 14 mm . We chose the front microphone on each hearing aid as reference microphone. The ATFs were calculated from anechoic impulse responses using a 512-point FFT at a sampling rate of 16 kHz .
The desired source was placed at 0 ◦ (in front) and the interfering source was placed at -35 ◦ (to the left), both at a distance of 3 m from the HATS. The desired source covariance matrix R x and the interfering source covariance matrix R u were constructed using the ATF vector of the desired source a and the ATF vector of the interfering source b according to (11), where the PSD of the desired source p x and the PSD of the interfering source p u were both set to 1. As background noise we considered a combination of spatially white and cylindrically isotropic noise, i.e., the noise covariance matrix R n was constructed as
<!-- formula-not-decoded -->
with p n, w the PSD of the spatially white noise, I M the M × M -dimensional identity matrix, p n, cyl the PSD of the cylindrically isotropic noise and Γ its spatial coherence matrix. The ( i, j ) -th element of the spatial coherence matrix Γ was calculated using all available anechoic ATFs as with h ( θ k ) the anechoic ATF at angle θ k and K the total number of angles in the database ( K = 72 for [24]). The PSD of the spatially white noise p n, w was set to -55 dB , while the PSD of the cylindrically isotropic noise p n, cyl was set to 1.
<!-- formula-not-decoded -->
Fig. 4. SNR improvement for the BLCMV-N and the BMVDR-N at 500Hz .
<details>
<summary>Image 4 Details</summary>

### Visual Description
## Line Graph: ΔSNR_L vs. η with Varying δ Values and BMVDR-N Baseline
### Overview
The graph depicts the relationship between the change in signal-to-noise ratio (ΔSNR_L, in dB) and the parameter η (ranging from 0 to 1). It compares four curves representing different δ values (0.01, 0.25, 0.477, 0.75) and a dashed reference line labeled "BMVDR-N." All lines exhibit a downward trend, with BMVDR-N showing the steepest decline.
### Components/Axes
- **X-axis (η)**: Labeled as η, scaled from 0 to 1 in increments of 0.25.
- **Y-axis (ΔSNR_L [dB])**: Labeled as ΔSNR_L, scaled from 0 to 10 in increments of 2.5.
- **Legend**: Located in the top-right corner, with color-coded lines:
- Blue: δ = 0.01
- Red: δ = 0.25
- Yellow: δ = 0.477
- Purple: δ = 0.75
- Dashed black: BMVDR-N
### Detailed Analysis
1. **BMVDR-N (Dashed Black Line)**:
- Starts at ~8 dB at η = 0.
- Declines linearly to ~1 dB at η = 1.
- Steepest slope among all lines.
2. **δ = 0.01 (Blue Line)**:
- Starts at ~3 dB at η = 0.
- Gradually decreases to ~0.5 dB at η = 1.
- Flattest slope, indicating minimal sensitivity to η.
3. **δ = 0.25 (Red Line)**:
- Starts at ~5 dB at η = 0.
- Decreases to ~2 dB at η = 1.
- Moderate slope, steeper than δ = 0.01 but less than BMVDR-N.
4. **δ = 0.477 (Yellow Line)**:
- Starts at ~5.5 dB at η = 0.
- Declines to ~1.5 dB at η = 1.
- Slope similar to δ = 0.25 but slightly higher initial value.
5. **δ = 0.75 (Purple Line)**:
- Starts at ~5 dB at η = 0.
- Decreases to ~1 dB at η = 1.
- Slope nearly identical to δ = 0.25 and 0.477.
### Key Observations
- **BMVDR-N Baseline**: Dominates at η = 0 but degrades rapidly, suggesting poor performance at higher η values.
- **δ Sensitivity**: Higher δ values (0.477, 0.75) maintain higher ΔSNR_L initially but converge with lower δ values (0.01, 0.25) as η increases.
- **Crossing Point**: The red (δ = 0.25) and purple (δ = 0.75) lines intersect near η = 0.5, indicating similar performance beyond this point.
- **Stability vs. Performance**: Lower δ values (0.01, 0.25) show greater stability (flatter curves) but lower initial SNR compared to higher δ values.
### Interpretation
The graph illustrates a trade-off between initial SNR and sensitivity to η. BMVDR-N, while optimal at η = 0, becomes less effective as η increases, outperformed by higher δ values. Conversely, lower δ values (e.g., 0.01) exhibit robustness across η but sacrifice initial SNR. This suggests that δ tuning is critical for balancing performance and adaptability in systems where η varies. The convergence of higher δ lines at η = 1 implies diminishing returns for δ > 0.477 in extreme η conditions.
</details>
Fig. 5. SIR improvement for the BLCMV-N and the BMVDR-N at 500Hz .
<details>
<summary>Image 5 Details</summary>

### Visual Description
## Line Graph: ΔSIR_L vs. η with δ Parameters and BMVDR-N Performance
### Overview
The image is a line graph comparing the change in Signal-to-Interference-plus-Noise Ratio (ΔSIR_L) in decibels (dB) across different values of η (normalized parameter, 0 to 1). Four data series are plotted: three horizontal lines representing constant δ values (0.25, 0.477, 0.75) and one dashed line labeled "BMVDR-N" showing a declining trend.
### Components/Axes
- **Y-axis**: ΔSIR_L [dB], scaled from 0 to 15 dB in increments of 5.
- **X-axis**: η (normalized parameter), scaled from 0 to 1 in increments of 0.25.
- **Legend**: Located in the top-right corner, with four entries:
- Blue line: δ = 0.25
- Red line: δ = 0.477
- Orange line: δ = 0.75
- Black dashed line: BMVDR-N
### Detailed Analysis
1. **δ = 0.25 (Blue Line)**:
- Constant at ~12 dB across all η values.
- Positioned at the top of the graph, spatially grounded in the upper-middle region.
2. **δ = 0.477 (Red Line)**:
- Constant at ~6 dB across all η values.
- Positioned midway between δ = 0.25 and δ = 0.75.
3. **δ = 0.75 (Orange Line)**:
- Constant at ~3 dB across all η values.
- Positioned near the bottom of the graph.
4. **BMVDR-N (Black Dashed Line)**:
- Starts at ~4 dB when η = 0.
- Declines linearly to ~0.5 dB at η = 1.
- Spatial grounding: Originates near δ = 0.477 and slopes downward, crossing δ = 0.75 near η = 0.75.
### Key Observations
- **Constant δ Lines**: All δ values (0.25, 0.477, 0.75) maintain fixed ΔSIR_L regardless of η, indicating no dependency on η for these parameters.
- **BMVDR-N Decline**: The BMVDR-N line shows a consistent downward trend, suggesting a negative correlation between η and ΔSIR_L for this metric.
- **Spatial Relationships**:
- δ = 0.25 (12 dB) > δ = 0.477 (6 dB) > δ = 0.75 (3 dB) > BMVDR-N (0–4 dB).
- BMVDR-N intersects δ = 0.75 at η ≈ 0.75.
### Interpretation
The graph demonstrates two distinct behaviors:
1. **δ-Dependent Stability**: Higher δ values (0.25, 0.477) maintain robust ΔSIR_L, while lower δ (0.75) results in minimal ΔSIR_L. This implies δ acts as a stabilizing factor, with diminishing returns as δ increases.
2. **BMVDR-N Trade-off**: The declining trend of BMVDR-N suggests a performance degradation as η increases. This could indicate a system limitation or optimization constraint where increasing η reduces effectiveness.
The data highlights a critical design consideration: while δ parameters ensure stable ΔSIR_L, BMVDR-N’s η-dependent decline may require mitigation strategies (e.g., adaptive η control) to maintain system performance. The absence of overlap between δ lines and BMVDR-N reinforces their distinct operational regimes.
</details>
1) Noise and Interference Reduction Performance: Using (17) and (18), Figure 4 depicts the left SNR improvement at 500 Hz for the BLCMV-N for different values of the mixing parameter η and the interference scaling parameter δ and the BMVDR-N for different values of the mixing parameter η . As expected, the BMVDR (i.e., BMVDR-N for η = 0 ) yields the largest SNR improvement (cf. (78)). Since the BMVDR-N mixes the output signals of the BMVDR with the noisy reference microphone signals, it can be observed that increasing the mixing parameter η reduces the SNR improvement of the BMVDR-N compared to the BMVDR ( η = 0 ). For the BLCMV-N, both η and δ affect the SNR improvement, which is in line with (77). Similarly to the BMVDR-N, the BLCMV-N mixes the output signals of a BLCMV with the noisy reference microphone signals. Hence, it can be observed that for any value of the interference scaling parameter δ , increasing the mixing parameter η reduces the SNR improvement of the BLCMV-N compared to the BLCMV ( η = 0 ), which is in line with (78). Since less degrees of freedom are available for noise reduction, the BLCMV ( η = 0 ) yields a smaller SNR improvement compared to the BMVDR ( η = 0 ), as discussed in Section III-B. Using (87), the interference scaling parameter δ maximizing the output SNR was equal to δ opt ,L = 0 . 477 for the considered acoustic scenario. As expected, it can be observed that using δ opt ,L leads to the largest SNR improvement of all considered values of δ . For large values of the mixing parameter η , the BLCMVN yields a larger SNR improvement than the BMVDR-N. It should be noted that the exact behaviour depends on the interference scaling parameter δ and the relative position of the interfering source to the desired source.
Using (19) and (20), Figure 5 depicts the left SIR improvement at 500 Hz for the BLCMV-N for different values of the mixing parameter η and the interference scaling parameter δ
Fig. 6. The MSC of the noise component in the reference microphone signals ( Input ), in the output signals of the BLCMV for different values of the interference scaling parameter δ , the BMVDR-N for different values of the mixing parameter η and the BLCMV-N for different values of the mixing parameter η and the interference scaling paramter δ .
<details>
<summary>Image 6 Details</summary>

### Visual Description
## Line Graphs: Comparative Analysis of MSC Across Parameters
### Overview
The image contains six line graphs arranged in two columns and three rows, comparing the Mean Squared Coefficient (MSC) across frequency (kHz) for different parameter configurations. Each graph includes legends with η values (0, 0.5, 1) and varying δ parameters (0.01, 0.477, 1). The graphs are labeled as follows:
- **Top Row**: "Input" (left), "BLCMV-N, δ = 0.01" (right)
- **Middle Row**: "BLCMV" (left), "BLCMV-N, δ = 0.477" (right)
- **Bottom Row**: "BMVDR-N" (left), "BLCMV-N, δ = 1" (right)
### Components/Axes
- **X-Axis**: Labeled "kHz" with a range of 0–4 kHz.
- **Y-Axis**: Labeled "MSC" with a range of 0–1.
- **Legends**: Positioned in the top-right corner of each graph, showing η values (0 = blue, 0.5 = red, 1 = yellow). All graphs include these three η values, though some graphs (e.g., "Input") only display one line (η = 0).
- **Graph Titles**: Located at the top of each graph, specifying the model and δ parameter (e.g., "BLCMV-N, δ = 0.01").
### Detailed Analysis
#### Input Graph (Top-Left)
- **Lines**: Single blue line (η = 0).
- **Trend**: Sharp drop from MSC = 1 to ~0.1 between 0–1 kHz, then stabilizes near 0.
- **Key Points**:
- At 0 kHz: MSC ≈ 1.0
- At 1 kHz: MSC ≈ 0.1
- At 4 kHz: MSC ≈ 0.0
#### BLCMV-N, δ = 0.01 (Top-Right)
- **Lines**:
- Blue (η = 0): Sharp drop to ~0.1 by 1 kHz, then stabilizes.
- Red (η = 0.5): Peaks at ~0.8 MSC around 1.5 kHz, then declines.
- Yellow (η = 1): Peaks at ~0.6 MSC around 2.5 kHz, then declines.
- **Key Points**:
- η = 0.5 peak: 1.5 kHz, MSC ≈ 0.8
- η = 1 peak: 2.5 kHz, MSC ≈ 0.6
#### BLCMV (Middle-Left)
- **Lines**:
- Blue (η = 0): Sharp drop to ~0.1 by 1 kHz, then stabilizes.
- Red (η = 0.5): Oscillates between 0.3–0.7 MSC up to 3 kHz.
- Yellow (η = 1): Oscillates between 0.1–0.5 MSC up to 4 kHz.
- **Key Points**:
- η = 0.5: Peaks at ~0.7 MSC around 2 kHz.
- η = 1: Peaks at ~0.5 MSC around 3 kHz.
#### BLCMV-N, δ = 0.477 (Middle-Right)
- **Lines**:
- Blue (η = 0): Sharp drop to ~0.1 by 1 kHz, then stabilizes.
- Red (η = 0.5): Oscillates between 0.4–0.8 MSC up to 3 kHz.
- Yellow (η = 1): Oscillates between 0.2–0.6 MSC up to 4 kHz.
- **Key Points**:
- η = 0.5: Peaks at ~0.8 MSC around 2.5 kHz.
- η = 1: Peaks at ~0.6 MSC around 3.5 kHz.
#### BMVDR-N (Bottom-Left)
- **Lines**:
- Blue (η = 0): Sharp drop to ~0.1 by 1 kHz, then stabilizes.
- Red (η = 0.5): Peaks at ~0.7 MSC around 1.5 kHz, then declines.
- Yellow (η = 1): Peaks at ~0.5 MSC around 2.5 kHz, then declines.
- **Key Points**:
- η = 0.5 peak: 1.5 kHz, MSC ≈ 0.7
- η = 1 peak: 2.5 kHz, MSC ≈ 0.5
#### BLCMV-N, δ = 1 (Bottom-Right)
- **Lines**:
- Blue (η = 0): Sharp drop to ~0.1 by 1 kHz, then stabilizes.
- Red (η = 0.5): Sharp drop to ~0.3 by 1.5 kHz, then oscillates between 0.2–0.4.
- Yellow (η = 1): Sharp drop to ~0.2 by 1.5 kHz, then oscillates between 0.1–0.3.
- **Key Points**:
- η = 0.5: Peaks at ~0.3 MSC around 2 kHz.
- η = 1: Peaks at ~0.3 MSC around 3 kHz.
### Key Observations
1. **Input Baseline**: The "Input" graph shows a universal sharp drop in MSC, serving as a reference for unmodified signals.
2. **δ Parameter Impact**: Higher δ values (e.g., δ = 1) result in sharper MSC drops and reduced oscillations compared to lower δ values (e.g., δ = 0.01).
3. **η Parameter Impact**:
- η = 0.5 and η = 1 introduce frequency-dependent oscillations, with peaks shifting to higher frequencies as η increases.
- η = 0.5 consistently shows higher MSC peaks than η = 1 across most graphs.
4. **Model-Specific Behavior**:
- **BLCMV-N**: δ modulates the sharpness of the MSC drop and the amplitude of oscillations.
- **BLCMV**: η introduces sustained oscillations without δ influence.
- **BMVDR-N**: η = 0.5 and η = 1 exhibit similar peak frequencies but lower amplitudes compared to BLCMV-N variants.
### Interpretation
The data demonstrates that:
- **δ** acts as a threshold parameter, controlling the initial MSC drop and the persistence of oscillations. Higher δ values suppress oscillations more aggressively.
- **η** modulates the frequency response, with higher η values shifting peak MSC to higher frequencies and reducing peak amplitudes.
- The "Input" graph establishes a baseline MSC profile, while parameterized models (BLCMV-N, BLCMV, BMVDR-N) show how η and δ jointly shape the MSC across frequencies. This suggests δ fine-tunes the system's sensitivity, while η adjusts the frequency-dependent behavior.
- Outliers (e.g., sharp peaks in BLCMV-N, δ = 0.01) indicate resonant frequencies where η amplifies the MSC, potentially highlighting critical operational thresholds.
</details>
and the BMVDR-N for different values of the mixing parameter η . As expected from (43) and (79), both the BLCMV-N and the BLCMV ( η = 0 ) yield the same SIR improvement, which is solely controlled by the interference scaling parameter δ . Hence, increasing the interference scaling parameter δ reduces the SIR improvement for both the BLCMV-N and the BLCMV. For the BMVDR-N it can be observed that increasing the mixing parameter η reduces the SIR improvement. It should be noted that the exact behaviour depends on the relative position of the interfering source to the desired source, as can be seen from (50) and (51).
2) Binaural Cue Preservation of Background Noise: For different frequencies, Figure 6 depicts the input MSC in (27) of the noise component ( Input ) and the output MSC in (27) of the noise component for the BLCMV in (46) for different values of the interference scaling parameter δ , the BMVDR-N in (53) for different values of the mixing parameter η and the BLCMV-N for different values of the mixing parameter η and the interference scaling parameter δ . Although the BLCMV is not designed to preserve the MSC of the noise component, it can be observed that an output MSC smaller than 1 is obtained, especially for large values of δ [14]. However, since the output MSC of the noise component depends on the relative position of the interfering source to the desired source, it cannot be easily controlled. Since the BMVDR-
Fig. 7. Frequency-averaged MSC error of the noise component for the BLCMV-N and the BMVDR-N.
<details>
<summary>Image 7 Details</summary>

### Visual Description
## Line Graph: ΔAMSC vs. η with Varying δ Parameters
### Overview
The image is a line graph comparing the performance metric ΔAMSC (y-axis) across different values of η (x-axis, ranging from 0 to 1). Four solid-colored lines represent different δ parameter values (0.25, 0.477, 0.75, 1), while a dashed black line labeled "BMVDR-N" serves as a reference. All lines exhibit downward trends, but with distinct starting points and slopes.
### Components/Axes
- **X-axis (η)**: Labeled "η", scaled from 0 to 1 in increments of 0.25.
- **Y-axis (ΔAMSC)**: Labeled "ΔAMSC", scaled from 0 to 1 in increments of 0.25.
- **Legend**: Located in the top-right corner, with:
- Solid blue line: δ = 0.25
- Solid red line: δ = 0.477
- Solid orange line: δ = 0.75
- Solid purple line: δ = 1
- Dashed black line: BMVDR-N
### Detailed Analysis
1. **δ = 0.25 (Blue Line)**:
- Starts at ~0.85 ΔAMSC at η = 0.
- Declines gradually, ending near ~0.2 at η = 1.
- Maintains the highest ΔAMSC across all η values.
2. **δ = 0.477 (Red Line)**:
- Starts at ~0.6 ΔAMSC at η = 0.
- Declines steadily, ending near ~0.15 at η = 1.
- Second-highest performance overall.
3. **δ = 0.75 (Orange Line)**:
- Starts at ~0.3 ΔAMSC at η = 0.
- Declines moderately, ending near ~0.05 at η = 1.
- Third-highest performance.
4. **δ = 1 (Purple Line)**:
- Starts at ~0.1 ΔAMSC at η = 0.
- Declines slightly, ending near ~0.02 at η = 1.
- Lowest performance across all η.
5. **BMVDR-N (Dashed Black Line)**:
- Starts at ~1.0 ΔAMSC at η = 0.
- Declines sharply, crossing below all δ lines by η = 0.5.
- Ends near ~0.05 at η = 1.
### Key Observations
- **BMVDR-N Dominance at Low η**: BMVDR-N outperforms all δ values at η = 0 but degrades rapidly as η increases.
- **δ Parameter Trade-off**: Higher δ values (e.g., δ = 1) start with lower ΔAMSC but degrade more slowly than lower δ values (e.g., δ = 0.25).
- **Convergence at High η**: All lines converge near ΔAMSC = 0.05 by η = 1, suggesting diminishing returns for all configurations at high η.
### Interpretation
The graph demonstrates a trade-off between initial performance (η = 0) and robustness to increasing η. BMVDR-N excels in low-η scenarios but becomes outperformed by δ-based configurations as η grows. Higher δ values sacrifice initial performance for stability, while lower δ values (e.g., δ = 0.25) prioritize initial gains at the cost of rapid degradation. This suggests δ tuning is critical for balancing η-dependent performance requirements. The sharp decline of BMVDR-N implies it may be unsuitable for applications with variable or high η values.
</details>
N mixes the output signals of the BMVDR with the noisy reference microphone signals, it can be observed that the output MSC of the noise component is smaller than 1, and for η = 1 the MSC is perfectly preserved (but no beamforming is applied). For the BLCMV-N, it can be observed that both η and δ influence the output MSC of the noise component, as discussed in Section V-C. For η = 0 , the output MSC of the noise component for the BLCMV-N is obviously equal to the output MSC of the noise component for the BLCMV. For a fixed value of δ , it can be observed that the output MSC of the noise component approaches the input MSC of the noise component for increasing η , although it should be realized that perfect preservation of the MSC of the noise component is only possible for δ = 1 (cf. Section V-C).
For several values of the mixing parameter η , Figure 7 depicts the MSC error of the noise component for the BLCMVN and the BMVDR-N, averaged over all frequencies, i.e.,
<!-- formula-not-decoded -->
with f the frequency bin index and F the total number of frequency bins. As expected, the BMVDR ( η = 0 ) yields the largest MSC error of the noise component and increasing the mixing parameter η reduces the frequency-averaged MSC error of the noise component for the BMVDR-N [16]. For the considered acoustic scenario, it can be observed for the BLCMVN that for any value of the interference scaling parameter δ , increasing the mixing parameter η reduces the frequencyaveraged MSC error of the noise component compared to the BLCMV ( η = 0 ). Further, it can be observed that for small values of the interference scaling parameter δ , the effect of the mixing parameter η is larger than for large values of the interference scaling parameter δ , for which the frequencyaveraged MSC error is relatively small for all values of the mixing parameter η . These results clearly show that the mixing parameter η in the BLCMV-N enables to control the binaural cues of the background noise.
## B. Experimental Results Using Reverberant Recordings
For a more realistic evaluation, we compare the performance of the considered binaural beamforming algorithms using reverberant recordings. Similarly to Section VI-A, the experimental setup consists of two hearing aids, each with two microphones, mounted on a HATS in a cafeteria with a reverberation time of approximately 1 . 25 s [24]. The desired source was again placed at 0 ◦ (at a distance of about 102 cm ), while the interfering source was again placed at -35 ◦ (at a distance of about 118 cm ), see [24] for more details. The desired and interfering source components were generated by convolving clean speech signals with the measured reverberant room impulse responses corresponding to the desired source and interfering source positions. The desired source was a male German speaker, speaking eight sentences with a pause of 1 s between the sentences. The interfering source was a male Dutch speaker, speaking seven sentences with a pause of 0 . 25 s between the sentences. As background noise we used realistic recordings [24], consisting of multi-talker babble noise, clacking plates and temporally dominant competing speakers. The used background noise hence clearly differed from the perfectly diffuse noise in Section VI-A. The entire signal had a length of about 28 s . The desired source and the background noise were active the entire time, whereas the interfering source only became active after about 14 s . The desired source component, the interfering source component and the noise component were mixed at an input SNR of 10 dB and input SIR of 5 dB in the right reference microphone. Again, we chose the front microphone on each hearing aid as reference microphone.
As objective performance measures for noise and interference reduction performance, we used the left and the right SNR improvement ( ∆SNR L , ∆SNR R ) and the left and the right SIR improvement ( ∆SIR L , ∆SIR R ). As objective performance measure for binaural cue preservation of the background noise we used the frequency-averaged MSC error of the noise component ( ∆MSC ) as defined in (91). All objective performance measures were computed using the reference microphone signals and the output signals of all considered algorithms. Table I presents the objective performance measures for all considered algorithms.
The processing was performed at a sampling rate of 16 kHz in the STFT domain with a frame length of 8192 samples and a square-root Hann window with 50 % overlap. We used an oracle voice activity detector (i.e., using the desired source and interfering source signals) to estimate the noise covariance matrix R n , the undesired covariance matrix R v (interfering source plus background noise) and R xn = R x + R n (desired source plus background noise) over the entire signal. All binaural beamforming algorithms were implemented using relative transfer function (RTF) vectors [25], relating the ATF vectors in (4) to the reference microphones. Using the covariance whitening method (see [14], [26] for further details) the RTF vectors of the desired source and the interfering source were estimated based on generalised eigenvalue decomposition of R xn and R n or R v and R n , respectively. The mixing parameter was set to η = 0 . 3 and the interference scaling parameter was set to δ = 0 . 3 .
1) Noise and Interference Reduction Performance: In terms of noise reduction performance, it can be observed that - as expected - the BMVDR yields the highest SNR improvement ( 13 . 0 dB for the left and 12 . 9 dB for the right side). All other algorithms yield a lower SNR improvement, for the BLCMV due to the additional constraint for the interfering source, for
Fig. 8. Boxplot of the MUSHRA scores for all three evaluations. The plot depicts the median score (red line), the mean score (red dot), the first and third quartiles (blue boxes) and the interquartile ranges (whiskers). Outliers are indicated by red + markers.
<details>
<summary>Image 8 Details</summary>

### Visual Description
## Box Plots: Method Performance Across Acoustic Scenarios
### Overview
The image contains three side-by-side box plots comparing the performance of different audio processing methods across three acoustic scenarios: "Interfering source," "Background noise," and "Complete acoustic scene." Each plot evaluates methods labeled as "Reference," "Anchor," "BMVDR," "BLCMV," "BMVDR-N," and "BLCMV-N." Scores range from 0 to 100, with red dots representing medians, blue boxes showing interquartile ranges (IQR), and red stars indicating outliers.
### Components/Axes
- **X-axis (Methods)**:
- Reference
- Anchor
- BMVDR
- BLCMV
- BMVDR-N
- BLCMV-N
- **Y-axis (Score)**: 0 to 100 (linear scale).
- **Plot Titles**:
1. "Interfering source"
2. "Background noise"
3. "Complete acoustic scene"
- **Legend Elements**:
- Red dots: Medians
- Blue boxes: Interquartile ranges (IQR)
- Red stars: Outliers
### Detailed Analysis
#### Interfering Source
- **Reference**: Highest score (~100), no outliers.
- **Anchor**: Lowest score (~0), with 3 outliers (~5, ~10, ~15).
- **BMVDR**: Median ~20, IQR ~10–30, 1 outlier (~5).
- **BLCMV**: Median ~70, IQR ~60–80, 1 outlier (~90).
- **BMVDR-N**: Median ~50, IQR ~40–60, 1 outlier (~70).
- **BLCMV-N**: Median ~75, IQR ~65–85, 1 outlier (~95).
#### Background Noise
- **Reference**: Highest score (~100), no outliers.
- **Anchor**: Lowest score (~0), with 2 outliers (~5, ~10).
- **BMVDR**: Median ~20, IQR ~10–30, 1 outlier (~5).
- **BLCMV**: Median ~30, IQR ~20–40, 1 outlier (~50).
- **BMVDR-N**: Median ~25, IQR ~15–35, 1 outlier (~40).
- **BLCMV-N**: Median ~70, IQR ~60–80, 1 outlier (~85).
#### Complete Acoustic Scene
- **Reference**: Highest score (~100), no outliers.
- **Anchor**: Lowest score (~0), with 3 outliers (~5, ~10, ~15).
- **BMVDR**: Median ~20, IQR ~10–30, 1 outlier (~5).
- **BLCMV**: Median ~50, IQR ~40–60, 1 outlier (~70).
- **BMVDR-N**: Median ~30, IQR ~20–40, 1 outlier (~50).
- **BLCMV-N**: Median ~75, IQR ~65–85, 1 outlier (~90).
### Key Observations
1. **Reference** consistently achieves the highest scores across all scenarios, with no outliers.
2. **Anchor** performs poorly in all scenarios, with frequent outliers.
3. **BMVDR** and **BLCMV** show moderate performance, with BLCMV outperforming BMVDR in "Interfering source" and "Complete acoustic scene."
4. **BMVDR-N** and **BLCMV-N** demonstrate improved scores compared to their non-N variants, but still fall short of Reference.
5. Outliers are most frequent in "Anchor" and "BLCMV-N" (e.g., ~95 in "Interfering source").
### Interpretation
The data suggests that the **Reference** method is the most robust, achieving near-perfect scores without variability. **Anchor** is the least effective, with significant performance gaps and outliers. **BLCMV-N** and **BMVDR-N** show incremental improvements over their base methods, but their performance remains suboptimal compared to Reference. The presence of outliers in "BLCMV-N" and "Anchor" indicates potential instability or edge-case failures. These results highlight the importance of method selection based on acoustic context, with Reference being the gold standard.
</details>
TABLE I OBJECTIVE PERFORMANCE MEASURES FOR ALL CONSIDERED ALGORITHMS IN THE REVERBERANT ENVIRONMENT.
| | BMVDR | BLCMV | BMVDR-N | BLCMV-N |
|-------------|---------|---------|-----------|-----------|
| ∆SNR L [dB] | 13 | 10.1 | 8.6 | 7.6 |
| ∆SNR R [dB] | 12.9 | 9.2 | 8.6 | 7 |
| ∆SIR L [dB] | -0.1 | 9.7 | 0.82 | 9.8 |
| ∆SIR R [dB] | -4.3 | 8.7 | -2.4 | 8.9 |
| ∆MSC | 0.86 | 0.64 | 0.1 | 0.19 |
the BMVDR-N due to the mixing with the noisy reference microphone signals, and for the BLCMV-N due to both effects. The partial noise estimation for the BLCMV-N seems to result in a smaller drop in noise reduction performance compared to the BLCMV ( 2 . 5 dB for the left side, 2 . 2 dB for the right side) than for the BMVDR-N compared to the BMVDR ( 4 . 4 dB for the left side, 4 . 3 dB for the right side). Please note that both for the BMVDR-N as well as for the BLCMV-N this drop in noise reduction performance depends on the relative position of the interfering source to the desired source.
In terms of interference reduction performance, it can be
BLCMV-N.
observed that both the BLCMV and the BLCMV-N approximately lead to the same SIR improvement (for the left and the right side), which is in line with the theoretical SIR improvement in (43) and (79), i.e., 10 log 10 1 δ 2 ≈ 10 . 5 dB . The fact that this theoretical SIR improvement is not reached and the fact that the SIR improvements for the BLCMV and BLCMV-N are not exactly the same is due to estimation errors in the covariance matrices, which was also already noted in [14], [17]. In addition, it can be observed that the BMVDR and BMVDR-N lead to very low (even negative) SIR improvements, which is presumably due to the fact that the interfering source is relatively close to the desired source. 2) Binaural Cue Preservation of Background Noise: As expected, the BMVDR yields the largest MSC error of the noise component ∆MSC . As discussed in Section III-B, the output MSC of the noise component for the BLCMV is typically smaller than 1, hence leading to a smaller MSC error compared to the BMVDR. Due to the mixing with the noisy reference microphone signals, both the BMVDR-N and the BLCMV-N yield a much smaller MSC error of the noise component than the BMVDR and the BLCMV, where the MSC error is slightly smaller for the BMVDR-N than for the
In conclusion, the objective performance measures show that the BLCMV-N leads to a very similar interference reduction as the BLCMV, while providing a trade-off between noise reduction performance (slightly worse than the BLCMV) and binaural cue preservation of the background noise (much better than the BLCMV).
## C. Perceptual Listenting Test
To further investigate the spatial impression of the different output signal components for the four considered algorithms, we conducted a perceptual listening test similarly to [21]. The desired source was now placed at -35 ◦ and the interfering source was placed at 90 ◦ , in order to enhance the perceived spatial differences between both sources. The desired source component, the interfering source component and the noise component were mixed at an input SNR of 0 dB and input SIR of 0 dB in the right reference microphone. Thirteen selfreported normal-hearing subjects participated in the perceptual listening test, where none of the authors participated. All subjects can be considered expert listeners, i.e., they were familiar with similar perceptual listening tests, and gave informed consent. The listening test was conducted in a sound proof listening booth using an RME Fireface UCX sound card with Sennheiser HD 580 headphones.
Using a procedure similar to the MUlti-Stimulus Test with Hidden Reference and Anchor (MUSHRA) [27], the task was to rate the perceived spatial difference with respect to a reference signal. For a coherent source (e.g., interfering source), this corresponds to rating differences in perceived source location, whereas for a diffuse noise field this corresponds to rating differences in perceived diffuseness. A score of 0 is associated with a large perceived spatial difference, whereas a score of 100 is associated with no perceived spatial difference. As reference signal we used the (unprocessed) reference microphone signals, while as anchor signal we used the left reference microphone signal, played back to both ears. The anchor signal was hence a monaural signal with no binaural cues, which is perceived in the center of the head.
We conducted three evaluations, where only some components were active in the output signals, the reference signal and the anchor signal. In the first evaluation, only the desired source component and the interfering source component (i.e., no noise component) were active and the task was to rate the
spatial difference for the interfering source. In the second evaluation, only the desired source component and the noise component (i.e., no interfering source component) were active and the task was to rate the spatial difference for the background noise. In the third evaluation, all signal components were active and the task was to rate the spatial difference for the interfering source and the background noise simultaneously. To familiarize the subjects with the tasks and the sound material, a training round was performed. Audio samples for all binaural beamforming algorithms and the unprocessed input signals are available online (see https://uol.de/en/sigproc/research/audiodemos/binaural-noise-reduction/blcmv-n-beamformer).
The MUSHRA scores for the three evaluations are shown in Figure 8. A one-way repeated-measures ANOVA was performed. The analysis revealed a significant within-subjects effect for all three evaluations. Hence, post-hoc comparison t-tests with Bonferroni correction were performed [28].
a) Interfering source: The within-subjects effect was significant [ F (2 . 098 , 25 . 176) = 219 . 2 , p < . 001 , GreenhouseGeisser correction]. As expected, the BLCMV and the BLCMV-N preserved the spatial impression of the interfering source significantly better than the BMVDR and the BMVDRN ( p < . 001 ). The BMVDR-N performed significantly better than the BMVDR ( p < . 001 ), which is not unexpected since the interfering source component is also mixed with the mixing paremter η . No significant difference was found between the BLCMV and the BLCMV-N ( p = 1 ).
b) Background noise: The within-subjects effect was significant [ F (3 . 072 , 36 . 869) = 332 . 066 , p < . 001 , GreenhouseGeisser correction]. As expected, the BMVDR-N and the BLCMV-N, both using partial noise estimation, preserved the spatial impression of the background noise significantly better than the BMVDR and the BLCMV ( p < . 001 ). No significant difference was found between the BMVDR-N and the BLCMV-N ( p = 1 ) and between the BMVDR and BLCMV ( p = . 614 ).
c) Complete acoustic scene: The within-subjects effect was significant [ F (2 . 905 , 34 . 858) = 171 . 783 , p < . 001 , Greenhouse-Geisser correction]. In terms of preservation of the spatial impression of the complete acoustic scene, the BMVDR-N scored significantly higher than the BMVDR ( p < . 001 ), the BLCMV scored significantly higher than the BMVDR-N ( p = . 014 ), and the proposed BLCMV-N scored significantly higher than the BLCMV ( p = . 025 ).
In summary, the results of the listening test showed that the BLCMV-N is capable of preserving the spatial impression of an interfering source and background noise in a realistic acoustic scenario, outperforming all other considered binaural beamforming algorithms in terms of spatial impression.
## VII. CONCLUSIONS
In this paper we proposed the BLCMV-N, merging the advantages of the BLCMV and the BMVDR-N, i.e., preserving the binaural cues of the interfering source and controlling the reduction of the interfering source as well as the binaural cues of the background noise. We showed that the output signals of the BLCMV-N can be interpreted as a mixture between the noisy reference microphone signals and the output signals of a BLCMV using an adjusted interference scaling parameter. We provided a theoretical comparison between the BMVDR, the BLCMV, the BMVDR-N and the proposed BLCMV-N in terms of noise and interference reduction performance and binaural cue preservation. The obtained analytical expressions were first validated using measured anechoic acoustic transfer functions. Experimental results using recorded signals in a realistic reverberant environment showed that the BLCMV-N leads to a very similar interference reduction as the BLCMV, while providing a trade-off between noise reduction performance (slightly worse than the BLCMV) and binaural cue preservation of the background noise (much better than the BLCMV). In addition, the results of a perceptual listening test with 13 normal-hearing participants showed that the proposed BLCMV-N is capable of preserving the spatial impression of an interfering source and background noise in a realistic acoustic scenario, outperforming all other considered binaural beamforming algorithms in terms of spatial impression.
## APPENDIX A DERIVATION OF THE BLCMV-N
Using (4), (6) and (39), the constrained optimization problem in (54) can be reformulated as
<!-- formula-not-decoded -->
This constrained optimization problem can be solved using the method of Lagrange multipliers, where the Lagrangian function is given by
<!-- formula-not-decoded -->
with λ L denoting the 2-dimensional vector of Lagrangian multipliers. Setting the gradient with respect to w L
<!-- formula-not-decoded -->
equal to 0 yields
<!-- formula-not-decoded -->
Substituting (95) into the constraint C H w L = g L and solving for the Lagrangian multiplier λ L yields
<!-- formula-not-decoded -->
Substituting (96) into (95), the solution to (54) is given by
<!-- formula-not-decoded -->
where, using (39),
<!-- formula-not-decoded -->
## APPENDIX B OUTPUT NOISE PSD FOR THE BLCMV-N
Using (67) in (16) with R n instead of R x , the output PSD of the noise component for the BLCMV-N is given by
<!-- formula-not-decoded -->
Using (64) and (66), the components in (99) are given by [14]
Substituting (100) in (99) yields
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
with R xu, 3 defined in (74). Similarly, it can be shown that
<!-- formula-not-decoded -->
<!-- formula-not-decoded -->
## REFERENCES
- [1] V. Hamacher, U. Kornagel, T. Lotter, and H. Puder, 'Binaural signal processing in hearing aids: Technologies and algorithms,' in Advances in Digital Speech Transmission . New York, NY, USA: Wiley, 2008, pp. 401-429.
- [3] S. Doclo, S. Gannot, D. Marquardt, and E. Hadad, 'Binaural speech processing with application to hearing devices,' in Audio Source Separation and Speech Enhancement . Wiley, 2018, ch. 18, pp. 413-442.
- [2] S. Doclo, W. Kellermann, S. Makino, and S. E. Nordholm, 'Multichannel signal enhancement algorithms for assisted listening devices: Exploiting spatial diversity using multiple microphones,' IEEE Signal Processing Magazine , vol. 32, no. 2, pp. 18-30, Mar. 2015.
- [4] J. Blauert, Spatial hearing: the psychophysics of human sound localization . Cambridge, Mass. MIT Press, 1997.
- [6] A. W. Bronkhorst and R. Plomp, 'The effect of head-induced interaural time and level differences on speech intelligibility in noise,' The Journal of the Acoustical Society of America , vol. 83, no. 4, pp. 1508-1516, Apr. 1988.
- [5] K. Kurozumi and K. Ohgushi, 'The relationship between the crosscorrelation coefficient of two-channel acoustic signals and sound image quality,' The Journal of the Acoustical Society of America , vol. 74, no. 6, pp. 1726-1733, Dec. 1983.
- [7] M. L. Hawley, R. Y. Litovsky, and J. F. Culling, 'The benefit of binaural hearing in a cocktail party: Effect of location and type of interferer,' The Journal of the Acoustical Society of America , vol. 115, no. 2, pp. 833843, Feb. 2004.
- [8] D. P. Welker, J. E. Greenberg, J. G. Desloge, and P. M. Zurek, 'Microphone-array hearing aids with binaural output. II. A twomicrophone adaptive system,' IEEE Transactions on Speech and Audio Processing , vol. 5, no. 6, pp. 543-551, 1997.
- [9] R. Aichner, H. Buchner, M. Zourub, and W. Kellermann, 'Multichannel source separation preserving spatial information,' in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , Honolulu HI, USA, Apr. 2007, pp. 5-8.
- [11] B. Cornelis, S. Doclo, T. van den Bogaert, J. Wouters, and M. Moonen, 'Theoretical analysis of binaural multi-microphone noise reduction techniques,' IEEE Transactions on Audio, Speech, and Language Processing , vol. 18, no. 2, pp. 342-355, Feb. 2010.
- [10] T. Klasen, T. van den Bogaert, M. Moonen, and J. Wouters, 'Binaural noise reduction algorithms for hearing aids that preserve interaural time delay cues,' IEEE Transactions on Signal Processing , vol. 55, no. 4, pp. 1579-1585, Apr. 2007.
- [12] E. Hadad, D. Marquardt, S. Doclo, and S. Gannot, 'Theoretical analysis of binaural transfer function MVDR beamformers with interference cue preservation constraints,' IEEE/ACM Transactions on Audio, Speech, and Language Processing , vol. 23, no. 12, pp. 2449-2464, Dec. 2015.
- [14] E. Hadad, S. Doclo, and S. Gannot, 'The binaural LCMV beamformer and its performance analysis,' IEEE/ACM Transactions on Audio, Speech, and Language Processing , vol. 24, no. 3, pp. 543-558, Mar. 2016.
- [13] D. Marquardt, V. Hohmann, and S. Doclo, 'Interaural coherence preservation in multi-channel Wiener filtering based noise reduction for binaural hearing aids,' IEEE/ACM Transactions on Audio, Speech, and Language Processing , vol. 23, no. 12, pp. 2162-2176, Dec. 2015.
- [15] A. I. Koutrouvelis, R. C. Hendriks, R. Heusdens, and J. Jensen, 'Relaxed binaural LCMV beamforming,' IEEE/ACM Transactions on Audio, Speech, and Language Processing , vol. 25, no. 1, pp. 137-152, Jan. 2017.
- [17] N. G¨ oßling, D. Marquardt, I. Merks, T. Zhang, and S. Doclo, 'Optimal binaural LCMV beamforming in complex acoustic scenarios: Theoretical and practical insights,' in Proc. International Workshop on Acoustic Signal Enhancement (IWAENC) , Tokyo, Japan, Sep. 2018, pp. 381-385.
- [16] D. Marquardt and S. Doclo, 'Interaural coherence preservation for binaural noise reduction using partial noise estimation and spectral postfiltering,' IEEE/ACM Transactions on Audio, Speech, and Language Processing , vol. 26, no. 7, pp. 1257-1270, Jan. 2018.
- [18] H. As'ad, M. Bouchard, and H. Kamkar-Parsi, 'A robust target linearly constrained minimum variance beamformer with spatial cues preservation for binaural hearing aids,' IEEE/ACM Transactions on Audio, Speech, and Language Processing , vol. 27, no. 10, pp. 1549-1563, Oct. 2019.
- [20] E. Hadad, D. Marquardt, S. Doclo, and S. Gannot, 'Comparison of binaural multichannel Wiener filters with binaural cue preservation of the interferer,' in IEEE International Conference on the Science of Electrical Engineering (ICSEE) , Eilat, Israel, Nov. 2016, pp. 1-5.
- [19] R. M. Corey and A. C. Singer, 'Binaural audio source remixing with microphone array listening devices,' in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , Barcelona, Spain, May 2020, pp. 561-565.
- [21] N. G¨ oßling, D. Marquardt, and S. Doclo, 'Perceptual evaluation of binaural MVDR-based algorithms to preserve the interaural coherence of diffuse noise fields,' Trends in Hearing , vol. 24, pp. 1-18, Apr. 2020.
- [23] S. Gannot, E. Vincent, S. Markovich-Golan, and A. Ozerov, 'A consolidated perspective on multimicrophone speech enhancement and source separation,' IEEE/ACM Transactions on Audio, Speech, and Language Processing , vol. 25, no. 4, pp. 692-730, Apr. 2017.
- [22] B. D. Van Veen and K. M. Buckley, 'Beamforming: A versatile approach to spatial filtering,' IEEE ASSP Magazine , vol. 5, no. 2, pp. 4-24, Apr. 1988.
- [24] H. Kayser, S. D. Ewert, J. Anem¨ uller, T. Rohdenburg, V. Hohmann, and B. Kollmeier, 'Database of multichannel In-Ear and Behind-The-Ear head-related and binaural room impulse responses,' Eurasip Journal on Advances in Signal Processing , vol. 2009, p. 10 pages, Jan. 2009.
- [26] S. Markovich, S. Gannot, and I. Cohen, 'Multichannel eigenspace beamforming in a reverberant noisy environment with multiple interfering speech signals,' IEEE Transactions on Audio, Speech, and Language Processing , vol. 17, no. 6, pp. 1071-1086, Aug. 2009.
- [25] S. Gannot, D. Burshtein, and E. Weinstein, 'Signal enhancement using beamforming and non-stationarity with applications to speech,' IEEE Transactions on Signal Processing , vol. 49, no. 8, pp. 1614-1626, Aug. 2001.
- [27] ITU-R BS.1534-1, Method for the subjective assessment of intermediate quality level of coding systems. , International Telecommunications Union (ITU-T) Recommendation, Jan. 2003.
- [28] B. R. Kirkwood and J. A. C. Sterne, Essential medical statistics . John Wiley & Sons, 2010.