Fine-Tuning of a Voice Production Model to Estimate Impact Stress Using a Metaheuristic Method
Ajuste fino de un modelo de producción vocal para estimar el estrés de impacto utilizando un método metaheurístico
Carlos-Alberto Calvache-Mora , Leonardo Soláque, Alexandra Velasco, Lina Peñuela
Abstract
Introduction: In vocal production models employing spring-mass-damper frameworks, precision in determining damping coefficients that align with physiological vocal fold characteristics is crucial, accounting for potential variations in the representation of viscosity-elasticity properties.
Objective: This study aims to conduct a parametric fitting of a vocal production model based on a mass-spring-damper system incorporating subglottic pressure interaction, with the purpose of accurately modeling the collision forces exerted by vocal folds during phonation.
Method: A metaheuristic search algorithm was employed for parametric synthesis. The algorithm was applied to elasticity coefficients c1 and c2, as well as damping coefficients ε1 and ε2, which directly correlate with the mass matrices of the model. This facilitates the adjustment of fold composition to achieve desired physiological behavior.
Results: The vocal system's behavior for each simulation cycle was compared to a predefined standard under normal conditions. The algorithm determined the simulation endpoint by evaluating discrepancies between key features of the obtained signals and the desired ones.
Conclusion: Parametric fitting enabled the approximation of physiological vocal production behavior, providing estimates of the impact forces experienced by vocal folds during phonation.
Keywords
Vocal model; impact stress; metaheuristic methods; fine-tunning.
Resumen
Introducción: En modelos de producción vocal que emplean estructuras de resorte-masa-amortiguador, la precisión en la determinación de coeficientes de amortiguamiento que se asemejen a las características fisiológicas de las cuerdas vocales es crucial, teniendo en cuenta posibles variaciones en la representación de la viscoelasticidad.
Objetivo: Este estudio tiene como objetivo realizar un ajuste paramétrico de un modelo de producción vocal basado en un sistema de resorte-masa-amortiguador que incorpora interacción con la presión subglótica, con el fin de modelar de manera precisa las fuerzas de colisión ejercidas por las cuerdas vocales durante la fonación.
Método: Se utilizó un algoritmo de búsqueda metaheurística para la síntesis paramétrica. El algoritmo se aplicó a los coeficientes de elasticidad c1 y c2, así como a los coeficientes de amortiguamiento ε1 y ε2, que se correlacionan directamente con las matrices de masa del modelo. Esto facilita el ajuste de la composición de las cuerdas para lograr un comportamiento fisiológico deseado.
Resultados: El comportamiento del sistema vocal para cada ciclo de simulación se comparó con un estándar predefinido en condiciones normales. El algoritmo determinó el punto final de la simulación evaluando las discrepancias entre características clave de las señales obtenidas y las deseadas.
Conclusión: El ajuste paramétrico permitió la aproximación del comportamiento fisiológico de la producción vocal, proporcionando estimaciones de las fuerzas de impacto experimentadas por las cuerdas vocales durante la fonación.
Palabras clave
Modelo vocal; estrés de impacto; métodos metaheurísticos; ajuste fino.
Introduction
Voice production involves the integration of three physiological processes: breathing, phonation, and resonance, allowing for the analysis of aerodynamic, mechanical, and acoustic phenomena within this framework [1,2]. In general terms, sound production is a complex nonlinear process involving air-structure-sound interaction: pulmonary pressure (subglottic pressure) generates airflow, which in turn interacts with the glottis, the region where the vocal folds converge to initiate vibration as a result of the interplay between air and the surrounding tissues and mucosa [3,4]. The quality of sound production is significantly influenced by the airflow [5]. Under optimal physiological and geometrical conditions, the interaction between the airflow and the vocal folds leads to a self-oscillation effect [6,7]. Consequently, the airflow affects glottal geometry, and the biomechanical properties of the vocal folds bring about alterations in the conversion of this airflow into sound, ultimately influencing the quality and acoustic attributes of the voice (i.e., the voice's characteristics depend on the interplay between flow and structure) [8]. Finally, the sound generated in the glottis is modified in the supraglottic cavities, so the concept of vocal tract inertance also contributes to this self-oscillation [2] (see Figure 1).
Figure 1. Vocal Physiology.
Given the complexity of this field, research has primarily focused on comprehending the various physical phenomena of vocal production through biomechanical modeling. In the literature, models of varying complexity have been developed to represent vocal fold tissues, incorporating lumped elements to depict acoustic wave signals and one-dimensional airflow [9,10]. Also, models of greater complexity, with a representation of computational fluid dynamics and high-fidelity, have been proposed [1].
Reduced-order models have proven effective in representing the self-oscillation and modal response of vocal folds, as demonstrated in recent works [11]. Other authors worked in models that include nonlinear flow-structure-acoustic coupling in voice production [12,13], and wave propagation within the vocal tract [2]. It is worth to mention the work of Espinoza et al. [14].
The ultimate objective of modeling vocal production has been to understand the vocal folds’ kinematic behavior; however, it is equally crucial to establish parameters for the clinical diagnosis of vocal pathologies. This purpose involves recognition of parameters related to vocal fold displacements and geometry concerning aerodynamic behavior. Some parameters that are currently challenging or even impossible to measure clinically can be estimated through numerical models, yielding a series of synchronized signals and data [15]. For instance, this encompasses the estimation of impact forces during the collision of vocal folds upon reaching the midline and calculation of intraglottic pressure [16,17]. Additionally, other lumped-element models presently aid in characterizing hyper-functional behaviors in voice use, allowing for their classification into phonotraumatic and non-phonotraumatic categories [18]. This process considers parameters derived from estimation methods utilizing signals acquired by accelerometers in conjunction with pressure transducers, airflow transducers, microphones, and/or neck surface electrodes [19].
The gap between vocal production modeling and clinical utility has been a challenging aspect addressed by various numerical models. However, achieving this integration remains complex, primarily attributed to the reliance on measurements derived from in vivo, ex vivo experiments [20], or numerical simulations [12,16,21] for characterizing the voice and the structural mechanical properties of the models. While these approaches provide an approximation, they lack precision in mirroring the clinical reality of a patient with a vocal alteration. Models focusing on spring masses and dampers, in particular, face difficulty in accurately determining damping coefficients, owing to potential variations in representing the viscosity-elasticity of the vocal fold. This paper describes a parametric optimization of the Hertz model originally introduced by Horacek [16].
Thus, this work aims to conduct a parametric refinement of a mass-spring-damper-based vocal production model incorporating subglottic pressure interaction, enabling the representation of collision forces between vocal folds during phonation [16]. Section 4 elucidates the features of the model. The parametric refinement is executed through a comprehensive application of metaheuristic methods, aiming to reconstruct the physical behavior of the vocal folds and estimate the impact stress during phonation.
The paper is structured as follows: Section 2 provides the background information essential for a thorough comprehension of the presented problem; Section 3 delineates the model used to representing the impact stress of vocal fold; Section 4 furnishes preliminary insights into the parametric synthesis framework; Section 5 addresses the demonstration of the proposed parametric synthesis method; finally, Section 6 offers a discussion and outlines potential future research avenues within this domain of knowledge.
Background: Metaheuristic Methods
The model serves as an abstraction of a real-world problem, upon which mathematical considerations are applied to yield results tailored to the desired outcomes. Within the realm of modeling, traditional methods for optimization are employed [22]. Among these methods are metaheuristic algorithms, which encompass approximate optimization and general-purpose search algorithms. These algorithms iteratively guide a subordinate heuristic by intelligently integrating various concepts to effectively explore and exploit the search space towards an appropriate solution [23].
Such methods have found application in diverse problem domains. For instance, they have been employed in assignment problems utilizing piecewise linearization techniques [24], in the design of embedded computer systems through deterministic iteration techniques [25], and in engineering structure design based on signomial discrete programming [26]. Additionally, they have been utilized in deterministic optimization methods within engineering and management [27], as well as in the field of molecular biology to optimize the localization of protein binding sites on DNA strands [28]. Likewise, this methodology has been used in medical contexts for optimizing fractionated protocols in cancer radiotherapy via nonlinear programming [29]. Other applications encompass model updating in the parameter optimization process [30-32].
Furthermore, metaheuristic methods have contributed to the development of novel optimization techniques [33]. These techniques find utility across various domains of knowledge [34]. For example, they have been applied to tackle complex nonlinear problems using music-based metaheuristic search methods, as documented by Altay and Alatas [35]. Heuristic and metaheuristic approaches have also been proposed for genetic algorithm, and memetic algorithm optimization [36]. Similarly, variations of methods for restricted optimization have been suggested. For instance, Gokalp and Ugur [37] employed a hybrid way, integrating three metaheuristic algorithms of different behavioral patterns.
Within the realm of research applied to vocal models, only a limited number of contributions have been identified. Existing literature provides evidence of parametric estimation work, such as that proposed by Yang et al. [38], wherein a mathematical optimization method is introduced to adjust the parameters of a three-dimensional vocal model for reproducing vocal fold dynamics through the evaluation of biomechanical parameters (pressure, tension, and masses). Additionally, Kurniasih et al. [39] present a computational estimation of vocal tract shape parameters employing synthesis analysis with acoustic data as input to iteratively optimize the shape parameters. Similarly, Dognin [40] proposes parametric optimization for vocal tract length normalization. This theme is further explored in Laprie and Mathieu´s work [41], which introduces a variational computational approach for estimating vocal tract shapes from speech signals using iterative processes. Another noteworthy study is presented by Ding et al. [42], where a swift and robust joint estimation of the tract and vocal source parameters from speech signals is proposed, based on an autoregressive model with exogenous input (ARX). It is noteworthy that, as of the date of this study, no applications aimed at understanding the aerodynamic processes involved in vocal production have been identified in the literature review. One approach to fine-tuning model outputs involves the development of algorithms that facilitate the convergence of values leading to model stability.
Combinatorial algorithms represent one of the most commonly employed algorithmic approaches for fine-tuning a model's response [43]. These algorithms form an integral part of metaheuristic algorithms [44,45]. They systematically enumerate all potential candidates for solving a given problem, assessing whether each candidate satisfies the solution criteria. This algorithm is often applied when the pool of candidate solutions is small or when it can be selectively reduced through preceding heuristic methods [46,47].
Model to Represent Impact Stress of Vocal Fold
The model of vocal fold self-oscillation, as outlined by Horáček et al. [16,48] serves as the foundational framework. This is a two-dimensional aeroelastic computational model, which incorporates the Hertz model to account for impact forces governing vocal cord collisions [17]. The primary objective of this model is to investigate the maximum magnitudes of the impact stress (IS) throughout a complete cycle of vocal fold vibration. It is designed to simulate a single vocal fold, assuming glottal symmetry, and is tailored towards emulating the characteristics of a typical, healthy larynx.
Figure 2 illustrates a schematic representation of the model, conceptualized as a dynamic system characterized by two degrees of freedom. It comprises three equivalent masses oscillating on two springs, along with dampers regulating the opening and closing phases of the glottis. The motion is characterized by the rotation and translation of the components arranged in the configuration of a vocal fold. A succinct overview of the model is presented herein.
Figure 2. Closing and opening cycle of vocal folds.
The interaction with the vocal tract is not taken into consideration by the model. Thus, only mechanical settings require adjustment. Self-oscillations arise from the presence of nonlinear aerodynamic forces and collision forces acting upon the static lung pressure load (Plung) when the glottis is closed. The geometry of the vocal cords is approximated by a parabolic shape function \(a\left(x\right)\), which determines the bulge and curvature of the contacting surfaces at the point of interaction. In the Hertz impact model, the Young’s modulus was considered, \(E=8kPa\) and Poisson’s ratio, \(v=0.4\) for vocal cord tissue.
The equation of motion for the two-degree-of-freedom vocal fold model can be written as,
\(M\ddot{V}+B\dot{V}+KV+F=0 \) Eq.1
where the following excitation and displacement force vectors were introduced:
\(V=\left[\begin{matrix}V_1\left(t\right)\\V_2\left(t\right)\\\end{matrix}\right],F=\left[\begin{matrix}F_1\left(t\right)\\F_2\left(t\right)\\\end{matrix}\right]\) Eq.2
and M, B, k are the matrices of a mass, spring, damper structure:
\( \begin{matrix}M&=\left[\begin{matrix}-lm_1&m_1+\frac{m3}{2}\\lm_2&m_2+\frac{m3}{2}\\\end{matrix}\right],\\B&=\epsilon_1M+\epsilon_2K,\\K&=\left[\begin{matrix}-c_1l&c_1\\c_2l&c_2\\\end{matrix}\right]\\\end{matrix}\) Eq.3
Damping matrix B represents a proportional structural damping model; \(\epsilon_1 \) and \(\epsilon_2 \) are constants adjusted according to the desired damping ratios for the two natural modes of vibration of the system. The structure of the matrices
Incoming airflow velocity U0 and subglottic pressure (Psub) at the entrance to the glottic region (x = 0) are related to pulmonary pressure (Plung). The factor
\(g\left(t\right) \) is the time-varying glottal width. A non-viscous incompressible fluid with the density of air is considered. The three equivalent masses m1, m2, and m3 were calculated from the properties of the vocal fold tissue (density
The impact force in Hertz is expressed as \(F_H=k_H\delta^{3/2}\), where KH is the contact stiffness and \(\delta\) is the penetration of the vocal cords through the axis of symmetry during collision (see Figure 2a), and \(\delta=\left[y_{max}-H_0\right]\), where \(H_0=max_{x\in\left(0,L\right)}a\left(x\right)+g \)is obtained from:
\(x_{max}=min\left(L,max\left(0,-\left[V_1\left(t\right)+a_1\right]/a_2\right)\right) \) Eq.4
\(y_{max}=a\left[x_{max}\left(t\right)\right]+\left[x_{max}\left(t\right)-L_1\right]V_1\left(t\right)+V_2\left(t\right) \) Eq.5>
Where a1 and a2 are dimensionless coefficients describing the geometry of the vocal cords. The impact stress (IS) is calculated as the maximum value during one oscillation period according to the formula of Hertz model of impact proposed in [49,50] for vocal-fold collisions:
\(IS\ =\ \frac{3}{2}\frac{F_{H,max}}{\pi a^2},a=\sqrt[3]{\frac{3}{4}\frac{\left(1-v^2\right)r}{E}F_{H,\ max}} \) Eq.6
where \(F_{H,max}=k_H\delta_{max}^{3/2}\). Using equation 15 of the Hertz model applied by [16], we arrive at the following equation that allows us to calculate the impact forces during the collision of the vocal cords
\(F_H\left(t\right)=k_H\left[y_{max}\left(t\right)-H_0\right]^{3/2}\) Eq.7
Method
Parametric tuning procedure and algorithm configuration
The path-based search algorithm is commonly employed in theoretical and practical investigations of search metaheuristics [51]. It involves the utilization of environment structures, which encapsulate the notion of proximity or adjacency among alternative problem solutions. The entirety of solutions falls within the environment surrounding the present solution, demarcated by a solution generation operator. Path-based algorithms conduct a localized examination of the search space, scrutinizing the environment encompassing the current solution to determine the course of the search path [52]. Establishing the environment's structure suffices to formulate a generic search algorithm model [51]. To instantiate the algorithm, encoding for the solutions is specified, and a neighbor generation operator is defined, consequently establishing an environment structure for the solutions. Subsequently, a solution is selected from the environment of the current solution until the termination criterion is satisfied [53].
The parametric tuning of the biomechanical vocal fold model addressed in this study commenced with the initial parameters validated by Horacek [17], as previously reported. The configuration procedures of the tuning algorithm are meticulously outlined in Diagram 1, providing a precise delineation of the decision-making process within the algorithm.
Diagram 1. Scheme of algorithm employed in a metaheuristic method.
The initial coefficients of the model were derived from the detailed biomechanical parameters described in the subsequent section, which comprehensively presents the model. These coefficients are directly linked to the matrices of the vocal fold mass-spring model, affording us the capability to modulate the composition of the folds to achieve the desired behavior.
The evaluation of the tuning process's performance was conducted by considering features of the target signal, including the period (T), positive maximum amplitude (Amax-p), negative maximum amplitude (A max-n ), positive section area (A r-p ), and negative section area (A r-n ). The appropriate moment to conclude the simulation is determined by the algorithm through an assessment of the discrepancies between the previously acquired characteristics and the desired ones. A margin of error of at least 10% was established, at which point the algorithm decides to conclude the metaheuristic search or update the data with new combinations, effectively restarting the simulation.
Figure 3 provides a visual representation of the employed methodology, illustrating the workflow during the parametric tuning of the biomechanical vocal fold model. This figure serves as an essential visual guide for comprehending the optimization process implemented in this scientifically significant study.
Figure 3. Proposed parametric tuning system diagram.
Results
Demonstration to parametric synthesis method
In this section, a summary of the results achieved following the successful implementation of the parametric synthesis method for the vocal model is presented. The fine-tuning adjustments made enabled the attainment of behaviors closer to the desired physiological parameters. This demonstrates the effectiveness of the approach applied in refining the biomechanical phonation model, thereby fulfilling the objective set forth in the title of the study.
The parametric synthesis method for the vocal model is implemented using a heuristic search algorithm. The parameters to which the fitting algorithm is applied are the elasticity coefficients c1 and c2, and the damping coefficients ϵ1 and ϵ2 . These parameters directly influence the matrices of the vocal fold mass-mechanical model, enabling us to modulate the composition of the folds to achieve the desired behavior. The resulting behavior of the vocal system for each simulation cycle is then compared to a pre-established standard behavior under normal conditions. Various signal characteristics are computed, including the period (T), Positive peak amplitude \(A_{max-p}\) , Negative peak amplitude \(A_{max-n}\) , Positive section area \(A_{r-p}\) and Negative section area \(A_{r-n}\). The simulation is terminated by the algorithm based on an evaluation of the discrepancies between the obtained signal characteristics and the desired ones. We have set a predefined error margin of at least 10%, at which point the algorithm will decide whether to conclude the metaheuristic search or update the data with new combinations, effectively restarting the simulation (refer to Figure 3 for visual representation).
After approximately 360,000 iterations of possible combinations, the algorithm was halted to refine the coefficients of the model. The fine-tuning of the model is manifested in the vertical displacements of the masses represented by m1 and m2 . It is noteworthy that the signals acquired W2 and W2 closely align with the desired signals W1D and W2D, particularly between samples 340,000 to 360,000, as illustrated in Figure 4.
Figure 4. Procedure for adjusting the vertical displacements of equivalent masses m1 and m2; comparing the obtained signals W1 and W2 Vs. the desired signals W1D and W2D.
The iterations of this algorithm were terminated upon reaching an error threshold of 10% or less for the evaluated features in the model signals. Table 1 presents the parametric modifications that the input variables underwent throughout the simulation. It is worth noting that the parameter ϵ 1 remained unchanged in its tuning, as dictated by the inherent nature of the mechanical model, wherein the dynamics primarily center around the forces generated by the viscous components rather than the mass of the fold.
Table 1. Tuning of parameters of the mechanical model of vocal folds.
samples | c1 | c2 | ϵ1 | ϵ2 |
---|---|---|---|---|
0 | 1,92E+13 | 4,50E+10 | 0,00 | 1,00E-04 |
100000 | 5,05E+12 | 1,18E+10 | 0,00 | 3,80E-04 |
200000 | 2,21E+12 | 5,18E+09 | 0,00 | 8,70E-04 |
300000 | 1,23E+12 | 2,89E+09 | 0,00 | 1,56E-03 |
310000 | 1,17E+12 | 2,75E+09 | 0,00 | 1,64E-03 |
320000 | 1,11E+12 | 2,62E+09 | 0,00 | 1,72E-03 |
330000 | 1,06E+12 | 2,49E+09 | 0,00 | 1,81E-03 |
340000 | 1,01E+12 | 2,38E+09 | 0,00 | 1,89E-03 |
350000 | 9,70E+11 | 2,27E+09 | 0,00 | 1,98E-03 |
360000 | 1,93E+12 | 4,54E+09 | 0,00 | 3,31E-04 |
Under similar conditions, it is important to observe that the simulation was concluded once errors had reached permissible thresholds (refer to Figure 5).
Figure 5. Error signals of the characteristics extracted from the desired signals WD Vs. signals obtained W.
In the simulation processing time, fine-tuning took approximately 3.6 to 5 seconds on an Intel(R) Core (TM) i5-10210U CPU @ 2.11 GHz machine. A fixed simulation step factor of 0.00001 was employed, considering that oscillations typically range between 100 and 200 Hz. The parametric adjustment facilitated the acquisition of interaction forces exciting the vocal model. Figure 6a illustrates the excitation force curves of the mechanical vocal model in the final moments of the simulation, wherein the parameters approach their minimum error. It is noteworthy that starting from sample 350000, the forces exhibit sinusoidal coupling and adapt in accordance with the naturally obtained parameter conditions at this iteration instant.
Figure 6. Aerodynamic forces curves for the vocal model.
In Figure 6b, aerodynamic forces (including impact forces and pressure variations between vocal folds) are presented, simulated using coefficient values derived from parametric tuning. The simulation employs a flow velocity of 1.6, yielding intraglottic pressures reaching as high as 0.46 [17]. This behavior exhibits fluctuations contingent on the opening or closing of the vocal folds.
Finally, Figure 7 displays the impact force curves computed using equations 3 and 4, incorporating the fine-tuned coefficients. We also present the curves for delta(t) (vocal fold penetration factor when applying equation 7). Note that, ymax)(t)>H0, which indicates the impact forces between the two vocal folds. Through parametric tuning of the model, we were able to attain impact forces of up to 2.5 and displacements of up to 1.
Figure 7. Impact force curves and deltas.
Discussion
I n this study, a parametric tuning approach using comprehensive metaheuristic methods was implemented to refine a vocal production model. This method allowed for the reconstruction of an approximate physical behavior of the vocal folds, enabling the estimation of impact stress resulting from the interaction forces between them. The tuning parameters, namely damping and spring coefficients, were systematically adjusted. Additionally, performance stopping criteria were established, comparing error characteristics between the desired and obtained signals, with an error margin of at least 10%.
As indicated by previous research [40], the effectiveness of tuning methodologies is intricately linked to the complexity of the model. Furthermore, achieving an optimal tuning hinges on the distinct vocal characteristics exhibited by male and female speakers, warranting tailored parameter variations [40]. While our results were validated with input parameters derived from a male speaker, it is crucial to acknowledge that gender-based differences in vocal frequencies may necessitate distinct tuning approaches.
The technique employed in this study addresses the estimation of impact forces through a parametric adjustment of the vocal characteristics within the Herz model, as examined by previous works [16]. This approach stands out for its organized exploration and exploitation of parametric targets, providing a unique opportunity to objectively quantify the biomechanical attributes of vocal folds [39], with a specific focus on acquiring data pertaining to impact forces. This aligns with the methodologies of global and local optimization algorithms, similar to those adapted in comparable studies, demonstrating their efficacy in yielding suitable results by adapting diverse synthetic datasets [38]. However, further research is warranted to establish a comprehensive correspondence between the biomechanical parameters of vocal folds and their respective vibrational modes.
The sequential exploration of potential values to achieve optimal results, which effectively traverses the search space, proves to be invaluable in addressing complex issues such as tuning a vocal production model. Although the results obtained through this technique are deemed adequate, refinement could be achieved by incorporating clinical measurements as a reference, given that the desired conditions were predominantly derived from existing literature [54].
The performance of the tuning methodology utilizing metaheuristic methods was deemed adequate in attaining the desired behavior, with consistent results. However, it is imperative to underscore that the tuning conditions could be further refined by giving greater weight to clinical measurements as a reference point, enabling the algorithm to emulate a vocal system behavior that more closely mirrors natural or clinically established parameters [55].
Finally, the proposed technique lays the foundation for potential extensions to include a wider array of search algorithms, with the aim of enhancing error conditions and streamlining processing resources.
Limitations and Future Directions
The parametric approach utilized, while robust, may not encapsulate the entirety of complexity and variability in vocal production, particularly in clinical scenarios. Additionally, while extensive efforts were made in tuning, other biomechanical and physiological factors may influence vocal fold behavior that were not accounted for in this model. Furthermore, the results are based on data from a male speaker, which may limit their generalizability to other genders or vocal profiles. Future research should aim to address variability in vocal production among different populations, including female speakers and individuals with diverse vocal characteristics.
Moving forward, exploring multi-objective approaches and integrating machine learning techniques may further enhance the precision and clinical applicability of biomechanical vocal fold models. Additionally, incorporating clinical data and direct measurements into the parametric tuning process could provide a more robust and specific foundation for assessing patients with vocal disorders. Expanding this approach to three-dimensional models and considering more complex interactions between vocal folds and other components of the vocal tract may offer a more comprehensive and accurate representation of vocal production in clinical contexts. Furthermore, experimental validation of the results through clinical trials and comparisons with direct clinical measurements would be crucial to confirm the utility and accuracy of this approach in diagnostic and treatment contexts for vocal disorders. This research avenue holds promise for advancing clinical practice in the field of phonation and voice, enabling a more personalized and effective approach to treating patients with vocal disorders.
Conclusion
This study highlights the effectiveness of metaheuristic methods in the parametric adjustment of vocal production modeling. The fine-tuned model achieves an accurate reproduction of vocal fold behavior, allowing for precise estimation of impact stress during phonation. The results demonstrate a significant improvement in vocal behavior representation, affirming the utility of this methodology. Despite the potential for further improvement through clinical measurements, this approach establishes a solid foundation for future research in vocal biomechanics. Multi-objective optimization, the integration of machine learning techniques, and the investigation of complex vocal fold interactions present exciting avenues for future exploration. Experimental validation and clinical comparison will further enhance the practical utility of the method. This study constitutes a significant step towards more precise vocal modeling and its clinical application.
References
1. Zhang Y, Zheng X, Xue Q. A Deep Neural Network Based Glottal Flow Model for Predicting Fluid-Structure Interactions during Voice Production. Appl Sci [Internet]. 2020 Jan 19;10(2):1-18. doi: https://doi.org/10.3390/app10020705
2. Titze IR. Nonlinear source-filter coupling in phonation: Theory. J Acoust Soc Am [Internet]. 2008;123(5):2733-49. doi: https://doi.org/10.1121/1.2832337
3. Hunter EJ, Titze IR, Alipour F. A three-dimensional model of vocal fold abduction/adduction. J Acoust Soc Am [Internet]. 2004;115(4):1747-59. doi: https://doi.org/10.1121/1.1652033
4. Story BH. An overview of the physiology, physics and modeling of the sound source for vowels. Acoust Sci Technol [Internet]. 2002;23(4):195-206. doi: https://doi.org/10.1250/ast.23.195
5. Alipour F, Vigmostad S. Measurement of vocal folds elastic properties for continuum modeling. J Voice [Internet]. 2012;26(6):816.e21-816.e29. doi: https://doi.org/10.1016/j.jvoice.2012.04.010
6. Berry DA, Zhang Z, Neubauer J. Mechanisms of irregular vibration in a physical model of the vocal folds. J Acoust Soc Am [Internet]. 2006;120(3):EL36-42. doi: https://doi.org/10.1121/1.2234519
7. Delebecque L, Pelorson X, Beautemps D. Modeling of aerodynamic interaction between vocal folds and vocal tract during production of a vowel-voiceless plosive-vowel sequence. J Acoust Soc Am [Internet]. 2016;139(1):350-60. doi: https://doi.org/10.1121/1.4939115
8. Šidlof P, Švec JG, Horáček J, Veselý J, Klepáček I, Havlík R. Geometry of human vocal folds and glottal channel for mathematical and biomechanical modeling of voice production. J Biomech [Internet]. 2008;41(5):985-95. doi: https://doi.org/10.1016/j.jbiomech.2007.12.016
9. Zhang Z. Cause-effect relationship between vocal fold physiology and voice production in a three-dimensional phonation model. J Acoust Soc Am [Internet]. 2016;139(4):1493-507. doi: https://doi.org/10.1121/1.4944754
10. Calvache C, Solaque L, Velasco A, Peñuela L. Biomechanical Models to Represent Vocal Physiology: A Systematic Review. J Voice [Internet]. 2023;37(3):465.e1-465.e18. doi: https://doi.org/10.1016/j.jvoice.2021.02.014
11. Story BH, Titze IR. Voice simulation with a body-cover model of the vocal folds. J Acoust Soc Am [Internet]. 1995;97(2):1249-60. doi: https://doi.org/10.1121/1.412234
12. Sadeghi H, Kniesburges S, Kaltenbacher M, Schützenberger A, Döllinger M. Computational Models of Laryngeal Aerodynamics: Potentials and Numerical Costs. J Voice [Internet]. 2019;33(4):385-400. doi: https://doi.org/10.1016/j.jvoice.2018.01.001
13. Erath BD, Zañartu M, Peterson SD. Modeling viscous dissipation during vocal fold contact: the influence of tissue viscosity and thickness with implications for hydration. Biomech Model Mechanobiol [Internet]. 2017;16(3):947-60. doi: https://doi.org/10.1007/s10237-016-0863-5
14. Espinoza VM, Zañartu M, Van Stan JH, Mehta DD, Hillman RE. Glottal aerodynamic measures in women with phonotraumatic and nonphonotraumatic vocal hyperfunction. J Speech Lang Hear Res [Internet]. 2017;60(8):2159-69. doi: https://doi.org/10.1044/2017_JSLHR-S-16-0337
15. Hadwin PJ, Galindo GE, Daun KJ, Zañartu M, Erath BD, Cataldo E, et al. Non-stationary Bayesian estimation of parameters from a body cover model of the vocal folds. J Acoust Soc Am [Internet]. 2016;139(5):2683-96. doi: https://doi.org/10.1121/1.4948755
16. Horáček J, Laukkanen AM, Šidlof P. Estimation of impact stress using an aeroelastic model of voice production. Logop Phoniatr Vocology [Internet]. 2007;32(4):185-92. doi: https://doi.org/10.1080/14015430600628039
17. Horacek J, Laukkanen A-M, Sidlof P, Murphy P, Svec JG. Comparison of Acceleration and Impact Stress as Possible Loading Factors in Phonation: A Computer Modeling Study. Folia Phoniatr Logop [Internet]. 2009;61(3):137-45. doi: https://doi.org/10.1159/000219949
18. Hillman RE, Stepp CE, Van Stan JH, Zañartu M, Mehta DD. An updated theoretical framework for vocal hyperfunction. Am J Speech-Language Pathol [Internet]. 2020;29(4):2254-60. doi: https://doi.org/10.1044/2020_AJSLP-20-00104
19. Cortés JP, Espinoza VM, Ghassemi M, Mehta DD, Van Stan JH, Hillman RE, et al. Ambulatory assessment of phonotraumatic vocal hyperfunction using glottal airflow measures estimated from neck-surface acceleration. PLoS One [Internet]. 2018;13(12):1-23. doi: https://doi.org/10.1371/journal.pone.0209017
20. Schwarz R, Huttner B, Döllinger M, Luegmair G, Eysholdt U, Schuster M, et al. Substitute Voice Production: Quantification of PE Segment Vibrations Using a Biomechanical Model. IEEE Trans Biomed Eng [Internet]. 2011;58(10):2767-76. doi: https://doi.org/10.1109/tbme.2011.2151860
21. Šidlof P, Zörner S, Hüppe A. A hybrid approach to the computational aeroacoustics of human voice production. Biomech Model Mechanobiol [Internet]. 2015;14(3):473-88. doi: https://doi.org/10.1007/s10237-014-0617-1
22. Neumaier A. Complete search in continuous global optimization and constraint satisfaction. Acta Numer [Internet]. 2004;13:271-369. doi: https://doi.org/10.1017/s0962492904000194
23. Elaziz MA, Elsheikh AH, Oliva D, Abualigah L, Lu S, Ewees AA. Advanced Metaheuristic Techniques for Mechanical Design Problems: Review. Arch Comput Methods Eng [Internet]. 2021;29:695-716. doi: https://doi.org/10.1007/s11831-021-09589-4
24. Li H-L, Chang C-T, Tsai J-F. Approximately global optimization for assortment problems using piecewise linearization techniques. Eur J Oper Res [Internet]. 2002;140(3):584-9. doi: https://doi.org/10.1016/s0377-2217(01)00194-1
25. Pinkevich, V. Oppacher, F. Platunov, A. Model-driven functional testing of cyber-physical systems using deterministic replay techniques. In: 2018 IEEE Industrial Cyber-Physical Systems (ICPS) [Internet]. 2018 May 15-18; Saint Petersburg, Russia: IEEE; 2018. p. 141-6. doi: https://doi.org/10.1109/icphys.2018.8387650
26. Tsai J-F. Global optimization for signomial discrete programming problems in engineering design. Eng Optim [Internet]. 2010;42(9):833-43. doi: https://doi.org/10.1080/03052150903456485
27. Lin M-H, Tsai J-F, Yu C-S. A Review of Deterministic Optimization Methods in Engineering and Management. Math Probl Eng [Internet]. 2012;2012:1-15. doi: https://doi.org/10.1155/2012/756023
28. Ecker JG, Kupferschmid M, Lawrence CE, Reilly AA, Scott ACH. An application of nonlinear optimization in molecular biology. Eur J Oper Res [Internet]. 2002;138(2):452-8. doi: https://doi.org/10.1016/s0377-2217(01)00122-9
29. Bertuzzi A, Conte F, Papa F, Sinisgalli C. Applications of Nonlinear Programming to the Optimization of Fractionated Protocols in Cancer Radiotherapy. Information [Internet]. 2020;11(6):1-24. doi: https://doi.org/10.3390/info11060313
30. Jiang H, Olleta B, Chen D, Geiger R. Parameter optimization of deterministic dynamic element matching DACs for accurate and cost-effective ADC testing. In: Proceedings of 2004 International Symposium on Circuits and Systems [Internet]. 2004 May 23-26; Vancouver, Canada: IEEE; 2004. p. 924-927. doi: https://doi.org/10.1109/iscas.2004.1328347
31. Tameemi AQ. Fusion-Based Deterministic and Stochastic Parameters Estimation for a Lithium-Polymer Battery Model. IEEE Access [Internet]. 2020;8:193005-19. doi: https://doi.org/10.1109/access.2020.3033497
32. Fang J, Lin S, Xu Z. Learning Through Deterministic Assignment of Hidden Parameters. IEEE Trans Cybern [Internet]. 2020 May;50(5):2321-34. doi: https://doi.org/10.1109/tcyb.2018.2885029
33. Bandaru S, Deb K. Metaheuristic Techniques. In: Decision Sciences [Internet]. Boca Raton: CRC Press; 2016. p. 693-750. doi: https://doi.org/10.1201/9781315183176
34. Worch E, Samiappan S, Zhou M, Ball JE. Hyperspectral Band Selection Using Moth-Flame Metaheuristic Optimization. In: IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2020 [Internet]. 2020 Sep 26-Oct 2; Waikoloa, USA: IEEE; 2020. p. 1271-4 doi: https://doi.org/10.1109/igarss39084.2020.9323754
35. Altay EV, Alatas B. Music based metaheuristic methods for constrained optimization. In: Varol A, Karabatak M, Varol C, editors. 6th International Symposium on Digital Forensic and Security Proceedings [Internet]. 2018 Mar 22-25; Antalya, Turkey: IEEE; 2018. p. 222-7. doi: https://doi.org/10.1109/isdfs.2018.8355355
36. Kurniasih J, Utami E, Raharjo S. Heuristics and Metaheuristics Approach for Query Optimization Using Genetics and Memetics Algorithm. In: 2019 1st International Conference on Cybernetics and Intelligent System (ICORIS) [Internet]. 2019 Aug 22-23; Bali, Indonesia: Institut Teknologi dan Bisnis (ITB); 2019. p. 168-72. doi: https://doi.org/10.1109/icoris.2019.8874909
37. Gokalp O, Ugur A. An order based hybrid metaheuristic algorithm for solving optimization problems. In: 2nd International Conference on Computer Science and Engineering (UBMK) [Internet]. 2017 Oct 5-8; Antalya, Turkey: IEEE; 2017. p. 604-9. doi: https://doi.org/10.1109/ubmk.2017.8093477
38. Yang A, Stingl M, Berry DA, Lohscheller J, Voigt D, Eysholdt U, et al. Computation of physiological human vocal fold parameters by mathematical optimization of a biomechanical model. J Acoust Soc Am [Internet]. 2011;130(2):948-64. doi: https://doi.org/10.1121/1.3605551
39. Prom-on S, Birkholz P, Xu Y. Estimating vocal tract shapes of Thai vowels from contextual vowel variation. In: 2014 17th Oriental Chapter of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA) [Internet]. 2014 Sep 10-12; Phuket, Thailand: IEEE; 2014. p. 1-6. doi: https://doi.org/10.1109/icsda.2014.7051442
40. Dognin P, El-Jaroudi A, Billa J. Parameter optimization for vocal tract length normalization. In: 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing Proceedings (Cat No00CH37100). Vol 3. [Internet]. 2000 Jun 5-9; Istanbul, Turkey: IEEE; 2000. p. 1767-70. doi: https://doi.org/10.1109/icassp.2000.862095
41. Laprie Y, Mathieu B. A variational approach for estimating vocal tract shapes from the speech signal. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP ‘98 (Cat No98CH36181). Vol 2. [Internet]. 1998 May 12-15; Seattle, USA: IEEE; 1998. p. 929-32. doi: https://doi.org/10.1109/icassp.1998.675418
42. Ding W, Campbell N, Higuchi N, Kasuya H. Fast and robust joint estimation of vocal tract and voice source parameters. In: 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol 2. [Internet]. 1997 Apr 21-24; Munich, Germany: IEEE; 1997. p. 1291-4. doi: https://doi.org/10.1109/icassp.1997.596182
43. Villalba Fernández de Castro G, Saldarriaga GJ. Algoritmos de Optimización Combinatoria (AOC) aplicados al diseño de redes de distribución de agua potable. Rev Ing [Internet]. 2005;(22):118-25. Available from: https://ojsrevistaing.uniandes.edu.co/ojs/index.php/revista/article/view/393
44. Bernstein DJ. Understanding brute force. [Internet]. 2005 Available from: https://cr.yp.to/snuffle/bruteforce-20050425.pdf
45. Mohammad A, Saleh O, Abdeen RA. Occurrences Algorithm for String Searching Based on Brute-force Algorithm. J Comput Sci [Internet]. 2006;2(1):82-5. Available from: https://thescipub.com/abstract/jcssp.2006.82.85
46. Rao SS. Metaheuristic Optimization Methods. In: Engineering Optimization Theory and Practice [Internet]. New York: Wiley; 2019. p. 673-95. doi: https://doi.org/10.1002/9781119454816.ch14
47. Radhika S, Chaparala A. Optimization using evolutionary metaheuristic techniques: a brief review. Brazilian J Oper & Prod Manag [Internet]. 2018;15(1):44-53. doi: https://doi.org/10.14488/bjopm.2018.v15.n1.a17
48. Horáček J, Šidlof P, Švec JG. Numerical simulation of self-oscillations of human vocal folds with Hertz model of impact forces. J Fluids Struct. 2005;20(6):853-69. doi: https://doi.org/10.1016/j.jfluidstructs.2005.05.003
49. Stronge WJ. Impact Mechanics [Internet]. Cambridge: Cambridge University Press; 2000. 280 p. doi: https://doi.org/10.1017/cbo9780511626432
50. Půst L, Peterka F. Impact oscillator with Hertz’s model of contact. Meccanica [Internet]. 2003;38(1):99-116. doi: https://doi.org/10.1023/a:1022075519038
51. Suman B, Kumar P. A survey of simulated annealing as a tool for single and multiobjective optimization. J Oper Res Soc [Internet]. 2006;57(10):1143-60. doi: https://doi.org/10.1057/palgrave.jors.2602068
52. Caballero-Villalobos JP, Alvarado-Valencia JA. Greedy Randomized Adaptive Search Procedure (GRASP), una alternativa valiosa en la minimización de la tardanza total ponderada en una máquina. Ing y Univ [Internet]. 2010;14(2):275-95. Available from: http://www.scielo.org.co/scielo.php?script=sci_arttext&pid=S0123-21262010000200004&nrm=iso
53. Hoos H, Sttzle T. Stochastic Local Search: Foundations & Applications. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.; 2004. 658 p.
54. Palaparthi A, Riede T, Titze IR. Combining Multiobjective Optimization and Cluster Analysis to Study Vocal Fold Functional Morphology. IEEE Trans Biomed Eng [Internet]. 2014;61(7):2199-208. doi: https://doi.org/10.1109/TBME.2014.2319194
55. Idrisoglu A, Dallora AL, Anderberg P, Berglund JS. Applied Machine Learning Techniques to Diagnose Voice-Affecting Conditions and Disorders: Systematic Literature Review. J Med Internet Res [Internet]. 2023 Jul;25:e46105. doi: https://doi.org/10.2196/46105