梅尔频率倒谱系数在声带息肉手术前后嗓音分析中的价值研究
作者:

Value of Mel frequency cepstrum coefficient in voice analysis before and after vocal polyp surgery
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • | |
    摘要:

    目的 本研究拟通过提取患者嗓音中的梅尔频率倒谱系数(MFCC)指标,探讨其在声带息肉手术前后嗓音分析中的临床价值。方法 回顾性分析于2018年1月—2019年8月行声带息肉手术且术前及术后1个月均行嗓音评估的患者41例,男31例,女10例;平均年龄(42.9±11.4)岁。另选取无声嘶且无声带病变的正常受试者21例作为基线对照。使用基于Python编程语言的librosa语音处理包进行MFCC特征提取,分别提取每位患者的MFCC均值,MFCC方差与MFCC标准差,使用配对样本t检验比较声带息肉手术前后上述各MFCC特征的差异。结果 声带息肉患者术后MFCC均值1.25±1.01、MFCC方差561.34±154.98及MFCC标准差21.74±4.03比术前MFCC均值6.81±2.05、MFCC方差1 019.66±295.87及MFCC标准差34.37±6.63显著下降,差异具有统计学意义(t=18.596,P=0.000;t=10.338,P=0.000;t=11.852,P=0.000)。声带息肉组患者术后1个月其MFCC均值、MFCC方差及MFCC标准差与正常受试者相比差异均无统计学意义,表明绝大部分声带息肉患者术后嗓音得到良好的恢复。结论 本研究首次探索了MFCC在声带息肉手术前后嗓音分析中的价值, MFCC各特征可作为评估声带息肉术后嗓音恢复的指标。

    Abstract:

    Objective Mel frequency cepstrum coefficient (MFCC) has a wide range of applications in the field of speech recognition, but its application has not been reported in the field of voice analysis at home and abroad. This study intends to analyze its research value in voice analysis before and after vocal polyp surgery by extracting the MFCC index.Methods A total of 41 patients who underwent vocal polyp surgery in our hospital from January 2018 to August 2019 and received voice evaluation before and 1 month after the surgery were retrospectively analyzed. In addition, 21 normal subjects who had neither hoarseness nor vocal cord lesions were selected as the baseline control. The librosa speech processing package based on Python programming language was used for MFCC feature extraction. The mean value, variance and standard deviation of MFCC in each patient were extracted respectively. The paired sample t-test was used to compare the differences of the above MFCC features before and after surgery.Results The enrolled patients included 31 males and 10 females with an average age of (42.9±11.4) years. Their postoperative mean value, variance and standard deviation of MFCC got decreased significantly compared to the preoperative ones (6.81±2.05 vs 1.25±1.01, t=18.596, P=0.000; 1 019.66±295.87 vs 561.34±154.98, t=10.338, P=0.000; 34.37±6.63 vs 21.74±4.03, t=11.852, P=0.000). The differences of mean value, variance and standard deviation of MFCC between the patients one month after surgery and the normal subjects were statistically insignificant, indicating that the voice of most patients recovered well after surgery.Conclusions This study is the first to explore the value of MFCC in voice analysis before and after vocal polyp surgery. The characteristics of MFCC can be used as indexes to evaluate the voice recovery after vocal polyp surgery.

    网友评论
    网友评论
    分享到微博
    发 布
    参考文献
    [1] Karlsen T, Sandvik L, Heimdal JH, et al. Acoustic voice analysis and maximum phonation time in relation to voice handicap index score and larynx disease[J]. J Voice, 2020, 34(1):161. e27-161. e35.
    [2] 徐萌,金晓彤,孙毓晗,等. 早期声门型喉癌低温等离子射频消融术后联合嗓音训练的嗓音学分析[J].中国耳鼻咽喉颅底外科杂志, 2022, 28(2):95-98..
    [3] Kim HK, Gao SH, Yi B, et al. Validation of the dysphonia severity index in the Dr. Speech Program[J]. J Voice, 2019, 33(6):948.e23-948.e29.
    [4] Deng M, Meng T, Cao J, et al. Heart sound classification based on improved MFCC features and convolutional recurrent neural networks[J]. Neural Netw, 2020, 130:22-32.
    [5] Iqtidar K, Qamar U, Aziz S, et al. Phonocardiogram signal analysis for classification of Coronary Artery Diseases using MFCC and 1D adaptive local ternary patterns[J].Comput Biol Med, 2021, 138:104926.
    [6] Hao X, Bao Y, Guo Y, et al. Multi-modal neuroimaging feature selection with consistent metric constraint for diagnosis of Alzheimer’s disease[J]. Med Image Anal, 2020, 60:101625.
    [7] Fernando T, Ghaemmaghami H, Denman S, et al. Heart sound segmentation using bidirectional LSTMs with attention[J]. IEEE J Biomed Health Inform, 2019, 24(6):1601-1609.
    [8] Palaniappan R, Sundaraj K, Sundaraj S. A comparative study of the svm and k-nn machine learning algorithms for the diagnosis of respiratory pathologies using pulmonary acoustic signals[J]. BMC Bioinformatics, 2014, 15:223.
    [9] Zhou L, Marzbanrad F, Ramanathan A, et al. Acoustic analysis of neonatal breath sounds using digital stethoscope technology[J].Pediatr Pulmonol, 2020, 55(3):624-630.
    [10] Ding Y, Sun Y, Li Y, et al. Selection of OSA-specific pronunciations and assessment of disease severity assisted by machine learning[J].J Clin Sleep Med, 2022, 18(11):2663-2672.
    [11] Ding Y, Wang J, Gao J, et al. Severity evaluation of obstructive sleep apnea based on speech features[J]. Sleep Breath, 2021, 25(2):787-795.
    [12] Anumanchipalli GK, Chartier J, Chang EF. Speech synthesis from neural decoding of spoken sentences[J]. Nature, 2019, 568(7753):493-498.
    [13] Krasnodębska P, Szkiełkowska A, Mias'kiewicz B, et al. Objective measurement of mucosal wave parameters in diagnosing benign lesions of the vocal folds[J]. Logoped Phoniatr Vocol, 2019, 44(2):73-78.
    引证文献
引用本文

刘茉,葛鑫颖,赵晓畅,郝青青,李祖飞.梅尔频率倒谱系数在声带息肉手术前后嗓音分析中的价值研究[J].中国耳鼻咽喉颅底外科杂志,2024,30(2):102-105

复制
分享
文章指标
  • 点击次数:59
  • 下载次数: 178
历史
  • 收稿日期:2023-02-26
  • 在线发布日期: 2024-05-08
温馨提示

本刊唯一投稿网址:www.xyosbs.com
唯一办公邮箱:xyent@126.com
编辑部联系电话:0731-84327210,84327469
本刊从未委托任何单位、个人及其他网站代理征稿及办理其他业务联系,谨防上当受骗!

关闭