English  |  正體中文  |  简体中文  |  Items with full text/Total items : 2737/2828
Visitors : 269613      Online Users : 8
RC Version 4.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Adv. Search
LoginUploadHelpAboutAdminister

Please use this identifier to cite or link to this item: http://ir.lib.stu.edu.tw:80/ir/handle/310903100/1308

Title: 利用聽覺小波轉換於強健性語音活動偵測演算法
ROBUST VOICE ACTIVITY DETECTION ALGORITHM BASED ON THE PERCEPTUAL WAVELET PACKET TRANSFORM
Authors: 吳信德
Hsin-Te Wu
Contributors: 陳璽煌
Shi-Huang Chen
資訊工程學系
Keywords: 語音活動偵測;適應性多重位元率編碼器;小波轉換;聽覺小波封包轉換
Voice Active Detection;Adaptive Multi Rate Codec;Wavelet Transform;Perceptual Wavelet Packet Transform
Date: 2006
Issue Date: 2011-05-24 15:12:15 (UTC+8)
Publisher: 高雄市:[樹德科技大學資訊工程學系]
Abstract: 本論文針對適應性多重位元率(Adaptive Multi Rate, 簡稱AMR)語音編碼器中的「語音活動偵測」 (Voice Active Detection , 簡稱VAD)模組進行改善,而現今大部分語音編碼器都含有VAD模組,用以區分語音段或非語音段,編碼器使用VAD模組的好處是提供給不連續傳輸(Discontinuous Transmission, 簡稱DTX),用以降低手機電池能量。
AMR編碼器的VAD模組雖然有許多機制偵測判斷,例如:背景雜訊估算、通道能量估算等,但是這些機制大部分都使用預設的門檻值,而這些機制計算後的參數值,若大於門檻值則判斷為語音段,反之則判斷為非語音段。
在低噪音的環境下,此類的語音編碼器能夠得到不錯的效果,但是在高噪音環境下,由於背景雜訊能量變化較大,如果使用預設門檻值來偵測語音段或非語音段會造成誤判及效能不佳。
為了改善上述問題,本論文提出一套「利用聽覺小波轉換於強健性語音活動偵測演算法」,用來改善AMR編碼器中VAD模組缺失,本論文提出演算法可隨著各種環境不同計算出適應性VAD門檻值,在不同噪音環境下得到不錯的結果,從實驗結果可得知本文提出VAD演算法優於AMR編碼器VAD模組與G.729B。
This thesis presents a voice active detection (VAD) algorithm for Adaptive Multi Rate (AMR) codec. The VAD refers to the ability of distinguishing speech from noise and is required in a variety of speech processing systems. For example, mostly speech coders, e.g. GSM and AMR, have sets of a VAD module. The VAD module also can improve power efficiency and provides a reduction in radiated emissions through discontinuous transmission (DTX).
The VAD module of AMR uses a set of method to distinguish speech from noise. The set of method includes background noise estimation, channel energy estimator, and channel SNR estimator. These approaches using pre-defined threshold values for VAD is not suitable and is not appropriate for noisy environments. However, it is difficult to derive a fixed threshold value for accurate VAD under variable pronunciation conditions. Furthermore, the threshold values used in some of traditional VAD algorithms are calculated in the silence intervals and are improper for noisy conditions. A robust VAD therefore should utilize time-varying threshold values to accomplish a better performance.
This thesis presents a new VAD algorithm that can overcome the above problems and improve threshold values of consistent accuracy. It is shown in this thesis that the adaptive weighted threshold (AWT) is a robust threshold value for VAD under various noisy environments. One of advantages of this new algorithm is that the pre-defined threshold values are not necessary. In addition, the proposed algorithm can adapt VAD threshold value to variable speech conditions. Experimental results show that the thesis proposes VAD algorithm outperforms the G.729B, and VAD of AMR.
Appears in Collections:[資訊工程系(所) ] 博碩士論文

Files in This Item:

File Description SizeFormat
利用聽覺小波轉換於強健性語音活動偵測演算法__臺灣博碩士論文知識加值系統.htm國圖99KbHTML540View/Open


All items in STUAIR are protected by copyright, with all rights reserved.

 


無標題文件

著作權政策宣告:

1.

本網站之數位內容為樹德科技大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
 
2. 本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本校護人員(clairhsu@stu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
 
DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback