English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 80990/80990 (100%)
造訪人次 : 42000342      線上人數 : 1222
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/63019


    題名: 基於壓縮感測於語音增強及盲訊號源分離之研究;A Study on Speech Enhancement and Blind Source Separation Using Compressive Sensing
    作者: 王家慶
    貢獻者: 國立中央大學資訊工程系
    關鍵詞: 資訊科學;軟體
    日期: 2013-12-01
    上傳時間: 2014-03-17 14:16:55 (UTC+8)
    出版者: 行政院國家科學委員會
    摘要: 研究期間:10208~10307;Compressive sensing is a highly interesting research topic in international academia. It can reconstruct the original signal efficiently by merely sampling part of the signal samples. Owing to this property, the sampling number can be less than the Nyquist rate. Compressive sensing is still a brand new research topic; therefore, speech processing researches based on compressive sensing is quite few. The goal of this project is to propose new speech enhancement and blind source separation methods using compressive sensing. This project is performed in the following three steps: developing a speech enhancement technique based on compressive sensing, developing a blind source separation technique based on compressive sensing, and improving the previous two techniques. The first step of the project is to develop the speech enhancement technique based on compressive sensing. First, we will establish the sparse representation model for speech signal. Considering the frame-based power spectrum, we can train an overcomplete dictionary by the iterative procedures of sparse coding and dictionary updating. This overcomplete dictionary will then be used for performing compressive sensing. For noisy speech, we develop a missing data mask technique, which includes non-stationary noise estimation and SNR estimation of each time-frequency point of the noisy speech. Reliable time-frequency points can be obtained by applying missing data mask. With the overcomplete dictionary, the enhanced speech spectrum is then reconstructed by using compressive sensing and an adjustment procedure. The second step of the project is to develop the blind source separation technique based on compressive sensing. In a microphone array, speech sources from different channels are mixed. This project develops proper spectrum-related features so that different speech sources can be separated. The extracted features are clustered and outliers are deleted by expectation maximization (EM) algorithm. In each frequency bin, each speech source is extracted from the corresponding mask. The permutation problem in blind source separation is solved by estimating direction of arrival (DOA) of each speech source. Therefore, the partial spectrum of each speech source can be obtained. With the overcomplete dictionary, the whole spectrum of each speech source is then reconstructed by compressive sensing. The third step of the project is to improve the previous two techniques. In the previous speech enhancement step, the overcomplete dictionary mentioned above is fixed. It does not consider the property of the current processed speech. Hence, we will develop an adaptive dictionary that will enable the processed speech to express more sparsity. As the dictionary cannot reconstruct the noise, the unconstructed part of the noisy speech can be used as noise estimation. This method can overcome the noise overestimating drawback. For the blind source separation, this project will develop a multi-stage compressive sensing. For each time-frequency point generated by the compressive sensing, a confidence measure is constructed by observing whether its total power is similar to that of the mixing speech. The time-frequency points with a high confidence measure will also be used as reliable measurements for the next stage compressive sensing. Furthermore, this project will develop a multi-range selection technique for the compressive sensing in blind source separation. This technique provides more flexibility and more measurements by training multi-frame and multi-frequency band dictionaries. This project span a total of three years, and the main objectives of the first year are listed below: 1. To develop the sparse representation model for speech signal. 2. To build a proper overcomplete dictionary from speech database. 3. To develop a non-stationary noise estimation technique. 4. To develop a missing data mask technique. 5. To develop a speech enhancement technique based on compressive sensing. The main objectives of the second year are listed below: 1. To develop a feature set for blind source separation. 2. To develop a clustering technique and an outlier elimination technique based on EM. 3. To solve the permutation problem of blind source separation by DOA. 4. To construct partial spectrum of each speech source. 5. To develop a blind source separation technique based on compressive sensing. The main objectives of the third year are listed below: 1. To develop an adaptive dictionary technique for speech enhancement. 2. To develop a noise estimation technique using compressive sensing for speech enhancement. 3. To develop a multi-stage compressive sensing technique for blind source separation. 4. To develop a multi-range compressive sensing technique for blind source separation.
    關聯: 財團法人國家實驗研究院科技政策研究與資訊中心
    顯示於類別:[資訊工程學系] 研究計畫

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML337檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明