中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/86634
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 80990/80990 (100%)
造访人次 : 41996536      在线人数 : 1660
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/86634


    题名: 改進自注意力機制於神經機器翻譯之研究
    作者: 陳明萱;Chen, Ming-Hsuan
    贡献者: 資訊管理學系
    关键词: 神經機器翻譯;Transformer;自注意力機制;Gate機制;分群演算法;Neural Machine Translation;Transformer;Self-Attention Mechanism;Gate Mechanism;Clustering Algorithms
    日期: 2021-08-02
    上传时间: 2021-12-07 13:02:39 (UTC+8)
    出版者: 國立中央大學
    摘要: 神經機器翻譯任務之目的為透過深度學習模型將來源語言句子轉換為目標語言,同時得以保留來源句子語意及正確句法。近年來常用的模型之一為 Transformer,透過模型中的自注意力機制捕捉句子的全局資訊,在多項自然語言處理任務中表現良好。然而,有研究指出自注意力機制會學到重複資訊,且無法有效學習文本中的局部資訊。因此,本研究針對 Transformer 中的自注意力機制進行改善,分別加入 Gate 機制與 K-means 分群演算法,進而提出 Gated Attention 與 Clustered Attention,其中 Gated Attention 又涵蓋 Top-k % 方法及 Threshold 方法。透過將 Attention Map 集中化,加強模型捕捉局部資訊之能力,藉此學習到更多元的句子關係,提升其翻譯品質。
      本研究將 Gated Attention 的 Top-k % 方法與 Threshold 方法,以及 Clustered Attention 應用於中英翻譯任務上,以 BLEU 作為評估指標,分別達到 25.30、24.69 及 24.69。其次,同時採用兩種注意力機制的混合組合模型之最佳結果為 24.88,並未比僅採用單一種方法要來得優秀。在實驗中皆證實本研究提出的改進模型優於原始 Transformer,另外亦表明了只使用一種注意力機制更能夠幫助 Transformer 學習文本資訊,且達到 Attention Map 集中化之目的。;The purpose of Neural Machine Translation (NMT) is to translate a source sentence to a target sentence by deep learning models and to be able to preserve the semantic meaning of the source sentence and have correct syntax as well. Recently, Transformer is one of the commonly used models. It can capture the global information of sentences through the Self-Attention Mechanism and performs well in lots of Natural Language Processing (NLP) tasks. However, some studies have indicated that the Self-Attention Mechanism learns repetitive information and cannot learn local information of texts effectively. Therefore, we modify the Self-attention Mechanism in Transformer and propose Gated Attention and Clustered Attention, by adding Gated Mechanism and K-means clustering algorithm respectively. Moreover, Gated Attention includes Top-k% method and Threshold method. These approaches centralize the Attention Map to made model improve the ability to capture local information and learn more different relationship in sentences. Hence Transformer can provide a higher quality translation.
    In this work, we apply Clustered Attention as well as Top-k% method and Threshold method of Gated Attention to Chinese-to-English translation tasks, and then the results are 24.69, 25.30 and 24.69 BLEU, respectively. Secondly, the best result of the hybrid combination model that uses both attention mechanisms at the same time is 24.88 BLEU, which is not better than using a single attention mechanism. In our experiments, we have found that the proposed model outperforms the vanilla Transformer. Furthermore, we have also observed that using only one attention mechanism can help Transformer learn text information better and achieve the goal of Attention Map centralization as well.
    显示于类别:[資訊管理研究所] 博碩士論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML79检视/开启


    在NCUIR中所有的数据项都受到原著作权保护.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明