中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/86623
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 80990/80990 (100%)
Visitors : 41996692      Online Users : 1682
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/86623


    Title: 應用深度學習於藥品後市場監督:Twitter文本分類任務
    Authors: ?堅洽;Huang, Jian-Cia
    Contributors: 資訊管理學系
    Keywords: 深度學習;自然語言處理;藥物警戒;資訊分類;藥物不良反應;Deep Learning;Natural Language Processing;Pharmacovigilance;Information Classification;Adverse Drug Reaction
    Date: 2021-07-27
    Issue Date: 2021-12-07 13:02:07 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 處方用藥是醫生每天面對每位患者需要處理的任務,當醫生開立藥品給患者服用時需了解此類藥品潛在的副作用,而一項藥物以正常劑量用於人類疾病的預防、診斷治療或是改變其生理功能,但卻出現有害且非預期的結果即為藥物不良反應(Adverse Drug Reaction, ADR)。根據美國衛生公共福利部表示「藥物不良反應」佔每年住院總人數1/3。現今社群媒體使用者越來越多,Twitter每天發文數高達6,500萬則推文及Facebook用戶現今也已超過5億人在使用。
    本研究針對44個治療過動症患者的品牌仿製藥及其他81種藥物做推文收集,共收錄5,729筆用戶推文,進行前處理後做特徵擷取,並以各單詞於詞嵌入模型中產生的文字向量作為自變數,由專家們對推文內容進行判斷是否含有藥物不良反應之訊號,以其結果作為依變數,搭配深度學習之架構與預訓練詞嵌入模型──BERT (Bidirectional Encoder Representations from Transformers)及其分支模型BioBERT (Biomedical Bidirectional Encoder Representations from Transformers)、Bio + Clinical BERT及RoBERTa進行BiLSTM (Bidirectional Long Short-Term Memory)模型的分類任務訓練。
    進行資料平衡處理後,搭配不同的詞向量合併方式Average及Concat進行BiLSTM模型的訓練,結合Earlystop避免過擬合,找出應用於有關ADR推文最適合之預訓練詞嵌入模型。
    本研究發現BERT、BioBERT、Bio + Clinical BERT及RoBERTa等預訓練詞嵌入模型即使於資料不平衡之資料集中建立ADR預測模型,其模型準確率均可接近55%甚至更高,且BERT預訓練詞嵌入模型以Concat方式合併詞向量,於Random Undersampling或是Random Oversampling進行模型訓練,均獲得更好的ADR預測能力。
    ;Prescription medication is a task that doctors face each patient every day. When doctors prescribe drugs to patients, they need to understand the potential side effects of such drugs. A drug is used in normal doses for the prevention, diagnosis and treatment of human diseases. It is to change its physiological function, but the harmful and unexpected result is named Adverse Drug Reaction (ADR). According to the US Department of Health and Public Welfare, "adverse drug reactions" account for 1/3 of the total number of hospitalizations each year. Nowadays, social media users are becoming more and more developed. Twitter posts up to 65 million tweets every day and more than 500 million Facebook users are using it now.
    In this study, 44 brand-name generic drugs and 81 other drugs used for the treatment of patients with ADHD were collected for tweet dataset. A total of 5,729 tweets were collected for pre-processing and feature extraction to obtain the text generated by each word in the word embedding model was used as an independent variable. Experts judged whether the tweet content contained adverse drug reactions combined with deep learning architecture and pre-trained word embedding model-BERT (Bidirectional Encoder Representations from Transformers) and its branch models BioBERT (Biomedical Bidirectional Encoder Representations from Transformers), Bio + Clinical BERT and RoBERTa for BiLSTM (Bi-directional Long Short -Term Memory) model classification task training.
    After the data is balanced, the BiLSTM model is trained with different word vector merging methods "Average" and "Concat", combined with Earlystop to avoid over-fitting mechanism to find out the most suitable pre-trained word embedding model for ADR tweets.
    This study found that even if pre-trained word embedding models such as BERT, BioBERT, Bio + Clinical BERT, and RoBERTa build an ADR prediction model in a imbalanced dataset ,the precision of the model can be close to 55% or even higher, and BERT pre-trained words The embedding model merges word vectors in ′Concat′ method, and performs model training on Random Undersampling or Random Oversampling, both of which obtain better ADR prediction capabilities.
    Appears in Collections:[Graduate Institute of Information Management] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML116View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明