摘要: | 在2018美中貿易爭端及COVID-19接連影響之下,因而高度提升我國伺服器製造業之總產值,其中伺服器內的CPU為最重要的核心運作元件,與伺服器應用效能的關聯性也最為密切,而CPU廠商為了因應多種新型應用端需求,將新世代CPU產品劃分為眾多規格組合,同時也加重伺服器研發實驗端的驗證負擔,將需要快速驗證不同規格的CPU產品及確保能夠有效利用相關資源,因此在伺服器訂單呈現持續成長的趨勢之下,就必需要思考如何有效提升驗證量能效率及降低成本,這將會是在熱流測試過程中的重點,本研究利用多種資料探勘方法為基礎下建構出預測模型,用以分析CPU熱流測試驗證結果與關鍵項目資料之關聯性,並從其中找出能夠應用於伺服器CPU熱流測試驗證結果之最佳預測模型,而進一步提早預估出CPU熱流測試驗證結果為何,如此將可以於系統設計變更或搭配不同規格元件組態時,簡化原本實際進行驗證時所需要之測試驗證程序及相關成本。 本研究以美國伺服器製造品牌公司之熱流驗證部門作為研究個案,資料區間設定為個案於2020年1月至2023年7月之實際熱流測試驗證資料及結果,使用SAS Enterprise Miner (EM)軟體藉由羅吉斯迴歸、類神經網路、決策樹、梯度提升樹與隨機森林共五種資料探勘方法於資料處理流程中建立預測模型,並針對五種模型的綜合預測能力指數進行評估分析,綜觀各模型所輸出之誤分類率、正確率、敏感度、特異度、ROC圖與AUC值計算及呈現結果,顯示「隨機森林」模型為其中性能表現相對最均衡之預測模型,並確定利用隨機森林模型針對測試集進行預測之正確分類率能夠達到92.7%。 於建立預測模型後利用SAS Enterprise Miner (EM)中的模型評分預測功能,針對待預測原始資料進行分析及分類結果預測,評分預測結果中顯示能夠利用「隨機森林」模型成功輸出其所計算出的判定機率值及預測結果,並透過實際驗證比對「模型預測資料」與「實際熱流測試資料」之分類結果差異後,確認「隨機森林」模型的預測能力能夠達到相對可信賴之預測能力水準,代表此預測模型可以實際應用於預測個案公司的CPU熱流測試驗證結果,假設個案公司導入預測模型作為部份熱流驗證實驗之結果判定方式,以一年的執行期間作為計算基礎下共20台系統,依據所計算出的相關實驗成本結果進行比較,主要效益為節省用於驗證之時間耗用成本約1400小時,次要效益為節省用於驗證之電力耗用成本約新台幣貳萬貳仟圓,除此之外也可以提升個案公司的環境測試腔體設備利用率與運用於驗證之人力成本,綜觀以上幾點顯示,本研究所建立之預測模型「隨機森林」可有效用於協助提升伺服器CPU熱流測試驗證效能,惟在建模及研究過程中,可能仍有一些限制因素未能夠完全將其列入分析,期望在未來能夠持續提升模型的預測判斷準確度,因此建議未來的研究者能夠再考量更多可能影響因素,並將其列為研究中的分析項目及內容,包含結合預測模型及實際驗證之結果並持續調整模型降低預測誤分類率、建立不同冷卻技術之預測模型、探討並建立GPU預測模型、分析不同伺服器品牌資料庫所建立之模型預測能力及評估運用其它資料探勘技術進行分析及建模。 ;In the aftermath of the 2018 US-China trade dispute and the subsequent impact of COVID-19, our country has experienced a significant increase in the manufacturing output of servers. The CPU, a crucial component within servers, plays a pivotal role in core operations, closely tied to the overall performance of server applications. With the introduction of a multitude of specifications for next-generation CPU products by manufacturers to meet the demands of various new applications, the verification burden on server development and experimental validation has increased. This necessitates rapid validation of various CPU product specifications while ensuring efficient resource utilization, particularly in the thermal testing process. To address this challenge, this study employs various data mining methods to construct predictive models based on the analysis of the correlation between CPU thermal validation results and key data points. The objective is to identify the optimal predictive model applicable to server CPU thermal validation results. Early estimation of CPU thermal validation results allows the simplification of testing and validation procedures and related costs when making system design changes or configuring components with different specifications. The research focuses on the thermal validation department of a leading US server manufacturing company, using actual thermal validation data and results from January 2020 to July 2023. The SAS Enterprise Miner (EM) software is employed to establish predictive models using five data mining methods: Logistic regression, Artificial neural network, Decision tree, Gradient Boosting Decision Tree, and Random forest. The overall predictive performance of these models is evaluated, and the Random Forest model is identified as relatively optimal, achieving a correct classification rate of 92.7% on the test set. After establishing the predictive model, the SAS Enterprise Miner (EM) scoring prediction function is utilized to analyze and predict the classification results of the original data to be predicted. The results demonstrate the successful output of calculated probability values and predictions using the Random forest model. Through actual verification and comparison of "model predicted data" with "actual thermal validation data" classification results, the predictive capability of the Random forest model is confirmed to be relatively reliable. This indicates the practical applicability of the predictive model to forecast CPU thermal validation results for the case company. Assuming the case company adopts the predictive model as part of the result determination method for some thermal validation experiments, the main effects include saving approximately 1400 hours of time consumption cost for verification over a one-year execution period for 20 systems. The secondary effects, such as enhancing the utilization of environmental testing chamber equipment and reducing labor costs associated with verification, result in additional benefits. In summary, the established predictive model, the Random forest, proves effective in assisting in improving the efficiency of server CPU thermal validation. However, it is acknowledged that there may be some limiting factors not fully addressed in the research and modeling process. Therefore, it is recommended that future researchers consider more potential influencing factors and include them as analysis items and content in their research. This includes continuously adjusting the model to reduce the misclassification rate by integrating predictive modeling with actual verification results, establishing predictive models for different cooling technologies, exploring and establishing GPU predictive models, analyzing the predictive capabilities of models established from databases of different server brands, and evaluating the use of other data mining techniques for analysis and modeling. |