节点文献

基于改进深层堆叠LSTM网络的玉米产量预测模型研究

Research on Corn Yield Prediction Model based on Improved Deep Stacked LSTM Network

【作者】 王旭

【导师】 房至一;

【作者基本信息】 吉林大学 , 计算机技术(专业学位), 2022, 硕士

【摘要】 近年来,随着计算机技术的发展和农业生产水平的提高,农业生产与数据统计的结合越发紧密,在大量农业数据和高速计算机的支撑下,使用统计学模型及神经网络对农作物进行生长过程分析和产量预测逐渐得以实现。在对农作物的产量预测工作时,需要采集影响农作物产量因素的数据,虽然影响因素很多,但很多情况下,研究人员只能获取其中的一小部分,本文中选择对农作物产量具有较大影响的气象信息作为主要输入对象,考虑到数据量的问题,本文选择进行县级作物产量的预测工作,只需要选择一定年份和足够多县的数据,就可以展开预测工作,因此选择添加不同县的经度、纬度和年份作为补充输入信息加以区分。本文选择玉米这一农作物进行产量预测工作,在现阶段使用气象信息作为主要输入的玉米产量研究中,大多数研究者直接将所有信息无差别地输入到网络中进行特征提取,这样做会存在以下两个问题:(1)无法提取输入信息间的内在联系;(2)无法区分不同输入信息对最终输出的重要程度。为了解决以上问题,本文选择将气象信息特征以月为单位进行采集,按照玉米的生长周期顺序将这些以月为单位的气象信息特征进行排序,并以月为时间步划分,组成从播种月到收获月的k个时间步向量,单个时间步向量维度为采集的气象因素个数,k为生长期月的个数。根据以上内容,本文首先提出了区分气象信息特征重要性的NPAM-LSTM(No-Parameter Attention Mechanism-Long Short-Term Memory)网络,用该网络中的NPAM部分来提取输入的各个气象因素对玉米产量预测结果的重要性程度,并结合LSTM网络再次使用NPAM对以月为时间步划分的时间序列进行特征提取,最后进行预测结果的输出。本文还在NPAM-LSTM网络的基础上,提出了PFSF-DSN(Primary Factors and Secondary Factors-Deep Stacked Networks)来更加准确地预测玉米产量。PFSF-DSN构建了正反两个深度堆叠网络,分别从正向和反向两个时间序列方向提取信息。每个深度堆叠网络由PFSF-LSTM(Primary Factors and Secondary Factors-Long Short-Term Memory)和多层LSTM组成。其中PFSF-LSTM网络部分由主要气象因素组成的时间序列和次要气象因素组成的时间序列作为输入,主次要气象因素的划分依据来源于NPAM-LSTM网络对气象因素重要程度的提取结果。本文将提出的PFSF-DSN与Ridge、LASSO、RF、LSTM、TCN、NPAM-LSTM等模型在同样基于气象信息和部分补充信息条件下的玉米产量预测结果进行了对比,结果表明在RMSE、MAE和RE(Relative Error)的表现上,PFSF-DSN优于其他对比模型。除此之外,本文还设计了消融实验证明模型改进的合理性。

【Abstract】 In recent years,with the development of computer technology and the improvement of agricultural production level,the combination of agricultural production and data statistics has become more and more closely.With the support of a large number of agricultural data and high-speed computers,statistical models and neural networks are used to analyze the growth process of crops.and production forecasts are gradually being realized.When predicting crop yield,it is necessary to collect data on factors affecting crop yield.Although there are many influencing factors,in many cases,researchers can only obtain a small part of them.Meteorological information is the main input object.Considering the amount of data,this paper chooses to predict crop yields at the county level.It only needs to select data from a certain year and enough counties to start the prediction work.Therefore,we choose to add the longitudes of different counties.,latitude and year are distinguished as supplementary input information.This paper selects corn as a crop for yield prediction.In the current corn yield research using meteorological information as the main input,most researchers directly input all information into the network for feature extraction without distinction.In doing so,there will be the following Two problems:(1)The intrinsic connection between the input information cannot be extracted;(2)The importance of different input information to the final output cannot be distinguished.In order to solve the above problems,this paper chooses to collect the meteorological information features in monthly units,sort these meteorological information features in monthly units according to the order of the growth cycle of corn,and divide them by month as a time step to form a composition ranging from sowing month to k time step vectors of harvest months,the dimension of a single time step vector is the number of meteorological factors collected,and k is the number of months in the growth period.According to the above content,this paper firstly proposes the NPAM-LSTM(No-Parameter Attention Mechanism-Long Short-Term Memory)network to distinguish the importance of meteorological information features,and uses the NPAM part of the network to extract the impact of each input meteorological factor on corn yield.The importance of the prediction results,combined with the LSTM network,uses NPAM again to extract features from the time series divided by months as time steps,and finally output the prediction results.This paper also proposes PFSF-DSN(Primary Factors and Secondary Factors-Deep Stacked Networks)based on the NPAM-LSTM network to more accurately predict corn yield.PFSF-DSN constructs two deep stacking networks,forward and reverse,to extract information from the forward and reverse time series directions,respectively.Each deep stacked network consists of PFSF-LSTM(Primary Factors and Secondary Factors-Long Short-Term Memory)and multiple layers of LSTM.The PFSF-LSTM network is partly composed of the time series composed of the main meteorological factors and the time series composed of the secondary meteorological factors as input.This paper compares the corn yield prediction results of the proposed PFSF-DSN and Ridge,LASSO,RF,LSTM,TCN,NPAM-LSTM and other models based on the same meteorological information and some supplementary information.In performance,PFSF-DSN outperforms other comparison models.In addition,this paper also designs ablation experiments to prove the rationality of the model improvement.

  • 【网络出版投稿人】 吉林大学
  • 【网络出版年期】2023年 01期
  • 【分类号】S126;TP183
节点文献中: 

本文链接的文献网络图示:

本文的引文网络