Crop Yield Prediction Using Deep Neural Networks

Crop yield is a highly complex trait determined by multiple factors such asgenotype, environment, and their interactions. Accurate yield predictionrequires fundamental understanding of the functional relationship between yieldand these interactive factors, and to reveal such relationship requires bothcomprehensive datasets and powerful algorithms. In the 2018 Syngenta CropChallenge, Syngenta released several large datasets that recorded the genotypeand yield performances of 2,267 maize hybrids planted in 2,247 locationsbetween 2008 and 2016 and asked participants to predict the yield performancein 2017. As one of the winning teams, we designed a deep neural network (DNN)approach that took advantage of state-of-the-art modeling and solutiontechniques. Our model was found to have a superior prediction accuracy, with aroot-mean-square-error (RMSE) being 12% of the average yield and 50% of thestandard deviation for the validation dataset using predicted weather data.With perfect weather data, the RMSE would be reduced to 11% of the averageyield and 46% of the standard deviation. We also performed feature selectionbased on the trained DNN model, which successfully decreased the dimension ofthe input space without significant drop in the prediction accuracy. Ourcomputational results suggested that this model significantly outperformedother popular methods such as Lasso, shallow neural networks (SNN), andregression tree (RT). The results also revealed that environmental factors hada greater effect on the crop yield than genotype.