Abstract:Ear count is the committed step of wheat yield estimation. With the rapid development of unmanned aerial vehicle (UAV) and computer vision technology, the problem of automatic counting of wheat ears can be solved more quickly and efficiently. An automatic counting method for UAV wheat ear images was proposed based on feature enhance-point to point (FE-P2Pnet) to address issues such as complex background, dense wheat, small wheat ear targets, and varying wheat ear sizes. Firstly, the brightness and contrast of the UAV image were enhanced to increase the difference between the wheat ear target and the background, and the influence of complex background factors such as leaves and stems were reduced. Secondly, a point annotated network P2Pnet was introduced as the baseline network to address the problem of dense wheat ears. At the same time, in response to the problem of limited feature information caused by small wheat ear targets, a Triplet module was added to the backbone network VGG16 of P2Pnet, which interacted with the information of C (channel), H (height), and W (width) dimensions, allowing the backbone network to extract more feature information related to the target. In response to the issue of varying wheat ear sizes, feature enhancement module (FEM) and squeeze excitation (SE) modules were added to feature pyramid networks (FPN), enabling this module to better process feature information and fuse multi-scale information. In order to better classify targets, Focal Loss function instead of cross entropy loss function was used. This loss function can carry out different weights on the background and target feature information to further highlight features. The experimental results showed that the mean absolute error (MAE), mean square error (MSE), and accuracy (ACC) indicators of wheat ear counting on the constructed unmanned aerial vehicle wheat image dataset (Wheat-ZWF) achieved 3.77, 5.13, and 90.87%, respectively. Compared with other target counting regression methods such as MCNN, CSRnet, and WHCNETs, the performance was the best. Compared with the baseline network P2Pnet, the MAE and MSE values were decreased by 23.2% and 16.6% respectively, and the ACC value was increased by 2.67 percentage points. In order to further validate the effectiveness of the algorithm proposed, experiments were conducted on four other different wheat varieties (AK1009, AK1401, AK1706, and YKM222) collected. The experimental results showed that the average MAE and MSE values of wheat ear counting were 5.10 and 6.17, with ACC of 89.69%. This indicated that the proposed model had good generalization performance. The research can provide certain support and assistance for related studies on wheat ear counting.