Abstract:Accurate map construction of farmland operation area is an important prerequisite for realizing the path planning and navigation operation of farm machinery. The terraced fields on the Loess Plateau have different sizes and complex shapes, and there are some pits, ditches and many dangerous operation boundaries, so it is difficult to accurately extract the terraced operation area by the commonly used satellite point measurement methods. An improved TransUNet model based on multi-scale feature extraction and fusion up-sampling was proposed with the remote sensing images of terraced fields from UAVs as the data base.In the encoder part, the ability of feature extraction and fusion for different scales of terraces was enhanced by introducing the pyramid squeeze attention (PSA) module on top of the channel attention and the Transformer layer was optimized by using the residual structure.In the decoder part, the Dual up-sample module was introduced to integrate the sub-pixel convolutional layer with the bilinear interpolation upsampling to improve the accuracy of the terraced field boundary segmentation while preventing the checker board effect, and the channel and spatial attention mechanism module (concurrent spatial and channel squeeze and channel excitation (SCSE)) was added at the end of the decoder to integrate and enhance the information of spatial and channel dimensions, which helped to recover the detailed features of the image step by step.The experimental results showed that the mean pixel accuracy, F1 value, and mean intersection over union of the improved TransUNet model can reach up to 96.0%, 96.0%, and 92.3% on average on the test set of three typical terraces, namely, straight and long stripes, meandering stripes, and irregular shapes, respectively, which was an average enhancement of 1.8 percentage points compared with the pre-improvement period, and compared with the representative PSPNet, HRNet V2, DeepLab V3+, and U-Net models, the average improvement of the three indicators was 8.3, 6.2, 5.0, and 4.2 percentage points. On the test set of three types of single terraces, the proposed model performed the best, and intersection over union can reach 97.0% on average. The method can provide a reference for the construction of terraced field environment maps in the Loess Plateau and the navigation operation of agricultural machinery in hilly and mountainous areas.