Abstract:
Traditional aquaculture methods rely on either quantitative feeding or human experience, often resulting in uneven feeding, feed waste and environmental pollution. This study aims to design a model for identifying the feeding intensity to enhance feeding efficiency and reduce pollution. The temporal segment network was employed as the base model to capture long-term changes in fish feeding behavior. Temporal shift operations were introduced to more accurately capture the dynamic changes between adjacent video frames. Through axial feature calibration, the model adaptively adjusted its features, enabling more precise focus on the variations in different axial features of fish feeding behavior. Experimental results indicate that compared with the two-dimensional convolutional network (TSN), the proposed model improved average accuracy by 10.0% with only a 5.2% increase in parameters. Compared with the three-dimensional convolutional network (C3D), it achieved a 0.9% accuracy improvement while reducing parameters by 67.3%. Additionally, compared with the Swin Transformer model based on the Transformer architecture, it increased average accuracy by 4.1% while reducing parameters by 9.2%. The findings demonstrate that the model we designed is more accurate in identifying and classifying the feeding intensity of fish schools, providing a scientific basis for formulating precise feeding strategies for fish schools.