Automatic fish recognition is widely used in the fields of marine ecology and aquaculture. Due to factors such as fluctuating illumination, overlapping instances and occlusion, accurate automatic identification of fish is extremely challenging. In order to solve these problems, this paper introduces an innovative Multi-stage Feature Extraction Network (MF-Net) model, which is predicated upon a multi-stage feature extraction paradigm for the domain of automatic fish recognition. The architecture of MF-Net commences with a subtle image enhancement preprocessing step, judiciously designed to augment the computational efficiency of the model. Then the deployment of a multi-stage convolutional feature extraction strategy is applied to improve the model's sensitivity towards the granular features of fish species. In an effort to mitigate issues arising from data imbalance, the model incorporates a long-tail loss computation strategy. To evaluate the efficacy of the proposed MF-Net, the study collects a comprehensive fish dataset encompassing 500 categories including 32 768 images. The proposed MF-Net demonstrated a remarkable accuracy of 86.8% on this dataset, thereby outperforming the recognition performance of the existing state-of-the-art target recognition algorithms. Furthermore, the model is tested on a publicly butterfly dataset to verify its generalization performance, and multiple ablation experiments further validate the effectiveness of the proposed algorithm.