Resource-Efficient Few-Shot Plant Disease Classification via Quantized Low-Rank Adapters in Vision Transformers
Abstract
Rapid and accurate detection of plant diseases from limited labeled data remains a critical challenge in digital agriculture. In this paper, we present a few-shot learning framework for plant disease classification that integrates a self-supervised Vision Transformer backbone (DINOv2-S) with a Prototypical Network classifier, adapted via Quantized Low-Rank Adaptation (QLoRA). While QLoRA has demonstrated strong efficiency gains in natural language processing since its introduction, its application to vision-domain few-shot learning tasks has not been systematically explored. Our framework fine-tunes only approximately 1% of the model parameters (~ 221K) using low-rank adapters and 4-bit quantization, achieving an approximate trainable-parameter storage of 2.53 MB and an inference speed of 0.20 ms/image on NVIDIA A100 hardware. Comprehensive experiments on the PlantDoc and PlantVillage datasets show competitive or improved performance relative to recent approaches. In the 5-way 5-shot setting at 224×224 resolution, the proposed framework achieves mean accuracies of 86.85 ± 0.42% and 94.51 ± 0.35%, respectively. Furthermore, ablation studies across varying shot numbers, image resolutions, backbone architectures, LoRA rank configurations, and fine-tuning strategies confirm the robustness and efficiency of the proposed approach. These results suggest that parameter-efficient adaptation of vision transformers offers a practical pathway for deploying disease diagnosis systems in resource-constrained agricultural settings.
Related articles
Related articles are currently not available for this article.