The V-SAM model achieved up to 96 percent F1-score in glomerulus segmentation, outperforming leading architectures and setting a new benchmark for kidney biopsy analysis, according to a recent study published in Frontiers in Medicine.
Segmentation is an important task in medical image analysis that identifies and delineates regions of interest in images of organs, tissues, and lesions. In the pursuit of more accurate segmentation of glomeruli in kidney histopathology, researchers at the College of Computer Science, Chongqing University, developed V-SAM, a novel framework that enhances the Segment Anything Model (SAM) through targeted architectural modifications. Although SAM has demonstrated strong capabilities in natural image segmentation, its application in medical imaging remains limited by inadequate preservation of fine anatomical details. It also displays inefficiency in processing gigapixel whole-slide images, and sensitivity to variables such as staining artifacts.
V-SAM addresses these limitations by integrating three key innovations: a V-shaped U-Net adapter with multi-scale skip connections, lightweight trainable adapter layers, and a gradient-aware point-prompt mechanism that enables sub-pixel boundary refinement.
The V-shaped adapter improves how it identifies and reconstructs small structures in medical images. To ensure the model can adapt to kidney pathology without losing its general image-recognition ability, it includes layers that can be adjusted for medical data without needing to retrain the whole system. V-SAM is also efficient in terms of computer processing — it delivers strong performance using fewer computing resources, which is important for use in hospitals or labs with limited technical capacity. Additionally, it can fine-tune how it pinpoints structures in an image without slowing down processing time. When tested on two large kidney image datasets (HuBMAP-1 and HuBMAP-2), V-SAM performed better than other leading AI models. It reached 89 percent accuracy and an 86 percent F1-score (a measure of precision and sensitivity) on the first dataset. On the second, more complex dataset, V-SAM achieved even higher scores of 98 percent accuracy and 96 percent F1-score.
Compared to other segmentation models such as UNet++, nnUNet, DRA-Net, and DET-SAM, V-SAM demonstrated superior performance in preserving glomerular boundaries and capturing intricate structural detail. Its performance gains were most notable against DET-SAM, which uses a similar ViT-B backbone but lacks V-SAM’s optimized prompting and skip-connection design. These results poit to the potential of V-SAM as a clinically relevant tool for automating glomerular segmentation in chronic kidney disease evaluation.
Beyond segmentation, V-SAM’s innovations may inform future diagnostic tools by improving tissue-level interpretation of renal pathology, thereby advancing the integration of deep learning into computational pathology workflows. The framework displays accuracy, efficiency, and adaptability for deployment in diverse clinical and research settings.