Segmentation

BiomedParse performs segmentation for organs, abnormalities and cells, accurately following user's prompts. Without any image specific guidance like bounding box or points, BiomedParse outperforms state-of-the-art bounding box methods with text prompts only, across 9 biomedical imaging modalities.

Detection

BiomedParse detects the specific object of interest, and locate it at pixel-level precision, even for objects with irregular shapes. By effectively identifying text prompts describing object that does not exist in the image, BiomedParse is capable of object detection in an end-to-end manner.

Recognition

Tired of typing prompts for every objects? BiomedParse can do object recognition all at once. Having learned 82 object types, BiomedParse can automatically identify all objects in a given image along with their semantic types, and simultaneously segment and label all biomedical objects of interests.

Abstract

Biomedical image analysis is fundamental for biomedical discovery. Holistic image analysis comprises interdependent subtasks such as segmentation, detection and recognition, which are tackled separately by traditional approaches. Here, we propose BiomedParse, a biomedical foundation model that can jointly conduct segmentation, detection and recognition across nine imaging modalities. This joint learning improves the accuracy for individual tasks and enables new applications such as segmenting all relevant objects in an image through a textual description. To train BiomedParse, we created a large dataset comprising over 6 million triples of image, segmentation mask and textual description by leveraging natural language labels or descriptions accompanying existing datasets. We showed that BiomedParse outperformed existing methods on image segmentation across nine imaging modalities, with larger improvement on objects with irregular shapes. We further showed that BiomedParse can simultaneously segment and label all objects in an image. In summary, BiomedParse is an all-in-one tool for biomedical image analysis on all major image modalities, paving the path for efficient and accurate image-based biomedical discovery.

Related Work

SEEM: Segment Everything Everywhere All at Once by Xueyan Zou*, Jianwei Yang*, Hao Zhang*, Feng Li*, Linjie Li, Jianfeng Wang, Lijuan Wang, Jianfeng Gao^, Yong Jae Lee^
X-Decoder: Generalized Decoding for Pixel, Image, and Language by Xueyan Zou*, Zi-Yi Dou*, Jianwei Yang*, Zhe Gan, Linjie Li, Chunyuan Li, Xiyang Dai, Harkirat Behl, Jianfeng Wang, Lu Yuan, Nanyun Peng, Lijuan Wang, Yong Jae Lee^, Jianfeng Gao^
Focal Modulation Networks by Jianwei Yang, Chunyuan Li, Xiyang Dai, Lu Yuan and Jianfeng Gao.
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing by Yu Gu, Robert Tinn, Hao Cheng, Michael Lucas, Naoto Usuyama, Xiaodonzzg Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon.
BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs by Sheng Zhang, Yanbo Xu, Naoto Usuyama, Hanwen Xu, Jaspreet Bagga, Robert Tinn, Sam Preston, Rajesh Rao, Mu Wei, Naveen Valluri, Cliff Wong, Andrea Tupini, Yu Wang, Matt Mazzola, Swadheen Shukla, Lars Liden, Jianfeng Gao, Matthew P. Lungren, Tristan Naumann, Sheng Wang, Hoifung Poon.
BiomedJourney: Counterfactual Biomedical Image Generation by Instruction-Learning from Multimodal Patient Journeys by Yu Gu, Jianwei Yang, Naoto Usuyama, Chunyuan Li, Sheng Zhang, Matthew P. Lungren, Jianfeng Gao, Hoifung Poon.

BiomedParse

A foundation model for joint segmentation, detection and recognition of biomedical objects across nine modalities

Segmentation

Detection

Recognition

One model, 9 imaging modalities

Abstract