top of page
  • Writer's pictureRodney LaLonde

Capsules for Object Segmentation

Updated: Jun 16, 2019

This work was awarded an ORAL presentation at the MIDL 2018 conference as well as received the CIFAR Student Travel Grant award.


The code has been made publicly available at https://github.com/lalonderodney/SegCaps. Live code demo: drive.google.com/drive/folders/1MhebBrDsh3N5HSntj2Zl5edx56_IkXkN?usp=sharing.


In this work, we focus on extending the recently introduced "capsule networks" to the task of object segmentation in larger images. The original paper by Rodney LaLonde and Ulas Bagci, "Capsules for Object Segmentation" can be found at https://arxiv.org/abs/1804.04241.

Presenting my research at MIDL 2018 (Oral) in Amsterdam.

Video of Oral Presentation

This work appeared as an oral presentation at the recent International conference on Medical Imaging with Deep Learning in Amsterdam, 4 ‑ 6th July 2018.


Brief Description of Work

Convolutional Neural Networks (CNNs) have shown remarkable results over the last several years for a wide range of computer vision tasks. A new architecture recently introduced by Sabour et al., referred to as a capsule networks with dynamic routing, has shown great initial results for digit recognition and small image classification. Our work expands the use of capsule networks to the task of object segmentation for the first time in the literature.

SegCaps Baseline Network
A baseline capsule network for segmentation based heavily on the work by Sabour et al.

We extend the idea of convolutional capsules with locally-connected routing and propose the concept of deconvolutional capsules. Further, we extend the masked reconstruction to reconstruct the positive input class. The proposed convolutional-deconvolutional capsule network, called SegCaps, shows strong results for the task of object segmentation with substantial decrease in parameter space.

As an example application, we applied the proposed SegCaps to segment pathological lungs from low dose CT scans and compared its accuracy and efficiency with other U-Net-based architectures. SegCaps is able to handle large image sizes (512 x 512) as opposed to baseline capsules (typically less than 32 x 32). The proposed SegCaps reduced the number of parameters of U-Net architecture by 95.4% while still providing a better segmentation accuracy.

Segmentation Results on the LUNA16 Dataset.

Feel free to leave any comments or questions about this project by signing in at the bottom of the page.

4,353 views5 comments
bottom of page