Multimodal Models for semantic segmentation of 4-channel RGB-D images

Is it possible to train a Multimodal Model for semantic segmentation of 4-channel RGB-D images on the web platform?

Hi there - we donโ€™t support this kind of training at the moment. Is there a specific model youโ€™d like to see us support? Open to integrating it!