3M: Multi-style image caption generation using Multi-modality features under Multi-UPDOWN model
In this paper, we build a multi-style generative model for stylish image captioning which uses multi-modality image features, ResNeXt features, and text features generated by DenseCap.We propose the 3M model, a Multi-UPDOWN caption model that encodes multi-modality features Standing Wooden Plaque and decodes them into captions.We demonstrate the ef