Combine Methods¶
This page explains the methods that are supported by multimodal_transformers.tabular_combiner.TabularFeatCombiner
.
See the table for details.
If you have rich categorical and numerical features any of the attention
, gating
, or weighted sum
methods are worth trying.
The following describes each supported method and whether or not it requires both categorical and numerical features.
Combine Feat Method  Description  requires both cat and num features 

text_only  Uses just the text columns as processed by transformer before final classifier layer(s). Essentially equivalent to HuggingFace's ForSequenceClassification models 
False 
concat  Concatenate transformer output, numerical feats, and categorical feats all at once before final classifier layer(s)  False 
mlp_on_categorical_then_concat  MLP on categorical feats then concat transformer output, numerical feats, and processed categorical feats before final classifier layer(s)  False (Requires cat feats) 
individual_mlps_on_cat_and_numerical_feats_then_concat  Separate MLPs on categorical feats and numerical feats then concatenation of transformer output, with processed numerical feats, and processed categorical feats before final classifier layer(s).  False 
mlp_on_concatenated_cat_and_numerical_feats_then_concat  MLP on concatenated categorical and numerical feat then concatenated with transformer output before final classifier layer(s)  True 
attention_on_cat_and_numerical_feats  Attention based summation of transformer outputs, numerical feats, and categorical feats queried by transformer outputs before final classifier layer(s).  False 
gating_on_cat_and_num_feats_then_sum  Gated summation of transformer outputs, numerical feats, and categorical feats before final classifier layer(s). Inspired by Integrating Multimodal Information in Large Pretrained Transformers which performs the mechanism for each token.  False 
weighted_feature_sum_on_transformer_cat_and_numerical_feats  Learnable weighted featurewise sum of transformer outputs, numerical feats and categorical feats for each feature dimension before final classifier layer(s)  False 
This table shows the the equations involved with each method. First we define some notation
equation denotes the combined multimodal features
equation denotes the output text features from the transformer
equation denotes the categorical features
equation denotes the numerical features
equation denotes a MLP parameterized by equation
equation denotes a weight matrix
equation denotes a scalar bias
Combine Feat Method  Equation 

text_only  
concat  
mlp_on_categorical_then_concat  
individual_mlps_on_cat_and_ numerical_feats_then_concat 

mlp_on_concatenated_cat_and_ numerical_feats_then_concat 

attention_on_cat_and_numerical_feats  where 
gating_on_cat_and_num_feats_ then_sum 
where is a hyperparameter and is an activation function 
weighted_feature_sum_on_transformer_ cat_and_numerical_feats 