Combine Methods¶
This page explains the methods that are supported by multimodal_transformers.tabular_combiner.TabularFeatCombiner
.
See the table for details.
If you have rich categorical and numerical features any of the attention
, gating
, or weighted sum
methods are worth trying.
The following describes each supported method and whether or not it requires both categorical and numerical features.
Combine Feat Method | Description | requires both cat and num features |
---|---|---|
text_only | Uses just the text columns as processed by transformer before final classifier layer(s). Essentially equivalent to HuggingFace's ForSequenceClassification models |
False |
concat | Concatenate transformer output, numerical feats, and categorical feats all at once before final classifier layer(s) | False |
mlp_on_categorical_then_concat | MLP on categorical feats then concat transformer output, numerical feats, and processed categorical feats before final classifier layer(s) | False (Requires cat feats) |
individual_mlps_on_cat_and_numerical_feats_then_concat | Separate MLPs on categorical feats and numerical feats then concatenation of transformer output, with processed numerical feats, and processed categorical feats before final classifier layer(s). | False |
mlp_on_concatenated_cat_and_numerical_feats_then_concat | MLP on concatenated categorical and numerical feat then concatenated with transformer output before final classifier layer(s) | True |
attention_on_cat_and_numerical_feats | Attention based summation of transformer outputs, numerical feats, and categorical feats queried by transformer outputs before final classifier layer(s). | False |
gating_on_cat_and_num_feats_then_sum | Gated summation of transformer outputs, numerical feats, and categorical feats before final classifier layer(s). Inspired by Integrating Multimodal Information in Large Pretrained Transformers which performs the mechanism for each token. | False |
weighted_feature_sum_on_transformer_cat_and_numerical_feats | Learnable weighted feature-wise sum of transformer outputs, numerical feats and categorical feats for each feature dimension before final classifier layer(s) | False |
This table shows the the equations involved with each method. First we define some notation
equation denotes the combined multimodal features
equation denotes the output text features from the transformer
equation denotes the categorical features
equation denotes the numerical features
equation denotes a MLP parameterized by equation
equation denotes a weight matrix
equation denotes a scalar bias
Combine Feat Method | Equation |
---|---|
text_only | |
concat | |
mlp_on_categorical_then_concat | |
individual_mlps_on_cat_and_ numerical_feats_then_concat |
|
mlp_on_concatenated_cat_and_ numerical_feats_then_concat |
|
attention_on_cat_and_numerical_feats | where |
gating_on_cat_and_num_feats_ then_sum |
where is a hyperparameter and is an activation function |
weighted_feature_sum_on_transformer_ cat_and_numerical_feats |