Model ouput - What does the model output correspond to?
#1
by
tmwalsh
- opened
I am trying to extract FAbCon's final sequence embeddings for a set of amino acid sequences. What do the dimensions of the output correspond to?
The model call outputs a transformers.modeling_outputs.CausalLMOutputWithCrossAttentions
object.
So if you want to do the typical thing of using the EOS token embedding as the sequence embedding then you would do something like:
from transformers import PreTrainedTokenizerFast, FalconForCausalLM
tokenizer = PreTrainedTokenizerFast.from_pretrained("alchemab/fabcon-large")
model = FalconForCausalLM.from_pretrained("alchemab/fabcon-large")
... ## --> Batching and tokenizing your inputs
output = model(**input_batch)
last_token_indices = input_batch['attention_mask'].sum(dim=1) - 1
batch_embeddings = output.last_hidden_state[range(output.last_hidden_state.size(0)), last_token_indices, :].cpu().numpy()
Adding to Justin’s point above, a tensor is of shape
B x L x D
Where D corresponds to the model’s size (eg Fabcon small has a D of 768), B is batch size (ie number of sequences) and L is your sequence length — typically the longest length of any antibody sequence input you provide due to padding.