Model ouput - What does the model output correspond to?

#1
by tmwalsh - opened

I am trying to extract FAbCon's final sequence embeddings for a set of amino acid sequences. What do the dimensions of the output correspond to?

The model call outputs a transformers.modeling_outputs.CausalLMOutputWithCrossAttentions object.

So if you want to do the typical thing of using the EOS token embedding as the sequence embedding then you would do something like:

from transformers import PreTrainedTokenizerFast, FalconForCausalLM

tokenizer = PreTrainedTokenizerFast.from_pretrained("alchemab/fabcon-large")
model = FalconForCausalLM.from_pretrained("alchemab/fabcon-large")

... ## --> Batching and tokenizing your inputs

output = model(**input_batch)

last_token_indices = input_batch['attention_mask'].sum(dim=1) - 1
batch_embeddings = output.last_hidden_state[range(output.last_hidden_state.size(0)), last_token_indices, :].cpu().numpy()

Adding to Justin’s point above, a tensor is of shape

B x L x D

Where D corresponds to the model’s size (eg Fabcon small has a D of 768), B is batch size (ie number of sequences) and L is your sequence length — typically the longest length of any antibody sequence input you provide due to padding.

Sign up or log in to comment