prokbert.prokbert_tokenizer.ProkBERTTokenizer.decode

ProkBERTTokenizer.decode(ids: Union[List[int], Tensor]) str

Decodes a list of token IDs back to the original DNA sequence.

This method converts token IDs back to their corresponding tokens and then concatenates them to form the original sequence. It is capable of handling token IDs provided as a list or a PyTorch tensor.

Parameters

ids (Union[List[int], torch.Tensor]) – Token IDs to be decoded. Can be a list of integers or a PyTorch tensor.

Returns

The decoded DNA sequence as a string.

Return type

str

Usage Example:
>>> tokenizer = ProkBERTTokenizer(...)
>>> ids = [213, 3343]
>>> sequence = tokenizer.decode(ids)
>>> print(sequence)
...