prokbert.prokbert_tokenizer.ProkBERTTokenizer.decode
- ProkBERTTokenizer.decode(ids: Union[List[int], Tensor]) str
Decodes a list of token IDs back to the original DNA sequence.
This method converts token IDs back to their corresponding tokens and then concatenates them to form the original sequence. It is capable of handling token IDs provided as a list or a PyTorch tensor.
- Parameters
ids (Union[List[int], torch.Tensor]) – Token IDs to be decoded. Can be a list of integers or a PyTorch tensor.
- Returns
The decoded DNA sequence as a string.
- Return type
- Usage Example:
>>> tokenizer = ProkBERTTokenizer(...) >>> ids = [213, 3343] >>> sequence = tokenizer.decode(ids) >>> print(sequence) ...