prokbert.sequtils.segment_sequence_contiguous
- prokbert.sequtils.segment_sequence_contiguous(sequence, params, sequence_id=nan)
Create end-to-end, disjoint segments of a sequence without overlaps.
Segments smaller than the predefined minimum length will be discarded. This function returns a list of segments along with their positions in the original sequence.
- Parameters
sequence (str) – The input nucleotide sequence to be segmented.
params (dict) – Dictionary containing the segmentation parameters. Must have ‘min_length’ and ‘max_length’ keys specifying the minimum and maximum lengths of the segments, respectively.
sequence_id (numeric, optional) – An identifier for the sequence. Defaults to NaN.
- Returns
Each dictionary in the list represents a segment and contains the segment’s sequence, start position, end position, and sequence ID.
- Return type
list of dict