pyhiv.split package
Submodules
pyhiv.split.split module
- pyhiv.split.split.get_gene_region(test_aligned, ref_aligned, aligned_gene_ranges)[source]
Identify the gene region(s) with the highest alignment score.
- Parameters:
test_aligned (str) – The aligned test sequence (with gaps).
ref_aligned (str) – The aligned reference sequence (with gaps).
aligned_gene_ranges (dict) – Dictionary mapping gene names to (start, end) positions in the alignment coordinates (0-based).
- Returns:
A list of gene names corresponding to the region(s) with the highest alignment score. If multiple genes share the same maximum score, all of them are returned.
- Return type:
list
- pyhiv.split.split.get_present_gene_regions(test_aligned, aligned_gene_ranges)[source]
Identify gene regions that contain at least one base (non-gap) in the aligned test sequence.
- Parameters:
test_aligned (str) – The aligned test sequence (with gaps).
aligned_gene_ranges (dict) – Dictionary mapping gene names to (start, end) positions in the alignment coordinates (0-based).
- Returns:
A list of gene names where the test sequence contains non-gap characters within the region.
- Return type:
list
- pyhiv.split.split.map_ref_coords_to_alignment(ref_aligned)[source]
Build a mapping from reference coordinates without gaps (GenBank) to alignment columns with gaps.
- Parameters:
ref_aligned (str) – The aligned reference sequence (may contain ‘-’ characters representing gaps).
- Returns:
A dictionary mapping 1-based reference positions (without gaps) to 0-based alignment positions (with gaps).
- Return type:
dict