pyhiv.report package
Submodules
pyhiv.report.constants module
Constants and configuration for PyHIV reporting module.
- class pyhiv.report.constants.GenePanelConfig[source]
Bases:
objectGene panel spacing and positioning configuration.
- ALIGNMENT_CLEARANCE = 0.243
- BOTTOM_MARGIN = 0.55
- NON_K03455_X_MAX_DEFAULT = 10000
- REV_CONNECTOR = 0.6075
- TAT_CONNECTOR = 0.09450000000000001
- TOP_MARGIN = 0.3
- X_PAD_MIN = 60
- Y_SCALE = 1.35
- class pyhiv.report.constants.K03455Config[source]
Bases:
objectConfiguration for K03455 special reference handling.
- DEFAULT_K03455_OFFSETS = (-0.15, 0.35)
- K03455_NUMERIC_OFFSETS = {"3' LTR": (0.25, -0.2), "5' LTR": (-0.15, -0.2), 'env': (-0.15, -0.2), 'gag': (0.3, 0.2), 'nef': (-0.15, -0.2), 'pol': (-0.15, -0.2), 'rev 1': (0.2, 0.25), 'rev 2': (0.2, 0.25), 'tat 1': (-0.15, 0.15), 'tat 2': (-0.15, 0.2), 'vif': (-0.15, -0.2), 'vpr': (0.2, -0.2), 'vpu': (0.35, 0.15)}
- TARGET_REGIONS = ["5' LTR", 'gag', 'pol', 'vif', 'vpr', 'vpu', 'tat 1', 'tat 2', 'rev 1', 'rev 2', 'env', 'nef', "3' LTR"]
- Y_POSITIONS = {"3' LTR": 0.0, "5' LTR": 0.0, 'env': 0.0, 'gag': 0.4, 'nef': 0.8, 'pol': 0.0, 'rev 1': 1.0, 'rev 2': 1.0, 'tat 1': 0.2, 'tat 2': 0.4, 'vif': 0.4, 'vpr': 0.8, 'vpu': 0.6}
- class pyhiv.report.constants.MetadataConfig[source]
Bases:
objectMetadata block display configuration.
- FONTSIZE = 9.5
- INFO_TOP_Y = 0.78
- TITLE_Y = 1.06
- WRAP = 75
- class pyhiv.report.constants.NumericOffsets[source]
Bases:
objectVertical offsets for numeric labels (non-K03455 only).
- DEFAULT_OFFSETS = (-0.15, 0.35)
- GENE_OFFSET_MAP = {"3' ltr": (0.25, -0.2), "5' ltr": (-0.15, -0.2), 'env': (0.15, 0.15), 'gag': (-0.15, -0.15), 'gag-pol': (-0.15, -0.2), 'nef': (-0.15, -0.2), 'pol': (-0.15, -0.2), 'rev 1': (0.2, 0.25), 'rev 2': (0.2, 0.25), 'tat 1': (0.15, 0.15), 'tat 2': (0.15, 0.2), 'vif': (-0.15, -0.2), 'vpr': (0.15, -0.2), 'vpu': (0.15, 0.15)}
pyhiv.report.pdf_generator module
PDF report generation for PyHIV results.
- pyhiv.report.pdf_generator.render_sequence_page(pdf, sequence, accession, subtype, mm_region, present_regions, features_aln, ref_seq_aligned, user_seq_aligned, y_positions=None)[source]
Render a single sequence page in the PDF report.
- Parameters:
pdf (PdfPages) – The PdfPages object to save the figure into.
sequence (str) – The name or identifier of the sequence.
accession (str) – The accession number of the sequence.
subtype (str) – The subtype of the sequence.
mm_region (str) – The most matching region of the sequence.
present_regions (List[str]) – List of present regions in the sequence.
features_aln (Dict[str, Tuple[int, int]]) – Dictionary of gene features with their alignment coordinate ranges.
ref_seq_aligned (str) – The reference sequence aligned (with gaps).
user_seq_aligned (str) – The user’s sequence aligned (with gaps).
y_positions (Optional[Dict[str, float]], optional) – Fixed y-positions for gene lanes, by default None (auto lanes).
pyhiv.report.reporter module
Main reporting class for PyHIV results.
- class pyhiv.report.reporter.PyHIVReporter(output_dir, log_level=20)[source]
Bases:
objectMain class for generating PyHIV PDF reports.
- Parameters:
output_dir (Path)
- generate_report(final_table_path, sequences_with_locations_path, output_pdf_name='PyHIV_report_all_sequences.pdf')[source]
Generate PDF report from PyHIV results.
- Parameters:
final_table_path (Path) – Path to final_table.tsv file.
sequences_with_locations_path (Path) – Path to sequences_with_locations.tsv file.
output_pdf_name (str, optional) – Name of the output PDF file, by default “PyHIV_report_all_sequences.pdf”
- Returns:
Path to the generated PDF report.
- Return type:
Path
pyhiv.report.utils module
Utility functions for PyHIV reporting module.
- pyhiv.report.utils.build_alignment_path(sequence, alignments_dir)[source]
Build path to alignment FASTA file.
- Parameters:
sequence (str) – The name or identifier of the sequence.
alignments_dir (Path) – The directory containing alignment FASTA files.
- Returns:
The path to the alignment FASTA file.
- Return type:
Path
- pyhiv.report.utils.build_ref_to_alignment_map(ref_aligned)[source]
Build mapping from reference coordinates to alignment coordinates.
- Parameters:
ref_aligned (str) – The reference sequence with alignment gaps.
- Returns:
A tuple containing: - A dictionary mapping reference positions to alignment indices. - The length of the aligned reference sequence.
- Return type:
Tuple[Dict[int, int], int]
- pyhiv.report.utils.canon_label(label)[source]
Canonicalize gene label for K03455.
- Parameters:
label (str) – The input gene label.
- Returns:
The canonical gene label, or None if not recognized.
- Return type:
Optional[str]
- pyhiv.report.utils.first_last_nongap_idx(seq)[source]
Return the first and last indices of non-gap characters in a sequence.
- Parameters:
seq (str) – The input sequence with gaps.
- Returns:
A tuple containing the first and last indices of non-gap characters.
- Return type:
Tuple[int, int]
- pyhiv.report.utils.get_numeric_offsets_non_special(gene)[source]
Get numeric offsets for non-K03455 references using NumericOffsets.
- Parameters:
gene (str) – The gene name.
- Returns:
A tuple containing (start_offset, end_offset).
- Return type:
tuple[float, float]
- pyhiv.report.utils.is_special_reference(accession, ref_header)[source]
Check if reference is special (K03455).
- Parameters:
accession (str) – The accession number of the reference.
ref_header (str) – The header of the reference sequence.
- Returns:
True if the reference is K03455, False otherwise.
- Return type:
bool
- pyhiv.report.utils.normalize_features(raw_features, special)[source]
Normalize features based on reference type.
- Parameters:
raw_features (Dict[str, Tuple[int, int]]) – Raw features mapping.
special (bool) – Whether the reference is special (K03455).
- Returns:
Normalized features mapping.
- Return type:
Dict[str, Tuple[int, int]]
- pyhiv.report.utils.normalize_present_regions(regions, special)[source]
Normalize present regions based on reference type.
- Parameters:
regions (List[str]) – List of raw present regions.
special (bool) – Whether the reference is special (K03455).
- Returns:
Normalized list of present regions.
- Return type:
List[str]
- pyhiv.report.utils.parse_features(cell)[source]
Parse features from table cell.
- Parameters:
cell (Any) – The table cell containing features.
- Returns:
A dictionary mapping feature names to (start, end) tuples.
- Return type:
Dict[str, Tuple[int, int]]
- pyhiv.report.utils.parse_present_regions(cell)[source]
Parse present regions from a table cell into a list of region strings.
- Parameters:
cell (Any) – The table cell containing present regions.
- Returns:
A list of present region strings.
- Return type:
List[str]
- pyhiv.report.utils.project_features_to_alignment(features_genomic, ref_map)[source]
Project genomic features to alignment coordinates.
- Parameters:
features_genomic (Dict[str, Tuple[int, int]]) – Genomic features mapping.
ref_map (Dict[int, int]) – Reference to alignment mapping.
- Returns:
Features projected to alignment coordinates.
- Return type:
Dict[str, Tuple[int, int]]
- pyhiv.report.utils.read_alignment_fasta(fpath)[source]
Read alignment FASTA file and return headers and sequences.
- Parameters:
fpath (Path) – Path to the alignment FASTA file.
- Returns:
A tuple containing: - Reference header - Reference sequence (aligned) - User header - User sequence (aligned)
- Return type:
Tuple[str, str, str, str]
pyhiv.report.visualization module
Gene visualization and plotting functions for PyHIV reporting module.
- pyhiv.report.visualization.plot_gene_axes(ax, genes_ranges, alignment_start, alignment_end, y_positions=None)[source]
Plot gene visualization with alignment information.
- Parameters:
ax (matplotlib.axes.Axes) – The matplotlib Axes to plot on.
genes_ranges (Dict[str, Tuple[int, int]]) – Mapping of gene names to their (start, end) positions.
alignment_start (int) – Start position of the alignment span.
alignment_end (int) – End position of the alignment span.
y_positions (Optional[Dict[str, float]], optional) – Fixed y-positions for gene lanes, by default None (auto lanes).
Module contents
PyHIV reporting module.
This module provides functionality to generate PDF reports from PyHIV analysis results. The reports include sequence metadata and gene visualization plots.
- class pyhiv.report.GenePanelConfig[source]
Bases:
objectGene panel spacing and positioning configuration.
- ALIGNMENT_CLEARANCE = 0.243
- BOTTOM_MARGIN = 0.55
- NON_K03455_X_MAX_DEFAULT = 10000
- REV_CONNECTOR = 0.6075
- TAT_CONNECTOR = 0.09450000000000001
- TOP_MARGIN = 0.3
- X_PAD_MIN = 60
- Y_SCALE = 1.35
- class pyhiv.report.K03455Config[source]
Bases:
objectConfiguration for K03455 special reference handling.
- DEFAULT_K03455_OFFSETS = (-0.15, 0.35)
- K03455_NUMERIC_OFFSETS = {"3' LTR": (0.25, -0.2), "5' LTR": (-0.15, -0.2), 'env': (-0.15, -0.2), 'gag': (0.3, 0.2), 'nef': (-0.15, -0.2), 'pol': (-0.15, -0.2), 'rev 1': (0.2, 0.25), 'rev 2': (0.2, 0.25), 'tat 1': (-0.15, 0.15), 'tat 2': (-0.15, 0.2), 'vif': (-0.15, -0.2), 'vpr': (0.2, -0.2), 'vpu': (0.35, 0.15)}
- TARGET_REGIONS = ["5' LTR", 'gag', 'pol', 'vif', 'vpr', 'vpu', 'tat 1', 'tat 2', 'rev 1', 'rev 2', 'env', 'nef', "3' LTR"]
- Y_POSITIONS = {"3' LTR": 0.0, "5' LTR": 0.0, 'env': 0.0, 'gag': 0.4, 'nef': 0.8, 'pol': 0.0, 'rev 1': 1.0, 'rev 2': 1.0, 'tat 1': 0.2, 'tat 2': 0.4, 'vif': 0.4, 'vpr': 0.8, 'vpu': 0.6}
- class pyhiv.report.MetadataConfig[source]
Bases:
objectMetadata block display configuration.
- FONTSIZE = 9.5
- INFO_TOP_Y = 0.78
- TITLE_Y = 1.06
- WRAP = 75
- class pyhiv.report.NumericOffsets[source]
Bases:
objectVertical offsets for numeric labels (non-K03455 only).
- DEFAULT_OFFSETS = (-0.15, 0.35)
- GENE_OFFSET_MAP = {"3' ltr": (0.25, -0.2), "5' ltr": (-0.15, -0.2), 'env': (0.15, 0.15), 'gag': (-0.15, -0.15), 'gag-pol': (-0.15, -0.2), 'nef': (-0.15, -0.2), 'pol': (-0.15, -0.2), 'rev 1': (0.2, 0.25), 'rev 2': (0.2, 0.25), 'tat 1': (0.15, 0.15), 'tat 2': (0.15, 0.2), 'vif': (-0.15, -0.2), 'vpr': (0.15, -0.2), 'vpu': (0.15, 0.15)}
- class pyhiv.report.PageLayout[source]
Bases:
objectPage layout and spacing configuration.
- FIGSIZE = (11.69, 9.2)
- GRID_HEIGHT_RATIOS = [0.9, 1.9]
- HSPACE = 0.42
- class pyhiv.report.PyHIVReporter(output_dir, log_level=20)[source]
Bases:
objectMain class for generating PyHIV PDF reports.
- Parameters:
output_dir (Path)
- generate_report(final_table_path, sequences_with_locations_path, output_pdf_name='PyHIV_report_all_sequences.pdf')[source]
Generate PDF report from PyHIV results.
- Parameters:
final_table_path (Path) – Path to final_table.tsv file.
sequences_with_locations_path (Path) – Path to sequences_with_locations.tsv file.
output_pdf_name (str, optional) – Name of the output PDF file, by default “PyHIV_report_all_sequences.pdf”
- Returns:
Path to the generated PDF report.
- Return type:
Path
- pyhiv.report.build_alignment_path(sequence, alignments_dir)[source]
Build path to alignment FASTA file.
- Parameters:
sequence (str) – The name or identifier of the sequence.
alignments_dir (Path) – The directory containing alignment FASTA files.
- Returns:
The path to the alignment FASTA file.
- Return type:
Path
- pyhiv.report.build_ref_to_alignment_map(ref_aligned)[source]
Build mapping from reference coordinates to alignment coordinates.
- Parameters:
ref_aligned (str) – The reference sequence with alignment gaps.
- Returns:
A tuple containing: - A dictionary mapping reference positions to alignment indices. - The length of the aligned reference sequence.
- Return type:
Tuple[Dict[int, int], int]
- pyhiv.report.canon_label(label)[source]
Canonicalize gene label for K03455.
- Parameters:
label (str) – The input gene label.
- Returns:
The canonical gene label, or None if not recognized.
- Return type:
Optional[str]
- pyhiv.report.first_last_nongap_idx(seq)[source]
Return the first and last indices of non-gap characters in a sequence.
- Parameters:
seq (str) – The input sequence with gaps.
- Returns:
A tuple containing the first and last indices of non-gap characters.
- Return type:
Tuple[int, int]
- pyhiv.report.get_numeric_offsets_non_special(gene)[source]
Get numeric offsets for non-K03455 references using NumericOffsets.
- Parameters:
gene (str) – The gene name.
- Returns:
A tuple containing (start_offset, end_offset).
- Return type:
tuple[float, float]
- pyhiv.report.is_special_reference(accession, ref_header)[source]
Check if reference is special (K03455).
- Parameters:
accession (str) – The accession number of the reference.
ref_header (str) – The header of the reference sequence.
- Returns:
True if the reference is K03455, False otherwise.
- Return type:
bool
- pyhiv.report.normalize_features(raw_features, special)[source]
Normalize features based on reference type.
- Parameters:
raw_features (Dict[str, Tuple[int, int]]) – Raw features mapping.
special (bool) – Whether the reference is special (K03455).
- Returns:
Normalized features mapping.
- Return type:
Dict[str, Tuple[int, int]]
- pyhiv.report.normalize_present_regions(regions, special)[source]
Normalize present regions based on reference type.
- Parameters:
regions (List[str]) – List of raw present regions.
special (bool) – Whether the reference is special (K03455).
- Returns:
Normalized list of present regions.
- Return type:
List[str]
- pyhiv.report.parse_features(cell)[source]
Parse features from table cell.
- Parameters:
cell (Any) – The table cell containing features.
- Returns:
A dictionary mapping feature names to (start, end) tuples.
- Return type:
Dict[str, Tuple[int, int]]
- pyhiv.report.parse_present_regions(cell)[source]
Parse present regions from a table cell into a list of region strings.
- Parameters:
cell (Any) – The table cell containing present regions.
- Returns:
A list of present region strings.
- Return type:
List[str]
- pyhiv.report.plot_gene_axes(ax, genes_ranges, alignment_start, alignment_end, y_positions=None)[source]
Plot gene visualization with alignment information.
- Parameters:
ax (matplotlib.axes.Axes) – The matplotlib Axes to plot on.
genes_ranges (Dict[str, Tuple[int, int]]) – Mapping of gene names to their (start, end) positions.
alignment_start (int) – Start position of the alignment span.
alignment_end (int) – End position of the alignment span.
y_positions (Optional[Dict[str, float]], optional) – Fixed y-positions for gene lanes, by default None (auto lanes).
- pyhiv.report.project_features_to_alignment(features_genomic, ref_map)[source]
Project genomic features to alignment coordinates.
- Parameters:
features_genomic (Dict[str, Tuple[int, int]]) – Genomic features mapping.
ref_map (Dict[int, int]) – Reference to alignment mapping.
- Returns:
Features projected to alignment coordinates.
- Return type:
Dict[str, Tuple[int, int]]
- pyhiv.report.read_alignment_fasta(fpath)[source]
Read alignment FASTA file and return headers and sequences.
- Parameters:
fpath (Path) – Path to the alignment FASTA file.
- Returns:
A tuple containing: - Reference header - Reference sequence (aligned) - User header - User sequence (aligned)
- Return type:
Tuple[str, str, str, str]
- pyhiv.report.render_sequence_page(pdf, sequence, accession, subtype, mm_region, present_regions, features_aln, ref_seq_aligned, user_seq_aligned, y_positions=None)[source]
Render a single sequence page in the PDF report.
- Parameters:
pdf (PdfPages) – The PdfPages object to save the figure into.
sequence (str) – The name or identifier of the sequence.
accession (str) – The accession number of the sequence.
subtype (str) – The subtype of the sequence.
mm_region (str) – The most matching region of the sequence.
present_regions (List[str]) – List of present regions in the sequence.
features_aln (Dict[str, Tuple[int, int]]) – Dictionary of gene features with their alignment coordinate ranges.
ref_seq_aligned (str) – The reference sequence aligned (with gaps).
user_seq_aligned (str) – The user’s sequence aligned (with gaps).
y_positions (Optional[Dict[str, float]], optional) – Fixed y-positions for gene lanes, by default None (auto lanes).