alignment module
This module provides functionality for aligning two sequences (architectures) with optional similarity matrices, minimizing crossings and matching modules. It includes utilities for handling unknown values, rotating sequences, counting inversions, extracting module pairs from similarity matrices, and computing the best alignment layout.
- class plast.alignment.Alignment(transformed, is_reversed, rotation, crossings, mean_len, matches, pairs)
Bases:
objectRepresents an alignment with transformation and matching information.
- Parameters:
transformed (list) – Transformed architecture (list after rotation/reversal).
is_reversed (bool) – Whether the architecture was reversed.
rotation (int) – Number of positions rotated.
crossings (int) – Number of crossings (inversions) in the alignment.
mean_len (float) – Mean absolute distance between matched indices.
matches (int) – Number of matched pairs.
pairs (list[tuple[int, int]]) – List of matched pairs (i, j).
- plast.alignment.is_unknown(x)
Determines whether a given value should be considered ‘unknown’.
- A value is considered unknown if it is:
None
A float NaN (not-a-number)
A string that is empty, “nan”, or “na” (case-insensitive, with whitespace ignored)
- Parameters:
x (Any) – The value to check.
- Returns:
True if the value is considered unknown, False otherwise.
- Return type:
bool
- plast.alignment.rotate(xs, k)
Rotates the elements of a list by k positions.
- Parameters:
xs (list) – The list to rotate.
k (int) – The number of positions to rotate the list by.
- Returns:
A new list with elements rotated by k positions.
- Return type:
list
- plast.alignment.count_inversions(arr)
Counts the number of inversions in the given array.
An inversion is a pair of indices (i, j) such that i < j and arr[i] > arr[j]. Uses a modified merge sort algorithm.
- Parameters:
arr (list[int]) – Iterable of comparable elements (e.g., list of integers).
- Returns:
The number of inversions in the array.
- Return type:
int
- plast.alignment.extract_module_pairs(sim_matrix, threshold=0.99, min_size=5)
Extracts pairs of indices (i, j) belonging to large modules in sim_matrix.
- Parameters:
sim_matrix (np.ndarray) – Similarity matrix.
threshold (float) – Similarity threshold for module inclusion.
min_size (int) – Minimum module size to consider.
- Returns:
List of (i, j) pairs in large modules.
- Return type:
list[tuple[int, int]]
- plast.alignment.mapping_and_crossings(top, bottom)
Maps elements from ‘top’ to ‘bottom’, computes crossings (inversions), and mean distance between mapped indices.
- Parameters:
top (list) – The first sequence to map from.
bottom (list) – The second sequence to map to.
- Returns:
crossings, pairs, mean_len
- Return type:
tuple[int, list[tuple[int, int]], float]
- plast.alignment.best_layout_min_crossings(arch1, arch2, sim_matrix=None, min_module_size=5, threshold=1.0)
Finds the best transformation (rotation and reversal) of arch2 to align with arch1 such that the number of crossings between matched modules is minimized.
- Supports:
Simple matching when min_module_size == 1 and sim_matrix is None.
Module-based matching using a similarity matrix, threshold, and minimum module size.
- Parameters:
arch1 (list) – Reference architecture (sequence of modules).
arch2 (list) – Architecture to be transformed and aligned.
sim_matrix (np.ndarray, optional) – Precomputed similarity matrix between modules.
min_module_size (int) – Minimum size of modules to consider for matching.
threshold (float) – Similarity threshold for considering a module pair as a match.
- Returns:
Alignment object with best transformation and matching info.
- Return type:
- plast.alignment.rotate_parsed(pl, shift, reverse=False)
Rotates the parsed CDS records (a DataFrame) by ‘shift’ positions (circularly) and reassigns the coordinates so that the first record in the rotated list starts at 1.
- If reverse=True:
reverse the order of elements first,
- flip coordinates to reversed orientation:
start’ = L - end + 1 end’ = L - start + 1
invert strand (if present): strand’ = -strand
then apply the rotation by ‘shift’
- plast.alignment.rotate_plasmid(plast, shift, reverse=False)
Returns a new PLAST object with the vector rotated by ‘shift’ positions. If reverse=True, the order of elements is reversed (and parsed is flipped accordingly).