alignment module

This module provides functionality for aligning two sequences (architectures) with optional similarity matrices, minimizing crossings and matching modules. It includes utilities for handling unknown values, rotating sequences, counting inversions, extracting module pairs from similarity matrices, and computing the best alignment layout.

class plast.alignment.Alignment(transformed, is_reversed, rotation, crossings, mean_len, matches, pairs)

Bases: object

Represents an alignment with transformation and matching information.

Parameters:
  • transformed (list) – Transformed architecture (list after rotation/reversal).

  • is_reversed (bool) – Whether the architecture was reversed.

  • rotation (int) – Number of positions rotated.

  • crossings (int) – Number of crossings (inversions) in the alignment.

  • mean_len (float) – Mean absolute distance between matched indices.

  • matches (int) – Number of matched pairs.

  • pairs (list[tuple[int, int]]) – List of matched pairs (i, j).

plast.alignment.is_unknown(x)

Determines whether a given value should be considered ‘unknown’.

A value is considered unknown if it is:
  • None

  • A float NaN (not-a-number)

  • A string that is empty, “nan”, or “na” (case-insensitive, with whitespace ignored)

Parameters:

x (Any) – The value to check.

Returns:

True if the value is considered unknown, False otherwise.

Return type:

bool

plast.alignment.rotate(xs, k)

Rotates the elements of a list by k positions.

Parameters:
  • xs (list) – The list to rotate.

  • k (int) – The number of positions to rotate the list by.

Returns:

A new list with elements rotated by k positions.

Return type:

list

plast.alignment.count_inversions(arr)

Counts the number of inversions in the given array.

An inversion is a pair of indices (i, j) such that i < j and arr[i] > arr[j]. Uses a modified merge sort algorithm.

Parameters:

arr (list[int]) – Iterable of comparable elements (e.g., list of integers).

Returns:

The number of inversions in the array.

Return type:

int

plast.alignment.extract_module_pairs(sim_matrix, threshold=0.99, min_size=5)

Extracts pairs of indices (i, j) belonging to large modules in sim_matrix.

Parameters:
  • sim_matrix (np.ndarray) – Similarity matrix.

  • threshold (float) – Similarity threshold for module inclusion.

  • min_size (int) – Minimum module size to consider.

Returns:

List of (i, j) pairs in large modules.

Return type:

list[tuple[int, int]]

plast.alignment.mapping_and_crossings(top, bottom)

Maps elements from ‘top’ to ‘bottom’, computes crossings (inversions), and mean distance between mapped indices.

Parameters:
  • top (list) – The first sequence to map from.

  • bottom (list) – The second sequence to map to.

Returns:

crossings, pairs, mean_len

Return type:

tuple[int, list[tuple[int, int]], float]

plast.alignment.best_layout_min_crossings(arch1, arch2, sim_matrix=None, min_module_size=5, threshold=1.0)

Finds the best transformation (rotation and reversal) of arch2 to align with arch1 such that the number of crossings between matched modules is minimized.

Supports:
  1. Simple matching when min_module_size == 1 and sim_matrix is None.

  2. Module-based matching using a similarity matrix, threshold, and minimum module size.

Parameters:
  • arch1 (list) – Reference architecture (sequence of modules).

  • arch2 (list) – Architecture to be transformed and aligned.

  • sim_matrix (np.ndarray, optional) – Precomputed similarity matrix between modules.

  • min_module_size (int) – Minimum size of modules to consider for matching.

  • threshold (float) – Similarity threshold for considering a module pair as a match.

Returns:

Alignment object with best transformation and matching info.

Return type:

Alignment

plast.alignment.rotate_parsed(pl, shift, reverse=False)

Rotates the parsed CDS records (a DataFrame) by ‘shift’ positions (circularly) and reassigns the coordinates so that the first record in the rotated list starts at 1.

If reverse=True:
  • reverse the order of elements first,

  • flip coordinates to reversed orientation:

    start’ = L - end + 1 end’ = L - start + 1

  • invert strand (if present): strand’ = -strand

  • then apply the rotation by ‘shift’

Parameters:
  • pl (PLAST) – PLAST object to rotate.

  • shift (int) – Number of positions to rotate.

  • reverse (bool) – Whether to reverse the order and flip coordinates.

Returns:

Rotated PLAST object.

Return type:

PLAST

plast.alignment.rotate_plasmid(plast, shift, reverse=False)

Returns a new PLAST object with the vector rotated by ‘shift’ positions. If reverse=True, the order of elements is reversed (and parsed is flipped accordingly).

Parameters:
  • plast (PLAST) – PLAST object to rotate.

  • shift (int) – Number of positions to rotate.

  • reverse (bool) – Whether to reverse the order and flip coordinates.

Returns:

Rotated PLAST object.

Return type:

PLAST