bridge.pipelines.policies.gh2bt package#

Public Interface#

This section documents user-facing interface of the bridge.pipelines.policies.gh2bt package (as defined in its __init__.py file).

Functions#

reconcile_gh_over_bt(*, gh_norm, bt_norm, ...)

Apply a generic GitHub-over-bio.tools reconciliation policy.

reconcile_gh_ontop_bt(*, gh_norm, bt_norm, ...)

Apply a generic GitHub-on-top-of-bio.tools policy for additive metadata.

Generic reconciliation policies for GitHub to bio.tools mapping.

bridge.pipelines.policies.gh2bt.reconcile_gh_ontop_bt(*, gh_norm, bt_norm, bt_value, build_bt_from_gh, build_bt_from_norm, log_label)[source]#

Apply a generic GitHub-on-top-of-bio.tools policy for additive metadata.

This function is intended for multi-valued fields where GitHub can contribute additional values on top of existing bio.tools ones (e.g. functions). Both GitHub and bio.tools values are provided as sets and the function computes the subset of GitHub values that are missing from bio.tools.

Policy: 1. If gh_norm is None or empty,

GitHub is treated as silent and no change is made to bio.tools. An “unchanged” log entry is emitted.

  1. If gh_norm contains values, each value is mapped to zero or more

    bio.tools values via build_bt_from_gh.

  2. If bt_norm is None or empty, all bio.tools values derived from GitHub are added to bio.tools.

    An “added” log entry is emitted.

  3. If both gh_norm and bt_norm contain values,

    the union of the existing bio.tools values and the new values derived from GitHub is computed.

  4. If the union is the same as the existing bio.tools values, no change is made.

    An “exact” log entry is emitted.

  5. If the union contains additional values compared to the existing bio.tools values,

    the new values are added to bio.tools and an “added” log entry is emitted indicating the number of new values.

Parameters:
  • gh_norm (GHN | None) – Normalized values derived from GitHub, or None if GitHub provides no usable value.

  • bt_norm (set[BTN] | None) – A set of normalized values derived from the existing bio.tools metadata, or None if no value is recorded.

  • bt_value (BT | None) – The existing bio.tools value corresponding to the field being reconciled, or None if no value is recorded. This is preserved when GitHub is silent or when the normalized sets are equal.

  • build_bt_from_gh (Callable[[GHN], set[BTN] | None]) – A callable that takes a normalized GitHub value and returns a set of normalized bio.tools values derived from it, or None if the GitHub value cannot be mapped to bio.tools.

  • build_bt_from_norm (Callable[[set[BTN]], BT]) – A callable that takes a set of normalized bio.tools values and constructs the corresponding concrete bio.tools value.

  • log_label (str) – A short label used in log messages to identify the reconciled field (e.g., “function”, “topic”, etc.).

Returns:

The reconciled bio.tools value, which may be the same as the existing value if GitHub is silent or if the normalized sets are equal, or a new value with GitHub-derived additions.

Return type:

BT | None

bridge.pipelines.policies.gh2bt.reconcile_gh_over_bt(*, gh_norm, bt_norm, bt_value, build_bt_from_gh, log_label, equality_fn=None)[source]#

Apply a generic GitHub-over-bio.tools reconciliation policy.

This function operates on normalized representations of GitHub and bio.tools values (gh_norm and bt_norm), while returning and constructing concrete bio.tools values (bt_value and the output).

Policy: 1. If gh_norm is None, GitHub is treated as silent and the existing

bio.tools value (bt_value) is preserved. An “unchanged” log entry is emitted.

  1. If gh_norm is not None and bt_norm is None, GitHub is treated as the only source. A new bio.tools value is constructed via build_bt_from_gh(gh_norm) and an “added” log entry is emitted.

  2. If both gh_norm and bt_norm are not None and they compare equal (gh_norm == bt_norm), the existing bio.tools value (bt_value) is preserved and an exact-match log entry is emitted.

  3. If both gh_norm and bt_norm are not None and differ, the GitHub value is treated as authoritative. A new bio.tools value is constructed via build_bt_from_gh(gh_norm) and a conflict log entry is emitted.

Parameters:
  • gh_norm (GHN | None) – Normalized representation of the GitHub value (e.g., canonicalized URL, lowercased language set, enum, etc.), or None if GitHub provides no usable value.

  • bt_norm (BTN | None) – Normalized representation of the existing bio.tools value, or None if no value is recorded.

  • bt_value (BT | None) – The current bio.tools value to be preserved when GitHub is silent or when the normalized values match.

  • build_bt_from_gh (Callable[[GHN], BT]) – Callable that constructs a concrete bio.tools value from the normalized GitHub representation.

  • log_label (str) – Short label used in log messages to identify the reconciled field (e.g., "license", "languages", "homepage").

  • equality_fn (Callable[[GHN, BTN], bool] | None, optional) – Optional callable to determine equality between normalized GitHub and bio.tools values. If None, the default equality operator (==) is used. This parameter is useful when the normalized representations require custom comparison logic (e.g., set equality for lists).

Returns:

The reconciled bio.tools value according to the policy, or None if both sources effectively provide no usable value.

Return type:

BT | None

Submodules#

reconcile_gh_ontop_bt

Apply a generic GitHub-on-top-of-bio.tools policy for additive metadata.

reconcile_gh_over_bt

Apply a generic GitHub-over-bio.tools reconciliation policy.

Dependencies diagram#

Each architecture diagram below visualizes the internal dependency structure of the bridge.pipelines.policies.gh2bt package. It shows how modules and subpackages within the package depend on each other, based on direct Python imports.

  • Packages are shown as purple rectangles

  • Modules are shown as pink rectangles

  • Arrows (A → B) indicate that A directly imports B

Each subpackage’s diagram focuses only on its own internal structure, it does not include imports to or from higher-level packages (those appear in the parent package’s diagram).

bridge package dependencies