Portal API#

The Portal is the single, authoritative interface between compiled native code and the Python runtime.

All native functionality exposed by this package is accessed exclusively through the portal namespace. The contents of this namespace are not discovered at runtime.

They are generated deterministically from compiled native build artifacts and explicit developer-defined exposure rules.

Exposure generation is a separate, repeatable step and may be re-run at any time after modifying exposure configuration, without recompiling native code.

No scanning. No guessing. No implicit imports.

Exposed tree#

Native modules are exposed under an explicit exposure tree located at:

_native/compiled/exposed/

This tree is materialized by running a dedicated exposure-generation step, e.g.:

python -m hydra_forge.generate_exposed_tree
or
python -m forge.tree

The resulting structure defines the entire public native API. Only modules present in this tree are importable through portal.

Two exposure modes are supported:

Structured exposure#

Structured exposure provides stable, hierarchical import paths defined by native_categories.json. This is the preferred mode for production APIs.

Example:

from HydraForge_Windows11.portal.io.csv import ascii_csv_reader

data = ascii_csv_reader.read_ascii_csv(...)

Flat exposure#

Flat exposure provides a provisional namespace for native modules that have not yet been assigned a stable category.

Flat exposure is automatic for compiled native modules that are not yet assigned a structured category.

Example:

from HydraForge_Windows11.portal.flat import ascii_csv_reader_rs

data = ascii_csv_reader_rs.read_ascii_csv_rs(...)

Flat exposure is fully functional, but paths may change once the module is assigned a structured category.

Import patterns#

The most common usage pattern is to import functions directly from the portal namespace:

Example:

from HydraForge_Windows11.portal.io.csv.ascii_csv_reader import read_ascii_csv

out = read_ascii_csv(
    filenames=["data.csv"],
    header_lines_to_skip=1,
    save_header_lines=False,
    columns=[["time", False, False], ["value", False, False]],
)

Advanced users may import entire exposed submodules when grouping or namespacing is desired.

Determinism and guarantees#

The Native Bridge provides the following guarantees:

  • Only explicitly exposed native modules are importable

  • The documented API exactly matches runtime availability

  • No hidden imports or side effects occur at documentation time

  • Documentation is generated without importing native binaries

In other words: if it appears here, it exists - and if it exists, it appears here.

Native build map (build → exposed API)#

Language

Module

Function

Source files

Binary

cpp

ascii_csv_reader

read_ascii_csv

ascii_csv_reader.cpp

ascii_csv_reader.pyd

rust

ascii_csv_reader_rs

read_ascii_csv_rs

Cargo.toml, pyproject.toml, lib.rs, host.rs, private.rs

ascii_csv_reader_rs.pyd

Exposed native API tree#

├── flat
│   └── ascii_csv_reader_rs
└── io
    └── ascii_csv_reader

Functions#

HydraForge_Windows11.portal.flat.ascii_csv_reader_rs.read_ascii_csv_rs(*args, **kwargs)#

Read numeric data from one or more ASCII/CSV files.

This function provides a single-threaded, high-performance ASCII/CSV reader implemented in Rust and exposed to Python via Hydra Forge’s portal.

It mirrors the behavior of the C++ ascii_csv_reader exactly, including handling of malformed lines and invalid numeric values, while providing full memory safety guarantees.

Parameters:
  • filenames (list[str]) – One or more file paths to read. Files are processed sequentially and concatenated in order.

  • header_lines_to_skip (int) – Number of header lines to skip at the beginning of each file.

  • save_header_lines (bool) – If True, skipped header lines are collected and returned in the output.

  • columns (list[list]) –

    Column configuration. Each entry must be:

    [name: str, is_string: bool, skip: bool]

    • name: Key used in the output dictionary

    • is_string: True for string columns, False for numeric

    • skip: True to ignore the column entirely

Returns:

Dictionary containing:

  • One entry per non-skipped column:
    • Numeric columns as numpy.ndarray (float64)

    • String columns as Python list[str]

  • Optional header_lines if save_header_lines=True

  • __summary__ with parsing statistics

    • total_lines : int Total number of data lines processed

    • malformed_lines : int Lines skipped due to column mismatch or parse errors

    • invalid_values : int Number of invalid numeric values replaced with NaN

Return type:

dict

Notes

  • Numeric columns are returned as 1D numpy.ndarray of type float64

  • String columns are returned as Python lists

  • Invalid numeric values are replaced with np.nan

  • Malformed lines are skipped

  • This implementation is single-threaded

  • Fully memory-safe (no undefined behavior or segfaults)

Examples

Basic usage via portal:

>>> from [pkg_name].portal.[exposed_path].csv_reader_rs import read_ascii_csv_rs
>>> data = read_ascii_csv_rs(
...     filenames=["data.csv"],
...     header_lines_to_skip=1,
...     save_header_lines=False,
...     columns=[
...         ["time", False, False],
...         ["value", False, False],
...         ["comment", True, True],
...     ],
... )
>>> data["value"][:5]
array([0.1, 0.2, 0.3, 0.4, 0.5])

Inspect parsing statistics:

>>> data["__summary__"]
{'total_lines': 1000, 'malformed_lines': 2, 'invalid_values': 3}

See also

[pkg_name].portal.[exposed_path].csv_reader_rs_mt.read_ascii_csv_rs_mt pandas.read_csv

HydraForge_Windows11.portal.io.ascii_csv_reader.read_ascii_csv(*args, **kwargs)#

read_ascii_csv(filenames: collections.abc.Sequence[str], header_lines_to_skip: typing.SupportsInt, save_header_lines: bool, columns: list) -> dict

read_ascii_csv(
        filenames,
        header_lines_to_skip,
        save_header_lines,
        columns
) -> dict

Parse ASCII CSV files into structured NumPy arrays.

The reader processes the input files sequentially and parses each data row according to the provided column schema. All values are converted directly into NumPy arrays without intermediate Python objects.

Parameters:
  • filenames (list[str]) – Paths to one or more ASCII CSV files. Files are processed in the order provided, and their data is concatenated row-wise.

  • header_lines_to_skip (int) – Number of initial lines to skip in each file before data parsing begins. Typically used to ignore headers or metadata blocks.

  • save_header_lines (bool) – If True, skipped header lines are preserved and included in the output under a metadata key.

  • columns (list[list]) – Column schema describing the expected structure of each row. Each entry is [name: str, is_string: bool, skip: bool]

Returns:

Dictionary mapping column names to NumPy arrays. If save_header_lines=True, header metadata is included under a reserved key.

Return type:

dict

Raises:

RuntimeError – If any row does not conform to the schema or contains invalid values.

Notes

  • Parsing is strict and fails fast on the first error.

  • No error recovery or row skipping is performed.

  • Suitable for trusted datasets where correctness is critical.