Skip to content

API Reference

Backend

selfx.backend

selfx.backend.features

Core abstractions for running feature computations and retrieving stored analysis results.

This module provides:

  • Feature: Base Celery task for a single analysis feature. A feature computes a result for a given time interval, optionally generates an LLM summary, and stores the result via selfx.backend.results.

  • AnalysisManager: Helper for splitting time ranges into analysis intervals, discovering missing intervals, and loading stored results.

  • get_analysis_intervals(...): Utility for converting a requested time range into a mapping of stable, storage-safe interval identifiers.

Design notes

  • Results are persisted through store_result(...), get_result(...), and get_results(...) from selfx.backend.results.
  • The special interval None represents online/live analysis and is stored under the "Online" prefix.
  • Feature.run(...) is the Celery entrypoint and should not usually be overridden. Subclasses should implement perform(...) and optionally llm_prompt(...).

Expected subclass contract

A typical feature subclass implements:

  • perform(start, end) -> Any: Compute the feature result for the given interval.
  • llm_prompt(result_dict) -> str | None: Return a prompt for LLM summarization, or a falsey value to skip it.
  • layout(...), register_callbacks(...), etc. for UI integration.

Notes

  • This refactor removes several broken references from the original code such as self.s3 and unfinished in-memory frame management.
  • AnalysisManager now acts purely as a storage/time-range helper.

AnalysisManager

Helper for interval discovery and loading persisted analysis results.

Parameters

freq : str Pandas-compatible frequency string used to split ranges into smaller intervals, e.g. "1h" or "15min".

Notes

This class does not maintain an in-memory frame cache in this refactored version. It uses persisted results as the source of truth.

__init__(freq)

Initialize the analysis manager.

Parameters

freq : str Pandas frequency string.

__str__()

Return a string representation of the manager.

Returns

str Human-readable manager description.

get_analysis(start, finish, feature=None)

Load analysis results for all intervals between start and finish.

Parameters

start : Any Requested analysis start. finish : Any Requested analysis finish. feature : str | None, optional If provided, load only that feature for each interval. Otherwise load all stored feature results for each interval.

Returns

list[Any] List of loaded results in interval order.

get_frame(interval_key, feature=None)

Load stored results for a single interval.

Parameters

interval_key : str Storage-safe interval identifier such as "Online" or a sanitized timestamp string. feature : str | None, optional If provided, load only this feature from the interval.

Returns

Any Either: - dict[str, Any] when feature is None - a single stored result object when feature is given - None if loading fails unexpectedly

get_non_analyzed_intervals(start, finish)

Determine which sub-intervals do not yet have stored results.

Parameters

start : Any Start of the requested range. finish : Any End of the requested range.

Returns

list[tuple[pandas.Timestamp, pandas.Timestamp]] Intervals of length self.freq whose corresponding storage folder currently does not exist or has no stored results.

Notes

This refactored implementation infers "analyzed" from persisted storage. An interval is considered analyzed if get_results(interval_key) returns a non-empty dictionary.

get_previous_frame(finish)

Return the interval immediately preceding finish.

Parameters

finish : Any End timestamp of the desired interval.

Returns

tuple[pandas.Timestamp, pandas.Timestamp] | None Previous interval boundaries, or None if finish is invalid.

get_today_non_analyzed_frames()

Return missing analysis intervals from local midnight until now.

Returns

list[tuple[pandas.Timestamp, pandas.Timestamp]] Missing sub-intervals between today's midnight and current Berlin time.

Feature

Bases: Task

Base class for analysis features executed as Celery tasks.

Subclasses are expected to implement perform(...) and may optionally implement UI-specific hooks such as layout(...) and register_callbacks(...).

Attributes

required_features : list[str] Names of features that should be available before this feature runs. plant_name : str | None Optional plant/system identifier for logging or UI display. data_access : Any Optional backend/data access object injected externally. config : dict Feature-specific configuration mapping. tr : Any Optional translation/localization object. color_mapping : Any Optional UI color mapping metadata. periodic : bool Whether this feature is intended for periodic execution. fetching : bool Whether this feature represents a data-fetching task.

__init__(tr=None, periodic=False, fetching=False)

Initialize the feature task.

Parameters

tr : Any, optional Translation/localization helper. periodic : bool, default=False Whether the feature is intended to run periodically. fetching : bool, default=False Whether the task is a data-fetching task.

__repr__()

Return the feature's display representation.

Returns

str The class name.

feature_name() classmethod

Return the canonical feature name.

Returns

str The class name.

get_result(sel_date, feature=None)

Load a previously stored result for this feature.

Parameters

sel_date : Any Selected date/interval start. Supported values include None, string, datetime-like values, and "Online". feature : str | None, optional Feature name to load. Defaults to this feature's own name.

Returns

Any Stored result object, {} if missing, or None if loading fails.

icon()

Return the Material icon name for this feature.

Returns

str Material icon identifier.

is_online(role)

Indicate whether this feature is an online/live feature.

Parameters

role : Any User/application role.

Returns

bool True for live/online-only features.

layout(role, analysis, start, end)

Return the UI layout representation for this feature.

Parameters

role : Any User/application role. analysis : Any Analysis payload to visualize. start : Any Interval start. end : Any Interval end.

Returns

Any UI-specific layout object.

Notes

Subclasses should override this method.

llm_prompt(result)

Build the LLM prompt for a computed result.

Parameters

result : Mapping[str, Any] Computed result payload.

Returns

str | None Prompt text for the LLM, or None / empty string to disable LLM summarization.

perform(start, end)

Execute the actual feature computation.

Parameters

start : pandas.Timestamp | None Interval start. end : pandas.Timestamp | None Interval end.

Returns

Any Arbitrary result payload. Non-dict results are wrapped into {"result": ...} by run(...).

Notes

Subclasses should override this method.

register_callbacks(dash_app, analysis)

Register UI callbacks for this feature.

Parameters

dash_app : Any Dash application instance. analysis : Any Analysis manager or analysis context.

Notes

Subclasses may override this method.

run(start_iso, finish_iso=None)

Celery task entrypoint.

This method converts the incoming ISO-like timestamps, executes the feature computation, optionally enriches the result with an LLM summary, and stores the final payload.

Parameters

start_iso : str | None Interval start as a string, or None for online mode. finish_iso : str | None, optional Interval end as a string.

Returns

bool True if execution reached completion.

Raises

Exception Re-raises any exception from perform(...) after logging.

time_range_selection(role)

Indicate whether this feature supports manual time-range selection.

Parameters

role : Any User/application role.

Returns

bool False if time-range selection is supported.

get_analysis_intervals(start, finish)

Split a requested range into daily analysis intervals.

Parameters

start : Any Analysis start. Supported values include string and datetime-like values. finish : Any Analysis end. Supported values include string, datetime/date-like values.

Returns

dict[str, tuple[pandas.Timestamp | None, pandas.Timestamp | None]] Mapping from storage-safe interval key to (interval_start, interval_end).

Special case:
- If either ``start`` or ``finish`` is ``None``, returns
  ``{"Online": (None, None)}``.
Notes
  • Intervals are generated at daily granularity.
  • Each interval maps a day-start to day-start + 1 day.
  • Keys are sanitized timestamp strings suitable for filesystem storage.

selfx.backend.results

results.py

Utilities for storing and retrieving computation results on the local filesystem under the fallback directory Analysis.

This module currently uses only local storage. Results are serialized with joblib and stored in a directory structure grouped by interval.

Storage layout

Results are written to::

Analysis/{interval_prefix}/{feature}.joblib

Where: - interval_prefix is either "Online" or a sanitized timestamp-like string - feature is the logical result name

Main functions

  • store_result(...) stores one serialized result
  • get_result(...) retrieves one stored result by relative path
  • is_stored(...) checks whether a result exists
  • get_results(...) loads all results for a given interval
  • delete_files(...) deletes all stored files for one or more intervals

Notes

  • Missing files in get_result(...) return {} for compatibility with the previous behavior.
  • Failed deserialization returns None after printing the traceback.
  • This module assumes that feature is already safe to use as a filename.

delete_files(intervals)

Delete all stored files for one or more intervals.

Parameters

intervals : Iterable[str] Iterable of interval prefixes whose stored results should be removed.

Returns

None

Notes

This function deletes files recursively inside each interval directory, but does not currently remove now-empty directories.

get_result(identifier)

Load a stored result by relative path.

Parameters

identifier : str Relative path under DEFAULT_RESULTS_DIR, for example: "Online/temperature.joblib".

Returns

Any Deserialized Python object.

Special cases:
- returns ``{}`` if the file does not exist
- returns ``None`` if deserialization fails
Notes

Returning {} for missing files preserves compatibility with earlier code.

get_results(interval)

Load all stored results for a given interval.

Parameters

interval : str Interval prefix to load from, for example "Online".

Returns

dict[str, Any] Mapping from feature name to deserialized result object.

Notes
  • Only files matching *.joblib are loaded.
  • Files that fail to load are skipped after printing a traceback.
  • If the interval directory does not exist, an empty dictionary is returned.

is_stored(interval, feature)

Check whether a stored result exists.

Parameters

interval : str Interval prefix, such as "Online" or a sanitized timestamp string. feature : str Feature name without the .joblib extension.

Returns

bool True if the corresponding file exists, otherwise False.

store_result(interval, feature, result)

Store a serialized result for a given interval and feature.

The result is written as a .joblib file under the analysis directory.

Parameters

interval : Any | None Interval identifier. If None, results are stored under "Online". feature : str Logical name of the result. Used as the filename stem. result : Any Python object serializable by joblib.

Returns

None

Storage path

Analysis/{interval_prefix}/{feature}.joblib

selfx.backend.perform

get_required_features(selected_feature_name, all_features)

Return all dependencies for a selected feature in topological order.

The output includes the selected feature itself. Each dependency appears before any feature that depends on it.

Parameters:

Name Type Description Default
selected_feature_name str

Target feature name.

required
all_features dict[str, Feature]

Mapping of feature names to feature objects.

required

Returns:

Type Description

list[str]: Dependency-ordered feature names.

Raises:

Type Description
KeyError

If the selected feature or any required feature is missing.

ValueError

If a cyclic dependency is detected.

perform_requested_features(feature_objects, celery_app, feature, system, start, finish)

Perform requested features. :param system: :param data: :param start: :param finish: :return:

run_tasks(tasks, celery_app, interv)

Dispatch a chain of celery tasks for a given interval.

Parameters

tasks: List of celery task names present in celery_app.tasks. celery_app: Celery application instance. interv: A pair (start, end) of datetime-like objects. They are converted via .isoformat() and passed into each task signature.

Returns

None

Notes
  • celery is imported lazily because celery import can be heavy.
  • This function only enqueues the chain (apply_async()).

selfx.backend.datetime_utils

selfx.utils.time_utils

Datetime/timestamp helpers used across SelfX.

Design goals - Be explicit about timezone handling (naive vs aware). - Keep parsing/formatting functions predictable. - Prefer pandas vectorized operations for Series / DataFrames.

convert_pandas_dt_to_str(series, nano_sec=False)

Convert a datetime-like pandas Series to strings in format '%d.%m.%y %H:%M:%S.%f'.

If nano_sec=True, append 3 extra digits for nanoseconds (so you effectively get 9 digits).

convert_pandas_str_to_dt(series)

Parse strings formatted as '%d.%m.%y %H:%M:%S.%f' into pandas datetimes.

datetime_to_utc(t)

Convert an aware datetime to UTC.

Raises

ValueError if t is naive.

dt_to_str_till_sec(d, short_year=False)

Format as 'YYYY-MM-DD HH:MM:SS' (or 'YY-...' if short_year).

dt_to_str_till_sec_europe(d, omit_date=False, short_year=False, omit_seconds=False)

European formatting: - With date: 'DD.MM.YYYY HH:MM(:SS)' - Without date: 'HH:MM(:SS)' - With short_year: DD.MM.YY ...

ensure_utc_index(df, *, col='timestamp', assume_tz_if_naive='UTC', sort=True, set_as_index=False, drop=False)

Ensure a DataFrame has UTC-aware timestamps, optionally set as index, optionally sorted.

Common use cases - normalize data before writing to InfluxDB - normalize data right after reading from InfluxDB - enforce consistent comparisons across sources

Parameters

df: Input DataFrame. col: Timestamp column name. If None, uses the current index as the timestamp source. assume_tz_if_naive: Timezone assumed for naive timestamps (see ensure_utc_series). sort: If True, sort by timestamp (and preserve stable ordering for ties). set_as_index: If True, set the normalized timestamp as the DataFrame index. drop: If set_as_index=True, drop the timestamp column.

Returns

DataFrame with normalized UTC timestamps (column or index).

ensure_utc_series(s, *, assume_tz_if_naive='UTC')

Ensure a timestamp Series is timezone-aware in UTC.

Behavior - Parses to datetime (pd.to_datetime). - If naive: - if assume_tz_if_naive is None -> raise - else localize to that tz, then convert to UTC - If already tz-aware -> convert to UTC

Parameters

s: Input Series (strings/datetime/ints are accepted by pandas). assume_tz_if_naive: Timezone to assume for naive timestamps. Common choices: - "UTC" (often safest for machine timestamps) - "Europe/Berlin" (if your sources are local time) - None (force callers to be explicit; raises on naive)

Returns

Series with dtype datetime64[ns, UTC]

interval_difference(new_interval, intervals)

Subtract a list of intervals from a single interval.

Given new_interval = (start, end) and a list intervals (the "blocked" parts), returns a list of remaining (non-overlapping) intervals.

Notes - Intervals are treated as half-open-ish in practice: [start, end). The overlap checks follow that convention. - Assumes each interval is a tuple of (start, end) with start < end. - intervals need not be sorted; order only affects intermediate splitting, not correctness.

is_valid_date(string, formats=None)

Check whether string matches any date format in formats.

istoday(ts)

Return True if ts is today (local), False if not, None if ts is None.

now_tz_naive()

Local naive current time.

robust_to_datetime(series)

Robustly parse a pandas Series to datetime.

Strategy: - First try pandas' inference (errors="coerce") - Then try a few common ISO variants with timezone info.

sort_timestamps(data)

Sort data[k] DataFrames by ['timestamp', 'name'] if timestamps are not monotonic increasing.

Expects each DataFrame to have columns: 'timestamp' and 'name'.

str_to_datetime(s, offset=None, convert_to_utc=False, tz=None, str_till_us=False)

Parse a datetime string into a datetime.datetime.

Supported inputs - Default: ISO 8601-like strings parseable by datetime.fromisoformat, e.g. "2024-01-01 12:34:56", "2024-01-01T12:34:56", with or without tz offset. - If str_till_us=True: parses european format "%d.%m.%y %H:%M:%S.%f" or "%d.%m.%y %H:%M:%S"

Timezone behavior - If tz is provided and parsing produces a naive datetime, it is localized with tz.localize(...). (If the parsed datetime is already aware, it is left as-is.) - If convert_to_utc=True, the result is converted to UTC (requires an aware datetime).

Parameters

s: Input string or None. offset: Optional timedelta added after parsing (and after UTC conversion if enabled). convert_to_utc: Whether to convert the result to UTC. tz: pytz timezone used to localize naive datetimes. str_till_us: Whether to parse using the european "%d.%m.%y %H:%M:%S(.%f)" format.

Returns

datetime or None

time_ago(target_time, time_now=None, translate=None, till_hour=False)

Return a human-friendly relative time string like "3 minutes ago".

Notes - Assumes time_now and target_time are comparable (both naive or both aware).

to_aware(d, tz, *, assume_local_if_naive=True)

Ensure d is timezone-aware.

If d is naive and assume_local_if_naive=True, localize it with the provided tz. If d is already aware, it is converted to tz.

Raises

ValueError if d is naive and assume_local_if_naive=False.

to_naive_utc(d, *, assume_tz_if_naive=None)

Convert datetime to naive UTC (tzinfo removed).

If d is naive, you must supply assume_tz_if_naive so we know how to interpret it.

today_tz_naive()

Local naive 'today 00:00:00'.

tomorrow_tz_naive()

Local naive 'tomorrow 00:00:00'.

Uses timedelta, avoids invalid dates at month boundaries.

selfx.backend.utils

make_valid_filename(s)

Convert a string into a filesystem-safe filename.

This function replaces characters that are typically invalid in filenames with underscores and optionally truncates the result to a maximum length.

Parameters

s : str The input string that should be converted into a valid filename.

Returns

str A sanitized filename containing only letters, numbers, underscores, hyphens, and dots, with a maximum length of 255 characters.

Notes
  • Invalid characters are replaced using the regex: [^a-zA-Z0-9_-.]
  • The 255-character limit corresponds to common filesystem limits.

parse_independent_processes_file(file_name)

Parse a text file describing independent process groups.

The file is expected to contain blocks of lines separated by blank lines. Each block represents one independent process group.

Within a block: - Lines starting with '% ' define the group name. - Other lines represent the contents of the group.

If a group name is not defined, a default name "Gruppe X" is assigned.

Parameters

file_name : str Path to the input file to parse.

Returns

Tuple[List[str], List[List[str]]] A tuple containing: - A list of group names. - A list of groups, where each group is a list of lines belonging to it.

Example

Input file:

% Process A
step1
step2

% Process B
step1
step2

Output:

(
    ["Process A", "Process B"],
    [
        ["% Process A", "step1", "step2"],
        ["% Process B", "step1", "step2"]
    ]
)

try_flatten(list_of_list)

Flatten a list containing nested lists by one level.

If an element is a list, its items are expanded into the result. If it is not a list, the element is kept as-is.

Parameters

list_of_list : List[Any] A list containing elements that may themselves be lists.

Returns

List[Any] A flattened list where nested lists are expanded by one level.

Example

Input: [1, [2, 3], 4, [5]]

Output

[1, 2, 3, 4, 5]

Dashboard

selfx.dash

selfx.dash.dashboard

SelfXDash

Main application wrapper for the SelfX Dash dashboard.

Responsibilities
  • Create and configure the Dash app instance.
  • Register plants/systems and their feature classes.
  • Instantiate feature objects and register their callbacks.
  • Provide routing (URL -> page render) and date-range/reevaluate behavior.

add_system(name, features=(), unified=True, settings=False, preferences=False, refresh=False, freq='1h')

Register a system/plant in the dashboard and wire up feature callbacks.

features can be: - iterable of Feature classes (treated as role "Default") - mapping role -> iterable of Feature classes

Each feature class is expected to provide: - feature_name() -> str - register_callbacks(app, analysis_manager) - config : dict (parameter -> dict containing at least "value")

register_celery_tasks()

Register feature tasks and (optionally) create a locked periodic chain executor.

plants_roles_features is expected like: {plant_name: {role: [(cls, mdl), ...], ...}, ...}

editable_table(df, use_columns=None, conditional_rows=None, hidden_columns=None, limit_numeric_precision=None, numeric_cols=None, editable_columns=None, style_as_list_view=False, **kwargs)

Editable Dash DataTable helper.

  • numeric_cols marks columns as numeric (optionally with limit_numeric_precision)
  • editable_columns marks specific columns editable
  • conditional_rows highlights specific (column, row_index) pairs

error_content(message, path='')

Default error page content.

get_modal(modal_id, title='Notification', button=True, button_text='Acknowledge')

Small modal helper with optional acknowledge button.

table(data, use_columns=None, **kwargs)

Simple read-only table helper that normalizes object columns to strings.

selfx.dash.layouts

selfx.dash.colors

contrast(hex_color, *, threshold=186)

Return black or white depending on the perceived brightness of a background color.

Parameters:

Name Type Description Default
hex_color str

Color in '#RRGGBB' format.

required
threshold float

Brightness threshold for switching to black text.

186

Returns:

Type Description
str

'#000000' for light backgrounds, '#FFFFFF' for dark backgrounds.

hex_to_rgb(hex_color)

Convert '#RRGGBB' into an (R, G, B) tuple.

hex_to_rgba(hex_color, opacity=None)

Convert a hex color string to CSS rgb(...) or rgba(...).

Parameters:

Name Type Description Default
hex_color str

Color in '#RRGGBB' format.

required
opacity float | None

Optional opacity value in the range 0..1.

None

Returns:

Type Description
str

CSS color string, either 'rgb(r,g,b)' or 'rgba(r,g,b,a)'.

iterate(index)

Return a color from the combined NEUTRAL and OTHER palettes, cycling by index.

opacity(color, alpha)

Apply alpha transparency to a hex color and return '#RRGGBBAA'.

Parameters:

Name Type Description Default
color str

Color in '#RRGGBB' format.

required
alpha float

Opacity in the range 0..1.

required

Returns:

Type Description
str

8-digit hex color string like '#FF000080'.

rgb_to_hex(rgb, *, with_hash=False)

Convert an RGB triplet into a 6-digit hex string.

Parameters:

Name Type Description Default
rgb Sequence[int]

Sequence of three integers in the range 0..255.

required
with_hash bool

Whether to prepend '#'.

False

Returns:

Type Description
str

Hex color string like 'ff00aa' or '#ff00aa'.

selfx.dash.plot

selfx.dash.routing_utils