API Reference¶
Backend¶
selfx.backend
¶
selfx.backend.features
¶
Core abstractions for running feature computations and retrieving stored analysis results.
This module provides:
-
Feature: Base Celery task for a single analysis feature. A feature computes a result for a given time interval, optionally generates an LLM summary, and stores the result viaselfx.backend.results. -
AnalysisManager: Helper for splitting time ranges into analysis intervals, discovering missing intervals, and loading stored results. -
get_analysis_intervals(...): Utility for converting a requested time range into a mapping of stable, storage-safe interval identifiers.
Design notes¶
- Results are persisted through
store_result(...),get_result(...), andget_results(...)fromselfx.backend.results. - The special interval
Nonerepresents online/live analysis and is stored under the"Online"prefix. Feature.run(...)is the Celery entrypoint and should not usually be overridden. Subclasses should implementperform(...)and optionallyllm_prompt(...).
Expected subclass contract¶
A typical feature subclass implements:
perform(start, end) -> Any: Compute the feature result for the given interval.llm_prompt(result_dict) -> str | None: Return a prompt for LLM summarization, or a falsey value to skip it.layout(...),register_callbacks(...), etc. for UI integration.
Notes¶
- This refactor removes several broken references from the original code such as
self.s3and unfinished in-memory frame management. AnalysisManagernow acts purely as a storage/time-range helper.
AnalysisManager
¶
Helper for interval discovery and loading persisted analysis results.
Parameters¶
freq : str
Pandas-compatible frequency string used to split ranges into smaller
intervals, e.g. "1h" or "15min".
Notes¶
This class does not maintain an in-memory frame cache in this refactored version. It uses persisted results as the source of truth.
__str__()
¶
get_analysis(start, finish, feature=None)
¶
Load analysis results for all intervals between start and finish.
Parameters¶
start : Any Requested analysis start. finish : Any Requested analysis finish. feature : str | None, optional If provided, load only that feature for each interval. Otherwise load all stored feature results for each interval.
Returns¶
list[Any] List of loaded results in interval order.
get_frame(interval_key, feature=None)
¶
Load stored results for a single interval.
Parameters¶
interval_key : str
Storage-safe interval identifier such as "Online" or a sanitized
timestamp string.
feature : str | None, optional
If provided, load only this feature from the interval.
Returns¶
Any
Either:
- dict[str, Any] when feature is None
- a single stored result object when feature is given
- None if loading fails unexpectedly
get_non_analyzed_intervals(start, finish)
¶
Determine which sub-intervals do not yet have stored results.
Parameters¶
start : Any Start of the requested range. finish : Any End of the requested range.
Returns¶
list[tuple[pandas.Timestamp, pandas.Timestamp]]
Intervals of length self.freq whose corresponding storage folder
currently does not exist or has no stored results.
Notes¶
This refactored implementation infers "analyzed" from persisted storage.
An interval is considered analyzed if get_results(interval_key)
returns a non-empty dictionary.
get_previous_frame(finish)
¶
Feature
¶
Bases: Task
Base class for analysis features executed as Celery tasks.
Subclasses are expected to implement perform(...) and may optionally
implement UI-specific hooks such as layout(...) and
register_callbacks(...).
Attributes¶
required_features : list[str] Names of features that should be available before this feature runs. plant_name : str | None Optional plant/system identifier for logging or UI display. data_access : Any Optional backend/data access object injected externally. config : dict Feature-specific configuration mapping. tr : Any Optional translation/localization object. color_mapping : Any Optional UI color mapping metadata. periodic : bool Whether this feature is intended for periodic execution. fetching : bool Whether this feature represents a data-fetching task.
__init__(tr=None, periodic=False, fetching=False)
¶
Initialize the feature task.
Parameters¶
tr : Any, optional Translation/localization helper. periodic : bool, default=False Whether the feature is intended to run periodically. fetching : bool, default=False Whether the task is a data-fetching task.
get_result(sel_date, feature=None)
¶
Load a previously stored result for this feature.
Parameters¶
sel_date : Any
Selected date/interval start. Supported values include None,
string, datetime-like values, and "Online".
feature : str | None, optional
Feature name to load. Defaults to this feature's own name.
Returns¶
Any
Stored result object, {} if missing, or None if loading
fails.
is_online(role)
¶
layout(role, analysis, start, end)
¶
llm_prompt(result)
¶
perform(start, end)
¶
register_callbacks(dash_app, analysis)
¶
run(start_iso, finish_iso=None)
¶
Celery task entrypoint.
This method converts the incoming ISO-like timestamps, executes the feature computation, optionally enriches the result with an LLM summary, and stores the final payload.
Parameters¶
start_iso : str | None
Interval start as a string, or None for online mode.
finish_iso : str | None, optional
Interval end as a string.
Returns¶
bool
True if execution reached completion.
Raises¶
Exception
Re-raises any exception from perform(...) after logging.
get_analysis_intervals(start, finish)
¶
Split a requested range into daily analysis intervals.
Parameters¶
start : Any Analysis start. Supported values include string and datetime-like values. finish : Any Analysis end. Supported values include string, datetime/date-like values.
Returns¶
dict[str, tuple[pandas.Timestamp | None, pandas.Timestamp | None]]
Mapping from storage-safe interval key to (interval_start, interval_end).
Special case:
- If either ``start`` or ``finish`` is ``None``, returns
``{"Online": (None, None)}``.
Notes¶
- Intervals are generated at daily granularity.
- Each interval maps a day-start to
day-start + 1 day. - Keys are sanitized timestamp strings suitable for filesystem storage.
selfx.backend.results
¶
results.py
Utilities for storing and retrieving computation results on the local
filesystem under the fallback directory Analysis.
This module currently uses only local storage. Results are serialized with
joblib and stored in a directory structure grouped by interval.
Storage layout¶
Results are written to::
Analysis/{interval_prefix}/{feature}.joblib
Where:
- interval_prefix is either "Online" or a sanitized timestamp-like string
- feature is the logical result name
Main functions¶
store_result(...)stores one serialized resultget_result(...)retrieves one stored result by relative pathis_stored(...)checks whether a result existsget_results(...)loads all results for a given intervaldelete_files(...)deletes all stored files for one or more intervals
Notes¶
- Missing files in
get_result(...)return{}for compatibility with the previous behavior. - Failed deserialization returns
Noneafter printing the traceback. - This module assumes that
featureis already safe to use as a filename.
delete_files(intervals)
¶
Delete all stored files for one or more intervals.
Parameters¶
intervals : Iterable[str] Iterable of interval prefixes whose stored results should be removed.
Returns¶
None
Notes¶
This function deletes files recursively inside each interval directory, but does not currently remove now-empty directories.
get_result(identifier)
¶
Load a stored result by relative path.
Parameters¶
identifier : str
Relative path under DEFAULT_RESULTS_DIR, for example:
"Online/temperature.joblib".
Returns¶
Any Deserialized Python object.
Special cases:
- returns ``{}`` if the file does not exist
- returns ``None`` if deserialization fails
Notes¶
Returning {} for missing files preserves compatibility with earlier code.
get_results(interval)
¶
Load all stored results for a given interval.
Parameters¶
interval : str
Interval prefix to load from, for example "Online".
Returns¶
dict[str, Any] Mapping from feature name to deserialized result object.
Notes¶
- Only files matching
*.joblibare loaded. - Files that fail to load are skipped after printing a traceback.
- If the interval directory does not exist, an empty dictionary is returned.
is_stored(interval, feature)
¶
store_result(interval, feature, result)
¶
Store a serialized result for a given interval and feature.
The result is written as a .joblib file under the analysis directory.
Parameters¶
interval : Any | None
Interval identifier. If None, results are stored under "Online".
feature : str
Logical name of the result. Used as the filename stem.
result : Any
Python object serializable by joblib.
Returns¶
None
Storage path¶
Analysis/{interval_prefix}/{feature}.joblib
selfx.backend.perform
¶
get_required_features(selected_feature_name, all_features)
¶
Return all dependencies for a selected feature in topological order.
The output includes the selected feature itself. Each dependency appears before any feature that depends on it.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
selected_feature_name
|
str
|
Target feature name. |
required |
all_features
|
dict[str, Feature]
|
Mapping of feature names to feature objects. |
required |
Returns:
| Type | Description |
|---|---|
|
list[str]: Dependency-ordered feature names. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If the selected feature or any required feature is missing. |
ValueError
|
If a cyclic dependency is detected. |
perform_requested_features(feature_objects, celery_app, feature, system, start, finish)
¶
Perform requested features. :param system: :param data: :param start: :param finish: :return:
run_tasks(tasks, celery_app, interv)
¶
Dispatch a chain of celery tasks for a given interval.
Parameters¶
tasks: List of celery task names present in celery_app.tasks. celery_app: Celery application instance. interv: A pair (start, end) of datetime-like objects. They are converted via .isoformat() and passed into each task signature.
Returns¶
None
Notes¶
- celery is imported lazily because celery import can be heavy.
- This function only enqueues the chain (apply_async()).
selfx.backend.datetime_utils
¶
selfx.utils.time_utils
Datetime/timestamp helpers used across SelfX.
Design goals - Be explicit about timezone handling (naive vs aware). - Keep parsing/formatting functions predictable. - Prefer pandas vectorized operations for Series / DataFrames.
convert_pandas_dt_to_str(series, nano_sec=False)
¶
Convert a datetime-like pandas Series to strings in format '%d.%m.%y %H:%M:%S.%f'.
If nano_sec=True, append 3 extra digits for nanoseconds (so you effectively get 9 digits).
convert_pandas_str_to_dt(series)
¶
Parse strings formatted as '%d.%m.%y %H:%M:%S.%f' into pandas datetimes.
dt_to_str_till_sec(d, short_year=False)
¶
Format as 'YYYY-MM-DD HH:MM:SS' (or 'YY-...' if short_year).
dt_to_str_till_sec_europe(d, omit_date=False, short_year=False, omit_seconds=False)
¶
European formatting: - With date: 'DD.MM.YYYY HH:MM(:SS)' - Without date: 'HH:MM(:SS)' - With short_year: DD.MM.YY ...
ensure_utc_index(df, *, col='timestamp', assume_tz_if_naive='UTC', sort=True, set_as_index=False, drop=False)
¶
Ensure a DataFrame has UTC-aware timestamps, optionally set as index, optionally sorted.
Common use cases - normalize data before writing to InfluxDB - normalize data right after reading from InfluxDB - enforce consistent comparisons across sources
Parameters¶
df:
Input DataFrame.
col:
Timestamp column name. If None, uses the current index as the timestamp source.
assume_tz_if_naive:
Timezone assumed for naive timestamps (see ensure_utc_series).
sort:
If True, sort by timestamp (and preserve stable ordering for ties).
set_as_index:
If True, set the normalized timestamp as the DataFrame index.
drop:
If set_as_index=True, drop the timestamp column.
Returns¶
DataFrame with normalized UTC timestamps (column or index).
ensure_utc_series(s, *, assume_tz_if_naive='UTC')
¶
Ensure a timestamp Series is timezone-aware in UTC.
Behavior
- Parses to datetime (pd.to_datetime).
- If naive:
- if assume_tz_if_naive is None -> raise
- else localize to that tz, then convert to UTC
- If already tz-aware -> convert to UTC
Parameters¶
s: Input Series (strings/datetime/ints are accepted by pandas). assume_tz_if_naive: Timezone to assume for naive timestamps. Common choices: - "UTC" (often safest for machine timestamps) - "Europe/Berlin" (if your sources are local time) - None (force callers to be explicit; raises on naive)
Returns¶
Series with dtype datetime64[ns, UTC]
interval_difference(new_interval, intervals)
¶
Subtract a list of intervals from a single interval.
Given new_interval = (start, end) and a list intervals (the "blocked" parts),
returns a list of remaining (non-overlapping) intervals.
Notes
- Intervals are treated as half-open-ish in practice: [start, end).
The overlap checks follow that convention.
- Assumes each interval is a tuple of (start, end) with start < end.
- intervals need not be sorted; order only affects intermediate splitting, not correctness.
is_valid_date(string, formats=None)
¶
Check whether string matches any date format in formats.
istoday(ts)
¶
Return True if ts is today (local), False if not, None if ts is None.
now_tz_naive()
¶
Local naive current time.
robust_to_datetime(series)
¶
Robustly parse a pandas Series to datetime.
Strategy:
- First try pandas' inference (errors="coerce")
- Then try a few common ISO variants with timezone info.
sort_timestamps(data)
¶
Sort data[k] DataFrames by ['timestamp', 'name'] if timestamps are not monotonic increasing.
Expects each DataFrame to have columns: 'timestamp' and 'name'.
str_to_datetime(s, offset=None, convert_to_utc=False, tz=None, str_till_us=False)
¶
Parse a datetime string into a datetime.datetime.
Supported inputs
- Default: ISO 8601-like strings parseable by datetime.fromisoformat,
e.g. "2024-01-01 12:34:56", "2024-01-01T12:34:56", with or without tz offset.
- If str_till_us=True: parses european format
"%d.%m.%y %H:%M:%S.%f" or "%d.%m.%y %H:%M:%S"
Timezone behavior
- If tz is provided and parsing produces a naive datetime, it is localized with tz.localize(...).
(If the parsed datetime is already aware, it is left as-is.)
- If convert_to_utc=True, the result is converted to UTC (requires an aware datetime).
Parameters¶
s: Input string or None. offset: Optional timedelta added after parsing (and after UTC conversion if enabled). convert_to_utc: Whether to convert the result to UTC. tz: pytz timezone used to localize naive datetimes. str_till_us: Whether to parse using the european "%d.%m.%y %H:%M:%S(.%f)" format.
Returns¶
datetime or None
time_ago(target_time, time_now=None, translate=None, till_hour=False)
¶
Return a human-friendly relative time string like "3 minutes ago".
Notes
- Assumes time_now and target_time are comparable (both naive or both aware).
to_aware(d, tz, *, assume_local_if_naive=True)
¶
Ensure d is timezone-aware.
If d is naive and assume_local_if_naive=True, localize it with the provided tz.
If d is already aware, it is converted to tz.
Raises¶
ValueError if d is naive and assume_local_if_naive=False.
to_naive_utc(d, *, assume_tz_if_naive=None)
¶
Convert datetime to naive UTC (tzinfo removed).
If d is naive, you must supply assume_tz_if_naive so we know how to interpret it.
today_tz_naive()
¶
Local naive 'today 00:00:00'.
tomorrow_tz_naive()
¶
Local naive 'tomorrow 00:00:00'.
Uses timedelta, avoids invalid dates at month boundaries.
selfx.backend.utils
¶
make_valid_filename(s)
¶
Convert a string into a filesystem-safe filename.
This function replaces characters that are typically invalid in filenames with underscores and optionally truncates the result to a maximum length.
Parameters¶
s : str The input string that should be converted into a valid filename.
Returns¶
str A sanitized filename containing only letters, numbers, underscores, hyphens, and dots, with a maximum length of 255 characters.
Notes¶
- Invalid characters are replaced using the regex: [^a-zA-Z0-9_-.]
- The 255-character limit corresponds to common filesystem limits.
parse_independent_processes_file(file_name)
¶
Parse a text file describing independent process groups.
The file is expected to contain blocks of lines separated by blank lines. Each block represents one independent process group.
Within a block: - Lines starting with '% ' define the group name. - Other lines represent the contents of the group.
If a group name is not defined, a default name "Gruppe X" is assigned.
Parameters¶
file_name : str Path to the input file to parse.
Returns¶
Tuple[List[str], List[List[str]]] A tuple containing: - A list of group names. - A list of groups, where each group is a list of lines belonging to it.
Example¶
Input file:
% Process A
step1
step2
% Process B
step1
step2
Output:
(
["Process A", "Process B"],
[
["% Process A", "step1", "step2"],
["% Process B", "step1", "step2"]
]
)
try_flatten(list_of_list)
¶
Flatten a list containing nested lists by one level.
If an element is a list, its items are expanded into the result. If it is not a list, the element is kept as-is.
Parameters¶
list_of_list : List[Any] A list containing elements that may themselves be lists.
Returns¶
List[Any] A flattened list where nested lists are expanded by one level.
Example¶
Input: [1, [2, 3], 4, [5]]
Output
[1, 2, 3, 4, 5]
Dashboard¶
selfx.dash
¶
selfx.dash.dashboard
¶
SelfXDash
¶
Main application wrapper for the SelfX Dash dashboard.
Responsibilities¶
- Create and configure the Dash app instance.
- Register plants/systems and their feature classes.
- Instantiate feature objects and register their callbacks.
- Provide routing (URL -> page render) and date-range/reevaluate behavior.
add_system(name, features=(), unified=True, settings=False, preferences=False, refresh=False, freq='1h')
¶
Register a system/plant in the dashboard and wire up feature callbacks.
features can be:
- iterable of Feature classes (treated as role "Default")
- mapping role -> iterable of Feature classes
Each feature class is expected to provide: - feature_name() -> str - register_callbacks(app, analysis_manager) - config : dict (parameter -> dict containing at least "value")
register_celery_tasks()
¶
Register feature tasks and (optionally) create a locked periodic chain executor.
plants_roles_features is expected like:
{plant_name: {role: [(cls, mdl), ...], ...}, ...}
editable_table(df, use_columns=None, conditional_rows=None, hidden_columns=None, limit_numeric_precision=None, numeric_cols=None, editable_columns=None, style_as_list_view=False, **kwargs)
¶
Editable Dash DataTable helper.
numeric_colsmarks columns as numeric (optionally withlimit_numeric_precision)editable_columnsmarks specific columns editableconditional_rowshighlights specific (column, row_index) pairs
error_content(message, path='')
¶
Default error page content.
get_modal(modal_id, title='Notification', button=True, button_text='Acknowledge')
¶
Small modal helper with optional acknowledge button.
table(data, use_columns=None, **kwargs)
¶
Simple read-only table helper that normalizes object columns to strings.
selfx.dash.layouts
¶
selfx.dash.colors
¶
contrast(hex_color, *, threshold=186)
¶
Return black or white depending on the perceived brightness of a background color.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hex_color
|
str
|
Color in '#RRGGBB' format. |
required |
threshold
|
float
|
Brightness threshold for switching to black text. |
186
|
Returns:
| Type | Description |
|---|---|
str
|
'#000000' for light backgrounds, '#FFFFFF' for dark backgrounds. |
hex_to_rgb(hex_color)
¶
Convert '#RRGGBB' into an (R, G, B) tuple.
hex_to_rgba(hex_color, opacity=None)
¶
Convert a hex color string to CSS rgb(...) or rgba(...).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hex_color
|
str
|
Color in '#RRGGBB' format. |
required |
opacity
|
float | None
|
Optional opacity value in the range 0..1. |
None
|
Returns:
| Type | Description |
|---|---|
str
|
CSS color string, either 'rgb(r,g,b)' or 'rgba(r,g,b,a)'. |
iterate(index)
¶
Return a color from the combined NEUTRAL and OTHER palettes, cycling by index.
opacity(color, alpha)
¶
Apply alpha transparency to a hex color and return '#RRGGBBAA'.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
color
|
str
|
Color in '#RRGGBB' format. |
required |
alpha
|
float
|
Opacity in the range 0..1. |
required |
Returns:
| Type | Description |
|---|---|
str
|
8-digit hex color string like '#FF000080'. |
rgb_to_hex(rgb, *, with_hash=False)
¶
Convert an RGB triplet into a 6-digit hex string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rgb
|
Sequence[int]
|
Sequence of three integers in the range 0..255. |
required |
with_hash
|
bool
|
Whether to prepend '#'. |
False
|
Returns:
| Type | Description |
|---|---|
str
|
Hex color string like 'ff00aa' or '#ff00aa'. |