Class anira::SessionElement¶
-
class SessionElement¶
Collaboration diagram for anira::SessionElement:
Core session management class for individual inference instances.
The SessionElement class represents a single inference session, managing all resources and state required for neural network inference processing. Each session is independent and can have different configurations, backends, and processing parameters while sharing the global inference thread pool and context.
Key responsibilities:
Managing input/output ring buffers for continuous audio streaming
Coordinating with backend processors (LibTorch, ONNX, TensorFlow Lite)
Handling latency calculation and compensation
Managing thread-safe data structures for multi-threaded processing
Buffer size calculation and optimization for different host configurations
Session lifecycle management and resource cleanup
The session uses ring buffers for efficient audio streaming and maintains multiple thread-safe structures to enable concurrent processing without blocking the audio thread. Latency is automatically calculated based on the model characteristics and host audio configuration.
Note
Each session has a unique ID and maintains its own processing state while participating in the global inference scheduling system.
Public Functions
-
SessionElement(int newSessionID, PrePostProcessor &pp_processor, InferenceConfig &inference_config)¶
Constructor that initializes a session with specified components.
Creates a new session element with a unique ID and associates it with the provided preprocessing/postprocessing pipeline and inference configuration. The session is not fully initialized until prepare() is called.
- Parameters:
newSessionID – Unique identifier for this session
pp_processor – Reference to the preprocessing/postprocessing pipeline
inference_config – Reference to the inference configuration containing model settings
-
void clear()¶
Clears all session data and resets to initial state.
Resets ring buffers, clears inference queues, and reinitializes all session state to prepare for reconfiguration or shutdown.
-
void prepare(const HostConfig &spec, std::vector<long> custom_latency = {})¶
Prepares the session for processing with specified audio configuration.
Initializes all buffers, calculates latencies, and configures the session for processing with the provided host audio configuration. This method must be called before the session can process audio data.
- Parameters:
spec – Host configuration containing sample rate, buffer size, and audio settings
custom_latency – Optional vector of custom latency values for each tensor (empty for automatic calculation)
Template method for setting backend processors.
Assigns a specific backend processor to this session. This template method works with any supported backend type (LibTorch, ONNX, TensorFlow Lite).
- Template Parameters:
T – Backend processor type
- Parameters:
processor – Shared pointer to the backend processor to assign
-
size_t calculate_num_structs(const HostConfig &spec) const¶
Calculates the number of thread-safe structures needed (public for testing)
Determines the optimal number of concurrent processing structures based on the host configuration and model requirements. This ensures sufficient parallelism without excessive memory usage.
- Parameters:
spec – Host configuration to calculate requirements for
- Returns:
Number of thread-safe structures needed
-
std::vector<float> calculate_latency(const HostConfig &host_config)¶
Calculates latency values for all tensors (public for testing)
Computes the processing latency for each tensor based on the model characteristics and host audio configuration. Includes buffer delays, processing time, and synchronization overhead.
- Parameters:
host_config – Host configuration to calculate latency for
- Returns:
Vector of latency values in samples for each tensor
-
std::vector<size_t> calculate_send_buffer_sizes(const HostConfig &host_config) const¶
Calculates send buffer sizes for all tensors (public for testing)
Determines the optimal buffer sizes for input ring buffers based on the model input requirements and host configuration.
- Parameters:
host_config – Host configuration to calculate buffer sizes for
- Returns:
Vector of buffer sizes for each input tensor
-
std::vector<size_t> calculate_receive_buffer_sizes(const HostConfig &host_config) const¶
Calculates receive buffer sizes for all tensors (public for testing)
Determines the optimal buffer sizes for output ring buffers based on the model output requirements and host configuration.
- Parameters:
host_config – Host configuration to calculate buffer sizes for
- Returns:
Vector of buffer sizes for each output tensor
Public Members
-
std::vector<RingBuffer> m_send_buffer¶
Ring buffers for input data streaming to inference.
-
std::vector<RingBuffer> m_receive_buffer¶
Ring buffers for output data streaming from inference.
-
std::vector<std::shared_ptr<ThreadSafeStruct>> m_inference_queue¶
Pool of thread-safe structures for concurrent processing.
-
std::atomic<InferenceBackend> m_current_backend = {CUSTOM}¶
Currently active inference backend for this session.
-
unsigned long m_current_queue = 0¶
Current position in the inference queue.
-
std::vector<unsigned long> m_time_stamps¶
Vector of timestamps for performance monitoring.
-
const int m_session_id¶
Unique identifier for this session (immutable)
-
std::atomic<bool> m_initialized = {false}¶
Atomic flag indicating if the session is fully initialized.
-
std::atomic<int> m_active_inferences = {0}¶
Atomic counter of currently active inference operations.
-
PrePostProcessor &m_pp_processor¶
Reference to the preprocessing/postprocessing pipeline.
-
InferenceConfig &m_inference_config¶
Reference to the inference configuration.
-
BackendBase m_default_processor¶
Default backend processor instance.
-
BackendBase *m_custom_processor¶
Pointer to custom backend processor (if provided)
-
bool m_is_non_real_time = false¶
Flag indicating non-real-time processing mode.
-
std::vector<unsigned int> m_latency¶
Calculated latency values for each tensor in samples.
-
size_t m_num_structs = 0¶
Number of allocated thread-safe structures (for testing access)
-
std::vector<size_t> m_send_buffer_size¶
Calculated send buffer sizes (for testing access)
-
std::vector<size_t> m_receive_buffer_size¶
Calculated receive buffer sizes (for testing access)
-
std::shared_ptr<LibtorchProcessor> m_libtorch_processor = nullptr¶
Shared pointer to LibTorch backend processor (if available)
-
std::shared_ptr<OnnxRuntimeProcessor> m_onnx_processor = nullptr¶
Shared pointer to ONNX Runtime backend processor (if available)
-
std::shared_ptr<TFLiteProcessor> m_tflite_processor = nullptr¶
Shared pointer to TensorFlow Lite backend processor (if available)
-
struct ThreadSafeStruct¶
Thread-safe data structure for concurrent inference processing.
This nested structure provides thread-safe coordination between the audio thread and inference threads. Each structure can hold one inference request and includes synchronization primitives to ensure safe concurrent access.
The structure uses atomic operations and semaphores to coordinate:
Availability checking (m_free)
Completion notification (m_done_semaphore, m_done_atomic)
Data integrity during concurrent access
Timestamping for latency tracking
Public Functions
-
ThreadSafeStruct(std::vector<size_t> tensor_input_size, std::vector<size_t> tensor_output_size)¶
Constructor that initializes thread-safe structure with tensor dimensions.
Creates buffers for input and output tensors with the specified sizes and initializes synchronization primitives.
- Parameters:
tensor_input_size – Vector of input tensor sizes
tensor_output_size – Vector of output tensor sizes
Public Members
-
std::atomic<bool> m_free = {true}¶
Atomic flag indicating if this structure is available for use.
-
std::binary_semaphore m_done_semaphore = {false}¶
Semaphore for blocking wait on inference completion.
-
std::atomic<bool> m_done_atomic = {false}¶
Atomic flag for non-blocking completion checking.
-
unsigned long m_time_stamp¶
Timestamp for latency tracking and debugging.