Struct anira::InferenceConfig¶

struct InferenceConfig¶

Collaboration diagram for anira::InferenceConfig:

digraph {
graph [bgcolor="#00000000"]
node [shape=rectangle style=filled fillcolor="#FFFFFF" font=Helvetica padding=2]
edge [color="#1414CE"]
"1" [label="anira::InferenceConfig" tooltip="anira::InferenceConfig" fillcolor="#BFBFBF"]
"3" [label="anira::ModelData" tooltip="anira::ModelData"]
"2" [label="anira::ProcessingSpec" tooltip="anira::ProcessingSpec"]
"4" [label="anira::TensorShape" tooltip="anira::TensorShape"]
"1" -> "2" [dir=forward tooltip="usage"]
"1" -> "3" [dir=forward tooltip="usage"]
"1" -> "4" [dir=forward tooltip="usage"]
}

Complete configuration for neural network inference operations.

The InferenceConfig struct serves as the central configuration hub for all aspects of neural network inference in anira. It combines model data, tensor specifications, processing parameters, and performance settings into a single, cohesive configuration.

Usage Examples:

// Simple mono audio effect
InferenceConfig config(
    {ModelData("model.onnx", ONNX)},
    {TensorShape({{1, 512}}, {{1, 512}})},
    ProcessingSpec({1}, {1}, {512}, {512}),
    10.0f  // 10ms max inference time
);

// Multi-input model with control parameters
InferenceConfig config(
    {ModelData("model.onnx", ONNX)},
    {TensorShape({{1, 512}, {1, 4}}, {{1, 512}})}, // Audio + 4 parameters
    ProcessingSpec({1, 1}, {1}, {512, 0}, {512}),  // Second input non-streamable
    15.0f
);

Public Functions

InferenceConfig() = default¶

Default constructor creating an empty configuration.

Creates an uninitialized InferenceConfig that must be properly configured before use. All member variables are default-initialized.

InferenceConfig(std::vector<ModelData> model_data, std::vector<TensorShape> tensor_shape, ProcessingSpec processing_spec, float max_inference_time, unsigned int warm_up = Defaults::m_warm_up, bool session_exclusive_processor = Defaults::m_session_exclusive_processor, float blocking_ratio = Defaults::m_blocking_ratio, unsigned int num_parallel_processors = Defaults::m_num_parallel_processors)¶

Constructs a complete InferenceConfig with processing specification.

Creates a fully configured InferenceConfig suitable for immediate use. This constructor includes explicit processing specifications for fine-grained control over the inference pipeline.

Parameters:

model_data – Vector of model data for different backends
tensor_shape – Vector of tensor shape configurations
processing_spec – Processing specification defining channels, sizes, and latencies
max_inference_time – Maximum allowed inference time in milliseconds per inference
warm_up – Number of warm-up inferences to perform during initialization
session_exclusive_processor – Whether to use exclusive processor sessions
blocking_ratio – Ratio controlling blocking behavior (0.0-1.0)
num_parallel_processors – Number of parallel inference processors to use

inline InferenceConfig(std::vector<ModelData> model_data, std::vector<TensorShape> tensor_shape, float max_inference_time, unsigned int warm_up = Defaults::m_warm_up, bool session_exclusive_processor = Defaults::m_session_exclusive_processor, float blocking_ratio = Defaults::m_blocking_ratio, unsigned int num_parallel_processors = Defaults::m_num_parallel_processors)¶

Constructs a simplified InferenceConfig with automatic processing specification.

Creates an InferenceConfig with automatic processing specification generation. The ProcessingSpec will be computed automatically from the provided tensor shapes, using default values for buffer sizes and channel counts.

Parameters:

model_data – Vector of model data for different backends
tensor_shape – Vector of tensor shape configurations
max_inference_time – Maximum allowed inference time in milliseconds per inference
warm_up – Number of warm-up inferences to perform during initialization
session_exclusive_processor – Whether to use exclusive processor sessions
blocking_ratio – Ratio controlling blocking behavior (0.0-1.0)
num_parallel_processors – Number of parallel inference processors to use

std::string get_model_path(InferenceBackend backend)¶

Gets the model file path for a specific backend.

Parameters:: backend – The target inference backend
Returns:: Model file path as string

const ModelData *get_model_data(InferenceBackend backend) const¶

Gets the model data structure for a specific backend.

Parameters:: backend – The target inference backend
Returns:: Pointer to ModelData, or nullptr if not found

std::string get_model_function(InferenceBackend backend) const¶

Gets the model function name for a specific backend.

Parameters:: backend – The target inference backend
Returns:: Model function name (LibTorch specific)

bool is_model_binary(InferenceBackend backend) const¶

Checks if the model data is binary for a specific backend.

Parameters:: backend – The target inference backend
Returns:: true if model data is binary, false if it’s a file path

const TensorShapeList &get_tensor_input_shape() const¶

Gets universal input tensor shapes.

Returns:: List of input tensor shapes (universal across backends)

const TensorShapeList &get_tensor_output_shape() const¶

Gets universal output tensor shapes.

Returns:: List of output tensor shapes (universal across backends)

const TensorShapeList &get_tensor_input_shape(InferenceBackend backend) const¶

Gets input tensor shapes for a specific backend.

Parameters:: backend – The target inference backend
Returns:: List of input tensor shapes for the specified backend

const TensorShapeList &get_tensor_output_shape(InferenceBackend backend) const¶

Gets output tensor shapes for a specific backend.

Parameters:: backend – The target inference backend
Returns:: List of output tensor shapes for the specified backend

const std::vector<size_t> &get_tensor_input_size() const¶

Gets total size (element count) for each input tensor.

Returns:: Vector of input tensor sizes

const std::vector<size_t> &get_tensor_output_size() const¶

Gets total size (element count) for each output tensor.

Returns:: Vector of output tensor sizes

const std::vector<size_t> &get_preprocess_input_channels() const¶

Gets number of input channels for each input tensor.

Returns:: Vector of input channel counts

const std::vector<size_t> &get_postprocess_output_channels() const¶

Gets number of output channels for each output tensor.

Returns:: Vector of output channel counts

const std::vector<size_t> &get_preprocess_input_size() const¶

Gets samples count required for preprocessing for each input tensor.

Returns:: Vector of input buffer sizes (0 = non-streamable)

const std::vector<size_t> &get_postprocess_output_size() const¶

Gets samples count after the postprocessing for each output tensor.

Returns:: Vector of output buffer sizes (0 = non-streamable)

const std::vector<size_t> &get_internal_model_latency() const¶

Gets internal model latency for each output tensor.

Returns:: Vector of latency values in samples

void set_tensor_input_shape(const TensorShapeList &input_shape)¶

Sets universal input tensor shapes.

Parameters:: input_shape – New input tensor shapes for all backends

void set_tensor_output_shape(const TensorShapeList &output_shape)¶

Sets universal output tensor shapes.

Parameters:: output_shape – New output tensor shapes for all backends

void set_tensor_input_shape(const TensorShapeList &input_shape, InferenceBackend backend)¶

Sets input tensor shapes for a specific backend.

Parameters:

input_shape – New input tensor shapes
backend – Target inference backend

void set_tensor_output_shape(const TensorShapeList &output_shape, InferenceBackend backend)¶

Sets output tensor shapes for a specific backend.

Parameters:

output_shape – New output tensor shapes
backend – Target inference backend

void set_preprocess_input_channels(const std::vector<size_t> &input_channels)¶

Sets input channel counts for preprocessing.

Parameters:: input_channels – New input channel counts

void set_preprocess_output_channels(const std::vector<size_t> &output_channels)¶

Sets output channel counts for postprocessing.

Parameters:: output_channels – New output channel counts

void set_preprocess_input_size(const std::vector<size_t> &preprocess_input_size)¶

Sets the required samples count for preprocessing each input tensor.

Parameters:: preprocess_input_size – New samples count required for preprocessing (0 = non-streamable)

void set_postprocess_output_size(const std::vector<size_t> &postprocess_output_size)¶

Sets the required samples count after postprocessing each output tensor.

Parameters:: postprocess_output_size – New samples count after postprocessing (0 = non-streamable)

void set_internal_model_latency(const std::vector<size_t> &internal_model_latency)¶

Sets internal model latency values.

Parameters:: internal_model_latency – New latency values in samples

void set_model_path(const std::string &model_path, InferenceBackend backend)¶

Sets model path for a specific backend.

Parameters:

model_path – New model file path
backend – Target inference backend

inline bool operator==(const InferenceConfig &other) const¶

Equality comparison operator.

Compares all configuration parameters using appropriate epsilon values for floating-point comparisons.

Parameters:: other – The InferenceConfig instance to compare with
Returns:: true if all parameters are equal within tolerance, false otherwise

inline bool operator!=(const InferenceConfig &other) const¶

Inequality comparison operator.

Parameters:: other – The InferenceConfig instance to compare with
Returns:: true if any parameters are not equal, false otherwise

const TensorShape &get_tensor_shape(InferenceBackend backend) const¶

Gets tensor shape configuration for a specific backend.

Retrieves the TensorShape object that matches the specified backend, or falls back to universal shapes if no backend-specific shape is found.

Parameters:: backend – The target inference backend
Throws:: std::runtime_error – if no matching shape configuration is found
Returns:: Reference to the matching TensorShape configuration

void clear_processing_spec()¶

Clears all processing specification parameters.

Resets all processing specification vectors to empty state. This is useful before reconfiguring the processing pipeline.

void update_processing_spec()¶

Updates and validates the processing specification.

Automatically computes tensor sizes from shapes, validates consistency between tensor configurations and processing parameters, and fills in default values where needed. This method should be called after any changes to tensor shapes or processing parameters.

Note

This method is automatically called by InferenceConfig constructors

Throws:

std::invalid_argument – if tensor shapes and processing parameters are inconsistent
std::invalid_argument – if tensor dimensions are invalid (non-positive)

Public Members

std::vector<ModelData> m_model_data¶: Model data for different inference backends.

std::vector<TensorShape> m_tensor_shape¶: Tensor shape configurations for inputs and outputs.

ProcessingSpec m_processing_spec¶: Processing specification for preprocessing and postprocessing.

float m_max_inference_time¶: Maximum allowed inference time in milliseconds.

unsigned int m_warm_up¶: Number of warm-up inferences to perform.

bool m_session_exclusive_processor¶: Whether to use exclusive processor sessions.

float m_blocking_ratio¶: Blocking ratio for real-time control (0.0-1.0)

unsigned int m_num_parallel_processors¶: Number of parallel inference processors.

struct Defaults¶

Default values for inference configuration parameters.

This nested struct provides sensible default values for optional InferenceConfig parameters, ensuring consistent behavior across different usage scenarios.

Public Static Attributes

static constexpr unsigned int m_warm_up = 0¶: Default number of warm-up inferences (0 = no warm-up)

static constexpr bool m_session_exclusive_processor = false¶: Default session exclusivity (false = shared processors)

static constexpr float m_blocking_ratio = 0.f¶: Default blocking ratio (0.0 = non-blocking)

static unsigned int m_num_parallel_processors = (std::thread::hardware_concurrency() / 2 > 0) ? std::thread::hardware_concurrency() / 2 : 1¶: Default number of parallel processors (half of available hardware threads, minimum 1)