Class anira::InferenceHandler

class InferenceHandler

Collaboration diagram for anira::InferenceHandler:

digraph {
    graph [bgcolor="#00000000"]
    node [shape=rectangle style=filled fillcolor="#FFFFFF" font=Helvetica padding=2]
    edge [color="#1414CE"]
    "16" [label="anira::Buffer< float >" tooltip="anira::Buffer< float >"]
    "17" [label="anira::MemoryBlock< float >" tooltip="anira::MemoryBlock< float >"]
    "8" [label="anira::MemoryBlock< std::atomic< float > >" tooltip="anira::MemoryBlock< std::atomic< float > >"]
    "13" [label="anira::BackendBase" tooltip="anira::BackendBase"]
    "10" [label="anira::Context" tooltip="anira::Context"]
    "11" [label="anira::ContextConfig" tooltip="anira::ContextConfig"]
    "25" [label="anira::HighPriorityThread" tooltip="anira::HighPriorityThread"]
    "9" [label="anira::HostConfig" tooltip="anira::HostConfig"]
    "2" [label="anira::InferenceConfig" tooltip="anira::InferenceConfig"]
    "26" [label="anira::InferenceData" tooltip="anira::InferenceData"]
    "1" [label="anira::InferenceHandler" tooltip="anira::InferenceHandler" fillcolor="#BFBFBF"]
    "6" [label="anira::InferenceManager" tooltip="anira::InferenceManager"]
    "24" [label="anira::InferenceThread" tooltip="anira::InferenceThread"]
    "18" [label="anira::LibtorchProcessor" tooltip="anira::LibtorchProcessor"]
    "19" [label="anira::LibtorchProcessor::Instance" tooltip="anira::LibtorchProcessor::Instance"]
    "4" [label="anira::ModelData" tooltip="anira::ModelData"]
    "20" [label="anira::OnnxRuntimeProcessor" tooltip="anira::OnnxRuntimeProcessor"]
    "21" [label="anira::OnnxRuntimeProcessor::Instance" tooltip="anira::OnnxRuntimeProcessor::Instance"]
    "7" [label="anira::PrePostProcessor" tooltip="anira::PrePostProcessor"]
    "3" [label="anira::ProcessingSpec" tooltip="anira::ProcessingSpec"]
    "15" [label="anira::RingBuffer" tooltip="anira::RingBuffer"]
    "12" [label="anira::SessionElement" tooltip="anira::SessionElement"]
    "14" [label="anira::SessionElement::ThreadSafeStruct" tooltip="anira::SessionElement::ThreadSafeStruct"]
    "22" [label="anira::TFLiteProcessor" tooltip="anira::TFLiteProcessor"]
    "23" [label="anira::TFLiteProcessor::Instance" tooltip="anira::TFLiteProcessor::Instance"]
    "5" [label="anira::TensorShape" tooltip="anira::TensorShape"]
    "16" -> "17" [dir=forward tooltip="usage"]
    "13" -> "2" [dir=forward tooltip="usage"]
    "10" -> "11" [dir=forward tooltip="usage"]
    "10" -> "12" [dir=forward tooltip="usage"]
    "10" -> "24" [dir=forward tooltip="usage"]
    "10" -> "18" [dir=forward tooltip="usage"]
    "10" -> "20" [dir=forward tooltip="usage"]
    "10" -> "22" [dir=forward tooltip="usage"]
    "10" -> "26" [dir=forward tooltip="usage"]
    "2" -> "3" [dir=forward tooltip="usage"]
    "2" -> "4" [dir=forward tooltip="usage"]
    "2" -> "5" [dir=forward tooltip="usage"]
    "1" -> "2" [dir=forward tooltip="usage"]
    "1" -> "6" [dir=forward tooltip="usage"]
    "6" -> "2" [dir=forward tooltip="usage"]
    "6" -> "7" [dir=forward tooltip="usage"]
    "6" -> "9" [dir=forward tooltip="usage"]
    "6" -> "10" [dir=forward tooltip="usage"]
    "6" -> "12" [dir=forward tooltip="usage"]
    "24" -> "25" [dir=forward tooltip="public-inheritance"]
    "24" -> "26" [dir=forward tooltip="usage"]
    "18" -> "13" [dir=forward tooltip="public-inheritance"]
    "18" -> "19" [dir=forward tooltip="usage"]
    "19" -> "2" [dir=forward tooltip="usage"]
    "19" -> "17" [dir=forward tooltip="usage"]
    "20" -> "13" [dir=forward tooltip="public-inheritance"]
    "20" -> "21" [dir=forward tooltip="usage"]
    "21" -> "2" [dir=forward tooltip="usage"]
    "21" -> "17" [dir=forward tooltip="usage"]
    "7" -> "2" [dir=forward tooltip="usage"]
    "7" -> "8" [dir=forward tooltip="usage"]
    "15" -> "16" [dir=forward tooltip="public-inheritance"]
    "12" -> "7" [dir=forward tooltip="usage"]
    "12" -> "2" [dir=forward tooltip="usage"]
    "12" -> "13" [dir=forward tooltip="usage"]
    "12" -> "9" [dir=forward tooltip="usage"]
    "12" -> "14" [dir=forward tooltip="usage"]
    "12" -> "15" [dir=forward tooltip="usage"]
    "12" -> "18" [dir=forward tooltip="usage"]
    "12" -> "20" [dir=forward tooltip="usage"]
    "12" -> "22" [dir=forward tooltip="usage"]
    "22" -> "13" [dir=forward tooltip="public-inheritance"]
    "22" -> "23" [dir=forward tooltip="usage"]
    "23" -> "2" [dir=forward tooltip="usage"]
    "23" -> "17" [dir=forward tooltip="usage"]
}

Main handler class for neural network inference operations.

The InferenceHandler provides a high-level interface for performing neural network inference in real-time audio processing contexts. It manages the inference backend, data buffering, and processing pipeline while ensuring real-time safety.

This class supports multiple processing modes:

  • Single tensor processing for simple models

  • Multi-tensor processing for complex models with multiple inputs/outputs

  • Push/pop data patterns for decoupled processing

Note

This class is designed for real-time audio processing and uses appropriate memory allocation strategies to avoid audio dropouts.

Public Functions

InferenceHandler() = delete

Default constructor is deleted to prevent uninitialized instances.

InferenceHandler(PrePostProcessor &pp_processor, InferenceConfig &inference_config, const ContextConfig &context_config = ContextConfig())

Constructs an InferenceHandler with pre/post processor and inference configuration.

Parameters:
  • pp_processor – Reference to the pre/post processor for data transformation

  • inference_config – Reference to the inference configuration containing model settings

  • context_config – Optional context configuration for advanced settings (default: ContextConfig())

InferenceHandler(PrePostProcessor &pp_processor, InferenceConfig &inference_config, BackendBase &custom_processor, const ContextConfig &context_config = ContextConfig())

Constructs an InferenceHandler with custom backend processor.

Parameters:
  • pp_processor – Reference to the pre/post processor for data transformation

  • inference_config – Reference to the inference configuration containing model settings

  • custom_processor – Reference to a custom backend processor implementation

  • context_config – Optional context configuration for advanced settings (default: ContextConfig())

~InferenceHandler()

Destructor that properly cleans up inference resources.

void set_inference_backend(InferenceBackend inference_backend)

Sets the inference backend to use for neural network processing.

Parameters:

inference_backend – The backend type to use (e.g., ONNX, LibTorch, TensorFlow Lite or custom)

InferenceBackend get_inference_backend()

Gets the currently active inference backend.

Returns:

The currently configured inference backend type

void prepare(HostConfig new_audio_config)

Prepares the inference handler for processing with new audio configuration.

This method must be called before processing begins or when audio settings change. It initializes internal buffers and prepares the inference pipeline.

Parameters:

new_audio_config – The new audio configuration containing sample rate, buffer size, etc.

void prepare(HostConfig new_audio_config, unsigned int custom_latency, size_t tensor_index = 0)

Prepares the inference handler for processing with new audio configuration and a custom latency.

This method must be called before processing begins or when audio settings change. It initializes internal buffers and prepares the inference pipeline.

Parameters:
  • new_audio_config – The new audio configuration containing sample rate, buffer size, etc.

  • custom_latency – Custom latency value in samples to override the calculated latency

  • tensor_index – Index of the tensor to apply the custom latency (default: 0)

void prepare(HostConfig new_audio_config, std::vector<unsigned int> custom_latency)

Prepares the inference handler for processing with new audio configuration and custom latencies for each tensor.

This method must be called before processing begins or when audio settings change. It initializes internal buffers and prepares the inference pipeline.

Parameters:
  • new_audio_config – The new audio configuration containing sample rate, buffer size, etc.

  • custom_latency – Vector of custom latency values in samples for each tensor

size_t process(float *const *data, size_t num_samples, size_t tensor_index = 0)

Processes audio data in-place for models with identical input/output shapes.

This is the most simple processing method when input and output have the same data shape and only one tensor index is streamable (e.g., audio effects with non-streamable parameters).

Note

This method is real-time safe and should not allocate memory

Parameters:
  • data – Audio data buffer organized as data[channel][sample]

  • num_samples – Number of samples to process

  • tensor_index – Index of the tensor to process (default: 0)

Returns:

Number of samples actually processed

size_t process(const float *const *input_data, size_t num_input_samples, float *const *output_data, size_t num_output_samples, size_t tensor_index = 0)

Processes audio data with separate input and output buffers.

This method allows for different input and output buffer sizes and is suitable for models that have different input and output shapes.

Note

This method is real-time safe and should not allocate memory

Parameters:
  • input_data – Input audio data organized as data[channel][sample]

  • num_input_samples – Number of input samples

  • output_data – Output audio data buffer organized as data[channel][sample]

  • num_output_samples – Maximum number of output samples the buffer can hold

  • tensor_index – Index of the tensor to process (default: 0)

Returns:

Number of output samples actually written

size_t *process(const float *const *const *input_data, size_t *num_input_samples, float *const *const *output_data, size_t *num_output_samples)

Processes multiple tensors simultaneously.

This method handles complex models with multiple input and output tensors, processing all tensors in a single call.

Note

This method is real-time safe and should not allocate memory

Parameters:
  • input_data – Input data organized as data[tensor_index][channel][sample]

  • num_input_samples – Array of input sample counts for each tensor

  • output_data – Output data buffers organized as data[tensor_index][channel][sample]

  • num_output_samples – Array of maximum output sample counts for each tensor

Returns:

Array of actual output sample counts for each tensor

void push_data(const float *const *input_data, size_t num_input_samples, size_t tensor_index = 0)

Pushes input data to the processing pipeline for a specific tensor.

This method enables decoupled input/output processing where data can be pushed and popped independently. Useful for buffered processing scenarios.

Note

This method is real-time safe and should not allocate memory

Parameters:
  • input_data – Input audio data organized as data[channel][sample]

  • num_input_samples – Number of input samples to push

  • tensor_index – Index of the tensor to receive the data (default: 0)

void push_data(const float *const *const *input_data, size_t *num_input_samples)

Pushes input data for multiple tensors simultaneously.

Note

This method is real-time safe and should not allocate memory

Parameters:
  • input_data – Input data organized as data[tensor_index][channel][sample]

  • num_input_samples – Array of input sample counts for each tensor

size_t pop_data(float *const *output_data, size_t num_output_samples, size_t tensor_index = 0)

Pops processed output data from the pipeline for a specific tensor.

This method retrieves processed data from the inference pipeline. Should be used in conjunction with push_data for decoupled processing.

Note

This method is real-time safe and should not allocate memory

Parameters:
  • output_data – Output buffer organized as data[channel][sample]

  • num_output_samples – Maximum number of samples the output buffer can hold

  • tensor_index – Index of the tensor to retrieve data from (default: 0)

Returns:

Number of samples actually written to the output buffer

size_t *pop_data(float *const *const *output_data, size_t *num_output_samples)

Pops processed output data for multiple tensors simultaneously.

Note

This method is real-time safe and should not allocate memory

Parameters:
  • output_data – Output buffers organized as data[tensor_index][channel][sample]

  • num_output_samples – Array of maximum output sample counts for each tensor

Returns:

Array of actual output sample counts for each tensor

unsigned int get_latency(size_t tensor_index = 0) const

Gets the processing latency for a specific tensor.

Returns the latency introduced by the inference processing in samples for a specific tensor. This includes buffering delays and model-specific processing latency.

Parameters:

tensor_index – Index of the tensor to query (default: 0)

Returns:

Latency in samples for the specified tensor

std::vector<unsigned int> get_latency_vector() const

Gets the processing latency for all tensors.

Returns:

Vector containing latency values in samples for each tensor index

size_t get_available_samples(size_t tensor_index, size_t channel = 0) const

Gets the number of samples received for a specific tensor and channel.

This method is useful for monitoring the data flow, benchmarking and debugging purposes.

Parameters:
  • tensor_index – Index of the tensor to query

  • channel – Channel index to query (default: 0)

Returns:

Number of samples received for the specified tensor and channel

void set_non_realtime(bool is_non_realtime)

Configures the handler for non-real-time operation.

When set to true, relaxes real-time constraints and may use different memory allocation strategies or processing algorithms optimized for offline processing.

Parameters:

is_non_realtime – True to enable non-real-time mode, false for real-time mode

void reset()

Resets the inference handler to its initial state.

This method clears all internal buffers, resets the inference pipeline, and prepares the handler for a new processing session. This also resets the latency and available samples for all tensors.

Note

This method waits for all ongoing inferences to complete before resetting.