Getting Started

This guide will help you get started with anira for neural network inference in your audio applications.

Prerequisites

Before using anira, ensure you have:

  • A C++ compiler with C++17 support

  • CMake (version 3.14 or higher)

  • One of the supported neural network model formats:
    • ONNX model files (.onnx)

    • PyTorch model files (.pt, .pth or .ts)

    • TensorFlow Lite model files (.tflite)

Installation

Anira can be easily integrated into your CMake project. You can either add anira as a submodule, download the pre-built binaries from the releases page, or build from source.

Option 2: Use Pre-built Binaries

Download pre-built binaries from the releases page.

In your CMakeLists.txt:

# Setup your project and target
project(your_project)
add_executable(your_target main.cpp ...)

# Add the path to the anira library as cmake prefix path and find the package
list(APPEND CMAKE_PREFIX_PATH "path/to/anira")
find_package(anira REQUIRED)

# Link your target to the anira library
target_link_libraries(your_target anira::anira)

Option 3: Build from Source

git clone https://github.com/anira-project/anira.git
cd anira
cmake . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release --target anira
cmake --install build --prefix /path/to/install/directory

Build options

By default, all three inference engines are installed. You can disable specific backends as needed:

  • LibTorch: -DANIRA_WITH_LIBTORCH=OFF

  • OnnxRuntime: -DANIRA_WITH_ONNXRUNTIME=OFF

  • Tensorflow Lite: -DANIRA_WITH_TFLITE=OFF

Moreover, the following options are available:

  • Build anira with benchmark capabilities: -DANIRA_WITH_BENCHMARK=ON

  • Build example applications, plugins and populate example neural models: -DANIRA_WITH_EXAMPLES=ON

  • Build anira with tests: -DANIRA_WITH_TESTS=ON

  • Build anira with documentation: -DANIRA_WITH_DOCS=ON

  • Disable the logging system: -DANIRA_WITH_LOGGING=OFF

Basic Usage Example

The basic usage of anira is as follows:

#include <anira/anira.h>

anira::InferenceConfig inference_config(
        {{"path/to/your/model.onnx", anira::InferenceBackend::ONNX}}, // Model path
        {{{256, 1, 1}}, {{256, 1}}},  // Input, Output shape
        5.33f // Maximum inference time in ms
);

// Create a pre- and post-processor instance
anira::PrePostProcessor pp_processor(inference_config);

// Create an InferenceHandler instance
anira::InferenceHandler inference_handler(pp_processor, inference_config);

// Pass the host configuration and allocate memory for audio processing
inference_handler.prepare({buffer_size, sample_rate});

// Select the inference backend
inference_handler.set_inference_backend(anira::InferenceBackend::ONNX);

// Optionally get the latency of the inference process in samples
unsigned int latency_in_samples = inference_handler.get_latency();

// Real-time safe audio processing in process callback of your application
process(float** audio_data, int num_samples) {
    inference_handler.process(audio_data, num_samples);
}
// audio_data now contains the processed audio samples

Using Different Backends

Anira supports multiple backends that can be selected at runtime. Use the anira::InferenceHandler::set_inference_backend() method to switch between them:

 1// Set the inference backend to ONNX
 2inference_handler.set_inference_backend(anira::InferenceBackend::ONNX);
 3
 4// Set the inference backend to PyTorch
 5inference_handler.set_inference_backend(anira::InferenceBackend::PYTORCH);
 6
 7// Set the inference backend to TensorFlow Lite
 8inference_handler.set_inference_backend(anira::InferenceBackend::TFLITE);
 9
10// You can also provide and select a custom backend if needed
11inference_handler.set_inference_backend(anira::InferenceBackend::CUSTOM);

Multi Tensor Processing Example

Some neural network models require multiple input tensors or produce multiple output tensors. For example, a model might need both audio data and control parameters as inputs, or output both processed audio and confidence scores. Anira provides flexible methods to handle such models through its multi-tensor processing capabilities.

An important distinction in multi-tensor processing is between streamable and non-streamable tensors:

  • Streamable tensors: Contain data that varies over time (e.g., audio samples, time-series data). They can have multiple channels.

  • Non-streamable tensors: Contain static parameters or metadata (e.g., control parameters, configuration values, global settings). Only one channel is allowed.

Here’s how to configure and process multi-tensor models with anira:

 1#include <anira/anira.h>
 2
 3// Configure a model with multiple inputs and outputs
 4anira::InferenceConfig multi_tensor_config(
 5        {{"path/to/your/multi_tensor_model.onnx", anira::InferenceBackend::ONNX}},
 6        {{{1, 1, 2048}, {1, 1, 4}},     // Two inputs: audio (2048 samples) + control params (4 values)
 7         {{1, 1, 2048}, {1, 1, 1}}},    // Two outputs: processed audio (2048 samples) + confidence (1 value)
 8        anira::ProcessingSpec(          // Optional processing specification
 9            {1, 1},        // Input channels per tensor: [audio_channels, control_channels]
10            {1, 1},        // Output channels per tensor: [audio_channels, confidence_channels]
11            {2048, 0},     // Input sizes: [streamable_audio, non_streamable_params]
12            {2048, 0}      // Output sizes: [streamable_audio, non_streamable_confidence]
13        ),
14        10.0f // Maximum inference time in ms
15);
16
17// Create pre- and post-processor and inference handler
18anira::PrePostProcessor pp_processor(multi_tensor_config);
19anira::InferenceHandler inference_handler(pp_processor, multi_tensor_config);
20
21// Prepare for processing
22inference_handler.prepare({buffer_size, sample_rate});
23inference_handler.set_inference_backend(anira::InferenceBackend::ONNX);
24
25// Optionally get the latency of the inference process in samples
26std::vector<unsigned int> all_latencies = inference_handler.get_latency_vector();
Next step is the real-time processing of audio data and control parameters. The following examples demonstrate how to set inputs, process the data, and retrieve outputs. We have the following inputs and outputs:
  • audio_input: A pointer to a pointer of floats (float**) with shape [num_channels][num_samples].

  • audio_output: A pointer to a pointer of floats (float**) with shape [num_channels][num_samples].

  • control_params: A pointer to float (float*) containing 4 control values.

  • confidence_output: A pointer to float (float*) used to receive the confidence score.

  • num_samples: The number of audio samples to process.

Method 1: Individual Tensor Processing

 1// Step 1: Set non-streamable control parameters (tensor index 1)
 2for (size_t i = 0; i < 4; ++i) {
 3    pp_processor.set_input(control_params[i], 1, i);  // tensor_index=1, sample_index=i
 4}
 5
 6// Step 2: Process streamable audio data (tensor index 0)
 7inference_handler.process(audio_input, num_samples, 0); // Process audio data in tensor 0
 8// audio_input now contains processed audio data
 9
10// Step 3: Retrieve non-streamable confidence output (tensor index 1)
11*confidence_output = pp_processor.get_output(1, 0);  // Get confidence from tensor 1, sample 0

Method 2: Multi-Tensor Processing

 1// Allocate memory for input data and output data (not in the real-time callback)
 2const float* const* const* input_data = new const float* const*[2];
 3float* const* const* output_data = new float* const*[2];
 4
 5// Prepare input data structure: [tensor_index][channel][sample]
 6input_data[0] = audio_input;                            // Tensor 0: streamable audio data
 7input_data[1] = (const float* const*) &control_params;  // Tensor 1: non-streamable control params
 8
 9// Prepare output data structure: [tensor_index][channel][sample]
10output_data[0] = audio_output;                          // Tensor 0: processed audio output
11output_data[1] = (float* const*) &confidence_output;    // Tensor 1: confidence score output
12
13// Specify number of samples for each tensor
14size_t input_samples[2] = {num_samples, 4};             // Audio: num_samples, Control: 4 values
15size_t output_samples[2] = {num_samples, 1};            // Audio: num_samples, Confidence: 1 value
16
17// Process all tensors in one call
18size_t* processed_samples = inference_handler.process(
19    input_data, input_samples, output_data, output_samples);
20
21// Clean up the allocated memory after processing
22delete[] input_data;
23delete[] output_data;

Key Points for Multi-Tensor Processing

Tensor Organization and Indexing

  • Tensor indexing: Tensors are indexed starting from 0, following the order specified in the TensorShape configuration

  • Data structure: Multi-tensor data uses a 3D array structure: [tensor_index][channel][sample]

Streamable vs Non-Streamable Tensors

  • Streamable tensors: Time-varying data (audio, time-series) that flows continuously through the processing pipeline

  • Non-streamable tensors: Static parameters or metadata that is updated asynchronously

  • Configuration: Set processing sizes to 0 for non-streamable tensors in the ProcessingSpec

Processing Methods

Non-Streamable Data Access

Note

Streamable tensors can not be accessed with the anira::PrePostProcessor::set_input() and anira::PrePostProcessor::get_output() methods.

Note

Non-streamable tensors will allways have a single channel and a latency of 0 samples, as they are not time-varying.

Tip

When designing multi-tensor models, consider separating time-varying audio data (streamable) from control parameters (non-streamable).

Next Steps

  • Check the Usage Guide page for more detailed usage instructions

  • See the Examples page for complete example applications

  • Review the Architecture to understand anira’s design

  • Try the Benchmarking Guide tools to evaluate your models’ performance