Getting Started

This guide will help you get started with anira for neural network inference in your audio applications.

Prerequisites

Before using anira, ensure you have:

  • A C++ compiler with C++17 support

  • CMake (version 3.14 or higher)

  • One of the supported neural network model formats:
    • ONNX model files (.onnx)

    • PyTorch model files (.pt, .pth or .ts)

    • TensorFlow Lite model files (.tflite)

Installation

C++ Library

Anira can be easily integrated into your CMake project. You can either add anira as a submodule, download the pre-built binaries from the releases page, or build from source.

Option 2: Use Pre-built Binaries

Download pre-built binaries from the releases page.

In your CMakeLists.txt:

# Setup your project and target
project(your_project)
add_executable(your_target main.cpp ...)

# Add the path to the anira library as cmake prefix path and find the package
list(APPEND CMAKE_PREFIX_PATH "path/to/anira")
find_package(anira REQUIRED)

# Link your target to the anira library
target_link_libraries(your_target anira::anira)

Option 3: Build from Source

git clone https://github.com/anira-project/anira.git
cd anira
cmake . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release --target anira
cmake --install build --prefix /path/to/install/directory

C++ Build Options

By default, LibTorch, ONNXRuntime and LiteRT are enabled. You can disable specific backends as needed:

  • LibTorch: -DANIRA_WITH_LIBTORCH=OFF

  • OnnxRuntime: -DANIRA_WITH_ONNXRUNTIME=OFF

  • LiteRT (LiteRt* C API): -DANIRA_WITH_LITERT=OFF — runs .tflite models through LiteRT’s native CompiledModel runtime. Enabled by default; it is the modern TensorFlow-Lite-family backend.

  • TensorFlow Lite (legacy TfLite* C API): -DANIRA_WITH_TFLITE=ON — the same runtime as LiteRT exposed through the older C API, so the two are mutually exclusive. To use it, disable LiteRT: -DANIRA_WITH_LITERT=OFF -DANIRA_WITH_TFLITE=ON.

Platform / backend support

anira builds on the targets below; the pre-built backends it downloads ship per target as shared and/or static (anira’s linkage follows BUILD_SHARED_LIBS):

Target

LibTorch

ONNXRuntime

LiteRT

TFLite (legacy)

macOS x86_64

shared

shared · static

shared · static

shared · static

macOS arm64

shared

shared · static

shared · static

shared · static

macOS universal

shared

shared · static

shared · static

shared · static

Linux x86_64

shared

shared · static

shared · static

shared · static

Linux aarch64

shared

shared · static

shared · static

shared · static

Windows x86_64

shared

shared · static

shared · static

shared · static

Windows arm64

shared

shared · static

shared · static

shared · static

WASM (Emscripten)

static

LibTorch is shared-only (auto-disabled for fully static anira builds). LiteRT and TFLite are the same runtime via two C APIs and are mutually exclusive (LiteRT is the default). On WebAssembly only ONNX Runtime is supported. Backends for Android and iOS are also published in the anira-project/backends release for cross-builds. = not provided.

Pre-built backend binaries are downloaded at configure time from the anira-project/backends release pinned by ANIRA_BACKENDS_VERSION. Integrity is checked live: when GitHub is reachable, anira fetches each asset’s published SHA256 and re-downloads any backend whose archive changed upstream or downloaded incompletely (the download is verified against that hash). Nothing is pinned in-repo. Linkage and source are configurable:

  • Linkage follows BUILD_SHARED_LIBS (shared anira → shared backends, static → static). Decouple a single engine with -DANIRA_<ENGINE>_LINKAGE=shared|static where <ENGINE> is LIBTORCH|ONNXRUNTIME|TFLITE|LITERT. LibTorch is shared-only.

  • Backends release tag: -DANIRA_BACKENDS_VERSION=v2.1.1.

  • Offline / reproducible builds: -DANIRA_BACKENDS_SKIP_REMOTE_CHECK=ON skips the GitHub query and reuses whatever is already in modules/.

  • Bring your own backend (no fork): -DANIRA_<ENGINE>_ROOTDIR=/path/to/prebuilt (a tree with include/ + lib/), or a custom source via -DANIRA_<ENGINE>_URL=... -DANIRA_<ENGINE>_SHA256=....

Moreover, the following options are available:

  • Build anira with benchmark capabilities: -DANIRA_WITH_BENCHMARK=ON

  • Build example applications, plugins and populate example neural models: -DANIRA_WITH_EXAMPLES=ON

  • Build anira with tests: -DANIRA_WITH_TESTS=ON

  • Build anira with documentation: -DANIRA_WITH_DOCS=ON

  • Disable the logging system: -DANIRA_WITH_LOGGING=OFF

Anira Web (Web / JavaScript)

Anira is available as the @anira-project/anira package for use in web applications:

# npm
npm install @anira-project/anira

# pnpm
pnpm add @anira-project/anira

# yarn
yarn add @anira-project/anira

Building @anira-project/anira from source

If you want to build the WASM module and JavaScript bindings yourself, you need to provide your own Emscripten SDK. The CMake presets expect the EMSDK environment variable to be set to the root of your emsdk installation.

git clone https://github.com/anira-project/anira.git
cd anira

export EMSDK=/path/to/your/emsdk

# Configure and build the WASM module (release)
cmake --preset web-prod
cmake --build --preset web-prod

# Build the JavaScript package
cd web
npm install
npm run build

For packaging it locally, use

npm pack

in the web folder, which will create a .tgz file that can be installed with npm or yarn.

Then install the package in your project:

npm install path/to/anira/web/anira-project-anira-x.x.x.tgz

A debug preset is also available via cmake --preset web / cmake --build --preset web.

Basic Usage Example

The basic usage of anira is as follows:

#include <anira/anira.h>

anira::InferenceConfig inference_config(
        {{"path/to/your/model.onnx", anira::InferenceBackend::ONNX}}, // Model path
        {{{256, 1, 1}}, {{256, 1}}},  // Input, Output shape
        5.33f // Maximum inference time in ms
);

// Create a pre- and post-processor instance
anira::PrePostProcessor pp_processor(inference_config);

// Create an InferenceHandler instance
anira::InferenceHandler inference_handler(pp_processor, inference_config);

// Pass the host configuration and allocate memory for audio processing
inference_handler.prepare({buffer_size, sample_rate});

// Select the inference backend
inference_handler.set_inference_backend(anira::InferenceBackend::ONNX);

// Optionally get the latency of the inference process in samples
unsigned int latency_in_samples = inference_handler.get_latency();

// Real-time safe audio processing in process callback of your application
process(float** audio_data, int num_samples) {
    inference_handler.process(audio_data, num_samples);
}
// audio_data now contains the processed audio samples

Using Different Backends

Anira supports multiple backends that can be selected at runtime. Use the anira::InferenceHandler::set_inference_backend() method to switch between them:

 1// Set the inference backend to ONNX
 2inference_handler.set_inference_backend(anira::InferenceBackend::ONNX);
 3
 4// Set the inference backend to PyTorch
 5inference_handler.set_inference_backend(anira::InferenceBackend::PYTORCH);
 6
 7// Set the inference backend to TensorFlow Lite
 8inference_handler.set_inference_backend(anira::InferenceBackend::TFLITE);
 9
10// You can also provide and select a custom backend if needed
11inference_handler.set_inference_backend(anira::InferenceBackend::CUSTOM);

Multi Tensor Processing Example

Some neural network models require multiple input tensors or produce multiple output tensors. For example, a model might need both audio data and control parameters as inputs, or output both processed audio and confidence scores. Anira provides flexible methods to handle such models through its multi-tensor processing capabilities.

An important distinction in multi-tensor processing is between streamable and non-streamable tensors:

  • Streamable tensors: Contain data that varies over time (e.g., audio samples, time-series data). They can have multiple channels.

  • Non-streamable tensors: Contain static parameters or metadata (e.g., control parameters, configuration values, global settings). Only one channel is allowed.

Here’s how to configure and process multi-tensor models with anira:

 1#include <anira/anira.h>
 2
 3// Configure a model with multiple inputs and outputs
 4anira::InferenceConfig multi_tensor_config(
 5        {{"path/to/your/multi_tensor_model.onnx", anira::InferenceBackend::ONNX}},
 6        {{{1, 1, 2048}, {1, 1, 4}},     // Two inputs: audio (2048 samples) + control params (4 values)
 7         {{1, 1, 2048}, {1, 1, 1}}},    // Two outputs: processed audio (2048 samples) + confidence (1 value)
 8        anira::ProcessingSpec(          // Optional processing specification
 9            {1, 1},        // Input channels per tensor: [audio_channels, control_channels]
10            {1, 1},        // Output channels per tensor: [audio_channels, confidence_channels]
11            {2048, 0},     // Input sizes: [streamable_audio, non_streamable_params]
12            {2048, 0}      // Output sizes: [streamable_audio, non_streamable_confidence]
13        ),
14        10.0f // Maximum inference time in ms
15);
16
17// Create pre- and post-processor and inference handler
18anira::PrePostProcessor pp_processor(multi_tensor_config);
19anira::InferenceHandler inference_handler(pp_processor, multi_tensor_config);
20
21// Prepare for processing
22inference_handler.prepare({buffer_size, sample_rate});
23inference_handler.set_inference_backend(anira::InferenceBackend::ONNX);
24
25// Optionally get the latency of the inference process in samples
26std::vector<unsigned int> all_latencies = inference_handler.get_latency_vector();
Next step is the real-time processing of audio data and control parameters. The following examples demonstrate how to set inputs, process the data, and retrieve outputs. We have the following inputs and outputs:
  • audio_input: A pointer to a pointer of floats (float**) with shape [num_channels][num_samples].

  • audio_output: A pointer to a pointer of floats (float**) with shape [num_channels][num_samples].

  • control_params: A pointer to float (float*) containing 4 control values.

  • confidence_output: A pointer to float (float*) used to receive the confidence score.

  • num_samples: The number of audio samples to process.

Method 1: Individual Tensor Processing

 1// Step 1: Set non-streamable control parameters (tensor index 1)
 2for (size_t i = 0; i < 4; ++i) {
 3    pp_processor.set_input(control_params[i], 1, i);  // tensor_index=1, sample_index=i
 4}
 5
 6// Step 2: Process streamable audio data (tensor index 0)
 7inference_handler.process(audio_input, num_samples, 0); // Process audio data in tensor 0
 8// audio_input now contains processed audio data
 9
10// Step 3: Retrieve non-streamable confidence output (tensor index 1)
11*confidence_output = pp_processor.get_output(1, 0);  // Get confidence from tensor 1, sample 0

Method 2: Multi-Tensor Processing

 1// Allocate memory for input data and output data (not in the real-time callback)
 2const float* const* const* input_data = new const float* const*[2];
 3float* const* const* output_data = new float* const*[2];
 4
 5// Prepare input data structure: [tensor_index][channel][sample]
 6input_data[0] = audio_input;                            // Tensor 0: streamable audio data
 7input_data[1] = (const float* const*) &control_params;  // Tensor 1: non-streamable control params
 8
 9// Prepare output data structure: [tensor_index][channel][sample]
10output_data[0] = audio_output;                          // Tensor 0: processed audio output
11output_data[1] = (float* const*) &confidence_output;    // Tensor 1: confidence score output
12
13// Specify number of samples for each tensor
14size_t input_samples[2] = {num_samples, 4};             // Audio: num_samples, Control: 4 values
15size_t output_samples[2] = {num_samples, 1};            // Audio: num_samples, Confidence: 1 value
16
17// Process all tensors in one call
18size_t* processed_samples = inference_handler.process(
19    input_data, input_samples, output_data, output_samples);
20
21// Clean up the allocated memory after processing
22delete[] input_data;
23delete[] output_data;

Key Points for Multi-Tensor Processing

Tensor Organization and Indexing

  • Tensor indexing: Tensors are indexed starting from 0, following the order specified in the TensorShape configuration

  • Data structure: Multi-tensor data uses a 3D array structure: [tensor_index][channel][sample]

Streamable vs Non-Streamable Tensors

  • Streamable tensors: Time-varying data (audio, time-series) that flows continuously through the processing pipeline

  • Non-streamable tensors: Static parameters or metadata that is updated asynchronously

  • Configuration: Set processing sizes to 0 for non-streamable tensors in the ProcessingSpec

Processing Methods

Non-Streamable Data Access

Note

Streamable tensors can not be accessed with the anira::PrePostProcessor::set_input() and anira::PrePostProcessor::get_output() methods.

Note

Non-streamable tensors will allways have a single channel and a latency of 0 samples, as they are not time-varying.

Tip

When designing multi-tensor models, consider separating time-varying audio data (streamable) from control parameters (non-streamable).

Next Steps

  • Check the Usage Guide page for more detailed usage instructions

  • See the Examples page for complete example applications

  • Review the Architecture to understand anira’s design

  • Try the Benchmarking Guide tools to evaluate your models’ performance