Custom Backend Definition¶
Anira provides the flexibility to implement custom inference backends through the anira::BackendBase
class. This enables integration of additional inference engines, customization of existing engines, or implementation of specialized processing logic such as bypass or roundtrip backends.
Understanding the Backend Interface¶
The anira::BackendBase
class provides the foundation for all inference backends in anira. When creating a custom backend, you need to inherit from this class and implement the required virtual methods.
Core Virtual Methods:
Method |
Description |
---|---|
|
Called once during initialization to set up the backend. Use this method to load models, allocate memory, or configure the inference engine. |
|
Called for each inference operation. This method receives input buffers, performs the actual inference, and writes results to output buffers. |
Method Signatures:
class CustomBackend : public anira::BackendBase {
public:
CustomBackend(anira::InferenceConfig& inference_config)
: anira::BackendBase(inference_config) {}
// Initialize the backend (called once)
void prepare() override {
// Load models, allocate memory, configure inference engine
}
// Process inference (called repeatedly in real-time)
void process(std::vector<anira::BufferF>& input, std::vector<anira::BufferF>& output,
std::shared_ptr<anira::SessionElement> session) override {
// Perform inference and write results to output buffers
}
};
Implementing the Backend¶
Constructor Requirements:
Your custom backend must accept an anira::InferenceConfig
reference in its constructor and pass it to the base class. This configuration provides access to model information, tensor shapes, and other inference parameters.
class CustomBackend : public anira::BackendBase {
public:
CustomBackend(anira::InferenceConfig& inference_config)
: anira::BackendBase(inference_config) {
// Optional: Store additional configuration or initialize members
}
};
Implementing prepare():
The anira::BackendBase::prepare()
method is called once during the initialization phase. Use this method to:
Load neural network models
Initialize inference engines or libraries
Allocate persistent memory buffers
Configure backend-specific settings
void prepare() override {
// Example: Load a custom inference engine
auto& model_data = m_inference_config.get_model_data();
if (!model_data.empty()) {
std::string model_path = model_data[0].model_path;
// Load your model here
custom_engine.load_model(model_path);
}
// Example: Pre-allocate inference buffers
auto input_shape = m_inference_config.get_tensor_input_shape();
auto output_shape = m_inference_config.get_tensor_output_shape();
inference_input_buffer.resize(input_shape[0]);
inference_output_buffer.resize(output_shape[0]);
}
Implementing process():
The anira::BackendBase::process()
method is called for each inference operation. This method receives:
input
: vector ofanira::BufferF
containing input data from the pre-processor for all input tensorsoutput
: vector ofanira::BufferF
where results should be written for all output tensorssession
: shared pointer ofanira::SessionElement
for accessing additional session data
The vectors contain one anira::BufferF
for each tensor defined in your model. Most audio processing models have a single input and single output tensor (both at index 0), but some models may have multiple tensors for different purposes (e.g., audio data, control parameters, confidence outputs).
void process(std::vector<anira::BufferF>& input, std::vector<anira::BufferF>& output,
std::shared_ptr<anira::SessionElement> session) override {
// Process each tensor - typically there's one input and one output tensor
for (size_t tensor_idx = 0; tensor_idx < input.size() && tensor_idx < output.size(); ++tensor_idx) {
auto& input_buffer = input[tensor_idx];
auto& output_buffer = output[tensor_idx];
// Copy input data to inference engine format
for (size_t channel = 0; channel < input_buffer.get_num_channels(); ++channel) {
auto input_ptr = input_buffer.get_read_pointer(channel);
// Copy to your inference engine's input format
std::copy(input_ptr, input_ptr + input_buffer.get_num_samples(),
inference_input_buffer.begin());
}
// Perform inference
custom_engine.infer(inference_input_buffer, inference_output_buffer);
// Copy results to output buffer
for (size_t channel = 0; channel < output_buffer.get_num_channels(); ++channel) {
auto output_ptr = output_buffer.get_write_pointer(channel);
std::copy(inference_output_buffer.begin(),
inference_output_buffer.begin() + output_buffer.get_num_samples(),
output_ptr);
}
}
}
Backend Integration¶
Once your custom backend is implemented, integrate it with the anira::InferenceHandler
:
// Create your custom backend instance
CustomBackend custom_backend(inference_config);
// Create InferenceHandler with custom backend
anira::InferenceHandler inference_handler(pp_processor, inference_config, custom_backend);
// Select the custom backend
inference_handler.set_inference_backend(anira::InferenceBackend::CUSTOM);
Example: Bypass Backend¶
The following example demonstrates a simple bypass backend that returns the last portion of the input buffer as output, effectively bypassing the inference stage:
#include <anira/anira.h>
class BypassProcessor : public anira::BackendBase {
public:
BypassProcessor(anira::InferenceConfig& inference_config)
: anira::BackendBase(inference_config) {}
void prepare() override {
// No preparation needed for bypass
}
void process(std::vector<anira::BufferF>& input, std::vector<anira::BufferF>& output,
[[maybe_unused]] std::shared_ptr<anira::SessionElement> session) override {
// Process each tensor pair
for (size_t tensor_idx = 0; tensor_idx < input.size() && tensor_idx < output.size(); ++tensor_idx) {
auto& input_buffer = input[tensor_idx];
auto& output_buffer = output[tensor_idx];
auto equal_channels = input_buffer.get_num_channels() == output_buffer.get_num_channels();
auto sample_diff = input_buffer.get_num_samples() - output_buffer.get_num_samples();
if (equal_channels && sample_diff >= 0) {
for (size_t channel = 0; channel < input_buffer.get_num_channels(); ++channel) {
auto write_ptr = output_buffer.get_write_pointer(channel);
auto read_ptr = input_buffer.get_read_pointer(channel);
// Copy the last output.get_num_samples() from input to output
for (size_t i = 0; i < output_buffer.get_num_samples(); ++i) {
write_ptr[i] = read_ptr[i + sample_diff];
}
}
} else {
// Clear output if dimensions don't match
output_buffer.clear();
}
}
}
};
Note
When implementing custom inference backends, refer to the existing backend implementations in the anira source code (src/backends/
) for additional guidance and best practices. Each backend demonstrates different approaches to handling model loading, memory management, and inference execution.