Hello everyone,
When I am initializing my Arm NN, I am pre-allocating output tensors like this:
//pre-allocate memory for output for (int it = 0; it < outputLayerNamesList.size(); ++it) { const armnn::DataType dataType = outputBindingInfo[it].second.GetDataType(); const armnn::TensorShape& tensorShape = outputBindingInfo[it].second.GetShape(); std::vector<float> oneLayerOutResult;
oneLayerOutResult.resize(tensorShape.GetNumElements(), 0); outputBuffer.emplace_back(oneLayerOutResult);
// Make ArmNN output tensors outputTensors.reserve(outputBuffer.size()); for (std::size_t it = 0; it < outputBuffer.size(); ++it) { outputTensors.emplace_back(std::make_pair( outputBindingInfo[it].first, armnn::Tensor(outputBindingInfo[it].second, outputBuffer.at(it).data()))); } }
The question is: What do I need to do to cleanly deallocate these output tensors when I am done with the network? Any suggestions, please?
Hi,
How are you de-allocating outputTensors currently, and what are your concerns about its "cleanliness"? I can't see the lifespan of outputTensors currently, but hopefully should be able to just de-allocate after network is run/finished? Are you getting bad behaviors from de-allocation, and if so, what?
If custom allocation is needed there's a good example here: arm-software.github.io/.../_custom_memory_allocator_sample_8cpp-example.xhtml
Hi Ben,
Good to hear from you.
What I am striving for is to be able to switch from one network to another at run time. In order to do that, I need to be able to cleanly deallocate all the dynamically-allocated residual memories of the previous network. A network can be initialized, executed numerous times, and then either simply de-initialized (i.e. closed) or the system can switch to some other network. If I don't de-initialized output tensors and don't deallocate buffers associated with them, then switching to another network crashes in that snippet of initialization code that I provided earlier.
Here is the way I am currently deallocating output tensors and associated with them buffers, but I am not sure if I am doing it the right way:
for (int i = 0; i < outputLayerNamesList.size(); ++i) { const armnn::TensorShape& tensorShape = outputBindingInfo[i].second.GetShape();
outputBuffer.get_allocator().deallocate(outputBuffer.data(), tensorShape.GetNumElements()); } outputBuffer.clear(); outputTensors.clear();
Does it make sense? It seems to be working for me right now, but is this going to work in general?
First look looks sensible, but I'll get an ArmNN person to run their eye over it.
ArmNN expert says that input/output memory is application-scope rather than ArmNN-specific, but what you've got looks sensible given assumptions about types used etc...