Please note: We are aware of an issue affecting replies on the Arm Community forums, which may not be loading as expected.
We apologize for any inconvenience and appreciate your patience while we investigate and work to resolve the issue.
Thank you for your understanding.
Hi everyone,
I'm currently implementing a sparse daxpyi operation on an aarch64 platform, using the inspector-executor pattern.
I compared three implementations:
Results:
I'm wondering:
Thanks in advance for any insights or suggestions! If helpful, I can share more details.
void arm_daxpyi2(const int n, const double alpha, const double *x, const int *indx, double *y) { // Early return if alpha is zero (no operation needed) if (alpha == 0.0) { return; } // use std::max_element to find the maximum index const int full_size = *std::max_element(indx, indx + n) + 1; // Find max index in indx array // Create sparse vector descriptor for x armpl_spvec_t spvec_x; armpl_status_t status = armpl_spvec_create_d(&spvec_x, // Pointer to sparse vector object to create 0, // Index base (0 for C-style indexing) full_size, // Dimension of the sparse vector n, // Number of non-zero elements indx, // Array of indices x, // Array of non-zero values 0 // Flags (currently unused) ); if (status != ARMPL_STATUS_SUCCESS) { // Handle error return; } // Execute the sparse vector operation: y = alpha*x + beta*y // Use beta = 1.0 to keep the existing values in y const double beta = 1.0; status = armpl_spaxpby_exec_d(alpha, // alpha coefficient spvec_x, // sparse vector x beta, // beta coefficient y // dense vector y (input/output) ); if (status != ARMPL_STATUS_SUCCESS) { // Handle error armpl_spvec_destroy(spvec_x); return; } // Clean up armpl_spvec_destroy(spvec_x); }
Thanks for the suggestion!
I'll try this out and do more testing.