We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hi everyone,
I'm currently implementing a sparse daxpyi operation on an aarch64 platform, using the inspector-executor pattern.
I compared three implementations:
Results:
I'm wondering:
Thanks in advance for any insights or suggestions! If helpful, I can share more details.
void arm_daxpyi2(const int n, const double alpha, const double *x, const int *indx, double *y) { // Early return if alpha is zero (no operation needed) if (alpha == 0.0) { return; } // use std::max_element to find the maximum index const int full_size = *std::max_element(indx, indx + n) + 1; // Find max index in indx array // Create sparse vector descriptor for x armpl_spvec_t spvec_x; armpl_status_t status = armpl_spvec_create_d(&spvec_x, // Pointer to sparse vector object to create 0, // Index base (0 for C-style indexing) full_size, // Dimension of the sparse vector n, // Number of non-zero elements indx, // Array of indices x, // Array of non-zero values 0 // Flags (currently unused) ); if (status != ARMPL_STATUS_SUCCESS) { // Handle error return; } // Execute the sparse vector operation: y = alpha*x + beta*y // Use beta = 1.0 to keep the existing values in y const double beta = 1.0; status = armpl_spaxpby_exec_d(alpha, // alpha coefficient spvec_x, // sparse vector x beta, // beta coefficient y // dense vector y (input/output) ); if (status != ARMPL_STATUS_SUCCESS) { // Handle error armpl_spvec_destroy(spvec_x); return; } // Clean up armpl_spvec_destroy(spvec_x); }
Thanks for the suggestion!
I'll try this out and do more testing.