Forums - Point Kernel in Symphony

9 posts / 0 new
Last post
Point Kernel in Symphony
srijeeta.dona
Join Date: 13 Mar 18
Posts: 12
Posted: Wed, 2018-06-06 02:47

Can anyone please share some resources or a demo example of Point Kernel in Symphony. There is not example of point kernel in the provided examples in SDK. Also there is not much written about point kernel in the SDK documentations.

Thank you in advance for your help.

  • Up0
  • Down0
asraghav
Join Date: 29 Nov 17
Posts: 11
Posted: Wed, 2018-06-06 17:57

Point Kernels are a beta feature for now addressing purely data parallel algorithms. Unfortunately we dont have any samples that we could share at this moment. We will update the thread once we have some samples that we could share.

Aravind.

  • Up0
  • Down0
srijeeta.dona
Join Date: 13 Mar 18
Posts: 12
Posted: Tue, 2018-06-19 21:21

I am getting this error when I am trying to create a point kernel in symphony. I couldn't find any demo or detailed documentation for the same. I followed this example as  illustrated in https://www.iwocl.org/wp-content/uploads/iwocl2017-wenjia-ruan-symphony.pdf.

Can anyone help me with this.  Thanks in advance.

In file included from /opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/internal/pointkernel/pointkernel.hh:10:
/opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/internal/pointkernel/pk_util.hh:197:20: error:
      implicit instantiation of undefined template
      'symphony::internal::pointkernel::output_buffer_type_extractor<false,
      float *>'
  typedef typename output_buffer_type_extractor<
                   ^
/opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/internal/pointkernel/pk_util.hh:197:20: note:
      in instantiation of template class
      'symphony::internal::pointkernel::output_buffer_type_extractor<false,
      const float *, float *>' requested here
/opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/internal/pointkernel/pk_util.hh:204:20: note:
      in instantiation of template class
      'symphony::internal::pointkernel::output_buffer_type_extractor<false,
      const float *, const float *, float *>' requested here
  typedef typename output_buffer_type_extractor<
                   ^
/opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/internal/pointkernel/pointkernel.hh:55:43: note:
      in instantiation of template class
      'symphony::internal::pointkernel::get_output_buffer_type<symphony::internal::pointkernel::arg_list_gen<const
      float *, const float *, float *> >' requested here
  using gpu_output_buffer_type = typename get_output_buffer_type<gpu_ker...
                                          ^
/home/srijeeta/Project/Symphony/tests/point_kernel/create_kernel1.cc:12:1: note:
      in instantiation of template class
      'symphony::internal::pointkernel::pointkernel<long,
      symphony::internal::range<1>, const float *, const float *, float *>'
      requested here
SYMPHONY_POINT_KERNEL_1D_3(vadd, int, i, first, last, const float*, a, c...
^
/opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/internal/pointkernel/macrodefinitions_1d.hh:98:26: note:
      expanded from macro 'SYMPHONY_POINT_KERNEL_1D_3'
  static fname##_pk_type gen_##fname##_obj(std::string global_string){ \
                         ^
<scratch space>:34:1: note: expanded from here
gen_vadd_obj
^
/opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/internal/pointkernel/pk_util.hh:188:58: note:
      template is declared here
template<bool is_output_buffer, typename... Args> struct output_buffer_t...
                                                         ^
/home/srijeeta/Project/Symphony/tests/point_kernel/create_kernel1.cc:37:18: error:
      no matching function for call to 'create_point_kernel'
        auto vadd_pk = symphony::beta::create_point_kernel('vadd');             
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/pointkernel.hh:19:40: note:
      candidate template ignored: couldn't infer template argument 'T'
typename std::tuple_element<0,T>::type create_point_kernel(std::string g...
                                       ^
2 errors generated.
 

  • Up0
  • Down0
srijeeta.dona
Join Date: 13 Mar 18
Posts: 12
Posted: Tue, 2018-06-19 21:31

My code for the same -

#include <symphony/symphony.h>
#include <random>
SYMPHONY_POINT_KERNEL_1D_3(vadd, int, i, first, last, const float*, a, const float*, b, float*, c,  {c[i] = a[i] + b[i];});
int main() {
    const size_t size = 16;
    auto a_buf = symphony::create_buffer<float>(size);
    auto b_buf = symphony::create_buffer<float>(size);
    auto c_buf = symphony::create_buffer<float>(size);
    for (size_t i = 0; i < size; ++i) {
      a_buf[i] = float(rand()%100);
      b_buf[i] = float(rand()%100);
      c_buf[i] = 0;
    }
     auto r =  symphony::range<1>(size); 
     auto vadd_pk = symphony::beta::create_point_kernel("vadd");          
    auto pfor= symphony::beta::pattern::create_pfor_each(vadd_pk, a_buf, b_buf, c_buf);
    pfor(r, symphony::pattern::tuner().set_cpu_load(20).set_gpu_load(80));
  return 0;
}
 

  • Up0
  • Down0
asraghav
Join Date: 29 Nov 17
Posts: 11
Posted: Thu, 2018-06-21 13:20

You need to specify the template argument type when invoking create_point_kernel API. Please refer to the API in the header symphony/pointkernel.hh. The types are defined in autogenerated file macrodefinitions_1d.hh

For ex. in your case since you are trying to do vector add, it would be something like this, 

auto vadd_pk = symphony::beta::create_point_kernel<vadd_type>();  

 

  • Up0
  • Down0
srijeeta.dona
Join Date: 13 Mar 18
Posts: 12
Posted: Mon, 2018-06-25 04:12

Even after changing that I am still getting the error "implicit instantiation of undefined template".

And also how should I use symphony::pattern::tuner().set_cpu_load(20).set_gpu_load(80) in the code to distribute the computation of point kernel across CPU and GPU

My code:

SYMPHONY_POINT_KERNEL_1D_3(vadd, int, i, first, last, const float*, a, const float*, b, float*, c,  {c[i] = a[i] + b[i];});

.....

auto r =  symphony::range<1>(size);
auto vadd_pk = symphony::beta::create_point_kernel<symphony::vadd_type>();

//symphony::internal::pfor_each(r, vadd_pk, a_buf, b_buf, c_buf , symphony::pattern::tuner().set_cpu_load(20).set_gpu_load(80));

Error messege:

[arm64-v8a] Compile++      : kernel_test <= create_kernel1.cc
In file included from /home/srijeeta/Project/Symphony/tests/point_kernel/create_kernel1.cc:6:
In file included from /opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/symphony.h:18:
In file included from /opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/patterns.hh:9:
In file included from /opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/pfor_each.hh:11:
In file included from /opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/internal/patterns/pfor_each.hh:11:
In file included from /opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/internal/pointkernel/pointkernel.hh:10:
/opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/internal/pointkernel/pk_util.hh:197:20: error: implicit instantiation of undefined template
      'symphony::internal::pointkernel::output_buffer_type_extractor<false, float *>'
  typedef typename output_buffer_type_extractor<
                   ^
/opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/internal/pointkernel/pk_util.hh:197:20: note: in instantiation of template class
      'symphony::internal::pointkernel::output_buffer_type_extractor<false, const float *, float *>' requested here
/opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/internal/pointkernel/pk_util.hh:204:20: note: in instantiation of template class
      'symphony::internal::pointkernel::output_buffer_type_extractor<false, const float *, const float *, float *>' requested here
  typedef typename output_buffer_type_extractor<
                   ^
/opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/internal/pointkernel/pointkernel.hh:55:43: note: in instantiation of template class
      'symphony::internal::pointkernel::get_output_buffer_type<symphony::internal::pointkernel::arg_list_gen<const float *, const float *, float *> >' requested here
  using gpu_output_buffer_type = typename get_output_buffer_type<gpu_kernel_args>::type;
                                          ^
/home/srijeeta/Project/Symphony/tests/point_kernel/create_kernel1.cc:9:1: note: in instantiation of template class 'symphony::internal::pointkernel::pointkernel<long,
      symphony::internal::range<1>, const float *, const float *, float *>' requested here
SYMPHONY_POINT_KERNEL_1D_3(vadd, int, i, first, last, const float*, a, const float*, b, float*, c,  {c[i] = a[i] + b[i];});
^
/opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/internal/pointkernel/macrodefinitions_1d.hh:3628:26: note: expanded from macro
      'SYMPHONY_POINT_KERNEL_1D_3'
  static fname##_pk_type gen_##fname##_obj(std::string global_string){ \
                         ^
<scratch space>:37:1: note: expanded from here
gen_vadd_obj
^
/opt/Qualcomm/Symphony/1.1.4/aarch64-linux-android/lib//../include/symphony/internal/pointkernel/pk_util.hh:188:58: note: template is declared here
template<bool is_output_buffer, typename... Args> struct output_buffer_type_extractor;
                                                         ^
1 error generated.
make: *** [obj/local/arm64-v8a/objs/kernel_test/create_kernel1.o] Error 1
 

  • Up0
  • Down0
asraghav
Join Date: 29 Nov 17
Posts: 11
Posted: Mon, 2018-06-25 17:43

For vector add, you also need to pass in the length of each vector as parameters. So I suggest using the 6 parameter macro instead of 3 parameter. It would be something like this,

 

SYMPHONY_POINT_KERNEL_1D_6(vadd, int, i, first, last,
const float*, a, int, num_elemsa,
const float*, b, int, num_elemsb,
float*, c, int, num_elemsc,
{ c[i] = a[i] + b[i]; });

 

You could then create Point Kernel, pfor_each task and launch them. Here's one way of doing those,

auto vadd_pk = ::symphony::beta::create_point_kernel<vadd_type>(); // Create Point Kernel
symphony::range<1> range_1d(size); // I'm assuming size is equivalent of your buffer size
symphony::pattern::tuner tuner = symphony::pattern::tuner(); // Pattern Tuner & setting the parameters
tuner.set_cpu_load(20).set_gpu_load(80)
auto pfor_task = symphony::beta::pattern::create_pfor_each(vadd_pk, arg_list); // arg_list is any other arguments you need to pass in like list of input/output buffers in this case
pfor_task(range_1d, tuner);
 

Let us know if this helps

Aravind.

 

  • Up0
  • Down0
srijeeta.dona
Join Date: 13 Mar 18
Posts: 12
Posted: Tue, 2018-06-26 01:48

Thank you so much for your help. I was stuck with it for a long time. Is there any documentation to understand what are the standard types for point kernel. And the parameters to be taken for each type.

For those who need a demo example for point kernel, this is the whole code-

#include <symphony/symphony.h>
#include <random>

SYMPHONY_POINT_KERNEL_1D_6(vadd, int, i, first, last, const float*, a, int, n_a, const float*, b, int, n_b, float*, c, int, n_c, { c[i] = a[i] + b[i]; });

int main() {
    const size_t size = 16;   
    auto a_buf = symphony::create_buffer<float>(size);
    auto b_buf = symphony::create_buffer<float>(size);
    auto C = symphony::create_buffer<float>(size);
    for (size_t i = 0; i < size; ++i) {
      a_buf[i] = float(rand()%100);
      b_buf[i] = float(rand()%100);
    }
    symphony::buffer_ptr<const float> A;
    symphony::buffer_ptr<const float> B;
    A=a_buf; B= b_buf;   
    auto vadd_pk = symphony::beta::create_point_kernel<symphony::vadd_type>();
    symphony::range<1> range_1d(size);
    symphony::pattern::tuner tuner = symphony::pattern::tuner();
    tuner.set_cpu_load(20).set_gpu_load(80);
    auto pfor_task = symphony::beta::pattern::create_pfor_each(vadd_pk,A,B,C);   
    pfor_task(range_1d, tuner);
    return 0;
}
 

  • Up0
  • Down0
srijeeta.dona
Join Date: 13 Mar 18
Posts: 12
Posted: Fri, 2018-07-06 07:18

Hello, I am etrying to launch for cpu, gpu and dsp. But it is not wrking for the dsp.  Even though individual examples for hexagon is working fine. Can anyone please give any insight for this. Thank you in advance.

  • Up0
  • Down0
or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.