The GPU support error of Relax

I build Relax software from source, with
.1)/waf --use-cuda --cuda-dir=/usr/local/cuda --prefix=/home/StaMPS/Software/relax/relax-1.0.7/gpu --gmt-dir=/usr/local/gmt-4.5.18 --proj-dir=/usr/local/proj-5.2.0 --check-fortran-compiler=ifort --check-c-compiler=icc --check-cxx-compiler=icpc --use-papi --use-fftw configure
2)./waf build
and when I run Relax, it has error when use the GPU, the error information is,

max sampling size (hor.,vert.): 3.99E+0 4.00E+0

----------------------------------------------------------------------------

could not allocate memory
Device 0: “Tesla P40”
copyFilter14 : invalid device symbolcopyFilter14 : invalid device symbolcopyFilter14 : invalid device symbolcuOptimalFilter: Failed in memcpy 1
custressupdate_ : Something went wrong with optimal filter

I also try the build without cuda support, it run successful. But the cuda support is still error.
Can you help me. Thank you.

Best wishes,

S. Hong