问题
With CUDA, I'm trying to allocate arrays in a structure, but I'm having an issue and I don't know why. So here is a short code (stored in a file called struct.cuf
) that describe my problem. I'm compiling with the PGI 16.10 version
, and I'm using the following options : -O3 -Mcuda=cc60 -tp=x64 struct.cuf -o struct_out
module structure
contains
type mytype
integer :: alpha,beta,gamma
real,dimension(:),pointer :: a
end type mytype
type mytypeDevice
integer :: alpha,beta,gamma
real,dimension(:),pointer,device :: a
end type mytypeDevice
end module structure
program main
use cudafor
use structure
type(mytype) :: T(3)
type(mytypeDevice),device :: T_Device(3)
! For the host
do i=1,3
allocate(T(i)%a(10))
end do
T(1)%a=1; T(2)%a=2; T(3)%a=3
! For the device
print *, 'Everything from now is ok'
do i=1,3
allocate(T_Device(i)%a(10))
end do
!do i=1,3
! T_Device(i)%a=T(i)%a
!end do
end program main
The output error :
Everything from now is ok
Segmentation fault
What I am doing wrong here ?
The only solution I found (and working) is to stored the values in differents arrays and transfers them to the GPU, but it's very "Heavy". Mostly if I use a lot of structures like mytype.
EDIT : Code has been modified to use Vladimir F's solution. If I remove the device
attribute from T_Device(3)
declaration, then allocation seems ok and giving values too (commented lines below allocation). But I need that device
attribute for T_Device(3)
, because I'm gonna use it in kernels.
Thanks !
回答1:
I think you need a device pointer
type mytype_device
...
real,dimension(:),pointer, device :: a
end type
Never used CUDA Fortran in my life, but it seems obvious enough to wager.
回答2:
The problem here is how you have declared T_Device
. To use host side allocation you first populate a host memory copy of the device structure, and then copy it to device memory. This:
type(mytypeDevice) :: T_Device(3)
do i=1,3
allocate(T_Device(i)%a(10))
end do
will work correctly. This is a very standard design pattern in C++ based CUDA code, and the principle here is identical.
来源:https://stackoverflow.com/questions/44680150/how-to-allocate-arrays-of-arrays-in-structure-with-cuda-fortran