精华内容
下载资源
问答
  • I have an opencv c++ script and I need to use cudamemcpy in ... So I used cudamemcpy in my script and saved the file as a .cu I am using cmake. So I added the .cu as a cuda_executable. But when I use ...

    I have an opencv c++ script and I need to use cudamemcpy in it. So I used cudamemcpy in my script and saved the file as a .cu I am using cmake. So I added the .cu as a cuda_executable. But when I use "sudo make" I get the following error

    我有一个opencv c++脚本,我需要在其中使用cudamemcpy。因此,我在脚本中使用cudamemcpy,并将文件保存为.cu,使用cmake。因此我将.cu添加为cuda_可执行文件。但是当我使用“sudo make”时,会得到以下错误

    aHR0cHM6Ly9pLnN0YWNrLmltZ3VyLmNvbS9mcjF3Zi5wbmc=

    This is the makefile aHR0cHM6Ly9pLnN0YWNrLmltZ3VyLmNvbS9FY2pQVS5wbmc=

    这是makefile

    So in line 27 I changed c++0x to c++11 and I got the same error so I tried gnu++11 but I still have the same error.

    在第27行,我把c++0x改成了c++11我得到了同样的错误所以我尝试了gnu+ 11但是我还是有同样的错误。

    1 个解决方案

    #1

    4

    The problem is that -std=c++11 is not added to the nvcc build command if it is passed via add_definitions(). From the FindCUDA cmake documentation:

    问题是,如果通过add_definition()传递-std=c++11,则不会添加到nvcc构建命令中。从FindCUDA cmake文档:

    Flags passed into add_definitions with -D or /D are passed along to nvcc.

    用-D或/D传递给add_definition的标志被传递给nvcc。

    Other options like -std=c++11 are not passed to nvcc.

    其他选项如-std=c++11不会传递给nvcc。

    You can enable c++11 globally by adding it to the CUDA_NVCC_FLAGS

    您可以通过将它添加到CUDA_NVCC_FLAGS来全局启用c++11

    set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -std=c++11" )

    Or on a per target level with

    或者在目标水平上

    cuda_add_executable(opencv opencv.cu OPTIONS -std=c++11)

    As talonmies pointed out in the comment: if you only need functions from the runtime API, like cudaMemcpy(), you can use them in ordinary C++-files (no .cu extension required). Then, you need to include cuda_runtime_api.h and link to the cuda runtime library cudart. (Ensure that the cuda include directory is on your include-path and the library in your library path, which is automatically taken care of if you used nvcc to compile.)

    正如talonmies在注释中指出的:如果您只需要运行时API中的函数,比如cudaMemcpy(),那么您可以在普通的c++文件中使用它们(不需要.cu扩展)。然后,需要包含cuda_runtime_api。h和链接到cuda运行时库cudart。(确保cuda include目录位于包含路径上,库路径中的库,如果使用nvcc编译,库路径将自动处理)。

    Update

    The proper way to set the c++ standard for more recent versions of CMake is explained here: Triggering C++11 support in NVCC with CMake

    本文解释了为最新版本的CMake设置c++标准的正确方法:使用CMake在NVCC中触发c++ 11支持

    展开全文
  • 主函数在main.cpp中,用clang++编译[注:g++(gcc)不行,必须用clang++(clang)]cuda函数放在KernelWrapper.cu中,用nvcc编译。另外main.cpp中需要包含头文件KernelWrapper.h#ifndef _KernelWrapper_h#define _...

    主函数在main.cpp中,用clang++编译[注:g++(gcc)不行,必须用clang++(clang)]

    cuda函数放在KernelWrapper.cu中,用nvcc编译。另外main.cpp中需要包含头文件

    KernelWrapper.h

    #ifndef _KernelWrapper_h

    #define _KernelWrapper_h

    void RunTest();

    #endif

    KernelWrapper.cu

    #include

    #include "KernelWrapper.h"

    __global__ void TestDevice(int *deviceArray)

    {

    int idx = blockIdx.x*blockDim.x + threadIdx.x;

    deviceArray[idx] = deviceArray[idx]*deviceArray[idx];

    }

    void RunTest()

    {

    int* hostArray;

    int* deviceArray;

    const int arrayLength = 16;

    const unsigned int memSize = sizeof(int) * arrayLength;

    hostArray = (int*)malloc(memSize);

    cudaMalloc((void**) &deviceArray, memSize);

    printf("Init Data\n");

    for(int i=0;i

    {

    hostArray[i] = i+1;

    printf("%d\n", hostArray[i]);

    }

    cudaMemcpy(deviceArray, hostArray, memSize, cudaMemcpyHostToDevice);

    TestDevice <<< 4, 4 >>> (deviceArray);

    cudaMemcpy(hostArray, deviceArray, memSize, cudaMemcpyDeviceToHost);

    printf("After Kernel Function\n");

    for(int i=0;i

    {

    printf("%d\n", hostArray[i]);

    }

    cudaFree(deviceArray);

    free(hostArray);

    printf("done");

    }

    main.cp

    #include "KernelWrapper.h"

    int main( int argc, char** argv)

    {

    RunTest();

    return 0;

    }

    Makefile

    all: program

    program: KernelWrapper.o main.o

    clang++ -o program -L/usr/local/cuda/lib64 -lcuda -lcudart KernelWrapper.o main.o

    KernelWrapper.o:KernelWrapper.cu

    /usr/local/cuda/bin/nvcc -c KernelWrapper.cu

    main.o: main.cpp

    clang++ -c main.cpp

    clean:

    rm -f *.o program

    展开全文
  • 如果cuda程序没有使用动态并行,编译动态链接库为:nvcc -arch=sm_60 -std=c+11 -O3 -rdc=true -Xcompiler -fPIC -c algorithm.cu -o algorithm.og++ algorithm.o -fPIC -shared -o libalgorithm.socuda文件中如果...

    如果cuda程序没有使用动态并行,编译动态链接库为:

    nvcc -arch=sm_60 -std=c+11 -O3 -rdc=true -Xcompiler -fPIC -c algorithm.cu -o algorithm.o

    g++  algorithm.o -fPIC -shared -o libalgorithm.so

    cuda文件中如果使用了动态并行,编译成动态链接库需要使用以下三步:

    nvcc -arch=sm_60 -std=c+11 -O3 -rdc=true -Xcompiler -fPIC -c algorithm.cu -L/usr/local/cuda-10.1/lib64 -lcudart -lcudadevrt

    nvcc -arch=sm_60 -Xcompiler -fPIC -dlink -o algorithm_link.o algorithm.o -L/usr/local/cuda-10.1/lib64 -lcudart -lcudadevrt

    g++ algorithm_link.o algorithm.o -fPIC -shared -o libalgorithm.so -L/usr/local/cuda-10.1/lib64 -lcudart -lcudadevrt

    展开全文
  • 主函数在main.cpp中,用g++编译,cuda函数放在KernelWrapper.cu中,用nvcc编译。另外main.cpp中需要包含头文件KernelWrapper.h KernelWrapper.h #ifndef _KernelWrapper_h #define _KernelWrapper_h void RunTest...

    主函数在main.cpp中,用clang++编译,cuda函数放在KernelWrapper.cu中,用nvcc编译。另外main.cpp中需要包含头文件KernelWrapper.h

    KernelWrapper.h

    #ifndef _KernelWrapper_h
    #define _KernelWrapper_h
    
    void RunTest();
    #endif


    KernelWrapper.cu

    #include <stdio.h>
    #include "KernelWrapper.h"
    
    __global__ void TestDevice(int *deviceArray)
    {
    int idx = blockIdx.x*blockDim.x + threadIdx.x;
    deviceArray[idx] = deviceArray[idx]*deviceArray[idx];
    }
    
    void RunTest()
    {
    int* hostArray;
    int* deviceArray;
    const int arrayLength = 16;
    const unsigned int memSize = sizeof(int) * arrayLength;
    
    hostArray = (int*)malloc(memSize);
    cudaMalloc((void**) &deviceArray, memSize);
    printf("Init Data\n");
    
    for(int i=0;i<arrayLength;i++)
    {
    hostArray[i] = i+1;
    printf("%d\n", hostArray[i]);
    
    }
    
    
    cudaMemcpy(deviceArray, hostArray, memSize, cudaMemcpyHostToDevice);
    TestDevice <<< 4, 4 >>> (deviceArray);
    cudaMemcpy(hostArray, deviceArray, memSize, cudaMemcpyDeviceToHost);
    
    
    printf("After Kernel Function\n");
    for(int i=0;i<arrayLength;i++)
    {
    
    printf("%d\n", hostArray[i]);
    }
    
    cudaFree(deviceArray);
    free(hostArray);
    
    printf("done");
    }
    
    
    
    main.cpp
    #include "KernelWrapper.h"
    
    int main( int argc, char** argv)
    {
        RunTest();
        
        return 0;
    }

    makefile

    all: program
    
    program: KernelWrapper.o main.o
        clang++ -o program -L/usr/local/cuda/lib -lcuda -lcudart KernelWrapper.o main.o
    
    KernelWrapper.o:KernelWrapper.cu
        /usr/local/cuda/bin/nvcc -c -arch=sm_20 KernelWrapper.cu
    
    main.o:main.cpp
        clang++ -c main.cpp
    
    clean: 
        rm -f *.o program
    


    展开全文
  • CUDA编译(一)---使用nvcc编译cuda

    万次阅读 2018-02-07 18:08:55
    CUDA编译(一)—使用nvcc编译cuda nvcc介绍 示例 nvcc介绍 ...nvcc是编译cuda程序的编译器,CDUA C是在C语言上的扩展...其实,nvcc编译cuda程序和g++编译c++程序是差不多的。在我的其它博客中也写了有关g++编译...
  • nvcc编译

    千次阅读 2019-03-10 23:30:26
    文件后缀说明 ....cu CUDA source file, containing host code and device functions cuda源文件 .c C source file c源文件 .cc, .cxx, .cpp C++ source file C++源文件 .ptx PTX intermediate assem...
  • CUDA以及NVCC编译流程

    万次阅读 2017-05-18 12:50:20
    NVCC编译流程
  • CUDA:nvcc编译参数示例

    千次阅读 2018-05-26 17:00:50
    nvcc中有控制程序兼容性的参数,本节主要讲这些参数,如果对原理想有了解见CUDA:NVCC编译过程和兼容性详解 CMake中的参数写法和命令行的写法都有。 多种编译写法 虚拟架构+真实框架 这种编译要编译成唯一的结果...
  • 写了一个简单的测试程序,使用nvcc编译,指令如下: nvcc cudaPrintDeviceInfo.cu -o cudaPrintDeviceInfo 本以为会很顺利地生成执行文件。但还是出现了warning: nvcc warning : The 'compute_20', 'sm_20', and '
  • CUDA nvcc编译步骤简单讲解

    千次阅读 2017-08-21 10:53:21
    如果你想了解 Nvcc 到底搞了什么鬼,究竟 compute_xy sm_xy 区别在哪里, ptx,cudabin 又是怎么嵌套到 exe 里面最终被驱动执行的,这一节正是你想要的知识。他将讲解每一个编译的具体步骤,而且不光是知识,...
  • CUDA:NVCC编译过程和兼容性详解 https://codeyarns.com/2014/03/03/how-to-specify-architecture-to-compile-cuda-code/ https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#supported-phases ...
  • CUDA:NVCC编译过程和兼容性详解

    万次阅读 多人点赞 2018-05-26 16:29:49
    CUDA:NVCC编译过程和兼容性详解 https://codeyarns.com/2014/03/03/how-to-specify-architecture-to-compile-cuda-code/ https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#supported-phases ...
  • 最近学习DSSD,从https://github.com/chengyangfu/caffe/tree/dssd下载源码后编译报错:Unsupported gpu architecture 'compute_20',具体如下: CXX src/caffe/util/db.cpp CXX src/caffe/util/benchmark.cpp ...
  • nvcc --ptxas-options=-v kernel.cu
  • 解决方法: nvcc -lglut -lGL main.cu
  • CUDA Nvcc 编译流程

    2009-11-13 10:47:32
    注:以下档案为转载: 如果你想了解Nvcc到底搞了什么鬼,究竟compute_xy sm_xy区别在哪里,ptx,cudabin 又是怎么嵌套到exe里面最终被驱动执行的,这一节正是你想要的知识。他将...
  • NVCC 分歩 编译 CUDA 程序

    千次阅读 2016-12-25 14:05:06
    先上一张大图,nviDIA官方文档的编译流程图 ...1.只打印,不运行nvcc -O2 -c backprop_cuda.cu -keep -arch sm_30 --dryrun2.打印出 具体流程1. 读取环境变量 #$ _SPACE_= #$ _CUDART_=cudart #$ _HERE_=/home/gpgpu-s
  • usr/local/cuda/include目录里面已经有thrust包了 nvcc --version命令也通过 为何编译时候还是会报no such file kernel.cu:7:34: fatal error: thrust\device_vector.h: No such file or directory
  • 既是一个路径名称也是一个函数名称,作为函数,其作用是编译并链接源文件到一个共享的库中(shared library),这个共享的库称为mex-file,可以在MATLAB中运行,为单独的MATLAb引擎和MAT文件应用构建可执行程序,...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 2,486
精华内容 994
热门标签
关键字:

nvcc编译cu程序