Multicore, Multithread Programing

Image Processing Applications with NVIDIA Carma Dev Kit

I got my Carma board last year, but was to busy to touch. Recently, I have sometimes playing with this, and have a lot of fun to share with you guys.

NVIDIA released this development kit (with middle end tablet CPU but quite powerful GPU) targeting HPC applications for server stack. However, I want to use it to accelerate my Image Processing Application and it has done pretty well. I will post few topics from development environment setting up, OpenCV installation and some simple image processing applications with performance benchmarking information.

And here are the specification information:

Tegra3 ARM A9 CPU( Dual core)
Quadro 1000M GPU (96 CUDA Cores)
2GB system RAM, 2GB GPU RAM
4x PCIe Gen1 CPU to GPU link
1000Base-T networking support
HDMI and Display ports
2USB 2.0 ports
1 SATA port. It's highly recommend that you prepare a SSD disk since default storage may not be enough for your image processing applications.

About software, Carma support:

Ubuntu version 11.04 is pre-installed.
Cuda version 4.2(So sad that it doesn't support OpenCL)
gcc-4.5-arm-linux-gnueabi

For details, you may look into SECO website.

Cross compile environment setting up.

Start with a clean installation of Ubuntu Desktop

You are recommended to have a X86_64 Linux development machine, running Ubuntu 11.04, with GCC version 4.5.X.Since it will be synchronized with Ubuntu on Carma board and you won't face much trouble with library version conflicts later on. However, it still works well in my Ubuntu 12.04 laptop.
Since Ubuntu 11.04 is quite an outdated version, and not supported by Ubuntu anymore you may need to go to this site to download: http://old-releases.ubuntu.com/releases/11.04/

In order to install applications downto your board, please update your repositories in your Carma board also. In your Carma board terminal, execute following commands:

Back up your sources.list file
sudo cp /etc/apt/sources.list /etc/apt/sources.list.backup
Add following line into sources.list file
deb http://old-releases.ubuntu.com/ubuntu/ natty main restricted universe multiverse
deb http://old-releases.ubuntu.com/ubuntu/ natty-updates main restricted universe multiverse
deb http://old-releases.ubuntu.com/ubuntu/ natty-security main restricted universe multiverse
From now on, you can use your "sudo apt-get" in Carma board. (It will be very useful when you want to install OpenCV into the board.)

Installing ARM compiler

A gcc ARM cross compiler needs to be installed. This compiler will be invoked by nvcc as part of the cross compilation. Currently, only gcc 4.5.X is supported.
The following command, when executed on the development machine, can be used to obtain a gcc 4.5.X cross compiler for ARM:

sudo apt-get install gcc-4.5-arm-linux-gnueabi g++-4.5-arm-linux-gnueabi
The default cross compiler files folders will be/usr/bin/arm-linux-gnueabi-gcc and /usr/bin/arm-linux-gnueabi-g++.

Installing the CUDA toolkit

Download Cuda Toolkit from SECO website. Once you have dow nloaded it, just simply execute the .run file on your Ubuntu machine. Follow the instructions given by the installer.

Adding Libraries on the CARMA machine If you need to use other libraries for ARM, you will also need to copy the libraries and corresponding header files from CARMA to your development PC. Copy all libraries from /usr/lib/arm-linux-gnueabi/ and /usrarm-linux-gnueabi/ from your Carma board to coresponding folders in your Ubuntu PC.
That's it. You have done the development environment and now it's time to try some simple application to have some fun

Your very first Cuda sample on Carma board.

Let's start with some very simple and stupid application, multiply a number in GPU domain and get back to CPU and print out:
Edit a file call "test_cuda.cu", put in the following source code:

#include "stdio.h"

__global__ void kernel(int w, float *d_n)
{
*d_n *= 1.02f;
}

int main(){
float n = 1.0f, *d_n;
float n_ref = 1.0f;
int i;
cudaMalloc((void **)&d_n, sizeof(float));
for(i = 1; i <= 10; i++)
{
cudaMemcpy(d_n, &n, sizeof(float), cudaMemcpyHostToDevice);
kernel <<< 1, 1 >>> (i, d_n);
cudaMemcpy(&n, d_n, sizeof(float), cudaMemcpyDeviceToHost);
printf("%d\t\t%42.41f\t%42.41f\n", i, n,n_ref*=1.02f);
}
return 1;
}

And edit a Makefile with following content:

###############################
# Makefile for Carma cross-compile #
###############################
all : test_cuda

CUDA_HOME=/usr/local/cuda
CC=arm-linux-gnueabi-gcc

NVCC=$(CUDA_HOME)/bin/nvcc -target-cpu-arch ARM --compiler-bindir /usr/bin/arm-linux-gnueabi-gcc-4.5 -m32

test_cuda : test_cuda.cu

$(NVCC) -o test_cuda test_cuda.cu

clean:
rm test_cuda

Well done, copy you binary file to Carma board, hit the keyboard button and enjoy your fun. From now, you can play with you new toy.

Multicore, Multithread Programing

Wednesday, July 9, 2014

Image Processing Applications with NVIDIA Carma Dev Kit

Cross compile environment setting up.

Your very first Cuda sample on Carma board.

No comments:

Post a Comment

Blog Archive