cuda-ada.git
10 years agoAdd benchmarks section to README
Adrian-Ken Rueegsegger [Mon, 19 Dec 2011 21:43:16 +0000 (22:43 +0100)]
Add benchmarks section to README

10 years agoDoc: Extend argument handling section
Adrian-Ken Rueegsegger [Mon, 19 Dec 2011 20:24:44 +0000 (21:24 +0100)]
Doc: Extend argument handling section

10 years agoDoc: Add Ada Reference Manual bibliography entry
Adrian-Ken Rueegsegger [Mon, 19 Dec 2011 20:23:53 +0000 (21:23 +0100)]
Doc: Add Ada Reference Manual bibliography entry

10 years agoPass ARCH variable to cuda_perf project
Reto Buerki [Mon, 19 Dec 2011 16:03:28 +0000 (17:03 +0100)]
Pass ARCH variable to cuda_perf project

10 years agoDoc: Add TODO for NVIDIA-LLVM and Dragonegg
Adrian-Ken Rueegsegger [Mon, 19 Dec 2011 00:13:06 +0000 (01:13 +0100)]
Doc: Add TODO for NVIDIA-LLVM and Dragonegg

10 years agoDoc: Minor corrections in argument handling section
Adrian-Ken Rueegsegger [Sun, 18 Dec 2011 23:54:01 +0000 (00:54 +0100)]
Doc: Minor corrections in argument handling section

10 years agoDoc: Add initial performance analysis section
Adrian-Ken Rueegsegger [Sun, 18 Dec 2011 23:53:36 +0000 (00:53 +0100)]
Doc: Add initial performance analysis section

10 years agoDoc: Improve lstlisting style
Reto Buerki [Sun, 18 Dec 2011 18:02:14 +0000 (19:02 +0100)]
Doc: Improve lstlisting style

10 years agoDoc: Add Device enumeration section
Reto Buerki [Sun, 18 Dec 2011 17:08:19 +0000 (18:08 +0100)]
Doc: Add Device enumeration section

10 years agoDoc: Extend Ada exceptions section
Reto Buerki [Sun, 18 Dec 2011 16:44:52 +0000 (17:44 +0100)]
Doc: Extend Ada exceptions section

10 years agoAdd Error_To_String function
Reto Buerki [Sun, 18 Dec 2011 16:23:04 +0000 (17:23 +0100)]
Add Error_To_String function

This function returns the error string for a given CUDA error code.

10 years agoDoc: Corrections in CUDA/Ada section
Adrian-Ken Rueegsegger [Sun, 18 Dec 2011 16:19:09 +0000 (17:19 +0100)]
Doc: Corrections in CUDA/Ada section

10 years agoDoc: Corrections in introduction section
Adrian-Ken Rueegsegger [Sun, 18 Dec 2011 16:02:57 +0000 (17:02 +0100)]
Doc: Corrections in introduction section

10 years agoDoc: Extend Ada exceptions section
Adrian-Ken Rueegsegger [Sun, 18 Dec 2011 15:31:24 +0000 (16:31 +0100)]
Doc: Extend Ada exceptions section

10 years agoDoc: Reflect changes wrt. arg creator generic
Adrian-Ken Rueegsegger [Sun, 18 Dec 2011 15:30:57 +0000 (16:30 +0100)]
Doc: Reflect changes wrt. arg creator generic

10 years agoDoc: Write argument handling section
Adrian-Ken Rueegsegger [Sun, 18 Dec 2011 15:30:19 +0000 (16:30 +0100)]
Doc: Write argument handling section

10 years agoMake sure obj dir exists before building perf_c
Reto Buerki [Fri, 16 Dec 2011 23:51:30 +0000 (00:51 +0100)]
Make sure obj dir exists before building perf_c

10 years agoMinor: Simplify $(OBJDIR)/perf_c target
Reto Buerki [Fri, 16 Dec 2011 23:49:39 +0000 (00:49 +0100)]
Minor: Simplify $(OBJDIR)/perf_c target

10 years agoDrop unneeded argument creation functions
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 17:36:12 +0000 (18:36 +0100)]
Drop unneeded argument creation functions

The concrete functions are replaced by the arg creators generic.

10 years agoUse arg creators generic in tests
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 17:34:14 +0000 (18:34 +0100)]
Use arg creators generic in tests

10 years agoUse arg creators generic in performance measurements
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 17:06:44 +0000 (18:06 +0100)]
Use arg creators generic in performance measurements

10 years agoUse arg creators generic in add example
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 17:06:12 +0000 (18:06 +0100)]
Use arg creators generic in add example

10 years agoGeneric argument type creator functions
Adrian-Ken Rueegsegger [Sun, 11 Dec 2011 23:05:11 +0000 (00:05 +0100)]
Generic argument type creator functions

10 years agoMakefile: Use CUDASDK variable when calling nvcc
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 16:39:58 +0000 (17:39 +0100)]
Makefile: Use CUDASDK variable when calling nvcc

10 years agoAdd optional COUNT parameter to perf target
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 12:56:54 +0000 (13:56 +0100)]
Add optional COUNT parameter to perf target

One can now specify the iterations count when calling the perf target:
make perf COUNT=10

10 years agoAdd CUDA C reference performance measurement code
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 12:54:45 +0000 (13:54 +0100)]
Add CUDA C reference performance measurement code

10 years agoShorten performance binary name
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 12:35:06 +0000 (13:35 +0100)]
Shorten performance binary name

10 years agoDoc: Extend thick binding section
Reto Buerki [Wed, 14 Dec 2011 22:19:47 +0000 (23:19 +0100)]
Doc: Extend thick binding section

10 years agoDon't use cudaGetErrorString in Check_Result
Reto Buerki [Wed, 14 Dec 2011 21:33:23 +0000 (22:33 +0100)]
Don't use cudaGetErrorString in Check_Result

This only works for errors inside the runtime API.

10 years agoDisplay performance measurement parameters
Reto Buerki [Wed, 14 Dec 2011 21:09:25 +0000 (22:09 +0100)]
Display performance measurement parameters

10 years agoDoc: Corrections and initial thick binding section
Reto Buerki [Wed, 14 Dec 2011 16:24:39 +0000 (17:24 +0100)]
Doc: Corrections and initial thick binding section

10 years agoDoc: Minor corrections
Reto Buerki [Tue, 13 Dec 2011 14:09:44 +0000 (15:09 +0100)]
Doc: Minor corrections

10 years agoDoc: Write initial CUDA/Ada thin binding section
Reto Buerki [Sun, 11 Dec 2011 19:02:56 +0000 (20:02 +0100)]
Doc: Write initial CUDA/Ada thin binding section

10 years agoDoc: Write design-goals section
Reto Buerki [Sun, 11 Dec 2011 18:04:42 +0000 (19:04 +0100)]
Doc: Write design-goals section

10 years agoDoc: Add section about GNAT
Reto Buerki [Sun, 11 Dec 2011 16:56:57 +0000 (17:56 +0100)]
Doc: Add section about GNAT

10 years agoDoc: Write initial section about Ada
Reto Buerki [Sun, 11 Dec 2011 16:03:19 +0000 (17:03 +0100)]
Doc: Write initial section about Ada

10 years agoDoc: Corrections in intro section
Reto Buerki [Sat, 10 Dec 2011 15:46:25 +0000 (16:46 +0100)]
Doc: Corrections in intro section

10 years agoAdd Nvidia copyright notice for use of sample code
Adrian-Ken Rueegsegger [Fri, 9 Dec 2011 18:32:31 +0000 (19:32 +0100)]
Add Nvidia copyright notice for use of sample code

This is needed to be compliant with Nvidia's 'GPU COMPUTING SDK END USER
LICENSE AGREEMENT'.

10 years agoDoc: Write initial introduction section
Reto Buerki [Fri, 9 Dec 2011 16:38:09 +0000 (17:38 +0100)]
Doc: Write initial introduction section

10 years agoDoc: Write initial abstract
Reto Buerki [Fri, 9 Dec 2011 16:03:11 +0000 (17:03 +0100)]
Doc: Write initial abstract

10 years agoDoc: Update article structure
Reto Buerki [Thu, 8 Dec 2011 17:59:06 +0000 (18:59 +0100)]
Doc: Update article structure

10 years agoAdd perf measurement for matrix multiplication
Adrian-Ken Rueegsegger [Tue, 6 Dec 2011 20:59:16 +0000 (21:59 +0100)]
Add perf measurement for matrix multiplication

The CUDA source is taken from Nvidia's CUDA SDK Matrix Multiplication
code sample.

10 years agoMakefile: Add perf target
Adrian-Ken Rueegsegger [Tue, 29 Nov 2011 20:40:12 +0000 (21:40 +0100)]
Makefile: Add perf target

10 years agoAdd skeleton for performance testing
Adrian-Ken Rueegsegger [Tue, 29 Nov 2011 20:17:14 +0000 (21:17 +0100)]
Add skeleton for performance testing

10 years agoAdd initial README and web page files
Reto Buerki [Wed, 7 Dec 2011 17:05:15 +0000 (18:05 +0100)]
Add initial README and web page files

10 years agoDoc: Add article target
Reto Buerki [Wed, 7 Dec 2011 16:38:34 +0000 (17:38 +0100)]
Doc: Add article target

10 years agoAutoinit: Add debug output
Reto Buerki [Wed, 30 Nov 2011 21:41:57 +0000 (22:41 +0100)]
Autoinit: Add debug output

10 years agoAdd AUTHORS file
Adrian-Ken Rueegsegger [Sat, 26 Nov 2011 10:08:52 +0000 (11:08 +0100)]
Add AUTHORS file

10 years agoAdd licence headers and COPYING (GPLv3+)
Reto Buerki [Fri, 25 Nov 2011 21:23:07 +0000 (22:23 +0100)]
Add licence headers and COPYING (GPLv3+)

10 years agoRemove unneeded conversions to stddef_h.size_t
Reto Buerki [Fri, 25 Nov 2011 20:47:36 +0000 (21:47 +0100)]
Remove unneeded conversions to stddef_h.size_t

10 years agoCompiler: Use GNAT.OS_Lib instead of System.OS_Lib
Reto Buerki [Fri, 25 Nov 2011 20:42:33 +0000 (21:42 +0100)]
Compiler: Use GNAT.OS_Lib instead of System.OS_Lib

This fixes the 'System.OS_Lib is an internal GNAT unit' warning with
GNAT GPL 2011.

10 years agoAdd TODO item
Reto Buerki [Thu, 24 Nov 2011 10:29:11 +0000 (11:29 +0100)]
Add TODO item

10 years agoChange argument data size type
Adrian-Ken Rueegsegger [Wed, 23 Nov 2011 17:53:09 +0000 (18:53 +0100)]
Change argument data size type

This makes is possible to allocate a lot more device memory since size_t
is larger than Ada's Integer type.

10 years agoRemove completed TODO item
Adrian-Ken Rueegsegger [Wed, 23 Nov 2011 17:15:26 +0000 (18:15 +0100)]
Remove completed TODO item

10 years agoCompiler: Grid and block dimensions cannot be zero
Adrian-Ken Rueegsegger [Wed, 23 Nov 2011 17:14:26 +0000 (18:14 +0100)]
Compiler: Grid and block dimensions cannot be zero

10 years agoAdd test for real matrix argument passing
Adrian-Ken Rueegsegger [Wed, 23 Nov 2011 17:12:57 +0000 (18:12 +0100)]
Add test for real matrix argument passing

10 years agoMinor: Drop use clause
Adrian-Ken Rueegsegger [Wed, 23 Nov 2011 16:47:45 +0000 (17:47 +0100)]
Minor: Drop use clause

10 years agoUse Grid_Dim_X of 1 in Kernel_Float_Args test
Reto Buerki [Tue, 22 Nov 2011 09:49:58 +0000 (10:49 +0100)]
Use Grid_Dim_X of 1 in Kernel_Float_Args test

10 years agoAdd grid/block parameters to Call procedure.
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 21:20:28 +0000 (22:20 +0100)]
Add grid/block parameters to Call procedure.

This allows the caller to specify the grid and block dimensions used
for the kernel execution.

10 years agoUpdate TODO items
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 21:32:07 +0000 (22:32 +0100)]
Update TODO items

10 years agoAdd function to create in-out from real matrix
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 21:00:43 +0000 (22:00 +0100)]
Add function to create in-out from real matrix

This function returns an in-mode argument object for the given real
matrix. Device memory is allocated and initialized with the specified
data. Results are copied back from the device to the given matrix once
the kernel has finished its execution.

10 years agoAdd function to create out from real matrix
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 20:58:37 +0000 (21:58 +0100)]
Add function to create out from real matrix

This function returns an out-mode argument object for the given real
matrix. Device memory is allocated. Results are copied back from the
device to the given matrix once the kernel has finished its execution.

10 years agoAdd function to create in arg from real matrix
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 20:55:44 +0000 (21:55 +0100)]
Add function to create in arg from real matrix

This function returns an in-mode argument object for the given real
matrix. Device memory is allocated and initialized with the specified
data.

10 years agoAdd test for float argument passing
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 20:34:29 +0000 (21:34 +0100)]
Add test for float argument passing

10 years agoAdd function to create in-out from float
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 20:21:04 +0000 (21:21 +0100)]
Add function to create in-out from float

This function returns an in-mode argument object for the given float
number. Device memory is allocated and initialized with the specified
data. Results are copied back from the device to the given float once
the kernel has finished its execution.

10 years agoAdd function to create out from float
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 20:16:13 +0000 (21:16 +0100)]
Add function to create out from float

This function returns an out-mode argument object for the given float
number. Device memory is allocated. Results are copied back from the
device to the given float once the kernel has finished its execution.

10 years agoAdd function to create in arg from a float
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 19:57:13 +0000 (20:57 +0100)]
Add function to create in arg from a float

This function returns an in-mode argument object for the given float
number. Device memory is allocated and initialized with the specified
data.

10 years agoRefactor argument object creation functions.
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 19:53:11 +0000 (20:53 +0100)]
Refactor argument object creation functions.

10 years agoRemove unneeded thin binding files
Reto Buerki [Mon, 21 Nov 2011 14:38:06 +0000 (15:38 +0100)]
Remove unneeded thin binding files

10 years agoAuto-detect host architecture
Reto Buerki [Mon, 21 Nov 2011 14:29:21 +0000 (15:29 +0100)]
Auto-detect host architecture

Build correct thin binding depending on host architecture
(i686, x86_64).

10 years agoCompiler: Convert Data_Size to stddef_h.size_t
Reto Buerki [Mon, 21 Nov 2011 13:11:30 +0000 (14:11 +0100)]
Compiler: Convert Data_Size to stddef_h.size_t

Instead of using IC.unsigned_long directly, convert Data_Size to
stddef_h.size_t in cuMem* functions.

This makes the code compile on 32-bit hosts.

10 years agoUse types IC.unsigned and IC.unsigned_long
Reto Buerki [Mon, 21 Nov 2011 13:51:03 +0000 (14:51 +0100)]
Use types IC.unsigned and IC.unsigned_long

This is needed to make the compiler happy on i686 and x86_64.

10 years agoCompiler: Remove superseded Call procedure
Reto Buerki [Mon, 21 Nov 2011 12:22:51 +0000 (13:22 +0100)]
Compiler: Remove superseded Call procedure

10 years agoAdd (second) kernel Call procedure
Adrian-Ken Rueegsegger [Sun, 20 Nov 2011 21:04:19 +0000 (22:04 +0100)]
Add (second) kernel Call procedure

This procedure calls the specified CUDA function with the given kernel
arguments.

10 years agoAdd function to create in-out from real vector
Adrian-Ken Rueegsegger [Sun, 20 Nov 2011 20:48:30 +0000 (21:48 +0100)]
Add function to create in-out from real vector

This function returns an in-mode argument object for the given real
vector. Device memory is allocated and initialized with the specified
data. Results are copied back from the device to the given vector once
the kernel has finished its execution.

10 years agoAdd function to create out from real vector
Adrian-Ken Rueegsegger [Sun, 20 Nov 2011 20:46:25 +0000 (21:46 +0100)]
Add function to create out from real vector

This function returns an out-mode argument object for the given real
vector. Device memory is allocated. Results are copied back from the
device to the given vector once the kernel has finished its execution.

10 years agoAdd function to create in arg from real vector
Adrian-Ken Rueegsegger [Sun, 20 Nov 2011 20:44:38 +0000 (21:44 +0100)]
Add function to create in arg from real vector

This function returns an in-mode argument object for the given real
vector. Device memory is allocated and initialized with the specified
data.

10 years agoAdd argument types
Adrian-Ken Rueegsegger [Sun, 20 Nov 2011 20:39:38 +0000 (21:39 +0100)]
Add argument types

The controlled argument type takes care of copying data from the device
back to the host after kernel execution and device memory deallocation
during finalization.

These types will be used for CUDA kernel calls to pass in/out/in-out
parameters.

10 years agoCompiler: Provide Source_Module_Type
Reto Buerki [Sun, 20 Nov 2011 18:35:36 +0000 (19:35 +0100)]
Compiler: Provide Source_Module_Type

This type stores CUDA source code. Use the Create function to
instantiate a new source module from given operation (the actual kernel)
and an optional preamble (e.g. some defines).

The source module can be compiled to cubin using the existing Compile
function.

10 years agoMakefile: Remove cache dir in clean target
Reto Buerki [Sun, 20 Nov 2011 17:52:30 +0000 (18:52 +0100)]
Makefile: Remove cache dir in clean target

10 years agoCompiler: Only compile source if not in cache
Reto Buerki [Sun, 20 Nov 2011 17:45:03 +0000 (18:45 +0100)]
Compiler: Only compile source if not in cache

The module cache stores CUDA and cubin files using the (SHA1) digest of
the source module. It only compiles code if the resulting cubin file is
not found in the cache.

10 years agoCompiler: Add 'extern C' to source module
Reto Buerki [Sun, 20 Nov 2011 17:02:05 +0000 (18:02 +0100)]
Compiler: Add 'extern C' to source module

10 years agoAdd CUDA.Logger package
Reto Buerki [Sun, 20 Nov 2011 16:50:01 +0000 (17:50 +0100)]
Add CUDA.Logger package

This package provides a simple Log procedure which can be used together
with pragma Debug to display messages on the console.

10 years agoDriver: Rename Device_Copy_Overlap to Copy_Overlap
Reto Buerki [Sun, 20 Nov 2011 16:21:06 +0000 (17:21 +0100)]
Driver: Rename Device_Copy_Overlap to Copy_Overlap

10 years agoMake Check_Result procedure private
Reto Buerki [Sun, 20 Nov 2011 16:18:13 +0000 (17:18 +0100)]
Make Check_Result procedure private

10 years agoRework Enum_Devices example using the Driver package
Reto Buerki [Sun, 20 Nov 2011 16:17:06 +0000 (17:17 +0100)]
Rework Enum_Devices example using the Driver package

10 years agoUnify package rename for Interfaces.C
Reto Buerki [Sun, 20 Nov 2011 16:01:08 +0000 (17:01 +0100)]
Unify package rename for Interfaces.C

While at it, implement Device_Count function in Driver package

10 years agoMakefile: Add target to perform coverage analysis
Reto Buerki [Sun, 20 Nov 2011 15:50:17 +0000 (16:50 +0100)]
Makefile: Add target to perform coverage analysis

10 years agoAdd CUDA.Driver package
Reto Buerki [Sun, 20 Nov 2011 15:10:21 +0000 (16:10 +0100)]
Add CUDA.Driver package

This package provides a Device_Type which represents a CUDA device.
The Iterate procedure can be used to iterate over all CUDA devices of a
host.

10 years agoAdd test project
Reto Buerki [Sun, 20 Nov 2011 15:08:56 +0000 (16:08 +0100)]
Add test project

Implement the 'Add' example as first ahven testcase to verify kernel
compilation and function call mechanism.

10 years agoAdd Cuda_Common project file
Reto Buerki [Sun, 20 Nov 2011 09:07:54 +0000 (10:07 +0100)]
Add Cuda_Common project file

Contains common switches for different targets

10 years agoAdd TODO items
Reto Buerki [Tue, 15 Nov 2011 12:02:43 +0000 (13:02 +0100)]
Add TODO items

10 years agoCompiler: Fix output variable allocation
Reto Buerki [Thu, 10 Nov 2011 22:31:11 +0000 (23:31 +0100)]
Compiler: Fix output variable allocation

10 years agoFree allocated device memory after function call
Reto Buerki [Thu, 10 Nov 2011 19:42:29 +0000 (20:42 +0100)]
Free allocated device memory after function call

10 years agoAdd TODO file
Adrian-Ken Rueegsegger [Wed, 9 Nov 2011 18:13:29 +0000 (19:13 +0100)]
Add TODO file

10 years agoCompiler: Handle memory allocation and var copying
Reto Buerki [Wed, 9 Nov 2011 18:04:21 +0000 (19:04 +0100)]
Compiler: Handle memory allocation and var copying

Input parameters are now automatically allocated and copied to the
device before the actual kernel launch. Output parameters are copied
back to the host after the call.

Currently, the allocated mem size per variable is hardcoded.

10 years agoIntroduce Function_Type
Reto Buerki [Wed, 9 Nov 2011 16:50:02 +0000 (17:50 +0100)]
Introduce Function_Type

This type is used to call a CUDA kernel on the GPU.

10 years agoAdd Module_Type
Reto Buerki [Wed, 9 Nov 2011 16:21:39 +0000 (17:21 +0100)]
Add Module_Type

Represents a CUDA module. The Compile function can be used to create a
new module from CUDA source code.

10 years agoMinor: Fix indentation
Reto Buerki [Wed, 9 Nov 2011 15:38:22 +0000 (16:38 +0100)]
Minor: Fix indentation

10 years agoRemove unneeded -lcudart linker option pragma.
Reto Buerki [Thu, 3 Nov 2011 10:02:21 +0000 (11:02 +0100)]
Remove unneeded -lcudart linker option pragma.