Reto Buerki [Mon, 9 Jan 2012 21:49:06 +0000 (22:49 +0100)]
Update download section in README
Adrian-Ken Rueegsegger [Mon, 9 Jan 2012 21:46:40 +0000 (22:46 +0100)]
Doc: Delete html and css on clean
Adrian-Ken Rueegsegger [Mon, 9 Jan 2012 20:24:29 +0000 (21:24 +0100)]
Doc: Corrections
Reto Buerki [Mon, 9 Jan 2012 18:17:48 +0000 (19:17 +0100)]
Doc: Corrections
Adrian-Ken Rueegsegger [Mon, 9 Jan 2012 17:43:45 +0000 (18:43 +0100)]
Doc: Extend conclusion section
Adrian-Ken Rueegsegger [Mon, 9 Jan 2012 16:42:14 +0000 (17:42 +0100)]
Doc: Extend performance measurement section
Reto Buerki [Mon, 9 Jan 2012 16:58:48 +0000 (17:58 +0100)]
Doc: Enable pdf hyperrefs
Reto Buerki [Mon, 9 Jan 2012 16:53:30 +0000 (17:53 +0100)]
Add article.lyx dependency to article.pdf target
Reto Buerki [Mon, 9 Jan 2012 16:41:35 +0000 (17:41 +0100)]
Doc: Use COUNT=20 for performance measurements
Reto Buerki [Mon, 9 Jan 2012 16:10:54 +0000 (17:10 +0100)]
Remove unneeded Ada.Real_Time package prefix
Reto Buerki [Mon, 9 Jan 2012 15:58:07 +0000 (16:58 +0100)]
perf_c_drv: Remove unneeded _v2 suffix
Adrian-Ken Rueegsegger [Mon, 9 Jan 2012 15:46:06 +0000 (16:46 +0100)]
Doc: Add bib entries for Ada 83 reqs and standard
Adrian-Ken Rueegsegger [Mon, 9 Jan 2012 15:25:20 +0000 (16:25 +0100)]
Doc: Extend conclusion section
Adrian-Ken Rueegsegger [Mon, 9 Jan 2012 14:02:00 +0000 (15:02 +0100)]
Perf: Reorder bechmarking output
This makes it easier to grab the multiplication related results for further
processing (e.g. documentation).
Reto Buerki [Mon, 9 Jan 2012 15:43:27 +0000 (16:43 +0100)]
Avoid forward declarations in C perf code
Reto Buerki [Mon, 9 Jan 2012 15:05:12 +0000 (16:05 +0100)]
Doc: Fix spelling mistakes
Reto Buerki [Mon, 9 Jan 2012 14:51:59 +0000 (15:51 +0100)]
Doc: Remove two TODOs
Reto Buerki [Mon, 9 Jan 2012 14:24:21 +0000 (15:24 +0100)]
Doc: Update performance measurements table
Reto Buerki [Mon, 9 Jan 2012 14:19:26 +0000 (15:19 +0100)]
Doc: Update system information table
Reto Buerki [Mon, 9 Jan 2012 14:09:31 +0000 (15:09 +0100)]
Doc: Update GPU properties
We will use the GeForce GTX 560 Ti for performance analysis.
Adrian-Ken Rueegsegger [Sun, 8 Jan 2012 22:48:43 +0000 (23:48 +0100)]
Doc: Extend performance analysis section
Adrian-Ken Rueegsegger [Sun, 8 Jan 2012 22:41:36 +0000 (23:41 +0100)]
Use Device Driver API call to get device
Using the cuDeviceGet function obsoletes the usage of the CUDA runtime API
package.
Adrian-Ken Rueegsegger [Sun, 8 Jan 2012 22:32:13 +0000 (23:32 +0100)]
Build perf_c_drv with standard C compiler
It is ordinary C code linked against the CUDA library, no need to use NVIDIA's
fancy nvcc.
Adrian-Ken Rueegsegger [Sun, 8 Jan 2012 22:28:22 +0000 (23:28 +0100)]
Cleanup CUDA Driver API benchmarking code
Use CUDA Driver API calls consistently.
Adrian-Ken Rueegsegger [Sun, 8 Jan 2012 18:10:39 +0000 (19:10 +0100)]
Doc: Extend performance analysis section
Adrian-Ken Rueegsegger [Tue, 3 Jan 2012 17:42:42 +0000 (18:42 +0100)]
Add CUDA Driver API performance measurement code
Adrian-Ken Rueegsegger [Tue, 3 Jan 2012 17:37:49 +0000 (18:37 +0100)]
Minor cleanup of CUDA Runtime API performance code
Reto Buerki [Wed, 28 Dec 2011 09:50:21 +0000 (10:50 +0100)]
Simplify Cuda_Tests project file
Adrian-Ken Rueegsegger [Wed, 21 Dec 2011 21:27:36 +0000 (22:27 +0100)]
Doc: Some corrections
Reto Buerki [Wed, 21 Dec 2011 18:31:31 +0000 (19:31 +0100)]
Doc: Add initial section about Autoinit package
Reto Buerki [Tue, 20 Dec 2011 22:37:10 +0000 (23:37 +0100)]
Doc: Add initial 'example' section
Reto Buerki [Tue, 20 Dec 2011 21:42:16 +0000 (22:42 +0100)]
Doc: Add src/add.adb to index.html dependency
Reto Buerki [Tue, 20 Dec 2011 21:50:13 +0000 (22:50 +0100)]
Cleanup Add example
Reto Buerki [Tue, 20 Dec 2011 21:36:43 +0000 (22:36 +0100)]
Add: Shorten A, B initialization
Adrian-Ken Rueegsegger [Mon, 19 Dec 2011 21:43:16 +0000 (22:43 +0100)]
Add benchmarks section to README
Adrian-Ken Rueegsegger [Mon, 19 Dec 2011 20:24:44 +0000 (21:24 +0100)]
Doc: Extend argument handling section
Adrian-Ken Rueegsegger [Mon, 19 Dec 2011 20:23:53 +0000 (21:23 +0100)]
Doc: Add Ada Reference Manual bibliography entry
Reto Buerki [Mon, 19 Dec 2011 16:03:28 +0000 (17:03 +0100)]
Pass ARCH variable to cuda_perf project
Adrian-Ken Rueegsegger [Mon, 19 Dec 2011 00:13:06 +0000 (01:13 +0100)]
Doc: Add TODO for NVIDIA-LLVM and Dragonegg
Adrian-Ken Rueegsegger [Sun, 18 Dec 2011 23:54:01 +0000 (00:54 +0100)]
Doc: Minor corrections in argument handling section
Adrian-Ken Rueegsegger [Sun, 18 Dec 2011 23:53:36 +0000 (00:53 +0100)]
Doc: Add initial performance analysis section
Reto Buerki [Sun, 18 Dec 2011 18:02:14 +0000 (19:02 +0100)]
Doc: Improve lstlisting style
Reto Buerki [Sun, 18 Dec 2011 17:08:19 +0000 (18:08 +0100)]
Doc: Add Device enumeration section
Reto Buerki [Sun, 18 Dec 2011 16:44:52 +0000 (17:44 +0100)]
Doc: Extend Ada exceptions section
Reto Buerki [Sun, 18 Dec 2011 16:23:04 +0000 (17:23 +0100)]
Add Error_To_String function
This function returns the error string for a given CUDA error code.
Adrian-Ken Rueegsegger [Sun, 18 Dec 2011 16:19:09 +0000 (17:19 +0100)]
Doc: Corrections in CUDA/Ada section
Adrian-Ken Rueegsegger [Sun, 18 Dec 2011 16:02:57 +0000 (17:02 +0100)]
Doc: Corrections in introduction section
Adrian-Ken Rueegsegger [Sun, 18 Dec 2011 15:31:24 +0000 (16:31 +0100)]
Doc: Extend Ada exceptions section
Adrian-Ken Rueegsegger [Sun, 18 Dec 2011 15:30:57 +0000 (16:30 +0100)]
Doc: Reflect changes wrt. arg creator generic
Adrian-Ken Rueegsegger [Sun, 18 Dec 2011 15:30:19 +0000 (16:30 +0100)]
Doc: Write argument handling section
Reto Buerki [Fri, 16 Dec 2011 23:51:30 +0000 (00:51 +0100)]
Make sure obj dir exists before building perf_c
Reto Buerki [Fri, 16 Dec 2011 23:49:39 +0000 (00:49 +0100)]
Minor: Simplify $(OBJDIR)/perf_c target
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 17:36:12 +0000 (18:36 +0100)]
Drop unneeded argument creation functions
The concrete functions are replaced by the arg creators generic.
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 17:34:14 +0000 (18:34 +0100)]
Use arg creators generic in tests
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 17:06:44 +0000 (18:06 +0100)]
Use arg creators generic in performance measurements
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 17:06:12 +0000 (18:06 +0100)]
Use arg creators generic in add example
Adrian-Ken Rueegsegger [Sun, 11 Dec 2011 23:05:11 +0000 (00:05 +0100)]
Generic argument type creator functions
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 16:39:58 +0000 (17:39 +0100)]
Makefile: Use CUDASDK variable when calling nvcc
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 12:56:54 +0000 (13:56 +0100)]
Add optional COUNT parameter to perf target
One can now specify the iterations count when calling the perf target:
make perf COUNT=10
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 12:54:45 +0000 (13:54 +0100)]
Add CUDA C reference performance measurement code
Adrian-Ken Rueegsegger [Fri, 16 Dec 2011 12:35:06 +0000 (13:35 +0100)]
Shorten performance binary name
Reto Buerki [Wed, 14 Dec 2011 22:19:47 +0000 (23:19 +0100)]
Doc: Extend thick binding section
Reto Buerki [Wed, 14 Dec 2011 21:33:23 +0000 (22:33 +0100)]
Don't use cudaGetErrorString in Check_Result
This only works for errors inside the runtime API.
Reto Buerki [Wed, 14 Dec 2011 21:09:25 +0000 (22:09 +0100)]
Display performance measurement parameters
Reto Buerki [Wed, 14 Dec 2011 16:24:39 +0000 (17:24 +0100)]
Doc: Corrections and initial thick binding section
Reto Buerki [Tue, 13 Dec 2011 14:09:44 +0000 (15:09 +0100)]
Doc: Minor corrections
Reto Buerki [Sun, 11 Dec 2011 19:02:56 +0000 (20:02 +0100)]
Doc: Write initial CUDA/Ada thin binding section
Reto Buerki [Sun, 11 Dec 2011 18:04:42 +0000 (19:04 +0100)]
Doc: Write design-goals section
Reto Buerki [Sun, 11 Dec 2011 16:56:57 +0000 (17:56 +0100)]
Doc: Add section about GNAT
Reto Buerki [Sun, 11 Dec 2011 16:03:19 +0000 (17:03 +0100)]
Doc: Write initial section about Ada
Reto Buerki [Sat, 10 Dec 2011 15:46:25 +0000 (16:46 +0100)]
Doc: Corrections in intro section
Adrian-Ken Rueegsegger [Fri, 9 Dec 2011 18:32:31 +0000 (19:32 +0100)]
Add Nvidia copyright notice for use of sample code
This is needed to be compliant with Nvidia's 'GPU COMPUTING SDK END USER
LICENSE AGREEMENT'.
Reto Buerki [Fri, 9 Dec 2011 16:38:09 +0000 (17:38 +0100)]
Doc: Write initial introduction section
Reto Buerki [Fri, 9 Dec 2011 16:03:11 +0000 (17:03 +0100)]
Doc: Write initial abstract
Reto Buerki [Thu, 8 Dec 2011 17:59:06 +0000 (18:59 +0100)]
Doc: Update article structure
Adrian-Ken Rueegsegger [Tue, 6 Dec 2011 20:59:16 +0000 (21:59 +0100)]
Add perf measurement for matrix multiplication
The CUDA source is taken from Nvidia's CUDA SDK Matrix Multiplication
code sample.
Adrian-Ken Rueegsegger [Tue, 29 Nov 2011 20:40:12 +0000 (21:40 +0100)]
Makefile: Add perf target
Adrian-Ken Rueegsegger [Tue, 29 Nov 2011 20:17:14 +0000 (21:17 +0100)]
Add skeleton for performance testing
Reto Buerki [Wed, 7 Dec 2011 17:05:15 +0000 (18:05 +0100)]
Add initial README and web page files
Reto Buerki [Wed, 7 Dec 2011 16:38:34 +0000 (17:38 +0100)]
Doc: Add article target
Reto Buerki [Wed, 30 Nov 2011 21:41:57 +0000 (22:41 +0100)]
Autoinit: Add debug output
Adrian-Ken Rueegsegger [Sat, 26 Nov 2011 10:08:52 +0000 (11:08 +0100)]
Add AUTHORS file
Reto Buerki [Fri, 25 Nov 2011 21:23:07 +0000 (22:23 +0100)]
Add licence headers and COPYING (GPLv3+)
Reto Buerki [Fri, 25 Nov 2011 20:47:36 +0000 (21:47 +0100)]
Remove unneeded conversions to stddef_h.size_t
Reto Buerki [Fri, 25 Nov 2011 20:42:33 +0000 (21:42 +0100)]
Compiler: Use GNAT.OS_Lib instead of System.OS_Lib
This fixes the 'System.OS_Lib is an internal GNAT unit' warning with
GNAT GPL 2011.
Reto Buerki [Thu, 24 Nov 2011 10:29:11 +0000 (11:29 +0100)]
Add TODO item
Adrian-Ken Rueegsegger [Wed, 23 Nov 2011 17:53:09 +0000 (18:53 +0100)]
Change argument data size type
This makes is possible to allocate a lot more device memory since size_t
is larger than Ada's Integer type.
Adrian-Ken Rueegsegger [Wed, 23 Nov 2011 17:15:26 +0000 (18:15 +0100)]
Remove completed TODO item
Adrian-Ken Rueegsegger [Wed, 23 Nov 2011 17:14:26 +0000 (18:14 +0100)]
Compiler: Grid and block dimensions cannot be zero
Adrian-Ken Rueegsegger [Wed, 23 Nov 2011 17:12:57 +0000 (18:12 +0100)]
Add test for real matrix argument passing
Adrian-Ken Rueegsegger [Wed, 23 Nov 2011 16:47:45 +0000 (17:47 +0100)]
Minor: Drop use clause
Reto Buerki [Tue, 22 Nov 2011 09:49:58 +0000 (10:49 +0100)]
Use Grid_Dim_X of 1 in Kernel_Float_Args test
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 21:20:28 +0000 (22:20 +0100)]
Add grid/block parameters to Call procedure.
This allows the caller to specify the grid and block dimensions used
for the kernel execution.
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 21:32:07 +0000 (22:32 +0100)]
Update TODO items
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 21:00:43 +0000 (22:00 +0100)]
Add function to create in-out from real matrix
This function returns an in-mode argument object for the given real
matrix. Device memory is allocated and initialized with the specified
data. Results are copied back from the device to the given matrix once
the kernel has finished its execution.
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 20:58:37 +0000 (21:58 +0100)]
Add function to create out from real matrix
This function returns an out-mode argument object for the given real
matrix. Device memory is allocated. Results are copied back from the
device to the given matrix once the kernel has finished its execution.
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 20:55:44 +0000 (21:55 +0100)]
Add function to create in arg from real matrix
This function returns an in-mode argument object for the given real
matrix. Device memory is allocated and initialized with the specified
data.
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 20:34:29 +0000 (21:34 +0100)]
Add test for float argument passing
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 20:21:04 +0000 (21:21 +0100)]
Add function to create in-out from float
This function returns an in-mode argument object for the given float
number. Device memory is allocated and initialized with the specified
data. Results are copied back from the device to the given float once
the kernel has finished its execution.
Adrian-Ken Rueegsegger [Mon, 21 Nov 2011 20:16:13 +0000 (21:16 +0100)]
Add function to create out from float
This function returns an out-mode argument object for the given float
number. Device memory is allocated. Results are copied back from the
device to the given float once the kernel has finished its execution.