C + CUDA.
Working on Windows Server 2008 64 bits and Ubuntu Linux 9.04 32 bits, both with CUDA 2.3.
Needs Compute Capability 1.2 or above (double precision floating point).
Includes the source code, Makefile for Linux and Visual Studio 2008 Project for Windows.
Version 2 makes coalesced memory access and showed better performance
Version 3 combines the best of the two worlds.