Is the array size always 495x652 ? If so you would realise a huge speed increase by precomputing an array to determine if or not a pixel is within the radius. So instead of making all those expensive computations it would be a table lookup.
Another, possibly better alternative based on use, would be to also precompute an array of zeros and ones (just like the above solution) but AND the two arrays together to zero out all values outside the radius...
Just some ideas off the top of my head...