Reviewed-by: kvn, twisti
Add scalar reduction optimization to C2 to take advantage of vector instructions in modern x86 CPUs. Reviewed-by: kvn, twisti