do you suggest that I connect the hardware AO to an AI channel?
Yes, wire directly to an AI channel, and read it with the rest of the channels.
How can I minimize the time difference between the different AI channels ?
I am not sure if what you are seeing is due to the card/software. It could be your test system characteristic, where the rise/drop is so fast , and a frequency of 1kHz is not adequate enough to capture the sudden changes.
You know, you can keep reading data continuously regardless of whether AO is set to 0 or some other value. Then you would see if the start of the task is the problem or not since you would intialize once and keep the task running (collecting data)