I will try to take a stab at this
#1. 2 sample clocks are not used to synchronize in this example. Only the ai sample clock is used. It is used for both the AI and AO task. Because the AO task is using it maybe that confused you, but the AI sample clock is used to ensure the synchronization between the input and output.
#2. Not exactly sure what you are asking but take a shot. The read and the write in SW happen at different times because of the data flow but in HW the read and write occur at the same instant in time (because they are sharing the AI sample clock). So, as in most control apps you need to be careful on the first iteration. The write is always i-1 away from the read. So at time t0 data is read in and written out (what is written out is garbage. A calculation is then made base on the read from t0 and written out to the FIFO (i.e. the write doesn't show up on your data pin yet). Then at time t1 the next sample clock happens, the data from t0 is written out to the data pin, new data is read, calculations are made and everything starts over again.
#3. DO can not do HWTSP
Hope this helps out
StuartG