I can attest that they are definitely NOT synchronous, unless you combine them either as one data element or cluster them. As a test, take a single counter and feed to two FP indicators. Now on your Host program, subtract the two and run the result to a graph..... I'll bet you'll see occasional glitches where the two number were not identical. I actually was using a 34 bit signed FXP, and split it into two U32's (since I was using DMA FIFO as well). On my host it was most obvious as my readback was cycling about zero, and one value was still a signed negative value while the other was showing a portion that was positive, causing large data variations.....
I agree, that if the old value is still around, putting the old value into the indicator can actually consume fewer resources, since you don't need the logic to gate the update. However, in some cases, to do so would mean you would need a shift register to hold the old value. I am not sure which would be fewer resources, the shift register (which in my case was for a 34 bit signed FXP), or the gating logic? It seems like the gating logic would be "cheaper", but I'm not sure.....
I implement my state machines inside a SCTL. For my purposes, this was required, since I was controlling a digital bus and needed deterministic timing for each 25 nS period. Using a loop outside a SCTL introduces a 2 cycle penalty PER LOOP, which would not allow the timing I needed. So, Inside the SCTL, each case takes the same amount of time, one clock cycle. If I need variable timing, I use a WAIT case. My previous case shoves into a SR (shift register) the number of waits required and stuffs into another SR the NEXT STATE when the wait is done, enter (and keep entering) the WAIT state, decrementing the wait each time until zero , then loads the NEXT STATE as the next state to enter.....
I'm a little concerned about your experience with the longest state issue......If that were true I think it would be a severe bug. I am assuming that this must have NOT been inside a SCTL? Could this be easily proved by examining the tick counter values at the completion of each case?