If you're trying to acquire a 16-bit sample at 1MS/s, using the parallel port, it will not work. In fact, many full-fledged DAQ cards do not support this high sampling-rate/resolution combination. Look at what NI has available here, in the S and E series DAQ boards--this should give you a feel for what's out there and what a professionally manufactured piece of equipment is capable of:
http://sine.ni.com/apps/we/nioc.vp?cid=10955〈=US
http://sine.ni.com/apps/we/nioc.vp?cid=1038〈=US
As you noted already, a parellel port is going to be extremely slow for your needs, and likely not only be slow, but also have a pretty fair amount of timing jitter, since the timing of your read calls will be indeterminate. I would definitely recommend you look into som
e other hardware acquisition option besides your parallel port... it won't get you very far unfortunately, because that isn't what it's really designed for.