2) Lots of multithreading, stream buffering and pipelineing
3) For the fast iteration speed the "read,parse,ask for next" loop is the main bottleneck - so if you e.g. know that your sync source prefix contains uuids - the tool creates a file iterator for each known subfolder prefix. And with 16 iterators, its mainly the CPU that bottlenecks the XML parsing :)