Using FTPFileFactory directly is a good idea, but you'll need to preprocess the raw listing before you can do it. The result of a LIST -R command is a bunch of directory listings joined together. They are separated by blank lines and each is preceded by the path of the directory that is being listed. You'll need to separate out each individual listing (by searching for blank lines) and use the path to figure out where it fits into the directory-tree. This is going to take a bit of work, but it doesn't seem to me like a particularly difficult task. Once you've separated them into individual listings you can pass them to an instance of FTPFileFactory, which should be able to parse them and create FTPFile objects.
As for adding this functionality to our product, we have considered doing so, but are worried about reliability. The reason for this is that it seems that some server simply ignore the -R flag and return a non-recursive listing. It may be impossible to detect whether it's ignored the flag or whether the subdirectories are empty.
For example, the MS FTP site (ftp.microsoft.com) sometimes supports LIST -R and sometimes doesn't. I guess it depends on which particular machine your session gets handled by. Just now I logged in and changed to /bussys/backoffice and executed a LIST -R command. It returned
02-01-06 03:01PM <DIR> reskit
02-01-06 03:01PM <DIR> SMS
Does this mean that LIST -R doesn't work this time? Or does it mean that the directories reskit and SMS are both empty? I'm not sure.
It may be that there is something we can use to reliably tell whether or not it's working on this server, but the fact is that LIST -R is a non-standard feature and that it is therefore likely to be implemented differently on different servers. In contrast, the recursive listing approach relies on no non-standard features and is therefore much more likely to work reliably.
Your problem is a very rare one because it's a confluence of three conditions:
- A very large directory tree.
- A very short time-out.
- A server has a policy that causes it to time out unless actual file transfers take place.
In my opinion the time-out policy is a mistake, especially given the size of the directory trees on the server and the task that's required of it. This policy should be changed, but I realize that it may not be possible for your to affect this change.
If you decide against the LIST -R approach then it seems to me that the best solution is to transfer a file after each directory listing. If you wish to do this then we are happy to provide you with advice and even write some of the code for you. Please let us know if you would like to try it.
- Hans (EnterpriseDT)