Our Products:   CompleteFTP  edtFTPnet/Free  edtFTPnet/PRO  edtFTPj/Free  edtFTPj/PRO
0 votes
8.3k views
in Java FTP by (200 points)
I have written an application that polls an FTP server for the arrival of a new file. The session is cycled each poll interval by executing the quit method and then re-connecting after the elapsed time ( set for 10 seconds ). The problem is that over time the open socket file descriptors start to accumulate to the point where a "too many open files" exception is being thrown. When I monitor the network connections everything looks fine but for some reason the socket file descriptors are not being closed. I am testing on a Solaris server running edtftpj 1.3.3. Has anyone experienced a similar problem?

Thanks,
JR.

4 Answers

0 votes
by (162k points)
edtFTPj closes all fd's as far as we can tell. Have you tried running lsof to find out where the open fd's are coming from? How long does it take to run out of fd's, and how many open fd's are permitted?

Remember too that on Solaris socket fd's go into TIME_WAIT for a period of time (120 sec is probably default). So polling every 10 sec could result in quite a few fd's open. If TIME_WAIT is set to be longer, you could run out of fd's quite quickly.

I have written an application that polls an FTP server for the arrival of a new file. The session is cycled each poll interval by executing the quit method and then re-connecting after the elapsed time ( set for 10 seconds ). The problem is that over time the open socket file descriptors start to accumulate to the point where a "too many open files" exception is being thrown. When I monitor the network connections everything looks fine but for some reason the socket file descriptors are not being closed. I am testing on a Solaris server running edtftpj 1.3.3. Has anyone experienced a similar problem?

Thanks,
JR.
0 votes
by (200 points)
We have been using lsof and found that the open fd's are coming from the thread that is performing the ftp poll. The fd counts have climbed upto 600-700 with the process upper limit set to 1024. The strange thing is that the fd's do get cleaned up on a periodic basis (every several minutes)which we think may be the result of JVM garbage collection. However, depending on system load we do occasionally hit the 1024 limit at which point we start to throw exceptions. Shown below are a few lines from lsof:

java 7373 sonicdev 378u IPv4 0x300399ab338 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)
java 7373 sonicdev 379u IPv4 0x300468ffd58 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)
java 7373 sonicdev 380u IPv4 0x3004322c968 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)
java 7373 sonicdev 381u IPv4 0x30033c6fbe8 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)
java 7373 sonicdev 382u IPv4 0x30043cf3710 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)
java 7373 sonicdev 383u IPv4 0x3001edd2b90 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)
java 7373 sonicdev 384u IPv4 0x3001edea7c8 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)
java 7373 sonicdev 385u IPv4 0x30043f9dae8 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)
java 7373 sonicdev 386u IPv4 0x300087641f0 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)
java 7373 sonicdev 387u IPv4 0x30042517080 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)
java 7373 sonicdev 388u IPv4 0x3004250d1d0 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)
java 7373 sonicdev 389u IPv4 0x300429c95b8 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)
java 7373 sonicdev 390u IPv4 0x3001cd9ce48 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)
java 7373 sonicdev 391u IPv4 0x300432276b0 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)
java 7373 sonicdev 392u IPv4 0x3004055edf0 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)
java 7373 sonicdev 393u IPv4 0x3004338e970 0t0 TCP sundvl04:*->ficlibm01.fmr.com:* (IDLE)


The socket close method is closing our connections to the FTP hosts as expected but for some reason the fd's hang around for several minutes.


Thanks,
JR.
0 votes
by (162k points)
Remember that because sockets go into TIME_WAIT after they are closed (to prevent duplicates being received), it will be a minimum of two minutes before they are completely closed. I think netstat will tell you if sockets are in TIME_WAIT.

You may want to increase the open files limit to 2048 if you can't reduce your polling time.

The socket close method is closing our connections to the FTP hosts as expected but for some reason the fd's hang around for several minutes.
0 votes
by (200 points)
I have taken a look in netstat and have not found an accoumulation of TIME_WAIT sockets. The only build up is in the fd's. We are planning to reduce the polling time and increase the file limits to avoid the exceptions. Perhaps we are mistaken in thinking there was a problem when in fact things are working normally. However, it would be nice to understand what triggers release of the fd's.

Thanks for your assistance on this.

JR.

Categories

...