Should You Continue Polling Socket For Readiness After An EAGAIN or EWOULDBLOCK Error?
This link shows the meaning of error codes in GNU library. EAGAIN/EWOULDBLOCK
means resources temporarily unavailable. The call might work if you try later. An example is the case of non-blocking IO operation that will block.
Related videos on Youtube
EdNdee
Updated on September 18, 2022Comments
-
EdNdee over 1 year
I am creating a web crawler with a multiplexed download manager using Linux epoll (Linux 2.6.30.x). I pick links from a database of over 40,000 domains (each domain having between 1 and 2000 urls), a total of 250,000 urls. I multiplex the downloads so that on average I have not more than 2 parallel streams per host (as per the HTTP spec recommendation), and also so that I loop over between a batch of 10 to 50 hosts at a time. I have chosen non-blocking sockets and epoll for speed and scalability (am low on RAM) and ease of use compared to the poll, select and signal-driven I/O.
I download the first few 100s of urls very smoothly and rapidly. Trouble is, I keep getting EAGAIN/EWOULDBLOCK error from certain links (sockets) that otherwise seem ready (i.e. I can use my PC's browser to open the links at any point). But even after epolling them repeatedly expecting their status to change to EPOLLIN, they remain EAGAIN/EWOULDBLOCK. These links build-up very quickly so that I have to stop the whole download.
What really does EAGAIN/EWOULDBLOCK mean? Is EAGAIN/EWOULDBLOCK a permanent status, so that once detected I should delist that socket from any further observation?
Kindly help.
-
David Schwartz about 12 yearsCan you clarify exactly what's happening? Are you getting an
epoll
read hit or write hit? What operation is returningEAGAIN/EWOULDBLOCK
? -
EdNdee about 12 yearsI've 3 threads -thread1 issues epoll_ctl(epoll_writefd, EPOLL_CTL_ADD,..) and epoll_ctl(epoll_readfd, EPOLL_CTL_ADD,..) for each live host socket -less than 50 active; thread2 issues epoll_wait(epoll_writefd,...,-1) to check write readiness, when ready the actual http request, then epoll_ctl(epoll_writefd, EPOLL_CTL_DEL,..) to remove socket from further write epoll; thread3 issues epoll_wait(epoll_readfd,..,-1) to check read readiness, when ready, download page repeatedly (until error or complete), then issues epoll_ctl(epoll_readfd, EPOLL_CTL_DEL,..) to remove socket from further read epoll.
-
David Schwartz about 12 yearsOkay, so what operation returns
EWOULDBLOCK
? I think what you're missing is this: If aread
operations returnsEWOULDBLOCK
, you don't want to try to read again until you get anotherepoll
read hit. -
EdNdee about 12 yearsSolved! Thanks David. "If a read operations returns EWOULDBLOCK, you don't want to try to read again until you get another epoll read hit" - That's actually quite important coz the thread would then block, I hadn't initially figured that out! I appreciate your help.
-
David Schwartz about 12 yearsThe thread shouldn't block because you should have set the socket non-blocking. (If you want to block, why use
epoll
? And if you don't want to block, you must set the socket.) What will happen, though, is that the thread will spin.
-