Why do we need to call poll_wait in poll?
Solution 1
poll_wait adds your device (represented by the "struct file") to the list of those that can wake the process up.
The idea is that the process can use poll (or select or epoll etc) to add a bunch of file descriptors to the list on which it wishes to wait. The poll entry for each driver gets called. Each one adds itself (via poll_wait) to the waiter list.
Then the core kernel blocks the process in one place. That way, any one of the devices can wake up the process. If you return non-zero mask bits, that means those "ready" attributes (readable/writable/etc) apply now.
So, in pseudo-code, it's roughly like this:
foreach fd:
find device corresponding to fd
call device poll function to setup wait queues (with poll_wait) and to collect its "ready-now" mask
while time remaining in timeout and no devices are ready:
sleep
return from system call (either due to timeout or to ready devices)
Solution 2
The poll
file_operation
sleeps if you return 0
This is what was confusing me.
When you return non-zero, it means that some event was fired, and it wakes up.
Once you see this, it is clear that something must be tying the process to the wait queue, and that thing is poll_wait
.
Also remember that struct file
represents "a connection between a process and an open file", not just a filesystem file, and as such it contains the pid, which is used to identify the process.
Playing with a minimal runnable example might also help clear things up: https://stackoverflow.com/a/44645336/895245
Related videos on Youtube
demonguy
Updated on September 15, 2022Comments
-
demonguy over 1 year
In LDD3, i saw such codes
static unsigned int scull_p_poll(struct file *filp, poll_table *wait) { struct scull_pipe *dev = filp->private_data; unsigned int mask = 0; /* * The buffer is circular; it is considered full * if "wp" is right behind "rp" and empty if the * two are equal. */ down(&dev->sem); poll_wait(filp, &dev->inq, wait); poll_wait(filp, &dev->outq, wait); if (dev->rp != dev->wp) mask |= POLLIN | POLLRDNORM; /* readable */ if (spacefree(dev)) mask |= POLLOUT | POLLWRNORM; /* writable */ up(&dev->sem); return mask; }
But it says poll_wait won't wait and will return immediately. Then why do we need to call it? Why can't we just return mask?
-
demonguy almost 9 yearsThen when does the process sleep?
-
demonguy almost 9 yearsYou mean, poll call from user space will block the process, right ?
-
Gil Hamilton almost 9 yearsYes. When you call poll(2) in user space, that goes to a function called "sys_poll" inside the kernel (see fs/select.c in kernel source). Likewise, select(2) => sys_select, etc. All those functions follow more or less the pseudo-code I gave above.
-
EML about 8 yearsThis is completely wrong.
poll_wait
doesn't 'trigger' at all. It simply adds a wait queue to thepoll_table
. -
Kevin Ding about 3 yearsI have a question: what does wait_queue_head_t do? void poll_wait (struct file *, wait_queue_head_t *, poll_table *);
-
Gil Hamilton about 3 yearsIt's a data structure that anchors the head of the queue of "waiting processes" (within this device). So that if an interrupt comes in that delivers data (for Read) or frees up space (for Write), the device can notify the core kernel that any waiting process on the queue can be awakened (which would result in each process being unblocked [scheduled to run] and hence cause a return to user space from the select/poll syscall that the process in the queue is blocked in).