GRPC Core
10.0.0
|
Author: Sree Kuchibhotla (@sreecha) - Sep 2018
Polling engine component was created for the following reasons:
grpc_endpoint
code calls recvmsg
call when the fd is readable and sendmsg
call when the fd is writabletcp_client
connect code issues async connect
and finishes creating the client once the fd is writable (i.e when the connect
actually finished)There are multiple polling engine implementations depending on the OS and the OS version. Fortunately all of them expose the same interface
epollex
** (default but requires kernel version >= 4.5),epoll1
(If epollex
is not available and glibc version >= 2.9)poll
(If kernel does not have epoll support)poll
** (default)libuv
polling engine implementation (requires different compile #define
s)The following are the Opaque structures exposed by Polling Engine interface (NOTE: Different polling engine implementations have different definitions of these structures)
grpc_pollset
sgrpc_pollset_work()
APIgrpc_fd
s, grpc_pollset
s and grpc_pollset_set
s (yes, a grpc_pollset_set
can contain other grpc_pollset_set
s)grpc_fd_notify_on_(grpc_fd* fd, grpc_closure* closure)
grpc_fd_shutdown(grpc_fd* fd)
grpc_fd_orphan(grpc_fd* fd, grpc_closure* on_done, int* release_fd, char* reason)
grpc_fd
structure and call on_done
closure when the operation is completerelease_fd
is set to nullptr
, then close()
the underlying fd as well. If not, put the underlying fd in release_fd
(and do not call close()
)release_fd
set to non-null in cases where the underlying fd is NOT owned by grpc core (like for example the fds used by C-Ares DNS resolver )grpc_pollset_add_fd(grpc_pollset* ps, grpc_fd *fd)
grpc_pollset_remove_fd
. This is because calling grpc_fd_orphan()
will effectively remove the fd from all the pollsets it’s a part ofgrpc_pollset_work(grpc_pollset* ps, grpc_pollset_worker** worker, grpc_millis deadline)
> NOTE: grpc_pollset_work()
requires the pollset mutex to be locked before calling it. Shortly after calling grpc_pollset_work()
, the function populates the *worker
pointer (among other things) and releases the mutex. Once grpc_pollset_work()
returns, the *worker
pointer is invalid and should not be used anymore. See the code in completion_queue.cc
to see how this is used.grpc_pollset_kick
for more details)grpc_pollset_kick(grpc_pollset* ps, grpc_pollset_worker* worker)
worker == nullptr
, kick ANY worker active on that pollsetgrpc_pollset_set_[add|del]_fd(grpc_pollset_set* pss, grpc_fd *fd)
grpc_pollset_set
grpc_pollset_set_[add|del]_pollset(grpc_pollset_set* pss, grpc_pollset* ps)
grpc_pollset_work()
on the pollset will also poll all the fds in the pollset_set i.e semantically, it is similar to adding all the fds inside pollset_set to the pollset.grpc_pollset_set_[add|del]_pollset_set(grpc_pollset_set* bag, grpc_pollset_set* item)
Relation between grpc_pollset_worker, grpc_pollset and grpc_fd:
grpc_pollset_set
Code at src/core/lib/iomgr/ev_epoll1_posix.cc
pollset_neighborhood
(a structure internal to epoll1
polling engine implementation). grpc_pollset_workers
that call grpc_pollset_work
on a given pollset are all queued in a linked-list against the grpc_pollset
. The head of the linked list is called "root worker"pollset_neighborhood
listed is scanned to pick the next pollset and worker that could be the new designated poller.grpc_pollset_workers
with a way to group them per-pollset (needed to implement grpc_pollset_kick
semantics) and a way randomly select a new designated pollerbegin_worker()
function to see how a designated poller is chosen. Similarly end_worker()
function is called by the worker that was just out of epoll_wait()
and will have to choose a new designated poller)Code at src/core/lib/iomgr/ev_epollex_posix.cc
Pollable
, then the pollable
MUST be either empty or of type PO_FD
(i.e single-fd)Pollable
s (even if one of the Pollable
s is of type PO_FD)Pollable
s of type PO_FD for the same fdPollable
of type PO_FD and PO_EMPTY ?Pollable
and hence an epollset. This is because every completion queue automatically creates a pollset and the channel fd will have to be put in that pollset. This clearly requires an epollset to put that fd. Creating an epollset per call (even if we delete the epollset once the call is completed) would mean a lot of sys calls to create/delete epoll fds. This is clearly not a good idea.Pollable
s, all pollsets (corresponding to the new per-call completion queue) will initially point to PO_EMPTY global epollset. Then once the channel fd is added to the pollset, the pollset will point to the Pollable
of type PO_FD containing just that fd (i.e it will reuse the existing Pollable
). This way, the epoll fd creation/deletion churn is avoided.poll
polling engine is quite complicated. It uses the poll()
function to do the polling (and hence it is for platforms like osx where epoll is not available)src/core/lib/iomgr/ev_poll_posix.cc
is written a certain/seemingly complicated way :))