Sun Mar 8 09:47:28 2020 UTC ()
Pull up following revision(s) (requested by mlelstv in ticket #1515):

	sys/kern/sys_select.c: revision 1.42-1.45

PR/54158: Anthony Mallet: poll(2) does not allow polling all possible fds
(hardcoded limit to 1000 + #<open-fds>). Changed to limit by the max of
the resource limit of open descriptors and the above.

Remove the slop code. Suggested by mrg@

Use the max limit (aka maxfiles or the moral equivalent of OPEN_MAX) which
makes poll(2) align with the Posix documentation (which allows EINVAL if
nfds > OPEN_MAX). From: Anthony Mallet

Add slop of 1000 and explain why.


(martin)
diff -r1.40 -r1.40.2.1 src/sys/kern/sys_select.c

cvs diff -r1.40 -r1.40.2.1 src/sys/kern/sys_select.c (expand / switch to context diff)
--- src/sys/kern/sys_select.c 2017/06/01 02:45:13 1.40
+++ src/sys/kern/sys_select.c 2020/03/08 09:47:28 1.40.2.1
@@ -1,4 +1,4 @@
-/*	$NetBSD: sys_select.c,v 1.40 2017/06/01 02:45:13 chs Exp $	*/
+/*	$NetBSD: sys_select.c,v 1.40.2.1 2020/03/08 09:47:28 martin Exp $	*/
 
 /*-
  * Copyright (c) 2007, 2008, 2009, 2010 The NetBSD Foundation, Inc.
@@ -84,7 +84,7 @@
  */
 
 #include <sys/cdefs.h>
-__KERNEL_RCSID(0, "$NetBSD: sys_select.c,v 1.40 2017/06/01 02:45:13 chs Exp $");
+__KERNEL_RCSID(0, "$NetBSD: sys_select.c,v 1.40.2.1 2020/03/08 09:47:28 martin Exp $");
 
 #include <sys/param.h>
 #include <sys/systm.h>
@@ -488,15 +488,28 @@
 	int		error;
 	size_t		ni;
 
-	if (nfds > 1000 + curlwp->l_fd->fd_dt->dt_nfiles) {
+	if (nfds > curlwp->l_proc->p_rlimit[RLIMIT_NOFILE].rlim_max + 1000) {
 		/*
-		 * Either the user passed in a very sparse 'fds' or junk!
-		 * The kmem_alloc() call below would be bad news.
-		 * We could process the 'fds' array in chunks, but that
+		 * Prevent userland from causing over-allocation.
+		 * Raising the default limit too high can still cause
+		 * a lot of memory to be allocated, but this also means
+		 * that the file descriptor array will also be large.
+		 *
+		 * To reduce the memory requirements here, we could 
+		 * process the 'fds' array in chunks, but that
 		 * is a lot of code that isn't normally useful.
 		 * (Or just move the copyin/out into pollscan().)
+		 *
 		 * Historically the code silently truncated 'fds' to
 		 * dt_nfiles entries - but that does cause issues.
+		 *
+		 * Using the max limit equivalent to sysctl
+		 * kern.maxfiles is the moral equivalent of OPEN_MAX
+		 * as specified by POSIX.
+		 *
+		 * We add a slop of 1000 in case the resource limit was
+		 * changed after opening descriptors or the same descriptor
+		 * was specified more than once.
 		 */
 		return EINVAL;
 	}