Received: by mail.netbsd.org (Postfix, from userid 605) id CE02B84F40; Tue, 15 Nov 2022 08:52:12 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mail.netbsd.org (Postfix) with ESMTP id 092CD84F34 for ; Tue, 15 Nov 2022 08:52:12 +0000 (UTC) X-Virus-Scanned: amavisd-new at netbsd.org Received: from mail.netbsd.org ([127.0.0.1]) by localhost (mail.netbsd.org [127.0.0.1]) (amavisd-new, port 10025) with ESMTP id eU5w9k5Q2Jxt for ; Tue, 15 Nov 2022 08:52:10 +0000 (UTC) Received: from cvs.NetBSD.org (ivanova.NetBSD.org [IPv6:2001:470:a085:999:28c:faff:fe03:5984]) by mail.netbsd.org (Postfix) with ESMTP id 0F5E384D2A for ; Tue, 15 Nov 2022 08:52:10 +0000 (UTC) Received: by cvs.NetBSD.org (Postfix, from userid 500) id 039B0FA90; Tue, 15 Nov 2022 08:52:10 +0000 (UTC) Content-Transfer-Encoding: 7bit Content-Type: multipart/mixed; boundary="_----------=_16685023292820" MIME-Version: 1.0 Date: Tue, 15 Nov 2022 08:52:09 +0000 From: "Thomas Klausner" Subject: CVS commit: pkgsrc/www/py-mod_wsgi To: pkgsrc-changes@NetBSD.org Reply-To: wiz@netbsd.org X-Mailer: log_accum Message-Id: <20221115085210.039B0FA90@cvs.NetBSD.org> Sender: pkgsrc-changes-owner@NetBSD.org List-Id: Precedence: bulk List-Unsubscribe: This is a multi-part message in MIME format. --_----------=_16685023292820 Content-Disposition: inline Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="UTF-8" Module Name: pkgsrc Committed By: wiz Date: Tue Nov 15 08:52:09 UTC 2022 Modified Files: pkgsrc/www/py-mod_wsgi: Makefile distinfo Removed Files: pkgsrc/www/py-mod_wsgi/patches: patch-configure patch-src_server_wsgi__python.h Log Message: py-ap24-mod_wsgi: update to 4.9.4. 4.9.4 Bugs Fixed Apache 2.4.54 changed the default value for LimitRequestBody from 0, which indicates there is no limit, to 1Gi. If the Apache configuration supplied with a distribution wasn’t explicitly setting LimitRequestBody to 0 at global server scope for the purposes of documenting the default, and it was actually relying on the compiled in default, then when using mod_wsgi daemon mode, if a request body size greater than 1Gi was encountered the mod_wsgi daemon mode process would crash. Fix ability to build mod_wsgi against Apache 2.2. Do note that in general only recent versions of Apache 2.4 are supported 4.9.3 Bugs Fixed When using WSGITrustedProxies and WSGITrustedProxyHeaders in the Apache configuration, or --trust-proxy and --trust-proxy-header options with mod_wsgi-express, if you trusted the X-Client-IP header and a request was received from an untrusted client, the header was not being correctly removed from the set of headers passed through to the WSGI application. This only occurred with the X-Client-IP header and the same problem was not present if trusting the X-Real-IP or X-Forwarded-For headers. The purpose of this feature for trusting a front end proxy was in this case for the headers: X-Client-IP X-Real-IP X-Forwarded-For and was designed to allow the value of REMOTE_ADDR passed to the WSGI application to be rewritten to the IP address that a trusted proxy said was the real remote address of the client. In other words, if a request was received from a proxy the IP address of which was trusted, REMOTE_ADDR would be set to the value of the single designated header out of those listed above which was to be trusted. In the case where the proxy was trusted, in addition to REMOTE_ADDR being rewritten, only the trusted header would be passed through. That is, if X-Real-IP was the trusted header, then HTTP_X_REAL_IP would be passed to the WSGI application, but HTTP_X_CLIENT_IP and HTTP_X_FORWARDED_FOR would be dropped if corresponding headers had also been supplied. That the header used to rewrite REMOTE_ADDR was passed through still was only intended for the purpose of documenting where the value of REMOTE_ADDR came from. A WSGI application when relying on this feature should only ever use the value of REMOTE_ADDR and should ignore the header passed through. The behaviour as described was therefore based on a WSGI application not at the same time enabling any WSGI or web framework middleware to try and process any proxy headers a second time and REMOTE_ADDR should be the single source of truth. Albeit the headers which were passed through should have resulted in the same result for REMOTE_ADDR if the proxy headers were processed a second time. Now in the case of the client a request was received from not being a trusted proxy, then REMOTE_ADDR would not be rewritten, and would be left as the IP of the client, and none of the headers listed above were supposed to be passed through. That REMOTE_ADDR is not rewritten is implemented correctly when the client is not a trusted proxy, but of the three headers listed above, HTTP_X_CLIENT_ID was not being dropped if the corresponding header was supplied. If the WSGI application followed best practice and only relied on the value of REMOTE_ADDR as the source of truth for the remote client address, then that HTTP_X_CLIENT_ID was not being dropped should pose no security risk. There would however be a problem if a WSGI application was still enabling a WSGI or web framework specific middleware to process the proxy headers a second time even though not required. In this case, the middleware used by the WSGI application may still trust the X-Client-IP header and rewrite REMOTE_ADDR allowing a malicious client to pretend to have a different IP address. In addition to the WSGI application having redundant checks for the proxy headers, to take advantage of this, a client would also need direct access to the Apache/mod_wsgi server instance. In the case that only clients on your private network behind your proxy could access the Apache/mod_wsgi server instance, that would imply any malicious actor already had access to your private network and had access to hosts in that private network or could attach their own device to that private network. In the case where your Apache/mod_wsgi server instance could be accessed from the same external networks as a proxy forwarding requests to it, such as may occur if making use of a CDN proxy cache, a client would still need to know the direct address used by the Apache/mod_wsgi server instance. Note that only one proxy header for designating the IP of a client should ever be trusted. If you trust more than one, then which will be used if both are present is undefined as it is dependent on the order that Apache processes headers. This hasn’t changed and as before to avoid ambiguity you should only trust one of the proxy headers recognised for this purpose. 4.9.2 Bugs Fixed When using mod_wsgi-express in daemon mode, and source code reloading was enabled, an invalid URL path which contained a byte sequence which could not be decoded as UTF-8 was causing a process crash. 4.9.1 Bugs Fixed When using --enable-debugger of mod_wsgi-express to enable Pdb, it was failing due to prior changes to run Apache in a sub processes to avoid Apache being shutdown when the window size changed. This was because standard input was being detached from Apache and so it was not possible to interact with Pdb. Now when --enable-debugger is used, or any feature which uses --debug-mode, Apache will not be run in a sub process so that you can still use standard input to interact with the process if needed. This does mean that a window size change event will again cause Apache to shutdown in these cases though. Update code so compiles on Python 3.11. Python 3.11 makes structures for Python frame objects opaque and requires functions to access struct members. Features Changed Historically when a process was being shutdown, mod_wsgi would do its best to destroy any Python sub interpreters as well as the main Python interpreter. This was done in case applications attempted to run any actions on process shutdown via atexit registered callbacks or other means. Because of changes in Python 3.9, and possibly because mod_wsgi makes use of externally created C threads to handle requests, and not Python native threads, there is now a suspiscion that attempting to delete Python sub interpreters can hang. It is believed this may relate to Python core now expecting all Python thread state objects to have been deleted before the Python sub interpreter can be destroyed. If they aren’t then Python core code can block indefinitely. If the issue isn’t the externally created C threads that mod_wsgi uses, it might instead be arising as a problem when a hosted WSGI application creates its own background threads but they are still running when the attempt is made to destroy the sub interpreter. In the case of using daemon mode the result is that processes can hang on shutdown, but will still at least be deleted after 5 seconds due to how Apache process management will forcibly kill managed processes after 5 seconds if they do not exit cleanly themselves. In other words the issue may not be noticed. For embedded mode however, the Apache child process can hang around indefinitely, possibly only being deleted if some higher level system application manager such as systemd is able to detect the problem and forcibly deleted the hung process. Although mod_wsgi always attempts to ensure that the externally created C threads are not still handling HTTP requests and thus not active prior to destroying the Python interpreter, it is impossible to guarantee this. Similarly, there is no way to guarantee that background threads created by a WSGI application aren’t still running. As such, it isn’t possible to safely attempt to delete the Python thread state objects before deleting the Python sub interpreter. Because of this uncertainty mod_wsgi now provides a way to disable the attempt to destroy the Python sub interpreters or the main Python interpreter when the process is being shutdown. This will though mean that atexit registered callbacks will not be called if this option is enabled. It is therefore important that you use mod_wsgi’s own mechanism of being notified when a process is being shutdown to perform any special actions. import mod_wsgi def shutdown_handler(event, **kwargs): print('SHUTDOWN-HANDLER', event, kwargs) mod_wsgi.subscribe_shutdown(shutdown_handler) Use of this shutdown notification was necessary anyway to reliably attempt to stop background threads created by the WSGI application since atexit registered callbacks are not called by Python core until after it thinks all threads have been stopped. In other words, atexit register callbacks couldn’t be used to reliably stop background threads. Thus use of the mod_wsgi mechanism for performing actions on process shutdown is the preferred way. Overall it is expected that the majority of users will not notice this change as it is very rare to see WSGI applications want to perform special actions on process shutdown. If you are affected, you should use mod_wsgi’s mechanism to perform special actions on process shutdown. If you need to enable this mode whereby no attempt is made to destroy the Python interpreter (including sub interpreters) on process shutdown, you can add at global scope in the Apache configuration: WSGIDestroyInterpreter Off If you are using mod_wsgi-express, you can instead supply the command line option --orphan-interpreter. 4.9.0 Bugs Fixed The mod_wsgi code wouldn’t compile on Python 3.10 as various Python C API functions were removed. Note that the changes required switching to alternate C APIs. The changes were made for all Python versions back to Python 3.6 and were not conditional on Python 3.10+ being used. This is why the minor version got bumped. When using CMMI (configure/make/make install) method for compiling mod_wsgi if embedded mode was being disabled at compile time, compilation would fail. When maximum-requests option was used with mod_wsgi daemon mode, and a graceful restart signal was sent to the daemon process while there was an active request, the process would only shutdown when the graceful timeout period had expired, and not as soon as any active requests had completed, if that had occurred before the graceful timeout had expired. When using the startup-timeout and restart-interval options of WSGIDaemonProcess directive together, checking for the expiration time of the startup time was done incorrectly, resulting in process restart being delayed if startup had failed. At worst case this was the lessor of the time periods specified by the options restart-interval, deadlock-timeout, graceful-timeout and eviction-timeout. If request-timeout were defined it would however still be calculated correctly. As request-timeout was by default defined when using mod_wsgi-express, this issue only usually affect mod_wsgi when manually configuring Apache. Features Changed Historically when using embedded mode, wsgi.multithread in the WSGI environ dictionary has reported True when any multithread capable Apache MPM were used (eg., worker, event), even if the current number of configured threads per child process was overridden to be 1. Why this was the case has been forgotten, but generally wouldn’t matter since no one would ever set up Apache with a mulithread MPM and then configure the number of threads to be 1. If that was desired then prefork MPM would be used. With mod_wsgi-express since 4.8.0 making it much easier to use embedded mode and have a sane configuration used, since it is generated for you, the value of wsgi.multithread has been changed such that it will now correctly report False if using embedded mode, a multithread capable MPM is used, but the number of configured threads is set to 1. The graceful-timeout option for WSGIDaemonProcess now defaults to 15 seconds. This was always the case when mod_wsgi-express was used but the default was never applied back to the case where mod_wsgi was being configured manually. A default of 15 seconds for graceful-timeout is being added to avoid the problem where sending a SIGUSR1 to a daemon mode process would never see the process shutdown due to there never being a time when there were no active requests. This might occur when there were a stuck request that never completed, or numerous long running requests which always overlapped in time meaning the process was never idle. You can still force graceful-timeout to be 0 to restore the original behaviour, but that is probably not recommended. 4.8.0 Bugs Fixed Fixed potential for process crash on Apache startup when the WSGI script file or other Python script file were being preloaded. This was triggered when WSGIImportScript was used, or if WSGIScriptAlias or WSGIScriptAliasMatch were used and both the process-group and application-group options were used with those directives. The potential for this problem arising was extremely high on Alpine Linux, but seem to be very rare on a full Linux of macOS distribution where glibc was being used. Include a potential workaround so that virtual environment work on Windows. Use of virtual environments in embedded systems on Windows has been broken ever since python -m venv was introduced. Initially virtualenv was not affected, although when it changed to use the new style Python virtual environment layout the same as python -m venv it also broke. This was with the introduction of about virtualenv version 20.0.0. The underlying cause is lack of support for using virtual environments in CPython for the new style virtual environments. The bug has existed in CPython since back in 2014 and has not been fixed. For details of the issue see https://bugs.python.org/issue22213. For non Window systems a workaround had been used to resolve the problem, but the same workaround has never worked on Windows. The change in this version tries a different workaround for Windows environments. Added a workaround for the fact that Python doesn’t actually set the _main_thread attribute of the threading module to the main thread which initialized the main interpreter or sub interpreter, but the first thread that imports the threading module. In an embedded system such as mod_wsgi it could be a request thread, not the main thread, that would import the threading module. This issue was causing the asgiref module used in Django to fail when using signal.set_wakeup_fd() as code was thinking it was in the main thread when it wasn’t. See https://github.com/django/asgiref/issues/143. Using WSGILazyInitialization Off would cause Python to abort the Apache parent process. The issue has been resolved, but you are warned that you should not be using this option anyway as it is dangerous and opens up security holes with the potential for user code to run as the root user when Python is initialized. Fix a Python deprecation warning for PyArg_ParseTuple() which would cause the process to crash when deprecation warnings were turned on globally for an application. Crash was occuring whenever anything was output to Apache error log via print(). Features Changed The --isatty option of mod_wsgi-express has been removed and the behaviour enabled by the option is now the default. The default behaviour is now that if mod_wsgi-express is run in an interactive terminal, then Apache will be started within a sub process of the mod_wsgi-express script and the SIGWINCH signal will be blocked and not passed through to Apache. This means that a window resizing event will no longer cause mod_wsgi-express to shutdown unexpectedly. When trying to set resource limits and they can’t be set, the system error number will now be included in the error message. New Features Added the mod_wsgi.subscribe_shutdown() function for registering a callback to be called when the process is being shutdown. This is needed because atexit.register() doesn’t work as required for the main Python interpreter, specifically the atexit callback isn’t called before the main interpreter thread attempts to wait on threads on shutdown, thus preventing one from shutting down daemon threads and waiting on them. This feature to get a callback on process shutdown was previously available by using mod_wsgi.subscribe_events(), but that would also reports events to the callback on requests as they happen, thus adding extra overhead if not using the request events. The new registration function can thus be used where only interested in the event for the process being shutdown. Added an --embedded-mode option to mod_wsgi-express to make it easier to force it into embedded mode for high throughput, CPU bound applications with minimal response times. In this case the number of Apache child worker processes used for embedded mode will be dictated by the --processes and --threads option, completely overriding any automatic mechanism to set those parameters. Any auto scaling done by Apache for the child worker processes will also be disabled. This gives preference to using Apache worker MPM instead of event MPM, as event MPM doesn’t work correctly when told to run with less than three threads per process. You can switch back to using event MPM by using the --server-mpm option, but will need to ensure that have three threads per process or more. Locking of the Python global interpreter lock has been reviewed with changes resulting in a reduction in overhead, or otherwise changing the interaction between threads such that at high request rate with a hello world application, a greater request throughput can be achieved. How much improvement you see with your own applications will depend on what your application does and whether you have short response times to begin with. If you have an I/O bound application with long response times you likely aren’t going to see any difference. Internal metrics collection has been improved with additional information provided in process metrics and a new request metrics feature added giving access to aggregrated metrics over the time of a reporting period. This includes bucketed time data on requests so can calculate distribution of server, queue and application time. Note that the new request metrics is still a work in progress and may be modified or enhanced, causing breaking changes in the format of data returned. Hidden experimental support for running mod_wsgi-express start-server on Windows. It will not show in list of sub commands mod_wsgi-express accepts on Windows, but it is there. There are still various issues that need to be sorted out but need assistance from someone who knows more about programming Python on Windows and Windows programming in general to get it all working properly. If you are interested in helping, reach out on the mod_wsgi mailing list. To generate a diff of this commit: cvs rdiff -u -r1.22 -r1.23 pkgsrc/www/py-mod_wsgi/Makefile cvs rdiff -u -r1.19 -r1.20 pkgsrc/www/py-mod_wsgi/distinfo cvs rdiff -u -r1.1 -r0 pkgsrc/www/py-mod_wsgi/patches/patch-configure \ pkgsrc/www/py-mod_wsgi/patches/patch-src_server_wsgi__python.h Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files. --_----------=_16685023292820 Content-Disposition: inline Content-Length: 1745 Content-Transfer-Encoding: binary Content-Type: text/x-diff; charset=us-ascii Modified files: Index: pkgsrc/www/py-mod_wsgi/Makefile diff -u pkgsrc/www/py-mod_wsgi/Makefile:1.22 pkgsrc/www/py-mod_wsgi/Makefile:1.23 --- pkgsrc/www/py-mod_wsgi/Makefile:1.22 Wed Jan 5 20:47:37 2022 +++ pkgsrc/www/py-mod_wsgi/Makefile Tue Nov 15 08:52:09 2022 @@ -1,8 +1,7 @@ -# $NetBSD: Makefile,v 1.22 2022/01/05 20:47:37 wiz Exp $ +# $NetBSD: Makefile,v 1.23 2022/11/15 08:52:09 wiz Exp $ -DISTNAME= mod_wsgi-4.7.1 +DISTNAME= mod_wsgi-4.9.4 PKGNAME= ${PYPKGPREFIX}-${APACHE_PKG_PREFIX}-${DISTNAME} -PKGREVISION= 2 CATEGORIES= www python MASTER_SITES= ${MASTER_SITE_PYPI:=m/mod_wsgi/} Index: pkgsrc/www/py-mod_wsgi/distinfo diff -u pkgsrc/www/py-mod_wsgi/distinfo:1.19 pkgsrc/www/py-mod_wsgi/distinfo:1.20 --- pkgsrc/www/py-mod_wsgi/distinfo:1.19 Sun Dec 19 14:12:48 2021 +++ pkgsrc/www/py-mod_wsgi/distinfo Tue Nov 15 08:52:09 2022 @@ -1,7 +1,5 @@ -$NetBSD: distinfo,v 1.19 2021/12/19 14:12:48 wiz Exp $ +$NetBSD: distinfo,v 1.20 2022/11/15 08:52:09 wiz Exp $ -BLAKE2s (mod_wsgi-4.7.1.tar.gz) = f74642297af9ecfed637416cb118885acc11c3d1d8715bbd92d47503e3d009ed -SHA512 (mod_wsgi-4.7.1.tar.gz) = 2c9d83737fe0ca5c599d3915e47047db2d06880ac3721c94350cd2d9ae930c20058e350f07c918dd301e50bf3433480e1bad479f4ffd382e6b2e42675352734e -Size (mod_wsgi-4.7.1.tar.gz) = 498301 bytes -SHA1 (patch-configure) = 7ece56413dfcb8de755dab722ebac632f3d1166f -SHA1 (patch-src_server_wsgi__python.h) = 70c153e55642714d7172246748b6e4038e87d831 +BLAKE2s (mod_wsgi-4.9.4.tar.gz) = 39922e1c24dba83ca3e14288d722127fc0c2ad07c12e4ce15d5c24f55fbc6222 +SHA512 (mod_wsgi-4.9.4.tar.gz) = e99c062a8fa9fdb9ce50f8d902ff9c9b572f7c470bfad6db0ad34b52ee476814845331ebded86c62e335bbfd8887c56a3a62109c332a951f883314b8350a3ae4 +Size (mod_wsgi-4.9.4.tar.gz) = 497531 bytes --_----------=_16685023292820--