Received: by mail.netbsd.org (Postfix, from userid 605) id F393F84DCA; Wed, 13 Feb 2019 08:38:26 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mail.netbsd.org (Postfix) with ESMTP id 76D4184DC8 for ; Wed, 13 Feb 2019 08:38:26 +0000 (UTC) X-Virus-Scanned: amavisd-new at netbsd.org Received: from mail.netbsd.org ([IPv6:::1]) by localhost (mail.netbsd.org [IPv6:::1]) (amavisd-new, port 10025) with ESMTP id Me0MIfxMV-_W for ; Wed, 13 Feb 2019 08:38:25 +0000 (UTC) Received: from cvs.NetBSD.org (ivanova.NetBSD.org [IPv6:2001:470:a085:999:28c:faff:fe03:5984]) by mail.netbsd.org (Postfix) with ESMTP id E3F5884DBF for ; Wed, 13 Feb 2019 08:38:25 +0000 (UTC) Received: by cvs.NetBSD.org (Postfix, from userid 500) id DCEBAFB16; Wed, 13 Feb 2019 08:38:25 +0000 (UTC) Content-Disposition: inline Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII" MIME-Version: 1.0 Date: Wed, 13 Feb 2019 08:38:25 +0000 From: "Maxime Villard" Subject: CVS commit: src/sys/arch/x86 To: source-changes@NetBSD.org X-Mailer: log_accum Message-Id: <20190213083825.DCEBAFB16@cvs.NetBSD.org> Sender: source-changes-owner@NetBSD.org List-Id: source-changes.NetBSD.org Precedence: bulk Reply-To: source-changes-d@NetBSD.org Mail-Reply-To: "Maxime Villard" Mail-Followup-To: source-changes-d@NetBSD.org List-Unsubscribe: Module Name: src Committed By: maxv Date: Wed Feb 13 08:38:25 UTC 2019 Modified Files: src/sys/arch/x86/include: pmap.h src/sys/arch/x86/x86: pmap.c Log Message: Add the EPT pmap code, used by Intel-VMX. The idea is that under NVMM, we don't want to implement the hypervisor page tables manually in NVMM directly, because we want pageable guests; that is, we want to allow UVM to unmap guest pages when the host comes under pressure. Contrary to AMD-SVM, Intel-VMX uses a different set of PTE bits from native, and this has three important consequences: - We can't use the native PTE bits, so each time we want to modify the page tables, we need to know whether we're dealing with a native pmap or an EPT pmap. This is accomplished with callbacks, that handle everything PTE-related. - There is no recursive slot possible, so we can't use pmap_map_ptes(). Rather, we walk down the EPT trees via the direct map, and that's actually a lot simpler (and probably faster too...). - The kernel is never mapped in an EPT pmap. An EPT pmap cannot be loaded on the host. This has two sub-consequences: at creation time we must zero out all of the top-level PTEs, and at destruction time we force the page out of the pool cache and into the pool, to ensure that a next allocation will invoke pmap_pdp_ctor() to create a native pmap and not recycle some stale EPT entries. To create an EPT pmap, the caller must invoke pmap_ept_transform() on a newly-allocated native pmap. And that's about it, from then on the EPT callbacks will be invoked, and the pmap can be destroyed via the usual pmap_destroy(). The TLB shootdown callback is not initialized however, it is the responsibility of the hypervisor (NVMM) to set it. There are some twisted cases that we need to handle. For example if pmap_is_referenced() is called on a physical page that is entered both by a native pmap and by an EPT pmap, we take the Accessed bits from the two pmaps using different PTE sets in each case, and combine them into a generic PP_ATTRS_U flag (that does not depend on the pmap type). Given that the EPT layout is a 4-Level tree with the same address space as native x86_64, we allow ourselves to use a few native macros in EPT, such as pmap_pa2pte(), rather than re-defining them with "ept" in the name. Even though this EPT code is rather complex, it is not too intrusive: just a few callbacks in a few pmap functions, predicted-false to give priority to native. So this comes with no messy #ifdef or performance cost. To generate a diff of this commit: cvs rdiff -u -r1.96 -r1.97 src/sys/arch/x86/include/pmap.h cvs rdiff -u -r1.321 -r1.322 src/sys/arch/x86/x86/pmap.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.