Tue Oct 17 08:42:30 2017 UTC ()
Update xentools48 and xenkernel48 to 4.8.2, and apply security patches up
to XSA244. Keep PKGREVISION to 1 to account for the fact that it's
not a stock Xen 4.8.2.

Note that, unlike upstream, pv-linear-pt defaults to true, so that
NetBSD PV guests (including dom0) will continue to boot without changes
to boot.cfg


(bouyer)
diff -r1.1 -r1.2 pkgsrc/sysutils/xenkernel48/MESSAGE
diff -r1.5 -r1.6 pkgsrc/sysutils/xenkernel48/Makefile
diff -r1.2 -r1.3 pkgsrc/sysutils/xenkernel48/distinfo
diff -r1.1 -r0 pkgsrc/sysutils/xenkernel48/patches/patch-XSA-212
diff -r0 -r1.1 pkgsrc/sysutils/xenkernel48/patches/patch-XSA231
diff -r0 -r1.1 pkgsrc/sysutils/xenkernel48/patches/patch-XSA232
diff -r0 -r1.1 pkgsrc/sysutils/xenkernel48/patches/patch-XSA234
diff -r0 -r1.1 pkgsrc/sysutils/xenkernel48/patches/patch-XSA237
diff -r0 -r1.1 pkgsrc/sysutils/xenkernel48/patches/patch-XSA238
diff -r0 -r1.1 pkgsrc/sysutils/xenkernel48/patches/patch-XSA239
diff -r0 -r1.1 pkgsrc/sysutils/xenkernel48/patches/patch-XSA240
diff -r0 -r1.1 pkgsrc/sysutils/xenkernel48/patches/patch-XSA241
diff -r0 -r1.1 pkgsrc/sysutils/xenkernel48/patches/patch-XSA242
diff -r0 -r1.1 pkgsrc/sysutils/xenkernel48/patches/patch-XSA243
diff -r0 -r1.1 pkgsrc/sysutils/xenkernel48/patches/patch-XSA244
diff -r1.7 -r1.8 pkgsrc/sysutils/xentools48/Makefile
diff -r1.3 -r1.4 pkgsrc/sysutils/xentools48/distinfo
diff -r1.1 -r0 pkgsrc/sysutils/xentools48/patches/patch-XSA-211-1
diff -r1.1 -r0 pkgsrc/sysutils/xentools48/patches/patch-XSA-211-2
diff -r0 -r1.1 pkgsrc/sysutils/xentools48/patches/patch-XSA233
diff -r0 -r1.1 pkgsrc/sysutils/xentools48/patches/patch-XSA240

cvs diff -r1.1 -r1.2 pkgsrc/sysutils/xenkernel48/Attic/MESSAGE (expand / switch to unified diff)

--- pkgsrc/sysutils/xenkernel48/Attic/MESSAGE 2017/03/30 09:15:09 1.1
+++ pkgsrc/sysutils/xenkernel48/Attic/MESSAGE 2017/10/17 08:42:30 1.2
@@ -1,7 +1,11 @@ @@ -1,7 +1,11 @@
1=========================================================================== 1===========================================================================
2$NetBSD: MESSAGE,v 1.1 2017/03/30 09:15:09 bouyer Exp $ 2$NetBSD: MESSAGE,v 1.2 2017/10/17 08:42:30 bouyer Exp $
3 3
4The Xen hypervisor is installed under the following locations: 4The Xen hypervisor is installed under the following locations:
5 ${XENKERNELDIR}/xen.gz (standard hypervisor) 5 ${XENKERNELDIR}/xen.gz (standard hypervisor)
6 ${XENKERNELDIR}/xen-debug.gz (debug hypervisor) 6 ${XENKERNELDIR}/xen-debug.gz (debug hypervisor)
 7
 8Note that unlike upstream Xen, pv-linear-pt defaults to true.
 9You can disable it using pv-linear-pt=false on the Xen command line,
 10but then you can't boot NetBSD in PV mode.
7=========================================================================== 11===========================================================================

cvs diff -r1.5 -r1.6 pkgsrc/sysutils/xenkernel48/Attic/Makefile (expand / switch to unified diff)

--- pkgsrc/sysutils/xenkernel48/Attic/Makefile 2017/07/24 08:53:45 1.5
+++ pkgsrc/sysutils/xenkernel48/Attic/Makefile 2017/10/17 08:42:30 1.6
@@ -1,16 +1,16 @@ @@ -1,16 +1,16 @@
1# $NetBSD: Makefile,v 1.5 2017/07/24 08:53:45 maya Exp $ 1# $NetBSD: Makefile,v 1.6 2017/10/17 08:42:30 bouyer Exp $
2 2
3VERSION= 4.8.0 3VERSION= 4.8.2
4DISTNAME= xen-${VERSION} 4DISTNAME= xen-${VERSION}
5PKGNAME= xenkernel48-${VERSION} 5PKGNAME= xenkernel48-${VERSION}
6PKGREVISION= 1 6PKGREVISION= 1
7CATEGORIES= sysutils 7CATEGORIES= sysutils
8MASTER_SITES= https://downloads.xenproject.org/release/xen/${VERSION}/ 8MASTER_SITES= https://downloads.xenproject.org/release/xen/${VERSION}/
9DIST_SUBDIR= xen48 9DIST_SUBDIR= xen48
10 10
11MAINTAINER= bouyer@NetBSD.org 11MAINTAINER= bouyer@NetBSD.org
12HOMEPAGE= http://xenproject.org/ 12HOMEPAGE= http://xenproject.org/
13COMMENT= Xen 4.8.x Kernel 13COMMENT= Xen 4.8.x Kernel
14 14
15LICENSE= gnu-gpl-v2 15LICENSE= gnu-gpl-v2
16 16

cvs diff -r1.2 -r1.3 pkgsrc/sysutils/xenkernel48/Attic/distinfo (expand / switch to unified diff)

--- pkgsrc/sysutils/xenkernel48/Attic/distinfo 2017/04/08 12:30:42 1.2
+++ pkgsrc/sysutils/xenkernel48/Attic/distinfo 2017/10/17 08:42:30 1.3
@@ -1,13 +1,23 @@ @@ -1,13 +1,23 @@
1$NetBSD: distinfo,v 1.2 2017/04/08 12:30:42 spz Exp $ 1$NetBSD: distinfo,v 1.3 2017/10/17 08:42:30 bouyer Exp $
2 2
3SHA1 (xen48/xen-4.8.0.tar.gz) = c2403899b13e1e8b8da391aceecbfc932d583a88 3SHA1 (xen48/xen-4.8.2.tar.gz) = 184c57ce9e71e34b3cbdd318524021f44946efbe
4RMD160 (xen48/xen-4.8.0.tar.gz) = b79b1e2587caa9c6fe68d2996a4fd42f95c1fe7b 4RMD160 (xen48/xen-4.8.2.tar.gz) = f4126cb0f7ff427ed7d20ce399dcd1077c599343
5SHA512 (xen48/xen-4.8.0.tar.gz) = 70b95553f9813573b12e52999a4df8701dec430f23c36a8dc70d25a46bb4bc9234e5b7feb74a04062af4c8d6b6bcfe947d90b2b172416206812e54bac9797454 5SHA512 (xen48/xen-4.8.2.tar.gz) = 7805531f73d23ecfff3439770e62d387f4254a444875670d53a0a739323e5d4d8f8fcc478f8936ee1ae8aff3e0229549e47c01c606365a8ce060dd5c503e87da
6Size (xen48/xen-4.8.0.tar.gz) = 22499917 bytes 6Size (xen48/xen-4.8.2.tar.gz) = 22522336 bytes
7SHA1 (patch-Config.mk) = abf55aa58792315e758ee3785a763cfa8c2da68f 7SHA1 (patch-Config.mk) = abf55aa58792315e758ee3785a763cfa8c2da68f
8SHA1 (patch-XSA-212) = 4637d51bcbb3b11fb0e22940f824ebacdaa15b4f 8SHA1 (patch-XSA231) = fc249a68ea53064ff7d95f24380f66f3fc3393e7
 9SHA1 (patch-XSA232) = 86d633941ac3165ca4034db660a48d60384ea252
 10SHA1 (patch-XSA234) = acf4170a410d9f314c0cc0c5c092db6bb6cc69a0
 11SHA1 (patch-XSA237) = 3125554b155bd650480934a37d89d1a7471dfb20
 12SHA1 (patch-XSA238) = 58b6fcb73d314d7f06256ed3769210e49197aa90
 13SHA1 (patch-XSA239) = 10619718e8a1536a7f52eb3838cdb490e6ba8c97
 14SHA1 (patch-XSA240) = dca90d33d30167edbe07071795f18159e3e20c57
 15SHA1 (patch-XSA241) = b506425ca7382190435df6f96800cb0a24aff23e
 16SHA1 (patch-XSA242) = afff314771d78ee2482aec3b7693c12bfe00e0ec
 17SHA1 (patch-XSA243) = 75eef49628bc0b3bd4fe8b023cb2da75928103a7
 18SHA1 (patch-XSA244) = 2739ff8a920630088853a9076f71ca2caf639320
9SHA1 (patch-xen_Makefile) = be3f4577a205b23187b91319f91c50720919f70b 19SHA1 (patch-xen_Makefile) = be3f4577a205b23187b91319f91c50720919f70b
10SHA1 (patch-xen_Rules.mk) = 5f33a667bae67c85d997a968c0f8b014b707d13c 20SHA1 (patch-xen_Rules.mk) = 5f33a667bae67c85d997a968c0f8b014b707d13c
11SHA1 (patch-xen_arch_x86_Rules.mk) = e2d148fb308c37c047ca41a678471217b6166977 21SHA1 (patch-xen_arch_x86_Rules.mk) = e2d148fb308c37c047ca41a678471217b6166977
12SHA1 (patch-xen_arch_x86_boot_build32.mk) = 7fa0d64e88e3be0330dac9a2ddc8b0114fd7d4a5 22SHA1 (patch-xen_arch_x86_boot_build32.mk) = 7fa0d64e88e3be0330dac9a2ddc8b0114fd7d4a5
13SHA1 (patch-xen_tools_symbols.c) = fdc7e4aa7b8db0854987c9d0e60c254bb9f5af4e 23SHA1 (patch-xen_tools_symbols.c) = fdc7e4aa7b8db0854987c9d0e60c254bb9f5af4e

File Deleted: pkgsrc/sysutils/xenkernel48/patches/Attic/patch-XSA-212

File Added: pkgsrc/sysutils/xenkernel48/patches/Attic/patch-XSA231
$NetBSD: patch-XSA231,v 1.1 2017/10/17 08:42:30 bouyer Exp $

From: George Dunlap <george.dunlap@citrix.com>
Subject: xen/mm: make sure node is less than MAX_NUMNODES

The output of MEMF_get_node(memflags) can be as large as nodeid_t can
hold (currently 255).  This is then used as an index to arrays of size
MAX_NUMNODE, which is 64 on x86 and 1 on ARM, can be passed in by an
untrusted guest (via memory_exchange and increase_reservation) and is
not currently bounds-checked.

Check the value in page_alloc.c before using it, and also check the
value in the hypercall call sites and return -EINVAL if appropriate.
Don't permit domains other than the hardware or control domain to
allocate node-constrained memory.

This is XSA-231.

Reported-by: Matthew Daley <mattd@bugfuzz.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

--- xen/common/memory.c.orig
+++ xen/common/memory.c
@@ -411,6 +411,31 @@ static void decrease_reservation(struct
     a->nr_done = i;
 }
 
+static bool propagate_node(unsigned int xmf, unsigned int *memflags)
+{
+    const struct domain *currd = current->domain;
+
+    BUILD_BUG_ON(XENMEMF_get_node(0) != NUMA_NO_NODE);
+    BUILD_BUG_ON(MEMF_get_node(0) != NUMA_NO_NODE);
+
+    if ( XENMEMF_get_node(xmf) == NUMA_NO_NODE )
+        return true;
+
+    if ( is_hardware_domain(currd) || is_control_domain(currd) )
+    {
+        if ( XENMEMF_get_node(xmf) >= MAX_NUMNODES )
+            return false;
+
+        *memflags |= MEMF_node(XENMEMF_get_node(xmf));
+        if ( xmf & XENMEMF_exact_node_request )
+            *memflags |= MEMF_exact_node;
+    }
+    else if ( xmf & XENMEMF_exact_node_request )
+        return false;
+
+    return true;
+}
+
 static long memory_exchange(XEN_GUEST_HANDLE_PARAM(xen_memory_exchange_t) arg)
 {
     struct xen_memory_exchange exch;
@@ -483,6 +508,12 @@ static long memory_exchange(XEN_GUEST_HA
         }
     }
 
+    if ( unlikely(!propagate_node(exch.out.mem_flags, &memflags)) )
+    {
+        rc = -EINVAL;
+        goto fail_early;
+    }
+
     d = rcu_lock_domain_by_any_id(exch.in.domid);
     if ( d == NULL )
     {
@@ -501,7 +532,6 @@ static long memory_exchange(XEN_GUEST_HA
         d,
         XENMEMF_get_address_bits(exch.out.mem_flags) ? :
         (BITS_PER_LONG+PAGE_SHIFT)));
-    memflags |= MEMF_node(XENMEMF_get_node(exch.out.mem_flags));
 
     for ( i = (exch.nr_exchanged >> in_chunk_order);
           i < (exch.in.nr_extents >> in_chunk_order);
@@ -864,12 +894,8 @@ static int construct_memop_from_reservat
         }
         read_unlock(&d->vnuma_rwlock);
     }
-    else
-    {
-        a->memflags |= MEMF_node(XENMEMF_get_node(r->mem_flags));
-        if ( r->mem_flags & XENMEMF_exact_node_request )
-            a->memflags |= MEMF_exact_node;
-    }
+    else if ( unlikely(!propagate_node(r->mem_flags, &a->memflags)) )
+        return -EINVAL;
 
     return 0;
 }
--- xen/common/page_alloc.c.orig
+++ xen/common/page_alloc.c
@@ -706,9 +706,13 @@ static struct page_info *alloc_heap_page
         if ( node >= MAX_NUMNODES )
             node = cpu_to_node(smp_processor_id());
     }
+    else if ( unlikely(node >= MAX_NUMNODES) )
+    {
+        ASSERT_UNREACHABLE();
+        return NULL;
+    }
     first_node = node;
 
-    ASSERT(node < MAX_NUMNODES);
     ASSERT(zone_lo <= zone_hi);
     ASSERT(zone_hi < NR_ZONES);
 

File Added: pkgsrc/sysutils/xenkernel48/patches/Attic/patch-XSA232
$NetBSD: patch-XSA232,v 1.1 2017/10/17 08:42:30 bouyer Exp $

From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: grant_table: fix GNTTABOP_cache_flush handling

Don't fall over a NULL grant_table pointer when the owner of the domain
is a system domain (DOMID_{XEN,IO} etc).

This is XSA-232.

Reported-by: Matthew Daley <mattd@bugfuzz.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

--- xen/common/grant_table.c.orig
+++ xen/common/grant_table.c
@@ -3053,7 +3053,7 @@ static int cache_flush(gnttab_cache_flus
 
     page = mfn_to_page(mfn);
     owner = page_get_owner_and_reference(page);
-    if ( !owner )
+    if ( !owner || !owner->grant_table )
     {
         rcu_unlock_domain(d);
         return -EPERM;

File Added: pkgsrc/sysutils/xenkernel48/patches/Attic/patch-XSA234
$NetBSD: patch-XSA234,v 1.1 2017/10/17 08:42:30 bouyer Exp $

From: Jan Beulich <jbeulich@suse.com>
Subject: gnttab: also validate PTE permissions upon destroy/replace

In order for PTE handling to match up with the reference counting done
by common code, presence and writability of grant mapping PTEs must
also be taken into account; validating just the frame number is not
enough. This is in particular relevant if a guest fiddles with grant
PTEs via non-grant hypercalls.

Note that the flags being passed to replace_grant_host_mapping()
already happen to be those of the existing mapping, so no new function
parameter is needed.

This is XSA-234.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

--- xen/arch/x86/mm.c.orig
+++ xen/arch/x86/mm.c
@@ -4017,7 +4017,8 @@ static int create_grant_pte_mapping(
 }
 
 static int destroy_grant_pte_mapping(
-    uint64_t addr, unsigned long frame, struct domain *d)
+    uint64_t addr, unsigned long frame, unsigned int grant_pte_flags,
+    struct domain *d)
 {
     int rc = GNTST_okay;
     void *va;
@@ -4063,16 +4064,27 @@ static int destroy_grant_pte_mapping(
 
     ol1e = *(l1_pgentry_t *)va;
     
-    /* Check that the virtual address supplied is actually mapped to frame. */
-    if ( unlikely(l1e_get_pfn(ol1e) != frame) )
+    /*
+     * Check that the PTE supplied actually maps frame (with appropriate
+     * permissions).
+     */
+    if ( unlikely(l1e_get_pfn(ol1e) != frame) ||
+         unlikely((l1e_get_flags(ol1e) ^ grant_pte_flags) &
+                  (_PAGE_PRESENT | _PAGE_RW)) )
     {
         page_unlock(page);
-        MEM_LOG("PTE entry %lx for address %"PRIx64" doesn't match frame %lx",
-                (unsigned long)l1e_get_intpte(ol1e), addr, frame);
+        MEM_LOG("PTE %"PRIpte" at %"PRIx64" doesn't match grant (%"PRIpte")",
+                l1e_get_intpte(ol1e), addr,
+                l1e_get_intpte(l1e_from_pfn(frame, grant_pte_flags)));
         rc = GNTST_general_error;
         goto failed;
     }
 
+    if ( unlikely((l1e_get_flags(ol1e) ^ grant_pte_flags) &
+                  ~(_PAGE_AVAIL | PAGE_CACHE_ATTRS)) )
+        MEM_LOG("PTE flags %x at %"PRIx64" don't match grant (%x)\n",
+                l1e_get_flags(ol1e), addr, grant_pte_flags);
+
     /* Delete pagetable entry. */
     if ( unlikely(!UPDATE_ENTRY
                   (l1, 
@@ -4081,7 +4093,7 @@ static int destroy_grant_pte_mapping(
                    0)) )
     {
         page_unlock(page);
-        MEM_LOG("Cannot delete PTE entry at %p", va);
+        MEM_LOG("Cannot delete PTE entry at %"PRIx64, addr);
         rc = GNTST_general_error;
         goto failed;
     }
@@ -4149,7 +4161,8 @@ static int create_grant_va_mapping(
 }
 
 static int replace_grant_va_mapping(
-    unsigned long addr, unsigned long frame, l1_pgentry_t nl1e, struct vcpu *v)
+    unsigned long addr, unsigned long frame, unsigned int grant_pte_flags,
+    l1_pgentry_t nl1e, struct vcpu *v)
 {
     l1_pgentry_t *pl1e, ol1e;
     unsigned long gl1mfn;
@@ -4185,19 +4198,30 @@ static int replace_grant_va_mapping(
 
     ol1e = *pl1e;
 
-    /* Check that the virtual address supplied is actually mapped to frame. */
-    if ( unlikely(l1e_get_pfn(ol1e) != frame) )
-    {
-        MEM_LOG("PTE entry %lx for address %lx doesn't match frame %lx",
-                l1e_get_pfn(ol1e), addr, frame);
+    /*
+     * Check that the virtual address supplied is actually mapped to frame
+     * (with appropriate permissions).
+     */
+    if ( unlikely(l1e_get_pfn(ol1e) != frame) ||
+         unlikely((l1e_get_flags(ol1e) ^ grant_pte_flags) &
+                  (_PAGE_PRESENT | _PAGE_RW)) )
+    {
+        MEM_LOG("PTE %"PRIpte" for %lx doesn't match grant (%"PRIpte")",
+                l1e_get_intpte(ol1e), addr,
+                l1e_get_intpte(l1e_from_pfn(frame, grant_pte_flags)));
         rc = GNTST_general_error;
         goto unlock_and_out;
     }
 
+    if ( unlikely((l1e_get_flags(ol1e) ^ grant_pte_flags) &
+                  ~(_PAGE_AVAIL | PAGE_CACHE_ATTRS)) )
+        MEM_LOG("PTE flags %x for %"PRIx64" don't match grant (%x)",
+                l1e_get_flags(ol1e), addr, grant_pte_flags);
+
     /* Delete pagetable entry. */
     if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, gl1mfn, v, 0)) )
     {
-        MEM_LOG("Cannot delete PTE entry at %p", (unsigned long *)pl1e);
+        MEM_LOG("Cannot delete PTE entry for %"PRIx64, addr);
         rc = GNTST_general_error;
         goto unlock_and_out;
     }
@@ -4211,9 +4235,11 @@ static int replace_grant_va_mapping(
 }
 
 static int destroy_grant_va_mapping(
-    unsigned long addr, unsigned long frame, struct vcpu *v)
+    unsigned long addr, unsigned long frame, unsigned int grant_pte_flags,
+    struct vcpu *v)
 {
-    return replace_grant_va_mapping(addr, frame, l1e_empty(), v);
+    return replace_grant_va_mapping(addr, frame, grant_pte_flags,
+                                    l1e_empty(), v);
 }
 
 static int create_grant_p2m_mapping(uint64_t addr, unsigned long frame,
@@ -4307,21 +4333,40 @@ int replace_grant_host_mapping(
     unsigned long gl1mfn;
     struct page_info *l1pg;
     int rc;
+    unsigned int grant_pte_flags;
     
     if ( paging_mode_external(current->domain) )
         return replace_grant_p2m_mapping(addr, frame, new_addr, flags);
 
+    grant_pte_flags =
+        _PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_GNTTAB | _PAGE_NX;
+
+    if ( flags & GNTMAP_application_map )
+        grant_pte_flags |= _PAGE_USER;
+    if ( !(flags & GNTMAP_readonly) )
+        grant_pte_flags |= _PAGE_RW;
+    /*
+     * On top of the explicit settings done by create_grant_host_mapping()
+     * also open-code relevant parts of adjust_guest_l1e(). Don't mirror
+     * available and cachability flags, though.
+     */
+    if ( !is_pv_32bit_domain(curr->domain) )
+        grant_pte_flags |= (grant_pte_flags & _PAGE_USER)
+                           ? _PAGE_GLOBAL
+                           : _PAGE_GUEST_KERNEL | _PAGE_USER;
+
     if ( flags & GNTMAP_contains_pte )
     {
         if ( !new_addr )
-            return destroy_grant_pte_mapping(addr, frame, curr->domain);
+            return destroy_grant_pte_mapping(addr, frame, grant_pte_flags,
+                                             curr->domain);
         
         MEM_LOG("Unsupported grant table operation");
         return GNTST_general_error;
     }
 
     if ( !new_addr )
-        return destroy_grant_va_mapping(addr, frame, curr);
+        return destroy_grant_va_mapping(addr, frame, grant_pte_flags, curr);
 
     pl1e = guest_map_l1e(new_addr, &gl1mfn);
     if ( !pl1e )
@@ -4369,7 +4414,7 @@ int replace_grant_host_mapping(
     put_page(l1pg);
     guest_unmap_l1e(pl1e);
 
-    rc = replace_grant_va_mapping(addr, frame, ol1e, curr);
+    rc = replace_grant_va_mapping(addr, frame, grant_pte_flags, ol1e, curr);
     if ( rc && !paging_mode_refcounts(curr->domain) )
         put_page_from_l1e(ol1e, curr->domain);
 

File Added: pkgsrc/sysutils/xenkernel48/patches/Attic/patch-XSA237
$NetBSD: patch-XSA237,v 1.1 2017/10/17 08:42:30 bouyer Exp $

From: Jan Beulich <jbeulich@suse.com>
Subject: x86: don't allow MSI pIRQ mapping on unowned device

MSI setup should be permitted only for existing devices owned by the
respective guest (the operation may still be carried out by the domain
controlling that guest).

This is part of XSA-237.

Reported-by: HW42 <hw42@ipsumj.de>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

--- xen/arch/x86/irq.c.orig
+++ xen/arch/x86/irq.c
@@ -1964,7 +1964,10 @@ int map_domain_pirq(
         if ( !cpu_has_apic )
             goto done;
 
-        pdev = pci_get_pdev(msi->seg, msi->bus, msi->devfn);
+        pdev = pci_get_pdev_by_domain(d, msi->seg, msi->bus, msi->devfn);
+        if ( !pdev )
+            goto done;
+
         ret = pci_enable_msi(msi, &msi_desc);
         if ( ret )
         {
From: Jan Beulich <jbeulich@suse.com>
Subject: x86: enforce proper privilege when (un)mapping pIRQ-s

(Un)mapping of IRQs, just like other RESOURCE__ADD* / RESOURCE__REMOVE*
actions (in FLASK terms) should be XSM_DM_PRIV rather than XSM_TARGET.
This in turn requires bypassing the XSM check in physdev_unmap_pirq()
for the HVM emuirq case just like is being done in physdev_map_pirq().
The primary goal security wise, however, is to no longer allow HVM
guests, by specifying their own domain ID instead of DOMID_SELF, to
enter code paths intended for PV guest and the control domains of HVM
guests only.

This is part of XSA-237.

Reported-by: HW42 <hw42@ipsumj.de>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>

--- xen/arch/x86/physdev.c.orig
+++ xen/arch/x86/physdev.c
@@ -110,7 +110,7 @@ int physdev_map_pirq(domid_t domid, int
     if ( d == NULL )
         return -ESRCH;
 
-    ret = xsm_map_domain_pirq(XSM_TARGET, d);
+    ret = xsm_map_domain_pirq(XSM_DM_PRIV, d);
     if ( ret )
         goto free_domain;
 
@@ -255,13 +255,14 @@ int physdev_map_pirq(domid_t domid, int
 int physdev_unmap_pirq(domid_t domid, int pirq)
 {
     struct domain *d;
-    int ret;
+    int ret = 0;
 
     d = rcu_lock_domain_by_any_id(domid);
     if ( d == NULL )
         return -ESRCH;
 
-    ret = xsm_unmap_domain_pirq(XSM_TARGET, d);
+    if ( domid != DOMID_SELF || !is_hvm_domain(d) )
+        ret = xsm_unmap_domain_pirq(XSM_DM_PRIV, d);
     if ( ret )
         goto free_domain;
 
--- xen/include/xsm/dummy.h.orig
+++ xen/include/xsm/dummy.h
@@ -453,7 +453,7 @@ static XSM_INLINE char *xsm_show_irq_sid
 
 static XSM_INLINE int xsm_map_domain_pirq(XSM_DEFAULT_ARG struct domain *d)
 {
-    XSM_ASSERT_ACTION(XSM_TARGET);
+    XSM_ASSERT_ACTION(XSM_DM_PRIV);
     return xsm_default_action(action, current->domain, d);
 }
 
@@ -465,7 +465,7 @@ static XSM_INLINE int xsm_map_domain_irq
 
 static XSM_INLINE int xsm_unmap_domain_pirq(XSM_DEFAULT_ARG struct domain *d)
 {
-    XSM_ASSERT_ACTION(XSM_TARGET);
+    XSM_ASSERT_ACTION(XSM_DM_PRIV);
     return xsm_default_action(action, current->domain, d);
 }
 
From: Jan Beulich <jbeulich@suse.com>
Subject: x86/MSI: disallow redundant enabling

At the moment, Xen attempts to allow redundant enabling of MSI by
having pci_enable_msi() return 0, and point to the existing MSI
descriptor, when the msi already exists.

Unfortunately, if subsequent errors are encountered, the cleanup
paths assume pci_enable_msi() had done full initialization, and
hence undo everything that was assumed to be done by that
function without also undoing other setup that would normally
occur only after that function was called (in map_domain_pirq()
itself).

Rather than try to make the redundant enabling case work properly, just
forbid it entirely by having pci_enable_msi() return -EEXIST when MSI
is already set up.

This is part of XSA-237.

Reported-by: HW42 <hw42@ipsumj.de>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>

--- xen/arch/x86/msi.c.orig
+++ xen/arch/x86/msi.c
@@ -1050,11 +1050,10 @@ static int __pci_enable_msi(struct msi_i
     old_desc = find_msi_entry(pdev, msi->irq, PCI_CAP_ID_MSI);
     if ( old_desc )
     {
-        printk(XENLOG_WARNING "irq %d already mapped to MSI on %04x:%02x:%02x.%u\n",
+        printk(XENLOG_ERR "irq %d already mapped to MSI on %04x:%02x:%02x.%u\n",
                msi->irq, msi->seg, msi->bus,
                PCI_SLOT(msi->devfn), PCI_FUNC(msi->devfn));
-        *desc = old_desc;
-        return 0;
+        return -EEXIST;
     }
 
     old_desc = find_msi_entry(pdev, -1, PCI_CAP_ID_MSIX);
@@ -1118,11 +1117,10 @@ static int __pci_enable_msix(struct msi_
     old_desc = find_msi_entry(pdev, msi->irq, PCI_CAP_ID_MSIX);
     if ( old_desc )
     {
-        printk(XENLOG_WARNING "irq %d already mapped to MSI-X on %04x:%02x:%02x.%u\n",
+        printk(XENLOG_ERR "irq %d already mapped to MSI-X on %04x:%02x:%02x.%u\n",
                msi->irq, msi->seg, msi->bus,
                PCI_SLOT(msi->devfn), PCI_FUNC(msi->devfn));
-        *desc = old_desc;
-        return 0;
+        return -EEXIST;
     }
 
     old_desc = find_msi_entry(pdev, -1, PCI_CAP_ID_MSI);
From: Jan Beulich <jbeulich@suse.com>
Subject: x86/IRQ: conditionally preserve irq <-> pirq mapping on map error paths

Mappings that had been set up before should not be torn down when
handling unrelated errors.

This is part of XSA-237.

Reported-by: HW42 <hw42@ipsumj.de>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>

--- xen/arch/x86/irq.c.orig
+++ xen/arch/x86/irq.c
@@ -1252,7 +1252,8 @@ static int prepare_domain_irq_pirq(struc
         return -ENOMEM;
     }
     *pinfo = info;
-    return 0;
+
+    return !!err;
 }
 
 static void set_domain_irq_pirq(struct domain *d, int irq, struct pirq *pirq)
@@ -1295,7 +1296,10 @@ int init_domain_irq_mapping(struct domai
             continue;
         err = prepare_domain_irq_pirq(d, i, i, &info);
         if ( err )
+        {
+            ASSERT(err < 0);
             break;
+        }
         set_domain_irq_pirq(d, i, info);
     }
 
@@ -1903,6 +1907,7 @@ int map_domain_pirq(
     struct pirq *info;
     struct irq_desc *desc;
     unsigned long flags;
+    DECLARE_BITMAP(prepared, MAX_MSI_IRQS) = {};
 
     ASSERT(spin_is_locked(&d->event_lock));
 
@@ -1946,8 +1951,10 @@ int map_domain_pirq(
     }
 
     ret = prepare_domain_irq_pirq(d, irq, pirq, &info);
-    if ( ret )
+    if ( ret < 0 )
         goto revoke;
+    if ( !ret )
+        __set_bit(0, prepared);
 
     desc = irq_to_desc(irq);
 
@@ -2019,8 +2026,10 @@ int map_domain_pirq(
             irq = create_irq(NUMA_NO_NODE);
             ret = irq >= 0 ? prepare_domain_irq_pirq(d, irq, pirq + nr, &info)
                            : irq;
-            if ( ret )
+            if ( ret < 0 )
                 break;
+            if ( !ret )
+                __set_bit(nr, prepared);
             msi_desc[nr].irq = irq;
 
             if ( irq_permit_access(d, irq) != 0 )
@@ -2053,15 +2062,15 @@ int map_domain_pirq(
                 desc->msi_desc = NULL;
                 spin_unlock_irqrestore(&desc->lock, flags);
             }
-            while ( nr-- )
+            while ( nr )
             {
                 if ( irq >= 0 && irq_deny_access(d, irq) )
                     printk(XENLOG_G_ERR
                            "dom%d: could not revoke access to IRQ%d (pirq %d)\n",
                            d->domain_id, irq, pirq);
-                if ( info )
+                if ( info && test_bit(nr, prepared) )
                     cleanup_domain_irq_pirq(d, irq, info);
-                info = pirq_info(d, pirq + nr);
+                info = pirq_info(d, pirq + --nr);
                 irq = info->arch.irq;
             }
             msi_desc->irq = -1;
@@ -2077,12 +2086,14 @@ int map_domain_pirq(
         spin_lock_irqsave(&desc->lock, flags);
         set_domain_irq_pirq(d, irq, info);
         spin_unlock_irqrestore(&desc->lock, flags);
+        ret = 0;
     }
 
 done:
     if ( ret )
     {
-        cleanup_domain_irq_pirq(d, irq, info);
+        if ( test_bit(0, prepared) )
+            cleanup_domain_irq_pirq(d, irq, info);
  revoke:
         if ( irq_deny_access(d, irq) )
             printk(XENLOG_G_ERR
--- xen/arch/x86/physdev.c.orig
+++ xen/arch/x86/physdev.c
@@ -185,7 +185,7 @@ int physdev_map_pirq(domid_t domid, int
         }
         else if ( type == MAP_PIRQ_TYPE_MULTI_MSI )
         {
-            if ( msi->entry_nr <= 0 || msi->entry_nr > 32 )
+            if ( msi->entry_nr <= 0 || msi->entry_nr > MAX_MSI_IRQS )
                 ret = -EDOM;
             else if ( msi->entry_nr != 1 && !iommu_intremap )
                 ret = -EOPNOTSUPP;
--- xen/include/asm-x86/msi.h.orig
+++ xen/include/asm-x86/msi.h
@@ -55,6 +55,8 @@
 /* MAX fixed pages reserved for mapping MSIX tables. */
 #define FIX_MSIX_MAX_PAGES              512
 
+#define MAX_MSI_IRQS 32 /* limited by MSI capability struct properties */
+
 struct msi_info {
     u16 seg;
     u8 bus;
From: Jan Beulich <jbeulich@suse.com>
Subject: x86/FLASK: fix unmap-domain-IRQ XSM hook

The caller and the FLASK implementation of xsm_unmap_domain_irq()
disagreed about what the "data" argument points to in the MSI case:
Change both sides to pass/take a PCI device.

This is part of XSA-237.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

--- xen/arch/x86/irq.c.orig
+++ xen/arch/x86/irq.c
@@ -2144,7 +2144,8 @@ int unmap_domain_pirq(struct domain *d,
         nr = msi_desc->msi.nvec;
     }
 
-    ret = xsm_unmap_domain_irq(XSM_HOOK, d, irq, msi_desc);
+    ret = xsm_unmap_domain_irq(XSM_HOOK, d, irq,
+                               msi_desc ? msi_desc->dev : NULL);
     if ( ret )
         goto done;
 
--- xen/xsm/flask/hooks.c.orig
+++ xen/xsm/flask/hooks.c
@@ -915,8 +915,8 @@ static int flask_unmap_domain_msi (struc
                                    u32 *sid, struct avc_audit_data *ad)
 {
 #ifdef CONFIG_HAS_PCI
-    struct msi_info *msi = data;
-    u32 machine_bdf = (msi->seg << 16) | (msi->bus << 8) | msi->devfn;
+    const struct pci_dev *pdev = data;
+    u32 machine_bdf = (pdev->seg << 16) | (pdev->bus << 8) | pdev->devfn;
 
     AVC_AUDIT_DATA_INIT(ad, DEV);
     ad->device = machine_bdf;

File Added: pkgsrc/sysutils/xenkernel48/patches/Attic/patch-XSA238
$NetBSD: patch-XSA238,v 1.1 2017/10/17 08:42:30 bouyer Exp $

From cdc2887076b19b39fab9faec495082586f3113df Mon Sep 17 00:00:00 2001
From: XenProject Security Team <security@xenproject.org>
Date: Tue, 5 Sep 2017 13:41:37 +0200
Subject: x86/ioreq server: correctly handle bogus
 XEN_DMOP_{,un}map_io_range_to_ioreq_server arguments

Misbehaving device model can pass incorrect XEN_DMOP_map/
unmap_io_range_to_ioreq_server arguments, namely end < start when
specifying address range. When this happens we hit ASSERT(s <= e) in
rangeset_contains_range()/rangeset_overlaps_range() with debug builds.
Production builds will not trap right away but may misbehave later
while handling such bogus ranges.

This is XSA-238.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/hvm/ioreq.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index b2a8b0e986..8c8bf1f0ec 100644
--- xen/arch/x86/hvm/ioreq.c.orig
+++ xen/arch/x86/hvm/ioreq.c
@@ -820,6 +820,9 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
     struct hvm_ioreq_server *s;
     int rc;
 
+    if ( start > end )
+        return -EINVAL;
+
     spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
     rc = -ENOENT;
@@ -872,6 +875,9 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
     struct hvm_ioreq_server *s;
     int rc;
 
+    if ( start > end )
+        return -EINVAL;
+
     spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
 
     rc = -ENOENT;

File Added: pkgsrc/sysutils/xenkernel48/patches/Attic/patch-XSA239
$NetBSD: patch-XSA239,v 1.1 2017/10/17 08:42:30 bouyer Exp $

From: Jan Beulich <jbeulich@suse.com>
Subject: x86/HVM: prefill partially used variable on emulation paths

Certain handlers ignore the access size (vioapic_write() being the
example this was found with), perhaps leading to subsequent reads
seeing data that wasn't actually written by the guest. For
consistency and extra safety also do this on the read path of
hvm_process_io_intercept(), even if this doesn't directly affect what
guests get to see, as we've supposedly already dealt with read handlers
leaving data completely unitialized.

This is XSA-239.

Reported-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- xen/arch/x86/hvm/emulate.c.orig
+++ xen/arch/x86/hvm/emulate.c
@@ -129,7 +129,7 @@ static int hvmemul_do_io(
         .count = *reps,
         .dir = dir,
         .df = df,
-        .data = data,
+        .data = data_is_addr ? data : 0,
         .data_is_ptr = data_is_addr, /* ioreq_t field name is misleading */
         .state = STATE_IOREQ_READY,
     };
--- xen/arch/x86/hvm/intercept.c.orig
+++ xen/arch/x86/hvm/intercept.c
@@ -127,6 +127,7 @@ int hvm_process_io_intercept(const struc
             addr = (p->type == IOREQ_TYPE_COPY) ?
                    p->addr + step * i :
                    p->addr;
+            data = 0;
             rc = ops->read(handler, addr, p->size, &data);
             if ( rc != X86EMUL_OKAY )
                 break;
@@ -161,6 +162,7 @@ int hvm_process_io_intercept(const struc
         {
             if ( p->data_is_ptr )
             {
+                data = 0;
                 switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
                                                   p->size) )
                 {

File Added: pkgsrc/sysutils/xenkernel48/patches/Attic/patch-XSA240
$NetBSD: patch-XSA240,v 1.1 2017/10/17 08:42:30 bouyer Exp $

From 2315b8c651e0cc31c9153d09c9912b8fbe632ad2 Mon Sep 17 00:00:00 2001
From: Jan Beulich <jbeulich@suse.com>
Date: Thu, 28 Sep 2017 15:17:25 +0100
Subject: [PATCH 1/2] x86: limit linear page table use to a single level

That's the only way that they're meant to be used. Without such a
restriction arbitrarily long chains of same-level page tables can be
built, tearing down of which may then cause arbitrarily deep recursion,
causing a stack overflow. To facilitate this restriction, a counter is
being introduced to track both the number of same-level entries in a
page table as well as the number of uses of a page table in another
same-level one (counting into positive and negative direction
respectively, utilizing the fact that both counts can't be non-zero at
the same time).

Note that the added accounting introduces a restriction on the number
of times a page can be used in other same-level page tables - more than
32k of such uses are no longer possible.

Note also that some put_page_and_type[_preemptible]() calls are
replaced with open-coded equivalents.  This seemed preferrable to
adding "parent_table" to the matrix of functions.

Note further that cross-domain same-level page table references are no
longer permitted (they probably never should have been).

This is XSA-240.

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
 xen/arch/x86/domain.c        |   1 +
 xen/arch/x86/mm.c            | 171 ++++++++++++++++++++++++++++++++++++++-----
 xen/include/asm-x86/domain.h |   2 +
 xen/include/asm-x86/mm.h     |  25 +++++--
 4 files changed, 175 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index a725b43a67..5265b0496c 100644
--- xen/arch/x86/domain.c.orig
+++ xen/arch/x86/domain.c
@@ -1245,6 +1245,7 @@ int arch_set_info_guest(
                     rc = -ERESTART;
                     /* Fallthrough */
                 case -ERESTART:
+                    v->arch.old_guest_ptpg = NULL;
                     v->arch.old_guest_table =
                         pagetable_get_page(v->arch.guest_table);
                     v->arch.guest_table = pagetable_null();
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a40461d4d6..31d4a03840 100644
--- xen/arch/x86/mm.c.orig
+++ xen/arch/x86/mm.c
@@ -733,6 +733,61 @@ static void put_data_page(
         put_page(page);
 }
 
+static bool inc_linear_entries(struct page_info *pg)
+{
+    typeof(pg->linear_pt_count) nc = read_atomic(&pg->linear_pt_count), oc;
+
+    do {
+        /*
+         * The check below checks for the "linear use" count being non-zero
+         * as well as overflow.  Signed integer overflow is undefined behavior
+         * according to the C spec.  However, as long as linear_pt_count is
+         * smaller in size than 'int', the arithmetic operation of the
+         * increment below won't overflow; rather the result will be truncated
+         * when stored.  Ensure that this is always true.
+         */
+        BUILD_BUG_ON(sizeof(nc) >= sizeof(int));
+        oc = nc++;
+        if ( nc <= 0 )
+            return false;
+        nc = cmpxchg(&pg->linear_pt_count, oc, nc);
+    } while ( oc != nc );
+
+    return true;
+}
+
+static void dec_linear_entries(struct page_info *pg)
+{
+    typeof(pg->linear_pt_count) oc;
+
+    oc = arch_fetch_and_add(&pg->linear_pt_count, -1);
+    ASSERT(oc > 0);
+}
+
+static bool inc_linear_uses(struct page_info *pg)
+{
+    typeof(pg->linear_pt_count) nc = read_atomic(&pg->linear_pt_count), oc;
+
+    do {
+        /* See the respective comment in inc_linear_entries(). */
+        BUILD_BUG_ON(sizeof(nc) >= sizeof(int));
+        oc = nc--;
+        if ( nc >= 0 )
+            return false;
+        nc = cmpxchg(&pg->linear_pt_count, oc, nc);
+    } while ( oc != nc );
+
+    return true;
+}
+
+static void dec_linear_uses(struct page_info *pg)
+{
+    typeof(pg->linear_pt_count) oc;
+
+    oc = arch_fetch_and_add(&pg->linear_pt_count, 1);
+    ASSERT(oc < 0);
+}
+
 /*
  * We allow root tables to map each other (a.k.a. linear page tables). It
  * needs some special care with reference counts and access permissions:
@@ -762,15 +817,35 @@ get_##level##_linear_pagetable(                                             \
                                                                             \
     if ( (pfn = level##e_get_pfn(pde)) != pde_pfn )                         \
     {                                                                       \
+        struct page_info *ptpg = mfn_to_page(pde_pfn);                      \
+                                                                            \
+        /* Make sure the page table belongs to the correct domain. */       \
+        if ( unlikely(page_get_owner(ptpg) != d) )                          \
+            return 0;                                                       \
+                                                                            \
         /* Make sure the mapped frame belongs to the correct domain. */     \
         if ( unlikely(!get_page_from_pagenr(pfn, d)) )                      \
             return 0;                                                       \
                                                                             \
         /*                                                                  \
-         * Ensure that the mapped frame is an already-validated page table. \
+         * Ensure that the mapped frame is an already-validated page table  \
+         * and is not itself having linear entries, as well as that the     \
+         * containing page table is not iself in use as a linear page table \
+         * elsewhere.                                                       \
          * If so, atomically increment the count (checking for overflow).   \
          */                                                                 \
         page = mfn_to_page(pfn);                                            \
+        if ( !inc_linear_entries(ptpg) )                                    \
+        {                                                                   \
+            put_page(page);                                                 \
+            return 0;                                                       \
+        }                                                                   \
+        if ( !inc_linear_uses(page) )                                       \
+        {                                                                   \
+            dec_linear_entries(ptpg);                                       \
+            put_page(page);                                                 \
+            return 0;                                                       \
+        }                                                                   \
         y = page->u.inuse.type_info;                                        \
         do {                                                                \
             x = y;                                                          \
@@ -778,6 +853,8 @@ get_##level##_linear_pagetable(                                             \
                  unlikely((x & (PGT_type_mask|PGT_validated)) !=            \
                           (PGT_##level##_page_table|PGT_validated)) )       \
             {                                                               \
+                dec_linear_uses(page);                                      \
+                dec_linear_entries(ptpg);                                   \
                 put_page(page);                                             \
                 return 0;                                                   \
             }                                                               \
@@ -1202,6 +1279,9 @@ get_page_from_l4e(
             l3e_remove_flags((pl3e), _PAGE_USER|_PAGE_RW|_PAGE_ACCESSED);   \
     } while ( 0 )
 
+static int _put_page_type(struct page_info *page, bool preemptible,
+                          struct page_info *ptpg);
+
 void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner)
 {
     unsigned long     pfn = l1e_get_pfn(l1e);
@@ -1271,17 +1351,22 @@ static int put_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn)
     if ( l2e_get_flags(l2e) & _PAGE_PSE )
         put_superpage(l2e_get_pfn(l2e));
     else
-        put_page_and_type(l2e_get_page(l2e));
+    {
+        struct page_info *pg = l2e_get_page(l2e);
+        int rc = _put_page_type(pg, false, mfn_to_page(pfn));
+
+        ASSERT(!rc);
+        put_page(pg);
+    }
 
     return 0;
 }
 
-static int __put_page_type(struct page_info *, int preemptible);
-
 static int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn,
                              int partial, bool_t defer)
 {
     struct page_info *pg;
+    int rc;
 
     if ( !(l3e_get_flags(l3e) & _PAGE_PRESENT) || (l3e_get_pfn(l3e) == pfn) )
         return 1;
@@ -1304,21 +1389,28 @@ static int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn,
     if ( unlikely(partial > 0) )
     {
         ASSERT(!defer);
-        return __put_page_type(pg, 1);
+        return _put_page_type(pg, true, mfn_to_page(pfn));
     }
 
     if ( defer )
     {
+        current->arch.old_guest_ptpg = mfn_to_page(pfn);
         current->arch.old_guest_table = pg;
         return 0;
     }
 
-    return put_page_and_type_preemptible(pg);
+    rc = _put_page_type(pg, true, mfn_to_page(pfn));
+    if ( likely(!rc) )
+        put_page(pg);
+
+    return rc;
 }
 
 static int put_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn,
                              int partial, bool_t defer)
 {
+    int rc = 1;
+
     if ( (l4e_get_flags(l4e) & _PAGE_PRESENT) && 
          (l4e_get_pfn(l4e) != pfn) )
     {
@@ -1327,18 +1419,22 @@ static int put_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn,
         if ( unlikely(partial > 0) )
         {
             ASSERT(!defer);
-            return __put_page_type(pg, 1);
+            return _put_page_type(pg, true, mfn_to_page(pfn));
         }
 
         if ( defer )
         {
+            current->arch.old_guest_ptpg = mfn_to_page(pfn);
             current->arch.old_guest_table = pg;
             return 0;
         }
 
-        return put_page_and_type_preemptible(pg);
+        rc = _put_page_type(pg, true, mfn_to_page(pfn));
+        if ( likely(!rc) )
+            put_page(pg);
     }
-    return 1;
+
+    return rc;
 }
 
 static int alloc_l1_table(struct page_info *page)
@@ -1536,6 +1632,7 @@ static int alloc_l3_table(struct page_info *page)
         {
             page->nr_validated_ptes = i;
             page->partial_pte = 0;
+            current->arch.old_guest_ptpg = NULL;
             current->arch.old_guest_table = page;
         }
         while ( i-- > 0 )
@@ -1628,6 +1725,7 @@ static int alloc_l4_table(struct page_info *page)
                 {
                     if ( current->arch.old_guest_table )
                         page->nr_validated_ptes++;
+                    current->arch.old_guest_ptpg = NULL;
                     current->arch.old_guest_table = page;
                 }
             }
@@ -2370,14 +2468,20 @@ int free_page_type(struct page_info *pag
 }
 
 
-static int __put_final_page_type(
-    struct page_info *page, unsigned long type, int preemptible)
+static int _put_final_page_type(struct page_info *page, unsigned long type,
+                                bool preemptible, struct page_info *ptpg)
 {
     int rc = free_page_type(page, type, preemptible);
 
     /* No need for atomic update of type_info here: noone else updates it. */
     if ( rc == 0 )
     {
+        if ( ptpg && PGT_type_equal(type, ptpg->u.inuse.type_info) )
+        {
+            dec_linear_uses(page);
+            dec_linear_entries(ptpg);
+        }
+        ASSERT(!page->linear_pt_count || page_get_owner(page)->is_dying);
         /*
          * Record TLB information for flush later. We do not stamp page tables
          * when running in shadow mode:
@@ -2413,8 +2517,8 @@ static int __put_final_page_type(
 }
 
 
-static int __put_page_type(struct page_info *page,
-                           int preemptible)
+static int _put_page_type(struct page_info *page, bool preemptible,
+                          struct page_info *ptpg)
 {
     unsigned long nx, x, y = page->u.inuse.type_info;
     int rc = 0;
@@ -2441,12 +2545,28 @@ static int __put_page_type(struct page_info *page,
                                            x, nx)) != x) )
                     continue;
                 /* We cleared the 'valid bit' so we do the clean up. */
-                rc = __put_final_page_type(page, x, preemptible);
+                rc = _put_final_page_type(page, x, preemptible, ptpg);
+                ptpg = NULL;
                 if ( x & PGT_partial )
                     put_page(page);
                 break;
             }
 
+            if ( ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info) )
+            {
+                /*
+                 * page_set_tlbflush_timestamp() accesses the same union
+                 * linear_pt_count lives in. Unvalidated page table pages,
+                 * however, should occur during domain destruction only
+                 * anyway.  Updating of linear_pt_count luckily is not
+                 * necessary anymore for a dying domain.
+                 */
+                ASSERT(page_get_owner(page)->is_dying);
+                ASSERT(page->linear_pt_count < 0);
+                ASSERT(ptpg->linear_pt_count > 0);
+                ptpg = NULL;
+            }
+
             /*
              * Record TLB information for flush later. We do not stamp page
              * tables when running in shadow mode:
@@ -2466,6 +2586,13 @@ static int __put_page_type(struct page_info *page,
             return -EINTR;
     }
 
+    if ( ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info) )
+    {
+        ASSERT(!rc);
+        dec_linear_uses(page);
+        dec_linear_entries(ptpg);
+    }
+
     return rc;
 }
 
@@ -2600,6 +2727,7 @@ static int __get_page_type(struct page_info *page, unsigned long type,
             page->nr_validated_ptes = 0;
             page->partial_pte = 0;
         }
+        page->linear_pt_count = 0;
         rc = alloc_page_type(page, type, preemptible);
     }
 
@@ -2614,7 +2742,7 @@ static int __get_page_type(struct page_info *page, unsigned long type,
 
 void put_page_type(struct page_info *page)
 {
-    int rc = __put_page_type(page, 0);
+    int rc = _put_page_type(page, false, NULL);
     ASSERT(rc == 0);
     (void)rc;
 }
@@ -2630,7 +2758,7 @@ int get_page_type(struct page_info *page, unsigned long type)
 
 int put_page_type_preemptible(struct page_info *page)
 {
-    return __put_page_type(page, 1);
+    return _put_page_type(page, true, NULL);
 }
 
 int get_page_type_preemptible(struct page_info *page, unsigned long type)
@@ -2836,11 +2964,14 @@ int put_old_guest_table(struct vcpu *v)
     if ( !v->arch.old_guest_table )
         return 0;
 
-    switch ( rc = put_page_and_type_preemptible(v->arch.old_guest_table) )
+    switch ( rc = _put_page_type(v->arch.old_guest_table, true,
+                                 v->arch.old_guest_ptpg) )
     {
     case -EINTR:
     case -ERESTART:
         return -ERESTART;
+    case 0:
+        put_page(v->arch.old_guest_table);
     }
 
     v->arch.old_guest_table = NULL;
@@ -2997,6 +3128,7 @@ int new_guest_cr3(unsigned long mfn)
                 rc = -ERESTART;
                 /* fallthrough */
             case -ERESTART:
+                curr->arch.old_guest_ptpg = NULL;
                 curr->arch.old_guest_table = page;
                 break;
             default:
@@ -3264,7 +3396,10 @@ long do_mmuext_op(
                     if ( type == PGT_l1_page_table )
                         put_page_and_type(page);
                     else
+                    {
+                        curr->arch.old_guest_ptpg = NULL;
                         curr->arch.old_guest_table = page;
+                    }
                 }
             }
 
@@ -3297,6 +3432,7 @@ long do_mmuext_op(
             {
             case -EINTR:
             case -ERESTART:
+                curr->arch.old_guest_ptpg = NULL;
                 curr->arch.old_guest_table = page;
                 rc = 0;
                 break;
@@ -3375,6 +3511,7 @@ long do_mmuext_op(
                         rc = -ERESTART;
                         /* fallthrough */
                     case -ERESTART:
+                        curr->arch.old_guest_ptpg = NULL;
                         curr->arch.old_guest_table = page;
                         break;
                     default:
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index f6a40eb881..60bb8c9014 100644
--- xen/include/asm-x86/domain.h.orig
+++ xen/include/asm-x86/domain.h
@@ -531,6 +531,8 @@ struct arch_vcpu
     pagetable_t guest_table_user;       /* (MFN) x86/64 user-space pagetable */
     pagetable_t guest_table;            /* (MFN) guest notion of cr3 */
     struct page_info *old_guest_table;  /* partially destructed pagetable */
+    struct page_info *old_guest_ptpg;   /* containing page table of the */
+                                        /* former, if any */
     /* guest_table holds a ref to the page, and also a type-count unless
      * shadow refcounts are in use */
     pagetable_t shadow_table[4];        /* (MFN) shadow(s) of guest */
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 6687dbc985..63590a7716 100644
--- xen/include/asm-x86/mm.h.orig
+++ xen/include/asm-x86/mm.h
@@ -125,11 +125,11 @@ struct page_info
         u32 tlbflush_timestamp;
 
         /*
-         * When PGT_partial is true then this field is valid and indicates
-         * that PTEs in the range [0, @nr_validated_ptes) have been validated.
-         * An extra page reference must be acquired (or not dropped) whenever
-         * PGT_partial gets set, and it must be dropped when the flag gets
-         * cleared. This is so that a get() leaving a page in partially
+         * When PGT_partial is true then the first two fields are valid and
+         * indicate that PTEs in the range [0, @nr_validated_ptes) have been
+         * validated. An extra page reference must be acquired (or not dropped)
+         * whenever PGT_partial gets set, and it must be dropped when the flag
+         * gets cleared. This is so that a get() leaving a page in partially
          * validated state (where the caller would drop the reference acquired
          * due to the getting of the type [apparently] failing [-ERESTART])
          * would not accidentally result in a page left with zero general
@@ -153,10 +153,18 @@ struct page_info
          * put_page_from_lNe() (due to the apparent failure), and hence it
          * must be dropped when the put operation is resumed (and completes),
          * but it must not be acquired if picking up the page for validation.
+         *
+         * The 3rd field, @linear_pt_count, indicates
+         * - by a positive value, how many same-level page table entries a page
+         *   table has,
+         * - by a negative value, in how many same-level page tables a page is
+         *   in use.
          */
         struct {
-            u16 nr_validated_ptes;
-            s8 partial_pte;
+            u16 nr_validated_ptes:PAGETABLE_ORDER + 1;
+            u16 :16 - PAGETABLE_ORDER - 1 - 2;
+            s16 partial_pte:2;
+            s16 linear_pt_count;
         };
 
         /*
@@ -207,6 +215,9 @@ struct page_info
 #define PGT_count_width   PG_shift(9)
 #define PGT_count_mask    ((1UL<<PGT_count_width)-1)
 
+/* Are the 'type mask' bits identical? */
+#define PGT_type_equal(x, y) (!(((x) ^ (y)) & PGT_type_mask))
+
  /* Cleared when the owning guest 'frees' this page. */
 #define _PGC_allocated    PG_shift(1)
 #define PGC_allocated     PG_mask(1, 1)
-- 
2.14.1

From 41d579aad2fee971e5ce0279a9b559a0fdc74452 Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Fri, 22 Sep 2017 11:46:55 +0100
Subject: [PATCH 2/2] x86/mm: Disable PV linear pagetables by default

Allowing pagetables to point to other pagetables of the same level
(often called 'linear pagetables') has been included in Xen since its
inception.  But it is not used by the most common PV guests (Linux,
NetBSD, minios), and has been the source of a number of subtle
reference-counting bugs.

Add a command-line option to control whether PV linear pagetables are
allowed (disabled by default).

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v2:
- s/_/-/; in command-line option
- Added __read_mostly
---
 docs/misc/xen-command-line.markdown | 15 +++++++++++++++
 xen/arch/x86/mm.c                   |  9 +++++++++
 2 files changed, 24 insertions(+)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 54acc60723..ffa66eb146 100644
--- docs/misc/xen-command-line.markdown.orig
+++ docs/misc/xen-command-line.markdown
@@ -1350,6 +1350,21 @@ The following resources are available:
     CDP, one COS will corespond two CBMs other than one with CAT, due to the
     sum of CBMs is fixed, that means actual `cos_max` in use will automatically
     reduce to half when CDP is enabled.
+
+### pv-linear-pt
+> `= <boolean>`
+
+> Default: `true`
+
+Allow PV guests to have pagetable entries pointing to other pagetables
+of the same level (i.e., allowing L2 PTEs to point to other L2 pages).
+This technique is often called "linear pagetables", and is sometimes
+used to allow operating systems a simple way to consistently map the
+current process's pagetables into its own virtual address space.
+
+None of the most common PV operating systems (Linux, MiniOS)
+use this technique, but NetBSD in PV mode,  and maybe custom operating
+systems which do.
 
 ### reboot
 > `= t[riple] | k[bd] | a[cpi] | p[ci] | P[ower] | e[fi] | n[o] [, [w]arm | [c]old]`
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 31d4a03840..5d125cff3a 100644
--- xen/arch/x86/mm.c.orig
+++ xen/arch/x86/mm.c
@@ -800,6 +800,9 @@ static void dec_linear_uses(struct page_info *pg)
  *     frame if it is mapped by a different root table. This is sufficient and
  *     also necessary to allow validation of a root table mapping itself.
  */
+static bool __read_mostly pv_linear_pt_enable = true;
+boolean_param("pv-linear-pt", pv_linear_pt_enable);
+
 #define define_get_linear_pagetable(level)                                  \
 static int                                                                  \
 get_##level##_linear_pagetable(                                             \
@@ -809,6 +812,12 @@ get_##level##_linear_pagetable(                                             \
     struct page_info *page;                                                 \
     unsigned long pfn;                                                      \
                                                                             \
+    if ( !pv_linear_pt_enable )                                             \
+    {                                                                       \
+        MEM_LOG("Attempt to create linear p.t. (feature disabled)");        \
+        return 0;                                                           \
+    }                                                                       \
+                                                                            \
     if ( (level##e_get_flags(pde) & _PAGE_RW) )                             \
     {                                                                       \
         MEM_LOG("Attempt to create linear p.t. with write perms");          \
-- 
2.14.1


File Added: pkgsrc/sysutils/xenkernel48/patches/Attic/patch-XSA241
$NetBSD: patch-XSA241,v 1.1 2017/10/17 08:42:30 bouyer Exp $

x86: don't store possibly stale TLB flush time stamp

While the timing window is extremely narrow, it is theoretically
possible for an update to the TLB flush clock and a subsequent flush
IPI to happen between the read and write parts of the update of the
per-page stamp. Exclude this possibility by disabling interrupts
across the update, preventing the IPI to be serviced in the middle.

This is XSA-241.

Reported-by: Jann Horn <jannh@google.com>
Suggested-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>

--- xen/arch/arm/smp.c.orig
+++ xen/arch/arm/smp.c
@@ -1,4 +1,5 @@
 #include <xen/config.h>
+#include <xen/mm.h>
 #include <asm/system.h>
 #include <asm/smp.h>
 #include <asm/cpregs.h>
--- xen/arch/x86/mm.c.orig
+++ xen/arch/x86/mm.c
@@ -2524,7 +2524,7 @@ static int _put_final_page_type(struct p
          */
         if ( !(shadow_mode_enabled(page_get_owner(page)) &&
                (page->count_info & PGC_page_table)) )
-            page->tlbflush_timestamp = tlbflush_current_time();
+            page_set_tlbflush_timestamp(page);
         wmb();
         page->u.inuse.type_info--;
     }
@@ -2534,7 +2534,7 @@ static int _put_final_page_type(struct p
                 (PGT_count_mask|PGT_validated|PGT_partial)) == 1);
         if ( !(shadow_mode_enabled(page_get_owner(page)) &&
                (page->count_info & PGC_page_table)) )
-            page->tlbflush_timestamp = tlbflush_current_time();
+            page_set_tlbflush_timestamp(page);
         wmb();
         page->u.inuse.type_info |= PGT_validated;
     }
@@ -2588,7 +2588,7 @@ static int _put_page_type(struct page_in
             if ( ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info) )
             {
                 /*
-                 * page_set_tlbflush_timestamp() accesses the same union
+                 * set_tlbflush_timestamp() accesses the same union
                  * linear_pt_count lives in. Unvalidated page table pages,
                  * however, should occur during domain destruction only
                  * anyway.  Updating of linear_pt_count luckily is not
@@ -2609,7 +2609,7 @@ static int _put_page_type(struct page_in
              */
             if ( !(shadow_mode_enabled(page_get_owner(page)) &&
                    (page->count_info & PGC_page_table)) )
-                page->tlbflush_timestamp = tlbflush_current_time();
+                page_set_tlbflush_timestamp(page);
         }
 
         if ( likely((y = cmpxchg(&page->u.inuse.type_info, x, nx)) == x) )
--- xen/arch/x86/mm/shadow/common.c.orig
+++ xen/arch/x86/mm/shadow/common.c
@@ -1464,7 +1464,7 @@ void shadow_free(struct domain *d, mfn_t
          * TLBs when we reuse the page.  Because the destructors leave the
          * contents of the pages in place, we can delay TLB flushes until
          * just before the allocator hands the page out again. */
-        sp->tlbflush_timestamp = tlbflush_current_time();
+        page_set_tlbflush_timestamp(sp);
         perfc_decr(shadow_alloc_count);
         page_list_add_tail(sp, &d->arch.paging.shadow.freelist);
         sp = next;
--- xen/common/page_alloc.c.orig
+++ xen/common/page_alloc.c
@@ -960,7 +960,7 @@ static void free_heap_pages(
         /* If a page has no owner it will need no safety TLB flush. */
         pg[i].u.free.need_tlbflush = (page_get_owner(&pg[i]) != NULL);
         if ( pg[i].u.free.need_tlbflush )
-            pg[i].tlbflush_timestamp = tlbflush_current_time();
+            page_set_tlbflush_timestamp(&pg[i]);
 
         /* This page is not a guest frame any more. */
         page_set_owner(&pg[i], NULL); /* set_gpfn_from_mfn snoops pg owner */
--- xen/include/asm-arm/flushtlb.h.orig
+++ xen/include/asm-arm/flushtlb.h
@@ -12,6 +12,11 @@ static inline void tlbflush_filter(cpuma
 
 #define tlbflush_current_time()                 (0)
 
+static inline void page_set_tlbflush_timestamp(struct page_info *page)
+{
+    page->tlbflush_timestamp = tlbflush_current_time();
+}
+
 #if defined(CONFIG_ARM_32)
 # include <asm/arm32/flushtlb.h>
 #elif defined(CONFIG_ARM_64)
--- xen/include/asm-x86/flushtlb.h.orig
+++ xen/include/asm-x86/flushtlb.h
@@ -23,6 +23,20 @@ DECLARE_PER_CPU(u32, tlbflush_time);
 
 #define tlbflush_current_time() tlbflush_clock
 
+static inline void page_set_tlbflush_timestamp(struct page_info *page)
+{
+    /*
+     * Prevent storing a stale time stamp, which could happen if an update
+     * to tlbflush_clock plus a subsequent flush IPI happen between the
+     * reading of tlbflush_clock and the writing of the struct page_info
+     * field.
+     */
+    ASSERT(local_irq_is_enabled());
+    local_irq_disable();
+    page->tlbflush_timestamp = tlbflush_current_time();
+    local_irq_enable();
+}
+
 /*
  * @cpu_stamp is the timestamp at last TLB flush for the CPU we are testing.
  * @lastuse_stamp is a timestamp taken when the PFN we are testing was last 

File Added: pkgsrc/sysutils/xenkernel48/patches/Attic/patch-XSA242
$NetBSD: patch-XSA242,v 1.1 2017/10/17 08:42:30 bouyer Exp $

From: Jan Beulich <jbeulich@suse.com>
Subject: x86: don't allow page_unlock() to drop the last type reference

Only _put_page_type() does the necessary cleanup, and hence not all
domain pages can be released during guest cleanup (leaving around
zombie domains) if we get this wrong.

This is XSA-242.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- xen/arch/x86/mm.c.orig
+++ xen/arch/x86/mm.c
@@ -1923,7 +1923,11 @@ void page_unlock(struct page_info *page)
 
     do {
         x = y;
+        ASSERT((x & PGT_count_mask) && (x & PGT_locked));
+
         nx = x - (1 | PGT_locked);
+        /* We must not drop the last reference here. */
+        ASSERT(nx & PGT_count_mask);
     } while ( (y = cmpxchg(&page->u.inuse.type_info, x, nx)) != x );
 }
 
@@ -2611,6 +2615,17 @@ static int _put_page_type(struct page_in
                    (page->count_info & PGC_page_table)) )
                 page_set_tlbflush_timestamp(page);
         }
+        else if ( unlikely((nx & (PGT_locked | PGT_count_mask)) ==
+                           (PGT_locked | 1)) )
+        {
+            /*
+             * We must not drop the second to last reference when the page is
+             * locked, as page_unlock() doesn't do any cleanup of the type.
+             */
+            cpu_relax();
+            y = page->u.inuse.type_info;
+            continue;
+        }
 
         if ( likely((y = cmpxchg(&page->u.inuse.type_info, x, nx)) == x) )
             break;

File Added: pkgsrc/sysutils/xenkernel48/patches/Attic/patch-XSA243
$NetBSD: patch-XSA243,v 1.1 2017/10/17 08:42:30 bouyer Exp $

From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: x86/shadow: Don't create self-linear shadow mappings for 4-level translated guests

When initially creating a monitor table for 4-level translated guests, don't
install a shadow-linear mapping.  This mapping is actually self-linear, and
trips up the writeable heuristic logic into following Xen's mappings, not the
guests' shadows it was expecting to follow.

A consequence of this is that sh_guess_wrmap() needs to cope with there being
no shadow-linear mapping present, which in practice occurs once each time a
vcpu switches to 4-level paging from a different paging mode.

An appropriate shadow-linear slot will be inserted into the monitor table
either while constructing lower level monitor tables, or by sh_update_cr3().

While fixing this, clarify the safety of the other mappings.  Despite
appearing unsafe, it is correct to create a guest-linear mapping for
translated domains; this is self-linear and doesn't point into the translated
domain.  Drop a dead clause for translate != external guests.

This is XSA-243.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>

diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index d70b1c6..029e8d4 100644
--- xen/arch/x86/mm/shadow/multi.c.orig
+++ xen/arch/x86/mm/shadow/multi.c
@@ -1472,26 +1472,38 @@ void sh_install_xen_entries_in_l4(struct domain *d, mfn_t gl4mfn, mfn_t sl4mfn)
         sl4e[shadow_l4_table_offset(RO_MPT_VIRT_START)] = shadow_l4e_empty();
     }
 
-    /* Shadow linear mapping for 4-level shadows.  N.B. for 3-level
-     * shadows on 64-bit xen, this linear mapping is later replaced by the
-     * monitor pagetable structure, which is built in make_monitor_table
-     * and maintained by sh_update_linear_entries. */
-    sl4e[shadow_l4_table_offset(SH_LINEAR_PT_VIRT_START)] =
-        shadow_l4e_from_mfn(sl4mfn, __PAGE_HYPERVISOR);
-
-    /* Self linear mapping.  */
-    if ( shadow_mode_translate(d) && !shadow_mode_external(d) )
+    /*
+     * Linear mapping slots:
+     *
+     * Calling this function with gl4mfn == sl4mfn is used to construct a
+     * monitor table for translated domains.  In this case, gl4mfn forms the
+     * self-linear mapping (i.e. not pointing into the translated domain), and
+     * the shadow-linear slot is skipped.  The shadow-linear slot is either
+     * filled when constructing lower level monitor tables, or via
+     * sh_update_cr3() for 4-level guests.
+     *
+     * Calling this function with gl4mfn != sl4mfn is used for non-translated
+     * guests, where the shadow-linear slot is actually self-linear, and the
+     * guest-linear slot points into the guests view of its pagetables.
+     */
+    if ( shadow_mode_translate(d) )
     {
-        // linear tables may not be used with translated PV guests
-        sl4e[shadow_l4_table_offset(LINEAR_PT_VIRT_START)] =
+        ASSERT(mfn_eq(gl4mfn, sl4mfn));
+
+        sl4e[shadow_l4_table_offset(SH_LINEAR_PT_VIRT_START)] =
             shadow_l4e_empty();
     }
     else
     {
-        sl4e[shadow_l4_table_offset(LINEAR_PT_VIRT_START)] =
-            shadow_l4e_from_mfn(gl4mfn, __PAGE_HYPERVISOR);
+        ASSERT(!mfn_eq(gl4mfn, sl4mfn));
+
+        sl4e[shadow_l4_table_offset(SH_LINEAR_PT_VIRT_START)] =
+            shadow_l4e_from_mfn(sl4mfn, __PAGE_HYPERVISOR);
     }
 
+    sl4e[shadow_l4_table_offset(LINEAR_PT_VIRT_START)] =
+        shadow_l4e_from_mfn(gl4mfn, __PAGE_HYPERVISOR);
+
     unmap_domain_page(sl4e);
 }
 #endif
@@ -4287,6 +4299,11 @@ static int sh_guess_wrmap(struct vcpu *v, unsigned long vaddr, mfn_t gmfn)
 
     /* Carefully look in the shadow linear map for the l1e we expect */
 #if SHADOW_PAGING_LEVELS >= 4
+    /* Is a shadow linear map is installed in the first place? */
+    sl4p  = v->arch.paging.shadow.guest_vtable;
+    sl4p += shadow_l4_table_offset(SH_LINEAR_PT_VIRT_START);
+    if ( !(shadow_l4e_get_flags(*sl4p) & _PAGE_PRESENT) )
+        return 0;
     sl4p = sh_linear_l4_table(v) + shadow_l4_linear_offset(vaddr);
     if ( !(shadow_l4e_get_flags(*sl4p) & _PAGE_PRESENT) )
         return 0;

File Added: pkgsrc/sysutils/xenkernel48/patches/Attic/patch-XSA244
$NetBSD: patch-XSA244,v 1.1 2017/10/17 08:42:30 bouyer Exp $

From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: [PATCH] x86/cpu: Fix IST handling during PCPU bringup

Clear IST references in newly allocated IDTs.  Nothing good will come of
having them set before the TSS is suitably constructed (although the chances
of the CPU surviving such an IST interrupt/exception is extremely slim).

Uniformly set the IST references after the TSS is in place.  This fixes an
issue on AMD hardware, where onlining a PCPU while PCPU0 is in HVM context
will cause IST_NONE to be copied into the new IDT, making that PCPU vulnerable
to privilege escalation from PV guests until it subsequently schedules an HVM
guest.

This is XSA-244

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/cpu/common.c | 5 +++++
 xen/arch/x86/smpboot.c    | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 78f5667..6cf3628 100644
--- xen/arch/x86/cpu/common.c.orig
+++ xen/arch/x86/cpu/common.c
@@ -640,6 +640,7 @@ void __init early_cpu_init(void)
  * - Sets up TSS with stack pointers, including ISTs
  * - Inserts TSS selector into regular and compat GDTs
  * - Loads GDT, IDT, TR then null LDT
+ * - Sets up IST references in the IDT
  */
 void load_system_tables(void)
 {
@@ -702,6 +703,10 @@ void load_system_tables(void)
 	asm volatile ("ltr  %w0" : : "rm" (TSS_ENTRY << 3) );
 	asm volatile ("lldt %w0" : : "rm" (0) );
 
+	set_ist(&idt_tables[cpu][TRAP_double_fault],  IST_DF);
+	set_ist(&idt_tables[cpu][TRAP_nmi],	      IST_NMI);
+	set_ist(&idt_tables[cpu][TRAP_machine_check], IST_MCE);
+
 	/*
 	 * Bottom-of-stack must be 16-byte aligned!
 	 *
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 3ca716c..1609b62 100644
--- xen/arch/x86/smpboot.c.orig
+++ xen/arch/x86/smpboot.c
@@ -724,6 +724,9 @@ static int cpu_smpboot_alloc(unsigned int cpu)
     if ( idt_tables[cpu] == NULL )
         goto oom;
     memcpy(idt_tables[cpu], idt_table, IDT_ENTRIES * sizeof(idt_entry_t));
+    set_ist(&idt_tables[cpu][TRAP_double_fault],  IST_NONE);
+    set_ist(&idt_tables[cpu][TRAP_nmi],           IST_NONE);
+    set_ist(&idt_tables[cpu][TRAP_machine_check], IST_NONE);
 
     for ( stub_page = 0, i = cpu & ~(STUBS_PER_PAGE - 1);
           i < nr_cpu_ids && i <= (cpu | (STUBS_PER_PAGE - 1)); ++i )

cvs diff -r1.7 -r1.8 pkgsrc/sysutils/xentools48/Attic/Makefile (expand / switch to unified diff)

--- pkgsrc/sysutils/xentools48/Attic/Makefile 2017/09/08 09:51:25 1.7
+++ pkgsrc/sysutils/xentools48/Attic/Makefile 2017/10/17 08:42:30 1.8
@@ -1,32 +1,32 @@ @@ -1,32 +1,32 @@
1# $NetBSD: Makefile,v 1.7 2017/09/08 09:51:25 jaapb Exp $ 1# $NetBSD: Makefile,v 1.8 2017/10/17 08:42:30 bouyer Exp $
2# 2#
3VERSION= 4.8.0 3VERSION= 4.8.2
4VERSION_IPXE= 827dd1bfee67daa683935ce65316f7e0f057fe1c 4VERSION_IPXE= 827dd1bfee67daa683935ce65316f7e0f057fe1c
5DIST_IPXE= ipxe-git-${VERSION_IPXE}.tar.gz 5DIST_IPXE= ipxe-git-${VERSION_IPXE}.tar.gz
6DIST_NEWLIB= newlib-1.16.0.tar.gz 6DIST_NEWLIB= newlib-1.16.0.tar.gz
7DIST_LWIP= lwip-1.3.0.tar.gz 7DIST_LWIP= lwip-1.3.0.tar.gz
8DIST_GRUB= grub-0.97.tar.gz 8DIST_GRUB= grub-0.97.tar.gz
9DIST_GMP= gmp-4.3.2.tar.bz2 9DIST_GMP= gmp-4.3.2.tar.bz2
10DIST_OCAML= ocaml-3.11.0.tar.gz 10DIST_OCAML= ocaml-3.11.0.tar.gz
11DIST_POLARSSL= polarssl-1.1.4-gpl.tgz 11DIST_POLARSSL= polarssl-1.1.4-gpl.tgz
12DIST_TPMEMU= tpm_emulator-0.7.4.tar.gz 12DIST_TPMEMU= tpm_emulator-0.7.4.tar.gz
13DIST_ZLIB= zlib-1.2.3.tar.gz 13DIST_ZLIB= zlib-1.2.3.tar.gz
14DIST_LIBPCI= pciutils-2.2.9.tar.bz2 14DIST_LIBPCI= pciutils-2.2.9.tar.bz2
15 15
16DIST_SUBDIR= xen48 16DIST_SUBDIR= xen48
17DISTNAME= xen-${VERSION} 17DISTNAME= xen-${VERSION}
18PKGNAME= xentools48-${VERSION} 18PKGNAME= xentools48-${VERSION}
19PKGREVISION= 3 19PKGREVISION= 1
20CATEGORIES= sysutils 20CATEGORIES= sysutils
21MASTER_SITES= https://downloads.xenproject.org/release/xen/${VERSION}/ 21MASTER_SITES= https://downloads.xenproject.org/release/xen/${VERSION}/
22 22
23DISTFILES= ${DISTNAME}.tar.gz 23DISTFILES= ${DISTNAME}.tar.gz
24 24
25XEN_EXTFILES= http://xenbits.xensource.com/xen-extfiles/ 25XEN_EXTFILES= http://xenbits.xensource.com/xen-extfiles/
26DISTFILES+= ${DIST_IPXE} 26DISTFILES+= ${DIST_IPXE}
27SITES.${DIST_IPXE} += ${XEN_EXTFILES} 27SITES.${DIST_IPXE} += ${XEN_EXTFILES}
28 28
29DISTFILES+= ${DIST_NEWLIB} 29DISTFILES+= ${DIST_NEWLIB}
30SITES.${DIST_NEWLIB} += ${XEN_EXTFILES} 30SITES.${DIST_NEWLIB} += ${XEN_EXTFILES}
31 31
32DISTFILES+= ${DIST_LWIP} 32DISTFILES+= ${DIST_LWIP}

cvs diff -r1.3 -r1.4 pkgsrc/sysutils/xentools48/Attic/distinfo (expand / switch to unified diff)

--- pkgsrc/sysutils/xentools48/Attic/distinfo 2017/08/23 03:02:14 1.3
+++ pkgsrc/sysutils/xentools48/Attic/distinfo 2017/10/17 08:42:30 1.4
@@ -1,14 +1,14 @@ @@ -1,14 +1,14 @@
1$NetBSD: distinfo,v 1.3 2017/08/23 03:02:14 maya Exp $ 1$NetBSD: distinfo,v 1.4 2017/10/17 08:42:30 bouyer Exp $
2 2
3SHA1 (xen48/gmp-4.3.2.tar.bz2) = c011e8feaf1bb89158bd55eaabd7ef8fdd101a2c 3SHA1 (xen48/gmp-4.3.2.tar.bz2) = c011e8feaf1bb89158bd55eaabd7ef8fdd101a2c
4RMD160 (xen48/gmp-4.3.2.tar.bz2) = a8f3f41501ece290c348aeb4444bbea40bc53e71 4RMD160 (xen48/gmp-4.3.2.tar.bz2) = a8f3f41501ece290c348aeb4444bbea40bc53e71
5SHA512 (xen48/gmp-4.3.2.tar.bz2) = 2e0b0fd23e6f10742a5517981e5171c6e88b0a93c83da701b296f5c0861d72c19782daab589a7eac3f9032152a0fc7eff7f5362db8fccc4859564a9aa82329cf 5SHA512 (xen48/gmp-4.3.2.tar.bz2) = 2e0b0fd23e6f10742a5517981e5171c6e88b0a93c83da701b296f5c0861d72c19782daab589a7eac3f9032152a0fc7eff7f5362db8fccc4859564a9aa82329cf
6Size (xen48/gmp-4.3.2.tar.bz2) = 1897483 bytes 6Size (xen48/gmp-4.3.2.tar.bz2) = 1897483 bytes
7SHA1 (xen48/grub-0.97.tar.gz) = 2580626c4579bd99336d3af4482c346c95dac4fb 7SHA1 (xen48/grub-0.97.tar.gz) = 2580626c4579bd99336d3af4482c346c95dac4fb
8RMD160 (xen48/grub-0.97.tar.gz) = 7fb5674edf0c950bd38e94f85ff1e2909aa741f0 8RMD160 (xen48/grub-0.97.tar.gz) = 7fb5674edf0c950bd38e94f85ff1e2909aa741f0
9SHA512 (xen48/grub-0.97.tar.gz) = c2bc9ffc8583aeae71cee9ddcc4418969768d4e3764d47307da54f93981c0109fb07d84b061b3a3628bd00ba4d14a54742bc04848110eb3ae8ca25dbfbaabadb 9SHA512 (xen48/grub-0.97.tar.gz) = c2bc9ffc8583aeae71cee9ddcc4418969768d4e3764d47307da54f93981c0109fb07d84b061b3a3628bd00ba4d14a54742bc04848110eb3ae8ca25dbfbaabadb
10Size (xen48/grub-0.97.tar.gz) = 971783 bytes 10Size (xen48/grub-0.97.tar.gz) = 971783 bytes
11SHA1 (xen48/ipxe-git-827dd1bfee67daa683935ce65316f7e0f057fe1c.tar.gz) = 37270d4d39686e29130c51405dbabf670d37b73d 11SHA1 (xen48/ipxe-git-827dd1bfee67daa683935ce65316f7e0f057fe1c.tar.gz) = 37270d4d39686e29130c51405dbabf670d37b73d
12RMD160 (xen48/ipxe-git-827dd1bfee67daa683935ce65316f7e0f057fe1c.tar.gz) = f780d33d510a83eda0c06cb9fa4732650e337640 12RMD160 (xen48/ipxe-git-827dd1bfee67daa683935ce65316f7e0f057fe1c.tar.gz) = f780d33d510a83eda0c06cb9fa4732650e337640
13SHA512 (xen48/ipxe-git-827dd1bfee67daa683935ce65316f7e0f057fe1c.tar.gz) = 82ba65e1c676d32b29c71e6395c9506cab952c8f8b03f692e2b50133be8f0c0146d0f22c223262d81a4df579986fde5abc6507869f4965be4846297ef7b4b890 13SHA512 (xen48/ipxe-git-827dd1bfee67daa683935ce65316f7e0f057fe1c.tar.gz) = 82ba65e1c676d32b29c71e6395c9506cab952c8f8b03f692e2b50133be8f0c0146d0f22c223262d81a4df579986fde5abc6507869f4965be4846297ef7b4b890
14Size (xen48/ipxe-git-827dd1bfee67daa683935ce65316f7e0f057fe1c.tar.gz) = 3656744 bytes 14Size (xen48/ipxe-git-827dd1bfee67daa683935ce65316f7e0f057fe1c.tar.gz) = 3656744 bytes
@@ -26,42 +26,42 @@ SHA512 (xen48/ocaml-3.11.0.tar.gz) = 61c @@ -26,42 +26,42 @@ SHA512 (xen48/ocaml-3.11.0.tar.gz) = 61c
26Size (xen48/ocaml-3.11.0.tar.gz) = 2855506 bytes 26Size (xen48/ocaml-3.11.0.tar.gz) = 2855506 bytes
27SHA1 (xen48/pciutils-2.2.9.tar.bz2) = 2871be0890f0406c7f86fa01646e23935fda789e 27SHA1 (xen48/pciutils-2.2.9.tar.bz2) = 2871be0890f0406c7f86fa01646e23935fda789e
28RMD160 (xen48/pciutils-2.2.9.tar.bz2) = 781a3d30c5c429a0d92110a46711144f74acde06 28RMD160 (xen48/pciutils-2.2.9.tar.bz2) = 781a3d30c5c429a0d92110a46711144f74acde06
29SHA512 (xen48/pciutils-2.2.9.tar.bz2) = 2b3d98d027e46d8c08037366dde6f0781ca03c610ef2b380984639e4ef39899ed8d8b8e4cd9c9dc54df101279b95879bd66bfd4d04ad07fef41e847ea7ae32b5 29SHA512 (xen48/pciutils-2.2.9.tar.bz2) = 2b3d98d027e46d8c08037366dde6f0781ca03c610ef2b380984639e4ef39899ed8d8b8e4cd9c9dc54df101279b95879bd66bfd4d04ad07fef41e847ea7ae32b5
30Size (xen48/pciutils-2.2.9.tar.bz2) = 212265 bytes 30Size (xen48/pciutils-2.2.9.tar.bz2) = 212265 bytes
31SHA1 (xen48/polarssl-1.1.4-gpl.tgz) = 3dd10bd1a8f7f58e0ef8c91cfa5ea7efd5d5f4bc 31SHA1 (xen48/polarssl-1.1.4-gpl.tgz) = 3dd10bd1a8f7f58e0ef8c91cfa5ea7efd5d5f4bc
32RMD160 (xen48/polarssl-1.1.4-gpl.tgz) = da5e218d1462561006841baff747f60bb4655f08 32RMD160 (xen48/polarssl-1.1.4-gpl.tgz) = da5e218d1462561006841baff747f60bb4655f08
33SHA512 (xen48/polarssl-1.1.4-gpl.tgz) = 88da614e4d3f4409c4fd3bb3e44c7587ba051e3fed4e33d526069a67e8180212e1ea22da984656f50e290049f60ddca65383e5983c0f8884f648d71f698303ad 33SHA512 (xen48/polarssl-1.1.4-gpl.tgz) = 88da614e4d3f4409c4fd3bb3e44c7587ba051e3fed4e33d526069a67e8180212e1ea22da984656f50e290049f60ddca65383e5983c0f8884f648d71f698303ad
34Size (xen48/polarssl-1.1.4-gpl.tgz) = 611340 bytes 34Size (xen48/polarssl-1.1.4-gpl.tgz) = 611340 bytes
35SHA1 (xen48/tpm_emulator-0.7.4.tar.gz) = ffa3aafcd833fdcd7483bbdb4ff862f30ffde579 35SHA1 (xen48/tpm_emulator-0.7.4.tar.gz) = ffa3aafcd833fdcd7483bbdb4ff862f30ffde579
36RMD160 (xen48/tpm_emulator-0.7.4.tar.gz) = ded71632d316126138f2db4a5f2051b2489ae5ff 36RMD160 (xen48/tpm_emulator-0.7.4.tar.gz) = ded71632d316126138f2db4a5f2051b2489ae5ff
37SHA512 (xen48/tpm_emulator-0.7.4.tar.gz) = 4928b5b82f57645be9408362706ff2c4d9baa635b21b0d41b1c82930e8c60a759b1ea4fa74d7e6c7cae1b7692d006aa5cb72df0c3b88bf049779aa2b566f9d35 37SHA512 (xen48/tpm_emulator-0.7.4.tar.gz) = 4928b5b82f57645be9408362706ff2c4d9baa635b21b0d41b1c82930e8c60a759b1ea4fa74d7e6c7cae1b7692d006aa5cb72df0c3b88bf049779aa2b566f9d35
38Size (xen48/tpm_emulator-0.7.4.tar.gz) = 214145 bytes 38Size (xen48/tpm_emulator-0.7.4.tar.gz) = 214145 bytes
39SHA1 (xen48/xen-4.8.0.tar.gz) = c2403899b13e1e8b8da391aceecbfc932d583a88 39SHA1 (xen48/xen-4.8.2.tar.gz) = 184c57ce9e71e34b3cbdd318524021f44946efbe
40RMD160 (xen48/xen-4.8.0.tar.gz) = b79b1e2587caa9c6fe68d2996a4fd42f95c1fe7b 40RMD160 (xen48/xen-4.8.2.tar.gz) = f4126cb0f7ff427ed7d20ce399dcd1077c599343
41SHA512 (xen48/xen-4.8.0.tar.gz) = 70b95553f9813573b12e52999a4df8701dec430f23c36a8dc70d25a46bb4bc9234e5b7feb74a04062af4c8d6b6bcfe947d90b2b172416206812e54bac9797454 41SHA512 (xen48/xen-4.8.2.tar.gz) = 7805531f73d23ecfff3439770e62d387f4254a444875670d53a0a739323e5d4d8f8fcc478f8936ee1ae8aff3e0229549e47c01c606365a8ce060dd5c503e87da
42Size (xen48/xen-4.8.0.tar.gz) = 22499917 bytes 42Size (xen48/xen-4.8.2.tar.gz) = 22522336 bytes
43SHA1 (xen48/zlib-1.2.3.tar.gz) = 60faeaaf250642db5c0ea36cd6dcc9f99c8f3902 43SHA1 (xen48/zlib-1.2.3.tar.gz) = 60faeaaf250642db5c0ea36cd6dcc9f99c8f3902
44RMD160 (xen48/zlib-1.2.3.tar.gz) = 89a57e336c24f7f6eebda3a1724e14b71187e117 44RMD160 (xen48/zlib-1.2.3.tar.gz) = 89a57e336c24f7f6eebda3a1724e14b71187e117
45SHA512 (xen48/zlib-1.2.3.tar.gz) = 021b958fcd0d346c4ba761bcf0cc40f3522de6186cf5a0a6ea34a70504ce9622b1c2626fce40675bc8282cf5f5ade18473656abc38050f72f5d6480507a2106e 45SHA512 (xen48/zlib-1.2.3.tar.gz) = 021b958fcd0d346c4ba761bcf0cc40f3522de6186cf5a0a6ea34a70504ce9622b1c2626fce40675bc8282cf5f5ade18473656abc38050f72f5d6480507a2106e
46Size (xen48/zlib-1.2.3.tar.gz) = 496597 bytes 46Size (xen48/zlib-1.2.3.tar.gz) = 496597 bytes
47SHA1 (patch-.._ipxe_src_core_settings.c) = 1eab2fbd8b22dde2b8aa830ae7701603486f74e4 47SHA1 (patch-.._ipxe_src_core_settings.c) = 1eab2fbd8b22dde2b8aa830ae7701603486f74e4
48SHA1 (patch-.._ipxe_src_net_fcels.c) = 3b515307d8203b60815ad76bfd2a82289e05ebc5 48SHA1 (patch-.._ipxe_src_net_fcels.c) = 3b515307d8203b60815ad76bfd2a82289e05ebc5
49SHA1 (patch-.._newlib-1.16.0_newlib_libc_include_sys__types.h) = 65ff526aa26832b930086279ed6c83862040f8ac 49SHA1 (patch-.._newlib-1.16.0_newlib_libc_include_sys__types.h) = 65ff526aa26832b930086279ed6c83862040f8ac
50SHA1 (patch-._stubdom_vtpmmgr_tpm2_marshal.h) = 30c747a53e848387e4c8d6f4dcbcab7d1b46ed12 50SHA1 (patch-._stubdom_vtpmmgr_tpm2_marshal.h) = 30c747a53e848387e4c8d6f4dcbcab7d1b46ed12
51SHA1 (patch-Config.mk) = 7976ce94c553c2fc6badc6d41e9cb8334fea40c8 51SHA1 (patch-Config.mk) = 7976ce94c553c2fc6badc6d41e9cb8334fea40c8
52SHA1 (patch-Makefile) = fdcd5fbb22613e55ac1b000a46b1ecbbd99eef59 52SHA1 (patch-Makefile) = fdcd5fbb22613e55ac1b000a46b1ecbbd99eef59
53SHA1 (patch-XSA-211-1) = df96b8992148e442a887715ccca741b948fbb0f5 53SHA1 (patch-XSA233) = e6a7230035966d7d292ef3ca477f2eb3458ae12f
54SHA1 (patch-XSA-211-2) = c860da3631c1c7988f9bb150020935859c6b061f 54SHA1 (patch-XSA240) = 754bbe5080a81e1526b7938fed01ba435e65e50b
55SHA1 (patch-docs_man_xl.cfg.pod.5.in) = e1ee6f2d48f6ce001c44c7ac688ea179b625b584 55SHA1 (patch-docs_man_xl.cfg.pod.5.in) = e1ee6f2d48f6ce001c44c7ac688ea179b625b584
56SHA1 (patch-docs_man_xl.conf.pod.5) = d77e3313750db315d540d7713c95cd54d6f02938 56SHA1 (patch-docs_man_xl.conf.pod.5) = d77e3313750db315d540d7713c95cd54d6f02938
57SHA1 (patch-docs_man_xl.pod.1.in) = 9b37ef724f2827bc05110e5456a8668257509cab 57SHA1 (patch-docs_man_xl.pod.1.in) = 9b37ef724f2827bc05110e5456a8668257509cab
58SHA1 (patch-docs_man_xlcpupool.cfg.pod.5) = 3f6db65d95b5fc607c2fa7e2fc975e0ddbfdd5e5 58SHA1 (patch-docs_man_xlcpupool.cfg.pod.5) = 3f6db65d95b5fc607c2fa7e2fc975e0ddbfdd5e5
59SHA1 (patch-docs_misc_xl-disk-configuration.txt) = b5c71dab9adc5ab1be38077617a8ea10b59485ec 59SHA1 (patch-docs_misc_xl-disk-configuration.txt) = b5c71dab9adc5ab1be38077617a8ea10b59485ec
60SHA1 (patch-extras_mini-os_Config.mk) = cb5cdb32f1b3c55abad702ab6768caf59d886ff2 60SHA1 (patch-extras_mini-os_Config.mk) = cb5cdb32f1b3c55abad702ab6768caf59d886ff2
61SHA1 (patch-extras_mini-os_arch_x86_arch.mk) = 8b4f1fe0e888f5b70408d2cc3a3968ce27eae5dc 61SHA1 (patch-extras_mini-os_arch_x86_arch.mk) = 8b4f1fe0e888f5b70408d2cc3a3968ce27eae5dc
62SHA1 (patch-extras_mini-os_include_fcntl.h) = 4ed18497227c8c327ee3db9d793caa4ac6254822 62SHA1 (patch-extras_mini-os_include_fcntl.h) = 4ed18497227c8c327ee3db9d793caa4ac6254822
63SHA1 (patch-extras_mini-os_include_time.h) = ab3b0794bf892ce6a036aa889c6852d65b508596 63SHA1 (patch-extras_mini-os_include_time.h) = ab3b0794bf892ce6a036aa889c6852d65b508596
64SHA1 (patch-extras_mini-os_lib_sys.c) = 9dd4bcab9deed5132d0fe88a0fe0d33b3fc7d09c 64SHA1 (patch-extras_mini-os_lib_sys.c) = 9dd4bcab9deed5132d0fe88a0fe0d33b3fc7d09c
65SHA1 (patch-extras_mini-os_lock.c) = e28753793dee483c1ffad8ea8ed2706353046b50 65SHA1 (patch-extras_mini-os_lock.c) = e28753793dee483c1ffad8ea8ed2706353046b50
66SHA1 (patch-stubdom_Makefile) = 6c52ae9af4003fdc199980b6725265fde5a06545 66SHA1 (patch-stubdom_Makefile) = 6c52ae9af4003fdc199980b6725265fde5a06545
67SHA1 (patch-stubdom_newlib.patch) = e937cd046db217e45b1de76bd0950f514666bc12 67SHA1 (patch-stubdom_newlib.patch) = e937cd046db217e45b1de76bd0950f514666bc12

File Deleted: pkgsrc/sysutils/xentools48/patches/Attic/patch-XSA-211-1

File Deleted: pkgsrc/sysutils/xentools48/patches/Attic/patch-XSA-211-2

File Added: pkgsrc/sysutils/xentools48/patches/Attic/patch-XSA233
$NetBSD: patch-XSA233,v 1.1 2017/10/17 08:42:30 bouyer Exp $

From: Juergen Gross <jgross@suse.com>
Subject: tools/xenstore: dont unlink connection object twice

A connection object of a domain with associated stubdom has two
parents: the domain and the stubdom. When cleaning up the list of
active domains in domain_cleanup() make sure not to unlink the
connection twice from the same domain. This could happen when the
domain and its stubdom are being destroyed at the same time leading
to the domain loop being entered twice.

Additionally don't use talloc_free() in this case as it will remove
a random parent link, leading eventually to a memory leak. Use
talloc_unlink() instead specifying the context from which the
connection object should be removed.

This is XSA-233.

Reported-by: Eric Chanudet <chanudete@ainfosec.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>

--- tools/xenstore/xenstored_domain.c.orig
+++ tools/xenstore/xenstored_domain.c
@@ -221,10 +221,11 @@ static int destroy_domain(void *_domain)
 static void domain_cleanup(void)
 {
 	xc_dominfo_t dominfo;
-	struct domain *domain, *tmp;
+	struct domain *domain;
 	int notify = 0;
 
-	list_for_each_entry_safe(domain, tmp, &domains, list) {
+ again:
+	list_for_each_entry(domain, &domains, list) {
 		if (xc_domain_getinfo(*xc_handle, domain->domid, 1,
 				      &dominfo) == 1 &&
 		    dominfo.domid == domain->domid) {
@@ -236,8 +237,12 @@ static void domain_cleanup(void)
 			if (!dominfo.dying)
 				continue;
 		}
-		talloc_free(domain->conn);
-		notify = 0; /* destroy_domain() fires the watch */
+		if (domain->conn) {
+			talloc_unlink(talloc_autofree_context(), domain->conn);
+			domain->conn = NULL;
+			notify = 0; /* destroy_domain() fires the watch */
+			goto again;
+		}
 	}
 
 	if (notify)

File Added: pkgsrc/sysutils/xentools48/patches/Attic/patch-XSA240
$NetBSD: patch-XSA240,v 1.1 2017/10/17 08:42:30 bouyer Exp $

From 41d579aad2fee971e5ce0279a9b559a0fdc74452 Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@citrix.com>
Date: Fri, 22 Sep 2017 11:46:55 +0100
Subject: [PATCH 2/2] x86/mm: Disable PV linear pagetables by default

Allowing pagetables to point to other pagetables of the same level
(often called 'linear pagetables') has been included in Xen since its
inception.  But it is not used by the most common PV guests (Linux,
NetBSD, minios), and has been the source of a number of subtle
reference-counting bugs.

Add a command-line option to control whether PV linear pagetables are
allowed (disabled by default).

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes since v2:
- s/_/-/; in command-line option
- Added __read_mostly
---
 docs/misc/xen-command-line.markdown | 15 +++++++++++++++
 xen/arch/x86/mm.c                   |  9 +++++++++
 2 files changed, 24 insertions(+)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 54acc60723..ffa66eb146 100644
--- docs/misc/xen-command-line.markdown.orig
+++ docs/misc/xen-command-line.markdown
@@ -1350,6 +1350,21 @@ The following resources are available:
     CDP, one COS will corespond two CBMs other than one with CAT, due to the
     sum of CBMs is fixed, that means actual `cos_max` in use will automatically
     reduce to half when CDP is enabled.
+
+### pv-linear-pt
+> `= <boolean>`
+
+> Default: `true`
+
+Allow PV guests to have pagetable entries pointing to other pagetables
+of the same level (i.e., allowing L2 PTEs to point to other L2 pages).
+This technique is often called "linear pagetables", and is sometimes
+used to allow operating systems a simple way to consistently map the
+current process's pagetables into its own virtual address space.
+
+None of the most common PV operating systems (Linux, MiniOS)
+use this technique, but NetBSD in PV mode,  and maybe custom operating
+systems which do.
 
 ### reboot
 > `= t[riple] | k[bd] | a[cpi] | p[ci] | P[ower] | e[fi] | n[o] [, [w]arm | [c]old]`
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 31d4a03840..5d125cff3a 100644