--- - branch: netbsd-6-1 date: Mon Mar 6 08:18:14 UTC 2017 files: - new: 1.49.2.2.6.1 old: 1.49.2.2 path: src/sys/arch/x86/include/pmap.h pathrev: src/sys/arch/x86/include/pmap.h@1.49.2.2.6.1 type: modified - new: 1.164.2.4.6.2 old: 1.164.2.4.6.1 path: src/sys/arch/x86/x86/pmap.c pathrev: src/sys/arch/x86/x86/pmap.c@1.164.2.4.6.2 type: modified id: 20170306T081814Z.a4cf9056dd041e9e0565dbf3ef213a674072df46 log: "Pull up following revision(s) (requested by bouyer in ticket #1441):\n\tsys/arch/x86/x86/pmap.c: revision 1.241 via patch\n\tsys/arch/x86/include/pmap.h: revision 1.63 via patch\nShould be PG_k, doesn't change anything.\n--\nRemove PG_u from the kernel pages on Xen. Otherwise there is no privilege\nseparation between the kernel and userland.\nOn Xen-amd64, the kernel runs in ring3 just like userland, and the\nseparation is guaranteed by the hypervisor - each syscall/trap is\nintercepted by Xen and sent manually to the kernel. Before that, the\nhypervisor modifies the page tables so that the kernel becomes accessible.\nLater, when returning to userland, the hypervisor removes the kernel pages\nand flushes the TLB.\nHowever, TLB flushes are costly, and in order to reduce the number of pages\nflushed Xen marks the userland pages as global, while keeping the kernel\nones as local. This way, when returning to userland, only the kernel pages\nget flushed - which makes sense since they are the only ones that got\nremoved from the mapping.\nXen differentiates the userland pages by looking at their PG_u bit in the\nPTE; if a page has this bit then Xen tags it as global, otherwise Xen\nmanually adds the bit but keeps the page as local. The thing is, since we\nset PG_u in the kernel pages, Xen believes our kernel pages are in fact\nuserland pages, so it marks them as global. Therefore, when returning to\nuserland, the kernel pages indeed get removed from the page tree, but are\nnot flushed from the TLB. Which means that they are still accessible.\nWith this - and depending on the DTLB size - userland has a small window\nwhere it can read/write to the last kernel pages accessed, which is enough\nto completely escalate privileges: the sysent structure systematically gets\nread when performing a syscall, and chances are that it will still be\ncached in the TLB. Userland can then use this to patch a chosen syscall,\nmake it point to a userland function, retrieve %gs and compute the address\nof its credentials, and finally grant itself root privileges.\n" module: src subject: 'CVS commit: [netbsd-6-1] src/sys/arch/x86' unixtime: '1488788294' user: snj