Received: by mail.netbsd.org (Postfix, from userid 605) id A456F84DA6; Mon, 15 Jun 2020 09:09:25 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mail.netbsd.org (Postfix) with ESMTP id 29FAB84D9D for ; Mon, 15 Jun 2020 09:09:25 +0000 (UTC) X-Virus-Scanned: amavisd-new at netbsd.org Received: from mail.netbsd.org ([IPv6:::1]) by localhost (mail.netbsd.org [IPv6:::1]) (amavisd-new, port 10025) with ESMTP id Kz725LzE8mNH for ; Mon, 15 Jun 2020 09:09:24 +0000 (UTC) Received: from cvs.NetBSD.org (ivanova.NetBSD.org [IPv6:2001:470:a085:999:28c:faff:fe03:5984]) by mail.netbsd.org (Postfix) with ESMTP id 7863484CF7 for ; Mon, 15 Jun 2020 09:09:24 +0000 (UTC) Received: by cvs.NetBSD.org (Postfix, from userid 500) id 6C0CEFB28; Mon, 15 Jun 2020 09:09:24 +0000 (UTC) Content-Disposition: inline Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII" MIME-Version: 1.0 Date: Mon, 15 Jun 2020 09:09:24 +0000 From: "SAITOH Masanobu" Subject: CVS commit: src/sys To: source-changes@NetBSD.org X-Mailer: log_accum Message-Id: <20200615090924.6C0CEFB28@cvs.NetBSD.org> Sender: source-changes-owner@NetBSD.org List-Id: source-changes.NetBSD.org Precedence: bulk Reply-To: source-changes-d@NetBSD.org Mail-Reply-To: "SAITOH Masanobu" Mail-Followup-To: source-changes-d@NetBSD.org List-Unsubscribe: Module Name: src Committed By: msaitoh Date: Mon Jun 15 09:09:24 UTC 2020 Modified Files: src/sys/arch/amd64/amd64: cpufunc.S src/sys/arch/i386/i386: cpufunc.S src/sys/arch/x86/include: cpu_counter.h cpufunc.h src/sys/arch/x86/x86: cpu.c hyperv.c tsc.c tsc.h src/sys/rump/librump/rumpkern/arch/x86: rump_x86_cpu_counter.c Log Message: Serialize rdtsc using with lfence, mfence or cpuid to read TSC more precisely. x86/x86/tsc.c rev. 1.67 reduced cache problem and got big improvement, but it still has room. I measured the effect of lfence, mfence, cpuid and rdtscp. The impact to TSC skew and/or drift is: AMD: mfence > rdtscp > cpuid > lfence-serialize > lfence = nomodify Intel: lfence > rdtscp > cpuid > nomodify So, mfence is the best on AMD and lfence is the best on Intel. If it has no SSE2, we can use cpuid. NOTE: - An AMD's document says DE_CFG_LFENCE_SERIALIZE bit can be used for serializing, but it's not so good. - On Intel i386(not amd64), it seems the improvement is very little. - rdtscp instruct can be used as serializing instruction + rdtsc, but it's not good as [lm]fence. Both Intel and AMD's document say that the latency of rdtscp is bigger than rdtsc, so I suspect the difference of the result comes from it. To generate a diff of this commit: cvs rdiff -u -r1.60 -r1.61 src/sys/arch/amd64/amd64/cpufunc.S cvs rdiff -u -r1.46 -r1.47 src/sys/arch/i386/i386/cpufunc.S cvs rdiff -u -r1.6 -r1.7 src/sys/arch/x86/include/cpu_counter.h cvs rdiff -u -r1.40 -r1.41 src/sys/arch/x86/include/cpufunc.h cvs rdiff -u -r1.193 -r1.194 src/sys/arch/x86/x86/cpu.c cvs rdiff -u -r1.9 -r1.10 src/sys/arch/x86/x86/hyperv.c cvs rdiff -u -r1.50 -r1.51 src/sys/arch/x86/x86/tsc.c cvs rdiff -u -r1.6 -r1.7 src/sys/arch/x86/x86/tsc.h cvs rdiff -u -r1.1 -r1.2 \ src/sys/rump/librump/rumpkern/arch/x86/rump_x86_cpu_counter.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.