--- - branch: MAIN date: Sat Jan 11 17:11:50 UTC 2014 files: - new: '1.5' old: '1.4' path: src/sys/arch/i386/i386/db_machdep.c pathrev: src/sys/arch/i386/i386/db_machdep.c@1.5 type: modified id: 20140111T171150Z.132b650be8346651eacab3d5896945d75dd21861 log: | stop ddb backtrace at Xsoftintr() (Richard Hansen) Stop unwinding frames when db_stack_trace_print() encouters Xsoftintr(). This avoids a recursive panic() due to an invalid pointer dereference when a software interrupt panic()s. Here's what happens without this change: When db_stack_trace_print() runs during a panic() and db_nextframe() encounters the Xsoftintr() frame, db_nextframe() does the following at db_machdep.c:292: 1. checks to see if there's a Xsoftintr() symbol (there is) 2. checks to see if the frame corresponds to an interrupt (the symbol name begins with "Xsoft" so it does) If both of the above are true (they are), db_nextframe() at db_machdep.c:303 tries to get a pointer to a struct intrframe. According to the comment at line 300, the second argument passed to Xsoftintr() is a pointer to a struct intrframe. However, the comment and the corresponding code are not correct -- Xsoftintr() doesn't take any arguments[1]. Attempting to fetch the second argument only yields stack garbage, not a struct intrframe. This causes db_machdep.c:307 to dereference a bad pointer, triggering the recursive panic(). [1] Xsoftintr() is called by Xspllower() which is called by splx() a.k.a. spllower(). Neither Xspllower() nor Xsoftintr() set up a standard frame when called (they don't do 'pushl %ebp; movl %esp, %ebp'), so Xsoftintr()'s %ebp is the same as splx()'s %ebp. This makes splx()'s arguments look like Xsoftintr()'s arguments, and splx() does not take any arguments. You can reproduce the recursive panic by reverting this change and adding a call to panic() inside ipintr(). The backtrace will look like the following (the line numbers you see might differ from these line numbers -- this backtrace was generated from a slightly modified version of the NetBSD 6.1 kernel): #0 vpanic (fmt=0xc0ba995b "trap", ap=0xdaa51730) at /usr/src/sys/kern/subr_prf.c:211 #1 0xc0790529 in panic (fmt=0xc0ba995b "trap") at /usr/src/sys/kern/subr_prf.c:205 #2 0xc07decbc in trap (frame=0xdaa517c0) at /usr/src/sys/arch/i386/i386/trap.c:396 #3 0xc010cf48 in ?? () at /usr/src/sys/arch/i386/i386/vector.S:983 #4 0xc02857f0 in db_get_value (addr=56, size=4, is_signed=false) at /usr/src/sys/ddb/db_access.c:72 #5 0xc028a09a in db_nextframe (nextframe=0xdaa51b40, retaddr=0xdaa51b3c, arg0=0xdaa51b38, ip=0xdaa51b34, argp=0xdaa51d88, is_trap=0, pr=0xc07901b5 ) at /usr/src/sys/arch/i386/i386/db_machdep.c:308 #6 0xc028be2b in db_stack_trace_print (addr=, have_addr=true, count=65533, modif=0xc0bb44bf "", pr=0xc07901b5 ) at /usr/src/sys/arch/x86/x86/db_trace.c:275 #7 0xc07903cb in vpanic (fmt=0xc0b6ba76 "testing", ap=0xdaa51d4c) at /usr/src/sys/kern/subr_prf.c:296 #8 0xc0790529 in panic (fmt=0xc0b6ba76 "testing") at /usr/src/sys/kern/subr_prf.c:205 #9 0xc04e3d4f in ipintr () at /usr/src/sys/netinet/ip_input.c:369 #10 0xc054ac0d in softint_execute (s=, si=, l=) at /usr/src/sys/kern/kern_softint.c:543 #11 softint_dispatch (pinned=0xc4085560, s=4) at /usr/src/sys/kern/kern_softint.c:825 #12 0xc0100fdb in ?? () at /usr/src/sys/arch/i386/i386/spl.S:390 #13 0xc07d2e11 in tcp_usrreq (so=0xc40b0534, req=4, m=0x0, nam=0xc317ba00, control=0x0, l=0xc4085560) at /usr/src/sys/netinet/tcp_usrreq.c:615 #14 0xc04bb300 in tcp_usrreq_wrapper (a=0xc40b0534, b=4, c=0x0, d=0xc317ba00, e=0x0, f=0xc4085560) at /usr/src/sys/netinet/in_proto.c:164 #15 0xc0839006 in soconnect (so=0xc40b0534, nam=0xc317ba00, l=0xc4085560) at /usr/src/sys/kern/uipc_socket.c:821 #16 0xc083c4ce in do_sys_connect (l=0xc4085560, fd=4, nam=0xc317ba00) at /usr/src/sys/kern/uipc_syscalls.c:371 #17 0xc083dbeb in sys_connect (l=0xc4085560, uap=0xdbc27d00, retval=0xdbc27d28) at /usr/src/sys/kern/uipc_syscalls.c:350 #18 0xc07b1b4a in sy_call (rval=0xdbc27d28, uap=0xdbc27d00, l=0xc4085560, sy=0xc0c2f018) at /usr/src/sys/sys/syscallvar.h:61 #19 syscall (frame=0xdbc27d48) at /usr/src/sys/arch/x86/x86/syscall.c:179 #20 0xc010056d in ?? () at /usr/src/sys/arch/i386/i386/locore.S:1160 Backtrace stopped: previous frame inner to this frame (corrupt stack?) module: src subject: 'CVS commit: src/sys/arch/i386/i386' unixtime: '1389460310' user: christos