linux/drivers/tty
Kosuke Tatsukawa e81107d4c6 tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c
My colleague ran into a program stall on a x86_64 server, where
n_tty_read() was waiting for data even if there was data in the buffer
in the pty.  kernel stack for the stuck process looks like below.
 #0 [ffff88303d107b58] __schedule at ffffffff815c4b20
 #1 [ffff88303d107bd0] schedule at ffffffff815c513e
 #2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818
 #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2
 #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23
 #5 [ffff88303d107dd0] tty_read at ffffffff81368013
 #6 [ffff88303d107e20] __vfs_read at ffffffff811a3704
 #7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57
 #8 [ffff88303d107f00] sys_read at ffffffff811a4306
 #9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7

There seems to be two problems causing this issue.

First, in drivers/tty/n_tty.c, __receive_buf() stores the data and
updates ldata->commit_head using smp_store_release() and then checks
the wait queue using waitqueue_active().  However, since there is no
memory barrier, __receive_buf() could return without calling
wake_up_interactive_poll(), and at the same time, n_tty_read() could
start to wait in wait_woken() as in the following chart.

        __receive_buf()                         n_tty_read()
------------------------------------------------------------------------
if (waitqueue_active(&tty->read_wait))
/* Memory operations issued after the
   RELEASE may be completed before the
   RELEASE operation has completed */
                                        add_wait_queue(&tty->read_wait, &wait);
                                        ...
                                        if (!input_available_p(tty, 0)) {
smp_store_release(&ldata->commit_head,
                  ldata->read_head);
                                        ...
                                        timeout = wait_woken(&wait,
                                          TASK_INTERRUPTIBLE, timeout);
------------------------------------------------------------------------

The second problem is that n_tty_read() also lacks a memory barrier
call and could also cause __receive_buf() to return without calling
wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken()
as in the chart below.

        __receive_buf()                         n_tty_read()
------------------------------------------------------------------------
                                        spin_lock_irqsave(&q->lock, flags);
                                        /* from add_wait_queue() */
                                        ...
                                        if (!input_available_p(tty, 0)) {
                                        /* Memory operations issued after the
                                           RELEASE may be completed before the
                                           RELEASE operation has completed */
smp_store_release(&ldata->commit_head,
                  ldata->read_head);
if (waitqueue_active(&tty->read_wait))
                                        __add_wait_queue(q, wait);
                                        spin_unlock_irqrestore(&q->lock,flags);
                                        /* from add_wait_queue() */
                                        ...
                                        timeout = wait_woken(&wait,
                                          TASK_INTERRUPTIBLE, timeout);
------------------------------------------------------------------------

There are also other places in drivers/tty/n_tty.c which have similar
calls to waitqueue_active(), so instead of adding many memory barrier
calls, this patch simply removes the call to waitqueue_active(),
leaving just wake_up*() behind.

This fixes both problems because, even though the memory access before
or after the spinlocks in both wake_up*() and add_wait_queue() can
sneak into the critical section, it cannot go past it and the critical
section assures that they will be serialized (please see "INTER-CPU
ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a
better explanation).  Moreover, the resulting code is much simpler.

Latency measurement using a ping-pong test over a pty doesn't show any
visible performance drop.

Signed-off-by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-04 19:03:40 +01:00
..
hvc xen: MFN/GFN/BFN terminology changes for 4.3-rc0 2015-09-10 16:21:11 -07:00
ipwireless Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2015-04-14 09:50:27 -07:00
serial serial: atmel: fix error path of probe function 2015-10-04 18:51:42 +01:00
vt tty: vt: Fix !TASK_RUNNING diagnostic warning from paste_selection() 2015-07-23 18:08:29 -07:00
amiserial.c tty: amiserial.c: move assignment out of if () block 2015-05-10 19:04:16 +02:00
bfin_jtag_comm.c TTY: bfin_jtag_comm: remove incorrect wait_until_sent operation 2015-03-07 03:44:14 +01:00
cyclades.c tty: remove buf parameter from tty_name() 2015-05-06 22:26:57 +02:00
ehv_bytechan.c tty: remove unused variable sprop 2015-02-03 15:52:13 -08:00
goldfish.c staging: goldfish: Fix pointer cast for 32 bits 2015-05-31 11:40:14 +09:00
isicom.c tty/isicom: fix big-endian compile warning 2015-02-02 10:11:25 -08:00
Kconfig ttyFDC: Implement KGDB IO operations. 2015-03-31 12:04:13 +02:00
Makefile TTY: Add MIPS EJTAG Fast Debug Channel TTY driver 2015-03-31 12:04:12 +02:00
metag_da.c tty/metag_da: Avoid module_init/module_exit in non-modular code 2015-06-16 14:12:31 -04:00
mips_ejtag_fdc.c MIPS: Remove "weak" from get_c0_fdc_int() declaration 2015-09-03 12:07:38 +02:00
moxa.c
moxa.h
mxser.c
mxser.h
n_gsm.c tty: Convert use of __constant_htons to htons 2015-07-23 15:10:25 -07:00
n_hdlc.c pty: Fix input race when closing 2015-05-10 19:26:37 +02:00
n_r3964.c
n_tracerouter.c
n_tracesink.c
n_tracesink.h
n_tty.c tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c 2015-10-04 19:03:40 +01:00
nozomi.c drivers/tty/nozomi.c: rename CONFIG_MAGIC 2015-05-10 19:19:06 +02:00
pty.c pty: Add debug message for ptmx open 2015-07-23 18:37:32 -07:00
rocket_int.h
rocket.c TTY: fix misspelling of current function in string 2015-01-09 14:38:15 -08:00
rocket.h tty: rocket: fix comment of ROCKET_SPD_HI 2015-05-24 12:49:16 -07:00
synclink_gt.c tty: synclink_gt.c: move assignment out of if () block 2015-05-10 19:04:18 +02:00
synclink.c tty: synclink.c: move assignment out of if () block 2015-05-10 19:04:18 +02:00
synclinkmp.c tty: synclinkmp.c: move assignment out of if () block 2015-05-10 19:04:18 +02:00
sysrq.c mm, oom: pass an oom order of -1 when triggered by sysrq 2015-09-08 15:35:28 -07:00
tty_audit.c
tty_buffer.c tty: buffers: Move hidden buffer index advance into outer loop 2015-07-23 18:23:56 -07:00
tty_io.c tty: don't leak cdev in tty_cdev_add() 2015-10-04 18:51:42 +01:00
tty_ioctl.c tty: Replace inline #ifdef TTY_DEBUG_WAIT_UNTIL_SENT 2015-07-23 18:37:31 -07:00
tty_ldisc.c tty: core: Improve ldisc debug messages 2015-07-23 18:37:31 -07:00
tty_ldsem.c tty: tty_ldsem.c: move assignment out of if () block 2015-05-10 19:04:18 +02:00
tty_mutex.c tty: Make lock subclasses available for other tty locks 2015-02-02 10:11:27 -08:00
tty_port.c tty: Deletion of unnecessary checks before two function calls 2014-11-26 19:35:49 -08:00