linux/fs/dlm
Gang He da3627c30d dlm: remove O_NONBLOCK flag in sctp_connect_to_sock
We should remove O_NONBLOCK flag when calling sock->ops->connect()
in sctp_connect_to_sock() function.
Why?
1. up to now, sctp socket connect() function ignores the flag argument,
that means O_NONBLOCK flag does not take effect, then we should remove
it to avoid the confusion (but is not urgent).
2. for the future, there will be a patch to fix this problem, then the flag
argument will take effect, the patch has been queued at https://git.kernel.o
rg/pub/scm/linux/kernel/git/davem/net.git/commit/net/sctp?id=644fbdeacf1d3ed
d366e44b8ba214de9d1dd66a9.
But, the O_NONBLOCK flag will make sock->ops->connect() directly return
without any wait time, then the connection will not be established, DLM kernel
module will call sock->ops->connect() again and again, the bad results are,
CPU usage is almost 100%, even trigger soft_lockup problem if the related
configurations are enabled,
DLM kernel module also prints lots of messages like,
[Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592
[Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592
[Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592
[Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592
The upper application (e.g. ocfs2 mount command) is hanged at new_lockspace(),
the whole backtrace is as below,
tb0307-nd2:~ # cat /proc/2935/stack
[<0>] new_lockspace+0x957/0xac0 [dlm]
[<0>] dlm_new_lockspace+0xae/0x140 [dlm]
[<0>] user_cluster_connect+0xc3/0x3a0 [ocfs2_stack_user]
[<0>] ocfs2_cluster_connect+0x144/0x220 [ocfs2_stackglue]
[<0>] ocfs2_dlm_init+0x215/0x440 [ocfs2]
[<0>] ocfs2_fill_super+0xcb0/0x1290 [ocfs2]
[<0>] mount_bdev+0x173/0x1b0
[<0>] mount_fs+0x35/0x150
[<0>] vfs_kern_mount.part.23+0x54/0x100
[<0>] do_mount+0x59a/0xc40
[<0>] SyS_mount+0x80/0xd0
[<0>] do_syscall_64+0x76/0x140
[<0>] entry_SYSCALL_64_after_hwframe+0x42/0xb7
[<0>] 0xffffffffffffffff

So, I think we should remove O_NONBLOCK flag here, since DLM kernel module can
not handle non-block sockect in connect() properly.

Signed-off-by: Gang He <ghe@suse.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2018-05-29 10:48:35 -05:00
..
ast.c DLM: fix overflow dlm_cb_seq 2017-09-25 12:45:21 -05:00
ast.h
config.c dlm: make config_item_type const 2017-10-19 16:15:22 +02:00
config.h dlm: add log_info config option 2016-06-21 09:04:24 -05:00
debug_fs.c dlm: Improve a size determination in table_seq_start() 2017-08-07 11:23:09 -05:00
dir.c
dir.h
dlm_internal.h Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
Kconfig
lock.c dlm: remove dlm_send_rcom_lookup_dump 2017-10-09 09:29:31 -05:00
lock.h
lockspace.c dlm: constify kset_uevent_ops structure 2017-08-07 11:23:09 -05:00
lockspace.h
lowcomms.c dlm: remove O_NONBLOCK flag in sctp_connect_to_sock 2018-05-29 10:48:35 -05:00
lowcomms.h
lvb_table.h
main.c dlm: audit and remove any unnecessary uses of module.h 2016-10-19 11:00:03 -05:00
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
member.c dlm: Delete an unnecessary variable initialisation in dlm_ls_start() 2017-08-07 11:23:09 -05:00
member.h
memory.c
memory.h
midcomms.c
midcomms.h
netlink.c dlm for 4.10 2016-12-14 08:31:37 -08:00
plock.c vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
rcom.c dlm: remove dlm_send_rcom_lookup_dump 2017-10-09 09:29:31 -05:00
rcom.h dlm: remove dlm_send_rcom_lookup_dump 2017-10-09 09:29:31 -05:00
recover.c DLM: retry rcom when dlm_wait_function is timed out. 2017-09-25 12:45:21 -05:00
recover.h
recoverd.c dlm: recheck kthread_should_stop() before schedule() 2017-09-25 12:48:10 -05:00
recoverd.h
requestqueue.c
requestqueue.h
user.c vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
user.h
util.c
util.h