linux/drivers/char
Xianting Tian f8910ffa81 ipmi:msghandler: retry to get device id on an error
We fail to get the BMCS's device id with low probability when loading
the ipmi driver and it causes BMC device registration failed. When this
issue occurs we got below kernel prints:

  [Wed Sep  9 19:52:03 2020] ipmi_si IPI0001:00: IPMI message handler:
     device id demangle failed: -22
  [Wed Sep  9 19:52:03 2020] IPMI BT: using default values
  [Wed Sep  9 19:52:03 2020] IPMI BT: req2rsp=5 secs retries=2
  [Wed Sep  9 19:52:03 2020] ipmi_si IPI0001:00: Unable to get the
     device id: -5
  [Wed Sep  9 19:52:04 2020] ipmi_si IPI0001:00: Unable to register
     device: error -5

When this issue happens, we want to manually unload the driver and try to
load it again, but it can't be unloaded by 'rmmod' as it is already 'in
use'.

We add a print in handle_one_recv_msg(), when this issue happens,
the msg we received is "Recv: 1c 01 d5", which means the data_len is 1,
data[0] is 0xd5 (completion code), which means "bmc cannot execute
command.  Command, or request parameter(s), not supported in present
state".  Debug code:
	static int handle_one_recv_msg(struct ipmi_smi *intf,
                               struct ipmi_smi_msg *msg) {
        	printk("Recv: %*ph\n", msg->rsp_size, msg->rsp);
		... ...
	}
Then in ipmi_demangle_device_id(), it returned '-EINVAL' as 'data_len < 7'
and 'data[0] != 0'.

We created this patch to retry the get device id when this error
happens.  We reproduced this issue again and the retry succeed on the
first retry, we finally got the correct msg and then all is ok:
Recv: 1c 01 00 01 81 05 84 02 af db 07 00 01 00 b9 00 10 00

So use a retry machanism in this patch to give bmc more opportunity to
correctly response kernel when we received specific completion codes.

Signed-off-by: Xianting Tian <tian.xianting@h3c.com>
Message-Id: <20200915071817.4484-1-tian.xianting@h3c.com>
[Cleaned up the verbage a bit in the header and prints.]
Signed-off-by: Corey Minyard <cminyard@mvista.com>
2020-09-15 09:57:45 -05:00
..
agp Merge drm/drm-next into drm-intel-next-queued 2020-06-25 18:05:03 +03:00
hw_random hwrng: core - remove redundant initialization of variable ret 2020-07-31 18:25:29 +10:00
ipmi ipmi:msghandler: retry to get device id on an error 2020-09-15 09:57:45 -05:00
mwave char/mwave: remove redundant initialization of variable bRC 2020-07-10 14:50:51 +02:00
pcmcia cm4000_cs.c cmm_ioctl(): get rid of pointless access_ok() 2020-05-29 11:04:56 -04:00
tpm ARM: SoC driver updates for v5.9 2020-08-03 19:30:59 -07:00
xilinx_hwicap treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
xillybus char: xillybus: use devm_platform_ioremap_resource() to simplify code 2019-11-05 18:29:21 +01:00
adi.c
apm-emulation.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
applicom.c misc: cleanup minor number definitions in c file into miscdevice.h 2020-03-18 12:27:03 +01:00
applicom.h
bsr.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
ds1620.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
dsp56k.c
dtlk.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
hangcheck-timer.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 405 2019-06-05 17:37:13 +02:00
hpet.c char: hpet: Fix out-of-bounds read bug 2020-01-30 06:58:33 +01:00
Kconfig char: Replace HTTP links with HTTPS ones 2020-07-23 09:44:15 +02:00
lp.c lp: fix sparc64 LPSETTIMEOUT ioctl 2019-11-13 19:08:22 +08:00
Makefile rtc/alpha: remove legacy rtc driver 2020-03-19 07:41:02 +01:00
mem.c /dev/mem: Add missing memory barriers for devmem_inode 2020-07-23 09:47:13 +02:00
misc.c char: misc: Move EXPORT_SYMBOL immediately next to the functions/varibles 2019-05-24 18:00:41 +02:00
mspec.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
nsc_gpio.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
nvram.c nvram: drop useless access_ok() 2020-05-29 11:03:30 -04:00
nwbutton.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
nwbutton.h misc: cleanup minor number definitions in c file into miscdevice.h 2020-03-18 12:27:03 +01:00
nwflash.c misc: move FLASH_MINOR into miscdevice.h and fix conflicts 2020-03-18 12:27:04 +01:00
pc8736x_gpio.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
powernv-op-panel.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
ppdev.c ppdev: Distribute switch variables for initialization 2020-02-23 20:28:12 +01:00
ps3flash.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 164 2019-05-30 11:26:38 -07:00
random.c random32: update the net random state on interrupt and activity 2020-07-29 10:35:37 -07:00
raw.c char: raw: do not leak CONFIG_MAX_RAW_DEVS to userspace 2020-07-10 14:50:51 +02:00
scx200_gpio.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
sonypi.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 61 2019-05-24 17:36:45 +02:00
tb0219.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
tlclk.c drivers: char: tlclk.c: Avoid data race between init and interrupt handler 2020-04-23 16:55:24 +02:00
toshiba.c misc: cleanup minor number definitions in c file into miscdevice.h 2020-03-18 12:27:03 +01:00
ttyprintk.c ttyprintk: remove redundant initialization of variable ret 2020-07-10 14:50:51 +02:00
uv_mmtimer.c
virtio_console.c Merge v5.8-rc6 into char-misc-next 2020-07-20 09:43:40 +02:00