This patch addresses a long standing issue of the driver with the
mac80211 .flush() callback. Since implementing the .flush() callback
a number of issues have been fixed, but a WARN_ON_ONCE() was still
triggered because the timeout on the flush could still occur.
This patch changes the awkward design using msleep() into one using
a waitqueue. The waiting flush() context will kick the transmit dma
when it is idle and the timeout used waiting for the event is set
to 500 ms. Worst case there can be 64 frames outstanding for transmit
in the driver. At a rate of 1Mbps that would take 1.5 seconds assuming
MTU is 1500 bytes and ignoring retries. The WARN_ON_ONCE() is also
removed as this was put in to indicate the flush timeout as a reason
for the driver to stall. That was not happening since fixing endless
AMPDU retries with following upstream commit:
commit 85091fc0a7
Author: Arend van Spriel <arend@broadcom.com>
Date: Thu Feb 23 18:38:22 2012 +0100
brcm80211: smac: fix endless retry of A-MPDU transmissions
bugzilla: 42840 <https://bugzilla.kernel.org/show_bug.cgi?id=42840>
bugzilla@redhat: <https://bugzilla.redhat.com/show_bug.cgi?id=799168>
bugzilla@redhat: <https://bugzilla.redhat.com/show_bug.cgi?id=787649>
Cc: Jonathan Nieder <jrnieder@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Camaleón <noelamac@gmail.com>
Cc: Milan Bouchet-Valat <nalimilan@club-internet.fr>
Cc: Seth Forshee <seth.forshee@canonical.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Piotr Haber <phaber@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch unregisters the gpio chip before ssb gets unloaded.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch unregisters the gpio chip before bcma gets unloaded.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Reported-by: Piotr Haber <phaber@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Kernel commits 41affd5 and 6539306 changed the locking in rtl_lps_leave()
from a spinlock to a mutex by doing the calls indirectly from a work queue
to reduce the time that interrupts were disabled. This change was fine for
most systems; however a scheduling while atomic bug was reported in
https://bugzilla.redhat.com/show_bug.cgi?id=903881. The backtrace indicates
that routine rtl_is_special(), which calls rtl_lps_leave() in three places
was entered in atomic context. These direct calls are replaced by putting a
request on the appropriate work queue.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-and-tested-by: Nathaniel Doherty <ntdoherty@gmail.com>
Cc: Nathaniel Doherty <ntdoherty@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
It is normal for minidrivers accumulating frames to return NULL
from their tx_fixup function. We do not want to count this as a
drop, or log any debug messages. A different exit path is
therefore chosen for such drivers, skipping the debug message
and the tx_dropped increment.
The test for accumulating drivers was however completely bogus,
making the exit path selection depend on whether the user had
enabled tx_err logging or not. This would arbitrarily mess up
accounting for both accumulating and non-accumulating minidrivers,
and would result in unwanted debug messages for the accumulating
drivers.
Fix by testing for FLAG_MULTI_PACKET instead, which probably was
the intention from the beginning. This usage match the documented
behaviour of this flag:
Indicates to usbnet, that USB driver accumulates multiple IP packets.
Affects statistic (counters) and short packet handling.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates LINUX_MIB_LISTENDROPS and LINUX_MIB_LISTENOVERFLOWS in
tcp_v6_conn_request() and tcp_v6_err(). tcp_v6_conn_request() in particular can
drop SYNs for various reasons which are not currently tracked.
Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates LINUX_MIB_LISTENDROPS in tcp_v4_conn_request() and
tcp_v4_err(). tcp_v4_conn_request() in particular can drop SYNs for various
reasons which are not currently tracked.
Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When releasing a packet socket, the routine packet_set_ring() is reused
to free rings instead of allocating them. But when calling it for the
first time, it fills req->tp_block_nr with the value of rb->pg_vec_len
which in the second invocation makes it bail out since req->tp_block_nr
is greater zero but req->tp_block_size is zero.
This patch solves the problem by passing a zeroed auto-variable to
packet_set_ring() upon each invocation from packet_release().
As far as I can tell, this issue exists even since 69e3c75 (net: TX_RING
and packet mmap), i.e. the original inclusion of TX ring support into
af_packet, but applies only to sockets with both RX and TX ring
allocated, which is probably why this was unnoticed all the time.
Signed-off-by: Phil Sutter <phil.sutter@viprinet.com>
Cc: Johann Baudy <johann.baudy@gnu-log.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use correct inner offset to set inner_network_offset.
Found by inspection.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 9dc274151a (tcp: fix ABC in tcp_slow_start())
uncovered a bug in FRTO code :
tcp_process_frto() is setting snd_cwnd to 0 if the number
of in flight packets is 0.
As Neal pointed out, if no packet is in flight we lost our
chance to disambiguate whether a loss timeout was spurious.
We should assume it was a proper loss.
Reported-by: Pasi Kärkkäinen <pasik@iki.fi>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 9dc274151a (tcp: fix ABC in tcp_slow_start()),
a nul snd_cwnd triggers an infinite loop in tcp_slow_start()
Avoid this infinite loop and log a one time error for further
analysis. FRTO code is suspected to cause this bug.
Reported-by: Pasi Kärkkäinen <pasik@iki.fi>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marc Kleine-Budde says:
====================
here's a patch for net for the v3.8 release cycle. Alexander Stein noticed that
the c_can hardware has a fixed bit in the IFx_MASK2 register. His patch fixes
writing of this register by always setting this bit.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
1) rhine_tx() should use dev_kfree_skb() not dev_kfree_skb_irq()
2) rhine_slow_event_task's NAPI triggering logic is racey, it
should just hit the interrupt mask register. This is the
same as commit 7dbb491878
("r8169: avoid NAPI scheduling delay.") made to fix the same
problem in the r8169 driver. From Francois Romieu.
Reported-by: Jamie Gloudon <jamie.gloudon@gmail.com>
Tested-by: Jamie Gloudon <jamie.gloudon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville says:
====================
This is a small batch of fixes intended for the 3.8 stream...
There are two pulls from Johannes. Regarding mac80211, Johannes says:
"One fix from Dan for a possible memory overrun."
Regarding iwlwifi, Johannes says:
"I have one fix from Emmanuel reverting a previous fix that caused
more trouble than it's worth."
Along with those:
Arend van Spriel fixes a fatal error in brcsmac related to tx status processing.
Bing Zhao corrects a problem where mwifiex would fail to complete a scan
in the event of an IE processing error.
Larry Finger fixes a thinko in rtlwifi in which the wrong skb variable
was being used in some cases.
Rafał Miłecki fixes a thinko in an ID check in the bcma flash code.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
According to C_CAN documentation, the reserved bit in IFx_MASK2 register is
fixed 1.
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
On receiving the SYN-ACK, Fast Open checks icsk_retransmit for SYN
retransmission to detect SYN/data drops. But if F-RTO is disabled,
icsk_retransmit is reset at step D of tcp_fastretrans_alert() (
under tcp_ack()) before tcp_rcv_fastopen_synack(). The fix is to use
total_retrans instead which accounts for SYN retransmission regardless
the use of F-RTO.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
l2tp_ip6 is incorrectly using the IPv4-specific ip_cmsg_recv to handle
ancillary data. This means that socket options such as IPV6_RECVPKTINFO are
not honoured in userspace.
Convert l2tp_ip6 to use the IPv6-specific handler.
Ref: net/ipv6/udp.c
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: Chris Elston <celston@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ip6_datagram_recv_ctl and ip6_datagram_send_ctl are used for handling IPv6
ancillary data. Since ip6_datagram_send_ctl is already publicly exported for
use in modules, ip6_datagram_recv_ctl should also be available to support
ancillary data in the receive path.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The datagram_*_ctl functions in net/ipv6/datagram.c are IPv6-specific. Since
datagram_send_ctl is publicly exported it should be appropriately named to
reflect the fact that it's for IPv6 only.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If occurs a LE or SCO hci_conn timeout and the connection is already
established (BT_CONNECTED state), the connection is not terminated as
expected. This bug can be reproduced using l2test or scotest tool.
Once the connection is established, kill l2test/scotest and the
connection won't be terminated.
This patch fixes hci_conn_disconnect helper so it is able to
terminate LE and SCO connections, as well as ACL.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
The conn->smp_chan pointer can be NULL if SMP PDUs arrive at unexpected
moments. To avoid NULL pointer dereferences the code should be checking
for this and disconnect if an unexpected SMP PDU arrives. This patch
fixes the issue by adding a check for conn->smp_chan for all other PDUs
except pairing request and security request (which are are the first
PDUs to come to initialize the SMP context).
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
CC: stable@vger.kernel.org
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Add VID, PID and fixed interface for Telit LE920
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
They will be created at output, if ever needed. This avoids creating
empty neighbor entries when TPROXYing/Forwarding packets for addresses
that are not even directly reachable.
Note that IPv4 already handles it this way. No neighbor entries are
created for local input.
Tested by myself and customer.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A device sending 0 length frames as fast as it can has been
observed killing the host system due to the resulting memory
pressure.
Temporarily disable RX skb allocation and URB submission when
the current error ratio is high, preventing us from trying to
allocate an infinite number of skbs. Reenable as soon as we
are finished processing the done queue, allowing the device
to continue working after short error bursts.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Acked-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
A scan request is split into multiple scan commands queued in
scan_pending_q. Each scan command will be sent to firmware and
its response is handlded one after another.
If any error is detected while parsing IE in command response
buffer the remaining data will be ignored and error is returned.
We should check if there is any more scan commands pending in
the queue before returning error. This ensures that we will call
cfg80211_scan_done if this is the last scan command, or send
next scan command in scan_pending_q to firmware.
Cc: "3.6+" <stable@vger.kernel.org>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
vmxnet3 fails to set netif_carrier_off on probe, meaning that when an interface
is opened the __LINK_STATE_NOCARRIER bit is already cleared, and so
/sys/class/net/<ifname>/operstate remains in the unknown state. Correct this by
setting netif_carrier_off on probe, like other drivers do.
Also, while we're at it, lets remove the netif_carrier_ok checks from the
link_state_update function, as that check is atomically contained within the
netif_carrier_[on|off] functions anyway
Tested successfully by myself
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: "VMware, Inc." <pv-drivers@vmware.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In rare instances, memory errors have been detected in the internal packet
buffer memory on I217/I218 when stressed under certain environmental
conditions. Enable Error Correcting Code (ECC) in hardware to catch both
correctable and uncorrectable errors. Correctable errors will be handled
by the hardware. Uncorrectable errors in the packet buffer will cause the
packet to be received with an error indication in the buffer descriptor
causing the packet to be discarded. If the uncorrectable error is in the
descriptor itself, the hardware will stop and interrupt the driver
indicating the error. The driver will then reset the hardware in order to
clear the error and restart.
Both types of errors will be accounted for in statistics counters.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Cc: <stable@vger.kernel.org> # 3.5.x & 3.6.x
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When bonding module is loaded with primary parameter and one decides to unset
primary slave using sysfs these settings are not preserved during bond device
restart. Primary slave is only unset once and it's not remembered in
bond->params structure. Below is example of recreation.
grep OPTS /etc/sysconfig/network-scripts/ifcfg-bond0
BONDING_OPTS="mode=active-backup miimon=100 primary=eth01"
grep "Primary Slave" /proc/net/bonding/bond0
Primary Slave: eth01 (primary_reselect always)
echo "" > /sys/class/net/bond0/bonding/primary
grep "Primary Slave" /proc/net/bonding/bond0
Primary Slave: None
sed -i -e 's/primary=eth01//' /etc/sysconfig/network-scripts/ifcfg-bond0
grep OPTS /etc/sysconfig/network-scripts/ifcfg-bond
BONDING_OPTS="mode=active-backup miimon=100 "
ifdown bond0 && ifup bond0
without patch:
grep "Primary Slave" /proc/net/bonding/bond0
Primary Slave: eth01 (primary_reselect always)
with patch:
grep "Primary Slave" /proc/net/bonding/bond0
Primary Slave: None
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Milos Vyletel <milos.vyletel@sde.cz>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We drop a connection request if the accept backlog is full and there are
sufficient packets in the syn queue to warrant starting drops. Increment the
appropriate counters so this isn't silent, for accurate stats and help in
debugging.
This patch assumes LINUX_MIB_LISTENDROPS is a superset of/includes the
counter LINUX_MIB_LISTENOVERFLOWS.
Signed-off-by: Nivedita Singhvi <niv@us.ibm.com>
Acked-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "Universal/Local" (U/L) bit must be complmented according to RFC4944
and RFC2464.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We forbid polling, writing and reading when the file were detached, this may
complex the user in several cases:
- when guest pass some buffers to vhost/qemu and then disable some queues,
host/qemu needs to do its own cleanup on those buffers which is complex
sometimes. We can do this simply by allowing a user can still write to an
disabled queue. Write to an disabled queue will cause the packet pass to the
kernel and read will get nothing.
- align the polling behavior with macvtap which never fails when the queue is
created. This can simplify the polling errors handling of its user (e.g vhost)
We can simply achieve this by don't assign NULL to tfile->tun when detached.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the polling errors were ignored, which can lead following issues:
- vhost remove itself unconditionally from waitqueue when stopping the poll,
this may crash the kernel since the previous attempt of starting may fail to
add itself to the waitqueue
- userspace may think the backend were successfully set even when the polling
failed.
Solve this by:
- check poll->wqh before trying to remove from waitqueue
- report polling errors in vhost_poll_start(), tx_poll_start(), the return value
will be checked and returned when userspace want to set the backend
After this fix, there still could be a polling failure after backend is set, it
will addressed by the next patch.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, when vhost_init_used() fails the sock refcnt and ubufs were
leaked. Correct this by calling vhost_init_used() before assign ubufs and
restore the oldsock when it fails.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit c8d68e6be1 removed carrier off call
from tun_detach since it's now called on queue disable and not only on
tun close. This confuses userspace which used this flag to detect a
free tun. To fix, put this back but under if (clean).
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Jason Wang <jasowang@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Tested-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The return value of pktgen_add_device() is not checked, so
even if we fail to add some device, for example, non-exist one,
we still see "OK:...". This patch fixes it.
After this patch, I got:
# echo "add_device non-exist" > /proc/net/pktgen/kpktgend_0
-bash: echo: write error: No such device
# cat /proc/net/pktgen/kpktgend_0
Running:
Stopped:
Result: ERROR: can not add device non-exist
# echo "add_device eth0" > /proc/net/pktgen/kpktgend_0
# cat /proc/net/pktgen/kpktgend_0
Running:
Stopped: eth0
Result: OK: add_device=eth0
(Candidate for -stable)
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The delay calculation with the rate extension introduces in v3.3 does
not properly work, if other packets are still queued for transmission.
For the delay calculation to work, both delay types (latency and delay
introduces by rate limitation) have to be handled differently. The
latency delay for a packet can overlap with the delay of other packets.
The delay introduced by the rate however is separate, and can only
start, once all other rate-introduced delays finished.
Latency delay is from same distribution for each packet, rate delay
depends on the packet size.
.: latency delay
-: rate delay
x: additional delay we have to wait since another packet is currently
transmitted
.....---- Packet 1
.....xx------ Packet 2
.....------ Packet 3
^^^^^
latency stacks
^^
rate delay doesn't stack
^^
latency stacks
-----> time
When a packet is enqueued, we first consider the latency delay. If other
packets are already queued, we can reduce the latency delay until the
last packet in the queue is send, however the latency delay cannot be
<0, since this would mean that the rate is overcommitted. The new
reference point is the time at which the last packet will be send. To
find the time, when the packet should be send, the rate introduces delay
has to be added on top of that.
Signed-off-by: Johannes Naab <jn@stusta.de>
Acked-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a tunnel socket is created by userspace, l2tp hooks the socket destructor
in order to clean up resources if userspace closes the socket or crashes. It
also caches a pointer to the struct sock for use in the data path and in the
netlink interface.
While it is safe to use the cached sock pointer in the data path, where the
skb references keep the socket alive, it is not safe to use it elsewhere as
such access introduces a race with userspace closing the socket. In
particular, l2tp_tunnel_delete is prone to oopsing if a multithreaded
userspace application closes a socket at the same time as sending a netlink
delete command for the tunnel.
This patch fixes this oops by forcing l2tp_tunnel_delete to explicitly look up
a tunnel socket held by userspace using sockfd_lookup().
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull powerpc fixes from Benjamin Herrenschmidt:
"Whenever you have a chance between two dives, you might want to
consider pulling my merge branch to pickup a few fixes for 3.8 that
have been accumulating for the last couple of weeks (I was myself
travelling then on vacation).
Nothing major, just a handful of powerpc bug fixes that I consider
worth getting in before 3.8 goes final."
And I'll have everybody know that I'm not diving for several days yet.
Snif.
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Max next_tb to prevent from replaying timer interrupt
powerpc: kernel/kgdb.c: Fix memory leakage
powerpc/book3e: Disable interrupt after preempt_schedule_irq
powerpc/oprofile: Fix error in oprofile power7_marked_instr_event() function
powerpc/pasemi: Fix crash on reboot
powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning for ppc32
With lazy interrupt, we always call __check_irq_replaysome with
decrementers_next_tb to check if we need to replay timer interrupt.
So in hotplug case we also need to set decrementers_next_tb as MAX
to make sure __check_irq_replay don't replay timer interrupt
when return as we expect, otherwise we'll trap here infinitely.
Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
the variable backup_current_thread_info isn't freed before existing the
function.
Signed-off-by: Cong Ding <dinggnu@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
In preempt case current arch_local_irq_restore() from
preempt_schedule_irq() may enable hard interrupt but we really
should disable interrupts when we return from the interrupt,
and so that we don't get interrupted after loading SRR0/1.
Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The calculation for the left shift of the mask OPROFILE_PM_PMCSEL_MSK has an
error. The calculation is should be to shift left by (max_cntrs - cntr) times
the width of the pmsel field width. However, the #define OPROFILE_MAX_PMC_NUM
was used instead of OPROFILE_PMSEL_FIELD_WIDTH. This patch fixes the
calculation.
Signed-off-by: Carl Love <cel@us.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
commit f96972f2dc "kernel/sys.c: call disable_nonboot_cpus() in
kernel_restart()"
added a call to disable_nonboot_cpus() on kernel_restart(), which tries
to shutdown all the CPUs except the first one. The issue with the PA
Semi, is that it does not support CPU hotplug.
When the call is made to __cpu_down(), it calls the notifiers
CPU_DOWN_PREPARE, and then tries to take the CPU down.
One of the notifiers to the CPU hotplug code, is the cpufreq. The
DOWN_PREPARE will call __cpufreq_remove_dev() which calls
cpufreq_driver->exit. The PA Semi exit handler unmaps regions of I/O
that is used by an interrupt that goes off constantly
(system_reset_common, but it goes off during normal system operations
too). I'm not sure exactly what this interrupt does.
Running a simple function trace, you can see it goes off quite a bit:
# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [001] 1558.859363: .pasemi_system_reset_exception <-.system_reset_exception
<idle>-0 [000] 1558.860112: .pasemi_system_reset_exception <-.system_reset_exception
<idle>-0 [000] 1558.861109: .pasemi_system_reset_exception <-.system_reset_exception
<idle>-0 [001] 1558.861361: .pasemi_system_reset_exception <-.system_reset_exception
<idle>-0 [000] 1558.861437: .pasemi_system_reset_exception <-.system_reset_exception
When the region is unmapped, the system crashes with:
Disabling non-boot CPUs ...
Error taking CPU1 down: -38
Unable to handle kernel paging request for data at address 0xd0000800903a0100
Faulting instruction address: 0xc000000000055fcc
Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT SMP NR_CPUS=64 NUMA PA Semi PWRficient
Modules linked in: shpchp
NIP: c000000000055fcc LR: c000000000055fb4 CTR: c0000000000df1fc
REGS: c0000000012175d0 TRAP: 0300 Not tainted (3.8.0-rc4-test-dirty)
MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR: 24000088 XER: 00000000
SOFTE: 0
DAR: d0000800903a0100, DSISR: 42000000
TASK = c0000000010e9008[0] 'swapper/0' THREAD: c000000001214000 CPU: 0
GPR00: d0000800903a0000 c000000001217850 c0000000012167e0 0000000000000000
GPR04: 0000000000000000 0000000000000724 0000000000000724 0000000000000000
GPR08: 0000000000000000 0000000000000000 0000000000000001 0000000000a70000
GPR12: 0000000024000080 c00000000fff0000 ffffffffffffffff 000000003ffffae0
GPR16: ffffffffffffffff 0000000000a21198 0000000000000060 0000000000000000
GPR20: 00000000008fdd35 0000000000a21258 000000003ffffaf0 0000000000000417
GPR24: 0000000000a226d0 c000000000000000 0000000000000000 0000000000000000
GPR28: c00000000138b358 0000000000000000 c000000001144818 d0000800903a0100
NIP [c000000000055fcc] .set_astate+0x5c/0xa4
LR [c000000000055fb4] .set_astate+0x44/0xa4
Call Trace:
[c000000001217850] [c000000000055fb4] .set_astate+0x44/0xa4 (unreliable)
[c0000000012178f0] [c00000000005647c] .restore_astate+0x2c/0x34
[c000000001217980] [c000000000054668] .pasemi_system_reset_exception+0x6c/0x88
[c000000001217a00] [c000000000019ef0] .system_reset_exception+0x48/0x84
[c000000001217a80] [c000000000001e40] system_reset_common+0x140/0x180
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
dmraid assess redundancy and replacements slightly inaccurately which
could lead to some degraded arrays failing to assemble.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIVAwUAUQb0OTnsnt1WYoG5AQJ4qg/8DIs0eWaJWF5lqF8qJBeMQrUDkqe7glqe
ezwVlO48uxVzJUxCMd1aZdxjTI/AIvqZ1U5DBC62SlB/RqWlrwIQIIok5odWfnG4
eAII5hUnktWqL4Ksqz4mgdI+WWwSc3JR7verqS4wOvcsN4qU5t8vL6dj2b0cNYCQ
I7B+fkMZE1KE4Y2DOf8dEO9gbNIO1ZllZapLRolrsGOr8Ggo1prEoMxBYF5HZNoE
J1As2N6NA7/kadtsfkCSs+f//5t1uMZluMjEUe4lDmgeqzqz/93kFmJ2OfSdqO4J
wuuTHbCL+NSjjAZuByluSO98O0h87xXGVMv/c7gadVQtOn6I1DA2i4wTaiHOzr4K
cdALvbteVCAPYLMA+s8ee6YYbB5pnlblT8FShG+3O6ae1KmbqKex1LlZLpwEoS8y
VxI1WCSQbBr/ejAnhxLFQPo5OAcoeHomlZHKPtCBSbwQ0f0pOHHPYlZyX2PtX6hF
U9bmtMq0XZulDORdLmIsEEpwzRKQ+b89+RrYXM7AhkJTxRP59RVwqFHy9SybcFBS
S5XFKqpCE+ioBvLp9HK189xMe0Nel2g7KWd34v5LcvQ21rzATezAh5TsWIzN3oV8
9/phd6nZa0hhcELykOTmK5b6+ks2tBfEN2FuyKfSq4Z2nz46rfD4wYVTY2+Qjh+D
hmUDBgguejo=
=bJyL
-----END PGP SIGNATURE-----
Merge tag 'md-3.8-fixes' of git://neil.brown.name/md
Pull dmraid fix from NeilBrown:
"Just one fix for md in 3.8
dmraid assess redundancy and replacements slightly inaccurately which
could lead to some degraded arrays failing to assemble."
* tag 'md-3.8-fixes' of git://neil.brown.name/md:
DM-RAID: Fix RAID10's check for sufficient redundancy
This patch fixes MAX_STACK_TRACE_ENTRIES too low warning for ppc32,
which is similar to commit 12660b17.
Reported-by: Christian Kujau <lists@nerdbynature.de>
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>