When a VLAN interface (on top of batX) is removed and
re-added within a short timeframe TT does not have enough
time to properly cleanup. This creates an internal TT state
mismatch as the newly created softif_vlan will be
initialized from scratch with a TT client count of zero
(even if TT entries for this VLAN still exist). The
resulting TT messages are bogus due to the counter / tt
client listing mismatch, thus creating inconsistencies on
every node in the network
To fix this issue destroy_vlan() has to not free the VLAN
object immediately but it has to be kept alive until all the
TT entries for this VLAN have been removed. destroy_vlan()
still removes the sysfs folder so that the user has the
feeling that everything went fine.
If the same VLAN is re-added before the old object is free'd,
then the latter is resurrected and re-used.
Implement such behaviour by increasing the reference counter
of a softif_vlan object every time a new local TT entry for
such VLAN is created and remove the object from the list
only when all the TT entries have been destroyed.
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Since bridge loop avoidance only supports untagged or simple 802.1q
tagged VLAN claim frames, claim frames with stacked VLAN headers (QinQ)
should be detected and dropped. Transporting the over the mesh may cause
problems on the receivers, or create bogus entries in the local tt
tables.
Reported-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Simon Wunderlich <simon@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
*_result[len] is parsed as *(_result[len]) which is not at all what we
want to touch here.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: 84a7c0b1db ("dns_resolver: assure that dns_query() result is null-terminated")
Signed-off-by: David S. Miller <davem@davemloft.net>
Zoltan Kiss says:
====================
xen-netback: Fixing up xenvif_tx_check_gop
This series fixes a lot of bugs on the error path around this function, which
were introduced with my grant mapping series in 3.15. They apply to the latest
net tree, but probably to net-next as well without any modification.
I'll post an another series which applies to 3.15 stable, as the problem was
first discovered there. The only difference is that the "queue" variable name is
replaced to "vif".
====================
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to this pointer is increased prematurely, the error log contains rubbish.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes this function aware that the first frag and the header might
share the same ring slot. That could happen if the first slot is bigger than
PKT_PROT_LEN. Due to this the error path might release that slot twice or never,
depending on the error scenario.
xenvif_idx_release is also removed from xenvif_idx_unmap, and called separately.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
When the grant operations failed, the skb is freed up eventually, and it tries
to release the frags, if there is any. For the main skb nr_frags is set to 0 to
avoid this, but on the frag_list it iterates through the frags array, and tries
to call put_page on the page pointer which contains garbage at that time.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
The error handling for skb's with frag_list was completely wrong, it caused
double unmap attempts to happen if the error was on the first skb. Move it to
the right place in the loop.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
When kernel generates a handle for a u32 filter, it tries to start
from the max in the bucket. So when we have a filter with the max (fff)
handle, it will cause kernel always generates the same handle for new
filters. This can be shown by the following command:
tc qdisc add dev eth0 ingress
tc filter add dev eth0 parent ffff: protocol ip pref 770 handle 800::fff u32 match ip protocol 1 0xff
tc filter add dev eth0 parent ffff: protocol ip pref 770 u32 match ip protocol 1 0xff
...
we will get some u32 filters with same handle:
# tc filter show dev eth0 parent ffff:
filter protocol ip pref 770 u32
filter protocol ip pref 770 u32 fh 800: ht divisor 1
filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
match 00010000/00ff0000 at 8
filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
match 00010000/00ff0000 at 8
filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
match 00010000/00ff0000 at 8
filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
match 00010000/00ff0000 at 8
handles should be unique. This patch fixes it by looking up a bitmap,
so that can guarantee the handle is as unique as possible. For compatibility,
we still start from 0x800.
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huawei's usage of the subclass and protocol fields is not 100%
clear to us, but there appears to be a very strict system.
A device with the "shared" device ID 12d1:1506 and this NCM
function was recently reported (showing only default altsetting):
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 1
bAlternateSetting 0
bNumEndpoints 1
bInterfaceClass 255 Vendor Specific Class
bInterfaceSubClass 3
bInterfaceProtocol 22
iInterface 8 CDC Network Control Model (NCM)
** UNRECOGNIZED: 05 24 00 10 01
** UNRECOGNIZED: 06 24 1a 00 01 1f
** UNRECOGNIZED: 0c 24 1b 00 01 00 04 10 14 dc 05 20
** UNRECOGNIZED: 0d 24 0f 0a 0f 00 00 00 ea 05 03 00 01
** UNRECOGNIZED: 05 24 06 01 01
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x85 EP 5 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0010 1x 16 bytes
bInterval 9
Cc: Enrico Mioso <mrkiko.rs@gmail.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add two device IDs found in an out-of-tree driver downloadable
from Netgear.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
If "newmtu * 2 + 4" is too large then it can cause an integer overflow
leading to memory corruption. Eric Dumazet suggests that 65534 is a
reasonable upper limit.
Btw, "newmtu" is not allowed to be a negative number because of the
check in dev_set_mtu(), so that's ok.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 568f194e8b ("net: ppp: use
sk_unattached_filter api") inadvertently changed the logic when setting
PPP pass and active filters. This applies to both the generic PPP subsystem
implemented by drivers/net/ppp/ppp_generic.c and the ISDN PPP subsystem
implemented by drivers/isdn/i4l/isdn_ppp.c. The original code in ppp_ioctl()
(or isdn_ppp_ioctl(), resp.) handling PPPIOCSPASS and PPPIOCSACTIVE allowed to
remove a pass/active filter previously set by using a filter of length zero.
However, with the new code this is not possible anymore as this case is not
explicitly checked for, which leads to passing NULL as a filter to
sk_unattached_filter_create(). This results in returning EINVAL to the caller.
Additionally, the variables ppp->pass_filter and ppp->active_filter (or
is->pass_filter and is->active_filter, resp.) are not reset to NULL, although
the filters they point to may have been destroyed by
sk_unattached_filter_destroy(), so in this EINVAL case dangling pointers are
left behind (provided the pointers were previously non-NULL).
This patch corrects both problems by checking whether the filter passed is
empty or non-empty, and prevents sk_unattached_filter_create() from being
called in the first case. Moreover, the pointers are always reset to NULL
as soon as sk_unattached_filter_destroy() returns.
Signed-off-by: Christoph Schulz <develop@kristov.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a regression introduced by commit 35f6f45 ("net/mlx4_en: Don't use
irq_affinity_notifier to track changes in IRQ affinity map").
When core is started in legacy EQ's (number of IRQ's < rx rings), cq->irq_desc
was NULL. This caused a kernel crash under heavy traffic - when having more
than rx NAPI budget completions.
Fixed to have it set for both EQ modes.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nothing cleans up the objects created by
vnet_new(), they are completely leaked.
vnet_exit(), after doing the vio_unregister_driver() to clean
up ports, should call a helper function that iterates over vnet_list
and cleans up those objects. This includes unregister_netdevice()
as well as free_netdev().
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Reviewed-by: Karl Volz <karl.volz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethernet port on my ASUS A88X Pro mainboard stopped working
several times a day, with messages like these in dmesg:
AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x001e address=0x0000000000003000 flags=0x0050]
Searching the web for these messages led me to similar reports about
different hardware supported by r8169, and eventually to commits
3ced8c955e ('r8169: enforce RX_MULTI_EN
for the 8168f.') and eb2dc35d99 ('r8169:
RxConfig hack for the 8168evl'). So I tried this change, and it fixes
the problem for me.
Signed-off-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter/nf_tables fixes
The following patchset contains nf_tables fixes, they are:
1) Fix wrong transaction handling when the table flags are not
modified.
2) Fix missing rcu read_lock section in the netlink dump path, which
is not protected by the nfnl_lock.
3) Set NLM_F_DUMP_INTR in the netlink dump path to indicate
interferences with updates.
4) Fix 64 bits chain counters when they are retrieved from a 32 bits
arch, from Eric Dumazet.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixed a bug that was introduced by my GRE-GRO patch
(bf5a755f5e net-gre-gro: Add GRE
support to the GRO stack) that breaks the forwarding path
because various GSO related fields were not set. The bug will
cause on the egress path either the GSO code to fail, or a
GRE-TSO capable (NETIF_F_GSO_GRE) NICs to choke. The following
fix has been tested for both cases.
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Fix ELM suspend/resume
- Reduce warnings if NAND ECC is too weak
- Add CFI support for Sharp LH28F640BF NOR
The last fix is coming in because other commits in the 3.16 cycle depended on
this support.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJTxrg9AAoJEFySrpd9RFgtJvMP/1xlHR4Pb8uJnTPFwDZbR6CQ
0I9q7K0wgvdtlTXLKqbmOEVZ7yn6EVpx/TR9Pdk+aCvJFrnKjkiTenZ7yeT80iad
lOp4zWf2+x74zMn9Q2m16YAoL/AwoGoiP5iaMnD+vwkwf2kzr7QbgxO/YqwtCv0O
gY/eAPiCWzlKlswEavSVBj//eqSdHWn4imKDisZ0roDYw8J+T3n42gwnVoqp5MqU
mrk0N/hpuxaYA4TH3XFhDb/QgfKSsNt/c3FzeQKO+aurwfM7TuuKVOGYdT/URuhg
oSLCRBoVF9EIUVIDeEWxmktJJx8HRCpmoLZE7jj8g/y9I6P4JZk73KTMIEdE9170
x9j7jYngp68yKw9wel6zcTPpEzzHzyenJKlUqUjwhsxemfdXjDOp+X+QlI8MGvuL
kypLN8xnIMYhSpFhOq0W+Emi67CXi9uN83HMZV4nP9krZDHb796pi1iRX66Vdl61
1U9dJ2YZy4mSoDQXS8bo98yR9Qw/UDA+Fus+OOfnGtTxcMYF2rdmL5sCMNBvAK+1
fpkSJDnNrmaa6zvSrKm6JOk5UTEFvmXZocBcrwIzMG3jI+JsZ9jwY7kpUA25Dz5b
wowABmfnjmuazsI8Ga9zGUjl70vHV/VuVYYSZybMxGdHxYKmdXJ1kSDqtuSlSQ3A
+KxZd8I/mdnOAy/OWIsd
=OGTK
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20140716' of git://git.infradead.org/linux-mtd
Pull MTD fixes from Brian Norris:
- Fix ELM suspend/resume
- Reduce warnings if NAND ECC is too weak
- Add CFI support for Sharp LH28F640BF NOR
The last fix is coming in because other commits in the 3.16 cycle
depended on this support.
* tag 'for-linus-20140716' of git://git.infradead.org/linux-mtd:
mtd: cfi_cmdset_0001.c: add support for Sharp LH28F640BF NOR
mtd: nand: reduce the warning noise when the ECC is too weak
mtd: devices: elm: fix elm_context_save() and elm_context_restore() functions
Pull perf fixes from Ingo Molnar:
"Tooling fixes and an Intel PMU driver fixlet"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Do not allow optimized switch for non-cloned events
perf/x86/intel: ignore CondChgd bit to avoid false NMI handling
perf symbols: Get kernel start address by symbol name
perf tools: Fix segfault in cumulative.callchain report
Things seem to calm down so far, just a small few HD-audio fixes
(regression fixes and a new codec ID addition) popping up.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJTxlkHAAoJEGwxgFQ9KSmk0EEP/iKDIFbGzlQeH/X6G0pwnk9+
Cl+j8KBUQfevV49SCGZm+h8NHF+RpnV3OrVHQFH+a1AwZqYRA0pEs4sDjRarcSud
fwMNqe/AUmTtyazeF2fSvNokbzHLlWaWG/+vLzxsvvsGlu2tNppNPK6WYrQyRp/G
8ZCruUQz+Cl/GY4VN2Cx4zV7xNkcyUQnbnL3Uo8EM2hZKK4I2YdtQPc8rrD4+DzG
YPDtbton4Sb0xPOnAcmmmuXQDM3rSh0xMK8ZMwfDeruLapXhY1z25PbugUvet2Xx
gkROW9JfseEjvJKT55sYbrj+jEFlIPAeom/BSVJBerJmYs+jbDkpCzlbtzU5dT0R
yUp6v5Z9VYmIkExJZOFmJVq9D2UbCd1b4kKZdwExvCtVPJ8Ce3HhCvrB5MJWl7vo
nA3Z7pPXeqhBmxBHT1oREREn5OydRR1MfERNEtxuVWrg7jl9ugUGHBc3/F24K2Ia
Xkbnh7orTmUZy9LeMA9BTUFKGmAKr/phju8RRahv846TFnd98NSK8g1F+Gi1UEfB
4ZqG9fJJBhwzm2ZIh8ocMOF7fVJsTOPurPkfL3PbosZ8J/7qsh3unVihrHuLDbC+
0sM3JS7kG6IgMCgyB8+697AtAP+jw1sAz/gDG+bjXx2BA2HtycLlb5mXwB8Jvg0y
K9XlJ6urPLsQGKsxCTO4
=Ua+N
-----END PGP SIGNATURE-----
Merge tag 'sound-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Things seem to calm down so far, just a small few HD-audio fixes
(regression fixes and a new codec ID addition) popping up"
* tag 'sound-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix broken PM due to incomplete i915 initialization
ALSA: hda - Revert stream assignment order for Intel controllers
ALSA: hda - Add new GPU codec ID 0x10de0070 to snd-hda
ALSA: hda: Fix build warning
Pull quota fix from Jan Kara:
"Fix locking of dquot shrinker"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
quota: missing lock in dqcache_shrink_scan()
Commit 1ab6c4997e (fs: convert fs shrinkers to new scan/count API)
accidentally removed locking from quota shrinker. Fix it -
dqcache_shrink_scan() should use dq_list_lock to protect the
scan on free_dquots list.
CC: stable@vger.kernel.org
Fixes: 1ab6c4997e
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Pull fuse fixes from Miklos Szeredi:
"This contains miscellaneous fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: replace count*size kzalloc by kcalloc
fuse: release temporary page if fuse_writepage_locked() failed
fuse: restructure ->rename2()
fuse: avoid scheduling while atomic
fuse: handle large user and group ID
fuse: inode: drop cast
fuse: ignore entry-timeout on LOOKUP_REVAL
fuse: timeout comparison fix
Pull networking fixes from David Miller:
1) Bluetooth pairing fixes from Johan Hedberg.
2) ieee80211_send_auth() doesn't allocate enough tail room for the SKB,
from Max Stepanov.
3) New iwlwifi chip IDs, from Oren Givon.
4) bnx2x driver reads wrong PCI config space MSI register, from Yijing
Wang.
5) IPV6 MLD Query validation isn't strong enough, from Hangbin Liu.
6) Fix double SKB free in openvswitch, from Andy Zhou.
7) Fix sk_dst_set() being racey with UDP sockets, leading to strange
crashes, from Eric Dumazet.
8) Interpret the NAPI budget correctly in the new systemport driver,
from Florian Fainelli.
9) VLAN code frees percpu stats in the wrong place, leading to crashes
in the get stats handler. From Eric Dumazet.
10) TCP sockets doing a repair can crash with a divide by zero, because
we invoke tcp_push() with an MSS value of zero. Just skip that part
of the sendmsg paths in repair mode. From Christoph Paasch.
11) IRQ affinity bug fixes in mlx4 driver from Amir Vadai.
12) Don't ignore path MTU icmp messages with a zero mtu, machines out
there still spit them out, and all of our per-protocol handlers for
PMTU can cope with it just fine. From Edward Allcutt.
13) Some NETDEV_CHANGE notifier invocations were not passing in the
correct kind of cookie as the argument, from Loic Prylli.
14) Fix crashes in long multicast/broadcast reassembly, from Jon Paul
Maloy.
15) ip_tunnel_lookup() doesn't interpret wildcard keys correctly, fix
from Dmitry Popov.
16) Fix skb->sk assigned without taking a reference to 'sk' in
appletalk, from Andrey Utkin.
17) Fix some info leaks in ULP event signalling to userspace in SCTP,
from Daniel Borkmann.
18) Fix deadlocks in HSO driver, from Olivier Sobrie.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (93 commits)
hso: fix deadlock when receiving bursts of data
hso: remove unused workqueue
net: ppp: don't call sk_chk_filter twice
mlx4: mark napi id for gro_skb
bonding: fix ad_select module param check
net: pppoe: use correct channel MTU when using Multilink PPP
neigh: sysctl - simplify address calculation of gc_* variables
net: sctp: fix information leaks in ulpevent layer
MAINTAINERS: update r8169 maintainer
net: bcmgenet: fix RGMII_MODE_EN bit
tipc: clear 'next'-pointer of message fragments before reassembly
r8152: fix r8152_csum_workaround function
be2net: set EQ DB clear-intr bit in be_open()
GRE: enable offloads for GRE
farsync: fix invalid memory accesses in fst_add_one() and fst_init_card()
igb: do a reset on SR-IOV re-init if device is down
igb: Workaround for i210 Errata 25: Slow System Clock
usbnet: smsc95xx: add reset_resume function with reset operation
dp83640: Always decode received status frames
r8169: disable L23
...
When the initialization of Intel HDMI controller fails due to missing
i915 kernel symbols (e.g. HD-audio is built in while i915 is module),
the driver discontinues the probe. However, since the probe was done
asynchronously, the driver object still remains, thus the relevant PM
ops are still called at suspend/resume. This results in the bad access
to the incomplete audio card object, eventually leads to Oops or stall
at PM.
This patch adds the missing checks of chip->init_failed flag at each
PM callback in order to fix the problem above.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=79561
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
When the module sends bursts of data, sometimes a deadlock happens in
the hso driver when the tty buffer doesn't get the chance to be flushed
quickly enough.
Remove the endless while loop in function put_rxbuf_data() which is
called by the urb completion handler.
If there isn't enough room in the tty buffer, discards all the data
received in the URB.
Cc: David Miller <davem@davemloft.net>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Dan Williams <dcbw@redhat.com>
Cc: Jan Dumon <j.dumon@option.com>
Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The workqueue "retry_unthrottle_workqueue" is not scheduled anywhere
in the code. So, remove it.
Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
to be built on platforms which don't provide the DMA mapping API.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJTxCGPAAoJEHnzb7JUXXnQ0W8QAKTGboWwSU5kznTcWGveIgEl
fckaDnyinK9Vd6lOpixxsSc0rxdaKPmMmoC1ta2AGcD4NgRyfM2aD3Lw6dArzlHr
xgcDRPE4K/41atxf6Z/wZnsXFSLrNEJbr6e8oZncGDzcRp4DKzv7tX2meNzyQRFE
ns0/sjeN7QJEamd8vUkELiVdBED7NLiQjxXdULQ9AjY/cC3byzWAHzyoII0ff5NY
+URYGtTIxwJ2KoNAtqD9OAazwPUV8rt1XXWnqkUHS89TnACsKiWmHpaBq+6jTkqG
YsRxY9h0eZ+sSHoeEBUS2WLiaKWufvsBWBTCOUVoO0eS2iYHJQ1SwSLYU1+Zh0IR
Fey2N2YaMA1HNthJ0OJJdC+sqSZ/tBmE+ytN7rTVHu0Yw0ZffzSPgJ303jIO0y87
rYQ57K6dXhn5kDqh/PheDJCZ1OwMqz+zvgJDyv9XaEcm9lynDmmhc2IqmzlqONmz
0UbiHITiqMrv6bSzZ83+PVeoROw2DBgH/fEu/Xk2vC1hBA7s4NynGSllEht1tXT3
zLbVW/bBbZfjqc0a1UADBRHVKSllac+BPlxvGHSUxJusqC2VN9S9tgEhboj7+baY
7VHbGkuCaEqtZeErzfNTN9uGvUsBN0KGDatW4Kp0g2CZ33dFDpJFOt7O5QaephnF
DhlO4daXNsvHvFVgpBGZ
=vY8J
-----END PGP SIGNATURE-----
Merge tag 'firewire-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire fix from Stefan Richter:
"The 1394 drivers cannot and are not supposed to be built on platforms
which don't provide the DMA mapping API (regression since v3.16-rc1
with CONFIG_COMPILE_TEST=y on some architectures)"
* tag 'firewire-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: IEEE 1394 (FireWire) support should depend on HAS_DMA
Pull another aio fix from Ben LaHaise:
"put_reqs_available() can now be called from within irq context, which
means that it (and its sibling function get_reqs_available()) now need
to be irq-safe, not just preempt-safe"
* git://git.kvack.org/~bcrl/aio-fixes:
aio: protect reqs_available updates from changes in interrupt handlers
The l2tp [get|set]sockopt() code has fallen back to the UDP functions
for socket option levels != SOL_PPPOL2TP since day one, but that has
never actually worked, since the l2tp socket isn't an inet socket.
As David Miller points out:
"If we wanted this to work, it'd have to look up the tunnel and then
use tunnel->sk, but I wonder how useful that would be"
Since this can never have worked so nobody could possibly have depended
on that functionality, just remove the broken code and return -EINVAL.
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: James Chapman <jchapman@katalix.com>
Acked-by: David Miller <davem@davemloft.net>
Cc: Phil Turnbull <phil.turnbull@oracle.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 568f194e8b ("net: ppp: use
sk_unattached_filter api") causes sk_chk_filter() to be called twice when
setting a PPP pass or active filter. This applies to both the generic PPP
subsystem implemented by drivers/net/ppp/ppp_generic.c and the ISDN PPP
subsystem implemented by drivers/isdn/i4l/isdn_ppp.c. The first call is from
within get_filter(). The second one is through the call chain
ppp_ioctl() or isdn_ppp_ioctl()
--> sk_unattached_filter_create()
--> __sk_prepare_filter()
--> sk_chk_filter()
The first call from within get_filter() should be deleted as get_filter() is
called just before calling sk_unattached_filter_create() later on, which
eventually calls sk_chk_filter() anyway.
For 3.15.x, this proposed change is a bugfix rather than a pure optimization as
in that branch, sk_chk_filter() may replace filter codes by other codes which
are not recognized when executing sk_chk_filter() a second time. So with
3.15.x, if sk_chk_filter() is called twice, the second invocation may yield
EINVAL (this depends on the filter codes found in the filter to be set, but
because the replacement is done for frequently used codes, this is almost
always the case). The net effect is that setting pass and/or active PPP filters
does not work anymore, since sk_unattached_filter_create() always returns
EINVAL due to the second call to sk_chk_filter(), regardless whether the filter
was originally sane or not.
Signed-off-by: Christoph Schulz <develop@kristov.de>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Napi id was not marked for gro_skb, this will lead rx busy loop won't
work correctly since they stack never try to call low latency receive
method because of a zero socket napi id. Fix this by marking napi id
for gro_skb.
The transaction rate of 1 byte netperf tcp_rr gets about 50% increased
(from 20531.68 to 30610.88).
Cc: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Obvious copy/paste error when I converted the ad_select to the new
option API. "lacp_rate" there should be "ad_select" so we can get the
proper value.
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: David S. Miller <davem@davemloft.net>
Fixes: 9e5f5eebe7 ("bonding: convert ad_select to use the new option
API")
Reported-by: Karim Scheik <karim.scheik@prisma-solutions.at>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PPP channel MTU is used with Multilink PPP when ppp_mp_explode() (see
ppp_generic module) tries to determine how big a fragment might be. According
to RFC 1661, the MTU excludes the 2-byte PPP protocol field, see the
corresponding comment and code in ppp_mp_explode():
/*
* hdrlen includes the 2-byte PPP protocol field, but the
* MTU counts only the payload excluding the protocol field.
* (RFC1661 Section 2)
*/
mtu = pch->chan->mtu - (hdrlen - 2);
However, the pppoe module *does* include the PPP protocol field in the channel
MTU, which is wrong as it causes the PPP payload to be 1-2 bytes too big under
certain circumstances (one byte if PPP protocol compression is used, two
otherwise), causing the generated Ethernet packets to be dropped. So the pppoe
module has to subtract two bytes from the channel MTU. This error only
manifests itself when using Multilink PPP, as otherwise the channel MTU is not
used anywhere.
In the following, I will describe how to reproduce this bug. We configure two
pppd instances for multilink PPP over two PPPoE links, say eth2 and eth3, with
a MTU of 1492 bytes for each link and a MRRU of 2976 bytes. (This MRRU is
computed by adding the two link MTUs and subtracting the MP header twice, which
is 4 bytes long.) The necessary pppd statements on both sides are "multilink
mtu 1492 mru 1492 mrru 2976". On the client side, we additionally need "plugin
rp-pppoe.so eth2" and "plugin rp-pppoe.so eth3", respectively; on the server
side, we additionally need to start two pppoe-server instances to be able to
establish two PPPoE sessions, one over eth2 and one over eth3. We set the MTU
of the PPP network interface to the MRRU (2976) on both sides of the connection
in order to make use of the higher bandwidth. (If we didn't do that, IP
fragmentation would kick in, which we want to avoid.)
Now we send a ICMPv4 echo request with a payload of 2948 bytes from client to
server over the PPP link. This results in the following network packet:
2948 (echo payload)
+ 8 (ICMPv4 header)
+ 20 (IPv4 header)
---------------------
2976 (PPP payload)
These 2976 bytes do not exceed the MTU of the PPP network interface, so the
IP packet is not fragmented. Now the multilink PPP code in ppp_mp_explode()
prepends one protocol byte (0x21 for IPv4), making the packet one byte bigger
than the negotiated MRRU. So this packet would have to be divided in three
fragments. But this does not happen as each link MTU is assumed to be two bytes
larger. So this packet is diveded into two fragments only, one of size 1489 and
one of size 1488. Now we have for that bigger fragment:
1489 (PPP payload)
+ 4 (MP header)
+ 2 (PPP protocol field for the MP payload (0x3d))
+ 6 (PPPoE header)
--------------------------
1501 (Ethernet payload)
This packet exceeds the link MTU and is discarded.
If one configures the link MTU on the client side to 1501, one can see the
discarded Ethernet frames with tcpdump running on the client. A
ping -s 2948 -c 1 192.168.15.254
leads to the smaller fragment that is correctly received on the server side:
(tcpdump -vvvne -i eth3 pppoes and ppp proto 0x3d)
52:54:00:ad:87:fd > 52:54:00:79:5c:d0, ethertype PPPoE S (0x8864),
length 1514: PPPoE [ses 0x3] MLPPP (0x003d), length 1494: seq 0x000,
Flags [end], length 1492
and to the bigger fragment that is not received on the server side:
(tcpdump -vvvne -i eth2 pppoes and ppp proto 0x3d)
52:54:00:70:9e:89 > 52:54:00:5d:6f:b0, ethertype PPPoE S (0x8864),
length 1515: PPPoE [ses 0x5] MLPPP (0x003d), length 1495: seq 0x000,
Flags [begin], length 1493
With the patch below, we correctly obtain three fragments:
52:54:00:ad:87:fd > 52:54:00:79:5c:d0, ethertype PPPoE S (0x8864),
length 1514: PPPoE [ses 0x1] MLPPP (0x003d), length 1494: seq 0x000,
Flags [begin], length 1492
52:54:00:70:9e:89 > 52:54:00:5d:6f:b0, ethertype PPPoE S (0x8864),
length 1514: PPPoE [ses 0x1] MLPPP (0x003d), length 1494: seq 0x000,
Flags [none], length 1492
52:54:00:ad:87:fd > 52:54:00:79:5c:d0, ethertype PPPoE S (0x8864),
length 27: PPPoE [ses 0x1] MLPPP (0x003d), length 7: seq 0x000,
Flags [end], length 5
And the ICMPv4 echo request is successfully received at the server side:
IP (tos 0x0, ttl 64, id 21925, offset 0, flags [DF], proto ICMP (1),
length 2976)
192.168.222.2 > 192.168.15.254: ICMP echo request, id 30530, seq 0,
length 2956
The bug was introduced in commit c9aa689537
("[PPPOE]: Advertise PPPoE MTU") from the very beginning. This patch applies
to 3.10 upwards but the fix can be applied (with minor modifications) to
kernels as old as 2.6.32.
Signed-off-by: Christoph Schulz <develop@kristov.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code in neigh_sysctl_register() relies on a specific layout of
struct neigh_table, namely that the 'gc_*' variables are directly
following the 'parms' member in a specific order. The code, though,
expresses this in the most ugly way.
Get rid of the ugly casts and use the 'tbl' pointer to get a handle to
the table. This way we can refer to the 'gc_*' variables directly.
Similarly seen in the grsecurity patch, written by Brad Spengler.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
While working on some other SCTP code, I noticed that some
structures shared with user space are leaking uninitialized
stack or heap buffer. In particular, struct sctp_sndrcvinfo
has a 2 bytes hole between .sinfo_flags and .sinfo_ppid that
remains unfilled by us in sctp_ulpevent_read_sndrcvinfo() when
putting this into cmsg. But also struct sctp_remote_error
contains a 2 bytes hole that we don't fill but place into a skb
through skb_copy_expand() via sctp_ulpevent_make_remote_error().
Both structures are defined by the IETF in RFC6458:
* Section 5.3.2. SCTP Header Information Structure:
The sctp_sndrcvinfo structure is defined below:
struct sctp_sndrcvinfo {
uint16_t sinfo_stream;
uint16_t sinfo_ssn;
uint16_t sinfo_flags;
<-- 2 bytes hole -->
uint32_t sinfo_ppid;
uint32_t sinfo_context;
uint32_t sinfo_timetolive;
uint32_t sinfo_tsn;
uint32_t sinfo_cumtsn;
sctp_assoc_t sinfo_assoc_id;
};
* 6.1.3. SCTP_REMOTE_ERROR:
A remote peer may send an Operation Error message to its peer.
This message indicates a variety of error conditions on an
association. The entire ERROR chunk as it appears on the wire
is included in an SCTP_REMOTE_ERROR event. Please refer to the
SCTP specification [RFC4960] and any extensions for a list of
possible error formats. An SCTP error notification has the
following format:
struct sctp_remote_error {
uint16_t sre_type;
uint16_t sre_flags;
uint32_t sre_length;
uint16_t sre_error;
<-- 2 bytes hole -->
sctp_assoc_t sre_assoc_id;
uint8_t sre_data[];
};
Fix this by setting both to 0 before filling them out. We also
have other structures shared between user and kernel space in
SCTP that contains holes (e.g. struct sctp_paddrthlds), but we
copy that buffer over from user space first and thus don't need
to care about it in that cases.
While at it, we can also remove lengthy comments copied from
the draft, instead, we update the comment with the correct RFC
number where one can look it up.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As of commit f8567a3845 it is now possible to
have put_reqs_available() called from irq context. While put_reqs_available()
is per cpu, it did not protect itself from interrupts on the same CPU. This
lead to aio_complete() corrupting the available io requests count when run
under a heavy O_DIRECT workloads as reported by Robert Elliott. Fix this by
disabling irq updates around the per cpu batch updates of reqs_available.
Many thanks to Robert and folks for testing and tracking this down.
Reported-by: Robert Elliot <Elliott@hp.com>
Tested-by: Robert Elliot <Elliott@hp.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@infradead.org>
Cc: stable@vger.kenel.org
Use generic u64_stats_sync infrastructure to get proper 64bit stats,
even on 32bit arches, at no extra cost for 64bit arches.
Without this fix, 32bit arches can have some wrong counters at the time
the carry is propagated into upper word.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
An updater may interfer with the dumping of any of the object lists.
Fix this by using a per-net generation counter and use the
nl_dump_check_consistent() interface so the NLM_F_DUMP_INTR flag is set
to notify userspace that it has to restart the dump since an updater
has interfered.
This patch also replaces the existing consistency checking code in the
rule dumping path since it is broken. Basically, the value that the
dump callback returns is not propagated to userspace via
netlink_dump_start().
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The dump operation through netlink is not protected by the nfnl_lock.
Thus, a reader process can be dumping any of the existing object
lists while another process can be updating the list content.
This patch resolves this situation by protecting all the object
lists with RCU in the netlink dump path which is the reader side.
The updater path is already protected via nfnl_lock, so use list
manipulation RCU-safe operations.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
We got a regression report for 3.15.x kernels, and this turned out to
be triggered by the fix for stream assignment order. On reporter's
machine with Intel controller (8086:1e20) + VIA VT1802 codec, the
first playback slot can't work with speaker outputs.
But the original commit was actually a fix for AMD controllers where
no proper GCAP value is returned, we shouldn't revert the whole
commit. Instead, in this patch, a new flag is introduced to determine
the stream assignment order, and follow the old behavior for Intel
controllers.
Fixes: dcb32ecd9a ('ALSA: hda - Do not assign streams in reverse order')
Reported-and-tested-by: Steven Newbury <steve@snewbury.org.uk>
Cc: <stable@vger.kernel.org> [v3.15+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
RGMII_MODE_EN bit was defined to 0, while it is actually 6. It was not
much of a problem on older designs where this was a no-op, and the RGMII
data-path would always be enabled, but newer GENET controllers need to
explicitely enable their RGMII data-pad using this bit.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This family of chips was long ago supported by the pre-cfi driver.
CFI code tested on several Zaurus SL-5500 (Collie) 2x16 on 32 bit bus.
Function is_LH28F640BF() mimics is_m29ew() from cmdset_0002.c
Buffer write fixes as seen in 2007 patch c/o
Anti Sullin <anti.sullin <at> artecdesign.ee>
http://comments.gmane.org/gmane.linux.ports.arm.kernel/36733
[Brian: this patch is semi-urgent, because the following patch switches
to using CFI detection for a chip which (until now) is unsupported by
the CFI driver
9218310 ARM: 8084/1: sa1100: collie: revert back to cfi_probe
]
Signed-off-by: Andrea Adami <andrea.adami@gmail.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
In commit 67a9ad9b8a ("mtd: nand: Warn the user if the selected ECC
strength is too weak"), a check was added to inform the user when the
ECC used for a NAND device is weaker than the recommended ECC
advertised by the NAND chip. However, the warning uses WARN_ON(),
which has two undesirable side-effects:
- It just prints to the kernel log the fact that there is a warning
in this file, at this line, but it doesn't explain anything about
the warning itself.
- It dumps a stack trace which is very noisy, for something that the
user is most likely not able to fix. If a certain ECC used by the
kernel is weaker than the advertised one, it's most likely to make
sure the kernel uses an ECC that is compatible with the one used by
the bootloader, and changing the bootloader may not necessarily be
easy. Therefore, normal users would not be able to do anything to
fix this very noisy warning, and will have to suffer from it at
every kernel boot. At least every time I see this stack trace in my
kernel boot log, I wonder what new thing is broken, just to realize
that it's once again this NAND ECC warning.
Therefore, this commit turns:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at /home/thomas/projets/linux-2.6/drivers/mtd/nand/nand_base.c:4051 nand_scan_tail+0x538/0x780()
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 3.16.0-rc3-dirty #4
[<c000e3dc>] (unwind_backtrace) from [<c000bee4>] (show_stack+0x10/0x14)
[<c000bee4>] (show_stack) from [<c0018180>] (warn_slowpath_common+0x6c/0x8c)
[<c0018180>] (warn_slowpath_common) from [<c001823c>] (warn_slowpath_null+0x1c/0x24)
[<c001823c>] (warn_slowpath_null) from [<c02c50cc>] (nand_scan_tail+0x538/0x780)
[<c02c50cc>] (nand_scan_tail) from [<c0639f78>] (orion_nand_probe+0x224/0x2e4)
[<c0639f78>] (orion_nand_probe) from [<c026da00>] (platform_drv_probe+0x18/0x4c)
[<c026da00>] (platform_drv_probe) from [<c026c1f4>] (really_probe+0x80/0x218)
[<c026c1f4>] (really_probe) from [<c026c47c>] (__driver_attach+0x98/0x9c)
[<c026c47c>] (__driver_attach) from [<c026a8f0>] (bus_for_each_dev+0x64/0x94)
[<c026a8f0>] (bus_for_each_dev) from [<c026bae4>] (bus_add_driver+0x144/0x1ec)
[<c026bae4>] (bus_add_driver) from [<c026cb00>] (driver_register+0x78/0xf8)
[<c026cb00>] (driver_register) from [<c026da5c>] (platform_driver_probe+0x20/0xb8)
[<c026da5c>] (platform_driver_probe) from [<c00088b8>] (do_one_initcall+0x80/0x1d8)
[<c00088b8>] (do_one_initcall) from [<c0620c9c>] (kernel_init_freeable+0xf4/0x1b4)
[<c0620c9c>] (kernel_init_freeable) from [<c049a098>] (kernel_init+0x8/0xec)
[<c049a098>] (kernel_init) from [<c00095f0>] (ret_from_fork+0x14/0x24)
---[ end trace 62f87d875aceccb4 ]---
Into the much shorter, and much more useful:
nand: WARNING: MT29F2G08ABAEAWP: the ECC used on your system is too weak compared to the one required by the NAND chip
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>