Commit Graph

29176 Commits

Author SHA1 Message Date
Matthew Wilcox
1a80dade01 Fix failure path in alloc_pid()
The failure path removes the allocated PIDs from the wrong namespace.
This could lead to us inadvertently reusing PIDs in the leaf namespace
and leaking PIDs in parent namespaces.

Fixes: 95846ecf9d ("pid: replace pid bitmap implementation with IDR API")
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:42:30 -08:00
Linus Torvalds
b71acb0e37 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Add 1472-byte test to tcrypt for IPsec
   - Reintroduced crypto stats interface with numerous changes
   - Support incremental algorithm dumps

  Algorithms:
   - Add xchacha12/20
   - Add nhpoly1305
   - Add adiantum
   - Add streebog hash
   - Mark cts(cbc(aes)) as FIPS allowed

  Drivers:
   - Improve performance of arm64/chacha20
   - Improve performance of x86/chacha20
   - Add NEON-accelerated nhpoly1305
   - Add SSE2 accelerated nhpoly1305
   - Add AVX2 accelerated nhpoly1305
   - Add support for 192/256-bit keys in gcmaes AVX
   - Add SG support in gcmaes AVX
   - ESN for inline IPsec tx in chcr
   - Add support for CryptoCell 703 in ccree
   - Add support for CryptoCell 713 in ccree
   - Add SM4 support in ccree
   - Add SM3 support in ccree
   - Add support for chacha20 in caam/qi2
   - Add support for chacha20 + poly1305 in caam/jr
   - Add support for chacha20 + poly1305 in caam/qi2
   - Add AEAD cipher support in cavium/nitrox"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (130 commits)
  crypto: skcipher - remove remnants of internal IV generators
  crypto: cavium/nitrox - Fix build with !CONFIG_DEBUG_FS
  crypto: salsa20-generic - don't unnecessarily use atomic walk
  crypto: skcipher - add might_sleep() to skcipher_walk_virt()
  crypto: x86/chacha - avoid sleeping under kernel_fpu_begin()
  crypto: cavium/nitrox - Added AEAD cipher support
  crypto: mxc-scc - fix build warnings on ARM64
  crypto: api - document missing stats member
  crypto: user - remove unused dump functions
  crypto: chelsio - Fix wrong error counter increments
  crypto: chelsio - Reset counters on cxgb4 Detach
  crypto: chelsio - Handle PCI shutdown event
  crypto: chelsio - cleanup:send addr as value in function argument
  crypto: chelsio - Use same value for both channel in single WR
  crypto: chelsio - Swap location of AAD and IV sent in WR
  crypto: chelsio - remove set but not used variable 'kctx_len'
  crypto: ux500 - Use proper enum in hash_set_dma_transfer
  crypto: ux500 - Use proper enum in cryp_set_dma_transfer
  crypto: aesni - Add scatter/gather avx stubs, and use them in C
  crypto: aesni - Introduce partial block macro
  ..
2018-12-27 13:53:32 -08:00
Linus Torvalds
e0c38a4d1f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) New ipset extensions for matching on destination MAC addresses, from
    Stefano Brivio.

 2) Add ipv4 ttl and tos, plus ipv6 flow label and hop limit offloads to
    nfp driver. From Stefano Brivio.

 3) Implement GRO for plain UDP sockets, from Paolo Abeni.

 4) Lots of work from Michał Mirosław to eliminate the VLAN_TAG_PRESENT
    bit so that we could support the entire vlan_tci value.

 5) Rework the IPSEC policy lookups to better optimize more usecases,
    from Florian Westphal.

 6) Infrastructure changes eliminating direct manipulation of SKB lists
    wherever possible, and to always use the appropriate SKB list
    helpers. This work is still ongoing...

 7) Lots of PHY driver and state machine improvements and
    simplifications, from Heiner Kallweit.

 8) Various TSO deferral refinements, from Eric Dumazet.

 9) Add ntuple filter support to aquantia driver, from Dmitry Bogdanov.

10) Batch dropping of XDP packets in tuntap, from Jason Wang.

11) Lots of cleanups and improvements to the r8169 driver from Heiner
    Kallweit, including support for ->xmit_more. This driver has been
    getting some much needed love since he started working on it.

12) Lots of new forwarding selftests from Petr Machata.

13) Enable VXLAN learning in mlxsw driver, from Ido Schimmel.

14) Packed ring support for virtio, from Tiwei Bie.

15) Add new Aquantia AQtion USB driver, from Dmitry Bezrukov.

16) Add XDP support to dpaa2-eth driver, from Ioana Ciocoi Radulescu.

17) Implement coalescing on TCP backlog queue, from Eric Dumazet.

18) Implement carrier change in tun driver, from Nicolas Dichtel.

19) Support msg_zerocopy in UDP, from Willem de Bruijn.

20) Significantly improve garbage collection of neighbor objects when
    the table has many PERMANENT entries, from David Ahern.

21) Remove egdev usage from nfp and mlx5, and remove the facility
    completely from the tree as it no longer has any users. From Oz
    Shlomo and others.

22) Add a NETDEV_PRE_CHANGEADDR so that drivers can veto the change and
    therefore abort the operation before the commit phase (which is the
    NETDEV_CHANGEADDR event). From Petr Machata.

23) Add indirect call wrappers to avoid retpoline overhead, and use them
    in the GRO code paths. From Paolo Abeni.

24) Add support for netlink FDB get operations, from Roopa Prabhu.

25) Support bloom filter in mlxsw driver, from Nir Dotan.

26) Add SKB extension infrastructure. This consolidates the handling of
    the auxiliary SKB data used by IPSEC and bridge netfilter, and is
    designed to support the needs to MPTCP which could be integrated in
    the future.

27) Lots of XDP TX optimizations in mlx5 from Tariq Toukan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1845 commits)
  net: dccp: fix kernel crash on module load
  drivers/net: appletalk/cops: remove redundant if statement and mask
  bnx2x: Fix NULL pointer dereference in bnx2x_del_all_vlans() on some hw
  net/net_namespace: Check the return value of register_pernet_subsys()
  net/netlink_compat: Fix a missing check of nla_parse_nested
  ieee802154: lowpan_header_create check must check daddr
  net/mlx4_core: drop useless LIST_HEAD
  mlxsw: spectrum: drop useless LIST_HEAD
  net/mlx5e: drop useless LIST_HEAD
  iptunnel: Set tun_flags in the iptunnel_metadata_reply from src
  net/mlx5e: fix semicolon.cocci warnings
  staging: octeon: fix build failure with XFRM enabled
  net: Revert recent Spectre-v1 patches.
  can: af_can: Fix Spectre v1 vulnerability
  packet: validate address length if non-zero
  nfc: af_nfc: Fix Spectre v1 vulnerability
  phonet: af_phonet: Fix Spectre v1 vulnerability
  net: core: Fix Spectre v1 vulnerability
  net: minor cleanup in skb_ext_add()
  net: drop the unused helper skb_ext_get()
  ...
2018-12-27 13:04:52 -08:00
Linus Torvalds
7f9f852c75 Modules updates for v4.21
Summary of modules changes for the 4.21 merge window:
 
 - Some modules-related kallsyms cleanups and a kallsyms fix for ARM.
 
 - Include keys from the secondary keyring in module signature
   verification.
 
 Signed-off-by: Jessica Yu <jeyu@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJcFzpbAAoJEMBFfjjOO8FyxOAP/jqIyJ08IThhgEWcsXwCgvir
 a5PtovqAwP3pWXJ0SE64/Hz4edwcPUnUvzt6nia7JELgZWukIQcjA/Yav4w65KA6
 kNAW4+BY41vjGpFBtObgMjU9dcEr8QPhO4362s7sPwxYaoRMI+uYHzEkxDvJaL8p
 1d5g/xdX+82rTQUwgzxHHqrfoHbL0H83eVLTG6YtmWCDHdXGq4lI7ZvHd87Qii3H
 PoL1ALiFyf0eO1Gouaivox3tBkpX6hI8Kl9Tm8lL0dIlIn3AcXj869T/h6jbhqMT
 qpMazFokSWGZ1m2sCfaxoA6L+MUqgn0zHSLm68B69CHj483919QsQ5wpHSmpT2Jp
 /szUuO1vHDd/e+nMGvxO0teg94OUfJ+J08RNC0B+QJ3dclOARR3z2Qnx1nR+7go/
 nBSjlFvedx7wvv9hIHYJdPdtxy7qOwY+jLW2nDXUwYSIkpJKq5Fm1qYlqEJhyuhy
 bQgTCR4da0iMdCuccHXS3XYhIsqgNDhZpcBu19ToRCH7RroitK/8rBssMCVsd0WB
 uSLgdkgkZrpOMzb/lQv8IDvqXOUrU2Tm2SUikUiZWzQGEvkeD6rDjxSxhEUbq5+m
 ZujOgp5EE4Li5PXUeX5rqMOxNmNysvOK8r0pynn6D2c77x/hDNuLHQQ5OFT9kPNs
 qInek4B09h0gij4OgSRp
 =vevq
 -----END PGP SIGNATURE-----

Merge tag 'modules-for-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux

Pull modules updates from Jessica Yu:

 - Some modules-related kallsyms cleanups and a kallsyms fix for ARM.

 - Include keys from the secondary keyring in module signature
   verification.

* tag 'modules-for-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  ARM: module: Fix function kallsyms on Thumb-2
  module: Overwrite st_size instead of st_info
  module: make it clearer when we're handling kallsyms symbols vs exported symbols
  modsign: use all trusted keys to verify module signature
2018-12-27 12:08:33 -08:00
Linus Torvalds
047ce6d380 audit/stable-4.21 PR 20181224
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAlwhAwIUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNl1w/+PKsewN5VkmmfibIxZ+iZwe1KGB+L
 iOwkdHDkG1Bae5A7TBdbKMbHq0FdhaiDXAIFrfunBG/tbgBF9O0056edekR4rRLp
 ReGQVNpGMggiATyVKrc3vi+4+UYQqtS6N7Y8q+mMMX/hVeeESXrTAZdgxSWwsZAX
 LbYwXXYUyupLvelpkpakE6VPZEcatcYWrVK/vFKLkTt2jLLlLPtanbMf0B71TULi
 5EZSVBYWS71a6yvrrYcVDDZjgot31nVQfX4EIqE6CVcXLuL9vqbZBGKZh+iAGbjs
 UdKgaQMZ/eJ4CRYDJca0Bnba3n1AKO4uNssY0nrMW4s/inDPrJnMZ0kgGWfayE3d
 QR96aHEP5W3SZoiJCUlYm8a4JFfndYKn4YBvqjvLgIkbd784/rvI+sNGM9BF1DNP
 f05frIJVHLNO3sECKWMmQyMGWGglj7bLsjtKrai5UQReyFLpM/q/Lh3J1IHZ9KZq
 YWFTA4G0rg7x2bdEB4Qh/SaLOOHW7uyQ7IJCYfzSKsZCIO++RqCQoArxiKRE6++C
 hv0UG6NGb6Z6a+k1JSzlxCXPmcui0zow7aqEpZSl/9kiYzkLpBITha/ERP7at5M2
 W3JVNfQNn6kPtZFgmNuP7rNE9Yn6jnbIdks0nsi/J/4KUr/p2Mfc5LamyTj1unk6
 xf7S+xmOFKHAc2s=
 =PCHx
 -----END PGP SIGNATURE-----

Merge tag 'audit-pr-20181224' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit

Pull audit updates from Paul Moore:
 "In the finest of holiday of traditions, I have a number of gifts to
  share today. While most of them are re-gifts from others, unlike the
  typical re-gift, these are things you will want in and around your
  tree; I promise.

  This pull request is perhaps a bit larger than our typical PR, but
  most of it comes from Jan's rework of audit's fanotify code; a very
  welcome improvement. We ran this through our normal regression tests,
  as well as some newly created stress tests and everything looks good.

  Richard added a few patches, mostly cleaning up a few things and and
  shortening some of the audit records that we send to userspace; a
  change the userspace folks are quite happy about.

  Finally YueHaibing and I kick in a few patches to simplify things a
  bit and make the code less prone to errors.

  Lastly, I want to say thanks one more time to everyone who has
  contributed patches, testing, and code reviews for the audit subsystem
  over the past year. The project is what it is due to your help and
  contributions - thank you"

* tag 'audit-pr-20181224' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: (22 commits)
  audit: remove duplicated include from audit.c
  audit: shorten PATH cap values when zero
  audit: use current whenever possible
  audit: minimize our use of audit_log_format()
  audit: remove WATCH and TREE config options
  audit: use session_info helper
  audit: localize audit_log_session_info prototype
  audit: Use 'mark' name for fsnotify_mark variables
  audit: Replace chunk attached to mark instead of replacing mark
  audit: Simplify locking around untag_chunk()
  audit: Drop all unused chunk nodes during deletion
  audit: Guarantee forward progress of chunk untagging
  audit: Allocate fsnotify mark independently of chunk
  audit: Provide helper for dropping mark's chunk reference
  audit: Remove pointless check in insert_hash()
  audit: Factor out chunk replacement code
  audit: Make hash table insertion safe against concurrent lookups
  audit: Embed key into chunk
  audit: Fix possible tagging failures
  audit: Fix possible spurious -ENOSPC error
  ...
2018-12-27 11:58:50 -08:00
Linus Torvalds
a3b5c1065f Printk changes for 4.21
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJcG5XBAAoJEFKgDEdIgJTyoV8QAJGJtlSLXJewMaJyLom8fb2I
 YvdNo2gq+uLeNAwNoOio/pcMZypfjoISYq7T4etfElXay42c7MgZftMoo/cmtlhs
 FU9WUVYUXWLuaMgibP7nl9fsNUtRt/ySY7PfOj3nu6A/E4dqqNWnoC7V9rLp2h70
 Np7L1JEnUr0daRhY6sBm2V6VwQKxjXHY/sdC3xw88R8CVA1wMAxCxouz8qHopvn3
 4Anfhu4o6e4PGCw8YxFIwKS7K5MtDP/WESOF/80/EB+tZkJzH63B3ozqxMirlHMt
 zilw6FPwZRX1NRJ1gDJJmZjt0rwC9oCr0u93QUdUx9j179THs8TBf3DaJEIx87zZ
 fwy+PpN+8OXnS+6qAQOhSaMtms6pPE73Kr2vTukvNmhEoHc0lGIXbKQeWdVl248a
 y9nTlJiOCEiA/nssNGpUVM7uncziKOmJOoQfyaSI9OOo/u3tAwZXrAe7f3GPKeWo
 o6RaIKfTx1LJhco1vxbc93pKCK4ItXU0aQxjRbpBBRjhlFJG8C8alKRagx4LXRpe
 5bFd7L+amtN6BzpeI1uGMKEeRBwn0zjlPrc12bTe6MjiRLr3wtthspELY2ubcIxk
 ghe7ARzb05X9O206EkF6Mir/fn3oudrouuAGyJjNQyAi/OjijRB92l7dLkn/Pe1o
 hNTPMDU/po0y3ulDwaVx
 =FeVC
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk

Pull printk updates from Petr Mladek:

 - Keep spinlocks busted until the end of panic()

 - Fix races between calculating number of messages that would fit into
   user space buffers, filling the buffers, and switching printk.time
   parameter

 - Some code clean up

* tag 'printk-for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
  printk: Remove print_prefix() calls with NULL buffer.
  printk: fix printk_time race.
  printk: Make printk_emit() local function.
  panic: avoid deadlocks in re-entrant console drivers
2018-12-27 11:24:43 -08:00
Olof Johansson
6d101ba6be sched/fair: Fix warning on non-SMP build
Caused by making the variable static:

  kernel/sched/fair.c:119:21: warning: 'capacity_margin' defined but not used [-Wunused-variable]

Seems easiest to just move it up under the existing ifdef CONFIG_SMP
that's a few lines above.

Fixes: ed8885a144 ('sched/fair: Make some variables static')
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-27 10:40:15 -08:00
Linus Torvalds
17bf423a1f Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Introduce "Energy Aware Scheduling" - by Quentin Perret.

     This is a coherent topology description of CPUs in cooperation with
     the PM subsystem, with the goal to schedule more energy-efficiently
     on asymetric SMP platform - such as waking up tasks to the more
     energy-efficient CPUs first, as long as the system isn't
     oversubscribed.

     For details of the design, see:

        https://lore.kernel.org/lkml/20180724122521.22109-1-quentin.perret@arm.com/

   - Misc cleanups and smaller enhancements"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  sched/fair: Select an energy-efficient CPU on task wake-up
  sched/fair: Introduce an energy estimation helper function
  sched/fair: Add over-utilization/tipping point indicator
  sched/fair: Clean-up update_sg_lb_stats parameters
  sched/toplogy: Introduce the 'sched_energy_present' static key
  sched/topology: Make Energy Aware Scheduling depend on schedutil
  sched/topology: Disable EAS on inappropriate platforms
  sched/topology: Add lowest CPU asymmetry sched_domain level pointer
  sched/topology: Reference the Energy Model of CPUs when available
  PM: Introduce an Energy Model management framework
  sched/cpufreq: Prepare schedutil for Energy Aware Scheduling
  sched/topology: Relocate arch_scale_cpu_capacity() to the internal header
  sched/core: Remove unnecessary unlikely() in push_*_task()
  sched/topology: Remove the ::smt_gain field from 'struct sched_domain'
  sched: Fix various typos in comments
  sched/core: Clean up the #ifdef block in add_nr_running()
  sched/fair: Make some variables static
  sched/core: Create task_has_idle_policy() helper
  sched/fair: Add lsub_positive() and use it consistently
  sched/fair: Mask UTIL_AVG_UNCHANGED usages
  ...
2018-12-26 14:56:10 -08:00
Linus Torvalds
116b081c28 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "The main changes in this cycle on the kernel side:

   - rework kprobes blacklist handling (Masami Hiramatsu)

   - misc cleanups

  on the tooling side these areas were the main focus:

   - 'perf trace'     enhancements (Arnaldo Carvalho de Melo)

   - 'perf bench'     enhancements (Davidlohr Bueso)

   - 'perf record'    enhancements (Alexey Budankov)

   - 'perf annotate'  enhancements (Jin Yao)

   - 'perf top'       enhancements (Jiri Olsa)

   - Intel hw tracing enhancements (Adrian Hunter)

   - ARM hw tracing   enhancements (Leo Yan, Mathieu Poirier)

   - ... plus lots of other enhancements, cleanups and fixes"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (171 commits)
  tools uapi asm: Update asm-generic/unistd.h copy
  perf symbols: Relax checks on perf-PID.map ownership
  perf trace: Wire up the fadvise 'advice' table generator
  perf beauty: Add generator for fadvise64's 'advice' arg constants
  tools headers uapi: Grab a copy of fadvise.h
  perf beauty mmap: Print mmap's 'offset' arg in hexadecimal
  perf beauty mmap: Print PROT_READ before PROT_EXEC to match strace output
  perf trace beauty: Beautify arch_prctl()'s arguments
  perf trace: When showing string prefixes show prefix + ??? for unknown entries
  perf trace: Move strarrays to beauty.h for further reuse
  perf beauty: Wire up the x86_arch prctl code table generator
  perf beauty: Add a string table generator for x86's 'arch_prctl' codes
  tools include arch: Grab a copy of x86's prctl.h
  perf trace: Show NULL when syscall pointer args are 0
  perf trace: Enclose the errno strings with ()
  perf augmented_raw_syscalls: Copy 'access' arg as well
  perf trace: Add alignment spaces after the closing parens
  perf trace beauty: Print O_RDONLY when (flags & O_ACCMODE) == 0
  perf trace: Allow asking for not suppressing common string prefixes
  perf trace: Add a prefix member to the strarray class
  ...
2018-12-26 14:45:18 -08:00
Linus Torvalds
1eefdec18e Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The main change in this cycle are initial preparatory bits of dynamic
  lockdep keys support from Bart Van Assche.

  There are also misc changes, a comment cleanup and a data structure
  cleanup"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Clean up comment in nohz_idle_balance()
  locking/lockdep: Stop using RCU primitives to access 'all_lock_classes'
  locking/lockdep: Make concurrent lockdep_reset_lock() calls safe
  locking/lockdep: Remove a superfluous INIT_LIST_HEAD() statement
  locking/lockdep: Introduce lock_class_cache_is_registered()
  locking/lockdep: Inline __lockdep_init_map()
  locking/lockdep: Declare local symbols static
  tools/lib/lockdep/tests: Test the lockdep_reset_lock() implementation
  tools/lib/lockdep: Add dummy print_irqtrace_events() implementation
  tools/lib/lockdep: Rename "trywlock" into "trywrlock"
  tools/lib/lockdep/tests: Run lockdep tests a second time under Valgrind
  tools/lib/lockdep/tests: Improve testing accuracy
  tools/lib/lockdep/tests: Fix shellcheck warnings
  tools/lib/lockdep/tests: Display compiler warning and error messages
  locking/lockdep: Remove ::version from lock_class structure
2018-12-26 14:25:52 -08:00
Linus Torvalds
792bf4d871 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The biggest RCU changes in this cycle were:

   - Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar.

   - Replace calls of RCU-bh and RCU-sched update-side functions to
     their vanilla RCU counterparts. This series is a step towards
     complete removal of the RCU-bh and RCU-sched update-side functions.

     ( Note that some of these conversions are going upstream via their
       respective maintainers. )

   - Documentation updates, including a number of flavor-consolidation
     updates from Joel Fernandes.

   - Miscellaneous fixes.

   - Automate generation of the initrd filesystem used for rcutorture
     testing.

   - Convert spin_is_locked() assertions to instead use lockdep.

     ( Note that some of these conversions are going upstream via their
       respective maintainers. )

   - SRCU updates, especially including a fix from Dennis Krein for a
     bag-on-head-class bug.

   - RCU torture-test updates"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (112 commits)
  rcutorture: Don't do busted forward-progress testing
  rcutorture: Use 100ms buckets for forward-progress callback histograms
  rcutorture: Recover from OOM during forward-progress tests
  rcutorture: Print forward-progress test age upon failure
  rcutorture: Print time since GP end upon forward-progress failure
  rcutorture: Print histogram of CB invocation at OOM time
  rcutorture: Print GP age upon forward-progress failure
  rcu: Print per-CPU callback counts for forward-progress failures
  rcu: Account for nocb-CPU callback counts in RCU CPU stall warnings
  rcutorture: Dump grace-period diagnostics upon forward-progress OOM
  rcutorture: Prepare for asynchronous access to rcu_fwd_startat
  torture: Remove unnecessary "ret" variables
  rcutorture: Affinity forward-progress test to avoid housekeeping CPUs
  rcutorture: Break up too-long rcu_torture_fwd_prog() function
  rcutorture: Remove cbflood facility
  torture: Bring any extra CPUs online during kernel startup
  rcutorture: Add call_rcu() flooding forward-progress tests
  rcutorture/formal: Replace synchronize_sched() with synchronize_rcu()
  tools/kernel.h: Replace synchronize_sched() with synchronize_rcu()
  net/decnet: Replace rcu_barrier_bh() with rcu_barrier()
  ...
2018-12-26 13:07:19 -08:00
Linus Torvalds
5694cecdb0 arm64 festive updates for 4.21
In the end, we ended up with quite a lot more than I expected:
 
 - Support for ARMv8.3 Pointer Authentication in userspace (CRIU and
   kernel-side support to come later)
 
 - Support for per-thread stack canaries, pending an update to GCC that
   is currently undergoing review
 
 - Support for kexec_file_load(), which permits secure boot of a kexec
   payload but also happens to improve the performance of kexec
   dramatically because we can avoid the sucky purgatory code from
   userspace. Kdump will come later (requires updates to libfdt).
 
 - Optimisation of our dynamic CPU feature framework, so that all
   detected features are enabled via a single stop_machine() invocation
 
 - KPTI whitelisting of Cortex-A CPUs unaffected by Meltdown, so that
   they can benefit from global TLB entries when KASLR is not in use
 
 - 52-bit virtual addressing for userspace (kernel remains 48-bit)
 
 - Patch in LSE atomics for per-cpu atomic operations
 
 - Custom preempt.h implementation to avoid unconditional calls to
   preempt_schedule() from preempt_enable()
 
 - Support for the new 'SB' Speculation Barrier instruction
 
 - Vectorised implementation of XOR checksumming and CRC32 optimisations
 
 - Workaround for Cortex-A76 erratum #1165522
 
 - Improved compatibility with Clang/LLD
 
 - Support for TX2 system PMUS for profiling the L3 cache and DMC
 
 - Reflect read-only permissions in the linear map by default
 
 - Ensure MMIO reads are ordered with subsequent calls to Xdelay()
 
 - Initial support for memory hotplug
 
 - Tweak the threshold when we invalidate the TLB by-ASID, so that
   mremap() performance is improved for ranges spanning multiple PMDs.
 
 - Minor refactoring and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJcE4TmAAoJELescNyEwWM0Nr0H/iaU7/wQSzHyNXtZoImyKTul
 Blu2ga4/EqUrTU7AVVfmkl/3NBILWlgQVpY6tH6EfXQuvnxqD7CizbHyLdyO+z0S
 B5PsFUH2GLMNAi48AUNqGqkgb2knFbg+T+9IimijDBkKg1G/KhQnRg6bXX32mLJv
 Une8oshUPBVJMsHN1AcQknzKariuoE3u0SgJ+eOZ9yA2ZwKxP4yy1SkDt3xQrtI0
 lojeRjxcyjTP1oGRNZC+BWUtGOT35p7y6cGTnBd/4TlqBGz5wVAJUcdoxnZ6JYVR
 O8+ob9zU+4I0+SKt80s7pTLqQiL9rxkKZ5joWK1pr1g9e0s5N5yoETXKFHgJYP8=
 =sYdt
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 festive updates from Will Deacon:
 "In the end, we ended up with quite a lot more than I expected:

   - Support for ARMv8.3 Pointer Authentication in userspace (CRIU and
     kernel-side support to come later)

   - Support for per-thread stack canaries, pending an update to GCC
     that is currently undergoing review

   - Support for kexec_file_load(), which permits secure boot of a kexec
     payload but also happens to improve the performance of kexec
     dramatically because we can avoid the sucky purgatory code from
     userspace. Kdump will come later (requires updates to libfdt).

   - Optimisation of our dynamic CPU feature framework, so that all
     detected features are enabled via a single stop_machine()
     invocation

   - KPTI whitelisting of Cortex-A CPUs unaffected by Meltdown, so that
     they can benefit from global TLB entries when KASLR is not in use

   - 52-bit virtual addressing for userspace (kernel remains 48-bit)

   - Patch in LSE atomics for per-cpu atomic operations

   - Custom preempt.h implementation to avoid unconditional calls to
     preempt_schedule() from preempt_enable()

   - Support for the new 'SB' Speculation Barrier instruction

   - Vectorised implementation of XOR checksumming and CRC32
     optimisations

   - Workaround for Cortex-A76 erratum #1165522

   - Improved compatibility with Clang/LLD

   - Support for TX2 system PMUS for profiling the L3 cache and DMC

   - Reflect read-only permissions in the linear map by default

   - Ensure MMIO reads are ordered with subsequent calls to Xdelay()

   - Initial support for memory hotplug

   - Tweak the threshold when we invalidate the TLB by-ASID, so that
     mremap() performance is improved for ranges spanning multiple PMDs.

   - Minor refactoring and cleanups"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (125 commits)
  arm64: kaslr: print PHYS_OFFSET in dump_kernel_offset()
  arm64: sysreg: Use _BITUL() when defining register bits
  arm64: cpufeature: Rework ptr auth hwcaps using multi_entry_cap_matches
  arm64: cpufeature: Reduce number of pointer auth CPU caps from 6 to 4
  arm64: docs: document pointer authentication
  arm64: ptr auth: Move per-thread keys from thread_info to thread_struct
  arm64: enable pointer authentication
  arm64: add prctl control for resetting ptrauth keys
  arm64: perf: strip PAC when unwinding userspace
  arm64: expose user PAC bit positions via ptrace
  arm64: add basic pointer authentication support
  arm64/cpufeature: detect pointer authentication
  arm64: Don't trap host pointer auth use to EL2
  arm64/kvm: hide ptrauth from guests
  arm64/kvm: consistently handle host HCR_EL2 flags
  arm64: add pointer authentication register bits
  arm64: add comments about EC exception levels
  arm64: perf: Treat EXCLUDE_EL* bit definitions as unsigned
  arm64: kpti: Whitelist Cortex-A CPUs that don't implement the CSV3 field
  arm64: enable per-task stack canaries
  ...
2018-12-25 17:41:56 -08:00
Linus Torvalds
9f687dddc4 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "The timer department delivers the following christmas presents:

  Core code:

   - Use proper seqcount initializer to make lockdep happy

   - SPDX annotations and cleanup of license boilerplates

   - Use DEFINE_SHOW_ATTRIBUTE() instead of open coding it

   - Minor cleanups

  Driver code:

   - Add the sched_clock for the arc timer (Alexey Brodkin)

   - Change the file timer names for riscv, rockchip, tegra20, sun4i and
     meson6 (Daniel Lezcano)

   - Add the DT bindings for r8a7796, r8a77470 and r8a774a1 (Biju Das)

   - Remove the early platform driver registration for timer-ti-dm
     (Bartosz Golaszewski)

   - Provide the sched_clock for the riscv timer (Anup Patel)

   - Add support for ARM64 for the imx-gpt and convert the imx-tpm to
     the timer-of API (Anson Huang)

   - Remove useless irq protection for the imx-gpt (Clément Péron)

   - Remove a duplicate function name for the vt8500 (Dan Carpenter)

   - Remove obsolete inclusion of <asm/smp_twd.h> for the tegra20 (Geert
     Uytterhoeven)

   - Demote the prcmu and the custom sched_clock for the dbx500 and the
     ux500 (Linus Walleij)

   - Add a new timer clock for the RDA8810PL (Manivannan Sadhasivam)

   - Rename the macro to stick to the register name and add the delay
     timer (Martin Blumenstingl)

   - Switch the bcm2835 to the SPDX identifier (Stefan Wahren)

   - Fix the interrupt register access on the fttmr010 (Tao Ren)

   - Add missing of_node_put in the initialization path on the
     integrator-ap (Yangtao Li)"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
  dt-bindings: timer: Document RDA8810PL SoC timer
  clocksource/drivers/rda: Add clock driver for RDA8810PL SoC
  clocksource/drivers/meson6: Change name meson6_timer timer-meson6
  clocksource/drivers/sun4i: Change name sun4i_timer to timer-sun4i
  clocksource/drivers/tegra20: Change name tegra20_timer to timer-tegra20
  clocksource/drivers/rockchip: Change name rockchip_timer to timer-rockchip
  clocksource/drivers/riscv: Change name riscv_timer to timer-riscv
  clocksource/drivers/riscv_timer: Provide the sched_clock
  clocksource/drivers/timer-imx-tpm: Specify clock name for timer-of
  clocksource/drivers/fttmr010: Fix invalid interrupt register access
  clocksource/drivers/integrator-ap: Add missing of_node_put()
  clocksource/drivers/bcm2835: Switch to SPDX identifier
  dt-bindings: timer: renesas, cmt: Document r8a774a1 CMT support
  clocksource/drivers/timer-imx-tpm: Convert the driver to timer-of
  clocksource/drivers/arc_timer: Utilize generic sched_clock
  dt-bindings: timer: renesas, cmt: Document r8a77470 CMT support
  dt-bindings: timer: renesas, cmt: Document r8a7796 CMT support
  clocksource/drivers/imx-gpt: Remove unnecessary irq protection
  clocksource/drivers/imx-gpt: Add support for ARM64
  clocksource/drivers/meson6_timer: Implement the ARM delay timer
  ...
2018-12-25 15:44:08 -08:00
Linus Torvalds
e4b99d415c Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "The interrupt department provides:

  Core updates:

   - Better spreading to NUMA nodes in the affinity management

   - Support for more than one set of interrupts to spread out to allow
     separate queues for separate functionality of a single device.

   - Decouple the non queue interrupts from being managed. Those are
     usually general interrupts for error handling etc. and those should
     never be shut down. This also a preparation to utilize the
     spreading mechanism for initial spreading of non-managed interrupts
     later.

   - Make the single CPU target selection in the matrix allocator more
     balanced so interrupts won't accumulate on single CPUs in certain
     situations.

   - A large spell checking patch so we don't end up fixing single typos
     over and over.

  Driver updates:

   - A bunch of new irqchip drivers (RDA8810PL, Madera, imx-irqsteer)

   - Updates for the 8MQ, F1C100s platform drivers

   - A number of SPDX cleanups

   - A workaround for a very broken GICv3 implementation on msm8996
     which sports a botched register set.

   - A platform-msi fix to prevent memory leakage

   - Various cleanups"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
  genirq/affinity: Add is_managed to struct irq_affinity_desc
  genirq/core: Introduce struct irq_affinity_desc
  genirq/affinity: Remove excess indentation
  irqchip/stm32: protect configuration registers with hwspinlock
  dt-bindings: interrupt-controller: stm32: Document hwlock properties
  irqchip: Add driver for imx-irqsteer controller
  dt-bindings/irq: Add binding for Freescale IRQSTEER multiplexer
  irqchip: Add driver for Cirrus Logic Madera codecs
  genirq: Fix various typos in comments
  irqchip/irq-imx-gpcv2: Add IRQCHIP_DECLARE for i.MX8MQ compatible
  irqchip/irq-rda-intc: Fix return value check in rda8810_intc_init()
  irqchip/irq-imx-gpcv2: Silence "fall through" warning
  irqchip/gic-v3: Add quirk for msm8996 broken registers
  irqchip/gic: Add support to device tree based quirks
  dt-bindings/gic-v3: Add msm8996 compatible string
  irqchip/sun4i: Add support for Allwinner ARMv5 F1C100s
  irqchip/sun4i: Move IC specific register offsets to struct
  irqchip/sun4i: Add a struct to hold global variables
  dt-bindings: interrupt-controller: Add suniv interrupt-controller
  irqchip: Add RDA8810PL interrupt driver
  ...
2018-12-25 15:17:51 -08:00
Linus Torvalds
1e2af254ef Power management updates for 4.21-rc1
- Add sysadmin documentation for cpuidle (Rafael Wysocki).
 
  - Make it possible to specify a cpuidle governor from kernel
    command line, add new cpuidle state sysfs attributes for
    governor evaluation, and improve the "polling" idle state
    handling (Rafael Wysocki).
 
  - Fix the handling of the "required-opps" DT property in the
    operating performance points (OPP) framework, improve the
    integration of it with the generic power domains (genpd)
    framework, improve the handling of performance states in
    them and clean up the idle states vs performance states
    separation in genpd (Viresh Kumar, Ulf Hansson).
 
  - Add a cpufreq driver called "qcom-hw" for Qualcomm SoCs using
    a hardware engine to control CPU frequency transitions along
    with DT bindings for it (Taniya Das).
 
  - Fix an intel_pstate driver issue related to CPU offline and
    update the documentation of it (Srinivas Pandruvada).
 
  - Clean up the imx6q cpufreq driver (Anson Huang).
 
  - Add SPDX license IDs to cpufreq schedutil governor files (Daniel
    Lezcano).
 
  - Switch over the runtime PM framework to using high-res timers
    for device autosuspend to allow the control of it to be more
    precise (Vincent Guittot).
 
  - Disable non-wakeup ACPI GPEs during suspend-to-idle so that they
    don't prevent the system from reaching the target low-power state
    and simplify the suspend-to-idle handling on ACPI platforms
    without full Low-Power S0 Idle (LPS0) support (Rafael Wysocki).
 
  - Add system-wide suspend and resume support to the devfreq
    framework (Lukasz Luba).
 
  - Clean up the SmartReflex adaptive voltage scaling (AVS) driver and
    add an SPDX license ID to it (Nishanth Menon, Uwe Kleine-König,
    Thomas Meyer).
 
  - Get rid of code duplication by using the DEFINE_SHOW_ATTRIBUTE
    macro in some places, fix some DT node refcount leaks, and do
    some other janitorial cleanups (Yangtao Li).
 
  - Update the cpupower, intel_pstate_tracer and turbosat utilities
    (Abhishek Goel, Doug Smythies, Len Brown).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcHMOKAAoJEILEb/54YlRxtHUP/i4MePJYYIab3dW2WMtFC9wP
 K3Z/qva5uiEampEZbjlATlqUrd0XxMU2YfkJ9N08rJ33bKnBi8wNp0BPiwZjYs77
 lwjuEQ0Depz85LJ7W+dOjjxucdYv/H/ZXVK7VF2sfAfbpEbD7+RNTTa1+zdoh3Yq
 AiCKH/fU+Yc+wOMcXxj8vWe8EecsuYfZUC/p/yJDv1uMaUyHTgiCAyE/S2JzvZcQ
 mGFmly8b1hSkvCPOsipq/O8HUDtve8sOyZvFO5Up5BiNbDyhEHS4tDiKipbKgc/J
 ljzP3pVATlnEZLFxyqg32PhuhwYB0MtN3+BZGTjLelqTmElTx9A2JvWBVk4jgaC9
 ps97nUz1HO2dr1ERDyMnDtZFHFp24LVQcIB2RKfj/lqSq94IWQphmNZVF219jdyd
 68wR2/fnkzTeDhJ1JvcJb5XKrrd/wq3IKfCJAd9AXVxy27BrCB0ryTsQFvXJ+AzV
 n603yf3Sb7QaIC7Mofhj2WT3N/BQnzMzZhBJRA3EeX/Gd7TkLtQuvvHZQzcGbgOY
 GGm6WKGWgEUp3YIJIz5ihUuy9SUETCvR2PqGZfo+Wmc7F77j/5FY5ejRt3llkeFn
 8KaVQH9YIRjffBr1nclXMHKaRtf/JVB9SgVmDvA6Sh3Ac5/KBy3+IgwiJ2z5dunr
 tYV/QJ22r15xEBITTSJ7
 =WhM8
 -----END PGP SIGNATURE-----

Merge tag 'pm-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These add sysadmin documentation for cpuidle, extend the cpuidle
  subsystem somewhat, improve the handling of performance states in the
  generic power domains (genpd) and operating performance points (OPP)
  frameworks, add a new cpufreq driver for Qualcomm SoCs, update some
  other cpufreq drivers, switch over the runtime PM framework to using
  high-res timers for device autosuspend, fix a problem with
  suspend-to-idle on ACPI-based platforms, add system-wide suspend and
  resume handling to the devfreq framework, do some janitorial cleanups
  all over and update some utilities.

  Specifics:

   - Add sysadmin documentation for cpuidle (Rafael Wysocki).

   - Make it possible to specify a cpuidle governor from kernel command
     line, add new cpuidle state sysfs attributes for governor
     evaluation, and improve the "polling" idle state handling (Rafael
     Wysocki).

   - Fix the handling of the "required-opps" DT property in the
     operating performance points (OPP) framework, improve the
     integration of it with the generic power domains (genpd) framework,
     improve the handling of performance states in them and clean up the
     idle states vs performance states separation in genpd (Viresh
     Kumar, Ulf Hansson).

   - Add a cpufreq driver called "qcom-hw" for Qualcomm SoCs using a
     hardware engine to control CPU frequency transitions along with DT
     bindings for it (Taniya Das).

   - Fix an intel_pstate driver issue related to CPU offline and update
     the documentation of it (Srinivas Pandruvada).

   - Clean up the imx6q cpufreq driver (Anson Huang).

   - Add SPDX license IDs to cpufreq schedutil governor files (Daniel
     Lezcano).

   - Switch over the runtime PM framework to using high-res timers for
     device autosuspend to allow the control of it to be more precise
     (Vincent Guittot).

   - Disable non-wakeup ACPI GPEs during suspend-to-idle so that they
     don't prevent the system from reaching the target low-power state
     and simplify the suspend-to-idle handling on ACPI platforms without
     full Low-Power S0 Idle (LPS0) support (Rafael Wysocki).

   - Add system-wide suspend and resume support to the devfreq framework
     (Lukasz Luba).

   - Clean up the SmartReflex adaptive voltage scaling (AVS) driver and
     add an SPDX license ID to it (Nishanth Menon, Uwe Kleine-König,
     Thomas Meyer).

   - Get rid of code duplication by using the DEFINE_SHOW_ATTRIBUTE
     macro in some places, fix some DT node refcount leaks, and do some
     other janitorial cleanups (Yangtao Li).

   - Update the cpupower, intel_pstate_tracer and turbosat utilities
     (Abhishek Goel, Doug Smythies, Len Brown)"

* tag 'pm-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (54 commits)
  PM / Domains: remove define_genpd_open_function() and define_genpd_debugfs_fops()
  PM-runtime: Switch autosuspend over to using hrtimers
  cpufreq: qcom-hw: Add support for QCOM cpufreq HW driver
  dt-bindings: cpufreq: Introduce QCOM cpufreq firmware bindings
  ACPI: PM: Loop in full LPS0 mode only
  ACPI: EC / PM: Disable non-wakeup GPEs for suspend-to-idle
  tools/power/x86/intel_pstate_tracer: Fix non root execution for post processing a trace file
  tools/power turbostat: consolidate duplicate model numbers
  tools/power turbostat: fix goldmont C-state limit decoding
  PM / Domains: Propagate performance state updates
  PM / Domains: Factorize dev_pm_genpd_set_performance_state()
  PM / Domains: Save OPP table pointer in genpd
  OPP: Don't return 0 on error from of_get_required_opp_performance_state()
  OPP: Add dev_pm_opp_xlate_performance_state() helper
  OPP: Improve _find_table_of_opp_np()
  PM / Domains: Make genpd performance states orthogonal to the idlestates
  PM / sleep: convert to DEFINE_SHOW_ATTRIBUTE
  cpuidle: Add 'above' and 'below' idle state metrics
  PM / AVS: SmartReflex: Switch to SPDX Licence ID
  PM / AVS: SmartReflex: NULL check before some freeing functions is not needed
  ...
2018-12-25 13:47:41 -08:00
David S. Miller
ce28bb4453 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-12-21 15:06:20 -08:00
Rik van Riel
5eed6f1dff fork,memcg: fix crash in free_thread_stack on memcg charge fail
Commit 9b6f7e163c ("mm: rework memcg kernel stack accounting") will
result in fork failing if allocating a kernel stack for a task in
dup_task_struct exceeds the kernel memory allowance for that cgroup.

Unfortunately, it also results in a crash.

This is due to the code jumping to free_stack and calling
free_thread_stack when the memcg kernel stack charge fails, but without
tsk->stack pointing at the freshly allocated stack.

This in turn results in the vfree_atomic in free_thread_stack oopsing
with a backtrace like this:

#5 [ffffc900244efc88] die at ffffffff8101f0ab
 #6 [ffffc900244efcb8] do_general_protection at ffffffff8101cb86
 #7 [ffffc900244efce0] general_protection at ffffffff818ff082
    [exception RIP: llist_add_batch+7]
    RIP: ffffffff8150d487  RSP: ffffc900244efd98  RFLAGS: 00010282
    RAX: 0000000000000000  RBX: ffff88085ef55980  RCX: 0000000000000000
    RDX: ffff88085ef55980  RSI: 343834343531203a  RDI: 343834343531203a
    RBP: ffffc900244efd98   R8: 0000000000000001   R9: ffff8808578c3600
    R10: 0000000000000000  R11: 0000000000000001  R12: ffff88029f6c21c0
    R13: 0000000000000286  R14: ffff880147759b00  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffffc900244efda0] vfree_atomic at ffffffff811df2c7
 #9 [ffffc900244efdb8] copy_process at ffffffff81086e37
#10 [ffffc900244efe98] _do_fork at ffffffff810884e0
#11 [ffffc900244eff10] sys_vfork at ffffffff810887ff
#12 [ffffc900244eff20] do_syscall_64 at ffffffff81002a43
    RIP: 000000000049b948  RSP: 00007ffcdb307830  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000896030  RCX: 000000000049b948
    RDX: 0000000000000000  RSI: 00007ffcdb307790  RDI: 00000000005d7421
    RBP: 000000000067370f   R8: 00007ffcdb3077b0   R9: 000000000001ed00
    R10: 0000000000000008  R11: 0000000000000246  R12: 0000000000000040
    R13: 000000000000000f  R14: 0000000000000000  R15: 000000000088d018
    ORIG_RAX: 000000000000003a  CS: 0033  SS: 002b

The simplest fix is to assign tsk->stack right where it is allocated.

Link: http://lkml.kernel.org/r/20181214231726.7ee4843c@imladris.surriel.com
Fixes: 9b6f7e163c ("mm: rework memcg kernel stack accounting")
Signed-off-by: Rik van Riel <riel@surriel.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-21 14:51:18 -08:00
Linus Torvalds
e572fa0e84 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
 "Fix a division by zero crash in the posix-timers code"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  posix-timers: Fix division by zero bug
2018-12-21 10:51:54 -08:00
Linus Torvalds
d5fa080d4c Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull futex fix from Ingo Molnar:
 "A single fix for a robust futexes race between sys_exit() and
  sys_futex_lock_pi()"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  futex: Cure exit race
2018-12-21 10:11:51 -08:00
Rafael J. Wysocki
442a5d000a Merge branches 'pm-core', 'pm-qos', 'pm-domains' and 'pm-sleep'
* pm-core:
  PM-runtime: Switch autosuspend over to using hrtimers

* pm-qos:
  PM / QoS: Change to use DEFINE_SHOW_ATTRIBUTE macro

* pm-domains:
  PM / Domains: remove define_genpd_open_function() and define_genpd_debugfs_fops()

* pm-sleep:
  PM / sleep: convert to DEFINE_SHOW_ATTRIBUTE
2018-12-21 10:06:44 +01:00
Rafael J. Wysocki
3a56fe685d Merge branches 'pm-cpuidle', 'pm-cpufreq' and 'pm-cpufreq-sched'
* pm-cpuidle:
  cpuidle: Add 'above' and 'below' idle state metrics
  cpuidle: big.LITTLE: fix refcount leak
  cpuidle: Add cpuidle.governor= command line parameter
  cpuidle: poll_state: Disregard disable idle states
  Documentation: admin-guide: PM: Add cpuidle document

* pm-cpufreq:
  cpufreq: qcom-hw: Add support for QCOM cpufreq HW driver
  dt-bindings: cpufreq: Introduce QCOM cpufreq firmware bindings
  cpufreq: nforce2: Remove meaningless return
  cpufreq: ia64: Remove unused header files
  cpufreq: imx6q: save one condition block for normal case of nvmem read
  cpufreq: imx6q: remove unused code
  cpufreq: pmac64: add of_node_put()
  cpufreq: powernv: add of_node_put()
  Documentation: intel_pstate: Clarify coordination of P-State limits
  cpufreq: intel_pstate: Force HWP min perf before offline
  cpufreq: s3c24xx: Change to use DEFINE_SHOW_ATTRIBUTE macro

* pm-cpufreq-sched:
  sched/cpufreq: Add the SPDX tags
2018-12-21 10:06:06 +01:00
David S. Miller
339bbff2d6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-12-21

The following pull-request contains BPF updates for your *net-next* tree.

There is a merge conflict in test_verifier.c. Result looks as follows:

        [...]
        },
        {
                "calls: cross frame pruning",
                .insns = {
                [...]
                .prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
                .errstr_unpriv = "function calls to other bpf functions are allowed for root only",
                .result_unpriv = REJECT,
                .errstr = "!read_ok",
                .result = REJECT,
	},
        {
                "jset: functional",
                .insns = {
        [...]
        {
                "jset: unknown const compare not taken",
                .insns = {
                        BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
                                     BPF_FUNC_get_prandom_u32),
                        BPF_JMP_IMM(BPF_JSET, BPF_REG_0, 1, 1),
                        BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0),
                        BPF_EXIT_INSN(),
                },
                .prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
                .errstr_unpriv = "!read_ok",
                .result_unpriv = REJECT,
                .errstr = "!read_ok",
                .result = REJECT,
        },
        [...]
        {
                "jset: range",
                .insns = {
                [...]
                },
                .prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
                .result_unpriv = ACCEPT,
                .result = ACCEPT,
        },

The main changes are:

1) Various BTF related improvements in order to get line info
   working. Meaning, verifier will now annotate the corresponding
   BPF C code to the error log, from Martin and Yonghong.

2) Implement support for raw BPF tracepoints in modules, from Matt.

3) Add several improvements to verifier state logic, namely speeding
   up stacksafe check, optimizations for stack state equivalence
   test and safety checks for liveness analysis, from Alexei.

4) Teach verifier to make use of BPF_JSET instruction, add several
   test cases to kselftests and remove nfp specific JSET optimization
   now that verifier has awareness, from Jakub.

5) Improve BPF verifier's slot_type marking logic in order to
   allow more stack slot sharing, from Jiong.

6) Add sk_msg->size member for context access and add set of fixes
   and improvements to make sock_map with kTLS usable with openssl
   based applications, from John.

7) Several cleanups and documentation updates in bpftool as well as
   auto-mount of tracefs for "bpftool prog tracelog" command,
   from Quentin.

8) Include sub-program tags from now on in bpf_prog_info in order to
   have a reliable way for user space to get all tags of the program
   e.g. needed for kallsyms correlation, from Song.

9) Add BTF annotations for cgroup_local_storage BPF maps and
   implement bpf fs pretty print support, from Roman.

10) Fix bpftool in order to allow for cross-compilation, from Ivan.

11) Update of bpftool license to GPLv2-only + BSD-2-Clause in order
    to be compatible with libbfd and allow for Debian packaging,
    from Jakub.

12) Remove an obsolete prog->aux sanitation in dump and get rid of
    version check for prog load, from Daniel.

13) Fix a memory leak in libbpf's line info handling, from Prashant.

14) Fix cpumap's frame alignment for build_skb() so that skb_shared_info
    does not get unaligned, from Jesper.

15) Fix test_progs kselftest to work with older compilers which are less
    smart in optimizing (and thus throwing build error), from Stanislav.

16) Cleanup and simplify AF_XDP socket teardown, from Björn.

17) Fix sk lookup in BPF kselftest's test_sock_addr with regards
    to netns_id argument, from Andrey.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 17:31:36 -08:00
Jesper Dangaard Brouer
77ea5f4cbe bpf/cpumap: make sure frame_size for build_skb is aligned if headroom isn't
The frame_size passed to build_skb must be aligned, else it is
possible that the embedded struct skb_shared_info gets unaligned.

For correctness make sure that xdpf->headroom in included in the
alignment. No upstream drivers can hit this, as all XDP drivers provide
an aligned headroom.  This was discovered when playing with implementing
XDP support for mvneta, which have a 2 bytes DSA header, and this
Marvell ARM64 platform didn't like doing atomic operations on an
unaligned skb_shinfo(skb)->dataref addresses.

Fixes: 1c601d829a ("bpf: cpumap xdp_buff to skb conversion and allocation")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-20 23:19:12 +01:00
David S. Miller
2be09de7d6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Lots of conflicts, by happily all cases of overlapping
changes, parallel adds, things of that nature.

Thanks to Stephen Rothwell, Saeed Mahameed, and others
for their guidance in these resolutions.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 11:53:36 -08:00
Jakub Kicinski
9b38c4056b bpf: verifier: reorder stack size check with dead code sanitization
Reorder the calls to check_max_stack_depth() and sanitize_dead_code()
to separate functions which can rewrite instructions from pure checks.

No functional changes.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-20 17:28:28 +01:00
Jakub Kicinski
960ea05656 bpf: verifier: teach the verifier to reason about the BPF_JSET instruction
Some JITs (nfp) try to optimize code on their own.  It could make
sense in case of BPF_JSET instruction which is currently not interpreted
by the verifier, meaning for instance that dead could would not be
detected if it was under BPF_JSET branch.

Teach the verifier basics of BPF_JSET, JIT optimizations will be
removed shortly.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-20 17:28:28 +01:00
Linus Torvalds
519be6995c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Off by one in netlink parsing of mac802154_hwsim, from Alexander
    Aring.

 2) nf_tables RCU usage fix from Taehee Yoo.

 3) Flow dissector needs nhoff and thoff clamping, from Stanislav
    Fomichev.

 4) Missing sin6_flowinfo initialization in SCTP, from Xin Long.

 5) Spectrev1 in ipmr and ip6mr, from Gustavo A. R. Silva.

 6) Fix r8169 crash when DEBUG_SHIRQ is enabled, from Heiner Kallweit.

 7) Fix SKB leak in rtlwifi, from Larry Finger.

 8) Fix state pruning in bpf verifier, from Jakub Kicinski.

 9) Don't handle completely duplicate fragments as overlapping, from
    Michal Kubecek.

10) Fix memory corruption with macb and 64-bit DMA, from Anssi Hannula.

11) Fix TCP fallback socket release in smc, from Myungho Jung.

12) gro_cells_destroy needs to napi_disable, from Lorenzo Bianconi.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (130 commits)
  rds: Fix warning.
  neighbor: NTF_PROXY is a valid ndm_flag for a dump request
  net: mvpp2: fix the phylink mode validation
  net/sched: cls_flower: Remove old entries from rhashtable
  net/tls: allocate tls context using GFP_ATOMIC
  iptunnel: make TUNNEL_FLAGS available in uapi
  gro_cell: add napi_disable in gro_cells_destroy
  lan743x: Remove MAC Reset from initialization
  net/mlx5e: Remove the false indication of software timestamping support
  net/mlx5: Typo fix in del_sw_hw_rule
  net/mlx5e: RX, Fix wrong early return in receive queue poll
  ipv6: explicitly initialize udp6_addr in udp_sock_create6()
  bnxt_en: Fix ethtool self-test loopback.
  net/rds: remove user triggered WARN_ON in rds_sendmsg
  net/rds: fix warn in rds_message_alloc_sgs
  ath10k: skip sending quiet mode cmd for WCN3990
  mac80211: free skb fraglist before freeing the skb
  nl80211: fix memory leak if validate_pae_over_nl80211() fails
  net/smc: fix TCP fallback socket release
  vxge: ensure data0 is initialized in when fetching firmware version information
  ...
2018-12-19 23:34:33 -08:00
Martin KaFai Lau
fdbaa0beb7 bpf: Ensure line_info.insn_off cannot point to insn with zero code
This patch rejects a line_info if the bpf insn code referred by
line_info.insn_off is 0. F.e. a broken userspace tool might generate
a line_info.insn_off that points to the second 8 bytes of a BPF_LD_IMM64.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-19 15:42:55 -08:00
Dou Liyang
c410abbbac genirq/affinity: Add is_managed to struct irq_affinity_desc
Devices which use managed interrupts usually have two classes of
interrupts:

  - Interrupts for multiple device queues
  - Interrupts for general device management

Currently both classes are treated the same way, i.e. as managed
interrupts. The general interrupts get the default affinity mask assigned
while the device queue interrupts are spread out over the possible CPUs.

Treating the general interrupts as managed is both a limitation and under
certain circumstances a bug. Assume the following situation:

 default_irq_affinity = 4..7

So if CPUs 4-7 are offlined, then the core code will shut down the device
management interrupts because the last CPU in their affinity mask went
offline.

It's also a limitation because it's desired to allow manual placement of
the general device interrupts for various reasons. If they are marked
managed then the interrupt affinity setting from both user and kernel space
is disabled. That limitation was reported by Kashyap and Sumit.

Expand struct irq_affinity_desc with a new bit 'is_managed' which is set
for truly managed interrupts (queue interrupts) and cleared for the general
device interrupts.

[ tglx: Simplify code and massage changelog ]

Reported-by: Kashyap Desai <kashyap.desai@broadcom.com>
Reported-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Dou Liyang <douliyangs@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-pci@vger.kernel.org
Cc: shivasharan.srikanteshwara@broadcom.com
Cc: ming.lei@redhat.com
Cc: hch@lst.de
Cc: bhelgaas@google.com
Cc: douliyang1@huawei.com
Link: https://lkml.kernel.org/r/20181204155122.6327-3-douliyangs@gmail.com
2018-12-19 11:32:08 +01:00
Dou Liyang
bec04037e4 genirq/core: Introduce struct irq_affinity_desc
The interrupt affinity management uses straight cpumask pointers to convey
the automatically assigned affinity masks for managed interrupts. The core
interrupt descriptor allocation also decides based on the pointer being non
NULL whether an interrupt is managed or not.

Devices which use managed interrupts usually have two classes of
interrupts:

  - Interrupts for multiple device queues
  - Interrupts for general device management

Currently both classes are treated the same way, i.e. as managed
interrupts. The general interrupts get the default affinity mask assigned
while the device queue interrupts are spread out over the possible CPUs.

Treating the general interrupts as managed is both a limitation and under
certain circumstances a bug. Assume the following situation:

 default_irq_affinity = 4..7

So if CPUs 4-7 are offlined, then the core code will shut down the device
management interrupts because the last CPU in their affinity mask went
offline.

It's also a limitation because it's desired to allow manual placement of
the general device interrupts for various reasons. If they are marked
managed then the interrupt affinity setting from both user and kernel space
is disabled.

To remedy that situation it's required to convey more information than the
cpumasks through various interfaces related to interrupt descriptor
allocation.

Instead of adding yet another argument, create a new data structure
'irq_affinity_desc' which for now just contains the cpumask. This struct
can be expanded to convey auxilliary information in the next step.

No functional change, just preparatory work.

[ tglx: Simplified logic and clarified changelog ]

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Dou Liyang <douliyangs@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-pci@vger.kernel.org
Cc: kashyap.desai@broadcom.com
Cc: shivasharan.srikanteshwara@broadcom.com
Cc: sumit.saxena@broadcom.com
Cc: ming.lei@redhat.com
Cc: hch@lst.de
Cc: douliyang1@huawei.com
Link: https://lkml.kernel.org/r/20181204155122.6327-2-douliyangs@gmail.com
2018-12-19 11:32:08 +01:00
Thomas Gleixner
c2899c3470 genirq/affinity: Remove excess indentation
Plus other coding style issues which stood out while staring at that code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-12-19 11:32:07 +01:00
Yonghong Song
76c43ae84e bpf: log struct/union attribute for forward type
Current btf internal verbose logger logs fwd type as
  [2] FWD A type_id=0
where A is the type name.

Commit 9d5f9f701b ("bpf: btf: fix struct/union/fwd types
with kind_flag") introduced kind_flag which can be used
to distinguish whether a forward type is a struct or
union.

Also, "type_id=0" does not carry any meaningful
information for fwd type as btf_type.type = 0 is simply
enforced during btf verification and is not used
anywhere else.

This commit changed the log to
  [2] FWD A struct
if kind_flag = 0, or
  [2] FWD A union
if kind_flag = 1.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-19 00:47:56 +01:00
Jiong Wang
0bae2d4d62 bpf: correct slot_type marking logic to allow more stack slot sharing
Verifier is supposed to support sharing stack slot allocated to ptr with
SCALAR_VALUE for privileged program. However this doesn't happen for some
cases.

The reason is verifier is not clearing slot_type STACK_SPILL for all bytes,
it only clears part of them, while verifier is using:

  slot_type[0] == STACK_SPILL

as a convention to check one slot is ptr type.

So, the consequence of partial clearing slot_type is verifier could treat a
partially overridden ptr slot, which should now be a SCALAR_VALUE slot,
still as ptr slot, and rejects some valid programs.

Before this patch, test_xdp_noinline.o under bpf selftests, bpf_lxc.o and
bpf_netdev.o under Cilium bpf repo, when built with -mattr=+alu32 are
rejected due to this issue. After this patch, they all accepted.

There is no processed insn number change before and after this patch on
Cilium bpf programs.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-18 14:45:01 -08:00
Thomas Gleixner
da791a6675 futex: Cure exit race
Stefan reported, that the glibc tst-robustpi4 test case fails
occasionally. That case creates the following race between
sys_exit() and sys_futex_lock_pi():

 CPU0				CPU1

 sys_exit()			sys_futex()
  do_exit()			 futex_lock_pi()
   exit_signals(tsk)		  No waiters:
    tsk->flags |= PF_EXITING;	  *uaddr == 0x00000PID
  mm_release(tsk)		  Set waiter bit
   exit_robust_list(tsk) {	  *uaddr = 0x80000PID;
      Set owner died		  attach_to_pi_owner() {
    *uaddr = 0xC0000000;	   tsk = get_task(PID);
   }				   if (!tsk->flags & PF_EXITING) {
  ...				     attach();
  tsk->flags |= PF_EXITPIDONE;	   } else {
				     if (!(tsk->flags & PF_EXITPIDONE))
				       return -EAGAIN;
				     return -ESRCH; <--- FAIL
				   }

ESRCH is returned all the way to user space, which triggers the glibc test
case assert. Returning ESRCH unconditionally is wrong here because the user
space value has been changed by the exiting task to 0xC0000000, i.e. the
FUTEX_OWNER_DIED bit is set and the futex PID value has been cleared. This
is a valid state and the kernel has to handle it, i.e. taking the futex.

Cure it by rereading the user space value when PF_EXITING and PF_EXITPIDONE
is set in the task which 'owns' the futex. If the value has changed, let
the kernel retry the operation, which includes all regular sanity checks
and correctly handles the FUTEX_OWNER_DIED case.

If it hasn't changed, then return ESRCH as there is no way to distinguish
this case from malfunctioning user space. This happens when the exiting
task did not have a robust list, the robust list was corrupted or the user
space value in the futex was simply bogus.

Reported-by: Stefan Liebler <stli@linux.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200467
Link: https://lkml.kernel.org/r/20181210152311.986181245@linutronix.de
2018-12-18 23:13:15 +01:00
Matt Mullins
a38d1107f9 bpf: support raw tracepoints in modules
Distributions build drivers as modules, including network and filesystem
drivers which export numerous tracepoints.  This enables
bpf(BPF_RAW_TRACEPOINT_OPEN) to attach to those tracepoints.

Signed-off-by: Matt Mullins <mmullins@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-18 14:08:12 -08:00
Thomas Gleixner
ff3730a497 irqchip updates for 4.21
- A bunch of new irqchip drivers (RDA8810PL, Madera, imx-irqsteer)
 - Updates for new (and old) platforms (i.MX8MQ, F1C100s)
 - A number of SPDX cleanups
 - A workaround for a very broken GICv3 implementation
 - A platform-msi fix
 - Various cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlwZI8cVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDyokP+gKoKbZMc1E7dX6WxUrKh2N+fMJF
 uVbuGF2s57CLG955YNuyo8BK4meWJIHGO3JahwE8I/9eu0G7PaudYvpZgP7s/sxD
 XHLWFVHB1mq4lExMcluT0jG4ZpX7EKvYB1KGqgYM1ScOS9Uubb4ZG9T5GPhUT/YM
 w1BAtHaZmCAg8d0wNPUMaAFc9Bd2B9Z1C8nwS+wpdJRxYxE9x8BES42r95rbXCG6
 5Cq2ol/NbF4RbFodel4YdiAIKfrQtXyQ3N3twC5GRXln4XLjUfzs4mA5rxLLoeGZ
 2UGXeIk0GcokSWF/e+0p3tQDWKwdbqoBhbRbqk7u5ZWuEWTRf4Zot3IlCVpJAMM3
 iRw5XChWxovC+/oqgin4sp1gNpSRgf5mMvR1EauR5DTVtwlOjUBKaPEyKLrPITOo
 B42EJugJ94J0YVdT9RUJsOSXIdOiYFE6I9F4i/XioLYq5FItBB56/81ARZgEncpg
 FEdtseCCtRC3WWGzghxZsSzCW3iGi8wdddRdZmOXCNdPtH03TZg0dGPS+KIn8Soh
 eVSGImV/4efN6hh6fSryeR02fYT3DKGgDQUiV4e/1SOSzxy6VjjrOh48tB8qn/M7
 NbFZMqDKnltsXT2C+bh6zjhorbVCkj8AEtx1oF0d7iIyBxor3eHUelTz6VglNlLq
 RFetH+Yjh9nt9ReO
 =1Mk9
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates from Marc Zyngier:

 - A bunch of new irqchip drivers (RDA8810PL, Madera, imx-irqsteer)
 - Updates for new (and old) platforms (i.MX8MQ, F1C100s)
 - A number of SPDX cleanups
 - A workaround for a very broken GICv3 implementation
 - A platform-msi fix
 - Various cleanups
2018-12-18 18:37:27 +01:00
Ingo Molnar
c5f48c0a7a genirq: Fix various typos in comments
Go over the IRQ subsystem source code (including irqchip drivers) and
fix common typos in comments.

No change in functionality intended.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
2018-12-18 14:22:28 +01:00
YueHaibing
07daef8b41 ntp: Remove duplicated include
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <john.stultz@linaro.org>
Cc: <sboyd@kernel.org>
Link: https://lkml.kernel.org/r/20181209062225.4344-1-yuehaibing@huawei.com
2018-12-18 12:59:33 +01:00
Yonghong Song
ffa0c1cf59 bpf: enable cgroup local storage map pretty print with kind_flag
Commit 970289fc0a83 ("bpf: add bpffs pretty print for cgroup
local storage maps") added bpffs pretty print for cgroup
local storage maps. The commit worked for struct without kind_flag
set.

This patch refactored and made pretty print also work
with kind_flag set for the struct.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-18 01:11:59 +01:00
Yonghong Song
9d5f9f701b bpf: btf: fix struct/union/fwd types with kind_flag
This patch fixed two issues with BTF. One is related to
struct/union bitfield encoding and the other is related to
forward type.

Issue #1 and solution:

======================

Current btf encoding of bitfield follows what pahole generates.
For each bitfield, pahole will duplicate the type chain and
put the bitfield size at the final int or enum type.
Since the BTF enum type cannot encode bit size,
pahole workarounds the issue by generating
an int type whenever the enum bit size is not 32.

For example,
  -bash-4.4$ cat t.c
  typedef int ___int;
  enum A { A1, A2, A3 };
  struct t {
    int a[5];
    ___int b:4;
    volatile enum A c:4;
  } g;
  -bash-4.4$ gcc -c -O2 -g t.c
The current kernel supports the following BTF encoding:
  $ pahole -JV t.o
  [1] TYPEDEF ___int type_id=2
  [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [3] ENUM A size=4 vlen=3
        A1 val=0
        A2 val=1
        A3 val=2
  [4] STRUCT t size=24 vlen=3
        a type_id=5 bits_offset=0
        b type_id=9 bits_offset=160
        c type_id=11 bits_offset=164
  [5] ARRAY (anon) type_id=2 index_type_id=2 nr_elems=5
  [6] INT sizetype size=8 bit_offset=0 nr_bits=64 encoding=(none)
  [7] VOLATILE (anon) type_id=3
  [8] INT int size=1 bit_offset=0 nr_bits=4 encoding=(none)
  [9] TYPEDEF ___int type_id=8
  [10] INT (anon) size=1 bit_offset=0 nr_bits=4 encoding=SIGNED
  [11] VOLATILE (anon) type_id=10

Two issues are in the above:
  . by changing enum type to int, we lost the original
    type information and this will not be ideal later
    when we try to convert BTF to a header file.
  . the type duplication for bitfields will cause
    BTF bloat. Duplicated types cannot be deduplicated
    later if the bitfield size is different.

To fix this issue, this patch implemented a compatible
change for BTF struct type encoding:
  . the bit 31 of struct_type->info, previously reserved,
    now is used to indicate whether bitfield_size is
    encoded in btf_member or not.
  . if bit 31 of struct_type->info is set,
    btf_member->offset will encode like:
      bit 0 - 23: bit offset
      bit 24 - 31: bitfield size
    if bit 31 is not set, the old behavior is preserved:
      bit 0 - 31: bit offset

So if the struct contains a bit field, the maximum bit offset
will be reduced to (2^24 - 1) instead of MAX_UINT. The maximum
bitfield size will be 256 which is enough for today as maximum
bitfield in compiler can be 128 where int128 type is supported.

This kernel patch intends to support the new BTF encoding:
  $ pahole -JV t.o
  [1] TYPEDEF ___int type_id=2
  [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [3] ENUM A size=4 vlen=3
        A1 val=0
        A2 val=1
        A3 val=2
  [4] STRUCT t kind_flag=1 size=24 vlen=3
        a type_id=5 bitfield_size=0 bits_offset=0
        b type_id=1 bitfield_size=4 bits_offset=160
        c type_id=7 bitfield_size=4 bits_offset=164
  [5] ARRAY (anon) type_id=2 index_type_id=2 nr_elems=5
  [6] INT sizetype size=8 bit_offset=0 nr_bits=64 encoding=(none)
  [7] VOLATILE (anon) type_id=3

Issue #2 and solution:
======================

Current forward type in BTF does not specify whether the original
type is struct or union. This will not work for type pretty print
and BTF-to-header-file conversion as struct/union must be specified.
  $ cat tt.c
  struct t;
  union u;
  int foo(struct t *t, union u *u) { return 0; }
  $ gcc -c -g -O2 tt.c
  $ pahole -JV tt.o
  [1] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [2] FWD t type_id=0
  [3] PTR (anon) type_id=2
  [4] FWD u type_id=0
  [5] PTR (anon) type_id=4

To fix this issue, similar to issue #1, type->info bit 31
is used. If the bit is set, it is union type. Otherwise, it is
a struct type.

  $ pahole -JV tt.o
  [1] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [2] FWD t kind_flag=0 type_id=0
  [3] PTR (anon) kind_flag=0 type_id=2
  [4] FWD u kind_flag=1 type_id=0
  [5] PTR (anon) kind_flag=0 type_id=4

Pahole/LLVM change:
===================

The new kind_flag functionality has been implemented in pahole
and llvm:
  https://github.com/yonghong-song/pahole/tree/bitfield
  https://github.com/yonghong-song/llvm/tree/bitfield

Note that pahole hasn't implemented func/func_proto kind
and .BTF.ext. So to print function signature with bpftool,
the llvm compiler should be used.

Fixes: 69b693f0ae ("bpf: btf: Introduce BPF Type Format (BTF)")
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-18 01:11:59 +01:00
Yonghong Song
f97be3ab04 bpf: btf: refactor btf_int_bits_seq_show()
Refactor function btf_int_bits_seq_show() by creating
function btf_bitfield_seq_show() which has no dependence
on btf and btf_type. The function btf_bitfield_seq_show()
will be in later patch to directly dump bitfield member values.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-18 01:11:59 +01:00
Daniel Borkmann
6c4fc209fc bpf: remove useless version check for prog load
Existing libraries and tracing frameworks work around this kernel
version check by automatically deriving the kernel version from
uname(3) or similar such that the user does not need to do it
manually; these workarounds also make the version check useless
at the same time.

Moreover, most other BPF tracing types enabling bpf_probe_read()-like
functionality have /not/ adapted this check, and in general these
days it is well understood anyway that all the tracing programs are
not stable with regards to future kernels as kernel internal data
structures are subject to change from release to release.

Back at last netconf we discussed [0] and agreed to remove this
check from bpf_prog_load() and instead document it here in the uapi
header that there is no such guarantee for stable API for these
programs.

  [0] http://vger.kernel.org/netconf2018_files/DanielBorkmann_netconf2018.pdf

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-17 13:41:35 -08:00
Lendacky, Thomas
c92a54cfa0 dma-direct: do not include SME mask in the DMA supported check
The dma_direct_supported() function intends to check the DMA mask against
specific values. However, the phys_to_dma() function includes the SME
encryption mask, which defeats the intended purpose of the check. This
results in drivers that support less than 48-bit DMA (SME encryption mask
is bit 47) from being able to set the DMA mask successfully when SME is
active, which results in the driver failing to initialize.

Change the function used to check the mask from phys_to_dma() to
__phys_to_dma() so that the SME encryption mask is not part of the check.

Fixes: c1d0af1a1d ("kernel/dma/direct: take DMA offset into account in dma_direct_supported")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-17 18:02:11 +01:00
Masami Hiramatsu
fb1a59fae8 kprobes: Blacklist symbols in arch-defined prohibited area
Blacklist symbols in arch-defined probe-prohibited areas.
With this change, user can see all symbols which are prohibited
to probe in debugfs.

All archtectures which have custom prohibit areas should define
its own arch_populate_kprobe_blacklist() function, but unless that,
all symbols marked __kprobes are blacklisted.

Reported-by: Andrea Righi <righi.andrea@gmail.com>
Tested-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lkml.kernel.org/r/154503485491.26176.15823229545155174796.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-17 17:48:38 +01:00
Ingo Molnar
76aea1eeb9 Linux 4.20-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlwW4/oeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG2QMH/Rl6iMpTUX23tMHe
 eXQzAOSvQXaWlFoX25j1Jvt8nhS7Uy8vkdpYTCOI/7DF0Jg4O/6uxcZkErlwWxb8
 MW1rMgpfO+OpDLSLXAO2GKxaKI3ArqF2BcOQA2mji1/jR2VUTqmIvBoudn5d+GYz
 19aCyfdzmVTC38G9sBhhcqJ10EkxLiHe2K74bf4JxVuSf2EnTI4LYt5xJPDoT0/C
 6fOeUNwVhvv5a4svvzJmortq7x7BwyxBQArc7PbO0MPhabLU4wyFUOTRszgsGd76
 o5JuOFwgdIIHlSSacGla6rKq10nmkwR07fHfRFFwbvrfBOEHsXOP2hvzMZX+FLBK
 IXOzdtc=
 =XlMc
 -----END PGP SIGNATURE-----

Merge tag 'v4.20-rc7' into perf/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-17 17:46:26 +01:00
Thomas Gleixner
0e334db6bb posix-timers: Fix division by zero bug
The signal delivery path of posix-timers can try to rearm the timer even if
the interval is zero. That's handled for the common case (hrtimer) but not
for alarm timers. In that case the forwarding function raises a division by
zero exception.

The handling for hrtimer based posix timers is wrong because it marks the
timer as active despite the fact that it is stopped.

Move the check from common_hrtimer_rearm() to posixtimer_rearm() to cure
both issues.

Reported-by: syzbot+9d38bedac9cc77b8ad5e@syzkaller.appspotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: sboyd@kernel.org
Cc: stable@vger.kernel.org
Cc: syzkaller-bugs@googlegroups.com
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1812171328050.1880@nanos.tec.linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-17 17:35:45 +01:00
David S. Miller
10589a568f Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2018-12-15

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) fix liveness propagation of callee saved registers, from Jakub.

2) fix overflow in bpf_jit_limit knob, from Daniel.

3) bpf_flow_dissector api fix, from Stanislav.

4) bpf_perf_event api fix on powerpc, from Sandipan.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-15 10:58:32 -08:00
Alexei Starovoitov
9242b5f561 bpf: add self-check logic to liveness analysis
Introduce REG_LIVE_DONE to check the liveness propagation
and prepare the states for merging.
See algorithm description in clean_live_states().

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-15 01:28:32 +01:00
Alexei Starovoitov
19e2dbb7dd bpf: improve stacksafe state comparison
"if (old->allocated_stack > cur->allocated_stack)" check is too conservative.
In some cases explored stack could have allocated more space,
but that stack space was not live.
The test case improves from 19 to 15 processed insns
and improvement on real programs is significant as well:

                       before    after
bpf_lb-DLB_L3.o        1940      1831
bpf_lb-DLB_L4.o        3089      3029
bpf_lb-DUNKNOWN.o      1065      1064
bpf_lxc-DDROP_ALL.o    28052     26309
bpf_lxc-DUNKNOWN.o     35487     33517
bpf_netdev.o           10864     9713
bpf_overlay.o          6643      6184
bpf_lcx_jit.o          38437     37335

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Edward Cree <ecree@solarflare.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-15 01:28:32 +01:00
Alexei Starovoitov
b233920c97 bpf: speed up stacksafe check
Don't check the same stack liveness condition 8 times.
once is enough.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Edward Cree <ecree@solarflare.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-15 01:28:32 +01:00