Commit Graph

549267 Commits

Author SHA1 Message Date
Vineet Gupta
25d464183c ARC: mm: PAE40: tlbex.S: Explicitify the size of pte_t
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:50:29 +05:30
Vineet Gupta
28b4af729f ARC: mm: PAE40: switch to using phys_addr_t for physical addresses
That way a single flip of phys_addr_t to 64 bit ensures all places
dealing with physical addresses get correct data

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:50:29 +05:30
Vineet Gupta
29e332261d ARC: mm: HIGHMEM: populate high memory from DT
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:50:26 +05:30
Jiri Olsa
443f8c75e8 perf symbols: Fix endless loop in dso__split_kallsyms_for_kcore
Currently we split symbols based on the map comparison, but symbols are stored
within dso objects and maps could point into same dso objects (kernel maps).

Hence we could end up changing rbtree we are currently iterating and mess it
up. It's easily reproduced on s390x by running:

  $ perf record -a -- sleep 3
  $ perf buildid-list -i perf.data --with-hits

The fix is to compare dso objects instead.

Reported-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151026135130.GA26003@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28 11:19:30 -03:00
Wang Nan
374ce938aa perf tools: Enable pre-event inherit setting by config terms
This patch allows perf record setting event's attr.inherit bit by
config terms like:

  # perf record -e cycles/no-inherit/ ...
  # perf record -e cycles/inherit/ ...

So user can control inherit bit for each event separately.

In following example, a.out fork()s in main then do some complex
CPU intensive computations in both of its children.

Basic result with and without inherit:

  # perf record -e cycles -e instructions ./a.out
  [ perf record: Woken up 9 times to write data ]
  [ perf record: Captured and wrote 2.205 MB perf.data (47920 samples) ]
  # perf report --stdio
  # ...
  # Samples: 23K of event 'cycles'
  # Event count (approx.): 23641752891
  ...
  # Samples: 24K of event 'instructions'
  # Event count (approx.): 30428312415

  # perf record -i -e cycles -e instructions ./a.out
  [ perf record: Woken up 5 times to write data ]
  [ perf record: Captured and wrote 1.111 MB perf.data (24019 samples) ]
  ...
  # Samples: 12K of event 'cycles'
  # Event count (approx.): 11699501775
  ...
  # Samples: 12K of event 'instructions'
  # Event count (approx.): 15058023559

Cancel inherit for one event when globally enable:

  # perf record -e cycles/no-inherit/ -e instructions ./a.out
  [ perf record: Woken up 7 times to write data ]
  [ perf record: Captured and wrote 1.660 MB perf.data (36004 samples) ]
  ...
  # Samples: 12K of event 'cycles/no-inherit/'
  # Event count (approx.): 11895759282
 ...
  # Samples: 24K of event 'instructions'
  # Event count (approx.): 30668000441

Enable inherit for one event when globally disable:

  # perf record -i -e cycles/inherit/ -e instructions ./a.out
  [ perf record: Woken up 7 times to write data ]
  [ perf record: Captured and wrote 1.654 MB perf.data (35868 samples) ]
  ...
  # Samples: 23K of event 'cycles/inherit/'
  # Event count (approx.): 23285400229
  ...
  # Samples: 11K of event 'instructions'
  # Event count (approx.): 14969050259

Committer note:

One can check if the bit was set, in addition to seeing the result in
the perf.data file size as above by doing one of:

  # perf record -e cycles -e instructions -a usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.911 MB perf.data (63 samples) ]
  # perf evlist -v
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
  instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
  #

So, the inherit bit was set in both, now, if we disable it globally using
--no-inherit:

  # perf record --no-inherit -e cycles -e instructions -a usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.910 MB perf.data (56 samples) ]
  # perf evlist -v
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
  instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1

No inherit bit set, then disabling it and setting just on the cycles event:

  # perf record --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.909 MB perf.data (48 samples) ]
  # perf evlist -v
  cycles/inherit/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
  instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
  #

We can see it as well in by using a more verbose level of debug messages in
the tool that sets up the perf_event_attr, 'perf record' in this case:

  [root@zoo ~]# perf record -vv --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|ID|CPU|PERIOD
    read_format                      ID
    disabled                         1
    inherit                          1
    mmap                             1
    comm                             1
    freq                             1
    task                             1
    sample_id_all                    1
    exclude_guest                    1
    mmap2                            1
    comm_exec                        1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8
  sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8
  sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8
  sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    config                           0x1
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|ID|CPU|PERIOD
    read_format                      ID
    disabled                         1
    freq                             1
    sample_id_all                    1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8

<SNIP>

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446029705-199659-2-git-send-email-wangnan0@huawei.com
[ s/u64/bool/ for the perf_evsel_config_term inherit field - jolsa]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28 11:19:16 -03:00
Vineet Gupta
45890f6d34 ARC: mm: HIGHMEM: kmap API implementation
Implement kmap* API for ARC.

This enables
 - permanent kernel maps (pkmaps): :kmap() API
 - fixmap : kmap_atomic()

We use a very simple/uniform approach for both (unlike some of the other
arches). So fixmap doesn't use the customary compile time address stuff.
The important semantic is sleep'ability (pkmap) vs. not (fixmap) which
the API guarantees.

Note that this patch only enables highmem for subsequent PAE40 support
as there is no real highmem for ARC in pure 32-bit paradigm as explained
below.

ARC has 2:2 address split of the 32-bit address space with lower half
being translated (virtual) while upper half unstranslated
(0x8000_0000 to 0xFFFF_FFFF). kernel itself is linked at base of
unstranslated space (i.e. 0x8000_0000 onwards), which is mapped to say
DDR 0x0 by external Bus Glue logic (outside the core). So kernel can
potentially access 1.75G worth of memory directly w/o need for highmem.
(the top 256M is taken by uncached peripheral space from 0xF000_0000 to
0xFFFF_FFFF)

In PAE40, hardware can address memory beyond 4G (0x1_0000_0000) while
the logical/virtual addresses remain 32-bits. Thus highmem is required
for kernel proper to be able to access these pages for it's own purposes
(user space is agnostic to this anyways).

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:49:04 +05:30
Vineet Gupta
6101be5ad4 ARC: mm: preps ahead of HIGHMEM support #2
Explicit'ify that all memory added so far is low memory
Nothing semantical

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:49:00 +05:30
Vineet Gupta
336e2136e1 ARC: mm: preps ahead of HIGHMEM support
Before we plug in highmem support, some of code needs to be ready for it
 - copy_user_highpage() needs to be using the kmap_atomic API
 - mk_pte() can't assume page_address()
 - do_page_fault() can't assume VMALLOC_END is end of kernel vaddr space

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:31:05 +05:30
Alexey Brodkin
d40846457f ARC: mm: use generic macros _BITUL()/_AC()
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:31:05 +05:30
Vineet Gupta
8840e14cd8 ARC: mm: Improve Duplicate PD Fault handler
- Move the verbosity knob from .data to .bss by using inverted logic
 - No need to readout PD1 descriptor
 - clip the non pfn bits of PD0 to avoid clipping inside the loop

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:31:04 +05:30
Dima Kogan
5baecbcd9c perf symbols: we can now read separate debug-info files based on a build ID
Recent GDB (at least on a vanilla Debian box) looks for debug information in

  /usr/lib/debug/.build-id/nn/nnnnnnn

where nn/nnnnnn is the build-id of the stripped ELF binary. This is
documented here:

  https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html

This was not working in perf because we didn't read the build id until
AFTER we searched for the separate debug information file. This patch
reads the build ID and THEN does the search.

Signed-off-by: Dima Kogan <dima@secretsauce.net>
Link: http://lkml.kernel.org/r/87si6pfwz4.fsf@secretsauce.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28 10:04:27 -03:00
Dima Kogan
f2f3096888 perf symbols: Fix type error when reading a build-id
This was benign, but wrong. The build-id should live in a char[], not a char*[]

Signed-off-by: Dima Kogan <dima@secretsauce.net>
Link: http://lkml.kernel.org/r/87si6pfwz4.fsf@secretsauce.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28 10:02:00 -03:00
Tejun Heo
b33e18f61b fs/writeback, rcu: Don't use list_entry_rcu() for pointer offsetting in bdi_split_work_to_wbs()
bdi_split_work_to_wbs() uses list_for_each_entry_rcu_continue()
to walk @bdi->wb_list.  To set up the initial iteration
condition, it uses list_entry_rcu() to calculate the entry
pointer corresponding to the list head; however, this isn't an
actual RCU dereference and using list_entry_rcu() for it ended
up breaking a proposed list_entry_rcu() change because it was
feeding an non-lvalue pointer into the macro.

Don't use the RCU variant for simple pointer offsetting.  Use
list_entry() instead.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Darren Hart <dvhart@linux.intel.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Patrick Marlier <patrick.marlier@gmail.com>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: pranith kumar <bobby.prani@gmail.com>
Link: http://lkml.kernel.org/r/20151027051939.GA19355@mtj.duckdns.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-28 13:17:30 +01:00
Ingo Molnar
e4340bbb07 Merge branch 'linus' into core/rcu, to fix up a semantic conflict
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-28 13:17:20 +01:00
Taku Izumi
78b9bc947b efi: Fix warning of int-to-pointer-cast on x86 32-bit builds
Commit:

  0f96a99dab ("efi: Add "efi_fake_mem" boot option")

introduced the following warning message:

  drivers/firmware/efi/fake_mem.c:186:20: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]

new_memmap_phy was defined as a u64 value and cast to void*,
causing a int-to-pointer-cast warning on x86 32-bit builds.
However, since the void* type is inappropriate for a physical
address, the definition of struct efi_memory_map::phys_map has
been changed to phys_addr_t in the previous patch, and so the
cast can be dropped entirely.

This patch also changes the type of the "new_memmap_phy"
variable from "u64" to "phys_addr_t" to align with the types of
memblock_alloc() and struct efi_memory_map::phys_map.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
[ Removed void* cast, updated commit log]
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: linux-efi@vger.kernel.org
Cc: matt.fleming@intel.com
Link: http://lkml.kernel.org/r/1445593697-1342-2-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-28 12:28:06 +01:00
Ard Biesheuvel
44511fb9e5 efi: Use correct type for struct efi_memory_map::phys_map
We have been getting away with using a void* for the physical
address of the UEFI memory map, since, even on 32-bit platforms
with 64-bit physical addresses, no truncation takes place if the
memory map has been allocated by the firmware (which only uses
1:1 virtually addressable memory), which is usually the case.

However, commit:

  0f96a99dab ("efi: Add "efi_fake_mem" boot option")

adds code that clones and modifies the UEFI memory map, and the
clone may live above 4 GB on 32-bit platforms.

This means our use of void* for struct efi_memory_map::phys_map has
graduated from 'incorrect but working' to 'incorrect and
broken', and we need to fix it.

So redefine struct efi_memory_map::phys_map as phys_addr_t, and
get rid of a bunch of casts that are now unneeded.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: izumi.taku@jp.fujitsu.com
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: linux-efi@vger.kernel.org
Cc: matt.fleming@intel.com
Link: http://lkml.kernel.org/r/1445593697-1342-1-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-28 12:28:06 +01:00
Vineet Gupta
9acdc911b5 MAINTAINERS: Add public mailing list for ARC
Cc: <stable@vger.kernel.org> #3.9+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:43 +05:30
Vineet Gupta
f759ee57b2 ARC: Ensure DT mem base is same as what kernel is built with
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:42 +05:30
Vineet Gupta
483bcc99c0 ARC: boot: Non Master cpus only need to call EARLY_CPU_SETUP once
With prev fixes, all cores now start via common entry point @stext which
already calls EARLY_CPU_SETUP for all cores - so no need to invoke it
again

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:42 +05:30
Vineet Gupta
aa0efcde45 ARCv2: smp: [plat-*]: No need to explicitly call mcip_init_smp()
MCIP now registers it's own per cpu setup routine (for IPI IRQ request)
using smp_ops.init_irq_cpu().

So no need for platforms to do that. This now completely decouples
platforms from MCIP.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:41 +05:30
Vineet Gupta
286130ebf1 ARC: smp: Introduce smp hook @init_irq_cpu called for all cores
Note this is not part of platform owned static machine_desc,
but more of device owned plat_smp_ops (rather misnamed) which a IPI
provider or some such typically defines.

This will help us seperate out the IPI registration from platform
specific init_cpu_smp() into device specific init_irq_cpu()

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:41 +05:30
Vineet Gupta
8721a7f5a6 ARC: smp: Rename platform hook @init_smp -> @init_cpu_smp
This conveys better that it is called for each cpu

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:40 +05:30
Vineet Gupta
26b8f99623 ARCv2: smp: [plat-*]: No need to explicitly call mcip_init_early_smp()
MCIP now registers it's own probe callback with smp_ops.init_early_smp()
which is called by ARC common code, so no need for platforms to do that.

This decouples the platforms and MCIP and helps confine MCIP details
to it's own file.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:40 +05:30
Vineet Gupta
e55af4da02 ARC: smp: Introduce smp hook @init_early_smp for Master core
This adds a platform agnostic early SMP init hook which is called on
Master core before calling setup_processor()

  setup_arch()
     smp_init_cpus()
         smp_ops.init_early_smp()
     ...
     setup_processor()

How this helps:
 - Used for one time init of certain SMP centric IP blocks, before
   calling setup_processor() which probes various bits of core,
   possibly including this block

 - Currently platforms need to call this IP block init from their
   init routines, which doesn't make sense as this is specific to ARC
   core and not platform and otherwise requires copy/paste in all
   (and hence a possible point of failure)

e.g. MCIP init is called from 2 platforms currently (axs10x and sim)
which will go away once we have this.

This change only adds the hooks but they are empty for now. Next commit
will populate them and remove the explicit init calls from platforms.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:40 +05:30
Vineet Gupta
4c82f28617 ARC: remove @init_time, @init_irq platform callbacks
These are not in use for ARC platforms. Moreover DT mechanims exist to
probe them w/o explicit platform calls.

 - clocksource drivers can use CLOCKSOURCE_OF_DECLARE()
 - intc IRQCHIP_DECLARE() calls + cascading inside DT allows external
   intc to be probed automatically

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:39 +05:30
Vineet Gupta
e0868e6f67 ARC: smp: irqchip: handle IPI as percpu irq like timer
The reason this was not done so far was lack of genuine IPI_IRQ for
ARC700, as we don't have a SMP version of core yet (which might change
soon thx to EZChip). Nevertheles to increase the build coverage, we
need to allow CONFIG_SMP for ARC700 and still be able to run it on a
UP platform (nsim or AXS101) with a UP Device Tree (SMP-on-UP)

The build itself requires some define for IPI_IRQ and even a dummy
value is fine since that code won't run anyways.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:39 +05:30
Vineet Gupta
3971cdc202 ARC: boot: Support Halt-on-reset and Run-on-reset SMP booting modes
For Run-on-reset, non masters need to spin wait. For Halt-on-reset they
can jump to entry point directly.

Also while at it, made reset vector handler as "the" entry point for
kernel including host debugger based boot (which uses the ELF header
entry point)

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:08:17 +05:30
Linus Torvalds
8a28d67457 powerpc fixes for 4.3 #5
- powerpc/dma: dma_set_coherent_mask() should not be GPL only from Ben
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWMFthAAoJEFHr6jzI4aWAZFkP/1Ci8cgh0GIfdN3wafi85pIe
 dhSBicWpHPpwnvcaD+ab/Q73/nfub6NKTpMasBankf53SYUWmPptwu+Gs0lv7GVv
 Q1AjOEP0vhcBVxhXNJpOWDGFyt5wez4BevqWpmPuz9MEy7PCpgevQJjPEsG8qnzh
 BfUvDkWAOjA6jX3mTyZQyefZECGPhDsM31eGYvkFQRB6Ui1KKbYbFMncjjTtBfjW
 fiUnKShJZIcVtpImGg6WQRKjRaT7uJzmFOLUclv/7dDoT8hLztiiLP44zX+kvJtF
 AyCBSQI/LOrQE826YpF93Vq6WZVvzNQZaPIHPC2pj1WIF6Q/gTTng5T16jODX+QI
 ldQ96sLr9WA49UaS6JakA7iv/OhRXUlRO6qyow8I+1XQdoY5yuVyXct9NpQ3A9J6
 lOw7d8o4RuMaGqMx5nUcyDrv/NICYuuyaikDe+8GY4Df6FG6Aepit1WO6BcDkrfC
 FrnQ+XROmvNoKbVRYYhJZRhEqz1PJJwOmCtNYNzjiJXfgKNknYPBfoBXqx8XeDxl
 VTyxer8V/wMP50X+SIg558YqNIXMj9ZA7fU5xnN4Y7MbHC5DEBnVX9nmc0J5PjTF
 t36S+cm1BeG44/LTSrH66DwsnWce0xo0doK7wBx+CKSGXVciu1VPn59upL1UtX5z
 WnyEnMm5NYc5aTGfZf+L
 =sFFh
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.3-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:
 - powerpc/dma: dma_set_coherent_mask() should not be GPL only from Ben

* tag 'powerpc-4.3-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/dma: dma_set_coherent_mask() should not be GPL only
2015-10-28 18:59:53 +09:00
Benjamin Herrenschmidt
977bf062bb powerpc/dma: dma_set_coherent_mask() should not be GPL only
When turning this from inline to an exported function I was a bit
over-eager and made it GPL only. This prevents the use of pretty much
all non-GPL PCI driver which is a bit over the top. Let's bring it
back in line with other architecture.

Fixes: 817820b022 ("powerpc/iommu: Support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-28 14:20:50 +09:00
David S. Miller
e18f6ac30d Merge branch 'mlx4-fixes'
Or Gerlitz says:

====================
Mellanox mlx4 driver fixes for 4.3-rc7

Jack's fix is for a regression introduced in 4.3-rc1

Carol's fix addresses an issue which exists for while and
turns to beat us hard on PPC, please queue for -stable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 20:27:45 -07:00
Carol L Soto
c02b05011f net/mlx4: Copy/set only sizeof struct mlx4_eqe bytes
When doing memcpy/memset of EQEs, we should use sizeof struct
mlx4_eqe as the base size and not caps.eqe_size which could be bigger.

If caps.eqe_size is bigger than the struct mlx4_eqe then we corrupt
data in the master context.

When using a 64 byte stride, the memcpy copied over 63 bytes to the
slave_eq structure.  This resulted in copying over the entire eqe of
interest, including its ownership bit -- and also 31 bytes of garbage
into the next WQE in the slave EQ -- which did NOT include the ownership
bit (and therefore had no impact).

However, once the stride is increased to 128, we are overwriting the
ownership bits of *three* eqes in the slave_eq struct.  This results
in an incorrect ownership bit for those eqes, which causes the eq to
seem to be full. The issue therefore surfaced only once 128-byte EQEs
started being used in SRIOV and (overarchitectures that have 128/256
byte cache-lines such as PPC) - e.g after commit 77507aa249
"net/mlx4_core: Enable CQE/EQE stride support".

Fixes: 08ff32352d ('mlx4: 64-byte CQE/EQE support')
Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 20:27:11 -07:00
Jack Morgenstein
092bf0fc80 net/mlx4_en: Explicitly set no vlan tags in WQE ctrl segment when no vlan is present
We do not set the ins_vlan field to zero when no vlan id is present in the packet.

Since WQEs in the TX ring are not zeroed out between uses, this oversight
could result in having vlan flags present in the WQE ctrl segment when no
vlan is preset.

Fixes: e38af4faf0 ('net/mlx4_en: Add support for hardware accelerated 802.1ad vlan')
Reported-by: Gideon Naim <gideonn@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 20:27:09 -07:00
Michael S. Tsirkin
e407f39afd vhost: fix performance on LE hosts
commit 2751c9882b ("vhost: cross-endian
support for legacy devices") introduced a minor regression: even with
cross-endian disabled, and even on LE host, vhost_is_little_endian is
checking is_le flag so there's always a branch.

To fix, simply check virtio_legacy_is_little_endian first.

Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 20:17:03 -07:00
Yang Shi
85ff8a43f3 bpf: sample: define aarch64 specific registers
Define aarch64 specific registers for building bpf samples correctly.

Signed-off-by: Yang Shi <yang.shi@linaro.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:51:10 -07:00
Lendacky, Thomas
20986ed826 amd-xgbe: Fix race between access of desc and desc index
During Tx cleanup it's still possible for the descriptor data to be
read ahead of the descriptor index. A memory barrier is required between
the read of the descriptor index and the start of the Tx cleanup loop.
This allows a change to a lighter-weight barrier in the Tx transmit
routine just before updating the current descriptor index.

Since the memory barrier does result in extra overhead on arm64, keep
the previous change to not chase the current descriptor value. This
prevents the execution of the barrier for each loop performed.

Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:49:22 -07:00
Sowmini Varadhan
8ce675ff39 RDS-TCP: Recover correctly from pskb_pull()/pksb_trim() failure in rds_tcp_data_recv
Either of pskb_pull() or pskb_trim() may fail under low memory conditions.
If rds_tcp_data_recv() ignores such failures, the application will
receive corrupted data because the skb has not been correctly
carved to the RDS datagram size.

Avoid this by handling pskb_pull/pskb_trim failure in the same
manner as the skb_clone failure: bail out of rds_tcp_data_recv(), and
retry via the deferred call to rds_send_worker() that gets set up on
ENOMEM from rds_tcp_read_sock()

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:46:34 -07:00
Neil Horman
0b7c874348 forcedeth: fix unilateral interrupt disabling in netpoll path
Forcedeth currently uses disable_irq_lockdep and enable_irq_lockdep, which in
some configurations simply calls local_irq_disable.  This causes errant warnings
in the netpoll path as in netpoll_send_skb_on_dev, where we disable irqs using
local_irq_save, leading to the following warning:

WARNING: at net/core/netpoll.c:352 netpoll_send_skb_on_dev+0x243/0x250() (Not
tainted)
Hardware name:
netpoll_send_skb_on_dev(): eth0 enabled interrupts in poll
(nv_start_xmit_optimized+0x0/0x860 [forcedeth])
Modules linked in: netconsole(+) configfs ipv6 iptable_filter ip_tables ppdev
parport_pc parport sg microcode serio_raw edac_core edac_mce_amd k8temp
snd_hda_codec_realtek snd_hda_codec_generic forcedeth snd_hda_intel
snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore
snd_page_alloc i2c_nforce2 i2c_core shpchp ext4 jbd2 mbcache sr_mod cdrom sd_mod
crc_t10dif pata_amd ata_generic pata_acpi sata_nv dm_mirror dm_region_hash
dm_log dm_mod [last unloaded: scsi_wait_scan]
Pid: 1940, comm: modprobe Not tainted 2.6.32-573.7.1.el6.x86_64.debug #1
Call Trace:
 [<ffffffff8107bbc1>] ? warn_slowpath_common+0x91/0xe0
 [<ffffffff8107bcc6>] ? warn_slowpath_fmt+0x46/0x60
 [<ffffffffa00fe5b0>] ? nv_start_xmit_optimized+0x0/0x860 [forcedeth]
 [<ffffffff814b3593>] ? netpoll_send_skb_on_dev+0x243/0x250
 [<ffffffff814b37c9>] ? netpoll_send_udp+0x229/0x270
 [<ffffffffa02e3299>] ? write_msg+0x39/0x110 [netconsole]
 [<ffffffffa02e331b>] ? write_msg+0xbb/0x110 [netconsole]
 [<ffffffff8107bd55>] ? __call_console_drivers+0x75/0x90
 [<ffffffff8107bdba>] ? _call_console_drivers+0x4a/0x80
 [<ffffffff8107c445>] ? release_console_sem+0xe5/0x250
 [<ffffffff8107d200>] ? register_console+0x190/0x3e0
 [<ffffffffa02e71a6>] ? init_netconsole+0x1a6/0x216 [netconsole]
 [<ffffffffa02e7000>] ? init_netconsole+0x0/0x216 [netconsole]
 [<ffffffff810020d0>] ? do_one_initcall+0xc0/0x280
 [<ffffffff810d4933>] ? sys_init_module+0xe3/0x260
 [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
---[ end trace f349c7af88e6a6d5 ]---
console [netcon0] enabled
netconsole: network logging started

Fix it by modifying the forcedeth code to use
disable_irq_nosync_lockdep_irqsavedisable_irq_nosync_lockdep_irqsave instead,
which saves and restores irq state properly.  This also saves us a little code
in the process

Tested by the reporter, with successful restuls

Patch applies to the head of the net tree

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
Reported-by: Vasily Averin <vvs@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:45:23 -07:00
Joe Stringer
6f5cadee44 openvswitch: Fix skb leak using IPv6 defrag
nf_ct_frag6_gather() makes a clone of each skb passed to it, and if the
reassembly is successful, expects the caller to free all of the original
skbs using nf_ct_frag6_consume_orig(). This call was previously missing,
meaning that the original fragments were never freed (with the exception
of the last fragment to arrive).

Fix this by ensuring that all original fragments except for the last
fragment are freed via nf_ct_frag6_consume_orig(). The last fragment
will be morphed into the head, so it must not be freed yet. Furthermore,
retain the ->next pointer for the head after skb_morph().

Fixes: 7f8a436eaa ("openvswitch: Add conntrack action")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:32:18 -07:00
Joe Stringer
190b8ffbb7 ipv6: Export nf_ct_frag6_consume_orig()
This is needed in openvswitch to fix an skb leak in the next patch.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:32:17 -07:00
Joe Stringer
74c1661813 openvswitch: Fix double-free on ip_defrag() errors
If ip_defrag() returns an error other than -EINPROGRESS, then the skb is
freed. When handle_fragments() passes this back up to
do_execute_actions(), it will be freed again. Prevent this double free
by never freeing the skb in do_execute_actions() for errors returned by
ovs_ct_execute. Always free it in ovs_ct_execute() error paths instead.

Fixes: 7f8a436eaa ("openvswitch: Add conntrack action")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 19:32:14 -07:00
Alexander Duyck
c2229fe143 fib_trie: leaf_walk_rcu should not compute key if key is less than pn->key
We were computing the child index in cases where the key value we were
looking for was actually less than the base key of the tnode.  As a result
we were getting incorrect index values that would cause us to skip over
some children.

To fix this I have added a test that will force us to use child index 0 if
the key we are looking for is less than the key of the current tnode.

Fixes: 8be33e955c ("fib_trie: Fib walk rcu should take a tnode and key instead of a trie and a leaf")
Reported-by: Brian Rak <brak@gameservers.com>
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 18:14:51 -07:00
Ming Lin
a22c4d7e34 block: re-add discard_granularity and alignment checks
In commit b49a087("block: remove split code in
blkdev_issue_{discard,write_same}"), discard_granularity and alignment
checks were removed. Ideally, with bio late splitting, the upper layers
shouldn't need to depend on device's limits.

Christoph reported a discard regression on the HGST Ultrastar SN100 NVMe
device when mkfs.xfs. We have not found the root cause yet.

This patch re-adds discard_granularity and alignment checks by reverting
the related changes in commit b49a087. The good thing is now we can
remove the 2G discard size cap and just use UINT_MAX to avoid bi_size
overflow.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-28 09:12:58 +09:00
Linus Torvalds
23d88271b4 Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "Two fixes for ARM and one for clkdev:

   - Fix another build issue with vdsomunge on non-glibc systems
   - Fix a randconfig build error caused by an invalid configuration
   - Fix a clkdev problem causing the Nokia n700 to no longer boot"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  clkdev: fix clk_add_alias() with a NULL alias device name
  ARM: 8445/1: fix vdsomunge not to depend on glibc specific byteswap.h
  ARM: make RiscPC depend on MMU
2015-10-28 07:24:53 +09:00
Linus Torvalds
3d0aa36607 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull blkcg fix from Jens Axboe:
 "One final fix that should go into 4.3.  It's a simple 2x1 liner,
  fixing a blkcg accounting issue.  It was using the wrong bio member to
  look at the sync and write bits..."

* 'for-linus' of git://git.kernel.dk/linux-block:
  blkcg: fix incorrect read/write sync/async stat accounting
2015-10-28 07:22:15 +09:00
Linus Torvalds
dc5bc3f1e3 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
 "This fixes a problem in the Crypto API that may cause spurious errors
  when signals are received by the process that made the orignal system
  call into the kernel"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: api - Only abort operations on fatal signal
2015-10-28 07:20:10 +09:00
Linus Torvalds
9e17f90702 Turns out we should have always been disabling preemption here;
someone finally caught it thanks to Peter Z's additional checks.
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWLfFDAAoJENkgDmzRrbjxhFQP/jK2tSn8I2kDsd2sNZelXor+
 uQGKBpws8c2qY/MXru7HwlAX4L6Qlk6hCm0FyI9ZTqPBfR9sxjqFzOzWjcwFYrHa
 EXPPzJcGBR1voK/n/e+J/xHFQgXuIVKiOtmz1PZOp8Hs4zO3SftT4+MuTCpuaC0x
 79P0crJ4lV8FKFc+5pjWXceWRIG5XQuIbmDjNLOycuiNad7mtrEnR4xbMQBxD1Hf
 P1xT4pKckwVcSvRNw8St25ov65N/l/CPlDtYXov4eay4/Jd4wTFRSipe9v70tOkL
 vUGFGwa76ZM9X9y1Q85l2oQEkKMffSl82JEXyDrGPITwlYX0qRGOGfXPL8ihgcFi
 Gd3abBdI68HQidkwPNBJmMdSEX8q3ygzFeFOLYNdmeklhqnKRZHvDvbok1poBnSf
 EJbE6Cv7eKY9e74lu60NAf76XB48oL/BvEfdW1l4p/8GGxc0EYc6XGjjeHLsQffT
 7yMRzEp99nX4JQoSjeDRroLbCCAzzoAcogJnlrnYFywzfVkyWUU3sPpbMFxvRDSW
 Xuk/8Ii15AYwXnbQOpojtIJAi9f6F8CzIbtsO5J9cFju8lFUzUYpwC0Lm/pPoxQT
 PqgDqVvS3wBOaL7iEA1Sy615qUCxu6eIKK6gqnFs1hK9sTg/aSUF+Oz668rQ/a/E
 2XvWiB2C9NErM5IHtQdh
 =8AgP
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module preemption fix from Rusty Russell:
 "Turns out we should have always been disabling preemption here;
  someone finally caught it thanks to Peter Z's additional checks"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  module: Fix locking in symbol_put_addr()
2015-10-28 07:17:50 +09:00
Arnaldo Carvalho de Melo
f4efcce33d perf tools: Search for more options when passing args to -h
Recently 'perf <tool> -h' was made aware of arguments and would show
just the help for the arguments specified, but that required a strict
form, i.e.:

  $ perf -h --tui

worked, but:

  $ perf -h tui

didn't.

Make it support both cases and also look at the option help when neither
matches, so that he following examples works:

  $ perf report -h interface

   Usage: perf report [<options>]

    --gtk    Use the GTK2 interface
    --stdio  Use the stdio interface
    --tui    Use the TUI interface

  $ perf report -h stack

   Usage: perf report [<options>]

    -g, --call-graph <print_type,threshold[,print_limit],order,
                      sort_key[,branch]>
      Display call graph (stack chain/backtrace):

        print_type:  call graph printing style (graph|flat|fractal|none)
        threshold:   minimum call graph inclusion threshold (<percent>)
        print_limit: maximum number of call graph entry (<number>)
        order:       call graph order (caller|callee)
        sort_key:    call graph sort key (function|address)
        branch:      include last branch info to call graph (branch)

      Default: graph,0.5,caller,function
        --max-stack <n>   Set the maximum stack depth when parsing the
                          callchain, anything beyond the specified depth
                          will be ignored. Default: 127
  $

Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-xzqvamzqv3cv0p6w3inhols3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-27 17:35:58 -03:00
Jiri Olsa
1e5a29318b perf stat: Cache aggregated map entries in extra cpumap
Currently any time we need to access socket or core id for given cpu, we
access the sysfs topology file.

Adding a cpus_aggr_map cpu_map to cache those entries.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-27 15:08:07 -03:00
Jiri Olsa
2322f573f8 perf cpu_map: Add cpu_map__empty_new function
Adding cpu_map__empty_new interface to create empty cpumap with given
size. The cpumap entries are initialized with -1.

It'll be used for caching cpu_map in following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-27 15:05:36 -03:00
Jiri Olsa
af33998174 perf evsel: Move id_offset out of struct perf_evsel union member
Because the 'perf stat record' patches will use the id_offset member
together with the priv pointer.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-29-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-27 15:04:29 -03:00