Commit Graph

107 Commits

Author SHA1 Message Date
Linus Torvalds
43c9fad942 Power management and ACPI material for v4.2-rc1
- ACPICA update to upstream revision 20150515 including basic
    support for ACPI 6 features: new ACPI tables introduced by
    ACPI 6 (STAO, XENV, WPBT, NFIT, IORT), changes related to the
    other tables (DTRM, FADT, LPIT, MADT), new predefined names
    (_BTH, _CR3, _DSD, _LPI, _MTL, _PRR, _RDI, _RST, _TFP, _TSN),
    fixes and cleanups (Bob Moore, Lv Zheng).
 
  - ACPI device power management core code update to follow ACPI 6
    which reflects the ACPI device power management implementation
    in Windows (Rafael J Wysocki).
 
  - Rework of the backlight interface selection logic to reduce the
    number of kernel command line options and improve the handling
    of DMI quirks that may be involved in that and to make the
    code generally more straightforward (Hans de Goede).
 
  - Fixes for the ACPI Embedded Controller (EC) driver related to
    the handling of EC transactions (Lv Zheng).
 
  - Fix for a regression related to the ACPI resources management
    and resulting from a recent change of ACPI initialization code
    ordering (Rafael J Wysocki).
 
  - Fix for a system initialization regression related to ACPI
    introduced during the 3.14 cycle and caused by running the
    code that switches the platform over to the ACPI mode too
    early in the initialization sequence (Rafael J Wysocki).
 
  - Support for the ACPI _CCA device configuration object related
    to DMA cache coherence (Suravee Suthikulpanit).
 
  - ACPI/APEI fixes and cleanups (Jiri Kosina, Borislav Petkov).
 
  - ACPI battery driver cleanups (Luis Henriques, Mathias Krause).
 
  - ACPI processor driver cleanups (Hanjun Guo).
 
  - Cleanups and documentation update related to the ACPI device
    properties interface based on _DSD (Rafael J Wysocki).
 
  - ACPI device power management fixes (Rafael J Wysocki).
 
  - Assorted cleanups related to ACPI (Dominik Brodowski. Fabian
    Frederick, Lorenzo Pieralisi, Mathias Krause, Rafael J Wysocki).
 
  - Fix for a long-standing issue causing General Protection Faults
    to be generated occasionally on return to user space after resume
    from ACPI-based suspend-to-RAM on 32-bit x86 (Ingo Molnar).
 
  - Fix to make the suspend core code return -EBUSY consistently in
    all cases when system suspend is aborted due to wakeup detection
    (Ruchi Kandoi).
 
  - Support for automated device wakeup IRQ handling allowing drivers
    to make their PM support more starightforward (Tony Lindgren).
 
  - New tracepoints for suspend-to-idle tracing and rework of the
    prepare/complete callbacks tracing in the PM core (Todd E Brandt,
    Rafael J Wysocki).
 
  - Wakeup sources framework enhancements (Jin Qian).
 
  - New macro for noirq system PM callbacks (Grygorii Strashko).
 
  - Assorted cleanups related to system suspend (Rafael J Wysocki).
 
  - cpuidle core cleanups to make the code more efficient (Rafael J
    Wysocki).
 
  - powernv/pseries cpuidle driver update (Shilpasri G Bhat).
 
  - cpufreq core fixes related to CPU online/offline that should
    reduce the overhead of these operations quite a bit, unless the
    CPU in question is physically going away (Viresh Kumar, Saravana
    Kannan).
 
  - Serialization of cpufreq governor callbacks to avoid race
    conditions in some cases (Viresh Kumar).
 
  - intel_pstate driver fixes and cleanups (Doug Smythies, Prarit
    Bhargava, Joe Konno).
 
  - cpufreq driver (arm_big_little, cpufreq-dt, qoriq) updates (Sudeep
    Holla, Felipe Balbi, Tang Yuantian).
 
  - Assorted cleanups in cpufreq drivers and core (Shailendra Verma,
    Fabian Frederick, Wang Long).
 
  - New Device Tree bindings for representing Operating Performance
    Points (Viresh Kumar).
 
  - Updates for the common clock operations support code in the PM
    core (Rajendra Nayak, Geert Uytterhoeven).
 
  - PM domains core code update (Geert Uytterhoeven).
 
  - Intel Knights Landing support for the RAPL (Running Average Power
    Limit) power capping driver (Dasaratharaman Chandramouli).
 
  - Fixes related to the floor frequency setting on Atom SoCs in the
    RAPL power capping driver (Ajay Thomas).
 
  - Runtime PM framework documentation update (Ben Dooks).
 
  - cpupower tool fix (Herton R Krzesinski).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJViJdWAAoJEILEb/54YlRx/9gP/3gHoFevNRycvn0VpKqdufCI
 Mxy2LBBLlfyW2uD3+NvqvA2WWSo0Cs/LgXa04eAVxPdU7k48s8w+54U23wSouzjW
 gfwAmuHxzDR8v0h8X3h6BxNzmkIQHtmDcQlA/cZdHejY/UUw01yxRGNUUZDNbxlm
 WXn2nmlBLmGqXTYq0fpBV+3jicUghJqHHsBCqa3VR2yQioHMJG01F4UZMqYTZunN
 OIvDUghxByKz6alzdCqlLl1Y0exV6vwWUAzBsl1qHqmHu/bWFSZn3ujNNVrjqHhw
 Kl7/8dC2pQkv3Zo3gEVvfQ0onotwWZxGHzPQRdvmxvRnBunQVCi/wynx90yABX/r
 PPb/iBNV0mZskbF0zb0GZT3ZZWGA8Z0p3o5JQv2jV4m62qTzx8w50Y5kbn9N1WT+
 5bre7AVbVAlGonWszcS9iE+6TOboRz9OD1CCwPFXHItFutlBkau+1hHfFoLM0o9n
 LhpGuyszT/EUa1BHkLzuCckFqO2DpbF3N2CKmuTekw0CdgdsvRL2pRByuerk3j7R
 WQhlcvBq5YH6j43AuoEZKp8r1iN8oG/iqlrMYQaYWrW9hJaoQOoU8dGJxp/e7gKN
 r/qeYjETI+tIsjCbtH5WQzzxDI3gPISAYAtfqs7G34EEo+Lwp6kyRUAF4kDot2V3
 ZIyuKMmTu4cdwDETr/O+
 =7jTj
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI updates from Rafael Wysocki:
 "The rework of backlight interface selection API from Hans de Goede
  stands out from the number of commits and the number of affected
  places perspective.  The cpufreq core fixes from Viresh Kumar are
  quite significant too as far as the number of commits goes and because
  they should reduce CPU online/offline overhead quite a bit in the
  majority of cases.

  From the new featues point of view, the ACPICA update (to upstream
  revision 20150515) adding support for new ACPI 6 material to ACPICA is
  the one that matters the most as some new significant features will be
  based on it going forward.  Also included is an update of the ACPI
  device power management core to follow ACPI 6 (which in turn reflects
  the Windows' device PM implementation), a PM core extension to support
  wakeup interrupts in a more generic way and support for the ACPI _CCA
  device configuration object.

  The rest is mostly fixes and cleanups all over and some documentation
  updates, including new DT bindings for Operating Performance Points.

  There is one fix for a regression introduced in the 4.1 cycle, but it
  adds quite a number of lines of code, it wasn't really ready before
  Thursday and you were on vacation, so I refrained from pushing it on
  the last minute for 4.1.

  Specifics:

   - ACPICA update to upstream revision 20150515 including basic support
     for ACPI 6 features: new ACPI tables introduced by ACPI 6 (STAO,
     XENV, WPBT, NFIT, IORT), changes related to the other tables (DTRM,
     FADT, LPIT, MADT), new predefined names (_BTH, _CR3, _DSD, _LPI,
     _MTL, _PRR, _RDI, _RST, _TFP, _TSN), fixes and cleanups (Bob Moore,
     Lv Zheng).

   - ACPI device power management core code update to follow ACPI 6
     which reflects the ACPI device power management implementation in
     Windows (Rafael J Wysocki).

   - rework of the backlight interface selection logic to reduce the
     number of kernel command line options and improve the handling of
     DMI quirks that may be involved in that and to make the code
     generally more straightforward (Hans de Goede).

   - fixes for the ACPI Embedded Controller (EC) driver related to the
     handling of EC transactions (Lv Zheng).

   - fix for a regression related to the ACPI resources management and
     resulting from a recent change of ACPI initialization code ordering
     (Rafael J Wysocki).

   - fix for a system initialization regression related to ACPI
     introduced during the 3.14 cycle and caused by running the code
     that switches the platform over to the ACPI mode too early in the
     initialization sequence (Rafael J Wysocki).

   - support for the ACPI _CCA device configuration object related to
     DMA cache coherence (Suravee Suthikulpanit).

   - ACPI/APEI fixes and cleanups (Jiri Kosina, Borislav Petkov).

   - ACPI battery driver cleanups (Luis Henriques, Mathias Krause).

   - ACPI processor driver cleanups (Hanjun Guo).

   - cleanups and documentation update related to the ACPI device
     properties interface based on _DSD (Rafael J Wysocki).

   - ACPI device power management fixes (Rafael J Wysocki).

   - assorted cleanups related to ACPI (Dominik Brodowski, Fabian
     Frederick, Lorenzo Pieralisi, Mathias Krause, Rafael J Wysocki).

   - fix for a long-standing issue causing General Protection Faults to
     be generated occasionally on return to user space after resume from
     ACPI-based suspend-to-RAM on 32-bit x86 (Ingo Molnar).

   - fix to make the suspend core code return -EBUSY consistently in all
     cases when system suspend is aborted due to wakeup detection (Ruchi
     Kandoi).

   - support for automated device wakeup IRQ handling allowing drivers
     to make their PM support more starightforward (Tony Lindgren).

   - new tracepoints for suspend-to-idle tracing and rework of the
     prepare/complete callbacks tracing in the PM core (Todd E Brandt,
     Rafael J Wysocki).

   - wakeup sources framework enhancements (Jin Qian).

   - new macro for noirq system PM callbacks (Grygorii Strashko).

   - assorted cleanups related to system suspend (Rafael J Wysocki).

   - cpuidle core cleanups to make the code more efficient (Rafael J
     Wysocki).

   - powernv/pseries cpuidle driver update (Shilpasri G Bhat).

   - cpufreq core fixes related to CPU online/offline that should reduce
     the overhead of these operations quite a bit, unless the CPU in
     question is physically going away (Viresh Kumar, Saravana Kannan).

   - serialization of cpufreq governor callbacks to avoid race
     conditions in some cases (Viresh Kumar).

   - intel_pstate driver fixes and cleanups (Doug Smythies, Prarit
     Bhargava, Joe Konno).

   - cpufreq driver (arm_big_little, cpufreq-dt, qoriq) updates (Sudeep
     Holla, Felipe Balbi, Tang Yuantian).

   - assorted cleanups in cpufreq drivers and core (Shailendra Verma,
     Fabian Frederick, Wang Long).

   - new Device Tree bindings for representing Operating Performance
     Points (Viresh Kumar).

   - updates for the common clock operations support code in the PM core
     (Rajendra Nayak, Geert Uytterhoeven).

   - PM domains core code update (Geert Uytterhoeven).

   - Intel Knights Landing support for the RAPL (Running Average Power
     Limit) power capping driver (Dasaratharaman Chandramouli).

   - fixes related to the floor frequency setting on Atom SoCs in the
     RAPL power capping driver (Ajay Thomas).

   - runtime PM framework documentation update (Ben Dooks).

   - cpupower tool fix (Herton R Krzesinski)"

* tag 'pm+acpi-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (194 commits)
  cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state
  x86: Load __USER_DS into DS/ES after resume
  PM / OPP: Add binding for 'opp-suspend'
  PM / OPP: Allow multiple OPP tables to be passed via DT
  PM / OPP: Add new bindings to address shortcomings of existing bindings
  ACPI: Constify ACPI device IDs in documentation
  ACPI / enumeration: Document the rules regarding the PRP0001 device ID
  ACPI / video: Make acpi_video_unregister_backlight() private
  acpi-video-detect: Remove old API
  toshiba-acpi: Port to new backlight interface selection API
  thinkpad-acpi: Port to new backlight interface selection API
  sony-laptop: Port to new backlight interface selection API
  samsung-laptop: Port to new backlight interface selection API
  msi-wmi: Port to new backlight interface selection API
  msi-laptop: Port to new backlight interface selection API
  intel-oaktrail: Port to new backlight interface selection API
  ideapad-laptop: Port to new backlight interface selection API
  fujitsu-laptop: Port to new backlight interface selection API
  eeepc-laptop: Port to new backlight interface selection API
  dell-wmi: Port to new backlight interface selection API
  ...
2015-06-23 14:18:07 -07:00
Prarit Bhargava
7180dddf7c intel_pstate: Fix overflow in busy_scaled due to long delay
The kernel may delay interrupts for a long time which can result in timers
being delayed. If this occurs the intel_pstate driver will crash with a
divide by zero error:

divide error: 0000 [#1] SMP
Modules linked in: btrfs zlib_deflate raid6_pq xor msdos ext4 mbcache jbd2 binfmt_misc arc4 md4 nls_utf8 cifs dns_resolver tcp_lp bnep bluetooth rfkill fuse dm_service_time iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ftp ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables intel_powerclamp coretemp vfat fat kvm_intel iTCO_wdt iTCO_vendor_support ipmi_devintf sr_mod kvm crct10dif_pclmul
 crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel cdc_ether lrw usbnet cdrom mii gf128mul glue_helper ablk_helper cryptd lpc_ich mfd_core pcspkr sb_edac edac_core ipmi_si ipmi_msghandler ioatdma wmi shpchp acpi_pad nfsd auth_rpcgss nfs_acl lockd uinput dm_multipath sunrpc xfs libcrc32c usb_storage sd_mod crc_t10dif crct10dif_common ixgbe mgag200 syscopyarea sysfillrect sysimgblt mdio drm_kms_helper ttm igb drm ptp pps_core dca i2c_algo_bit megaraid_sas i2c_core dm_mirror dm_region_hash dm_log dm_mod
CPU: 113 PID: 0 Comm: swapper/113 Tainted: G        W   --------------   3.10.0-229.1.2.el7.x86_64 #1
Hardware name: IBM x3950 X6 -[3837AC2]-/00FN827, BIOS -[A8E112BUS-1.00]- 08/27/2014
task: ffff880fe8abe660 ti: ffff880fe8ae4000 task.ti: ffff880fe8ae4000
RIP: 0010:[<ffffffff814a9279>]  [<ffffffff814a9279>] intel_pstate_timer_func+0x179/0x3d0
RSP: 0018:ffff883fff4e3db8  EFLAGS: 00010206
RAX: 0000000027100000 RBX: ffff883fe6965100 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000010 RDI: 000000002e53632d
RBP: ffff883fff4e3e20 R08: 000e6f69a5a125c0 R09: ffff883fe84ec001
R10: 0000000000000002 R11: 0000000000000005 R12: 00000000000049f5
R13: 0000000000271000 R14: 00000000000049f5 R15: 0000000000000246
FS:  0000000000000000(0000) GS:ffff883fff4e0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7668601000 CR3: 000000000190a000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffff883fff4e3e58 ffffffff81099dc1 0000000000000086 0000000000000071
 ffff883fff4f3680 0000000000000071 fbdc8a965e33afee ffffffff810b69dd
 ffff883fe84ec000 ffff883fe6965108 0000000000000100 ffffffff814a9100
Call Trace:
 <IRQ>

 [<ffffffff81099dc1>] ? run_posix_cpu_timers+0x51/0x840
 [<ffffffff810b69dd>] ? trigger_load_balance+0x5d/0x200
 [<ffffffff814a9100>] ? pid_param_set+0x130/0x130
 [<ffffffff8107df56>] call_timer_fn+0x36/0x110
 [<ffffffff814a9100>] ? pid_param_set+0x130/0x130
 [<ffffffff8107fdcf>] run_timer_softirq+0x21f/0x320
 [<ffffffff81077b2f>] __do_softirq+0xef/0x280
 [<ffffffff816156dc>] call_softirq+0x1c/0x30
 [<ffffffff81015d95>] do_softirq+0x65/0xa0
 [<ffffffff81077ec5>] irq_exit+0x115/0x120
 [<ffffffff81616355>] smp_apic_timer_interrupt+0x45/0x60
 [<ffffffff81614a1d>] apic_timer_interrupt+0x6d/0x80
 <EOI>

 [<ffffffff814a9c32>] ? cpuidle_enter_state+0x52/0xc0
 [<ffffffff814a9c28>] ? cpuidle_enter_state+0x48/0xc0
 [<ffffffff814a9d65>] cpuidle_idle_call+0xc5/0x200
 [<ffffffff8101d14e>] arch_cpu_idle+0xe/0x30
 [<ffffffff810c67c1>] cpu_startup_entry+0xf1/0x290
 [<ffffffff8104228a>] start_secondary+0x1ba/0x230
Code: 42 0f 00 45 89 e6 48 01 c2 43 8d 44 6d 00 39 d0 73 26 49 c1 e5 08 89 d2 4d 63 f4 49 63 c5 48 c1 e2 08 48 c1 e0 08 48 63 ca 48 99 <48> f7 f9 48 98 4c 0f af f0 49 c1 ee 08 8b 43 78 c1 e0 08 44 29
RIP  [<ffffffff814a9279>] intel_pstate_timer_func+0x179/0x3d0
 RSP <ffff883fff4e3db8>

The kernel values for cpudata for CPU 113 were:

struct cpudata {
  cpu = 113,
  timer = {
    entry = {
      next = 0x0,
      prev = 0xdead000000200200
    },
    expires = 8357799745,
    base = 0xffff883fe84ec001,
    function = 0xffffffff814a9100 <intel_pstate_timer_func>,
    data = 18446612406765768960,
<snip>
    i_gain = 0,
    d_gain = 0,
    deadband = 0,
    last_err = 22489
  },
  last_sample_time = {
    tv64 = 4063132438017305
  },
  prev_aperf = 287326796397463,
  prev_mperf = 251427432090198,
  sample = {
    core_pct_busy = 23081,
    aperf = 2937407,
    mperf = 3257884,
    freq = 2524484,
    time = {
      tv64 = 4063149215234118
    }
  }
}

which results in the time between samples = last_sample_time - sample.time
= 4063149215234118 - 4063132438017305 = 16777216813 which is 16.777 seconds.

The duration between reads of the APERF and MPERF registers overflowed a s32
sized integer in intel_pstate_get_scaled_busy()'s call to div_fp().  The result
is that int_tofp(duration_us) == 0, and the kernel attempts to divide by 0.

While the kernel shouldn't be delaying for a long time, it can and does
happen and the intel_pstate driver should not panic in this situation.  This
patch changes the div_fp() function to use div64_s64() to allow for "long"
division.  This will avoid the overflow condition on long delays.

[v2]: use div64_s64() in div_fp()

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-06-16 22:52:45 +02:00
Doug Smythies
6c1e45917d intel_pstate: Force setting target pstate when required
During initialization and exit it is possible that the target pstate
might not actually be set. Furthermore, the result can be that the
driver and the processor are out of synch and, under some conditions,
the driver might never send the processor the proper target pstate.

This patch adds a bypass or do_checks flag to the call to
intel_pstate_set_pstate. If bypass, then specifically bypass clamp
checks and the do not send if it is the same as last time check. If
do_checks, then, and as before, do the current policy clamp checks,
and do not do actual send if the new target is the same as the old.

Signed-off-by: Doug Smythies <dsmythies@telus.net>
Reported-by: Marien Zwart <marien.zwart@gmail.com>
Reported-by: Alex Lochmann <alexander.lochmann@tu-dortmund.de>
Reported-by: Piotr Ko?aczkowski <pkolaczk@gmail.com>
Reported-by: Clemens Eisserer <linuxhippy@gmail.com>
Tested-by: Marien Zwart <marien.zwart@gmail.com>
Tested-by: Doug Smythies <dsmythies@telus.net>
[ rjw: Dropped pointless symbol definitions, rebased ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-06-10 02:08:27 +02:00
Doug Smythies
f16255eb93 intel_pstate: change some inconsistent debug information
Commit ce717613f3 (intel_pstate: Turn per cpu printk into pr_debug)
turned per cpu printk into pr_debug.  However, only half of the change
was done, introducing an inconsistency between entry and exit from
driver pstate control.  This patch changes the exit message to pr_debug
also.

The various messages are inconsistent with respect to any identifier
text that can be used to help isolate the desired information from a
huge log.  This patch makes a consistent identifier portion of the
string.

Amends: ce717613f3 (intel_pstate: Turn per cpu printk into pr_debug)
Signed-off-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-06-10 01:57:14 +02:00
Stephen Rothwell
d6472302f2 x86/mm: Decouple <linux/vmalloc.h> from <asm/io.h>
Nothing in <asm/io.h> uses anything from <linux/vmalloc.h>, so
remove it from there and fix up the resulting build problems
triggered on x86 {64|32}-bit {def|allmod|allno}configs.

The breakages were triggering in places where x86 builds relied
on vmalloc() facilities but did not include <linux/vmalloc.h>
explicitly and relied on the implicit inclusion via <asm/io.h>.

Also add:

  - <linux/init.h> to <linux/io.h>
  - <asm/pgtable_types> to <asm/io.h>

... which were two other implicit header file dependencies.

Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
[ Tidied up the changelog. ]
Acked-by: David Miller <davem@davemloft.net>
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Colin Cross <ccross@android.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: James E.J. Bottomley <JBottomley@odin.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Suma Ramars <sramars@cisco.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-03 12:02:00 +02:00
Joe Konno
0dd23f9425 intel_pstate: set BYT MSR with wrmsrl_on_cpu()
Commit 007bea098b (intel_pstate: Add setting voltage value for
baytrail P states.) introduced byt_set_pstate() with the assumption that
it would always be run by the CPU whose MSR is to be written by it.  It
turns out, however, that is not always the case in practice, so modify
byt_set_pstate() to enforce the MSR write done by it to always happen on
the right CPU.

Fixes: 007bea098b (intel_pstate: Add setting voltage value for baytrail P states.)
Signed-off-by: Joe Konno <joe.konno@intel.com>
Acked-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-12 23:36:26 +02:00
Doug Smythies
4055fad340 intel_pstate: Add tsc collection and keep previous target pstate
The intel_pstate driver is difficult to debug and investigate without tsc.

Also, it is likely use of tsc, and some version of C0 percentage,
will be re-introdcued in futute.

There have also been occasions where it is desirebale to know, and
confirm, the previous target pstate.

This patch brings back tsc, adds previous target pstate,
and adds both to the trace data collection.

Signed-off-by: Doug Smythies <dsmythies@telus.net>
Acked-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-05 01:08:54 +02:00
Borislav Petkov
64df1fdfcc cpufreq: intel_pstate: Fix an annoying !CONFIG_SMP warning
I keep seeing

  drivers/cpufreq/intel_pstate.c: In function ‘intel_pstate_init’:
  drivers/cpufreq/intel_pstate.c:1187:26: warning: initialization from incompatible pointer type
    struct cpuinfo_x86 *c = &boot_cpu_data;

when doing randconfig builds.

This is caused by the fact that when !CONFIG_SMP, asm/processor.h
defines cpu_info to boot_cpu_data and the local variable

  struct cpu_defaults *cpu_info

overshadows it leading to this unfortunate assignment in the
preprocessed source:

 struct cpu_defaults *boot_cpu_data;
 struct cpuinfo_x86 *c = &boot_cpu_data;

Rename the local variable and use static_cpu_has_safe() which alleviates
the need for defining a local cpuinfo_x86 pointer.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-15 23:02:24 +02:00
Kristen Carlson Accardi
6a82ba6d4f intel_pstate: Change the setpoint for Atom params
Change the setpoint for the Baytrail and Cherrytrail CPUs.  This
will cause more aggressive pstate selection and improves
performance on a variety of workloads with little power penalty.

Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-15 22:42:32 +02:00
Dasaratharaman Chandramouli
b34ef932d7 intel_pstate: Knights Landing support
1. Add Knights Landing (KNL) CPUID to the list of CPUIDs supported by
    the intel_pstate driver.

 2. Add a new cpu_default structure for KNL since KNL has a slightly
    different mechanism to get turbo pstates from MSRs.

Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
[ rjw: Subject and changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-11 02:13:29 +02:00
Kristen Carlson Accardi
5f97899d78 intel_pstate: remove MSR test
x86_match_cpu will not match our cpuid unless APERF/MPERF flag is
set, so there is no need to do the manual check for this MSR.

Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-11 02:13:28 +02:00
Kristen Carlson Accardi
d64c3b0bb9 intel_pstate: provide option to only use intel_pstate with HWP
Allow users the option to disable the driver for any hardware
which does not support HWP.

Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-06 22:54:18 +01:00
Kristen Carlson Accardi
a04759924e intel_pstate: honor user space min_perf_pct override on resume
If the user has requested an override of the min_perf_pct via
sysfs, then it should be restored whenever policy is updated,
such as on resume.  Take the max of whatever the user requested
and whatever the policy is.

Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-30 01:52:17 +01:00
Srinivas Pandruvada
630ec286dd intel_pstate: respect cpufreq policy request
When thermal or other subsystem requests to change the policy,
use that irrepective of whether cpufreq policy is PERFORMANCE or
not. Without this change, when thermal subsystem passive policy wants
to reduce performance, it still runs at 100%.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-30 01:52:17 +01:00
Kristen Carlson Accardi
0522424ecb intel_pstate: Add num_pstates to sysfs
Add a sysfs interface to display the total number of supported
pstates.  This value is independent of whether turbo has been
enabled or disabled.

Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-30 01:52:17 +01:00
Kristen Carlson Accardi
d01b1f48c5 intel_pstate: expose turbo range to sysfs
This patch adds "turbo_pct" to the intel_pstate sysfs interface.
turbo_pct will display the percentage of the total supported
pstates that are in the turbo range.  This value is independent
of whether turbo has been disabled or not.

Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-30 01:52:17 +01:00
Kristen Carlson Accardi
7ab0256e57 intel_pstate: Add support for SkyLake
Add SKL cpuid.

Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-30 01:52:17 +01:00
Kristen Carlson Accardi
e0d4c8f808 intel_pstate: Add a few comments
Add a few comments in the code which calculates busyness to
clarify parts of the algorithm.

Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-11 00:24:42 +01:00
Ethan Zhao
aa4ea34da9 intel_pstate: add kernel parameter to force loading
To force loading on Oracle Sun X86 servers, provide one kernel command line
parameter

  intel_pstate = force

For those who are aware of the risk of no power capping capabily working
and try to get better performance with this driver.

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Tested-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Reviewed-by: Linda Knippers <linda.knippers@hp.com>
Acked-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-11 00:23:48 +01:00
ethan zhao
966916eabf intel_pstate: skip this driver if Sun server has _PPC method
Oracle Sun X86 servers have dynamic power capping capability that works via
ACPI _PPC method etc, so skip loading this driver if Sun server has ACPI _PPC
enabled.

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Tested-by: Linda Knippers <linda.knippers@hp.com>
Acked-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-03 02:56:04 +01:00
Dirk Brandewie
43f8a966e9 intel_pstate: Add CPUID for BDW-H CPU
Add BDW-H to the list of supported processors.

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-12 00:04:38 +01:00
Dirk Brandewie
2f86dc4cdd intel_pstate: Add support for HWP
Add support of Hardware Managed Performance States (HWP) described in Volume 3
section 14.4 of the SDM.

With HWP enbaled intel_pstate will no longer be responsible for selecting P
states for the processor. intel_pstate will continue to register to
the cpufreq core as the scaling driver for CPUs implementing
HWP. In HWP mode intel_pstate provides three functions reporting
frequency to the cpufreq core, support for the set_policy() interface
from the core and maintaining the intel_pstate sysfs interface in
/sys/devices/system/cpu/intel_pstate.  User preferences expressed via
the set_policy() interface or the sysfs interface are forwared to the
CPU via the HWP MSR interface.

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-12 00:04:38 +01:00
Dirk Brandewie
d022a65ed2 intel_pstate: Correct BYT VID values.
Using a VID value that is not high enough for the requested P state can
cause machine checks. Add a ceiling function to ensure calulated VIDs
with fractional values are set to the next highest integer VID value.

The algorythm for calculating the non-trubo VID from the BIOS writers
guide is:
 vid_ratio = (vid_max - vid_min) / (max_pstate - min_pstate)
 vid = ceiling(vid_min + (req_pstate - min_pstate) * vid_ratio)

Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-23 23:00:00 +02:00
Dirk Brandewie
b27580b05e intel_pstate: Fix BYT frequency reporting
BYT has a different conversion from P state to frequency than the core
processors.  This causes the min/max and current frequency to be
misreported on some BYT SKUs. Tested on BYT N2820, Ivybridge and
Haswell processors.

Link: https://bugzilla.yoctoproject.org/show_bug.cgi?id=6663
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-23 22:59:59 +02:00
Dirk Brandewie
c034871712 intel_pstate: Don't lose sysfs settings during cpu offline
The user may have custom settings don't destroy them during suspend.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=80651
Reported-by: Tobias Jakobi <liquid.acid@gmx.net>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-23 22:59:59 +02:00
Gabriele Mazzotta
4521e1a0ce cpufreq: intel_pstate: Reflect current no_turbo state correctly
Some BIOSes modify the state of MSR_IA32_MISC_ENABLE_TURBO_DISABLE
based on the current power source for the system battery AC vs
battery. Reflect the correct current state and ability to modify the
no_turbo sysfs file based on current state of
MSR_IA32_MISC_ENABLE_TURBO_DISABLE.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=83151
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-23 22:59:59 +02:00
Pali Rohár
36b4bed5cd cpufreq: intel_pstate: Fix setting max_perf_pct in performance policy
Code which changes policy to powersave changes also max_policy_pct based on
max_freq. Code which change max_perf_pct has upper limit base on value
max_policy_pct. When policy is changing from powersave back to performance
then max_policy_pct is not changed. Which means that changing max_perf_pct is
not possible to high values if max_freq was too low in powersave policy.

Test case:

$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq
800000
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
3300000
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance
$ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct
100

$ echo powersave > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
$ echo 800000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
$ echo 20 > /sys/devices/system/cpu/intel_pstate/max_perf_pct

$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
powersave
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
800000
$ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct
20

$ echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
$ echo 3300000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
$ echo 100 > /sys/devices/system/cpu/intel_pstate/max_perf_pct

$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
3300000
$ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct
24

And now intel_pstate driver allows to set maximal value for max_perf_pct based
on max_policy_pct which is 24 for previous powersave max_freq 800000.

This patch will set default value for max_policy_pct when setting policy to
performance so it will allow to set also max value for max_perf_pct.

Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Cc: All applicable <stable@vger.kernel.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-23 22:59:59 +02:00
Gabriele Mazzotta
73f1ae8ab0 cpufreq: intel_pstate: Remove unneeded variable
It should have been removed with commit d1b6848590
("cpufreq / intel_pstate: Optimize intel_pstate_set_policy")

Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-09-03 01:31:04 +02:00
Mika Westerberg
16405f98bc cpufreq: intel_pstate: Add CPU ID for Braswell processor
This is pretty much the same as Intel Baytrail, only the CPU ID is
different. Add the new ID to the supported CPU list.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-08-28 01:26:21 +02:00
Andi Kleen
ce717613f3 intel_pstate: Turn per cpu printk into pr_debug
On larger systems intel_pstate currently spams the boot up
log with its "Intel pstate controlling ..." message for each CPU.
It's the only subsystem that prints a message for each
CPU.

Turn the message into a pr_debug.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-08-28 01:23:08 +02:00
Stratos Karafotis
78e2708691 cpufreq: intel_pstate: Remove core_pct rounding
The specific rounding adds conditionally only 1/256 to fractional
part of core_pct.

We can safely remove it without any noticeable impact in
calculations.

Use div64_u64 instead of div_u64 to avoid possible overflow of
sample->mperf as divisor

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-21 13:43:19 +02:00
Stratos Karafotis
4b707c893d cpufreq: intel_pstate: Simplify P state adjustment logic.
Simplify the code by removing the inline functions pstate_increase and
pstate_decrease and use directly the intel_pstate_set_pstate.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-21 13:43:19 +02:00
Stratos Karafotis
ac658131d7 cpufreq: intel_pstate: Keep values in aperf/mperf in full precision
Currently we shift right aperf and mperf variables by FRAC_BITS
to prevent overflow when we convert them to fix point numbers
(shift left by FRAC_BITS).

But this is not necessary, because we actually use delta aperf and mperf
which are much less than APERF and MPERF values.

So, use the unmodified APERF and MPERF values in calculation.
This also adds 8 bits in precision, although the gain is insignificant.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-21 13:43:18 +02:00
Stratos Karafotis
4ab60c3f32 cpufreq: intel_pstate: Disable interrupts during MSRs reading
According to Intel 64 and IA-32 Architectures SDM, Volume 3,
Chapter 14.2, "Software needs to exercise care to avoid delays
between the two RDMSRs (for example interrupts)".

So, disable interrupts during reading MSRs IA32_APERF and IA32_MPERF.
This should increase the accuracy of the calculations.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-21 13:43:18 +02:00
Stratos Karafotis
c410833a3c cpufreq: intel_pstate: Align multiple lines to open parenthesis
Suppress checkpatch.pl --strict warnings:
CHECK: Alignment should match open parenthesis

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-21 13:43:18 +02:00
Stratos Karafotis
abf013bffe cpufreq: intel_pstate: Remove unnecessary intermediate variable sample_time
Remove the unnecessary intermediate assignment and use directly the
pid_params.sample_rate_ms variable.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-21 13:43:18 +02:00
Stratos Karafotis
285cb99091 cpufreq: intel_pstate: Cleanup parentheses
Remove unnecessary parentheses.
Also, add parentheses in one case for better readability.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-21 13:43:18 +02:00
Stratos Karafotis
2d8d1f18ed cpufreq: intel_pstate: Fit code in a single line where possible
We can fit these lines in a single one, under the 80 characters
limit.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-21 13:43:18 +02:00
Stratos Karafotis
845c1cbef0 cpufreq: intel_pstate: Add missing blank lines after declarations
Also, remove unnecessary blank lines.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-21 13:43:18 +02:00
Stratos Karafotis
fa30dff9a8 cpufreq: intel_pstate: Remove unnecessary type casting in div_s64() call
div_s64() accepts the divisor parameter as s32. Helper div_fp()
also accepts divisor as int32_t.

So, remove the unnecessary int64_t type casting.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-21 13:43:17 +02:00
Stratos Karafotis
317dd50e80 cpufreq: intel_pstate: Make intel_pstate_kobject and debugfs_parent locals
Since we never remove sysfs entry and debugfs files, we can make
the intel_pstate_kobject and debugfs_parent locals.

Also, annotate with __init intel_pstate_sysfs_expose_params()
and intel_pstate_debug_expose_params() in order to be freed
after bootstrap.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-21 13:43:17 +02:00
Vincent Minet
179e847167 intel_pstate: Set CPU number before accessing MSRs
Ensure that cpu->cpu is set before writing MSR_IA32_PERF_CTL during CPU
initialization. Otherwise only cpu0 has its P-state set and all other
cores are left with their values unchanged.

In most cases, this is not too serious because the P-states will be set
correctly when the timer function is run.  But when the default governor
is set to performance, the per-CPU current_pstate stays the same forever
and no attempts are made to write the MSRs again.

Signed-off-by: Vincent Minet <vincent@vincent-minet.net>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-07 01:24:24 +02:00
Dirk Brandewie
dd5fbf70f9 intel_pstate: don't touch turbo bit if turbo disabled or unavailable.
If turbo is disabled in the BIOS bit 38 should be set in
MSR_IA32_MISC_ENABLE register per section 14.3.2.1 of the SDM Vol 3
document 325384-050US Feb 2014.  If this bit is set do *not* attempt
to disable trubo via the MSR_IA32_PERF_CTL register.  On some systems
trying to disable turbo via MSR_IA32_PERF_CTL will cause subsequent
writes to MSR_IA32_PERF_CTL not take affect, in fact reading
MSR_IA32_PERF_CTL will not show the IDA/Turbo DISENGAGE bit(32) as
set. A write of bit 32 to zero returns to normal operation.

Also deal with the case where the processor does not support
turbo and the BIOS does not report the fact in MSR_IA32_MISC_ENABLE
but does report the max and turbo P states as the same value.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=64251
Cc: 3.13+ <stable@vger.kernel.org>  # 3.13+
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-07 01:22:19 +02:00
Dirk Brandewie
c16ed06024 intel_pstate: Fix setting VID
Commit 21855ff5 (intel_pstate: Set turbo VID for BayTrail) introduced
setting the turbo VID which is required to prevent a machine check on
some Baytrail SKUs under heavy graphics based workloads.  The
docmumentation update that brought the requirement to light also
changed the bit mask used for enumerating P state and VID values from
0x7f to 0x3f.

This change returns the mask value to 0x7f.

Tested with the Intel NUC DN2820FYK,
BIOS version FYBYT10H.86A.0034.2014.0513.1413 with v3.16-rc1 and
v3.14.8 kernel versions.

Fixes: 21855ff5 (intel_pstate: Set turbo VID for BayTrail)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77951
Reported-and-tested-by: Rune Reterson <rune@megahurts.dk>
Reported-and-tested-by: Eric Eickmeyer <erich@ericheickmeyer.com>
Cc: 3.13+ <stable@vger.kernel.org>  # 3.13+
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-07 01:22:19 +02:00
Doug Smythies
51d211e9c3 intel_pstate: Correct rounding in busy calculation
There was a mistake in the actual rounding portion this previous patch:
f0fe3cd7e1 (intel_pstate: Correct rounding in busy calculation) such that
the rounding was asymetric and incorrect.

Severity: Not very serious, but can increase target pstate by one extra value.
For real world work flows the issue should self correct (but I have no proof).
It is the equivalent of different PID gains for positive and negative numbers.

Examples:
 -3.000000 used to round to -4, rounds to -3 with this patch.
 -3.503906 used to round to -5, rounds to -4 with this patch.

Fixes: f0fe3cd7e1 (intel_pstate: Correct rounding in busy calculation)
Signed-off-by: Doug Smythies <dsmythies@telus.net>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-06-17 22:57:40 +02:00
Stratos Karafotis
830bcac4e4 cpufreq: intel_pstate: Remove duplicate CPU ID check
We check the CPU ID during driver init. There is no need
to do it again per logical CPU initialization.

So, remove the duplicate check.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-06-10 22:49:41 +02:00
Rafael J. Wysocki
5ece239918 Merge back earlier cpufreq material.
Conflicts:
	arch/mips/loongson/lemote-2f/clock.c
	drivers/cpufreq/intel_pstate.c
2014-06-03 15:03:27 +02:00
Doug Smythies
bf8102228a intel_pstate: Improve initial busy calculation
This change makes the busy calculation using 64 bit math which prevents
overflow for large values of aperf/mperf.

Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-06-02 12:46:01 +02:00
Dirk Brandewie
c4ee841f60 intel_pstate: add sample time scaling
The PID assumes that samples are of equal time, which for a deferable
timers this is not true when the system goes idle.  This causes the
PID to take a long time to converge to the min P state and depending
on the pattern of the idle load can make the P state appear stuck.

The hold-off value of three sample times before using the scaling is
to give a grace period for applications that have high performance
requirements and spend a lot of time idle,  The poster child for this
behavior is the ffmpeg benchmark in the Phoronix test suite.

Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-06-02 12:45:05 +02:00
Dirk Brandewie
f0fe3cd7e1 intel_pstate: Correct rounding in busy calculation
Changing to fixed point math throughout the busy calculation in
commit e66c1768 (Change busy calculation to use fixed point
math.) Introduced some inaccuracies by rounding the busy value at two
points in the calculation.  This change removes roundings and moves
the rounding to the output of the PID where the calculations are
complete and the value returned as an integer.

Fixes: e66c176837 (intel_pstate: Change busy calculation to use fixed point math.)
Reported-by: Doug Smythies <dsmythies@telus.net>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-06-02 12:44:59 +02:00