x86/mce: Add support for deferred errors on AMD

Deferred errors indicate error conditions that were not corrected, but
those errors have not been consumed yet. They require no action from
S/W (or action is optional). These errors provide info about a latent
uncorrectable MCE that can occur when a poisoned data is consumed by the
processor.

Newer AMD processors can generate deferred errors and can be configured
to generate APIC interrupts on such events.

SUCCOR stands for S/W UnCorrectable error COntainment and Recovery.
It indicates support for data poisoning in HW and deferred error
interrupts.

Add new bitfield to mce_vendor_flags for this. We use this to verify
presence of deferred error interrupts before we enable them in mce_amd.c

While at it, clarify comments in mce_vendor_flags to provide an
indication of usages of the bitfields.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1430913538-1415-4-git-send-email-Aravind.Gopalakrishnan@amd.com
[ beef up commit message, do CPUID(8000_0007) only once. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
This commit is contained in:
Aravind Gopalakrishnan 2015-05-06 06:58:55 -05:00 committed by Borislav Petkov
parent 6e6e746e33
commit 7559e13fb4
2 changed files with 21 additions and 4 deletions

View File

@ -117,8 +117,19 @@ struct mca_config {
};
struct mce_vendor_flags {
__u64 overflow_recov : 1, /* cpuid_ebx(80000007) */
__reserved_0 : 63;
/*
* overflow recovery cpuid bit indicates that overflow
* conditions are not fatal
*/
__u64 overflow_recov : 1,
/*
* SUCCOR stands for S/W UnCorrectable error COntainment
* and Recovery. It indicates support for data poisoning
* in HW and deferred error interrupts.
*/
succor : 1,
__reserved_0 : 62;
};
extern struct mce_vendor_flags mce_flags;

View File

@ -1637,10 +1637,16 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
mce_intel_feature_init(c);
mce_adjust_timer = cmci_intel_adjust_timer;
break;
case X86_VENDOR_AMD:
case X86_VENDOR_AMD: {
u32 ebx = cpuid_ebx(0x80000007);
mce_amd_feature_init(c);
mce_flags.overflow_recov = cpuid_ebx(0x80000007) & 0x1;
mce_flags.overflow_recov = !!(ebx & BIT(0));
mce_flags.succor = !!(ebx & BIT(1));
break;
}
default:
break;
}