There are at least three "max" entries, which specify the max voltage.
Because they are actually normal voltage map entries, they can also be
affected by the temperature.
Nvidia respects those entries and if they get changed, nvidia uses the
lower voltage from all three.
We shouldn't exceed those voltages at any given time.
v2: State what those entries do in the source.
v3: Add the third max entry.
v5: Better describe the entries.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
nvkm_volt_map_min is a copy of nvkm_volt_map, which always returns the
lowest possible voltage for a cstate.
nvkm_volt_map will get a temperature parameter there later and also fix
the voltage calculation, so that this functions will be completly
different later.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
There is a field in the voltage table which tells us if the VIDs are taken
from the entries or calculated through the header.
v2: Don't break older versions.
v5: Reverse flag name.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Some Fermi+ GPUs specify VID information via voltage table entries, rather
than describing them as a range in the header.
The mask may be bigger than 0x1fffff, but this value is already >2V, so it
will be fine for now.
This patch fixes volting issues on those cards enabling them to switch
cstates.
v6: rework message
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Previously we parsed that table a bit wrong:
1. The entry layout depends on the sensor type used.
2. We have all resitors in one entry for the INA3221.
3. The config is already included in the vbios.
This commit addresses that issue and with that we should be able to read
out the right power consumption for every GPU with a INA209, INA219 and
INA3221.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This can show up on SPARC or other architectures that don't handle
unaligned accesses. The kernel normally fixes these up, but it shouldn't
have to.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96836
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
gk20a_ibus_init_ibus_ring() can be called from gk20a_ibus_intr(), in
non-interruptible context. Replace use of usleep_range() with udelay().
Reported-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer. Thus the pointer can point to const data.
However the attributes do not have to be a bitfield. Instead unsigned
long will do fine:
1. This is just simpler. Both in terms of reading the code and setting
attributes. Instead of initializing local attributes on the stack
and passing pointer to it to dma_set_attr(), just set the bits.
2. It brings safeness and checking for const correctness because the
attributes are passed by value.
Semantic patches for this change (at least most of them):
virtual patch
virtual context
@r@
identifier f, attrs;
@@
f(...,
- struct dma_attrs *attrs
+ unsigned long attrs
, ...)
{
...
}
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
and
// Options: --all-includes
virtual patch
virtual context
@r@
identifier f, attrs;
type t;
@@
t f(..., struct dma_attrs *attrs);
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Due to the GPU preventing us from touching NV_PLTCG_LTCS_LTSS_CBC_BASE,
we cannot provide CBC/ZBC support without signed PMU firmware to handle
the task for us...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
GFxxx/GM1xx support the selection of 64/128KiB big pages globally.
GM2xx supports the same, as well as another mode where the page size
can be selected per-instance.
We default to 128KiB pages (With per-instance for GM200, but the current
code selects 128KiB there already) as the MMU code isn't currently able
to handle otherwise.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Makes common the code that was previously used by the PMU table parsing,
as it appears other tables need this too.
Not much of an idea what this is all about...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Corresponds with GT215. Don't rely on the lock test logic being
unconditionally enabled, and disable test logic when done (presumably
to save power).
v2: Remove warning, nvkm_msec already warns on time-out
Signed-off-by: Roy Spliet <nouveau@spliet.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Defer the loading of firmware files to the chip-specific part of secure
boot. This allows implementations to retry loading firmware if the first
attempt failed ; for the GM200 implementation, this happens when trying
to reset a falcon, typically in reaction to GR init.
Firmware loading may fail for a variety of reasons, such as the
filesystem where they reside not being ready at init time. This new
behavior allows GR to be initialized the next time we try to use it if
the firmware has become available.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Make it possible to call gm20x_secboot_prepare_blobs() several times
after either success or failure without re-building already existing
blobs. The function will now try to load firmware files that have
previously failed before returning success.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Some members were documented in the wrong structure.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We shouldn't set voltages below the min or above the max voltage the gpu is
able to set, so save the range for future lookups.
Signed-off-by: Karol Herbst <karolherbst@gmail.de>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This patch adds support for advanced features supported by the
Noise-Aware PLL of Maxwell. Glitchless switch allows the PL field to be
updated without disabling the PLL first if the SYNC_MODE bit of the CFG
register is set.
More significantly, DFS allows the PLL to monitor the actual input
voltage and to dynamically lower the output frequency accordingly. This
allows the clock to be more tolerant of lower voltages.
These improvements are only supported for Tegra speedos >= 1.
Also add the voltage table that is suitable for GM20B's NAPLL. This
change needs to be done atomically for the right voltages to be used by
the clock driver.
v2. Fix build on non-Tegra platforms
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Strip the _ prefix off the gk20a clock constructor.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Split the MNP programming function into two functions for the cases
where we allow sliding or not, instead of making it take a parameter for
this. This results in less conditionals in the code and makes it easier
to read.
Also make the MNP programming functions take the PLL parameters as
arguments, and move bits of code to more relevant places (previous
programming tended to be just-in-time, which added more conditionnals in
the code).
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Use a dedicated function instead of always calculating n_lo on the fly.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Make functions manipulating PLL settings take them as an argument,
instead of assuming we want to work on the copy in the gk20a_clk
structure. This makes these functions more flexible, which we will need
in GM20B.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Add relevant functions to work with the gk20a_pll structure and use them
where they ought to be instead of directly manipulating registers.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Move variables declarations to their actual scope of use, and simplify
code a bit.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Slide setup needs to be performed only once, during init. Also
use the proper parameters for different clock speeds.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Chips may be characterized for a minimum voltage. Support this extra
parameter and select the appropriate minimum voltage for the detected
GPU speedo.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Strip the _ prefix off the gk20a volt constructor.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Give a name to this constant so we at least get an idea of what it is
for.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Nobody else is using these, so make them private.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
There are cases where subdevs need to perform additonal actions around
the master reset, so we want to expost the operations separately.
This commit also adds a flag to the NV_PMC_ENABLE bitfield definitions
which allow skipping the automatic reset() called from core/subdev.c.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We always want a equal or higher voltage than the requested ones, otherwise
nouveau undervolts.
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
With the addition of PTOP-specified reset bits, it makes more sense to
move the definitions here rather than in individual subdev
implementations.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
v2: rename ina209/ina219 read function
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
v2: add list_del call, reword error message
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
v2: add list_del calls
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
When we start communicating with the pmu a bit more, the current code is
a real issue. I encountered a dead lock here, while testing my dynamic
reclocking code
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
In case of successful suspend, devinit will have to be run and this is
the behavior currently hardcoded. However, as FD bug 94725 suggests,
there might be cases where runtime suspend leaves the GPU powered, and
in such cases devinit should not be run on resume.
On GF100+ we have a reliable way to know whether we need to run devinit.
Use it instead of blindly trusting the flag set by nvkm_devinit_fini().
The code around the NvForcePost also needs to be slightly reworked in
order to keep working.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Suggested-by: Dave Airlie <airlied@redhat.com>
Suggested-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Add a basic clock driver that reuses the GK20A logic.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Make functions/structures that the GM20B driver will reuse public.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Err on the safe side by setting the lowest frequency (and thus voltage)
during device init.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This allows to instanciate drivers that use the same logic as gk20a with
different parameters.
Add a constructor function to allow other chips that inherit from this
clock to easily initialize its members
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
pl_to_div may be done differently depending on the chip. Abstract this
operation so the same logic can be reused for them as well.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This allows us to read them using one single function and will be handy
to the GM20B driver.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Most users are probably not interested in this information.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Only restore the 1:1 divider if it is not set already. Also use the
proper masks for this operation and add a second write as done in the
Android code.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
n_lo is used if we are going to slide. Compute it only if that condition
succeeds to avoid confusion about future usage of this computation.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Fix the mask specified to switch to VCO mode was given as an (incorrect)
immediate value. Although the side-effect happens to be the same, this
is clearly incorrect.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>