This structure does not need to be shared anymore.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This allows the bootloader descriptor generation code to not rely on
specialized ls_ucode_img structures, making it reusable in other
instances.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Offsets were not properly computed. This went unnoticed because we are
only using one app for now.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Using 32-bit integers would trim the WPR address if it is allocated above 4GB.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
A WPR region smaller than 256K will result in secure boot failure.
Adjust the minimal size.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The WPR address parameter of the ls_write_wpr hook was defined as a u32,
which will very likely overflow on boards with more than 4GB VRAM.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Check at contruction time that we have support for all the LS firmwares
asked by the caller.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Remove a leftover that became obsolete with the falcon interface.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Some PMU implementations (in particular the ones managed by secure
boot) may not have a reset() hook. Make sure we don't crash in that
case.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Make nvkm_secboot_falcon_name publicly visible as other subdevs will
need to use it for debug messages.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ideally we'd be able to keep these at a more obvious error level, as
they're a good indication of us doing something wrong.
However, NVIDIA's FECS/GPCCS firmware touches registers that trigger
priv ring faults, and we can't do anything to fix that ourselves due
to the need for them to be signed by NVIDIA.
This issue was reported a while back, but hasn't been fixed, so, for
now we will hide the messages to prevent spamming Optimus users with
messages whenever the NVIDIA GPU is powered off and on again.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
704a6c008b7942bb7f30bb43d2a6bcad7f543662 broke pci msi rearm for g92 GPUs.
g92 needs the nv46_pci_msi_rearm, where g94+ gpus used nv40_pci_msi_rearm.
Reported-by: Andrew Randrianasulu <randrianasulu@gmail.com>
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
This seems to be absolutely necessary for a lot of NV40.
Reported-by: gsgf on IRC/freenode
Signed-off-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
v2: Set entry to 0xff if not found
Add cap entry for ver 0x30 tables
Rework to fix memory leak
v3: More error checks
Simplify check for invalid entries
v4: disable for ver 0x10 for now
move assignments after the second last return
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We never have any need for a double-linked list here, and as there's
generally a large number of these objects, replace it with a single-
linked list in order to save some memory.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The halt interrupt must be cleared after ACR is run, otherwise the LS
PMU firmware will not be able to run.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
When the PMU firmware is present, the falcons it manages need to have
the lazy-bootstrap flag of their WPR header set so the ACR does not boot
them. Add support for this.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Generate the WPR descriptor closer to what RM does. In particular, set
the expected masks, and only set the ucode members on Tegra.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Set a default error value in the mailbox 0 register so we can catch
cases where the secure boot binary fails early without being able to
report anything.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Since DMEM was initialized to zero, these fields went unnoticed. Add
them for safety.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Perform the zeroing of BL descriptors in the caller function instead of
trusting each generator will do it. This could avoid a few pulled hairs.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The WPR and LSB headers, used to generate the LS blob, may have a
different layout and sizes depending on the driver version they come
from. Abstract them and confine their use to driver-specific code.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This was used only locally to one function and can be replaced by ad-hoc
variables.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
ucode_header is not used anywhere, so just get rid of it.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Make sure we are not disturbed by spurious interrupts, as we poll the
halt bit anyway.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Split the reset function into more meaningful and reusable ones.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Add a flag that can be set when declaring how a LS firmware should be
loaded. This allows us to remove falcon-specific code in the loader.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Split the act of building the ACR blob from firmware files from the rest
of the (chip-dependent) secure boot logic. ACR logic is moved into
acr_rxxx.c files, where rxxx corresponds to the compatible release of
the NVIDIA driver. At the moment r352 and r361 are supported since
firmwares have been released for these versions. Some abstractions are
added on top of r352 so r361 can easily be implemented on top of it by
just overriding a few hooks.
This split makes it possible and easy to reuse the same ACR version on
different chips. It also hopefully makes the code much more readable as
the different secure boot logics are separated. As more chips and
firmware versions will be supported, this is a necessity to not get lost
in code that is already quite complex.
This is a big commit, but it essentially moves things around (and split
the nvkm_secboot structure into two, nvkm_secboot and nvkm_acr). Code
semantics should not be affected.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Use the HS hook to completely generate the HS BL descriptor, similarly
to what is done in the LS hook, instead of (arbitrarily) using the
acr_v1 format as an intermediate.
This allows us to make the bootloader descriptor structures private to
each implementation, resulting in a cleaner an more consistent design.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Secure firmwares provided by NVIDIA will follow the same overall
principle, but may slightly differ in format, or not use the same
bootloader descriptor even on the same chip. In order to handle
this as gracefully as possible, turn the LS firmware functions into
hooks that can be overloaded as needed.
The current hooks cover the external firmware loading as well as the
bootloader descriptor generation.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This hook can be removed if the function writing the HS
descriptor is aware of WPR settings. Let's do that as it allows us to
make the ACR descriptor structure private and save some code.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The init() hook is called by the subdev's oneinit(). Rename it
accordingly to avoid confusion about the lifetime of objects allocated
in it.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Since GR has moved to using the falcon library to start the falcons,
this function is not needed anymore.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Use the falcon library functions in secure boot. This removes a lot of
code and makes the secure boot flow easier to understand as no register
is directly accessed.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
These functions should use the nvkm_secboot_falcon enum. Fix this.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Add a dummy PMU device so the PMU falcon is instanciated and can be used
by secure boot.
We could reuse gk20a's implementation here, but it would fight with
secboot over PMU falcon's ownership and secboot will reset the PMU,
preventing it from operating afterwards. Proper handout between secboot
and pmu is coming along with the actual gm20b PMU implementation, so
use this as a temporary solution.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Some functions always succeed - change their return type to void and
remove the error-handling code in their caller.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Use the PMU constructor so that all base members (in particular the
falcon instance) are initialized properly.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Have an instance of nvkm_falcon in the PMU structure, ready to be used
by other subdevs (i.e. secboot).
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Add a PMU constructor so implementations that extend the nvkm_pmu
structure can have all base members properly initialized.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Add a function that allows us to query whether a given subdev is
currently enabled or not.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
I suspect the version bump is just to signify that the table now specifies
pad macro/links instead of SOR/sublinks.
For our usage of the table, just recognising the new version is enough.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This sequence is incorrect for GP102/GP104 boards. This is now being
handled correctly by the PMU subdev during preinit();
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
gm20b's FB has the same capabilities as gm200, minus the ability to
allocate RAM. Create a device that reflects this instead of re-using the
gk20a device which may be incorrect.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-By: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
gk20a's FB is not special compared to other Kepler chips, besides the
fact it does not have VRAM. Use the regular gf100 hooks instead of the
incomplete versions we rewrote.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-By: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The gf100 constructor should be called, otherwise we will allocate a
smaller object than expected. This was without effect so far because
gk20a did not allocate a page, but with gf100's page allocation moved
to the oneinit() hook this problem has become apparent.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The reset hook of pmu_func is never called, and gt215 was the only chip
to implement. Remove this dead code.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We get a few warnings when building kernel with W=1:
drivers/gpu/drm/nouveau/nvkm/subdev/bios/fan.c:29:1: warning: no previous prototype for 'nvbios_fan_table' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/bios/fan.c:56:1: warning: no previous prototype for 'nvbios_fan_entry' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c:184:1: warning: no previous prototype for 'gt215_clk_info' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgt215.c:99:1: warning: no previous prototype for 'gt215_link_train_calc' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgt215.c:153:1: warning: no previous prototype for 'gt215_link_train' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgt215.c:271:1: warning: no previous prototype for 'gt215_link_train_init' [-Wmissing-prototypes]
....
In fact, both functions are only used in the file in which they are
declared and don't need a declaration, but can be made static.
So this patch marks these functions with 'static'.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We get a few warnings when building kernel with W=1:
drivers/gpu/drm/nouveau/nvkm/core/firmware.c:34:1: warning: no previous prototype for 'nvkm_firmware_get' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/core/firmware.c:58:1: warning: no previous prototype for 'nvkm_firmware_put' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/fb/sddr3.c:69:1: warning: no previous prototype for 'nvkm_sddr3_calc' [-Wmissing-prototypes]
drivers/gpu/drm/nouveau/nvkm/subdev/fb/sddr2.c:60:1: warning: no previous prototype for 'nvkm_sddr2_calc' [-Wmissing-prototypes]
....
In fact, these functions are declared in
drivers/gpu/drm/nouveau/include/nvkm/core/firmware.h
drivers/gpu/drm/nouveau/nvkm/subdev/fb/ram.h
drivers/gpu/drm/nouveau/nvkm/subdev/volt/priv.h
drivers/gpu/drm/nouveau/nvkm/engine/gr/nv50.h
drivers/gpu/drm/nouveau/dispnv04/disp.h.
So this patch adds missing header dependencies.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This fixes (works around?) link training failures seen on (at least)
the Lenovo P50's internal panel.
It's also an important fix on the same system for MST support on the
dock. Sometimes, right after receiving an IRQ from the sink, there's
an error bit (SINKSTAT_ERR) set in the DPAUX registers before we've
even attempted a transaction.
v2. Fixed regression on passive DP->DVI adapters.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The 100c08 scratch page is mapped using dma_map_page() before the TTM
layer has had a chance to set the DMA mask. This means we are still
running with the default of 32 when this code executes, and this causes
problems for platforms with no memory below 4 GB (such as AMD Seattle)
So move the dma_map_page() to the .oneinit hook, which executes after the
DMA mask has been set.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The 100c10 scratch page is mapped using dma_map_page() before the TTM
layer has had a chance to set the DMA mask. This means we are still
running with the default of 32 when this code executes, and this causes
problems for platforms with no memory below 4 GB (such as AMD Seattle)
So move the dma_map_page() to the .oneinit hook, which executes after the
DMA mask has been set.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Doing direct 64 bit divisions in kernel code leads to references to
undefined symbols on 32 bit architectures. Replace such divisions with
calls to div64_s64 to make the module usable on 32 bit archs.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This enables memory reclocking on Maxwell. Sadly without a PMU firmware it
is useless for gm20x gpus.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
I'm quite sure that those coefficients are real close, because while
testing the biggest error compared to nvidia was around -1.5% (biggest
error with right coefficients is 12.5mV / 600mV = 2%).
These coefficients were REed by modifing the voltage map entries and by
calculating the set voltage back until I was able to forecast which voltage
nvidia sets for a given voltage map entry.
With these formulars I am able to precisely predict at which exact
temperature Nvidia down- or upvolts due to a changed therm reading.
That's why I am quite sure these are right, or at least really really
close.
v4: Use better coefficients and speedo.
v5: Add error message when speedo is missing.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
v5: Squashed speedo related commits.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Since gf100 we need a speedo value for calculating the voltage. The readout
will be added in a later patch.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Depending on the value a different formular is used to calculated the
voltage for this entry.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
If we calculate the voltage in the table right, we get all kinds of values,
which never fit the hardware steps, so we use the closest higher value the
hardware can do.
v3: Simplify the implementation.
v5: Initialize best_err with volt->max_uv.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
0: base clock from the vbios is max clock (default)
1: boost only to boost clock from the vbios
2: boost to max clock available
v2: Moved into nvkm_cstate_valid.
v4: Check the existence of the clocks before limiting.
v5: Default to boost level 0.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This table contains three important clocks:
base clock: This is the non boosted max clock.
tdp clock: The clock at wich the vbios guarentees the TDP won't ever be
exceeded at max load (seems to be always the same as the base
clock, but behaves differently).
boost clock: The avg clock the gpu will stay boosted to. It doesn't seem to
affect the behaviour of the nvidia driver at all though.
v2: Make clear that base/boost/tdp fields are ids.
v5: Rename Base clock to vpstate.
Make vbios pointers 32bit.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We should never allow to select a cstate which current voltage (depending
on the temperature) is higher than
1. the max volt entries in the voltage map table.
2. what tha gpu actually can volt to.
v3: Use find_best for all cstates before actually trying.
Add nvkm_cstate_get function to get cstate by index.
v5: Cstates with voltages lower then min_uv are valid.
Move nvkm_cstate_get into the previous commit.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Now the cstatei parameter can be used of the nvkm_cstate_prog function to
select a specific cstate.
v5: Make a constant for the magic value.
Use list_last_entry.
Add nvkm_cstate_get here instead of in the next commit.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The voltage entries actually may map to a different voltage depending on
the current temperature.
v2: Only read the temperature when actually needed.
v5: Be smarter about using max().
Don't read the temperature anymore.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This way other subdevs can notify the clk subdev about temperature changes
without the need of clk to poll that value.
Also make this function safe to be called from an interrupt handler.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
It is better to read out the id out of the cstate struct directly instead
of iterating over the list of cstates over and over again. Especially when
we start saving pointers to a nvkm_cstate struct, it makes things easier.
v5: Rename field to id.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Each pstate has its own voltage map entry like each cstate has.
The voltages of those entries act as a floor value for the currently
selected pstate and nvidia never sets a voltage below them.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
There are at least three "max" entries, which specify the max voltage.
Because they are actually normal voltage map entries, they can also be
affected by the temperature.
Nvidia respects those entries and if they get changed, nvidia uses the
lower voltage from all three.
We shouldn't exceed those voltages at any given time.
v2: State what those entries do in the source.
v3: Add the third max entry.
v5: Better describe the entries.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
nvkm_volt_map_min is a copy of nvkm_volt_map, which always returns the
lowest possible voltage for a cstate.
nvkm_volt_map will get a temperature parameter there later and also fix
the voltage calculation, so that this functions will be completly
different later.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
There is a field in the voltage table which tells us if the VIDs are taken
from the entries or calculated through the header.
v2: Don't break older versions.
v5: Reverse flag name.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Some Fermi+ GPUs specify VID information via voltage table entries, rather
than describing them as a range in the header.
The mask may be bigger than 0x1fffff, but this value is already >2V, so it
will be fine for now.
This patch fixes volting issues on those cards enabling them to switch
cstates.
v6: rework message
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Previously we parsed that table a bit wrong:
1. The entry layout depends on the sensor type used.
2. We have all resitors in one entry for the INA3221.
3. The config is already included in the vbios.
This commit addresses that issue and with that we should be able to read
out the right power consumption for every GPU with a INA209, INA219 and
INA3221.
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This can show up on SPARC or other architectures that don't handle
unaligned accesses. The kernel normally fixes these up, but it shouldn't
have to.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96836
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
gk20a_ibus_init_ibus_ring() can be called from gk20a_ibus_intr(), in
non-interruptible context. Replace use of usleep_range() with udelay().
Reported-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer. Thus the pointer can point to const data.
However the attributes do not have to be a bitfield. Instead unsigned
long will do fine:
1. This is just simpler. Both in terms of reading the code and setting
attributes. Instead of initializing local attributes on the stack
and passing pointer to it to dma_set_attr(), just set the bits.
2. It brings safeness and checking for const correctness because the
attributes are passed by value.
Semantic patches for this change (at least most of them):
virtual patch
virtual context
@r@
identifier f, attrs;
@@
f(...,
- struct dma_attrs *attrs
+ unsigned long attrs
, ...)
{
...
}
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
and
// Options: --all-includes
virtual patch
virtual context
@r@
identifier f, attrs;
type t;
@@
t f(..., struct dma_attrs *attrs);
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Due to the GPU preventing us from touching NV_PLTCG_LTCS_LTSS_CBC_BASE,
we cannot provide CBC/ZBC support without signed PMU firmware to handle
the task for us...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
GFxxx/GM1xx support the selection of 64/128KiB big pages globally.
GM2xx supports the same, as well as another mode where the page size
can be selected per-instance.
We default to 128KiB pages (With per-instance for GM200, but the current
code selects 128KiB there already) as the MMU code isn't currently able
to handle otherwise.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Makes common the code that was previously used by the PMU table parsing,
as it appears other tables need this too.
Not much of an idea what this is all about...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Corresponds with GT215. Don't rely on the lock test logic being
unconditionally enabled, and disable test logic when done (presumably
to save power).
v2: Remove warning, nvkm_msec already warns on time-out
Signed-off-by: Roy Spliet <nouveau@spliet.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Defer the loading of firmware files to the chip-specific part of secure
boot. This allows implementations to retry loading firmware if the first
attempt failed ; for the GM200 implementation, this happens when trying
to reset a falcon, typically in reaction to GR init.
Firmware loading may fail for a variety of reasons, such as the
filesystem where they reside not being ready at init time. This new
behavior allows GR to be initialized the next time we try to use it if
the firmware has become available.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Make it possible to call gm20x_secboot_prepare_blobs() several times
after either success or failure without re-building already existing
blobs. The function will now try to load firmware files that have
previously failed before returning success.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Some members were documented in the wrong structure.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We shouldn't set voltages below the min or above the max voltage the gpu is
able to set, so save the range for future lookups.
Signed-off-by: Karol Herbst <karolherbst@gmail.de>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This patch adds support for advanced features supported by the
Noise-Aware PLL of Maxwell. Glitchless switch allows the PL field to be
updated without disabling the PLL first if the SYNC_MODE bit of the CFG
register is set.
More significantly, DFS allows the PLL to monitor the actual input
voltage and to dynamically lower the output frequency accordingly. This
allows the clock to be more tolerant of lower voltages.
These improvements are only supported for Tegra speedos >= 1.
Also add the voltage table that is suitable for GM20B's NAPLL. This
change needs to be done atomically for the right voltages to be used by
the clock driver.
v2. Fix build on non-Tegra platforms
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Strip the _ prefix off the gk20a clock constructor.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Split the MNP programming function into two functions for the cases
where we allow sliding or not, instead of making it take a parameter for
this. This results in less conditionals in the code and makes it easier
to read.
Also make the MNP programming functions take the PLL parameters as
arguments, and move bits of code to more relevant places (previous
programming tended to be just-in-time, which added more conditionnals in
the code).
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Use a dedicated function instead of always calculating n_lo on the fly.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Make functions manipulating PLL settings take them as an argument,
instead of assuming we want to work on the copy in the gk20a_clk
structure. This makes these functions more flexible, which we will need
in GM20B.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Add relevant functions to work with the gk20a_pll structure and use them
where they ought to be instead of directly manipulating registers.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Move variables declarations to their actual scope of use, and simplify
code a bit.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Slide setup needs to be performed only once, during init. Also
use the proper parameters for different clock speeds.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Chips may be characterized for a minimum voltage. Support this extra
parameter and select the appropriate minimum voltage for the detected
GPU speedo.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Strip the _ prefix off the gk20a volt constructor.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Give a name to this constant so we at least get an idea of what it is
for.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Nobody else is using these, so make them private.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
There are cases where subdevs need to perform additonal actions around
the master reset, so we want to expost the operations separately.
This commit also adds a flag to the NV_PMC_ENABLE bitfield definitions
which allow skipping the automatic reset() called from core/subdev.c.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We always want a equal or higher voltage than the requested ones, otherwise
nouveau undervolts.
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
With the addition of PTOP-specified reset bits, it makes more sense to
move the definitions here rather than in individual subdev
implementations.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
v2: rename ina209/ina219 read function
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
v2: add list_del call, reword error message
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
v2: add list_del calls
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
When we start communicating with the pmu a bit more, the current code is
a real issue. I encountered a dead lock here, while testing my dynamic
reclocking code
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
In case of successful suspend, devinit will have to be run and this is
the behavior currently hardcoded. However, as FD bug 94725 suggests,
there might be cases where runtime suspend leaves the GPU powered, and
in such cases devinit should not be run on resume.
On GF100+ we have a reliable way to know whether we need to run devinit.
Use it instead of blindly trusting the flag set by nvkm_devinit_fini().
The code around the NvForcePost also needs to be slightly reworked in
order to keep working.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Suggested-by: Dave Airlie <airlied@redhat.com>
Suggested-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Add a basic clock driver that reuses the GK20A logic.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Make functions/structures that the GM20B driver will reuse public.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Err on the safe side by setting the lowest frequency (and thus voltage)
during device init.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This allows to instanciate drivers that use the same logic as gk20a with
different parameters.
Add a constructor function to allow other chips that inherit from this
clock to easily initialize its members
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
pl_to_div may be done differently depending on the chip. Abstract this
operation so the same logic can be reused for them as well.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This allows us to read them using one single function and will be handy
to the GM20B driver.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Most users are probably not interested in this information.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Only restore the 1:1 divider if it is not set already. Also use the
proper masks for this operation and add a second write as done in the
Android code.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
n_lo is used if we are going to slide. Compute it only if that condition
succeeds to avoid confusion about future usage of this computation.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Fix the mask specified to switch to VCO mode was given as an (incorrect)
immediate value. Although the side-effect happens to be the same, this
is clearly incorrect.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
gk20a_pllg_disable() is only used in the context of gk20a_clk_fini().
Move its body there and rename _gk20a_pllg_enable() and
_gk20a_pllg_disable() to non-underscored versions.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Move some variables declarations to the scope where they are actually
used to make the code easier to follow.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Perform computations in Khz instead of Mhz for better precision.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Add basic GM20B volt driver that reuses the GK20A logic.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Split the constructor function so we can reuse the same logic in other
chips.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The CVB calculation and voltage setting functions can be reused for the
future chips. So move the declaration to gk20a.h.
Signed-off-by: Vince Hsu <vinceh@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
the macro deals with target specific differences and so we should always use
this
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
on gk208+ we can simply mov 32bits, so we should have a single mov there
v2: use or operator instead of add
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
When using the DMA-API for instmem, we may obtain a write-combined
mapping. For such cases, add a write barrier in
gk20a_instobj_release_dma() to make sure that all writes have reached
memory at this time.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
LTC operations timeout was set to 2ms, which may be too low for devices
that run at very low clocks (e.g. GM20B) and trigger timeout messages.
Set the timeout to the default 2s. Also remove the redundant error
messages since nvkm_wait_msec() will already display a warning.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
based on Martins initial work
v3: fix ina2x9 calculations
v4: don't kmalloc(0), fix the lsb/pga stuff
v5: add a field to tell if the power reading may be invalid
add nkvm_iccsense_read_all function
check for the device on the i2c bus
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Karol Herbst:
v4: don't kmalloc(0)
v5: stricter validation
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Add secure boot support for the GM20B chip found in Tegra X1. Secure
boot on Tegra works slightly differently from desktop, notably in the
way the WPR region is set up.
In addition, the firmware bootloaders use a slightly different header
format.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Add secure-boot for the dGPU set of GM20X chips, using the PMU as the
high-secure falcon.
This work is based on Deepak Goyal's initial port of Secure Boot to
Nouveau.
v2. use proper memory target function
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
On GM200 and later GPUs, firmware for some essential falcons (notably
GR ones) must be authenticated by a NVIDIA-produced signature and
loaded by a high-secure falcon in order to be able to access privileged
registers, in a process known as Secure Boot.
Secure Boot requires building a binary blob containing the firmwares
and signatures of the falcons to be loaded. This blob is then given to
a high-secure falcon running a signed loader firmware that copies the
blob into a write-protected region, checks that the signatures are
valid, and finally loads the verified firmware into the managed falcons
and switches them to privileged mode.
This patch adds infrastructure code to support this process on chips
that require it.
v2:
- The IRQ mask of the PMU falcon was left - replace it with the proper
irq_mask variable.
- The falcon reset procedure expecting a falcon in an initialized state,
which was accidentally provided by the PMU subdev. Make sure that
secboot can manage the falcon on its own.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Upon encountering an unknown condition code, the script interpreter
is supposed to skip 'size' bytes and continue at the next devinit
token.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
It is not advisable to perform devinit if it has already been done.
VBIOS will very likely have invoked devinit if the GPU is the primary
graphics device, but there is no accurate way to detect this fact yet.
This patch adds such a method for gf100 and later chips, by means of the
NV_PTOP_SCRATCH1_DEVINIT_COMPLETED bit. This bit is set to 1 by devinit,
and reset to 0 when the GPU is powered.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We never use any nv50-specific member in this nv50_devinit_preinit().
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Patch "ltc/gm107: use nvkm_mask to set cbc_ctrl1" sets the 3rd bit
of the CTRL1 register instead of writing it entirely in
gm107_ltc_cbc_clear(). As a counterpart, gm107_ltc_cbc_wait() must also
be modified to wait on that single bit only, otherwise a timeout may
occur if some other bit of that register is set. This happened at least
on GM206 when running glmark2-drm.
While we are at it, use the more compact nvkm_wait_msec() to wait for
the bit to clear.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The asm-generic tree this time contains one series from Nicolas Pitre
that makes the optimized do_div() implementation from the ARM
architecture available to all architectures. This also adds stricter
type checking for callers of do_div, which has uncovered a number
of bugs in existing code, and fixes up the ones we have found.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIVAwUAVqARKWCrR//JCVInAQJrBhAAlwZL0IiVGFfDXWtvQGOm+yC5j4vdIhMf
1scsvRbk3ln1xUk5+NM61NpxbQotro78K5HxFZFhaVGUTbbFXM9w2VZSyI8ZaGAJ
Od6lBUUyLQmzlbHDJ3v/zrZn8Up7qZlRApmXcbUVDtssfnEfKk4xA2RG9JwIMS1c
uZMvnD7N3P9vxDPl+CsYlB2osi6Yks3VQ1tXYe2z6siO+H67zHaF08+ls7fbsd3d
oyKjZqlaQ02MIOr+AdR0h9iKyJJ6SXT0DQlsMyzB6aBWmeBCNLNALNIiukDk9Qc1
VV3sF1MOS3LtfU2TeOx4Na7hcd2iC6WYLb271iApO2Ww7t16n+de3i6AipZxLUJ0
08jiRlisTzUhXDobRSqI3mcQlxrB5UGfyblab2z/MqGGmIGJSPPRdTPRQUgi0ZKg
jksSmsaPwOQp64FhTgECLJthlYX7h6ULjkvJ9h60gZHa4jhGZbGPeMwHPf1uSm95
EvQE971Ssgm4jwhvxZ/kt1ruuZI/fxxG1Qfw+C25QkXZGKye2nB+icLWeMwz+FXG
HLqkmaAjasf5MAV1GiK8U6zoC6bCOLU0Lea83hOwRPZ999v3Nym1giSatNv4/pB+
QmkXRvFi93cdQ643l7xcUEDT2zpk4pogF3xREiBhyaXtqLlT7pPMKsBQOgdWvFuu
Ou0ZbEAwIVo=
=4psa
-----END PGP SIGNATURE-----
Merge tag 'asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"The asm-generic tree this time contains one series from Nicolas Pitre
that makes the optimized do_div() implementation from the ARM
architecture available to all architectures.
This also adds stricter type checking for callers of do_div, which has
uncovered a number of bugs in existing code, and fixes up the ones we
have found"
* tag 'asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
ARM: asm/div64.h: adjust to generic codde
__div64_32(): make it overridable at compile time
__div64_const32(): abstract out the actual 128-bit cross product code
do_div(): generic optimization for constant divisor on 32-bit machines
div64.h: optimize do_div() for power-of-two constant divisors
mtd/sm_ftl.c: fix wrong do_div() usage
drm/mgag200/mgag200_mode.c: fix wrong do_div() usage
hid-sensor-hub.c: fix wrong do_div() usage
ti/fapll: fix wrong do_div() usage
ti/clkt_dpll: fix wrong do_div() usage
tegra/clk-divider: fix wrong do_div() usage
imx/clk-pllv2: fix wrong do_div() usage
imx/clk-pllv1: fix wrong do_div() usage
nouveau/nvkm/subdev/clk/gk20a.c: fix wrong do_div() usage
v2: rename and group functions
v4: change copyright information
move printing of pcie speeds into oneinit,
rename all pcie functions to nvkm_pcie_*
don't try to raise the pcie version when no higher one is supported
v5: revert Copyright changes and rename nvkm_pcie_raise_version to nvkm_pcie_set_version
v6: remove some useless pci_is_pcie checks and rework messages
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Commit 69c4938249 ("drm/nouveau/instmem/gk20a: use direct CPU access")
tried to be smart while using the DMA-API by managing the CPU mappings of
buffers allocated with the DMA-API by itself. In doing so, it relied
on dma_to_phys() which is an architecture-private function not
available everywhere. This broke the build on several architectures.
Since there is no reliable and portable way to obtain the physical
address of a DMA-API buffer, stop trying to be smart and just use the
CPU mapping that the DMA-API can provide. This means that buffers will
be CPU-mapped for all their life as opposed to when we need them, but
anyway using the DMA-API here is a fallback for when no IOMMU is
available so we should not expect optimal behavior.
This makes the IOMMU and DMA-API implementations of instmem diverge
enough that we should maybe put them into separate files...
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The LRU list used for recycling CPU mappings was handling concurrency
very poorly. For instance, if an instobj was acquired twice before being
released once, it would end up into the LRU list even though there is
still a client accessing it.
This patch fixes this by properly counting how many clients are
currently using a given instobj.
While at it, we also raise errors when inconsistencies are detected, and
factorize some code.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This is an oversight that made use of the trip-point-based fan managenent on
cards that never expose those. This led the fan to stay at fan_min.
Fortunately, the emergency code would kick when the temperature would reach
90°C.
Reported-by: Tom Englund <tomenglund26@gmail.com>
Tested-by: Tom Englund <tomenglund26@gmail.com>
Signed-off-by: Martin Peres <martin.peres@free.fr>
Tested-by: Daemon32 <lnf.purple@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92126
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org
Just the one commit I mentioned earlier, making the PGOB workaround the
default.
* 'linux-4.4' of https://github.com/skeggsb/linux:
drm/nouveau/pmu: remove whitelist for PGOB-exit WAR, enable by default
NVIDIA have indicated that the workaround is required on all GK10[467]
boards that have the PGOB fuse set.
I've left the commandline option in place for now, as paranoia.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs wrote:
A couple of regression fixes, some more boards whitelisted for a hw bug
workaround, gr/ucode fixes for hangs a user is seeing.
The changes look larger than they actually are due to the ucode binaries
(*.fucN.h) being regenerated.
* 'linux-4.4' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nouveau/volt/pwm/gk104: fix an off-by-one resulting in the voltage not being set
drm/nouveau/nvif: allow userspace access to its own client object
drm/nouveau/gr/gf100-: fix oops when calling zbc methods
drm/nouveau/gr/gf117-: assume no PPC if NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK is zero
drm/nouveau/gr/gf117-: read NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK from correct GPC
drm/nouveau/gr/gf100-: split out per-gpc address calculation macro
drm/nouveau/bios: return actual size of the buffer retrieved via _ROM
drm/nouveau/instmem: protect instobj list with a spinlock
drm/nouveau/pci: enable c800 magic for some unknown Samsung laptop
drm/nouveau/pci: enable c800 magic for Clevo P157SM
No locking is required for the traversal of this list, as it only
happens during suspend/resume where nothing else can be executing.
Fixes some of the issues noticed during parallel piglit runs.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
gk20a is an ARM only GPU, so we can just do the correct thing on
ARM but fail on other architectures. The other option was to use
SWIOTLB as the define, which means phys_to_page exists, but
this seems clearer.
Signed-off-by: Dave Airlie <airlied@redhat.com>
This patch uses an approach closer to the nvidia driver to configure
both PLLs for high gddr5 memory clocks (usually above 2400MHz)
Previously nouveau used the one PLL as it was used for the lower clocks
and just adjusted the second PLL to get as close as possible to the
requested clock. This means for my card, that I got a 4050 MHz clock
although 4008 MHz was requested.
Now the driver iterates over a list of PLL configuration also used by
the nvidia driver and then adjust the second PLL to get near the
requested clock. Also it hold to some restriction I found while
analyzing the PLL configurations
This won't fix all gddr5 high clock issues itself, but it should be
fine on hybrid gpu systems as found on many laptops these days. Also
switching while normal desktop usage should be a lot more stable than
before.
v2: move the pll code into ramgk104
Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Your milage may vary, as it's only been tested on a single G94 and one G96.
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Avoids waiting for VBLANKS that never arrive on headless or otherwise
unconventional set-ups. Strategy taken from MEMX.
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
10053c is not even read on some cards, and I have no idea exactly what the
criteria are. Likely NVIDIA pre-scans the VBIOS and in their driver disables
all features that are never used. The practical effect should be the same
as this implementation though.
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Like Pierre's G94. We might want to structure Kepler similarly in a follow-up.
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Does not seem to be necessary for NVA0, hence untested by me.
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Seems to be mostly equal to DDR3 on < GT218, should improve stability for
DDR2 reclocks.
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
In preparation of changing FBVDDQ, as observed on at least one GDDR3 card.
While at it, adhere to func.log[1] properly for consistency.
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
If the hardware supports extended tag field (8-bit ones), then enable it.
This is usually done by the VBIOS, but not on some MBPs (see fdo#86537).
In case extended tag field is not supported, 5-bit tag field is used which
limits the possible number of requests to 32. Apparently bits 7:0 of
0x08841c stores some number of outstanding requests, so cap it to 32 if
extended tag is unsupported.
Fixes: fdo#86537
v2: Restrict changes to chipsets >= 0x84
v3:
* Add nvkm_pci_mask to pci.h
* Mask bit 8 before setting it
v4:
* Rename `add` argument of nvkm_pci_mask to `value`
* Move code from nvkm_pci_init to g84_pci_init and remove PCIe and chipset
checks
v5:
* Rebase code on latest PCI structure
* Restore PCIe check
* Fix namings in nvkm_pci_mask
* Rephrase part of the commit message
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
coverity.com reported that memset was using a buffer of size 0, on
checking the code it turned out that the function was not being used. So
remove it.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Was not able to obtain a trace of NVRM due to kernel version annoyances,
however, experimentally confirmed that the WAR we use on NV50/G8x boards
works here too.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Increase clock timeout of some unknown engines in order to avoid failure
at high gpcclk rate.
This fixes IBUS read faults on my GF119 when reclocking is manually
enabled. Note that memory reclocking is completely broken and NvMemExec
has to be disabled to allow core clock reclocking only.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Most Keplers actually use the GPIO-based voltage management instead of the new
PWM-based one. Use the GPIO mode as a fallback as it already gracefully handles
the case where no GPIOs exist.
All the Maxwells seem to use the PWM method though.
v2:
- Do not forget to commit the PWM configuration change!
Signed-off-by: Martin Peres <martin.peres@free.fr>
This patch is not ideal but it definitely beats a rewrite of the current
interface and is very self-contained.
Signed-off-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Use the IOMMU bit specified in platform data instead of hardcoding it to
the bit used by current Tegra GPUs.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The Great Nouveau Refactoring Take II brought us a lot of goodness,
including acquire/release methods that are called before and after an
instobj is modified. These functions can be used as synchronization
points to manage CPU/GPU coherency if we modify an instobj using the
CPU.
This patch replaces the legacy and slow PRAMIN access for gk20a instmem
with CPU mappings and writes. A LRU list is used to unmap unused
mappings after a certain threshold (currently 1MB) of mapped instobjs is
reached. This allows mappings to be reused most of the time.
Accessing instobjs using the CPU requires to maintain the GPU L2 cache,
which we do in the acquire/release functions. This triggers a lot of L2
flushes/invalidates, but most of them are performed on an empty cache
(and thus return immediately), and overall context setup performance
greatly benefits from this (from 250ms to 160ms on Jetson TK1 for a
simple libdrm program).
Making L2 management more explicit should allow us to grab some more
performance in the future.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Allow clients to manually flush and invalidate L2. This will be useful
for Tegra systems for which we want to write instmem using the CPU.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
These are useful for systems without a coherent CPU/GPU bus. For such
systems we may need to maintain the L2 ourselves.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Some devices may not have a PMU. Avoid a NULL pointer dereference in
such cases by checking whether the pointer given to nvkm_pmu_pgob() is
valid.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Currently OF bios load fails for a few reasons:
- checksum failure
- bios size too small
- no PCIR header
- bios length not a multiple of 4
In this change, we resolve all of the above by ignoring any checksum
failures (since OF VBIOS tends not to have a checksum), and faking the
PCIR data when loading from OF.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
SiS 761 chipset does not support AGP cards but has AGP capability (for
the onboard video). At least PC Chips A31G board using this chipset has
an AGP-like AGPro slot that's wired to the PCI bus. Enabling AGP will
fail (GPU lockup and software fbcon, X11 hangs).
Add support for matching just the host bridge in nvkm_device_agp_quirks
and add entry for SiS 761 with mode 0 (AGP disabled).
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
three nouveau regression fixes.
* 'linux-4.3' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nouveau/device: enable c800 quirk for tecra w50
drm/nouveau/clk/gt215: Unbreak engine pausing for GT21x/MCP7x
drm/nouveau/gr/nv04: fix big endian setting on gr context
Typo that snuck in with commit 6979c6303a
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Reported-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The copyright header in nvkm/engine/device/platform.c has been replaced
with the NVIDIA one from drm/nouveau_platform.c, as most of the actual
code is now theirs.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Doesn't fix any known issue, but best be safe in case control is handed
to us from firmware with these left enabled.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This ensures we have a valid mask of disabled engines before we start
trying to execute fini()/init() on the subdevs, potentially touching
devices that don't exist.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
An upcoming commit requires being able to modify the PRAMIN BAR page
tables while already holding the MMU subdev mutex.
To solve this issue, each VM has been given its own mutex. As a nice
side-effect, this also allows separate VMs to be updated concurrently.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
These require an explicit struct nvkm_device pointer, unlike the previous
macros which take a void *, and work for (almost) anything derived from
nvkm_object by using some heuristics.
These macros are more general than the previous ones, and can be used to
handle PTIMER-based busy-waits (will be used in later devinit fixes) as
well as more complicated wait conditions.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Pretty much every subdev/engine is going to need access to nvkm_device
shortly to touch registers and/or output messages.
The odd placement of the includes is necessary to work around some
inter-dependencies that currently exist. This will be fixed later.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Will be utilised in upcoming commits to remove the need for heuristics
to lookup the device a subdev belongs to.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Only a handful of machines have this enabled by default, where it's been
proven to work. The workaround can be explicitly enabled with a module
option also.
Still waiting on feedback from NVIDIA for a proper idea of exactly what
this fix is doing, and how to implement it properly.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We previously assumed that the values "2" and "4" were new in DCB 4.1,
however, there's at least one GM107 DCB 4.0 board (Quadro K620) that
uses the newer values.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
No known VBIOSes use these, but they are present in the actual VBIOS
table parsing logic. No harm in adding these too.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
According to the tstate calculation in nvkm_clk_tstate(),
the range of tstate is from -(clk->state_nr - 1) to 0,
it mean the tstate is negative value. But in nvkm_pstate_work(),
it use (clk->state_nr - 1 - clk->tstate) to limit pstate,
it's not correct.
This patch fix it to use (clk->state_nr - 1 + clk->tstate) to
limit pstate.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Tested on a few cards. Probably works quite well for most, given they should
all be GDDR3.
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This looks surprisingly similar to scripts on earlier cards as well
but they don't seem to work just yet. That... and I don't have any, which
makes it a tough job to reverse engineer.
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Some of the bits in there are similar to the bits in the gt215 rammap.
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Might need some generalisation to < GT200. For those: use at your own risk!
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
In preparation of NV50 reclocking, where there is no version
Signed-off-by: Roy Spliet <rspliet@eclipso.eu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
More analysis shows that this is identical to 0x79 except that it loads
the frequency indirectly from elsewhere in the VBIOS.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91025
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Opcode 0x5a is a register write for data looked up from another part of
the VBIOS image. 0x59 is a more complex opcode, but we may as well
recognize it. These occur on a single known instance of Riva TNT2
hardware.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91025
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Increase clock timeout for SYS, FPB and GPC in order to avoid operation
failure at high gpcclk rate.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
nvbios_extend() returns 1 to indicate "extended the array" and 0 to
indicate the array is already big enough. This is used by the core
shadowing code to prevent re-fetching chunks of the image that have
already been shadowed.
The ACPI fetching code may possibly need to extend this further due
to requiring fetches to happen in 4KiB chunks.
Under certain circumstances (that happen if the total image size is
a multiple of 4KiB), the memory allocated to store the shadow will
already be big enough, causing the ACPI code's nvbios_extend() call
to return 0, which is misinterpreted as a failure.
The fix is simple, accept >= 0 as a successful condition here. The
core will have already made sure that we're not re-fetching data we
already have.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89047
v2 (Ben Skeggs):
- dropped hunk which would cause unnecessary re-fetching
- more descriptive explanation
Signed-off-by: Jan Vesely <jano.vesely@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Make static a few functions and structures that should be.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Before we moved gk110's implementation of this to pmu, the functions were
identical. This commit just switches GK208 to use the new (more complete)
implementation of the power-up sequence.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Turns out the PTHERM part of this dance is bracketed by the same PMU
fiddling that occurs on GK104/6, let's assume it's also PGOB.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
If a memory allocation fails when using the DMA allocator,
gk20a_instobj_dtor_dma() will be called on the failed instmem object.
At this time, node->handle might not be NULL despite the call to
dma_alloc_attrs() having failed. node->cpuaddr is the right member to
check for such a failure, so use it instead.
Reported-by: Vince Hsu <vinceh@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Let GK20A's instmem take advantage of the IOMMU if it is present. Having
an IOMMU means that instmem is no longer allocated using the DMA API,
but instead obtained through page_alloc and made contiguous to the GPU
by IOMMU mappings.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
instmem for GK20A is allocated using dma_alloc_coherent(), which
provides us with a coherent CPU mapping that we never use because
instmem objects are accessed through PRAMIN. Switch to
dma_alloc_attrs() which gives us the option to dismiss that CPU mapping
and free up some CPU virtual space.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Now that Nouveau can operate even when there is no RAM device, remove
the dummy one used by GK20A.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
GK20A does not have dedicated RAM, thus having a RAM device for it does
not make sense. Move the contiguous physical memory allocation to
instmem.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Having a RAM device does not make sense for chips like GK20A which have
no dedicated video memory. The dummy RAM device that we used so far
works as a temporary band-aid, but in the longer term it is desirable
for the driver to be able to work without any kind of VRAM.
This patch adds a few conditionals in places where a RAM device was
assumed to be present and allows some more objects to be allocated from
the TT domain, allowing Nouveau to handle GPUs for which
pfb->ram == NULL.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This if statement is correct but it wasn't indented, so it looked like
some code was missing.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Spotted by coccinelle:
drivers/gpu/drm/nouveau/core/subdev/fuse/gm107.c:50:5-8: WARNING: end returns can be simpified
Signed-off-by: Martin Peres <martin.peres@free.fr>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Code before looked only at bit 31 to decide if a port is unused.
However dcb 4.1 spec says 0x1F in bits 31-27 and 26-22 means unused.
This fixed hdmi monitor detection on GM206.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The namespace of NVKM is being changed to nvkm_ instead of nouveau_,
which will be used for the DRM part of the driver. This is being
done in order to make it very clear as to what part of the driver a
given symbol belongs to, and as a minor step towards splitting the
DRM driver out to be able to stand on its own (for virt).
Because there's already a large amount of churn here anyway, this is
as good a time as any to also switch to NVIDIA's device and chipset
naming to ease collaboration with them.
A comparison of objdump disassemblies proves no code changes.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>