acpi_get_pci_dev() is (hopefully) better, and all callers have been
converted, so let's get rid of this duplicated functionality.
Cc: Thomas Renninger <trenn@suse.de>
Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
acpi_get_pci_dev() is better, and all callers have been converted, so
eliminate acpi_get_pci_id().
Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
There is no need to pass a segment/bus tuple to this API, as the callsite
always has a struct pci_bus. We can derive segment/bus from the
struct pci_bus, so let's take this opportunit to simplify the API and
make life easier for the callers.
Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
A PCI domain cannot change as you descend down subordinate buses, which
makes the 'segment' argument to acpi_pci_irq_add_prt() useless.
Change the interface to take a struct pci_bus *, from whence we can derive
the bus number and segment. Reducing the number of arguments makes life
simpler for callers.
Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Now that we can dynamically convert an ACPI CA handle to a
struct pci_dev at runtime, there's no need to statically bind
them during boot.
acpi_pci_bind/unbind are vastly simplified, and are only used
to evaluate _PRT methods on P2P bridges and non-bridge children.
This patch also changes the time-space tradeoff ever so slightly.
Looking up the ACPI-PCI binding is never in the performance path, and by
eliminating this caching, we save 24 bytes for each _ADR device in the
ACPI namespace.
This patch lays further groundwork to eventually eliminate
the acpi_driver_ops.bind callback.
Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Convert an ACPI CA handle to a struct pci_dev.
Performing this lookup dynamically allows us to get rid of the
ACPI-PCI binding code, which:
- eliminates struct acpi_device vs struct pci_dev lifetime issues
- lays more groundwork for eliminating .start from acpi_device_ops
and thus simplifying ACPI drivers
- whacks out a lot of code
This change lays the groundwork for eliminating much of pci_bind.c.
Although pci_root.c may not be the most logical place for this
change, putting it here saves us from having to export acpi_pci_find_root.
Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Returns whether an ACPI CA node is a PCI root bridge or not.
This API is generically useful, and shouldn't just be a hotplug function.
The implementation becomes much simpler as well.
Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
acpi_pci_root_add() explicitly assigns device->ops.bind, and later
calls acpi_pci_bind_root(), which also does the same thing.
We don't need to repeat ourselves; removing the explicit assignment
allows us to make acpi_pci_bind() static.
Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Clean up: Relocate MNT program procedure number definitions to the
only file that uses them. Relocate the version number definitions,
which are shared, to nfs.h. Remove duplicate program number
definitions.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When rpc.statd starts up in user space at boot time, it attempts to
write the latest NSM local state number into
/proc/sys/fs/nfs/nsm_local_state.
If lockd.ko isn't loaded yet (as is the case in most configurations),
that file doesn't exist, thus the kernel's NSM state remains set to
its initial value of zero during lockd operation.
This is a problem because rpc.statd and lockd use the NSM state number
to prevent repeated lock recovery on rebooted hosts. If lockd sends
a zero NSM state, but then a delayed SM_NOTIFY with a real NSM state
number is received, there is no way for lockd or rpc.statd to
distinguish that stale SM_NOTIFY from an actual reboot. Thus lock
recovery could be performed after the rebooted host has already
started reclaiming locks, and those locks will be lost.
We could change /etc/init.d/nfslock so it always modprobes lockd.ko
before starting rpc.statd. However, if lockd.ko is ever unloaded
and reloaded, we are back at square one, since the NSM state is not
preserved across an unload/reload cycle. This may happen frequently
on clients that use automounter. A period of NFS inactivity causes
lockd.ko to be unloaded, and the kernel loses its NSM state setting.
Instead, let's use the fact that rpc.statd plants the local system's
NSM state in every SM_MON (and SM_UNMON) reply. lockd performs a
synchronous SM_MON upcall to the local rpc.statd _before_ sending its
first NLM request to a new remote. This would permit rpc.statd to
provide the current NSM state to lockd, even after lockd.ko had been
unloaded and reloaded.
Note that NLMPROC_LOCK arguments are constructed before the
nsm_monitor() call, so we have to rearrange argument construction very
slightly to make this all work out.
And, the kernel appears to treat NSM state as a u32 (see struct
nlm_args and nsm_res). Make nsm_local_state a u32 as well, to ensure
we don't get bogus comparison results.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
1/ Raid5 has learned to take over also raid4 and raid6 arrays.
2/ new_chunk in mdp_superblock_1 is in sectors, not bytes.
Signed-off-by: NeilBrown <neilb@suse.de>
Defines a new 'struct nfs4_slot_table' in the 'struct nfs4_session'
for use by the backchannel. Initializes, resets, and destroys the backchannel
slot table in the same manner the forechannel slot table is initialized,
reset, and destroyed.
The sequenceid for each slot in the backchannel slot table is initialized
to 0, whereas the forechannel slotid's sequenceid is set to 1.
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
The 'rq_received' member of 'struct rpc_rqst' is used to track when we
have received a reply to our request. With v4.1, the backchannel
can now accept callback requests over the existing connection. Rename
this field to make it clear that it is only used for tracking reply bytes
and not all bytes received on the connection.
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
This svc_xprt is passed on to the callback service thread to be later used
to processes incoming svc_rqst's
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
For nfs41 callbacks we need an svc_xprt to process requests coming up the
backchannel socket as rpc_rqst's that are transformed into svc_rqst's that
need a rq_xprt to be processed.
The svc_{udp,tcp}_create methods are too heavy for this job as svc_create_socket
creates an actual socket to listen on while for nfs41 we're "reusing" the
fore channel's socket.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Implement the NFSv4.1 backchannel service. Invokes the common callback
processing logic svc_process_common() to authenticate the call and
dispatch the appropriate NFSv4.1 XDR decoder and operation procedure.
It then invokes bc_send() to send the reply over the same connection.
bc_send() is implemented in a separate patch.
At this time there is no slot validation or reply cache handling.
[nfs41: Preallocate rpc_rqst receive buffer for handling callbacks]
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[Move bc_svc_process() declaration to correct patch]
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Executes the backchannel task on the RPC state machine using
the existing open connection previously established by the client.
Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com>
nfs41: Add bc_svc.o to sunrpc Makefile.
[nfs41: bc_send() does not need to be exported outside RPC module]
[nfs41: xprt_free_bc_request() need not be exported outside RPC module]
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[Update copyright]
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Adds rpc_run_bc_task() which is called by the NFS callback service to
process backchannel requests. It performs similar work to rpc_run_task()
though "schedules" the backchannel task to be executed starting at the
call_trasmit state in the RPC state machine.
It also introduces some miscellaneous updates to the argument validation,
call_transmit, and transport cleanup functions to take into account
that there are now forechannel and backchannel tasks.
Backchannel requests do not carry an RPC message structure, since the
payload has already been XDR encoded using the existing NFSv4 callback
mechanism.
Introduce a new transmit state for the client to reply on to backchannel
requests. This new state simply reserves the transport and issues the
reply. In case of a connection related error, disconnects the transport and
drops the reply. It requires the forechannel to re-establish the connection
and the server to retransmit the request, as stated in NFSv4.1 section
2.9.2 "Client and Server Transport Behavior".
Note: There is no need to loop attempting to reserve the transport. If EAGAIN
is returned by xprt_prepare_transmit(), return with tk_status == 0,
setting tk_action to call_bc_transmit. rpc_execute() will invoke it again
after the task is taken off the sleep queue.
[nfs41: rpc_run_bc_task() need not be exported outside RPC module]
[nfs41: New call_bc_transmit RPC state]
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: Backchannel: No need to loop in call_bc_transmit()]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[rpc_count_iostats incorrectly exits early]
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[Convert rpc_reply_expected() to inline function]
[Remove unnecessary BUG_ON()]
[Rename variable]
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
This function was only used by pci_claim_resource(), and the last commit
deleted that use.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch introduces support to setup the callback xprt on the client side.
It allocates/ destroys the preallocated memory structures used to process
backchannel requests.
At setup time, xprt_setup_backchannel() is invoked to allocate one or
more rpc_rqst structures and substructures. This ensures that they
are available when an RPC callback arrives. The rpc_rqst structures
are maintained in a linked list attached to the rpc_xprt structure.
We keep track of the number of allocations so that they can be correctly
removed when the channel is destroyed.
When an RPC callback arrives, xprt_alloc_bc_request() is invoked to
obtain a preallocated rpc_rqst structure. An rpc_xprt structure is
returned, and its RPC_BC_PREALLOC_IN_USE bit is set in
rpc_xprt->bc_flags. The structure is removed from the the list
since it is now in use, and it will be later added back when its
user is done with it.
After the RPC callback replies, the rpc_rqst structure is returned
by invoking xprt_free_bc_request(). This clears the
RPC_BC_PREALLOC_IN_USE bit and adds it back to the list, allowing it
to be reused by a subsequent RPC callback request.
To be consistent with the reception of RPC messages, the backchannel requests
should be placed into the 'struct rpc_rqst' rq_rcv_buf, which is then in turn
copied to the 'struct rpc_rqst' rq_private_buf.
[nfs41: Preallocate rpc_rqst receive buffer for handling callbacks]
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[Update copyright notice and explain page allocation]
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Adds new list of rpc_xprt structures, and a readers/writers lock to
protect the list. The list is used to preallocate resources for
the backchannel during backchannel requests. Callbacks are not
expected to cause significant latency, so only one callback will
be allowed at this time.
It also adds a pointer to the NFS callback service so that
requests can be directed to it for processing.
New callback members added to svc_serv. The NFSv4.1 callback service will
sleep on the svc_serv->svc_cb_waitq until new callback requests arrive.
The request will be queued in svc_serv->svc_cb_list. This patch adds this
list, the sleep queue and spinlock to svc_serv.
[nfs41: NFSv4.1 callback support]
Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
At mount, nfs_alloc_client sets the cl_state NFS4CLNT_LEASE_EXPIRED bit
and nfs4_alloc_session sets the NFS4CLNT_SESSION_SETUP bit, so both bits are
set when nfs4_lookup_root calls nfs4_recover_expired_lease which schedules
the nfs4_state_manager and waits for it to complete.
Place the session setup after the clientid establishment in nfs4_state_manager
so that the session is setup right after the clientid has been established
without rescheduling the state manager.
Unlike nfsv4.0, the nfs_client struct is not ready to use until the session
has been established. Postpone marking the nfs_client struct to NFS_CS_READY
until after a successful CREATE_SESSION call so that other threads cannot use
the client until the session is established.
If the EXCHANGE_ID call fails and the session has not been setup (the
NFS4CLNT_SESSION_SETUP bit is set), mark the client with the error and return.
If the session setup CREATE_SESSION call fails with NFS4ERR_STALE_CLIENTID
which could occur due to server reboot or network partition inbetween the
EXCHANGE_ID and CREATE_SESSION call, reset the NFS4CLNT_LEASE_EXPIRED and
NFS4CLNT_SESSION_SETUP bits and try again.
If the CREATE_SESSION call fails with other errors, mark the client with
the error and return.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: NFS_CS_SESSION_SETUP cl_cons_state for back channel setup]
On session setup, the CREATE_SESSION reply races with the server back channel
probe which needs to succeed to setup the back channel. Set a new
cl_cons_state NFS_CS_SESSION_SETUP just prior to the CREATE_SESSION call
and add it as a valid state to nfs_find_client so that the client back channel
can find the nfs_client struct and won't drop the server backchannel probe.
Use a new cl_cons_state so that NFSv4.0 back channel behaviour which only
sets NFS_CS_READY is unchanged.
Adjust waiting on the nfs_client_active_wq accordingly.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: rename NFS_CS_SESSION_SETUP to NFS_CS_SESSION_INITING]
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfs41: set NFS_CL_SESSION_INITING in alloc_session]
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfs41: move session setup into a function]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[moved nfs4_proc_create_session declaration here]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Implement the create_session operation conforming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26
Set the real fore channel max operations to preserve server resources.
Note: If the server returns < NFS4_MAX_OPS, the client will very soon
get an NFS4ERR_TOO_MANY_OPS. A later patch will handle this.
Set the max_rqst_sz and max_resp_sz to PAGE_SIZE - we preallocate the buffers.
Set the back channel max_resp_sz_cached to zero to force the client to
always set csa_cachethis to FALSE because the current implementation
of the back channel DRC only supports caching the CB_SEQUENCE operation.
The client back channel server supports one slot, and desires 2 operations
per compound.
Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com>
Signed-off-by: Andy Adamson<andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: remove extraneous rpc_clnt pointer]
Use the struct nfs_client cl_rpcclient.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: nfs4_init_channel_attrs, just use nfs41_create_session_args]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: use rsize and wsize for session channel attributes]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: set channel max operations]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: set back channel attributes]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: obliterate nfs4_adjust_channel_attrs]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: have create_session work on nfs_client]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: move CONFIG_NFS_V4_1 endif]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: pass *session in seq_args and seq_res]
[moved nfs4_init_slot_table definition here]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: use kcalloc to allocate slot table]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
[nfs41: fix Xcode_create_session's xdr Xcoding pointer type]
[nfs41: refactor decoding of channel attributes]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
get_lease_time uses the FSINFO rpc operation to
get the lease time attribute.
nfs4_get_lease_time() is only called from the state manager on session setup
so don't recover from clientid or sequence level errors.
We do need to recover from NFS4ERR_DELAY or NFS4ERR_GRACE.
Use NFS4_POLL_RETRY_MIN - the Linux server returns NFS4ERR_DELAY when an
upcall is needed to resolve an uncached export referenced by a file handle.
[nfs41: sequence res use slotid]
Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: remove extraneous rpc_clnt pointer]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: have get_lease_time work on nfs_client]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: get_lease_time recover from NFS4ERR_DELAY]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: pass *session in seq_args and seq_res]
[define nfs4_get_lease_time_{args,res}]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Implement the exchange_id operation conforming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26
Unlike NFSv4.0, NFSv4.1 requires machine credentials. RPC_AUTH_GSS machine
credentials will be passed into the kernel at mount time to be available for
the exchange_id operation.
RPC_AUTH_UNIX root mounts can use the UNIX root credential. Store the root
credential in the nfs_client struct.
Without a credential, NFSv4.1 state renewal fails.
[nfs41: establish clientid via exchange id only if cred != NULL]
Signed-off-by: Andy Adamson<andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: move nfstime4 from under CONFIG_NFS_V4_1]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: do not wait a lease time in exchange id]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: pass *session in seq_args and seq_res]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
[nfs41: Ignoring impid in decode_exchange_id is missing a READ_BUF]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: fix Xcode_exchange_id's xdr Xcoding pointer type]
[nfs41: get rid of unused struct nfs41_exchange_id_res members]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Convert ia64 to use int-ll64.h
[IA64] Fix build error in paravirt_patchlist.c
[IA64] ia64 does not need umount2() syscall
[IA64] hook up new rt_tgsigqueueinfo syscall
[IA64] msi_ia64.c dmar_msi_type should be static
[IA64] remove obsolete hw_interrupt_type
[IA64] remove obsolete irq_desc_t typedef
[IA64] remove obsolete no_irq_type
[IA64] unexport fpswa.h
Allocate a slot in the session slot table and set the sequence op arguments.
Called at the rpc prepare stage.
Add a status to nfs41_sequence_res, initialize it to one so that we catch
rpc level failures which do not go through decode_sequence which sets
the new status field.
Note that upon an rpc level failure, we don't know if the server processed the
sequence operation or not. Proceed as if the server did process the sequence
operation.
Signed-off-by: Rahul Iyer <iyer@netapp.com>
[nfs41: sequence args use slotid]
[nfs41: find slot return slotid]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: remove SEQ4_STATUS_USE_TK_STATUS]
As per 11-14-08 review
[move extern declaration from nfs41: sequence setup/done support]
[removed sa_session definition, changed sa_cache_this into a u8 to reduce footprint]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: rpc_sleep_on slot_tbl_waitq must be called under slot_tbl_lock]
Otherwise there's a race (we've hit) with nfs4_free_slot where
nfs41_setup_sequence sees a full slot table, unlocks slot_tbl_lock,
nfs4_free_slots happen concurrently and call rpc_wake_up_next
where there's nobody to wake up yet, context goes back to
nfs41_setup_sequence which goes to sleep when the slot table
is actually empty now and there's no-one to wake it up anymore.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Initialize nfs4_sequence_res sr_slotid to NFS4_MAX_SLOT_TABLE.
[was nfs41: sequence res use slotid]
Signed-off-by: Andy Adamson <andros@netapp.com>
[pulled definition of struct nfs4_sequence_res.sr_slotid to here]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
To be used for getting the rpc's minorversion and for nfs41 xdr
{en,de}coding of the sequence operation.
Reset the seq session ptrs for minorversion=0 rpc calls.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Use nfs4_call_sync rather than rpc_call_sync to provide
for a nfs41 sessions-enabled interface for sessions manipulation.
The nfs41 rpc logic uses the rpc_call_prepare method to
recover and create the session, as well as selecting a free slot id
and the rpc_call_done to free the slot and update slot table
related metadata.
In the coming patches we'll add rpc prepare and done routines
for setting up the sequence op and processing the sequence result.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: nfs4_call_sync]
As per 11-14-08 review.
Squash into "nfs41: introduce nfs4_call_sync" and "nfs41: nfs4_setup_sequence"
Define two functions one for v4 and one for v41
add a pointer to struct nfs4_client to the correct one.
Signed-off-by: Andy Adamson <andros@netapp.com>
[added BUG() in _nfs4_call_sync_session if !CONFIG_NFS_V4_1]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: check for session not minorversion]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[group minorversion specific stuff together]
Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfs41: fixup nfs4_clear_client_minor_version]
[introduce nfs4_init_client_minor_version() in this patch]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[cleaned-up patch: got rid of nfs_call_sync_t, dprintks, cosmetics, extra server defs]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NFSv4.1 Sessions basic data types, initialization, and destruction.
The session is always associated with a struct nfs_client that holds
the exchange_id results.
Signed-off-by: Rahul Iyer <iyer@netapp.com>
Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[remove extraneous rpc_clnt pointer, use the struct nfs_client cl_rpcclient.
remove the rpc_clnt parameter from nfs4 nfs4_init_session]
Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[Use the presence of a session to determine behaviour instead of the
minorversion number.]
Signed-off-by: Andy Adamson <andros@netapp.com>
[constified nfs4_has_session's struct nfs_client parameter]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[Rename nfs4_put_session() to nfs4_destroy_session() and call it from nfs4_free_client() not nfs4_free_server().
Also get rid of nfs4_get_session() and the ref_count in nfs4_session struct as keeping track of nfs_client should be sufficient]
Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com>
[nfs41: pass rsize and wsize into nfs4_init_session]
Signed-off-by: Andy Adamson <andros@netapp.com>
[separated out removal of rpc_clnt parameter from nfs4_init_session ot a
patch of its own]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[Pass the nfs_client pointer into nfs4_alloc_session]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: don't assign to session->clp->cl_session in nfs4_destroy_session]
[nfs41: fixup nfs4_clear_client_minor_version]
[introduce nfs4_clear_client_minor_version() in this patch]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[Refactor nfs4_init_session]
Moved session allocation into nfs4_init_client_minor_version, called from
nfs4_init_client.
Leave rwise and wsize initialization in nfs4_init_session, called from
nfs4_init_server.
Reverted moving of nfs_fsid definition to nfs_fs_sb.h
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: Move NFS4_MAX_SLOT_TABLE define from under CONFIG_NFS_V4_1]
[Fix comile error when CONFIG_NFS_V4_1 is not set.]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[moved nfs4_init_slot_table definition to "create_session operation"]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: alloc session with GFP_KERNEL]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Define stubs for sequence args and res data structures and embed
them in all other nfs4 and nfs41 xdr types. They are needed for
sending any op in a nfs41 compound rpc.
Signed-off-by: Andy Adamson<andros@netapp.com>
[moved new args/res definitions away, to where they're first used]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This field is set to the nfsv4 minor version for this mount.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Note: This patch sets the referral to the same minorversion as the
current mount. Revisit in future patch.
Signed-off-by: Andy Adamson <andros@netapp.com>
[removed cl_minorversion assignment in nfs_set_client]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[always define nfs_client.cl_minorversion]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If 4.1 isn't supported, NFS4_MAX_MINOR_VERSION will be 0.
Signed-off-by: Mike Sager <sager@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We can not have .driver_data as const since platform_set_drvdata() doesnt take
a const.
The hclk mmc_data field can be const though.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Only the base addresses remain, as they are needed to set up
the IOMEM resources.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
IRQ number definitions for PWM, LED, SPI and OWM (ds1wm).
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Used to configure single bits of the SDHWCTRL_SDCONF and EXTCF_RESET/SELECT
registers needed for DS1WM, MMC/SDIO and PCMCIA functionality.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The PCAP Asic as present on EZX phones is a multi function device with
voltage regulators, ADC, touch screen controller, RTC, USB transceiver,
leds controller, and audio codec.
It has two SPI ports, typically one is connected to the application
processor and another to the baseband, this driver provides read/write
functions to its registers, irq demultiplexer and ADC
queueing/abstraction.
This chip is used on a lot of Motorola phones, it was manufactured by TI
as a custom product with the name PTWL93017, later this design evolved
into the ATLAS PMIC from Freescale (MC13783).
Signed-off-by: Daniel Ribeiro <drwyrm@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
This adds a core driver for the AB3100 mixed-signal circuit
found in the ST-Ericsson U300 series platforms. This driver
is a singleton proxy for all accesses to the AB3100
sub-drivers which will be merged on top of this one, RTC,
regulators, battery and system power control, vibrator,
LEDs, and an ALSA codec.
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Reviewed-by: Mike Rapoport <mike@compulab.co.il>
Reviewed-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6:
[SCSI] aic79xx: make driver respect nvram for IU and QAS settings
[SCSI] don't attach ULD to Dell Universal Xport
[SCSI] lpfc 8.3.3 : Update driver version to 8.3.3
[SCSI] lpfc 8.3.3 : Add support for Target Reset handler entrypoint
[SCSI] lpfc 8.3.3 : Fix a couple of spin_lock and memory issues and a crash
[SCSI] lpfc 8.3.3 : FC/FCOE discovery fixes
[SCSI] lpfc 8.3.3 : Fix various SLI-3 vs SLI-4 differences
[SCSI] qla2xxx: Resolve a performance issue in interrupt
[SCSI] cnic, bnx2i: Fix build failure when CONFIG_PCI is not set.
[SCSI] nsp_cs: time_out reaches -1
[SCSI] qla2xxx: fix printk format warnings
[SCSI] ncr53c8xx: div reaches -1
[SCSI] compat: don't perform unneeded copy in sg_io code
[SCSI] zfcp: Update FC pass-through support
[SCSI] zfcp: Add FC pass-through support
[SCSI] FC Pass Thru support
* 'linux-next' of git://git.infradead.org/ubi-2.6: (21 commits)
UBI: add reboot notifier
UBI: handle more error codes
UBI: fix multiple spelling typos
UBI: fix kmem_cache_free on error patch
UBI: print amount of reserved PEBs
UBI: improve messages in the WL worker
UBI: make gluebi a separate module
UBI: remove built-in gluebi
UBI: add notification API
UBI: do not switch to R/O mode on read errors
UBI: fix and clean-up error paths in WL worker
UBI: introduce new constants
UBI: fix race condition
UBI: minor serialization fix
UBI: do not panic if volume check fails
UBI: add dump_stack in checking code
UBI: fix races in I/O debugging checks
UBI: small debugging code optimization
UBI: improve debugging messages
UBI: re-name volumes_mutex to device_mutex
...
It is generally agreed that it would be beneficial for u64 to be an
unsigned long long on all architectures. ia64 (in common with several
other 64-bit architectures) currently uses unsigned long. Migrating
piecemeal is too painful; this giant patch fixes all compilation warnings
and errors that come as a result of switching to use int-ll64.h.
Note that userspace will still see __u64 defined as unsigned long. This
is important as it affects C++ name mangling.
[Updated by Tony Luck to change efi.h:efi_freemem_callback_t to use
u64 for start/end rather than unsigned long]
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
get rid of BKL in fs/sysv
get rid of BKL in fs/minix
get rid of BKL in fs/efs
befs ->pust_super() doesn't need BKL
Cleanup of adfs headers
9P doesn't need BKL in ->umount_begin()
fuse doesn't need BKL in ->umount_begin()
No instance of ->bmap() needs BKL
remove unlock_kernel() left accidentally
ext4: avoid unnecessary spinlock in critical POSIX ACL path
ext3: avoid unnecessary spinlock in critical POSIX ACL path
commit 2b85a34e91
(net: No more expensive sock_hold()/sock_put() on each tx)
changed initial sk_wmem_alloc value.
Some protocols check sk_wmem_alloc value to determine if a timer
must delay socket deallocation. We must take care of the sk_wmem_alloc
value being one instead of zero when no write allocations are pending.
Reported by Ingo Molnar, and full diagnostic from David Miller.
This patch introduces three helpers to get read/write allocations
and a followup patch will use these helpers to report correct
write allocations to user.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix kernel-doc warnings (missing + extra entries) in skbuff.h.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new types for TLV dB scale specified with min/max values instead
of min/step since the resolution can't match always with the one
a device provides. For example, usb audio devices give 1/256 dB
resolution while ALSA TLV is based on 1/100 dB resolution.
The new min/max types have less problems because the possible
rounding error happens only at min/max.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon: switch to using late_initcall
radeon legacy chips: tv dac bg/dac adj updates
drm/radeon: introduce kernel modesetting for radeon hardware
drm: Add the TTM GPU memory manager subsystem.
drm: Memory fragmentation from lost alignment blocks
drm/radeon: fix mobility flags on new PCI IDs.
* akpm: (182 commits)
fbdev: bf54x-lq043fb: use kzalloc over kmalloc/memset
fbdev: *bfin*: fix __dev{init,exit} markings
fbdev: *bfin*: drop unnecessary calls to memset
fbdev: bfin-t350mcqb-fb: drop unused local variables
fbdev: blackfin has __raw I/O accessors, so use them in fb.h
fbdev: s1d13xxxfb: add accelerated bitblt functions
tcx: use standard fields for framebuffer physical address and length
fbdev: add support for handoff from firmware to hw framebuffers
intelfb: fix a bug when changing video timing
fbdev: use framebuffer_release() for freeing fb_info structures
radeon: P2G2CLK_ALWAYS_ONb tested twice, should 2nd be P2G2CLK_DAC_ALWAYS_ONb?
s3c-fb: CPUFREQ frequency scaling support
s3c-fb: fix resource releasing on error during probing
carminefb: fix possible access beyond end of carmine_modedb[]
acornfb: remove fb_mmap function
mb862xxfb: use CONFIG_OF instead of CONFIG_PPC_OF
mb862xxfb: restrict compliation of platform driver to PPC
Samsung SoC Framebuffer driver: add Alpha Channel support
atmel-lcdc: fix pixclock upper bound detection
offb: use framebuffer_alloc() to allocate fb_info struct
...
Manually fix up conflicts due to kmemcheck in mm/slab.c
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add accelerated bitblt functions to s1d13xxx based video chipsets, more
specificly functions copyarea and fillrect.
It has only been tested and activated for 13506 chipsets but is expected
to work for the majority of s1d13xxx based chips. This patch also cleans
up the driver with respect of whitespaces and other formatting issues. We
update the current status comments.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kristoffer Ericson <kristoffer.ericson@gmail.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With KMS we have ran into an issue where we really want the KMS fb driver
to be the one running the console, so panics etc can be shown by switching
out of X etc.
However with vesafb/efifb built-in, we end up with those on fb0 and the
KMS fb driver on fb1, driving the same piece of hw, so this adds an fb
info flag to denote a firmware fbdev, and adds a new aperture base/size
range which can be compared when the hw drivers are installed to see if
there is a conflict with a firmware driver, and if there is the firmware
driver is unregistered and the hw driver takes over.
It uses new aperture_base/size members instead of comparing on the fix
smem_start/length, as smem_start/length might for example only cover the
first 1MB of the PCI aperture, and we could allocate the kms fb from 8MB
into the aperture, thus they would never overlap.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Dave Airlie <airlied@redhat.com>
Acked-by: Peter Jones <pjones@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now we have __initconst, we can finally move the external declarations for
the various Linux logo structures to <linux/linux_logo.h>.
James' ack dates back to the previous submission (way to long ago), when the
logos were still __initdata, which caused failures on some platforms with some
toolchain versions.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: James Simmons <jsimmons@infradead.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The LIS302DL accelerometer chip has a 'click' feature which can be used to
detect sudden motion on any of the three axis. Configuration data is
passed via spi platform_data and no action is taken if that's not
specified, so it won't harm any existing platform.
To make the configuration effective, the IRQ lines need to be set up
appropriately. This patch also adds a way to do that from board support
code.
The DD_* definitions were factored out to an own enum because they are
specific to LIS3LV02D devices.
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Eric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for the TI VLYNQ high-speed, serial and packetized bus.
This bus allows external devices to be connected to the System-on-Chip and
appear in the main system memory just like any memory mapped peripheral.
It is widely used in TI's networking and multimedia SoC, including the AR7
SoC.
Signed-off-by: Eugene Konev <ejka@imfi.kspu.ru>
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Convert most arches to use asm-generic/kmap_types.h.
Move the KM_FENCE_ macro additions into asm-generic/kmap_types.h,
controlled by __WITH_KM_FENCE from each arch's kmap_types.h file.
Would be nice to be able to add custom KM_types per arch, but I don't yet
see a nice, clean way to do that.
Built on x86_64, i386, mips, sparc, alpha(tonyb), powerpc(tonyb), and
68k(tonyb).
Note: avr32 should be able to remove KM_PTE2 (since it's not used) and
then just use the generic kmap_types.h file. Get avr32 maintainer
approval.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: <linux-arch@vger.kernel.org>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Bryan Wu <cooloney@kernel.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: "Luck Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
put_cpu_no_resched() is an optimization of put_cpu() which unfortunately
can cause high latencies.
The nfs iostats code uses put_cpu_no_resched() in a code sequence where a
reschedule request caused by an interrupt between the get_cpu() and the
put_cpu_no_resched() can delay the reschedule for at least HZ.
The other users of put_cpu_no_resched() optimize correctly in interrupt
code, but there is no real harm in using the put_cpu() function which is
an alias for preempt_enable(). The extra check of the preemmpt count is
not as critical as the potential source of missing a reschedule.
Debugged in the preempt-rt tree and verified in mainline.
Impact: remove a high latency source
[akpm@linux-foundation.org: build fix]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since we have had a LANANA major number for years, and it is documented in
devices.txt, I think that this first patch can go upstream.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
They're in linux/bug.h at present, which causes include order tangles. In
particular, linux/bug.h cannot be used by linux/atomic.h because,
according to Nikanth:
linux/bug.h pulls in linux/module.h => linux/spinlock.h => asm/spinlock.h
(which uses atomic_inc) => asm/atomic.h.
bug.h is a pretty low-level thing and module.h is a higher-level thing,
IMO.
Cc: Nikanth Karthikesan <knikanth@novell.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After introduction of keyed wakeups Davide Libenzi did on epoll, we are
able to avoid spurious wakeups in poll()/select() code too.
For example, typical use of poll()/select() is to wait for incoming
network frames on many sockets. But TX completion for UDP/TCP frames call
sock_wfree() which in turn schedules thread.
When scheduled, thread does a full scan of all polled fds and can sleep
again, because nothing is really available. If number of fds is large,
this cause significant load.
This patch makes select()/poll() aware of keyed wakeups and useless
wakeups are avoided. This reduces number of context switches by about 50%
on some setups, and work performed by sofirq handlers.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The members of the new_utsname structure are defined with magic numbers
that *should* correspond to the constant __NEW_UTS_LEN+1. Everywhere
else, code assumes this and uses the constant, so this patch makes the
structure match.
Originally suggested by Serge here:
https://lists.linux-foundation.org/pipermail/containers/2009-March/016258.html
Signed-off-by: Dan Smith <danms@us.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On NUMA machines, the administrator can configure zone_reclaim_mode that
is a more targetted form of direct reclaim. On machines with large NUMA
distances for example, a zone_reclaim_mode defaults to 1 meaning that
clean unmapped pages will be reclaimed if the zone watermarks are not
being met.
There is a heuristic that determines if the scan is worthwhile but it is
possible that the heuristic will fail and the CPU gets tied up scanning
uselessly. Detecting the situation requires some guesswork and
experimentation so this patch adds a counter "zreclaim_failed" to
/proc/vmstat. If during high CPU utilisation this counter is increasing
rapidly, then the resolution to the problem may be to set
/proc/sys/vm/zone_reclaim_mode to 0.
[akpm@linux-foundation.org: name things consistently]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Collect vma->vm_flags of the VMAs that actually referenced the page.
This is preparing for more informed reclaim heuristics, eg. to protect
executable file pages more aggressively. For now only the VM_EXEC bit
will be used by the caller.
Thanks to Johannes, Peter and Minchan for all the good tips.
Acked-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As function shmem_file_setup does not modify/allocate/free/pass given
filename - mark it as const.
Signed-off-by: Sergei Trofimovich <slyfox@inbox.ru>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The file argument resulted from address_space's readpage long time ago.
We don't use it any more. Let's remove unnecessary argement.
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove __invalidate_mapping_pages atomic variant now that its sole caller
can sleep (fixed in eccb95cee4 ("vfs: fix
lock inversion in drop_pagecache_sb()")).
This fixes softlockups that can occur while in the drop_caches path.
Signed-off-by: Mike Waychison <mikew@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The per-task oom_adj value is a characteristic of its mm more than the
task itself since it's not possible to oom kill any thread that shares the
mm. If a task were to be killed while attached to an mm that could not be
freed because another thread were set to OOM_DISABLE, it would have
needlessly been terminated since there is no potential for future memory
freeing.
This patch moves oomkilladj (now more appropriately named oom_adj) from
struct task_struct to struct mm_struct. This requires task_lock() on a
task to check its oom_adj value to protect against exec, but it's already
necessary to take the lock when dereferencing the mm to find the total VM
size for the badness heuristic.
This fixes a livelock if the oom killer chooses a task and another thread
sharing the same memory has an oom_adj value of OOM_DISABLE. This occurs
because oom_kill_task() repeatedly returns 1 and refuses to kill the
chosen task while select_bad_process() will repeatedly choose the same
task during the next retry.
Taking task_lock() in select_bad_process() to check for OOM_DISABLE and in
oom_kill_task() to check for threads sharing the same memory will be
removed in the next patch in this series where it will no longer be
necessary.
Writing to /proc/pid/oom_adj for a kthread will now return -EINVAL since
these threads are immune from oom killing already. They simply report an
oom_adj value of OOM_DISABLE.
Cc: Nick Piggin <npiggin@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a part of the patches for fixing memcg's swap accountinf leak.
But, IMHO, not a bad patch even if no memcg.
There are 2 kinds of references to swap.
- reference from swap entry
- reference from swap cache
Then,
- If there is swap cache && swap's refcnt is 1, there is only swap cache.
(*) swapcount(entry) == 1 && find_get_page(swapper_space, entry) != NULL
This counting logic have worked well for a long time. But considering
that we cannot know there is a _real_ reference or not by swap_map[],
current usage of counter is not very good.
This patch adds a flag SWAP_HAS_CACHE and recored information that a swap
entry has a cache or not. This will remove -1 magic used in swapfile.c
and be a help to avoid unnecessary find_get_page().
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In a following patch, the usage of swap cache is recorded into swap_map.
This patch is for necessary interface changes to do that.
2 interfaces:
- swapcache_prepare()
- swapcache_free()
are added for allocating/freeing refcnt from swap-cache to existing swap
entries. But implementation itself is not changed under this patch. At
adding swapcache_free(), memcg's hook code is moved under
swapcache_free(). This is better than using scattered hooks.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: Balbir Singh <balbir@in.ibm.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Factor the per-zone arithemetic inside setup_per_zone_inactive_ratio()'s
loop into a a separate function, calculate_zone_inactive_ratio(). This
function will be used in a later patch
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change the names of two functions. It doesn't affect behavior.
Presently, setup_per_zone_pages_min() changes low, high of zone as well as
min. So a better name is setup_per_zone_wmarks(). That's because Mel
changed zone->pages_[hig/low/min] to zone->watermark array in "page
allocator: replace the watermark-related union in struct zone with a
watermark[] array".
* setup_per_zone_pages_min => setup_per_zone_wmarks
Of course, we have to change init_per_zone_pages_min, too. There are not
pages_min any more.
* init_per_zone_pages_min => init_per_zone_wmark_min
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This simplifies the code in gfp_zone() and also keeps the ability of the
compiler to use constant folding to get rid of gfp_zone processing.
The lookup of the zone is done using a bitfield stored in an integer. So
the code in gfp_zone is a simple extraction of bits from a constant
bitfield. The compiler is generating a load of a constant into a register
and then performs a shift and mask operation to get the zone from a gfp_t.
No cachelines are touched and no branches have to be predicted by the
compiler.
We are doing some macro tricks here to convince the compiler to always do
the constant folding if possible.
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Mel Gorman <mel@csn.ul.ie>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If you're using a non-highmem architecture, passing an argument with the
wrong type to kunmap() doesn't give you a warning because the ifdef
doesn't check the type.
Using a static inline function solves the problem nicely.
Reported-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, the following scenario appears to be possible in theory:
* Tasks are frozen for hibernation or suspend.
* Free pages are almost exhausted.
* Certain piece of code in the suspend code path attempts to allocate
some memory using GFP_KERNEL and allocation order less than or
equal to PAGE_ALLOC_COSTLY_ORDER.
* __alloc_pages_internal() cannot find a free page so it invokes the
OOM killer.
* The OOM killer attempts to kill a task, but the task is frozen, so
it doesn't die immediately.
* __alloc_pages_internal() jumps to 'restart', unsuccessfully tries
to find a free page and invokes the OOM killer.
* No progress can be made.
Although it is now hard to trigger during hibernation due to the memory
shrinking carried out by the hibernation code, it is theoretically
possible to trigger during suspend after the memory shrinking has been
removed from that code path. Moreover, since memory allocations are
going to be used for the hibernation memory shrinking, it will be even
more likely to happen during hibernation.
To prevent it from happening, introduce the oom_killer_disabled switch
that will cause __alloc_pages_internal() to fail in the situations in
which the OOM killer would have been called and make the freezer set
this switch after tasks have been successfully frozen.
[akpm@linux-foundation.org: be nicer to the namespace]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Fengguang Wu <fengguang.wu@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Analoguous to follow_phys(), add a helper that looks up the PFN at a
user virtual address in an IO mapping or a raw PFN mapping.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Hellwig <hch@infradead.org>
Acked-by: Magnus Damm <magnus.damm@gmail.com>
Cc: Hans Verkuil <hverkuil@xs4all.nl>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The vmscan batching logic is twisting. Move it into a standalone function
nr_scan_try_batch() and document it. No behavior change.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Christoph Lameter <cl@linux-foundation.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the file LRU lists are dominated by streaming IO pages, evict those
pages first, before considering evicting other pages.
This should be safe from deadlocks or performance problems
because only three things can happen to an inactive file page:
1) referenced twice and promoted to the active list
2) evicted by the pageout code
3) under IO, after which it will get evicted or promoted
The pages freed in this way can either be reused for streaming IO, or
allocated for something else. If the pages are used for streaming IO,
this pageout pattern continues. Otherwise, we will fall back to the
normal pageout pattern.
Signed-off-by: Rik van Riel <riel@redhat.com>
Reported-by: Elladan <elladan@eskimo.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
num_online_nodes() is called in a number of places but most often by the
page allocator when deciding whether the zonelist needs to be filtered
based on cpusets or the zonelist cache. This is actually a heavy function
and touches a number of cache lines.
This patch stores the number of online nodes at boot time and updates the
value when nodes get onlined and offlined. The value is then used in a
number of important paths in place of num_online_nodes().
[rientjes@google.com: do not override definition of node_set_online() with macro]
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ALLOC_WMARK_MIN, ALLOC_WMARK_LOW and ALLOC_WMARK_HIGH determin whether
pages_min, pages_low or pages_high is used as the zone watermark when
allocating the pages. Two branches in the allocator hotpath determine
which watermark to use.
This patch uses the flags as an array index into a watermark array that is
indexed with WMARK_* defines accessed via helpers. All call sites that
use zone->pages_* are updated to use the helpers for accessing the values
and the array offsets for setting.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On low-memory systems, anti-fragmentation gets disabled as there is
nothing it can do and it would just incur overhead shuffling pages between
lists constantly. Currently the check is made in the free page fast path
for every page. This patch moves it to a slow path. On machines with low
memory, there will be small amount of additional overhead as pages get
shuffled between lists but it should quickly settle.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Callers of alloc_pages_node() can optionally specify -1 as a node to mean
"allocate from the current node". However, a number of the callers in
fast paths know for a fact their node is valid. To avoid a comparison and
branch, this patch adds alloc_pages_exact_node() that only checks the nid
with VM_BUG_ON(). Callers that know their node is valid are then
converted.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Paul Mundt <lethal@linux-sh.org> [for the SLOB NUMA bits]
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
No user of the allocator API should be passing in an order >= MAX_ORDER
but we check for it on each and every allocation. Delete this check and
make it a VM_BUG_ON check further down the call path.
[akpm@linux-foundation.org: s/VM_BUG_ON/WARN_ON_ONCE/]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The start of a large patch series to clean up and optimise the page
allocator.
The performance improvements are in a wide range depending on the exact
machine but the results I've seen so fair are approximately;
kernbench: 0 to 0.12% (elapsed time)
0.49% to 3.20% (sys time)
aim9: -4% to 30% (for page_test and brk_test)
tbench: -1% to 4%
hackbench: -2.5% to 3.45% (mostly within the noise though)
netperf-udp -1.34% to 4.06% (varies between machines a bit)
netperf-tcp -0.44% to 5.22% (varies between machines a bit)
I haven't sysbench figures at hand, but previously they were within the
-0.5% to 2% range.
On netperf, the client and server were bound to opposite number CPUs to
maximise the problems with cache line bouncing of the struct pages so I
expect different people to report different results for netperf depending
on their exact machine and how they ran the test (different machines, same
cpus client/server, shared cache but two threads client/server, different
socket client/server etc).
I also measured the vmlinux sizes for a single x86-based config with
CONFIG_DEBUG_INFO enabled but not CONFIG_DEBUG_VM. The core of the
.config is based on the Debian Lenny kernel config so I expect it to be
reasonably typical.
This patch:
__alloc_pages_internal is the core page allocator function but essentially
it is an alias of __alloc_pages_nodemask. Naming a publicly available and
exported function "internal" is also a big ugly. This patch renames
__alloc_pages_internal() to __alloc_pages_nodemask() and deletes the old
nodemask function.
Warning - This patch renames an exported symbol. No kernel driver is
affected by external drivers calling __alloc_pages_internal() should
change the call to __alloc_pages_nodemask() without any alteration of
parameters.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix allocating page cache/slab object on the unallowed node when memory
spread is set by updating tasks' mems_allowed after its cpuset's mems is
changed.
In order to update tasks' mems_allowed in time, we must modify the code of
memory policy. Because the memory policy is applied in the process's
context originally. After applying this patch, one task directly
manipulates anothers mems_allowed, and we use alloc_lock in the
task_struct to protect mems_allowed and memory policy of the task.
But in the fast path, we didn't use lock to protect them, because adding a
lock may lead to performance regression. But if we don't add a lock,the
task might see no nodes when changing cpuset's mems_allowed to some
non-overlapping set. In order to avoid it, we set all new allowed nodes,
then clear newly disallowed ones.
[lee.schermerhorn@hp.com:
The rework of mpol_new() to extract the adjusting of the node mask to
apply cpuset and mpol flags "context" breaks set_mempolicy() and mbind()
with MPOL_PREFERRED and a NULL nodemask--i.e., explicit local
allocation. Fix this by adding the check for MPOL_PREFERRED and empty
node mask to mpol_new_mpolicy().
Remove the now unneeded 'nodes = NULL' from mpol_new().
Note that mpol_new_mempolicy() is always called with a non-NULL
'nodes' parameter now that it has been removed from mpol_new().
Therefore, we don't need to test nodes for NULL before testing it for
'empty'. However, just to be extra paranoid, add a VM_BUG_ON() to
verify this assumption.]
[lee.schermerhorn@hp.com:
I don't think the function name 'mpol_new_mempolicy' is descriptive
enough to differentiate it from mpol_new().
This function applies cpuset set context, usually constraining nodes
to those allowed by the cpuset. However, when the 'RELATIVE_NODES flag
is set, it also translates the nodes. So I settled on
'mpol_set_nodemask()', because the comment block for mpol_new() mentions
that we need to call this function to "set nodes".
Some additional minor line length, whitespace and typo cleanup.]
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Paul Menage <menage@google.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move more documentation for get_user_pages_fast into the new kerneldoc comment.
Add some comments for get_user_pages as well.
Also, move get_user_pages_fast declaration up to get_user_pages. It wasn't
there initially because it was once a static inline function.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Andy Grover <andy.grover@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The counterpart of radix_tree_next_hole(). To be used by context readahead.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Vladislav Bolkhovitin <vst@vlnb.net>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mmap read-around now shares the same code style and data structure with
readahead code.
This also removes do_page_cache_readahead(). Its last user, mmap
read-around, has been changed to call ra_submit().
The no-readahead-if-congested logic is dumped by the way. Users will be
pretty sensitive about the slow loading of executables. So it's
unfavorable to disabled mmap read-around on a congested queue.
[akpm@linux-foundation.org: coding-style fixes]
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This makes the performance impact of possible mmap_miss wrap around to be
temporary and tolerable: i.e. MMAP_LOTSAMISS=100 extra readarounds.
Otherwise if ever mmap_miss wraps around to negative, it takes INT_MAX
cache misses to bring it back to normal state. During the time mmap
readaround will be _enabled_ for whatever wild random workload. That's
almost permanent performance impact.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* create mm/init-mm.c, move init_mm there
* remove INIT_MM, initialize init_mm with C99 initializer
* unexport init_mm on all arches:
init_mm is already unexported on x86.
One strange place is some OMAP driver (drivers/video/omap/) which
won't build modular, but it's already wants get_vm_area() export.
Somebody should look there.
[akpm@linux-foundation.org: add missing #includes]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Cc: Americo Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13484
Peer reported:
| The bug is introduced from kernel 2.6.27, if E820 table reserve the memory
| above 4G in 32bit OS(BIOS-e820: 00000000fff80000 - 0000000120000000
| (reserved)), system will report Int 6 error and hang up. The bug is caused by
| the following code in drivers/firmware/memmap.c, the resource_size_t is 32bit
| variable in 32bit OS, the BUG_ON() will be invoked to result in the Int 6
| error. I try the latest 32bit Ubuntu and Fedora distributions, all hit this
| bug.
|======
|static int firmware_map_add_entry(resource_size_t start, resource_size_t end,
| const char *type,
| struct firmware_map_entry *entry)
and it only happen with CONFIG_PHYS_ADDR_T_64BIT is not set.
it turns out we need to pass u64 instead of resource_size_t for that.
[akpm@linux-foundation.org: add comment]
Reported-and-tested-by: Peer Chen <pchen@nvidia.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
PIT_TICK_RATE is currently defined in four architectures, but in three
different places. While linux/timex.h is not the perfect place for it, it
is still a reasonable replacement for those drivers that traditionally use
asm/timex.h to get CLOCK_TICK_RATE and expect it to be the PIT frequency.
Note that for Alpha, the actual value changed from 1193182UL to 1193180UL.
This is unlikely to make a difference, and probably can only improve
accuracy. There was a discussion on the correct value of CLOCK_TICK_RATE
a few years ago, after which every existing instance was getting changed
to 1193182. According to the specification, it should be
1193181.818181...
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Len Brown <lenb@kernel.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Other functions use type bool, so use that for pci_enable_wake as well.
Signed-off-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
sdio: add cards id for sms (Siano Mobile Silicon) MDTV receivers
Add SDIO vendor ID, and multiple device IDs for
various SMS-based MDTV SDIO adapters.
Signed-off-by: Uri Shkolnik <uris@siano-ms.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This patch enhances the FLR functions:
1) remove disable_irq() so the shared IRQ won't be disabled.
2) replace the 1s wait with 100, 200 and 400ms wait intervals
for the Pending Transaction.
3) replace mdelay() with msleep().
4) add might_sleep().
5) lock the device to prevent PM suspend from accessing the CSRs
during the reset.
6) coding style fixes.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Based on PCI Express AER specs, a root port might receive multiple
TLP errors while it could only save a correctable error source id
and an uncorrectable error source id at the same time. In addition,
some root port hardware might be unable to provide a correct source
id, i.e., the source id, or the bus id part of the source id provided
by root port might be equal to 0.
The patchset implements the support in kernel by searching the device
tree under the root port.
Patch 1 changes parameter cb of function pci_walk_bus to return a value.
When cb return non-zero, pci_walk_bus stops more searching on the
device tree.
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Create symbolic link to hotplug driver module in the PCI slot
directory (/sys/bus/pci/slots/<SLOT#>). In the past, we need to load
hotplug drivers one by one to identify the hotplug driver that handles
the slot, and it was very inconvenient especially for trouble shooting.
With this change, we can easily identify the hotplug driver.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The EMI support in pciehp is obviously broken. It is implemented using
struct hotplug_slot_attribute, but sysfs_ops for pci_slot_ktype is NOT
for struct hotplug_slot_attribute, but for struct pci_slot_attribute.
This bug had been there for a long time, maybe it was introduced when
PCI slot framework was introduced. The reason why this bug didn't
cause any problem is maybe the EMI support is not tested at all
because of lack of test environment.
As described above, the EMI support in pciehp seems not to be tested
at all. So this patch removes EMI support from pciehp, instead of
fixing the bug.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: core: use more outbound tlabels
firewire: core: don't update Broadcast_Channel if RFC 2734 conditions aren't met
firewire: core: prepare for non-core children of card devices
firewire: core: include linux/uaccess.h instead of asm/uaccess.h
firewire: add parent-of-unit accessor
firewire: rename source files
firewire: reorganize header files
firewire: clean up includes
firewire: ohci: access bus_seconds atomically
firewire: also use vendor ID in root directory for driver matches
firewire: share device ID table type with ieee1394
firewire: core: add sysfs attribute for easier udev rules
firewire: core: check for missing struct update at build time, not run time
firewire: core: improve check for local node
pci_bus_set_ops changes pci_ops associated with a pci_bus. This can be
used by debug tools such as PCIE AER error injection to fake some PCI
configuration registers.
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use pci_is_root_bus() in acpi_find_root_bridge_handle() to check if
the pci bus is root, for code consistency.
Reviewed-by: Alex Chiang <achiang@hp.com>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use pci_is_root_bus() in acpi_pci_get_bridge_handle() to check if the
pci bus is root, for code consistency.
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Reviewed-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
For specific boards, pass initialization data to ir-kbd-i2c instead
of modifying the settings after the device is initialized. This is
more efficient and easier to read.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Let card drivers probe for IR receiver devices and instantiate them if
found. Ultimately it would be better if we could stop probing
completely, but I suspect this won't be possible for all card types.
There's certainly room for cleanups. For example, some drivers are
sharing I2C adapter IDs, so they also had to share the list of I2C
addresses being probed for an IR receiver. Now that each driver
explicitly says which addresses should be probed, maybe some addresses
can be dropped from some drivers.
Also, the special cases in saa7134-i2c should probably be handled on a
per-board basis. This would be more efficient and less risky than always
probing extra addresses on all boards. I'll give it a try later.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
In the standard device driver binding model, the name field of
struct i2c_client is used to match devices to their drivers, so we
must stop using it for internal purposes. Define a separate field
in struct IR_i2c as a replacement, and use it.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Add ADV7343 I2C based video encoder driver. This follows the
v4l2-subdev framework. This driver has been tested on TI DM646x EVM. It
has been tested for Composite and Component outputs.
Updates as per review by Mauro Chehab, added support for more standards
supported by the encoder. Also adding the missed out signed-offs.Tested
only NTSC and PAL standards.
[hverkuil@xs4all.nl: s_routing API changed, updated driver to use new API]
Signed-off-by: Manjunath Hadli <mrh@ti.com>
Signed-off-by: Brijesh Jadav <brijesh.j@ti.com>
Signed-off-by: Chaithrika U S <chaithrika@ti.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This patch adds driver for TI THS7303 video amplifier. This driver is
implemented as a v4l2 sub device. Tested on TI DM646x EVM.
This version has updates based on review comments by Mauro Chehab.
Signed-off-by: Chaithrika U S <chaithrika@ti.com>
Reviewed-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Add a platform driver to soc_camera.c. This way we preserve backwards
compatibility with existing platforms and can start converting them one by one
to the new platform-device soc-camera interface.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Until now I relied on i2c_del_adapter to unregister the i2c_clients for
me, however, if the i2c bus is a platform bus then it is never deleted.
So instead I need to unregister i2c clients when unregistering the
v4l2_device.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Add a utility function that can be used to setup the v4l2_device's name
field in a standard manner.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Make camera devices direct children of host platform devices, move the
inheritance management into the soc_camera.c core driver.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Currently pcm990 camera bus-width management functions request a GPIO and never
free it again. With this approach the GPIO extender driver cannot be unloaded
once camera drivers have been loaded, also unloading theb i2c-pxa bus driver
produces errors, because the GPIO extender driver cannot unregister properly.
Another problem is, that if camera drivers are once loaded before the GPIO
extender driver, the platform code marks the GPIO unavailable and only a reboot
helps to recover. Adding an explicit free_bus method and using it in mt9m001
and mt9v022 drivers fixes these problems.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The name size limit is gone from the driver core, the BUS_ID_SIZE
value will be removed.
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (64 commits)
debugfs: use specified mode to possibly mark files read/write only
debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem.
xen: remove driver_data direct access of struct device from more drivers
usb: gadget: at91_udc: remove driver_data direct access of struct device
uml: remove driver_data direct access of struct device
block/ps3: remove driver_data direct access of struct device
s390: remove driver_data direct access of struct device
parport: remove driver_data direct access of struct device
parisc: remove driver_data direct access of struct device
of_serial: remove driver_data direct access of struct device
mips: remove driver_data direct access of struct device
ipmi: remove driver_data direct access of struct device
infiniband: ehca: remove driver_data direct access of struct device
ibmvscsi: gadget: at91_udc: remove driver_data direct access of struct device
hvcs: remove driver_data direct access of struct device
xen block: remove driver_data direct access of struct device
thermal: remove driver_data direct access of struct device
scsi: remove driver_data direct access of struct device
pcmcia: remove driver_data direct access of struct device
PCIE: remove driver_data direct access of struct device
...
Manually fix up trivial conflicts due to different direct driver_data
direct access fixups in drivers/block/{ps3disk.c,ps3vram.c}
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: remove some includings of blktrace_api.h
mg_disk: seperate mg_disk.h again
block: Introduce helper to reset queue limits to default values
cfq: remove extraneous '\n' in blktrace output
ubifs: register backing_dev_info
btrfs: properly register fs backing device
block: don't overwrite bdi->state after bdi_init() has been run
cfq: cleanup for last_end_request in cfq_data
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (38 commits)
ps3flash: Always read chunks of 256 KiB, and cache them
ps3flash: Cache the last accessed FLASH chunk
ps3: Replace direct file operations by callback
ps3: Switch ps3_os_area_[gs]et_rtc_diff to EXPORT_SYMBOL_GPL()
ps3: Correct debug message in dma_ioc0_map_pages()
drivers/ps3: Add missing annotations
ps3fb: Use ps3_system_bus_[gs]et_drvdata() instead of direct access
ps3flash: Use ps3_system_bus_[gs]et_drvdata() instead of direct access
ps3: shorten ps3_system_bus_[gs]et_driver_data to ps3_system_bus_[gs]et_drvdata
ps3: Use dev_[gs]et_drvdata() instead of direct access for system bus devices
block/ps3: remove driver_data direct access of struct device
ps3vram: Make ps3vram_priv.reports a void *
ps3vram: Remove no longer used ps3vram_priv.ddr_base
ps3vram: Replace mutex by spinlock + bio_list
block: Add bio_list_peek()
powerpc: Use generic atomic64_t implementation on 32-bit processors
lib: Provide generic atomic64_t implementation
powerpc: Add compiler memory barrier to mtmsr macro
powerpc/iseries: Mark signal_vsp_instruction() as maybe unused
powerpc/iseries: Fix unused function warning in iSeries DT code
...
* 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (27 commits)
ACPICA: Update version to 20090521.
ACPICA: Disable preservation of SCI enable bit (SCI_EN)
ACPICA: Region deletion: Ensure region object is removed from handler list
ACPICA: Eliminate extra call to NsGetParentNode
ACPICA: Simplify internal operation region interface
ACPICA: Update Load() to use operation region interfaces
ACPICA: New: AcpiInstallMethod - install a single control method
ACPICA: Invalidate DdbHandle after table unload
ACPICA: Fix reference count issues for DdbHandle object
ACPICA: Simplify and optimize NsGetNextNode function
ACPICA: Additional validation of _PRT packages (resource mgr)
ACPICA: Fix DebugObject output for DdbHandle objects
ACPICA: Fix allowable release order for ASL mutex objects
ACPICA: Mutex support: Fix release ordering issue and current sync level
ACPICA: Update version to 20090422.
ACPICA: Linux OSL: cleanup/update/merge
ACPICA: Fix implementation of AML BreakPoint operator (break to debugger)
ACPICA: Fix miscellaneous warnings under gcc 4+
ACPICA: Miscellaneous lint changes
ACPICA: Fix possible dereference of null pointer
...
This adds a KERN_DEFAULT loglevel marker, for when you cannot decide
which loglevel you want, and just want to keep an existing printk
with the default loglevel.
The difference between having KERN_DEFAULT and having no log-level
marker at all is two-fold:
- having the log-level marker will now force a new-line if the
previous printout had not added one (perhaps because it forgot,
but perhaps because it expected a continuation)
- having a log-level marker is required if you are printing out a
message that otherwise itself could perhaps otherwise be mistaken
for a log-level.
Signed-of-by: Linus Torvalds <torvalds@linux-foundation.org>
It used to be that we would only look at the log-level in a printk()
after explicit newlines, which can cause annoying problems when the
previous printk() did not end with a '\n'. In that case, the log-level
marker would be just printed out in the middle of the line, and be
seen as just noise rather than change the logging level.
This changes things to always look at the log-level in the first
bytes of the printout. If a log level marker is found, it is always
used as the log-level. Additionally, if no newline existed, one is
added (unless the log-level is the explicit KERN_CONT marker, to
explicitly show that it's a continuation of a previous line).
Acked-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
the change is valid for both the forechannel and the backchannel (currently dummy)
Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
If socket destuction gets delayed to a timer, we try to
lock_sock() from that timer which won't work.
Use bh_lock_sock() in that case.
Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Ingo Molnar <mingo@elte.hu>
eec9462088 fold mg_disk.h into mg_disk.c,
but mg_disk platform driver needs private data for operation. This also
make mg_disk.c as machine independent. Seperate only needed structure and
defines to mg_disk.h
Signed-off-by: unsik Kim <donari75@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
DM reuses the request queue when swapping in a new device table
Introduce blk_set_default_limits() which can be used to reset the the
queue_limits prior to stacking devices.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Differentiate between SuperSpeed endpoint companion descriptor and the
wireless USB endpoint companion descriptor. Make all structure names for
this descriptor have "ss" (SuperSpeed) in them. David Vrabel asked for
this change in http://marc.info/?l=linux-usb&m=124091465109367&w=2
Reported-by: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is the original patch I created before David Vrabel posted a better
patch (http://marc.info/?l=linux-usb&m=123377477209109&w=2) that does
basically the same thing. This patch will get replaced with his
(modified) patch later.
Allow USB device drivers that use usb_sg_init() and usb_sg_wait() to push
bulk endpoint scatter gather lists down to the host controller drivers.
This allows host controller drivers to more efficiently enqueue these
transfers, and allows the xHCI host controller to better take advantage of
USB 3.0 "bursts" for bulk endpoints.
This patch currently only enables scatter gather lists for bulk endpoints.
Other endpoint types that use the usb_sg_* functions will not have their
scatter gather lists pushed down to the host controller. For periodic
endpoints, we want each scatterlist entry to be a separate transfer.
Eventually, HCDs could parse these scatter-gather lists for periodic
endpoints also. For now, we use the old code and call usb_submit_urb()
for each scatterlist entry.
The caller of usb_sg_init() can request that all bytes in the scatter
gather list be transferred by passing in a length of zero. Handle that
request for a bulk endpoint under xHCI by walking the scatter gather list
and calculating the length. We could let the HCD handle a zero length in
this case, but I'm not sure if the core layers in between will get
confused by this.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The USB 3.0 bus specification added an "Endpoint Companion" descriptor that is
supposed to follow all SuperSpeed Endpoint descriptors. This descriptor is used
to extend the bus protocol to allow more packets to be sent to an endpoint per
"microframe". The word microframe was removed from the USB 3.0 specification
because the host controller does not send Start Of Frame (SOF) symbols down the
USB 3.0 wires.
The descriptor defines a bMaxBurst field, which indicates the number of packets
of wMaxPacketSize that a SuperSpeed device can send or recieve in a service
interval. All non-control endpoints may set this value as high as 16 packets
(bMaxBurst = 15).
The descriptor also allows isochronous endpoints to further specify that they
can send and receive multiple bursts per service interval. The bmAttributes
allows them to specify a "Mult" of up to 3 (bmAttributes = 2).
Bulk endpoints use bmAttributes to report the number of "Streams" they support.
This was an extension of the endpoint pipe concept to allow multiple mass
storage device commands to be outstanding for one bulk endpoint at a time. This
should allow USB 3.0 mass storage devices to support SCSI command queueing.
Bulk endpoints can say they support up to 2^16 (65,536) streams.
The information in the endpoint companion descriptor must be stored with the
other device, config, interface, and endpoint descriptors because the host
controller needs to access them quickly, and we need to install some default
values if a SuperSpeed device doesn't provide an endpoint companion descriptor.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Warn users of URB_NO_SETUP_DMA_MAP about xHCI behavior.
Device drivers can choose to DMA map the setup packet of a control transfer
before submitting the URB to the USB core. Drivers then set the
URB_NO_SETUP_DMA_MAP and pass in the DMA memory address in setup_dma, instead of
providing a kernel address for setup_packet. However, xHCI requires that the
setup packet be copied into an internal data structure, and we need a kernel
memory address pointer for that. Warn users of URB_NO_SETUP_DMA_MAP that they
should provide a valid pointer for setup_packet, along with the DMA address.
FIXME: I'm not entirely sure how to work around this in the xHCI driver
or USB core.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add host controller driver API and a slot_id variable to struct
usb_device. This allows the xHCI host controller driver to ask the
hardware to allocate a slot for the device when a struct usb_device is
allocated. The slot needs to be allocated at that point because the
hardware can run out of internal resources, and we want to know that very
early in the device connection process. Don't call this new API for root
hubs, since they aren't real devices.
Add HCD API to let the host controller choose the device address. This is
especially important for xHCI hardware running in a virtualized
environment. The guests running under the VM don't need to know which
addresses on the bus are taken, because the hardware picks the address for
them. Announce SuperSpeed USB devices after the address has been assigned
by the hardware.
Don't use the new get descriptor/set address scheme with xHCI. Unless
special handling is done in the host controller driver, the xHC can't
issue control transfers before you set the device address. Support for
the older addressing scheme will be added when the xHCI driver supports
the Block Set Address Request (BSR) flag in the Address Device command.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch adds a hex route string to each USB device. The route string is used
by the USB 3.0 host controller to send packets through the device tree. USB 3.0
hubs use this string to route packets to the correct port. This is fundamental
bus change from USB 2.0, where all packets were broadcast across the bus.
Devices (including hubs) under a root port receive the route string 0x0. Every
four bits in the route string represent a port on a hub. This length works
because USB 3.0 hubs are limited to 15 ports, and USB 2.0 hubs (with potentially
more ports) will never see packets with a route string. A port number of 0
means the packet is destined for that hub.
For example, a peripheral device might have a route string of 0x00097.
This means the device is connected to port 9 of the hub at depth 1.
The hub at depth 1 is connected to port 7 of a hub at depth 0.
The hub at depth 0 is connected to a root port.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Modify the USB core to handle the new USB 3.0 speed, "SuperSpeed". This
is 5.0 Gbps (wire speed). There are probably more places that check for
speed that I've missed.
SuperSpeed devices have a 512 byte endpoint 0 max packet size. This shows
up as a bMaxPacketSize0 set to 0x09 (see table 9-8 of the USB 3.0 bus
spec).
xHCI spec says that the xHC can handle intervals up to 2^15 microframes. That
might change when real silicon becomes available.
Add FIXME note for SuperSpeed isochronous endpoints. They can transmit up
to 16 packets in one "burst" before they wait for an acknowledgment of the
packets. They can do up to 3 bursts per microframe (determined by the
mult value in the endpoint companion descriptor). The xHCI driver doesn't
have support for isoc yet, so fix this later.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add PCI initialization code to take control of the xHCI host controller
away from the BIOS, halt, and reset the host controller. The xHCI spec
says that BIOSes must give up the host controller within 5 seconds.
Add some host controller glue functions to handle hardware initialization
and memory allocation for the host controller. The current xHCI
prototypes use PCI interrupts, but the xHCI spec requires MSI-X
interrupts. Add code to support MSI-X interrupts, but use the PCI
interrupts for now.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Description:
This driver is used for Intel Langwell* USB OTG controller in Intel
Moorestown* platform. It tries to implement host/device role switch
according to OTG spec. The actual hsot and device functions are
accomplished in modified EHCI driver and Intel Langwell USB OTG client
controller driver.
* Langwell and Moorestown are names used in development. They are not
approved official name.
Note:
This patch is the first version Intel Langwell USB OTG Transceiver
driver. The development is not finished, and the bug fixing is on going
for some hardware and software issues. The main purpose of this
submission is for code view.
Supported features:
- Data-line Pulsing SRP
- Support HNP to switch roles
- PCI D0/D3 power management support
Known issues:
- HNP is only tested with another Moorestown platform.
- PCI D0/D3 power management support is not fully tested.
- VBus Pulsing SRP is not support in current version.
Signed-off-by: Hao Wu <hao.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Intel Langwell USB Device Controller is a High-Speed USB OTG device
controller in Intel Moorestown platform. It can work in OTG device mode
with Intel Langwell USB OTG transceiver driver as well as device-only
mode. The number of programmable endpoints is different through
controller revision.
NOTE:
This patch is the first version Intel Langwell USB OTG device controller
driver. The bug fixing is on going for some hardware and software
issues. Intel Langwell USB OTG transceiver driver and EHCI driver
patches will be submitted later.
Supported features:
- USB OTG protocol support with Intel Langwell USB OTG transceiver
driver (turn on CONFIG_USB_LANGWELL_OTG)
- Support control, bulk, interrupt and isochronous endpoints
(isochronous not tested)
- PCI D0/D3 power management support
- Link Power Management (LPM) support
Tested gadget drivers:
- g_file_storage
- g_ether
- g_zero
The passed tests:
- g_file_storage: USBCV Chapter 9 tests
- g_file_storage: USBCV MSC tests
- g_file_storage: from/to host files copying
- g_ether: ping, ftp and scp files from/to host
- Hotplug, with and without hubs
Known issues:
- g_ether: failed part of USBCV chap9 tests
- LPM support not fully tested
TODO:
- g_ether: pass all USBCV chap9 tests
- g_zero: pass usbtest tests
- Stress tests on different gadget drivers
- On-chip private SRAM caching support
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1254) splits up the shutdown method of usb_serial_driver
into a disconnect and a release method.
The problem is that the usb-serial core was calling shutdown during
disconnect handling, but drivers didn't expect it to be called until
after all the open file references had been closed. The result was an
oops when the close method tried to use memory that had been
deallocated by shutdown.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1253) prevents the usb-serial core from calling a
driver's port_probe and port_remove methods more than once per port.
It also removes some unnecessary try_module_get() calls and adds a
missing port_remove method call in a failure path.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
CPU/board specific parameters (PLL clock, vif etc...) can be set
by platform_data instead of module_param.
v2: remove irq_sense member in platform_data because it can OR in
IRQF_TRIGGER_LOW or IRQF_TRIGGER_FALLING against IORESOURCE_IRQ in
the struct resource.
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Reviewed-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The usb_debug driver was modified to implement serial break handling
by using a "magic" data packet comprised of the sequence:
0x00 0xff 0x01 0xfe 0x00 0xfe 0x01 0xff
When the tty layer requests a serial break the usb_debug driver sends
the magic packet. On the receiving side the magic packet is thrown
away or a sysrq is activated depending on what kernel .config options
have been set.
The generic serial driver was modified as well as the usb serial
headers to generically implement sysrq processing in the same way the
non usb uart based drivers implement the sysrq handling. This will
allow other usb serial devices to implement sysrq handling as desired.
The new usb serial functions are named similarly and implemented
similarly to the uart functions as follows:
usb_serial_handle_break <-> uart_handle_break
usb_serial_handle_sysrq_char <-> uart_handle_sysrq_char
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The usb_debug driver, when used as the console, will always fail to
insert the carriage return and new line sequence as well as randomly
drop console output. This is a result of only having the single
write_urb and that the tty layer will have a lock that prevents the
processing of the back to back urb requests.
The solution is to allow more than one urb to be outstanding and have
a slightly deeper transmit queue. The idea and some code is borrowed
from the ftdi_sio usb driver.
The generic usb serial driver was modified so as to allow the classic
method of 1 write urb, or a multi write urb scheme with N allowed
outstanding urbs where N is controlled by max_in_flight_urbs. When
max_in_flight_urbs in a "struct usb_serial_driver" is non zero the
multi write urb scheme will be used.
The size of 4000 was selected for the usb_debug driver so that the
driver lowers possibility of losing the queued console messages during
the kernel startup.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Use "/* private:" to mark struct members as private so that
scripts/kernel-doc will handle them correctly.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mark internal struct members as /* private: */ so that kernel-doc
won't produce warnings about missing descriptions for them.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1235) adds an array of PCI power-state names, together
with a simple inline accessor routine.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The usb_host class isn't used for anything anymore (it was used for
debug files, but they have moved to debugfs a few kernel releases ago),
so let's delete it before someone accidentally puts a file in it.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1239) updates the kernel's treatment of Unicode. The
character-set conversion routines are well behind the current state of
the Unicode specification: They don't recognize the existence of code
points beyond plane 0 or of surrogate pairs in the UTF-16 encoding.
The old wchar_t 16-bit type is retained because it's still used in
lots of places. This shouldn't cause any new problems; if a
conversion now results in an invalid 16-bit code then before it must
have yielded an undefined code.
Difficult-to-read names like "utf_mbstowcs" are replaced with more
transparent names like "utf8s_to_utf16s" and the ordering of the
parameters is rationalized (buffer lengths come immediate after the
pointers they refer to, and the inputs precede the outputs).
Fortunately the low-level conversion routines are used in only a few
places; the interfaces to the higher-level uni2char and char2uni
methods have been left unchanged.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The NOP OTG transceiver driver needs to be usable from modules.
Make sure its symbols are always accessible at both compile and
link time, and make sure the device instance is allocated from
the heap so that device lifetime rules are obeyed.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Many developers use "/debug/" or "/debugfs/" or "/sys/kernel/debug/"
directory name to mount debugfs filesystem for ftrace according to
./Documentation/tracers/ftrace.txt file.
And, three directory names(ex:/debug/, /debugfs/, /sys/kernel/debug/) is
existed in kernel source like ftrace, DRM, Wireless, Documentation,
Network[sky2]files to mount debugfs filesystem.
debugfs means debug filesystem for debugging easy to use by greg kroah
hartman. "/sys/kernel/debug/" name is suitable as directory name
of debugfs filesystem.
- debugfs related reference: http://lwn.net/Articles/334546/
Fix inconsistency of directory name to mount debugfs filesystem.
* From Steven Rostedt
- find_debugfs() and tracing_files() in this patch.
Signed-off-by: GeunSik Lim <geunsik.lim@samsung.com>
Acked-by : Inaky Perez-Gonzalez <inaky@linux.intel.com>
Reviewed-by : Steven Rostedt <rostedt@goodmis.org>
Reviewed-by : James Smart <james.smart@emulex.com>
CC: Jiri Kosina <trivial@kernel.org>
CC: David Airlie <airlied@linux.ie>
CC: Peter Osterlund <petero2@telia.com>
CC: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
CC: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
CC: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device. Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used. These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This adds support for block drivers to report their requested nodename
to userspace. It also updates a number of block drivers to provide the
needed subdirectory and device name to be used for them.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This adds support for USB drivers to report their requested nodename to
userspace. It also updates a number of USB drivers to provide the
needed subdirectory and device name to be used for them.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This adds support for misc devices to report their requested nodename to
userspace. It also updates a number of misc drivers to provide the
needed subdirectory and device name to be used for them.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This adds the nodename callback for struct class, struct device_type and
struct device, to allow drivers to send userspace hints on the device
name and subdirectory that should be used for it.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
As we're allocating the firmware name dynamically, we no longer need this
definition.
This patch must be applied only after the 5 previous patches from this pacth
set have been applied.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This converts resource and IRQ getbyname functions for the platform
bus to use const char *, I ran into compiler moanings when I tried
using a const char * for looking up a certain resource.
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch adds a new bus notifier event which is emitted _after_ a
device is removed from its driver. This event will be used by the
dma-api debug code to check if a driver has released all dma allocations
for that device.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
During bootup performance tracing we see repeated occurrences of
/sys/kernel/uid/* events for the same uid, leading to a,
in this case, rather pointless userspace processing for the
same uid over and over.
This is usually caused by tools which change their uid to "nobody",
to run without privileges to read data supplied by untrusted users.
This change delays the execution of the (already existing) scheduled
work, to cleanup the uid after one second, so the allocated and announced
uid can possibly be re-used by another process.
This is the current behavior, where almost every invocation of a
binary, which changes the uid, creates two events:
$ read START < /sys/kernel/uevent_seqnum; \
for i in `seq 100`; do su --shell=/bin/true bin; done; \
read END < /sys/kernel/uevent_seqnum; \
echo $(($END - $START))
178
With the delayed cleanup, we get only two events, and userspace finishes
a bit faster too:
$ read START < /sys/kernel/uevent_seqnum; \
for i in `seq 100`; do su --shell=/bin/true bin; done; \
read END < /sys/kernel/uevent_seqnum; \
echo $(($END - $START))
1
Acked-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch mostly follows commit 5998102b90
"ASoC: Refactor WM8731 device registration" to make UDA1380 use standard
device instantiation. Similarly, the I2C device registration temporarily
moves into the magician machine driver before it will find its final
resting place in the board file.
At the same time, platform specific configuration is moved to platform data
and common power/reset GPIO handling moves into the codec driver.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Change ide_drive_t 'drive_data' field from 'unsigned int' type to 'void *'
type, allowing a wider range of values/types to be stored in this field.
Added 'ide_get_drivedata' and 'ide_set_drivedata' helpers to get and set
the 'drive_data' field.
Fixed all host drivers to maintain coherency with the change in the
'drive_data' field type.
Signed-off-by: Joao Ramos <joao.ramos@inov.pt>
[bart: fix qd65xx build, cast to 'unsigned long', minor Coding Style fixups]
Acked-by: Sergei Shtylyov <sshtylyov@ru.montavista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* 'timers-for-linus-migration' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
timers: Logic to move non pinned timers
timers: /proc/sys sysctl hook to enable timer migration
timers: Identifying the existing pinned timers
timers: Framework for identifying pinned timers
timers: allow deferrable timers for intervals tv2-tv5 to be deferred
Fix up conflicts in kernel/sched.c and kernel/timer.c manually
* 'timers-for-linus-clocksource' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
clocksource: prevent selection of low resolution clocksourse also for nohz=on
clocksource: sanity check sysfs clocksource changes
Move the ack_intr() method into 'struct ide_port_ops', also renaming it to
test_irq() while at it...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add 'unsigned long port_flags' field to ide_hwif_t.
* Add IDE_PFLAG_PROBING port flag and keep it set during probing.
* Fix ide_pio_need_iordy() to not enable IORDY at a probe time
(IORDY may lead to controller lock up on certain controllers
if the port is not occupied).
Loosely based on the recent libata's fix by Tejun, thanks to Alan
for the hint that IDE may also need it.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add ide_pio_need_iordy() helper and convert host drivers to use it.
This fixes it8172, it8213, pdc202xx_old, piix, slc90e66 and siimage
host drivers to handle IORDY correctly.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
v2:
Minor fixes per Sergei's review.
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (103 commits)
powerpc: Fix bug in move of altivec code to vector.S
powerpc: Add support for swiotlb on 32-bit
powerpc/spufs: Remove unused error path
powerpc: Fix warning when printing a resource_size_t
powerpc/xmon: Remove unused variable in xmon.c
powerpc/pseries: Fix warnings when printing resource_size_t
powerpc: Shield code specific to 64-bit server processors
powerpc: Separate PACA fields for server CPUs
powerpc: Split exception handling out of head_64.S
powerpc: Introduce CONFIG_PPC_BOOK3S
powerpc: Move VMX and VSX asm code to vector.S
powerpc: Set init_bootmem_done on NUMA platforms as well
powerpc/mm: Fix a AB->BA deadlock scenario with nohash MMU context lock
powerpc/mm: Fix some SMP issues with MMU context handling
powerpc: Add PTRACE_SINGLEBLOCK support
fbdev: Add PLB support and cleanup DCR in xilinxfb driver.
powerpc/virtex: Add ml510 reference design device tree
powerpc/virtex: Add Xilinx ML510 reference design support
powerpc/virtex: refactor intc driver and add support for i8259 cascading
powerpc/virtex: Add support for Xilinx PCI host bridge
...
when compiling linux-mips with kmemtrace enabled, there will be an
error:
include/linux/trace_seq.h:12: error: 'PAGE_SIZE' undeclared here (not in
a function)
I checked the source code and found trace_seq.h used PAGE_SIZE but not
included the relative header file, so, fix it via adding the header file
<asm/page.h>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Wu Zhangjin <wuzj@lemote.com>
LKML-Reference: <1244962350-28702-1-git-send-email-wuzhangjin@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This patch introduces a new flag SCHED_RESET_ON_FORK which can be passed
to the kernel via sched_setscheduler(), ORed in the policy parameter. If
set this will make sure that when the process forks a) the scheduling
priority is reset to DEFAULT_PRIO if it was higher and b) the scheduling
policy is reset to SCHED_NORMAL if it was either SCHED_FIFO or SCHED_RR.
Why have this?
Currently, if a process is real-time scheduled this will 'leak' to all
its child processes. For security reasons it is often (always?) a good
idea to make sure that if a process acquires RT scheduling this is
confined to this process and only this process. More specifically this
makes the per-process resource limit RLIMIT_RTTIME useful for security
purposes, because it makes it impossible to use a fork bomb to
circumvent the per-process RLIMIT_RTTIME accounting.
This feature is also useful for tools like 'renice' which can then
change the nice level of a process without having this spill to all its
child processes.
Why expose this via sched_setscheduler() and not other syscalls such as
prctl() or sched_setparam()?
prctl() does not take a pid parameter. Due to that it would be
impossible to modify this flag for other processes than the current one.
The struct passed to sched_setparam() can unfortunately not be extended
without breaking compatibility, since sched_setparam() lacks a size
parameter.
How to use this from userspace? In your RT program simply replace this:
sched_setscheduler(pid, SCHED_FIFO, ¶m);
by this:
sched_setscheduler(pid, SCHED_FIFO|SCHED_RESET_ON_FORK, ¶m);
Signed-off-by: Lennart Poettering <lennart@poettering.net>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090615152714.GA29092@tango.0pointer.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch allows the Adaptec firmware to pass on its values for Packetize and
QAS. To do this, the settings max_iu and max_qas have been introduced into
the SPI transport class and populated from the adaptec NVram tables. Domain
validation in the SPI transport class will respect the max settings when
configuring to the highest possible speed for testing.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
At present, every architecture that supports perf_counters has to
declare set_perf_counter_pending() in its arch-specific headers.
This consolidates the declarations into a single declaration in one
common place, include/linux/perf_counter.h. On powerpc, we continue
to provide a static inline definition of set_perf_counter_pending()
in the powerpc hw_irq.h.
Also, this removes from the x86 perf_counter.h the unused null
definitions of {test,clear}_perf_counter_pending.
Reported-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: benh@kernel.crashing.org
LKML-Reference: <18998.13388.920691.523227@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Introduce a gup_fast() variant which is usable from IRQ/NMI context.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Nick Piggin <npiggin@suse.de>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The purpose of this change is to allow __getname() users to pass a
custom GFP mask to kmem_cache_alloc(). This is needed for annotating
a certain kmemcheck false positive.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
This gets rid of a heap of false-positive warnings from the tracer
code due to the use of bitfields.
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
The use of bitfields here would lead to false positive warnings with
kmemcheck. Silence them.
(Additionally, one erroneous comment related to the bitfield was also
fixed.)
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Add the bitfield API which can be used to annotate bitfields in structs
and get rid of false positive reports.
According to Al Viro, the syntax we were using (putting #ifdef inside
macro arguments) was not valid C. He also suggested using begin/end
markers instead, which is what we do now.
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
This adds support for tracking the initializedness of memory that
was allocated with the page allocator. Highmem requests are not
tracked.
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
[build fix for !CONFIG_KMEMCHECK]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
This patch hooks into the DMA API to prevent the reporting of the
false positives that would otherwise be reported when memory is
accessed that is also used directly by devices.
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
With kmemcheck enabled, the slab allocator needs to do this:
1. Tell kmemcheck to allocate the shadow memory which stores the status of
each byte in the allocation proper, e.g. whether it is initialized or
uninitialized.
2. Tell kmemcheck which parts of memory that should be marked uninitialized.
There are actually a few more states, such as "not yet allocated" and
"recently freed".
If a slab cache is set up using the SLAB_NOTRACK flag, it will never return
memory that can take page faults because of kmemcheck.
If a slab cache is NOT set up using the SLAB_NOTRACK flag, callers can still
request memory with the __GFP_NOTRACK flag. This does not prevent the page
faults from occuring, however, but marks the object in question as being
initialized so that no warnings will ever be produced for this object.
In addition to (and in contrast to) __GFP_NOTRACK, the
__GFP_NOTRACK_FALSE_POSITIVE flag indicates that the allocation should
not be tracked _because_ it would produce a false positive. Their values
are identical, but need not be so in the future (for example, we could now
enable/disable false positives with a config option).
Parts of this patch were contributed by Pekka Enberg but merged for
atomicity.
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
The V3 regulator can be configured with an external resistor
connected to the feedback pin (R24 in the data sheet) to
increase the voltage range.
For example, hx4700 has R24 = 3.32 kOhm to achieve a maximum
V3 voltage of 1.55 V which is needed for 624 MHz CPU frequency.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
This patch adds regulator drivers for National Semiconductors LP3971 PMIC.
This LP3971 PMIC controller has 3 DC/DC voltage converters and 5 low
drop-out (LDO) regulators. LP3971 PMIC controller uses I2C interface.
Reviewed-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
The userspace-consumer driver allows control of voltage and current
regulator state from userspace. This is required for fine-grained
power management of devices that are completely controller by userspace
applications, e.g. a GPS transciever connected to a serial port.
Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
The Maxim 1586 regulator is a voltage regulator with 2
voltage outputs, specially suitable for Marvell PXA
chips. One output is in the range of required VCC_CORE by
the PXA27x chips, the other in the VCC_USIM required as well
by PXA27x chips.
The chip is controlled through the I2C bus.
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Let's use TICKS instead of US, so PSCHED_TICKS2NS and PSCHED_NS2TICKS
(like in PSCHED_TICKS_PER_SEC already) to avoid misleading.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce bio_list_peek(), to obtain a pointer to the first bio on the bio_list
without actually removing it from the list. This is needed when you want to
serialize based on the list being empty or not.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Many processor architectures have no 64-bit atomic instructions, but
we need atomic64_t in order to support the perf_counter subsystem.
This adds an implementation of 64-bit atomic operations using hashed
spinlocks to provide atomicity. For each atomic operation, the address
of the atomic64_t variable is hashed to an index into an array of 16
spinlocks. That spinlock is taken (with interrupts disabled) around the
operation, which can then be coded non-atomically within the lock.
On UP, all the spinlock manipulation goes away and we simply disable
interrupts around each operation. In fact gcc eliminates the whole
atomic64_lock variable as well.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add kernel modesetting support to radeon driver, use the ttm memory
manager to manage memory and DRM/GEM to provide userspace API.
In order to avoid backward compatibility issue and to allow clean
design and code the radeon kernel modesetting use different code path
than old radeon/drm driver.
When kernel modesetting is enabled the IOCTL of radeon/drm
driver are considered as invalid and an error message is printed
in the log and they return failure.
KMS enabled userspace will use new API to talk with the radeon/drm
driver. The new API provide functions to create/destroy/share/mmap
buffer object which are then managed by the kernel memory manager
(here TTM). In order to submit command to the GPU the userspace
provide a buffer holding the command stream, along this buffer
userspace have to provide a list of buffer object used by the
command stream. The kernel radeon driver will then place buffer
in GPU accessible memory and will update command stream to reflect
the position of the different buffers.
The kernel will also perform security check on command stream
provided by the user, we want to catch and forbid any illegal use
of the GPU such as DMA into random system memory or into memory
not owned by the process supplying the command stream. This part
of the code is still incomplete and this why we propose that patch
as a staging driver addition, future security might forbid current
experimental userspace to run.
This code support the following hardware : R1XX,R2XX,R3XX,R4XX,R5XX
(radeon up to X1950). Works is underway to provide support for R6XX,
R7XX and newer hardware (radeon from HD2XXX to HD4XXX).
Authors:
Jerome Glisse <jglisse@redhat.com>
Dave Airlie <airlied@redhat.com>
Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
TTM is a GPU memory manager subsystem designed for use with GPU
devices with various memory types (On-card VRAM, AGP,
PCI apertures etc.). It's essentially a helper library that assists
the DRM driver in creating and managing persistent buffer objects.
TTM manages placement of data and CPU map setup and teardown on
data movement. It can also optionally manage synchronization of
data on a per-buffer-object level.
TTM takes care to provide an always valid virtual user-space address
to a buffer object which makes user-space sub-allocation of
big buffer objects feasible.
TTM uses a fine-grained per buffer-object locking scheme, taking
care to release all relevant locks when waiting for the GPU.
Although this implies some locking overhead, it's probably a big
win for devices with multiple command submission mechanisms, since
the lock contention will be minimal.
TTM can be used with whatever user-space interface the driver
chooses, including GEM. It's used by the upcoming Radeon KMS DRM driver
and is also the GPU memory management core of various new experimental
DRM drivers.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This patch fixes the mmap/truncate race that was fixed for delayed
allocation by merging ext4_{journalled,normal,da}_writepage() into
ext4_writepage().
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (53 commits)
.gitignore: ignore *.lzma files
kbuild: add generic --set-str option to scripts/config
kbuild: simplify argument loop in scripts/config
kbuild: handle non-existing options in scripts/config
kallsyms: generalize text region handling
kallsyms: support kernel symbols in Blackfin on-chip memory
documentation: make version fix
kbuild: fix a compile warning
gitignore: Add GNU GLOBAL files to top .gitignore
kbuild: fix delay in setlocalversion on readonly source
README: fix misleading pointer to the defconf directory
vmlinux.lds.h update
kernel-doc: cleanup perl script
Improve vmlinux.lds.h support for arch specific linker scripts
kbuild: fix headers_exports with boolean expression
kbuild/headers_check: refine extern check
kbuild: fix "Argument list too long" error for "make headers_check",
ignore *.patch files
Remove bashisms from scripts
menu: fix embedded menu presentation
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
mlx4_core: Don't double-free IRQs when falling back from MSI-X to INTx
IB/mthca: Don't double-free IRQs when falling back from MSI-X to INTx
IB/mlx4: Add strong ordering to local inval and fast reg work requests
IB/ehca: Remove superfluous bitmasks from QP control block
RDMA/cxgb3: Limit fast register size based on T3 limitations
RDMA/cxgb3: Report correct port state and MTU
mlx4_core: Add module parameter for number of MTTs per segment
IB/mthca: Add module parameter for number of MTTs per segment
RDMA/nes: Fix off-by-one bugs in reset_adapter_ne020() and init_serdes()
infiniband: Remove void casts
IB/ehca: Increment version number
IB/ehca: Remove unnecessary memory operations for userspace queue pairs
IB/ehca: Fall back to vmalloc() for big allocations
IB/ehca: Replace vmalloc() with kmalloc() for queue allocation
In addition to KT_DEAD which has limited support for diacriticals,
there is KT_DEAD2 that can support 256 criticals, so let's advertise
it in <linux/keyboard.h>.
This lets userland know abut the drivers/char/keyboard.c function
k_dead2, which supports more than the few trivial ones that k_dead
supports.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: (25 commits)
atmel-mci: add MCI2 register definitions
atmel-mci: Integrate AT91 specific definition in header file
tmio_mmc: allow compilation for ASIC3
mmc_block: do not DMA to stack
sdhci: Print ADMA status and pointer on debug
tmio_mmc: fix clock setup
tmio_mmc: map SD control registers after enabling the MFD cell
tmio_mmc: correct probe return value for num_resources != 3
tmio_mmc: don't use set_irq_type
tmio_mmc: add bus_shift support
MFD,mmc: tmio_mmc: make HCLK configurable
mmc_spi: don't use EINVAL for possible transmission errors
cb710: more cleanup for the DEBUG case.
sdhci: platform driver for SDHCI
mxcmmc: remove frequency workaround
cb710: handle DEBUG define in Makefile
cb710: add missing parenthesis
cb710: fix printk format string
mmc: Driver for CB710/720 memory card reader (MMC part)
pxamci: add regulator support.
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: fix inverted wheel for bluetooth version of apple mighty mouse
HID: no more reinitializtion is needed in post_reset
HID: hidraw -- fix comment about accepted devices
HID: Multitouch support for the N-Trig touchscreen
HID: add new multitouch and digitizer contants
HID: autocentering support for Logitech Force 3D Pro
HID: fix hid-ff drivers so that devices work even without ff support
HID: force feedback support for SmartJoy PLUS PS2/USB adapter
HID: Wacom Graphire Bluetooth driver
HID: autocentering support for Logitech G25 Racing Wheel
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm: (417 commits)
MAINTAINERS: EB110ATX is not ebsa110
MAINTAINERS: update Eric Miao's email address and status
fb: add support of LCD display controller on pxa168/910 (base layer)
[ARM] 5552/1: ep93xx get_uart_rate(): use EP93XX_SYSCON_PWRCNT and EP93XX_SYSCON_PWRCN
[ARM] pxa/sharpsl_pm: zaurus needs generic pxa suspend/resume routines
[ARM] 5544/1: Trust PrimeCell resource sizes
[ARM] pxa/sharpsl_pm: cleanup of gpio-related code.
[ARM] pxa/sharpsl_pm: drop set_irq_type calls
[ARM] pxa/sharpsl_pm: merge pxa-specific code into generic one
[ARM] pxa/sharpsl_pm: merge the two sharpsl_pm.c since it's now pxa specific
[ARM] sa1100: remove unused collie_pm.c
[ARM] pxa: fix the conflicting non-static declarations of global_gpios[]
[ARM] 5550/1: Add default configure file for w90p910 platform
[ARM] 5549/1: Add clock api for w90p910 platform.
[ARM] 5548/1: Add gpio api for w90p910 platform
[ARM] 5551/1: Add multi-function pin api for w90p910 platform.
[ARM] Make ARM_VIC_NR depend on ARM_VIC
[ARM] 5546/1: ARM PL022 SSP/SPI driver v3
ARM: OMAP4: SMP: Update defconfig for OMAP4430
ARM: OMAP4: SMP: Enable SMP support for OMAP4430
...
Updated after review by Tim Abbott.
- Use HEAD_TEXT_SECTION
- Drop use of section-names.h and delete file
- Introduce EXIT_CALL
Deleting section-names.h required a few simple
updates of init.h
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Tim Abbott <tabbott@ksplice.com>
Decouple the creation and destruction of the net_device from the order
of discovery and removal of nodes with RFC 2734 unit directories since
there is no reliable order. The net_device is now created when the
first RFC 2734 unit on a card is discovered, and destroyed when the last
RFC 2734 unit on a card went away. This includes all remote units as
well as the local unit, which is therefore tracked as a peer now too.
Also, locking around the list of peers is slightly extended to guard
against peer removal. As a side effect, fwnet_peer.pdg_lock has become
superfluous and is deleted.
Peer data (max_rec, speed, node ID, generation) are updated more
carefully.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Implement IPv4 over IEEE 1394 as per RFC 2734 for the newer firewire
stack. This feature has only been present in the older ieee1394 stack
via the eth1394 driver.
Still to do:
- fix ipv4_priv and ipv4_node lifetime logic
- fix determination of speeds and max payloads
- fix bus reset handling
- fix unaligned memory accesses
- fix coding style
- further testing/ improvement of fragment reassembly
- perhaps multicast support
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (rebased, copyright note, changelog)
Tlabel is a 6 bits wide datum. Wrap it after 63 rather than 31 for more
safety against transaction label exhaustion and potential responders'
transaction layer bugs. (As noted by Guus Sliepen, this change requires
an expansion of tlabel_mask to 64 bits.)
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This extra check will avoid Broadcast_Channel register related traffic
to many IIDC, SBP-2, and AV/C devices which aren't IRMC or have a
max_rec < 8 (i.e. support < 512 bytes async payload). This avoids a
little bit of traffic after bus reset and is even more careful with
devices which don't implement this CSR.
The assumption is that no other protocol than IP over 1394 uses the
broadcast channel for streams.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Fix kernel-doc warnings in linux/irq.h:
Warning(include/linux/irq.h:201): No description found for parameter 'node'
Warning(include/linux/irq.h:201): Excess struct/union/enum/typedef member 'cpu' description in 'irq_desc'
Warning(include/linux/irq.h:434): No description found for parameter 'node'
Warning(include/linux/irq.h:434): Excess function parameter 'cpu' description in 'alloc_desc_masks'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
LKML-Reference: <4A3467EC.50006@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The Toshiba parts all have a 24 MHz HCLK, but HTC ASIC3 has a 24.576 MHz HCLK
and AMD Imageon w228x's HCLK is 80 MHz. With this patch, the MFD driver
provides the HCLK frequency to tmio_mmc via mfd_cell->driver_data.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Acked-by: Ian Molton <ian@mnementh.co.uk>
Acked-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Pierre Ossman <pierre@ossman.eu>
The code is divided in two parts. There is a virtual 'bus' driver
that handles PCI device and registers three new devices one per card
reader type. The other driver handles SD/MMC part of the reader.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Pierre Ossman <pierre@ossman.eu>
* git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6:
avr32: Fix oops on unaligned user access
avr32: Add support for Mediama RMTx add-on board for ATNGW100
avr32: Change Atmel ATNGW100 config to add choice of add-on board
Fix MIMC200 board LCD init
avr32: Fix clash in ATMEL_USART_ flags
avr32: remove obsolete hw_interrupt_type
avr32: Solves problem with inverted MCI detect pin on Merisc board
atmel-mci: Add support for inverted detect pin
Now that ASoC subdevices can be regular devices they can have normal
suspend and resume calls from their buses. However, suspending them
individually is not desirable since this can lead to problems such as
pops and clicks from devices being suspended with their signals being
amplified or clocks being stopped suddenly.
This will be resolved by having the normal device model suspend and
resume calls call into ASoC which will suspend the entire card while any
of its components are suspended. At present this is not yet implemented
but in order to aid the transition of drivers to the standard device
model this patch adds API calls for the notifications.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
General description: kmemcheck is a patch to the linux kernel that
detects use of uninitialized memory. It does this by trapping every
read and write to memory that was allocated dynamically (e.g. using
kmalloc()). If a memory address is read that has not previously been
written to, a message is printed to the kernel log.
Thanks to Andi Kleen for the set_memory_4k() solution.
Andrew Morton suggested documenting the shadow member of struct page.
Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
[export kmemcheck_mark_initialized]
[build fix for setup_max_cpus]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
This patch improves ctnetlink event reliability if one broadcast
listener has set the NETLINK_BROADCAST_ERROR socket option.
The logic is the following: if an event delivery fails, we keep
the undelivered events in the missed event cache. Once the next
packet arrives, we add the new events (if any) to the missed
events in the cache and we try a new delivery, and so on. Thus,
if ctnetlink fails to deliver an event, we try to deliver them
once we see a new packet. Therefore, we may lose state
transitions but the userspace process gets in sync at some point.
At worst case, if no events were delivered to userspace, we make
sure that destroy events are successfully delivered. Basically,
if ctnetlink fails to deliver the destroy event, we remove the
conntrack entry from the hashes and we insert them in the dying
list, which contains inactive entries. Then, the conntrack timer
is added with an extra grace timeout of random32() % 15 seconds
to trigger the event again (this grace timeout is tunable via
/proc). The use of a limited random timeout value allows
distributing the "destroy" resends, thus, avoiding accumulating
lots "destroy" events at the same time. Event delivery may
re-order but we can identify them by means of the tuple plus
the conntrack ID.
The maximum number of conntrack entries (active or inactive) is
still handled by nf_conntrack_max. Thus, we may start dropping
packets at some point if we accumulate a lot of inactive conntrack
entries that did not successfully report the destroy event to
userspace.
During my stress tests consisting of setting a very small buffer
of 2048 bytes for conntrackd and the NETLINK_BROADCAST_ERROR socket
flag, and generating lots of very small connections, I noticed
very few destroy entries on the fly waiting to be resend.
A simple way to test this patch consist of creating a lot of
entries, set a very small Netlink buffer in conntrackd (+ a patch
which is not in the git tree to set the BROADCAST_ERROR flag)
and invoke `conntrack -F'.
For expectations, no changes are introduced in this patch.
Currently, event delivery is only done for new expectations (no
events from expectation expiration, removal and confirmation).
In that case, they need a per-expectation event cache to implement
the same idea that is exposed in this patch.
This patch can be useful to provide reliable flow-accouting. We
still have to add a new conntrack extension to store the creation
and destroy time.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch adds the hlist_nulls_add_head() function which is
based on hlist_nulls_add_head_rcu() but without the use of
rcu_assign_pointer(). It also adds hlist_nulls_del which is
exactly the same like hlist_nulls_del_rcu().
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch moves the helper destruction to a function that lives
in nf_conntrack_helper.c. This new function is used in the patch
to add ctnetlink reliable event delivery.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch reworks the per-cpu event caching to use the conntrack
extension infrastructure.
The main drawback is that we consume more memory per conntrack
if event delivery is enabled. This patch is required by the
reliable event delivery that follows to this patch.
BTW, this patch allows you to enable/disable event delivery via
/proc/sys/net/netfilter/nf_conntrack_events in runtime, although
you can still disable event caching as compilation option.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
commit 3f68535ada (clocksource: sanity check sysfs clocksource
changes) prevents selection of non high resolution capable
clocksources when high resolution mode is active, but did not take
into account that the same rules apply for highres=off nohz=on.
Check the tick device mode instead of hrtimer_hres_active() to verify
whether the system needs to be protected from a switch to jiffies or
other non highres capable clock sources.
Reported-by: Luming Yu <luming.yu@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
There is sometimes a need for the ocores driver to add devices to the
bus when installed.
i2c_register_board_info can not always be used, because the I2C devices
are not known at an early state, they could for instance be connected
on a I2C bus on a PCI device which has the Open Cores IP.
i2c_new_device can not be used in all cases either since the resulting
bus nummer might be unknown.
The solution is the pass a list of I2C devices in the platform data to
the Open Cores driver. This is useful for MFD drivers.
Signed-off-by: Richard Röjfors <richard.rojfors.ext@mocean-labs.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Rationale: kmemcheck needs to be able to schedule a tasklet without
touching any dynamically allocated memory _at_ _all_ (since that would
lead to a recursive page fault). This tasklet is used for writing the
error reports to the kernel log.
The new scheduling function avoids touching any other tasklets by
inserting the new tasklist as the head of the "tasklet_hi" list instead
of on the tail.
Also don't wake up the softirq thread lest the scheduler access some
tracked memory and we go down with a recursive page fault.
In this case, we'd better just wait for the maximum time of 1/HZ for the
message to appear.
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Move the SLAB struct kmem_cache definition to <linux/slab_def.h> like
with SLUB so kmemcheck can access ->ctor and ->flags.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
[rebased for mainline inclusion]
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (50 commits)
drm: include kernel list header file in hashtab header
drm: Export hash table functionality.
drm: Split out the mm declarations in a separate header. Add atomic operations.
drm/radeon: add support for RV790.
drm/radeon: add rv740 drm support.
drm_calloc_large: check right size, check integer overflow, use GFP_ZERO
drm: Eliminate magic I2C frobbing when reading EDID
drm/i915: duplicate desired mode for use by fbcon.
drm/via: vfree() no need checking before calling it
drm: Replace DRM_DEBUG with DRM_DEBUG_DRIVER in i915 driver
drm: Replace DRM_DEBUG with DRM_DEBUG_MODE in drm_mode
drm/i915: Replace DRM_DEBUG with DRM_DEBUG_KMS in intel_sdvo
drm/i915: replace DRM_DEBUG with DRM_DEBUG_KMS in intel_lvds
drm: add separate drm debugging levels
radeon: remove _DRM_DRIVER from the preadded sarea map
drm: don't associate _DRM_DRIVER maps with a master
drm: simplify kcalloc() call to kzalloc().
intelfb: fix spelling of "CLOCK"
drm: fix LOCK_TEST_WITH_RETURN macro
drm/i915: Hook connector to encoder during load detection (fixes tv/vga detect)
...
Move
arch/x86/kernel/acpi/boot.c: acpi_parse_mcfg()
to
arch/x86/pci/mmconfig-shared.c: pci_parse_mcfg()
where it is used, and make it static.
Move associated globals and helper routine with it.
No functional change.
This code move is in preparation for SFI support,
which will allow the PCI code to find the MCFG table
on systems which do not support ACPI.
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Previously [5], now [8].
sprintf(acpi_device_bid(device), "CPU%X", cpu_id)
now looks better on systems with more than 0xFF processors.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This will help kmemcheck (and possibly other debugging tools) since we
can now simply pass regs->bp to the stack tracer instead of specifying
the number of stack frames to skip, which is unreliable if gcc decides
to inline functions, etc.
Note that this makes the API incomplete for other architectures, but I
expect that those can be updated lazily, e.g. when they need it.
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm:
dlm: use more NOFS allocation
dlm: connect to nodes earlier
dlm: fix use count with multiple joins
dlm: Make name input parameter of {,dlm_}new_lockspace() const
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf_counter: Start documenting HAVE_PERF_COUNTERS requirements
perf_counter: Add forward/backward attribute ABI compatibility
perf record: Explicity program a default counter
perf_counter: Remove PERF_TYPE_RAW special casing
perf_counter: PERF_TYPE_HW_CACHE is a hardware counter too
powerpc, perf_counter: Fix performance counter event types
perf_counter/x86: Add a quirk for Atom processors
perf_counter tools: Remove one L1-data alias
git commit 0a0c5168 "PM: Introduce functions for suspending and resuming
device interrupts" introduced some helper functions. However these
functions are only available for architectures which support
GENERIC_HARDIRQS.
Other architectures will see this build error:
drivers/built-in.o: In function `sysdev_suspend':
(.text+0x15138): undefined reference to `check_wakeup_irqs'
drivers/built-in.o: In function `device_power_up':
(.text+0x1cb66): undefined reference to `resume_device_irqs'
drivers/built-in.o: In function `device_power_down':
(.text+0x1cb92): undefined reference to `suspend_device_irqs'
To fix this add some empty inline functions for !GENERIC_HARDIRQS.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
The *_nvs_* routines in swsusp.c make use of the io*map()
functions, which are only provided for HAS_IOMEM, thus
breaking compilation if HAS_IOMEM is not set. Fix this
by moving the *_nvs_* routines into hibernate_nvs.c, which
is only compiled if HAS_IOMEM is set.
[rjw: Change the name of the new file to hibernate_nvs.c, add the
license line to the header comment.]
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
This patch removes the legacy callbacks ->suspend() and
->resume() from struct device_type. These callbacks seem
unused, and new code should instead make use of struct
dev_pm_ops.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Remove the ->suspend_late() and ->resume_early() callbacks
from struct bus_type V2. These callbacks are legacy stuff
at this point and since there seem to be no in-tree users
we may as well remove them. New users should use dev_pm_ops.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
This patch (as1241) renames a bunch of functions in the PM core.
Rather than go through a boring list of name changes, suffice it to
say that in the end we have a bunch of pairs of functions:
device_resume_noirq dpm_resume_noirq
device_resume dpm_resume
device_complete dpm_complete
device_suspend_noirq dpm_suspend_noirq
device_suspend dpm_suspend
device_prepare dpm_prepare
in which device_X does the X operation on a single device and dpm_X
invokes device_X for all devices in the dpm_list.
In addition, the old dpm_power_up and device_resume_noirq have been
combined into a single function (dpm_resume_noirq).
Lastly, dpm_suspend_start and dpm_resume_end are the renamed versions
of the former top-level device_suspend and device_resume routines.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Rename the functions performing "_noirq" dev_pm_ops
operations from device_power_down() and device_power_up()
to device_suspend_noirq() and device_resume_noirq().
The new function names are chosen to show that the functions
are responsible for calling the _noirq() versions to finalize
the suspend/resume operation. The current function names do
not perform power down/up anymore so the names may be misleading.
Global function renames:
- device_power_down() -> device_suspend_noirq()
- device_power_up() -> device_resume_noirq()
Static function renames:
- suspend_device_noirq() -> __device_suspend_noirq()
- resume_device_noirq() -> __device_resume_noirq()
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Len Brown <lenb@kernel.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Attached is the ELS/CT pass-thru patch for the FC Transport. The patch
creates a generic framework that lays on top of bsg and the SGIO v4 ioctl
in order to pass transaction requests to LLDD's.
The interface supports the following operations:
On an fc_host basis:
Request login to the specified N_Port_ID, creating an fc_rport.
Request logout of the specified N_Port_ID, deleting an fc_rport
Send ELS request to specified N_Port_ID w/o requiring a login, and
wait for ELS response.
Send CT request to specified N_Port_ID and wait for CT response.
Login is required, but LLDD is allowed to manage login and decide
whether it stays in place after the request is satisfied.
Vendor-Unique request. Allows a LLDD-specific request to be passed
to the LLDD, and the passing of a response back to the application.
On an fc_rport basis:
Send ELS request to nport and wait for ELS response.
Send CT request to nport and wait for CT response.
The patch also exports several headers from include/scsi such that
they can be available to user-space applications:
include/scsi/scsi.h
include/scsi/scsi_netlink.h
include/scsi/scsi_netlink_fc.h
include/scsi/scsi_bsg_fc.h
For further information, refer to the last RFC:
http://marc.info/?l=linux-scsi&m=123436574018579&w=2
Note: Documentation is still spotty and will be added later.
[bharrosh@panasas.com: update for new block API]
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: (290 commits)
ALSA: pcm - Update document about xrun_debug proc file
ALSA: lx6464es - support standard alsa module parameters
ALSA: snd_usb_caiaq: set mixername
ALSA: hda - add quirk for STAC92xx (SigmaTel STAC9205)
ALSA: use card device as parent for jack input-devices
ALSA: sound/ps3: Correct existing and add missing annotations
ALSA: sound/ps3: Restructure driver source
ALSA: sound/ps3: Fix checkpatch issues
ASoC: Fix lm4857 control
ALSA: ctxfi - Clear PCM resources at hw_params and hw_free
ALSA: ctxfi - Check the presence of SRC instance in PCM pointer callbacks
ALSA: ctxfi - Add missing start check in atc_pcm_playback_start()
ALSA: ctxfi - Add use_system_timer module option
ALSA: usb - Add boot quirk for C-Media 6206 USB Audio
ALSA: ctxfi - Fix wrong model id for UAA
ALSA: ctxfi - Clean up probe routines
ALSA: hda - Fix the previous tagra-8ch patch
ALSA: hda - Add 7.1 support for MSI GX620
ALSA: pcm - A helper function to compose PCM stream name for debug prints
ALSA: emu10k1 - Fix minimum periods for efx playback
...
* 'topic/slab/earlyboot-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slab: setup cpu caches later on when interrupts are enabled
slab,slub: don't enable interrupts during early boot
slab: fix gfp flag in setup_cpu_cache()
x86: make zap_low_mapping could be used early
irq: slab alloc for default irq_affinity
memcg: fix page_cgroup fatal error in FLATMEM
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (154 commits)
[SCSI] osd: Remove out-of-tree left overs
[SCSI] libosd: Use REQ_QUIET requests.
[SCSI] osduld: use filp_open() when looking up an osd-device
[SCSI] libosd: Define an osd_dev wrapper to retrieve the request_queue
[SCSI] libosd: osd_req_{read,write} takes a length parameter
[SCSI] libosd: Let _osd_req_finalize_data_integrity receive number of out_bytes
[SCSI] libosd: osd_req_{read,write}_kern new API
[SCSI] libosd: Better printout of OSD target system information
[SCSI] libosd: OSD2r05: Attribute definitions
[SCSI] libosd: OSD2r05: Additional command enums
[SCSI] mpt fusion: fix up doc book comments
[SCSI] mpt fusion: Added support for Broadcast primitives Event handling
[SCSI] mpt fusion: Queue full event handling
[SCSI] mpt fusion: RAID device handling and Dual port Raid support is added
[SCSI] mpt fusion: Put IOC into ready state if it not already in ready state
[SCSI] mpt fusion: Code Cleanup patch
[SCSI] mpt fusion: Rescan SAS topology added
[SCSI] mpt fusion: SAS topology scan changes, expander events
[SCSI] mpt fusion: Firmware event implementation using seperate WorkQueue
[SCSI] mpt fusion: rewrite of ioctl_cmds internal generated function
...
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest: (31 commits)
lguest: add support for indirect ring entries
lguest: suppress notifications in example Launcher
lguest: try to batch interrupts on network receive
lguest: avoid sending interrupts to Guest when no activity occurs.
lguest: implement deferred interrupts in example Launcher
lguest: remove obsolete LHREQ_BREAK call
lguest: have example Launcher service all devices in separate threads
lguest: use eventfds for device notification
eventfd: export eventfd_signal and eventfd_fget for lguest
lguest: allow any process to send interrupts
lguest: PAE fixes
lguest: PAE support
lguest: Add support for kvm_hypercall4()
lguest: replace hypercall name LHCALL_SET_PMD with LHCALL_SET_PGD
lguest: use native_set_* macros, which properly handle 64-bit entries when PAE is activated
lguest: map switcher with executable page table entries
lguest: fix writev returning short on console output
lguest: clean up length-used value in example launcher
lguest: Segment selectors are 16-bit long. Fix lg_cpu.ss1 definition.
lguest: beyond ARRAY_SIZE of cpu->arch.gdt
...
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-virtio:
virtio: enhance id_matching for virtio drivers
virtio: fix id_matching for virtio drivers
virtio: handle short buffers in virtio_rng.
virtio_blk: add missing __dev{init,exit} markings
virtio: indirect ring entries (VIRTIO_RING_F_INDIRECT_DESC)
virtio: teach virtio_has_feature() about transport features
virtio: expose features in sysfs
virtio_pci: optional MSI-X support
virtio_pci: split up vp_interrupt
virtio: find_vqs/del_vqs virtio operations
virtio: add names to virtqueue struct, mapping from devices to queues.
virtio: meet virtio spec by finalizing features before using device
virtio: fix obsolete documentation on probe function
* 'cuse' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
CUSE: implement CUSE - Character device in Userspace
fuse: export symbols to be used by CUSE
fuse: update fuse_conn_init() and separate out fuse_conn_kill()
fuse: don't use inode in fuse_file_poll
fuse: don't use inode in fuse_do_ioctl() helper
fuse: don't use inode in fuse_sync_release()
fuse: create fuse_do_open() helper for CUSE
fuse: clean up args in fuse_finish_open() and fuse_release_fill()
fuse: don't use inode in helpers called by fuse_direct_io()
fuse: add members to struct fuse_file
fuse: prepare fuse_direct_io() for CUSE
fuse: clean up fuse_write_fill()
fuse: use struct path in release structure
fuse: misc cleanups
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-module-and-param:
module: cleanup FIXME comments about trimming exception table entries.
module: trim exception table on init free.
module: merge module_alloc() finally
uml module: fix uml build process due to this merge
x86 module: merge the rest functions with macros
x86 module: merge the same functions in module_32.c and module_64.c
uvesafb: improve parameter handling.
module_param: allow 'bool' module_params to be bool, not just int.
module_param: add __same_type convenience wrapper for __builtin_types_compatible_p
module_param: split perm field into flags and perm
module_param: invbool should take a 'bool', not an 'int'
cyber2000fb.c: use proper method for stopping unload if CONFIG_ARCH_SHARK
* 'for-2.6.31' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (29 commits)
ide: re-implement ide_pci_init_one() on top of ide_pci_init_two()
ide: unexport ide_find_dma_mode()
ide: fix PowerMac bootup oops
ide: skip probe if there are no devices on the port (v2)
sl82c105: add printk() logging facility
ide-tape: fix proc warning
ide: add IDE_DFLAG_NIEN_QUIRK device flag
ide: respect quirk_drives[] list on all controllers
hpt366: enable all quirks for devices on quirk_drives[] list
hpt366: sync quirk_drives[] list with pdc202xx_{new,old}.c
ide: remove superfluous SELECT_MASK() call from do_rw_taskfile()
ide: remove superfluous SELECT_MASK() call from ide_driveid_update()
icside: remove superfluous ->maskproc method
ide-tape: fix IDE_AFLAG_* atomic accesses
ide-tape: change IDE_AFLAG_IGNORE_DSC non-atomically
pdc202xx_old: kill resetproc() method
pdc202xx_old: don't call pdc202xx_reset() on IRQ timeout
pdc202xx_old: use ide_dma_test_irq()
ide: preserve Host Protected Area by default (v2)
ide-gd: implement block device ->set_capacity method (v2)
...
This driver is originally written by Lennert, modified by Green to be
feature complete, and ported by Jun Nie and Kevin Liu for pxa168/910
processors.
The patch adds support for the on-chip LCD display controller, it
currently supports the base (graphics) layer only.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Green Wan <gwan@marvell.com>
Cc: Peter Liao <pliao@marvell.com>
Signed-off-by: Jun Nie <njun@marvell.com>
Signed-off-by: Kevin Liu <kliu5@marvell.com>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
As explained by Benjamin Herrenschmidt:
Oh and btw, your patch alone doesn't fix powerpc, because it's missing
a whole bunch of GFP_KERNEL's in the arch code... You would have to
grep the entire kernel for things that check slab_is_available() and
even then you'll be missing some.
For example, slab_is_available() didn't always exist, and so in the
early days on powerpc, we used a mem_init_done global that is set form
mem_init() (not perfect but works in practice). And we still have code
using that to do the test.
Therefore, mask out __GFP_WAIT, __GFP_IO, and __GFP_FS in the slab allocators
in early boot code to avoid enabling interrupts.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Conflicts:
drivers/message/fusion/mptsas.c
fixed up conflict between req->data_len accessors and mptsas driver updates.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This is a followup patch to the one implemeting rdesc representation in debugfs
rather than being dependent on compile-time CONFIG_HID_DEBUG setting.
The API of the appropriate formatting functions is slightly modified -- if
they are passed seq_file pointer, the one-shot output for 'rdesc' file mode
is used, and therefore the message is formatted into the corresponding seq_file
immediately.
Otherwise the called function allocated a new buffer, formats the text into the
buffer and returns the pointer to it, so that it can be queued into the ring-buffer
of the processess blocked waiting on input on 'events' file in debugfs.
'debug' parameter to the 'hid' module is now used solely for the prupose of inetrnal
driver state debugging (parser, transport, etc).
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
It is a little bit inconvenient for people who have some non-standard
HID hardware (usually violating the HID specification) to have to
recompile kernel with CONFIG_HID_DEBUG to be able to see kernel's perspective
of the HID report descriptor and observe the parsed events. Plus the messages
are then mixed up inconveniently with the rest of the dmesg stuff.
This patch implements /sys/kernel/debug/hid/<device>/rdesc file, which
represents the kernel's view of report descriptor (both the raw report
descriptor data and parsed contents).
With all the device-specific debug data being available through debugfs, there
is no need for keeping CONFIG_HID_DEBUG, as the 'debug' parameter to the
hid module will now only output only driver-specific debugging options, which has
absolutely minimal memory footprint, just a few error messages and one global
flag (hid_debug).
We use the current set of output formatting functions. The ones that need to be
used both for one-shot rdesc seq_file and also for continuous flow of data
(individual reports, as being sent by the device) distinguish according to the
passed seq_file parameter, and if it is NULL, it still output to kernel ringbuffer,
otherwise the corresponding seq_file is used for output.
The format of the output is preserved.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
We no longer need an efficient mechanism to force the Guest back into
host userspace, as each device is serviced without bothering the main
Guest process (aka. the Launcher).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently, when a Guest wants to perform I/O it calls LHCALL_NOTIFY with
an address: the main Launcher process returns with this address, and figures
out what device to run.
A far nicer model is to let processes bind an eventfd to an address: if we
find one, we simply signal the eventfd.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Davide Libenzi <davidel@xmailserver.org>
lguest never checked for pending interrupts when enabling interrupts, and
things still worked. However, it makes a significant difference to TCP
performance, so it's time we fixed it by introducing a pending_irq flag
and checking it on irq_restore and irq_enable.
These two routines are now too big to patch into the 8/10 bytes
patch space, so we drop that code.
Note: The high latency on interrupt delivery had a very curious
effect: once everything else was optimized, networking without GSO was
faster than networking with GSO, since more interrupts were sent and
hence a greater chance of one getting through to the Guest!
Note2: (Almost) Closing the same loophole for iret doesn't have any
measurable effect, so I'm leaving that patch for the moment.
Before:
1GB tcpblast Guest->Host: 30.7 seconds
1GB tcpblast Guest->Host (no GSO): 76.0 seconds
After:
1GB tcpblast Guest->Host: 6.8 seconds
1GB tcpblast Guest->Host (no GSO): 27.8 seconds
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Add a new feature flag for indirect ring entries. These are ring
entries which point to a table of buffer descriptors.
The idea here is to increase the ring capacity by allowing a larger
effective ring size whereby the ring size dictates the number of
requests that may be outstanding, rather than the size of those
requests.
This should be most effective in the case of block I/O where we can
potentially benefit by concurrently dispatching a large number of
large requests. Even in the simple case of single segment block
requests, this results in a threefold increase in ring capacity.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Drivers don't add transport features to their table, so we
shouldn't check these with virtio_check_driver_offered_feature().
We could perhaps add an ->offered_feature() virtio_config_op,
but that perhaps that would be overkill for a consitency check
like this.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This implements optional MSI-X support in virtio_pci.
MSI-X is used whenever the host supports at least 2 MSI-X
vectors: 1 for configuration changes and 1 for virtqueues.
Per-virtqueue vectors are allocated if enough vectors
available.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (+ whitespace, style)
This replaces find_vq/del_vq with find_vqs/del_vqs virtio operations,
and updates all drivers. This is needed for MSI support, because MSI
needs to know the total number of vectors upfront.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (+ lguest/9p compile fixes)
Add a linked list of all virtqueues for a virtio device: this helps for
debugging and is also needed for upcoming interface change.
Also, add a "name" field for clearer debug messages.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Provide for means of extending the perf_counter_attr in a 'natural' way.
We allow growing the structure by appending fields at the end by specifying
the full structure size inside it.
When a new kernel sees a smaller (old) structure, it will 0 pad the tail.
When an old kernel sees a larger (new) structure, it will verify the tail
consists of 0s, otherwise fail.
If we fail due to a size-mismatch, we return -E2BIG and write the kernel's
native attribe size back into the provided structure.
Furthermore, add some attribute verification, so that we'll fail counter
creation when unknown bits are present (PERF_SAMPLE, PERF_FORMAT, or in
the __reserved fields).
(This ABI detail is introduced while keeping the existing syscall ABI.)
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
is_software_counter() was missing the new HW_CACHE category.
( This could have caused some counter scheduling artifacts
with mixed sw and hw counters and counter groups. )
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It's theoretically possible that there are exception table entries
which point into the (freed) init text of modules. These could cause
future problems if other modules get loaded into that memory and cause
an exception as we'd see the wrong fixup. The only case I know of is
kvm-intel.ko (when CONFIG_CC_OPTIMIZE_FOR_SIZE=n).
Amerigo fixed this long-standing FIXME in the x86 version, but this
patch is more general.
This implements trim_init_extable(); most archs are simple since they
use the standard lib/extable.c sort code. Alpha and IA64 use relative
addresses in their fixups, so thier trimming is a slight variation.
Sparc32 is unique; it doesn't seem to define ARCH_HAS_SORT_EXTABLE,
yet it defines its own sort_extable() which overrides the one in lib.
It doesn't sort, so we have to mark deleted entries instead of
actually trimming them.
Inspired-by: Amerigo Wang <amwang@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-alpha@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Impact: API cleanup
For historical reasons, 'bool' parameters must be an int, not a bool.
But there are around 600 users, so a conversion seems like useless churn.
So we use __same_type() to distinguish, and handle both cases.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Impact: new API
__builtin_types_compatible_p() is a little awkward to use: it takes two
types rather than types or variables, and it's just damn long.
(typeof(type) == type, so this works on types as well as vars).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Impact: cleanup
Rather than hack KPARAM_KMALLOCED into the perm field, separate it out.
Since the perm field was 32 bits and only needs 16, we don't add bloat.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It takes an 'int' for historical reasons, and there are only two
users: simply switch it over to bool.
The other user (uvesafb.c) will get a (harmless-on-x86) warning until
the next patch is applied.
Cc: Brad Douglas <brad@neruo.com>
Cc: Michal Januszewski <spock@gentoo.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now, SLAB is configured in very early stage and it can be used in
init routine now.
But replacing alloc_bootmem() in FLAT/DISCONTIGMEM's page_cgroup()
initialization breaks the allocation, now.
(Works well in SPARSEMEM case...it supports MEMORY_HOTPLUG and
size of page_cgroup is in reasonable size (< 1 << MAX_ORDER.)
This patch revive FLATMEM+memory cgroup by using alloc_bootmem.
In future,
We stop to support FLATMEM (if no users) or rewrite codes for flatmem
completely.But this will adds more messy codes and overheads.
Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Tested-by: Li Zefan <lizf@cn.fujitsu.com>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
this is a TTM preparation patch, it rearranges the mm and
add operations needed to do mm operations in atomic context.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Previously we would check size instead of size * nmemb, and so would
never hit the vmalloc path. Also add integer overflow check as in kcalloc,
and allocate GFP_ZERO pages instead of memset()ing them.
Signed-off-by: Kristian Høgsberg <krh@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
fs-internal parts of qnx4_fs.h taken to fs/qnx4/qnx4.h, includes adjusted,
qnx4_fs.h doesn't need unifdef anymore.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* have directory operations use mark_buffer_dirty_inode(),
so that sync_mapping_buffers() would get those.
* make qnx4_write_inode() honour its last argument.
* get rid of insane copies of very ancient "walk the indirect blocks"
in qnx4/fsync - they never matched the actual fs layout and, fortunately,
never'd been called. Again, all this junk is not needed; ->fsync()
should just do sync_mapping_buffers + sync_inode (and if we implement
block allocation for qnx4, we'll need to use mark_buffer_dirty_inode()
for extent blocks)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
writes associated buffers, then does sync_inode() to write
the inode itself (and to make it clean). Depends on
->write_inode() honouring the second argument.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The only user of the i_cindex element in the inode structure is used
is by the firewire drivers. As part of an attempt to slim down the
inode structure to save memory --- since a typical Linux system will
have hundreds of thousands if not millions of inodes cached, a
reduction in the size inode has high leverage.
The firewire driver does not need i_cindex in any fast path, so it's
simple enough to calculate when it is needed, instead of wasting space
in the inode structure.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: krh@redhat.com
Cc: stefanr@s5r6.in-berlin.de
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
d_unlinked() will be used in middle-term to ban checkpointing when opened
but unlinked file is detected, and in long term, to detect such situation
and special case on it.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Introduce this function which just writes all the quota structures but
avoids all the syncing and cache pruning work to expose quota structures
to userspace. Use this function from __sync_filesystem when wait == 0.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Currently the VFS calls vfs_dq_sync to sync out disk quotas for a given
superblock. This is a small wrapper around sync_dquots which for the
case of a non-NULL superblock is a small wrapper around quota_sync_sb.
Just make quota_sync_sb global (rename it to sync_quota_sb) and call it
directly. Also call it directly for those cases in quota.c that have a
superblock and leave sync_dquots purely an iterator over sync_quota_sb and
remove it's superblock argument.
To make this nicer move the check for the lack of a quota_sync method
from the callers into sync_quota_sb.
[folded build fix from Alexander Beregalov <a.beregalov@gmail.com>]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Rename the function so that it better describe what it really does. Also
remove the unnecessary include of buffer_head.h.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Move sync_filesystems(), __fsync_super(), fsync_super() from
super.c to sync.c where it fits better.
[build fixes folded]
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
It is unnecessarily fragile to have two places (fsync_super() and do_sync())
doing data integrity sync of the filesystem. Alter __fsync_super() to
accommodate needs of both callers and use it. So after this patch
__fsync_super() is the only place where we gather all the calls needed to
properly send all data on a filesystem to disk.
Nice bonus is that we get a complete livelock avoidance and write_supers()
is now only used for periodic writeback of superblocks.
sync_blockdevs() introduced a couple of patches ago is gone now.
[build fixes folded]
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
__fsync_super() does the same thing as fsync_super(). So change the only
caller to use fsync_super() and make __fsync_super() static. This removes
unnecessarily duplicated call to sync_blockdev() and prepares ground
for the changes to __fsync_super() in the following patches.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Remove the unused s_async_list in the superblock, a leftover of the
broken async inode deletion code that leaked into mainline. Having this
in the middle of the sync/unmount path is not helpful for the following
cleanups.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This patch speeds up lmbench lat_mmap test by about another 2% after the
first patch.
Before:
avg = 462.286
std = 5.46106
After:
avg = 453.12
std = 9.58257
(50 runs of each, stddev gives a reasonable confidence)
It does this by introducing mnt_clone_write, which avoids some heavyweight
operations of mnt_want_write if called on a vfsmount which we know already
has a write count; and mnt_want_write_file, which can call mnt_clone_write
if the file is open for write.
After these two patches, mnt_want_write and mnt_drop_write go from 7% on
the profile down to 1.3% (including mnt_clone_write).
[AV: mnt_want_write_file() should take file alone and derive mnt from it;
not only all callers have that form, but that's the only mnt about which
we know that it's already held for write if file is opened for write]
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This patch speeds up lmbench lat_mmap test by about 8%. lat_mmap is set up
basically to mmap a 64MB file on tmpfs, fault in its pages, then unmap it.
A microbenchmark yes, but it exercises some important paths in the mm.
Before:
avg = 501.9
std = 14.7773
After:
avg = 462.286
std = 5.46106
(50 runs of each, stddev gives a reasonable confidence, but there is quite
a bit of variation there still)
It does this by removing the complex per-cpu locking and counter-cache and
replaces it with a percpu counter in struct vfsmount. This makes the code
much simpler, and avoids spinlocks (although the msync is still pretty
costly, unfortunately). It results in about 900 bytes smaller code too. It
does increase the size of a vfsmount, however.
It should also give a speedup on large systems if CPUs are frequently operating
on different mounts (because the existing scheme has to operate on an atomic in
the struct vfsmount when switching between mounts). But I'm most interested in
the single threaded path performance for the moment.
[AV: minor cleanup]
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
New field: nd->root. When pathname resolution wants to know the root,
check if nd->root.mnt is non-NULL; use nd->root if it is, otherwise
copy current->fs->root there. After path_walk() is finished, we check
if we'd got a cached value in nd->root and drop it. Before calling
path_walk() we should either set nd->root.mnt to NULL *or* copy (and
pin down) some path to nd->root. In the latter case we won't be
looking at current->fs->root at all.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This patch adds an -oexpose_privroot option to allow access to the privroot.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'for-linus' of git://git.infradead.org/users/eparis/notify:
fsnotify: allow groups to set freeing_mark to null
inotify/dnotify: should_send_event shouldn't match on FS_EVENT_ON_CHILD
dnotify: do not bother to lock entry->lock when reading mask
dnotify: do not use ?true:false when assigning to a bool
fsnotify: move events should indicate the event was on a child
inotify: reimplement inotify using fsnotify
fsnotify: handle filesystem unmounts with fsnotify marks
fsnotify: fsnotify marks on inodes pin them in core
fsnotify: allow groups to add private data to events
fsnotify: add correlations between events
fsnotify: include pathnames with entries when possible
fsnotify: generic notification queue and waitq
dnotify: reimplement dnotify using fsnotify
fsnotify: parent event notification
fsnotify: add marks to inodes so groups can interpret how to handle those inodes
fsnotify: unified filesystem notification backend
* 'for-linus' of git://linux-arm.org/linux-2.6:
kmemleak: Add the corresponding MAINTAINERS entry
kmemleak: Simple testing module for kmemleak
kmemleak: Enable the building of the memory leak detector
kmemleak: Remove some of the kmemleak false positives
kmemleak: Add modules support
kmemleak: Add kmemleak_alloc callback from alloc_large_system_hash
kmemleak: Add the vmalloc memory allocation/freeing hooks
kmemleak: Add the slub memory allocation/freeing hooks
kmemleak: Add the slob memory allocation/freeing hooks
kmemleak: Add the slab memory allocation/freeing hooks
kmemleak: Add documentation on the memory leak detector
kmemleak: Add the base support
Manual conflict resolution (with the slab/earlyboot changes) in:
drivers/char/vt.c
init/main.c
mm/slab.c
* 'perfcounters-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (574 commits)
perf_counter: Turn off by default
perf_counter: Add counter->id to the throttle event
perf_counter: Better align code
perf_counter: Rename L2 to LL cache
perf_counter: Standardize event names
perf_counter: Rename enums
perf_counter tools: Clean up u64 usage
perf_counter: Rename perf_counter_limit sysctl
perf_counter: More paranoia settings
perf_counter: powerpc: Implement generalized cache events for POWER processors
perf_counters: powerpc: Add support for POWER7 processors
perf_counter: Accurate period data
perf_counter: Introduce struct for sample data
perf_counter tools: Normalize data using per sample period data
perf_counter: Annotate exit ctx recursion
perf_counter tools: Propagate signals properly
perf_counter tools: Small frequency related fixes
perf_counter: More aggressive frequency adjustment
perf_counter/x86: Fix the model number of Intel Core2 processors
perf_counter, x86: Correct some event and umask values for Intel processors
...
* 'topic/slab/earlyboot' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
vgacon: use slab allocator instead of the bootmem allocator
irq: use kcalloc() instead of the bootmem allocator
sched: use slab in cpupri_init()
sched: use alloc_cpumask_var() instead of alloc_bootmem_cpumask_var()
memcg: don't use bootmem allocator in setup code
irq/cpumask: make memoryless node zero happy
x86: remove some alloc_bootmem_cpumask_var calling
vt: use kzalloc() instead of the bootmem allocator
sched: use kzalloc() instead of the bootmem allocator
init: introduce mm_init()
vmalloc: use kzalloc() instead of alloc_bootmem()
slab: setup allocators earlier in the boot sequence
bootmem: fix slab fallback on numa
bootmem: use slab if bootmem is no longer available
Adds support for PCI Express transaction layer end-to-end CRC checking
(ECRC). This patch will enable/disable ECRC checking by setting/clearing
the ECRC Check Enable and/or ECRC Generation Enable bits for devices that
support ECRC.
The ECRC setting is controlled by the "pci=ecrc=<policy>" command-line
option. If this option is not set or is set to 'bios", the enable and
generation bits are left in whatever state that firmware/BIOS set them to.
The "off" setting turns them off, and the "on" option turns them on (if the
device supports it).
Turning ECRC on or off can be a data integrity versus performance
tradeoff. In theory, turning it on will catch more data errors, turning
it off means possibly better performance since CRC does not need to be
calculated by the PCIe hardware and packet sizes are reduced.
Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The last in-tree caller of pci_find_slot has been converted, so
let's get rid of this deprecated interface.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We should not assign 64bit ranges to PCI devices that only take 32bit
prefetchable addresses.
Try to set IORESOURCE_MEM_64 in 64bit resource of pci_device/pci_bridge
and make the bus resource only have that bit set when all devices under
it support 64bit prefetchable memory. Use that flag to allocate
resources from that range.
Reported-by: Yannick <yannick.roehlly@free.fr>
Reviewed-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Impact: cleanup, spec compliance
This patch does:
- Remove unused msi/msix_enable/disable macros.
User should use msi/msix_set_enable() functions instead.
- Remove unused msix_mask/unmask/pending macros.
These macros are useless because they are not based on any of
the PCI Local Bus Specifications properly.
It seems that they were written based on a draft of PCI spec,
and that the draft was the MSI-X ECN that underwent membership
review in September 2002.
(* In the draft, the size of a entry in MSI-X table was 64bit,
containing 32bit message data and DWORD aligned lower address
plus a pending bit and a mask bit.(30+1+1bit) The higher
address was placed in MSI-X capability structure and shared
by all entries.)
- Remove PCI_MSIX_FLAGS_BITMASK.
This definition also come from the draft ECN.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add a generic (unoptimized) implementation of checksum.c in pure C
for use by all architectures that cannot be bother with implementing
their own version.
Based on microblaze code by Michal Simek <monstr@monstr.eu>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Based on discussions with Michal Simek and code
from m68knommu and h8300, this version of uaccess.h
should be usable by most architectures, by overriding
some parts of it.
Simple NOMMU architectures can use it out of
the box, but a minimal __access_ok() should be
added there as well.
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Memory management in generic is highly architecture specific,
but on NOMMU architectures, it is mostly trivial, so just
add a default implementation in asm-generic that applies
to all NOMMU architectures.
The two files cache.h and cacheflush.h can possibly also
be used by architectures that have an MMU but never require
flushing the cache or have cache lines larger than 32 bytes.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
atomic.h and io.h are based on the mn10300 architecture,
which is already pretty generic and can be used by
other architectures that do not have hardware support
for atomic operations or out-of-order I/O access.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The dma.h, hw_irq.h, serial.h and timex.h files originally
described PC-style i8237, i8259A, i8250, i8253 and i8255 chips
as well as the VGA style text mode graphics.
Modern architectures live happily without these specific
interfaces, but a few definitions from these headers keep
getting used in common code.
The new generic headers are what most architectures use
anyway nowadays, just implementing the minimal definitions.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
These are all kernel internal interfaces that get copied
around a lot. In most cases, architectures can provide
their own optimized versions, but these generic versions
can work as well.
I have tried to use the most common contents of each
header to allow existing architectures to migrate easily.
Thanks to Remis for suggesting a number of cleanups.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
bitops.h apparently suffered from some level of bitrot, it
was missing the smp_mb__{before,after}_clear_bit functions,
and included other headers in an invalid order.
This changes the file so that new architectures can use
it out of the box.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Some generic code is using the horribly misnamed PCI_DMA_BUS_IS_PHYS
from asm/pci.h. This makes sure that an architecture without PCI
support does not have to define this itself but can rely on the
asm-generic version.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Evidently, set_rtc_time is supposed to be overridable
by architectures that define their own version, but
unfortunately, get_rtc_ss would in that case still
use the generic version.
This makes get_rtc_ss call the real set_rtc_time
to let architectures define their own version.
The change should fix the "Extended RTC operation"
on Alpha, which uses the incorrect get_rtc_ss
call. It also allows PowerPC to use the asm-generic/rtc.h
file in the future.
Cc: Richard Henderson <rth@twiddle.net>
Cc: linux-alpha@vger.kernel.org
Cc: Tom Rini <trini@mvista.com>
Cc: rtc-linux@googlegroups.com
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Paul Gortmaker <p_gortmaker@yahoo.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The current asm-generic/page.h only contains the get_order
function, and asm-generic/uaccess.h only implements
unaligned accesses. This renames the file to getorder.h
and uaccess-unaligned.h to make room for new page.h
and uaccess.h file that will be usable by all simple
(e.g. nommu) architectures.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The existing asm-generic/atomic.h only defines the
atomic_long type. This renames it to atomic-long.h
so we have a place to add a truly generic atomic.h
that can be used on all non-SMP systems.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
A new architecture should only define a minimal set of system
calls while still providing the full functionality. This version
of unistd.h has gone through intensive review to make sure that
by default it only enables syscalls that do not already have
a more featureful replacement.
It is modeled after the x86-64 version of unistd.h, which unifies
the syscall number definition and the actual system call table
in a single file, in order to keep them synchronized much more
easily.
This first version still keeps legacy system call definitions
around, guarded by various #ifdefs, and with numbers larger
than 1024. The idea behind this is to make it easier for
new architectures to transition from a full list to the reduced
set. In particular, the new microblaze architecture that should
migrate to using the generic ABI headers can at least use an
existing uClibc source tree without major rewrites during the
conversion.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
These header files are typically copied from an existing architecture
into any new one, slightly modified and then remain untouched until
the end of time in the name of ABI stability.
To make it easier for future architectures, provide a sane generic
version here. In cases where multiple architectures already use
identical code, I used the most common version. In cases like
stat.h that are more or less broken everywhere, I provide a
version that is meant to be ideal for new architectures.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The ipc64 data structures were originally meant to
be architecture specific so that each architecture
could add their own optimizations for padding.
In the end, most of them just copied the x86 version,
and most got that wrong. UClibc expects the x86 anyway,
so we might just declare that the default and get
rid of the extra copies.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This provides a reliable way for asm-generic/types.h and other
files to find out if it is running on a 32 or 64 bit platform.
We cannot use CONFIG_64BIT for this in headers that are included
from user space because CONFIG symbols are not available there.
We also cannot do it inside of asm/types.h because some headers
need the word size but cannot include types.h.
The solution is to introduce a new header <asm/bitsperlong.h>
that defines both __BITS_PER_LONG for user space and
BITS_PER_LONG for usage in the kernel. The asm-generic
version falls back to 32 bit unless the architecture overrides
it, which I did for all 64 bit platforms.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The existing asm-generic versions are incomplete and included
by some architectures. New architectures should be able
to use a generic version, so rename the existing files and
change all users, which lets us add the new files.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
fsnotify tells its listeners explicitly when an event happened on the given
inode verses on the child of the given inode. (see __fsnotify_parent)
However, the semantics of fsnotify_move() are such that we deliver events
directly to the two parent directories in question (old_dir and new_dir)
directly without using the __fsnotify_parent() call. fsnotify should be
adding FS_EVENT_ON_CHILD for the notifications to these parents.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reimplement inotify_user using fsnotify. This should be feature for feature
exactly the same as the original inotify_user. This does not make any changes
to the in kernel inotify feature used by audit. Those patches (and the eventual
removal of in kernel inotify) will come after the new inotify_user proves to be
working correctly.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
When an fs is unmounted with an fsnotify mark entry attached to one of its
inodes we need to destroy that mark entry and we also (like inotify) send
an unmount event.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
inotify needs per group information attached to events. This patch allows
groups to attach private information and implements a callback so that
information can be freed when an event is being destroyed.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
As part of the standard inotify events it includes a correlation cookie
between two dentry move operations. This patch includes the same behaviour
in fsnotify events. It is needed so that inotify userspace can be
implemented on top of fsnotify.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
When inotify wants to send events to a directory about a child it includes
the name of the original file. This patch collects that filename and makes
it available for notification.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
inotify needs to do asyc notification in which event information is stored
on a queue until the listener is ready to receive it. This patch
implements a generic notification queue for inotify (and later fanotify) to
store events to be sent at a later time.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Reimplement dnotify using fsnotify.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
inotify and dnotify both use a similar parent notification mechanism. We
add a generic parent notification mechanism to fsnotify for both of these
to use. This new machanism also adds the dentry flag optimization which
exists for inotify to dnotify.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
This patch creates a way for fsnotify groups to attach marks to inodes.
These marks have little meaning to the generic fsnotify infrastructure
and thus their meaning should be interpreted by the group that attached
them to the inode's list.
dnotify and inotify will make use of these markings to indicate which
inodes are of interest to their respective groups. But this implementation
has the useful property that in the future other listeners could actually
use the marks for the exact opposite reason, aka to indicate which inodes
it had NO interest in.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
fsnotify is a backend for filesystem notification. fsnotify does
not provide any userspace interface but does provide the basis
needed for other notification schemes such as dnotify. fsnotify
can be extended to be the backend for inotify or the upcoming
fanotify. fsnotify provides a mechanism for "groups" to register for
some set of filesystem events and to then deliver those events to
those groups for processing.
fsnotify has a number of benefits, the first being actually shrinking the size
of an inode. Before fsnotify to support both dnotify and inotify an inode had
unsigned long i_dnotify_mask; /* Directory notify events */
struct dnotify_struct *i_dnotify; /* for directory notifications */
struct list_head inotify_watches; /* watches on this inode */
struct mutex inotify_mutex; /* protects the watches list
But with fsnotify this same functionallity (and more) is done with just
__u32 i_fsnotify_mask; /* all events for this inode */
struct hlist_head i_fsnotify_mark_entries; /* marks on this inode */
That's right, inotify, dnotify, and fanotify all in 64 bits. We used that
much space just in inotify_watches alone, before this patch set.
fsnotify object lifetime and locking is MUCH better than what we have today.
inotify locking is incredibly complex. See 8f7b0ba1c8 as an example of
what's been busted since inception. inotify needs to know internal semantics
of superblock destruction and unmounting to function. The inode pinning and
vfs contortions are horrible.
no fsnotify implementers do allocation under locks. This means things like
f04b30de3 which (due to an overabundance of caution) changes GFP_KERNEL to
GFP_NOFS can be reverted. There are no longer any allocation rules when using
or implementing your own fsnotify listener.
fsnotify paves the way for fanotify. In brief fanotify is a notification
mechanism that delivers the lisener both an 'event' and an open file descriptor
to the object in question. This means that fanotify is pathname agnostic.
Some on lkml may not care for the original companies or users that pushed for
TALPA, but fanotify was designed with flexibility and input for other users in
mind. The readahead group expressed interest in fanotify as it could be used
to profile disk access on boot without breaking the audit system. The desktop
search groups have also expressed interest in fanotify as it solves a number
of the race conditions and problems present with managing inotify when more
than a limited number of specific files are of interest. fanotify can provide
for a userspace access control system which makes it a clean interface for AV
vendors to hook without trying to do binary patching on the syscall table,
LSM, and everywhere else they do their things today. With this patch series
fanotify can be implemented in less than 1200 lines of easy to review code.
Almost all of which is the socket based user interface.
This patch series builds fsnotify to the point that it can implement
dnotify and inotify_user. Patches exist and will be sent soon after
acceptance to finish the in kernel inotify conversion (audit) and implement
fanotify.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
The IBS implemention writes 64 bit register values to the cpu buffer
by writing two 32 values using oprofile_add_data(). This patch
introduces oprofile_add_data64() to write a single 64 bit value to the
buffer.
Signed-off-by: Robert Richter <robert.richter@amd.com>
* 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block: (153 commits)
block: add request clone interface (v2)
floppy: fix hibernation
ramdisk: remove long-deprecated "ramdisk=" boot-time parameter
fs/bio.c: add missing __user annotation
block: prevent possible io_context->refcount overflow
Add serial number support for virtio_blk, V4a
block: Add missing bounce_pfn stacking and fix comments
Revert "block: Fix bounce limit setting in DM"
cciss: decode unit attention in SCSI error handling code
cciss: Remove no longer needed sendcmd reject processing code
cciss: change SCSI error handling routines to work with interrupts enabled.
cciss: separate error processing and command retrying code in sendcmd_withirq_core()
cciss: factor out fix target status processing code from sendcmd functions
cciss: simplify interface of sendcmd() and sendcmd_withirq()
cciss: factor out core of sendcmd_withirq() for use by SCSI error handling code
cciss: Use schedule_timeout_uninterruptible in SCSI error handling code
block: needs to set the residual length of a bidi request
Revert "block: implement blkdev_readpages"
block: Fix bounce limit setting in DM
Removed reference to non-existing file Documentation/PCI/PCI-DMA-mapping.txt
...
Manually fix conflicts with tracing updates in:
block/blk-sysfs.c
drivers/ide/ide-atapi.c
drivers/ide/ide-cd.c
drivers/ide/ide-floppy.c
drivers/ide/ide-tape.c
include/trace/events/block.h
kernel/trace/blktrace.c
* 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (138 commits)
KVM: Prevent overflow in largepages calculation
KVM: Disable large pages on misaligned memory slots
KVM: Add VT-x machine check support
KVM: VMX: Rename rmode.active to rmode.vm86_active
KVM: Move "exit due to NMI" handling into vmx_complete_interrupts()
KVM: Disable CR8 intercept if tpr patching is active
KVM: Do not migrate pending software interrupts.
KVM: inject NMI after IRET from a previous NMI, not before.
KVM: Always request IRQ/NMI window if an interrupt is pending
KVM: Do not re-execute INTn instruction.
KVM: skip_emulated_instruction() decode instruction if size is not known
KVM: Remove irq_pending bitmap
KVM: Do not allow interrupt injection from userspace if there is a pending event.
KVM: Unprotect a page if #PF happens during NMI injection.
KVM: s390: Verify memory in kvm run
KVM: s390: Sanity check on validity intercept
KVM: s390: Unlink vcpu on destroy - v2
KVM: s390: optimize float int lock: spin_lock_bh --> spin_lock
KVM: s390: use hrtimer for clock wakeup from idle - v2
KVM: s390: Fix memory slot versus run - v3
...
* 'for-2.6.31' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (28 commits)
ide-tape: fix debug call
alim15x3: Remove historical hacks, re-enable init_hwif for PowerPC
ide-dma: don't reset request fields on dma_timeout_retry()
ide: drop rq->data handling from ide_map_sg()
ide-atapi: kill unused fields and callbacks
ide-tape: simplify read/write functions
ide-tape: use byte size instead of sectors on rw issue functions
ide-tape: unify r/w init paths
ide-tape: kill idetape_bh
ide-tape: use standard data transfer mechanism
ide-tape: use single continuous buffer
ide-atapi,tape,floppy: allow ->pc_callback() to change rq->data_len
ide-tape,floppy: fix failed command completion after request sense
ide-pm: don't abuse rq->data
ide-cd,atapi: use bio for internal commands
ide-atapi: convert ide-{floppy,tape} to using preallocated sense buffer
ide-cd: convert to using generic sense request
ide: add helpers for preparing sense requests
ide-cd: don't abuse rq->buffer
ide-atapi: don't abuse rq->buffer
...
Now that we set up the slab allocator earlier, we can get rid of some
alloc_bootmem_cpumask_var() calls in boot code.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
There are allocations for which the main pointer cannot be found but
they are not memory leaks. This patch fixes some of them. For more
information on false positives, see Documentation/kmemleak.txt.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch adds the callbacks to kmemleak_(alloc|free) functions from
the slab allocator. The patch also adds the SLAB_NOLEAKTRACE flag to
avoid recursive calls to kmemleak when it allocates its own data
structures.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
This patch adds the base support for the kernel memory leak
detector. It traces the memory allocation/freeing in a way similar to
the Boehm's conservative garbage collector, the difference being that
the unreferenced objects are not freed but only shown in
/sys/kernel/debug/kmemleak. Enabling this feature introduces an
overhead to memory allocations.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
* serial-from-alan: (79 commits)
moxa: prevent opening unavailable ports
imx: serial: use tty_encode_baud_rate to set true rate
imx: serial: add IrDA support to serial driver
imx: serial: use rational library function
lib: isolate rational fractions helper function
imx: serial: handle initialisation failure correctly
imx: serial: be sure to stop xmit upon shutdown
imx: serial: notify higher layers in case xmit IRQ was not called
imx: serial: fix one bit field type
imx: serial: fix whitespaces (no changes in functionality)
tty: use prepare/finish_wait
tty: remove sleep_on
sierra: driver interface blacklisting
sierra: driver urb handling improvements
tty: resolve some sierra breakage
timbuart: Fix the termios logic
serial: Added Timberdale UART driver
tty: Add URL for ttydev queue
devpts: unregister the file system on error
tty: Untangle termios and mm mutex dependencies
...
So as to be able to distuinguish between multiple counters.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The top (fastest) and last level (biggest) caches are the most
interesting ones, performance wise.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
[ Fixed the Nehalem LL table to LLC Reference/Miss events ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Pure renames only, to PERF_COUNT_HW_* and PERF_COUNT_SW_*.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Rename the perf enums to be in the 'perf_' namespace and strictly
enumerate the ABI bits.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Provide a helper function to determine optimum numerator
denominator value pairs taking into account restricted
register size. Useful especially with PLL and other clock
configurations.
Signed-off-by: Oskar Schirmer <os@emlix.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Driver for the UART found in the Timberdale FPGA
Signed-off-by: Richard Röjfors <richard.rojfors.ext@mocean-labs.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds support for the TI AR7 internal UART.
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are several pretty much unfixable races in the old ldisc code, especially
with respect to pty behaviour and also to hangup. It's easier to rewrite the
code than simply try and patch it up.
This patch
- splits the ldisc from the tty (so we will be able to refcount it more cleanly
later)
- introduces a mutex lock for ldisc changing on an active device
- fixes the complete mess that hangup caused
- implements hopefully correct setldisc/close/hangup locking
There are still some problems around pty pairs that have always been there but
at least it is now possible to understand the code and fix further problems.
This fixes the following known bugs
- hang up can leak ldisc references
- hang up may not call open/close on ldisc in a matched way
- pty/tty pairs can deadlock during an ldisc change
- reading the ldisc proc files can cause every ldisc to be loaded
and probably a few other of the mysterious ldisc race reports.
I'm sure it also adds the odd new one.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Before trying to tackle the ldisc bugs the code needs to be a good deal
more readable, so do the simple extractions of routines first.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The tty throttling code can race due to the lock drops. It takes very high
loads but this has been observed and verified by Rob Duncan.
The basic problem is that on an SMP box we can go
CPU #1 CPU #2
need to throttle ?
suppose we should buffer space cleared
are we throttled
yes ? - unthrottle
call throttle method
This changeet take the termios lock to protect against this. The termios
lock isn't the initial obvious candidate but many implementations of throttle
methods already need to poke around their own termios structures (and nobody
really locks them against a racing change of flow control).
This does mean that anyone who is setting tty->low_latency = 1 and then
calling tty_flip_buffer_push from their unthrottle method is going to end up
collapsing in a pile of locks. However we've removed all the known bogus
users of low_latency = 1 and such use isn't safe anyway for other reasons so
catching it would be an improvement.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds support for the following serial controller chip:
Oxford Semiconductor OXCB950 for PCI Cardbus interface
http://www.transdimension.com/products/serial/OXCB950.html
on this card:
ExSys EX-1370 1 port high-speed serial card for ExpressCard/34 slot
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Define ASYNCB_* flags which are bit numbers of the ASYNC_* flags.
This is useful for {test,set,clear}_bit.
Also convert each ASYNC_% to be (1 << ASYNCB_%) and define masks
with the macros, not constants.
Tested with:
#include "PATH_TO_KERNEL/include/linux/serial.h"
static struct {
unsigned int new, old;
} as[] = {
{ ASYNC_HUP_NOTIFY, 0x0001 },
{ ASYNC_FOURPORT, 0x0002 },
...
{ ASYNC_BOOT_ONLYMCA, 0x00400000 },
{ ASYNC_INTERNAL_FLAGS, 0xFFC00000 }
};
...
for (a = 0; a < ARRAY_SIZE(as); a++)
if (as[a].old != as[a].new)
printf("%.8x != %.8x\n", as[a].old, as[a].new);
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Store HW version locally to not read it all the time in interrupts
and alike.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove ugly all-over-the-code casts of ctl_addr to 9060 space.
Add an union to the cyclades_card structure, which contains
a pointer to both 9050 and 9060 spaces.
The 9050 space layout is unknown, so let it still as a void
__iomem pointer.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This allows us to clean stuff up, but is probably also going to cause
some app breakage with buggy apps as we now implement proper POSIX behaviour
for USB ports matching all the other ports. This does also mean other apps
that break on USB will now work properly.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We need this for devices that cannot flush and wait, but which do not order
data and modem events. Without it we will hang up before all the data
clears the hardware. Needed for the USB changes.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some drivers implement this internally, others miss it out. Push the
behaviour into the core code as that way everyone will do it consistently.
Update the dtr rts method to raise or lower depending upon flags. Having a
single method in this style fits most of the implementations more cleanly than
two funtions.
We need this in place before we tackle the USB side
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rename perf_counter_limit to perf_counter_max_sample_rate and
prohibit creation of counters with a known higher sample
frequency.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Rename the perf_counter_priv knob to perf_counter_paranoia (because
priv can be read as private, as opposed to privileged) and provide
one more level:
0 - permissive
1 - restrict cpu counters to privilidged contexts
2 - restrict kernel-mode code counting and profiling
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds the following 2 interfaces for request-stacking drivers:
- blk_rq_prep_clone(struct request *clone, struct request *orig,
struct bio_set *bs, gfp_t gfp_mask,
int (*bio_ctr)(struct bio *, struct bio*, void *),
void *data)
* Clones bios in the original request to the clone request
(bio_ctr is called for each cloned bios.)
* Copies attributes of the original request to the clone request.
The actual data parts (e.g. ->cmd, ->buffer, ->sense) are not
copied.
- blk_rq_unprep_clone(struct request *clone)
* Frees cloned bios from the clone request.
Request stacking drivers (e.g. request-based dm) need to make a clone
request for a submitted request and dispatch it to other devices.
To allocate request for the clone, request stacking drivers may not
be able to use blk_get_request() because the allocation may be done
in an irq-disabled context.
So blk_rq_prep_clone() takes a request allocated by the caller
as an argument.
For each clone bio in the clone request, request stacking drivers
should be able to set up their own completion handler.
So blk_rq_prep_clone() takes a callback function which is called
for each clone bio, and a pointer for private data which is passed
to the callback.
NOTE:
blk_rq_prep_clone() doesn't copy any actual data of the original
request. Pages are shared between original bios and cloned bios.
So caller must not complete the original request before the clone
request.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The constant is being use as an alignment factor, not as a padding
factor; made reading/reviewing the code quite confusing.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
One of the problem with sock memory accounting is it uses
a pair of sock_hold()/sock_put() for each transmitted packet.
This slows down bidirectional flows because the receive path
also needs to take a refcount on socket and might use a different
cpu than transmit path or transmit completion path. So these
two atomic operations also trigger cache line bounces.
We can see this in tx or tx/rx workloads (media gateways for example),
where sock_wfree() can be in top five functions in profiles.
We use this sock_hold()/sock_put() so that sock freeing
is delayed until all tx packets are completed.
As we also update sk_wmem_alloc, we could offset sk_wmem_alloc
by one unit at init time, until sk_free() is called.
Once sk_free() is called, we atomic_dec_and_test(sk_wmem_alloc)
to decrement initial offset and atomicaly check if any packets
are in flight.
skb_set_owner_w() doesnt call sock_hold() anymore
sock_wfree() doesnt call sock_put() anymore, but check if sk_wmem_alloc
reached 0 to perform the final freeing.
Drawback is that a skb->truesize error could lead to unfreeable sockets, or
even worse, prematurely calling __sk_free() on a live socket.
Nice speedups on SMP. tbench for example, going from 2691 MB/s to 2711 MB/s
on my 8 cpu dev machine, even if tbench was not really hitting sk_refcnt
contention point. 5 % speedup on a UDP transmit workload (depends
on number of flows), lowering TX completion cpu usage.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is available in a standard MDIO register in 10GBASE-T PHYs.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas, Andrew and Ingo pointed out that we don't have any safety checks
in the clocksource sysfs entries to make sure sysadmins don't try to
change the clocksource to a non high-res timer capable clocksource (such
as jiffies) when high-res timers (HRT) is enabled. Doing so will likely
hang a system.
Correct this by filtering non HRT clocksources from available_clocksources
and not accepting non HRT clocksources with HRT enabled.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Now all the DRM debug info will be reported if the boot option of
"drm.debug=1" is added. Sometimes it is inconvenient to get the debug
info in KMS mode. We will get too much unrelated info.
This will separate several DRM debug levels and the debug level can be used
to print the different debug info. And the debug level is controlled by the
module parameter of drm.debug
In this patch it is divided into four debug levels;
drm_core, drm_driver, drm_kms, drm_mode.
At the same time we can get the different debug info by changing the debug
level. This can be done by adding the module parameter. Of course it can
be changed through the /sys/module/drm/parameters/debug after the system is
booted.
Four debug macro definitions are provided.
DRM_DEBUG(fmt, args...)
DRM_DEBUG_DRIVER(prefix, fmt, args...)
DRM_DEBUG_KMS(prefix, fmt, args...)
DRM_DEBUG_MODE(prefix, fmt, args...)
When the boot option of "drm.debug=4" is added, it will print the debug info
using DRM_DEBUG_KMS macro definition.
When the boot option of "drm.debug=6" is added, it will print the debug info
using DRM_DEBUG_KMS/DRM_DEBUG_DRIVER.
Sometimes we expect to print the value of an array.
For example: SDVO command,
In such case the following four DRM debug macro definitions are added:
DRM_LOG(fmt, args...)
DRM_LOG_DRIVER(fmt, args...)
DRM_LOG_KMS(fmt, args...)
DRM_LOG_MODE(fmt, args...)
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
When this macro isn't called with 'file_priv' this will result in a build
failure.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
* 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (244 commits)
Revert "x86, bts: reenable ptrace branch trace support"
tracing: do not translate event helper macros in print format
ftrace/documentation: fix typo in function grapher name
tracing/events: convert block trace points to TRACE_EVENT(), fix !CONFIG_BLOCK
tracing: add protection around module events unload
tracing: add trace_seq_vprint interface
tracing: fix the block trace points print size
tracing/events: convert block trace points to TRACE_EVENT()
ring-buffer: fix ret in rb_add_time_stamp
ring-buffer: pass in lockdep class key for reader_lock
tracing: add annotation to what type of stack trace is recorded
tracing: fix multiple use of __print_flags and __print_symbolic
tracing/events: fix output format of user stack
tracing/events: fix output format of kernel stack
tracing/trace_stack: fix the number of entries in the header
ring-buffer: discard timestamps that are at the start of the buffer
ring-buffer: try to discard unneeded timestamps
ring-buffer: fix bug in ring_buffer_discard_commit
ftrace: do not profile functions when disabled
tracing: make trace pipe recognize latency format flag
...
* 'rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: rcu_sched_grace_period(): kill the bogus flush_signals()
rculist: use list_entry_rcu in places where it's appropriate
rculist.h: introduce list_entry_rcu() and list_first_entry_rcu()
rcu: Update RCU tracing documentation for __rcu_pending
rcu: Add __rcu_pending tracing to hierarchical RCU
RCU: make treercu be default
We currently log hw.sample_period for PERF_SAMPLE_PERIOD, however this is
incorrect. When we adjust the period, it will only take effect the next
cycle but report it for the current cycle. So when we adjust the period
for every cycle, we're always wrong.
Solve this by keeping track of the last_period.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For easy extension of the sample data, put it in a structure.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (61 commits)
amd-iommu: remove unnecessary "AMD IOMMU: " prefix
amd-iommu: detach device explicitly before attaching it to a new domain
amd-iommu: remove BUS_NOTIFY_BOUND_DRIVER handling
dma-debug: simplify logic in driver_filter()
dma-debug: disable/enable irqs only once in device_dma_allocations
dma-debug: use pr_* instead of printk(KERN_* ...)
dma-debug: code style fixes
dma-debug: comment style fixes
dma-debug: change hash_bucket_find from first-fit to best-fit
x86: enable GART-IOMMU only after setting up protection methods
amd_iommu: fix lock imbalance
dma-debug: add documentation for the driver filter
dma-debug: add dma_debug_driver kernel command line
dma-debug: add debugfs file for driver filter
dma-debug: add variables and checks for driver filter
dma-debug: fix debug_dma_sync_sg_for_cpu and debug_dma_sync_sg_for_device
dma-debug: use sg_dma_len accessor
dma-debug: use sg_dma_address accessor instead of using dma_address directly
amd-iommu: don't free dma adresses below 512MB with CONFIG_IOMMU_STRESS
amd-iommu: don't preallocate page tables with CONFIG_IOMMU_STRESS
...
* 'futexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
futex: fix restart in wait_requeue_pi
futex: fix restart for early wakeup in futex_wait_requeue_pi()
futex: cleanup error exit
futex: remove the wait queue
futex: add requeue-pi documentation
futex: remove FUTEX_REQUEUE_PI (non CMP)
futex: fix futex_wait_setup key handling
sparc64: extend TI_RESTART_BLOCK space by 8 bytes
futex: fixup unlocked requeue pi case
futex: add requeue_pi functionality
futex: split out futex value validation code
futex: distangle futex_requeue()
futex: add FUTEX_HAS_TIMEOUT flag to restart.futex.flags
rt_mutex: add proxy lock routines
futex: split out fixup owner logic from futex_lock_pi()
futex: split out atomic logic from futex_lock_pi()
futex: add helper to find the top prio waiter of a futex
futex: separate futex_wait_queue_me() logic from futex_wait()
* 'x86-xen-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (42 commits)
xen: cache cr0 value to avoid trap'n'emulate for read_cr0
xen/x86-64: clean up warnings about IST-using traps
xen/x86-64: fix breakpoints and hardware watchpoints
xen: reserve Xen start_info rather than e820 reserving
xen: add FIX_TEXT_POKE to fixmap
lguest: update lazy mmu changes to match lguest's use of kvm hypercalls
xen: honour VCPU availability on boot
xen: add "capabilities" file
xen: drop kexec bits from /sys/hypervisor since kexec isn't implemented yet
xen/sys/hypervisor: change writable_pt to features
xen: add /sys/hypervisor support
xen/xenbus: export xenbus_dev_changed
xen: use device model for suspending xenbus devices
xen: remove suspend_cancel hook
xen/dev-evtchn: clean up locking in evtchn
xen: export ioctl headers to userspace
xen: add /dev/xen/evtchn driver
xen: add irq_from_evtchn
xen: clean up gate trap/interrupt constants
xen: set _PAGE_NX in __supported_pte_mask before pagetable construction
...
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits)
x86: fix system without memory on node0
x86, mm: Fix node_possible_map logic
mm, x86: remove MEMORY_HOTPLUG_RESERVE related code
x86: make sparse mem work in non-NUMA mode
x86: process.c, remove useless headers
x86: merge process.c a bit
x86: use sparse_memory_present_with_active_regions() on UMA
x86: unify 64-bit UMA and NUMA paging_init()
x86: Allow 1MB of slack between the e820 map and SRAT, not 4GB
x86: Sanity check the e820 against the SRAT table using e820 map only
x86: clean up and and print out initial max_pfn_mapped
x86/pci: remove rounding quirk from e820_setup_gap()
x86, e820, pci: reserve extra free space near end of RAM
x86: fix typo in address space documentation
x86: 46 bit physical address support on 64 bits
x86, mm: fault.c, use printk_once() in is_errata93()
x86: move per-cpu mmu_gathers to mm/init.c
x86: move max_pfn_mapped and max_low_pfn_mapped to setup.c
x86: unify noexec handling
x86: remove (null) in /sys kernel_page_tables
...
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: fix typo in sched-rt-group.txt file
ftrace: fix typo about map of kernel priority in ftrace.txt file.
sched: properly define the sched_group::cpumask and sched_domain::span fields
sched, timers: cleanup avenrun users
sched, timers: move calc_load() to scheduler
sched: Don't export sched_mc_power_savings on multi-socket single core system
sched: emit thread info flags with stack trace
sched: rt: document the risk of small values in the bandwidth settings
sched: Replace first_cpu() with cpumask_first() in ILB nomination code
sched: remove extra call overhead for schedule()
sched: use group_first_cpu() instead of cpumask_first(sched_group_cpus())
wait: don't use __wake_up_common()
sched: Nominate a power-efficient ilb in select_nohz_balancer()
sched: Nominate idle load balancer from a semi-idle package.
sched: remove redundant hierarchy walk in check_preempt_wakeup
* 'irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (76 commits)
x86, apic: Fix dummy apic read operation together with broken MP handling
x86, apic: Restore irqs on fail paths
x86: Print real IOAPIC version for x86-64
x86: enable_update_mptable should be a macro
sparseirq: Allow early irq_desc allocation
x86, io-apic: Don't mark pin_programmed early
x86, irq: don't call mp_config_acpi_gsi() if update_mptable is not enabled
x86, irq: update_mptable needs pci_routeirq
x86: don't call read_apic_id if !cpu_has_apic
x86, apic: introduce io_apic_irq_attr
x86/pci: add 4 more return parameters to IO_APIC_get_PCI_irq_vector(), fix
x86: read apic ID in the !acpi_lapic case
x86: apic: Fixmap apic address even if apic disabled
x86: display extended apic registers with print_local_APIC and cpu_debug code
x86: read apic ID in the !acpi_lapic case
x86: clean up and fix setup_clear/force_cpu_cap handling
x86: apic: Check rev 3 fadt correctly for physical_apic bit
x86/pci: update pirq_enable_irq() to setup io apic routing
x86/acpi: move setup io apic routing out of CONFIG_ACPI scope
x86/pci: add 4 more return parameters to IO_APIC_get_PCI_irq_vector()
...
This adds a driver for the ARM PL022 PrimeCell SSP/SPI
driver found in the U300 platforms as well as in some
ARM reference hardware, and in a modified version on the
Nomadik board.
Reviewed-by: Alessandro Rubini <rubini-list@gnudd.com>
Reviewed-by: Russell King <linux@arm.linux.org.uk>
Reviewed-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Currently io_context has an atomic_t(32-bit) as refcount. In the case of
cfq, for each device against whcih a task does I/O, a reference to the
io_context would be taken. And when there are multiple process sharing
io_contexts(CLONE_IO) would also have a reference to the same io_context.
Theoretically the possible maximum number of processes sharing the same
io_context + the number of disks/cfq_data referring to the same io_context
can overflow the 32-bit counter on a very high-end machine.
Even though it is an improbable case, let us make it atomic_long_t.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
By moving the macro that creates the print format code above the
defining of the event macro helpers (__get_str, __print_symbolic,
and __get_dynamic_array), we get a little cleaner print format.
Instead of:
(char *)((void *)REC + REC->__data_loc_name)
we get:
__get_str(name)
Instead of:
({ static const struct trace_print_flags symbols[] = { { HI_SOFTIRQ, "HI" }, {
we get:
__print_symbolic(REC->vec, { HI_SOFTIRQ, "HI" }, {
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Once rfkill-input is disabled, the "global" states will only be used as
default initial states.
Since the states will always be the same after resume, we shouldn't
generate events on resume.
Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
rfkill_set_global_sw_state() (previously rfkill_set_default()) will no
longer be exported by the rewritten rfkill core.
Instead, platform drivers which can provide persistent soft-rfkill state
across power-down/reboot should indicate their initial state by calling
rfkill_set_sw_state() before registration. Otherwise, they will be
initialized to a default value during registration by a set_block call.
We remove existing calls to rfkill_set_sw_state() which happen before
registration, since these had no effect in the old model. If these
drivers do have persistent state, the calls can be put back (subject
to testing :-). This affects hp-wmi and acer-wmi.
Drivers with persistent state will affect the global state only if
rfkill-input is enabled. This is required, otherwise booting with
wireless soft-blocked and pressing the wireless-toggle key once would
have no apparent effect. This special case will be removed in future
along with rfkill-input, in favour of a more flexible userspace daemon
(see Documentation/feature-removal-schedule.txt).
Now rfkill_global_states[n].def is only used to preserve global states
over EPO, it is renamed to ".sav".
Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In order to handle powersave frames properly we had needed
to pass these out to the device queues again, and introduce
the skb->requeue bit. This, however, also has unnecessary
overhead by needing to 'clean up' already tried frames, and
this clean-up code is also buggy when software encryption
is used.
Instead of sending the frames via the master netdev queue
again, simply put them into the pending queue. This also
fixes a problem where frames for that particular station
could be reordered when some were still on the software
queues and older ones are re-injected into the software
queue after them.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This removes the dependency on GPIO framework and lets the SPI host
driver handle the chip select. The SPI host driver is required to keep
the CS active for the entire message unless cs_change says otherwise.
This patch collects the two/three single SPI transfers into a message.
Also the delay in read path in case use_dummy_writes are not used is
moved into the SPI host driver.
Tested-by: Mike Rapoport <mike@compulab.co.il>
Tested-by: Andrey Yurovsky <andrey@cozybit.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Also employ the overflow handler to adjust the frequency, this results
in a stable frequency in about 40~50 samples, instead of that many ticks.
This also means we can start sampling at a sample period of 1 without
running head-first into the throttle.
It relies on sched_clock() to accurately measure the time difference
between the overflow NMIs.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch was inspired by Al Viro, for simplifying and fixing the
retrieval of osd-devices by in-kernel users, eg: file systems.
In-Kernel users, now, go through the same path user-mode does by
opening a file on the osd char-device and though holding a reference
to both the device and the Module.
A file pointer was added to the osd_dev structure which is now
allocated for each user. The internal osd_dev is no longer exposed
outside of the uld. I wanted to do that for a long time so each
libosd user can have his own defaults on the device.
The API is left the same, so user code need not change.
It is no longer needed to open/close a file handle on the osd
char-device from user-mode, before mounting an exofs on it.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
CC: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
libosd users that need to work with bios, must sometime use
the request_queue associated with the osd_dev. Make a wrapper for
that, and convert all in-tree users.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
For supporting of chained-bios we can not inspect the first
bio only, as before. Caller shall pass the total length of the
request, ie. sum_bytes(bio-chain).
Also since the bio might be a chain we don't set it's direction
on behalf of it's callers. The bio direction should be properly
set prior to this call. So fix a couple of write users that now
need to set the bio direction properly
[In this patch I change both library code and user sites at
exofs, to make it easy on integration. It should be submitted
via James's scsi-misc tree.]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
CC: Jeff Garzik <jeff@garzik.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
By popular demand, define usefull wrappers for osd_req_read/write
that recieve kernel pointers. All users had their own.
Also remove these from exofs
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Some New revision 5 Attribute definitions.
Some missing definitions of Attributes and pages
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Add all constant definitions of new OSD commands added in
revision 4 & 5. Mainly for creating snapshots and clones.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Introduce per-conntrack locks and use them instead of the global protocol
locks to avoid contention. Especially tcp_lock shows up very high in
profiles on larger machines.
This will also allow to simplify the upcoming reliable event delivery patches.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Fix building failures when CONFIG_BLOCK == n.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4A2F1520.8020003@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This was only defined with CONFIG_DEBUG_SPINLOCK set, but some
obscure arch/powerpc code wants it always.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
kvm_assigned_dev_ack_irq is vulnerable to a race condition with the
interrupt handler function. It does:
if (dev->host_irq_disabled) {
enable_irq(dev->host_irq);
dev->host_irq_disabled = false;
}
If an interrupt triggers before the host->dev_irq_disabled assignment,
it will disable the interrupt and set dev->host_irq_disabled to true.
On return to kvm_assigned_dev_ack_irq, dev->host_irq_disabled is set to
false, and the next kvm_assigned_dev_ack_irq call will fail to reenable
it.
Other than that, having the interrupt handler and work handlers run in
parallel sounds like asking for trouble (could not spot any obvious
problem, but better not have to, its fragile).
CC: sheng.yang@intel.com
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
KVM uses a function call IPI to cause the exit of a guest running on a
physical cpu. For virtual interrupt notification there is no need to
wait on IPI receival, or to execute any function.
This is exactly what the reschedule IPI does, without the overhead
of function IPI. So use it instead of smp_call_function_single in
kvm_vcpu_kick.
Also change the "guest_mode" variable to a bit in vcpu->requests, and
use that to collapse multiple IPI's that would be issued between the
first one and zeroing of guest mode.
This allows kvm_vcpu_kick to called with interrupts disabled.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Memory aliases with different memory type is a problem for guest. For the guest
without assigned device, the memory type of guest memory would always been the
same as host(WB); but for the assigned device, some part of memory may be used
as DMA and then set to uncacheable memory type(UC/WC), which would be a conflict of
host memory type then be a potential issue.
Snooping control can guarantee the cache correctness of memory go through the
DMA engine of VT-d.
[avi: fix build on ia64]
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Two things needed fixing: 1) g++ does not allow a named structure type
within an anonymous union and 2) Avoid name clash between two padding
fields within the same struct by giving them different names as is
done elsewhere in the header.
Signed-off-by: Nathan Binkert <nate@binkert.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
kvm_vcpu_block() unhalts vpu on an interrupt/timer without checking
if interrupt window is actually opened.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
After discussion with Marcelo, we decided to rework device assignment framework
together. The old problems are kernel logic is unnecessary complex. So Marcelo
suggest to split it into a more elegant way:
1. Split host IRQ assign and guest IRQ assign. And userspace determine the
combination. Also discard msi2intx parameter, userspace can specific
KVM_DEV_IRQ_HOST_MSI | KVM_DEV_IRQ_GUEST_INTX in assigned_irq->flags to
enable MSI to INTx convertion.
2. Split assign IRQ and deassign IRQ. Import two new ioctls:
KVM_ASSIGN_DEV_IRQ and KVM_DEASSIGN_DEV_IRQ.
This patch also fixed the reversed _IOR vs _IOW in definition(by deprecated the
old interface).
[avi: replace homemade bitcount() by hweight_long()]
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Use kvm_apic_match_dest() in kvm_get_intr_delivery_bitmask() instead
of duplicating the same code. Use kvm_get_intr_delivery_bitmask() in
apic_send_ipi() to figure out ipi destination instead of reimplementing
the logic.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
ioapic_deliver() and kvm_set_msi() have code duplication. Move
the code into ioapic_deliver_entry() function and call it from
both places.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Since "KVM: Unify the delivery of IOAPIC and MSI interrupts"
I get the following warnings:
CC [M] arch/s390/kvm/kvm-s390.o
In file included from arch/s390/kvm/kvm-s390.c:22:
include/linux/kvm_host.h:357: warning: 'struct kvm_ioapic' declared inside parameter list
include/linux/kvm_host.h:357: warning: its scope is only this definition or declaration, which is probably not what you want
This patch limits IOAPIC functions for architectures that have one.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch finally enable MSI-X.
What we need for MSI-X:
1. Intercept one page in MMIO region of device. So that we can get guest desired
MSI-X table and set up the real one. Now this have been done by guest, and
transfer to kernel using ioctl KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY.
2. Information for incoming interrupt. Now one device can have more than one
interrupt, and they are all handled by one workqueue structure. So we need to
identify them. The previous patch enable gsi_msg_pending_bitmap get this done.
3. Mapping from host IRQ to guest gsi as well as guest gsi to real MSI/MSI-X
message address/data. We used same entry number for the host and guest here, so
that it's easy to find the correlated guest gsi.
What we lack for now:
1. The PCI spec said nothing can existed with MSI-X table in the same page of
MMIO region, except pending bits. The patch ignore pending bits as the first
step (so they are always 0 - no pending).
2. The PCI spec allowed to change MSI-X table dynamically. That means, the OS
can enable MSI-X, then mask one MSI-X entry, modify it, and unmask it. The patch
didn't support this, and Linux also don't work in this way.
3. The patch didn't implement MSI-X mask all and mask single entry. I would
implement the former in driver/pci/msi.c later. And for single entry, userspace
should have reposibility to handle it.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
We have to handle more than one interrupt with one handler for MSI-X. Avi
suggested to use a flag to indicate the pending. So here is it.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Introduce KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY two ioctls.
This two ioctls are used by userspace to specific guest device MSI-X entry
number and correlate MSI-X entry with GSI during the initialization stage.
MSI-X should be well initialzed before enabling.
Don't support change MSI-X entry number for now.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
* topic/pcm-jiffies-check:
ALSA: pcm - A helper function to compose PCM stream name for debug prints
ALSA: pcm - Fix update of runtime->hw_ptr_interrupt
ALSA: pcm - Fix a typo in hw_ptr update check
ALSA: PCM midlevel: lower jiffies check margin using runtime->delay value
ALSA: PCM midlevel: Do not update hw_ptr_jiffies when hw_ptr is not changed
ALSA: PCM midlevel: introduce mask for xrun_debug() macro
ALSA: PCM midlevel: improve fifo_size handling
* topic/asoc: (135 commits)
ASoC: Apostrophe patrol
ASoC: codec tlv320aic23 fix bogus divide by 0 message
ASoC: fix NULL pointer dereference in soc_suspend()
ASoC: Fix build error in twl4030.c
ASoC: SSM2602: assign last substream to the master when shutting down
ASoC: Blackfin: document how anomaly 05000250 is handled
ASoC: Blackfin: set the transfer size according the ac97_frame size
ASoC: SSM2602: remove unsupported sample rates
ASoC: TWL4030: Check the interface format for 4 channel mode
ASoC: TWL4030: Use reg_cache in twl4030_init_chip
ASoC: Initialise dev for the dummy S/PDIF DAI
ASoC: Add dummy S/PDIF codec support
ASoC: correct print specifiers for unsigneds
ASoC: Modify mpc5200 AC97 driver to use V9 of spin_event_timeout()
ASoC: Switch FSL SSI DAI over to symmetric_rates
ASoC: Mark MPC5200 AC97 as BROKEN until PowerPC merge issues are resolved
ASoC: Fabric bindings for STAC9766 on the Efika
ASoC: Support for AC97 on Phytec pmc030 base board.
ASoC: AC97 driver for mpc5200
ASoC: Main rewite of the mpc5200 audio DMA code
...
Since we use ERR_PTR and similar macros, we need to include
linux/err.h.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
To support alingment of the individual architecture specific linker scripts
provide a set of general definitions in vmlinux.lds.h
With these definitions applied the diverse linekr scripts can be reduced
in line count and their readability are improved - IMO.
A sample linker script is included to give the preferred
order of the sections for the architectures that do not
have any special requirments.
These definitions are also a first step towards eventual
support for -ffunction-sections.
The definitions makes it much easier to do a global
renaming of section names - but the main purpose is
to clean up the linker scripts.
Tim Aboot has provided a lot of inputs to improve
the definitions - all faults are mine.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Tim Abbott <tabbott@mit.edu>
- add .init.rodata to INIT_DATA, and group all initconst flavors
together
- move strings generated from __setup_param() into .init.rodata
- add .*init.rodata to modpost's sets of init sections
- make modpost warn about references between meminit and cpuinit
as well as memexit and cpuexit sections (as CPU and memory
hotplug are independently selectable features)
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
The code to update the print formats for events requires a vprintf
format in the trace_seq. This patch adds that interface.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The sector field is either u64 or unsigned long depending on
the arch. This patch casts the sector to unsigned long long to
prevent the printf warnings.
[ Impact: remove compile warnings ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
TRACE_EVENT is a more generic way to define tracepoints. Doing so adds
these new capabilities to this tracepoint:
- zero-copy and per-cpu splice() tracing
- binary tracing without printf overhead
- structured logging records exposed under /debug/tracing/events
- trace events embedded in function tracer output and other plugins
- user-defined, per tracepoint filter expressions
...
Cons:
- no dev_t info for the output of plug, unplug_timer and unplug_io events.
no dev_t info for getrq and sleeprq events if bio == NULL.
no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL.
This is mainly because we can't get the deivce from a request queue.
But this may change in the future.
- A packet command is converted to a string in TP_assign, not TP_print.
While blktrace do the convertion just before output.
Since pc requests should be rather rare, this is not a big issue.
- In blktrace, an event can have 2 different print formats, but a TRACE_EVENT
has a unique format, which means we have some unused data in a trace entry.
The overhead is minimized by using __dynamic_array() instead of __array().
I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing:
dd dd + ioctl blktrace dd + TRACE_EVENT (splice)
1 7.36s, 42.7 MB/s 7.50s, 42.0 MB/s 7.41s, 42.5 MB/s
2 7.43s, 42.3 MB/s 7.48s, 42.1 MB/s 7.43s, 42.4 MB/s
3 7.38s, 42.6 MB/s 7.45s, 42.2 MB/s 7.41s, 42.5 MB/s
So the overhead of tracing is very small, and no regression when using
those trace events vs blktrace.
And the binary output of TRACE_EVENT is much smaller than blktrace:
# ls -l -h
-rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0
-rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1
-rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out
Following are some comparisons between TRACE_EVENT and blktrace:
plug:
kjournald-480 [000] 303.084981: block_plug: [kjournald]
kjournald-480 [000] 303.084981: 8,0 P N [kjournald]
unplug_io:
kblockd/0-118 [000] 300.052973: block_unplug_io: [kblockd/0] 1
kblockd/0-118 [000] 300.052974: 8,0 U N [kblockd/0] 1
remap:
kjournald-480 [000] 303.085042: block_remap: 8,0 W 102736992 + 8 <- (8,8) 33384
kjournald-480 [000] 303.085043: 8,0 A W 102736992 + 8 <- (8,8) 33384
bio_backmerge:
kjournald-480 [000] 303.085086: block_bio_backmerge: 8,0 W 102737032 + 8 [kjournald]
kjournald-480 [000] 303.085086: 8,0 M W 102737032 + 8 [kjournald]
getrq:
kjournald-480 [000] 303.084974: block_getrq: 8,0 W 102736984 + 8 [kjournald]
kjournald-480 [000] 303.084975: 8,0 G W 102736984 + 8 [kjournald]
bash-2066 [001] 1072.953770: 8,0 G N [bash]
bash-2066 [001] 1072.953773: block_getrq: 0,0 N 0 + 0 [bash]
rq_complete:
konsole-2065 [001] 300.053184: block_rq_complete: 8,0 W () 103669040 + 16 [0]
konsole-2065 [001] 300.053191: 8,0 C W 103669040 + 16 [0]
ksoftirqd/1-7 [001] 1072.953811: 8,0 C N (5a 00 08 00 00 00 00 00 24 00) [0]
ksoftirqd/1-7 [001] 1072.953813: block_rq_complete: 0,0 N (5a 00 08 00 00 00 00 00 24 00) 0 + 0 [0]
rq_insert:
kjournald-480 [000] 303.084985: block_rq_insert: 8,0 W 0 () 102736984 + 8 [kjournald]
kjournald-480 [000] 303.084986: 8,0 I W 102736984 + 8 [kjournald]
Changelog from v2 -> v3:
- use the newly introduced __dynamic_array().
Changelog from v1 -> v2:
- use __string() instead of __array() to minimize the memory required
to store hex dump of rq->cmd().
- support large pc requests.
- add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT.
- some cleanups.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4A2DF669.5070905@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Add ISCSI_NETLINK messages for iSCSI NICs to get information such as
path from userspace. Original iscsid messages are now always sent as
multicast to group 1. The new messages are sent to group 2.
The multicast changes were made by Mike Christie.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If the XT_SOCKET_TRANSPARENT flag is set, enabled 'transparent'
socket option is required for the socket to be matched.
Signed-off-by: Laszlo Attila Toth <panther@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch extracts the opaque data from pci i/o
region 0 via the added VIRTIO_BLK_F_IDENTIFY
field. By convention this data takes the form of
that returned by an ATA IDENTIFY DEVICE command,
however the driver (except for structure size)
makes no interpretation of the data. The structure
data is copied wholesale to userspace via a
HDIO_GET_IDENTITY ioctl command (eg: hdparm -i <dev>).
Signed-off-by: john cooper <john.cooper@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Ethtool is a standard way of getting information about
ethernet interfaces. We enhance ethtool kernel interface
& e1000e to make the MDI-X status readable via ethtool in
userspace.
Signed-off-by: Chaitanya Lala <clala@riverbed.com>
Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a netlink interface for configuration of IEEE 802.15.4 device. Also this
interface specifies events notification sent by devices towards higher layers.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Sergey Lapin <slapin@ossfans.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for communication over IEEE 802.15.4 networks. This implementation
is neither certified nor complete, but aims to that goal. This commit contains
only the socket interface for communication over IEEE 802.15.4 networks.
One can either send RAW datagrams or use SOCK_DGRAM to encapsulate data
inside normal IEEE 802.15.4 packets.
Configuration interface, drivers and software MAC 802.15.4 implementation will
follow.
Initial implementation was done by Maxim Gorbachyov, Maxim Osipov and Pavel
Smolensky as a research project at Siemens AG. Later the stack was heavily
reworked to better suit the linux networking model, and is now maitained
as an open project partially sponsored by Siemens.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Sergey Lapin <slapin@ossfans.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
IEEE 802.15.4 stack requires several constants to be defined/adjusted.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Sergey Lapin <slapin@ossfans.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change PSCHED_SHIFT from 10 to 6 to increase schedulers time
resolution. This will increase 16x a number of (internal) ticks per
nanosecond, and is needed to improve accuracy of schedulers based on
rate tables, like HTB, TBF or CBQ, with rates above 100Mbit. It is
assumed this change is safe for 32bit accounting of time diffs up
to 2 minutes, which should be enough for common use (extremely low
rate values may overflow, so get inaccurate instead). To make full
use of this change an updated iproute2 will be needed. (But using
older iproute2 should be safe too.)
This change breaks ticks - microseconds similarity, so some minor code
fixes might be needed. It is also planned to change naming adequately
eg. to PSCHED_TICKS2NS() etc. in the near future.
Reported-by: Antonio Almeida <vexwek@gmail.com>
Tested-by: Antonio Almeida <vexwek@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use PSCHED_SHIFT constant instead of '10' in PSCHED_US2NS() and
PSCHED_NS2US() macros to enable changing this value later.
Additionally use PSCHED_SHIFT in sch_hfsc SM_SHIFT and ISM_SHIFT
definitions. This part of the patch is based on feedback from
Patrick McHardy <kaber@trash.net>.
Reported-by: Antonio Almeida <vexwek@gmail.com>
Tested-by: Antonio Almeida <vexwek@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CUSE enables implementing character devices in userspace. With recent
additions of ioctl and poll support, FUSE already has most of what's
necessary to implement character devices. All CUSE has to do is
bonding all those components - FUSE, chardev and the driver model -
nicely.
When client opens /dev/cuse, kernel starts conversation with
CUSE_INIT. The client tells CUSE which device it wants to create. As
the previous patch made fuse_file usable without associated
fuse_inode, CUSE doesn't create super block or inodes. It attaches
fuse_file to cdev file->private_data during open and set ff->fi to
NULL. The rest of the operation is almost identical to FUSE direct IO
case.
Each CUSE device has a corresponding directory /sys/class/cuse/DEVNAME
(which is symlink to /sys/devices/virtual/class/DEVNAME if
SYSFS_DEPRECATED is turned off) which hosts "waiting" and "abort"
among other things. Those two files have the same meaning as the FUSE
control files.
The only notable lacking feature compared to in-kernel implementation
is mmap support.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
With the hope that these can be used to eliminate direct
references to the frag list implementation.
Signed-off-by: David S. Miller <davem@davemloft.net>
Furthermore, it twiddles with the details of SKB list handling
directly, which we're trying to eliminate.
Signed-off-by: David S. Miller <davem@davemloft.net>
On Sun, 7 Jun 2009, Ingo Molnar wrote:
> Testing tracer sched_switch: <6>Starting ring buffer hammer
> PASSED
> Testing tracer sysprof: PASSED
> Testing tracer function: PASSED
> Testing tracer irqsoff:
> =============================================
> PASSED
> Testing tracer preemptoff: PASSED
> Testing tracer preemptirqsoff: [ INFO: possible recursive locking detected ]
> PASSED
> Testing tracer branch: 2.6.30-rc8-tip-01972-ge5b9078-dirty #5760
> ---------------------------------------------
> rb_consumer/431 is trying to acquire lock:
> (&cpu_buffer->reader_lock){......}, at: [<c109eef7>] ring_buffer_reset_cpu+0x37/0x70
>
> but task is already holding lock:
> (&cpu_buffer->reader_lock){......}, at: [<c10a019e>] ring_buffer_consume+0x7e/0xc0
>
> other info that might help us debug this:
> 1 lock held by rb_consumer/431:
> #0: (&cpu_buffer->reader_lock){......}, at: [<c10a019e>] ring_buffer_consume+0x7e/0xc0
The ring buffer is a generic structure, and can be used outside of
ftrace. If ftrace traces within the use of the ring buffer, it can produce
false positives with lockdep.
This patch passes in a static lock key into the allocation of the ring
buffer, so that different ring buffers will have their own lock class.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1244477919.13761.9042.camel@twins>
[ store key in ring buffer descriptor ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The MAX17040 is a I2C interfaced Fuel Gauge systems for lithium-ion
batteries This patch adds support the MAX17040 Fuel Gauge
Signed-off-by: Minkyu Kang <mk7.kang@samsung.com>
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
FIP is the FCoE Initialization Protocol and this patch
adds the protocol ethertype to the kernel's list of
ethertypes.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 5543/1: arm: serial amba: add missing declaration in serial.h
[ARM] pxa: fix pxa27x_udc default pullup GPIO
[ARM] pxa/imote2: fix UCAM sensor board ADC model number
mx[23]: don't put clock lookups in __initdata
fix oops when using console=ttymxcN with N > 0
[ARM] ARMv7 errata: only apply fixes when running on applicable CPU
[ARM] 5534/1: kmalloc must return a cache line aligned buffer
Passive OS fingerprinting netfilter module allows to passively detect
remote OS and perform various netfilter actions based on that knowledge.
This module compares some data (WS, MSS, options and it's order, ttl, df
and others) from packets with SYN bit set with dynamically loaded OS
fingerprints.
Fingerprint matching rules can be downloaded from OpenBSD source tree
or found in archive and loaded via netfilter netlink subsystem into
the kernel via special util found in archive.
Archive contains library file (also attached), which was shipped
with iptables extensions some time ago (at least when ipt_osf existed
in patch-o-matic).
Following changes were made in this release:
* added NLM_F_CREATE/NLM_F_EXCL checks
* dropped _rcu list traversing helpers in the protected add/remove calls
* dropped unneded structures, debug prints, obscure comment and check
Fingerprints can be downloaded from
http://www.openbsd.org/cgi-bin/cvsweb/src/etc/pf.os
or can be found in archive
Example usage:
-d switch removes fingerprints
Please consider for inclusion.
Thank you.
Passive OS fingerprint homepage (archives, examples):
http://www.ioremap.net/projects/osf
Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Current conntrack code kills the ICMP conntrack entry as soon as
the first reply is received. This is incorrect, as we then see only
the first ICMP echo reply out of several possible duplicates as
ESTABLISHED, while the rest will be INVALID. Also this unnecessarily
increases the conntrackd traffic on H-A firewalls.
Make all the ICMP conntrack entries (including the replied ones)
last for the default of nf_conntrack_icmp{,v6}_timeout seconds.
Signed-off-by: Jan "Yenya" Kasprzak <kas@fi.muni.cz>
Signed-off-by: Patrick McHardy <kaber@trash.net>
They are now only accessed within dapm_power_widgets() so can be local
to that function.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
With the re-write of the RFKILL subsystem it is now possible to easily
integrate RFKILL soft-switch support into the Bluetooth subsystem. All
Bluetooth devices will now get automatically RFKILL support.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Bluetooth source uses some endian conversion helpers, that in the end
translate to kernel standard routines. So remove this obfuscation since it
is fully pointless.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This adds the basic constants required to add support for L2CAP Enhanced
Retransmission feature.
Based on a patch from Nathan Holstein <nathan@lampreynetworks.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Using the L2CAP_CONF_HINT macro is easier to understand than using a
hardcoded 0x80 value.
Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Use macros instead of hardcoded numbers to make the L2CAP source code
more readable.
Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Replace the remaining unsigned shorts with unsigned ints.
Tested with pcap2 codec (25 bits registers).
Signed-off-by: Daniel Ribeiro <drwyrm@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Change the name of the Kernel CAPI exported function capi_ctr_reseted()
to something representing its purpose better.
Impact: renaming, no functional change
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb_dma_unmap() is quite expensive for small packets,
because we use two different cache lines from skb_shared_info.
One to access nr_frags, one to access dma_maps[0]
Instead of dma_maps being an array of MAX_SKB_FRAGS + 1 elements,
let dma_head alone in a new dma_head field, close to nr_frags,
to reduce cache lines misses.
Tested on my dev machine (bnx2 & tg3 adapters), nice speedup !
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of num_dma_maps in struct skb_shared_info, as it seems unused.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This header is sometimes included in the uncompress stage to get
register values, but no <linux/amba/bus.h> can be included there.
So declare "struct amba_device" here before using it in a prototype.
Signed-off-by: Alessandro Rubini <rubini@unipv.it>
Acked-by: Andrea Gallo <andrea.gallo@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add IDE_DFLAG_NIEN_QUIRK device flag and use it instead of
drive->quirk_list.
There should be no functional changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add ide_check_nien_quirk_list() helper to the core code
and then use it in ide_port_tune_devices().
* Remove no longer needed ->quirkproc methods from hpt366.c
and pdc202xx_{new,old}.c.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
From the perspective of most users of recent systems, disabling Host
Protected Area (HPA) can break vendor RAID formats, GPT partitions and
risks corrupting firmware or overwriting vendor system recovery tools.
Unfortunately the original (kernels < 2.6.30) behavior (unconditionally
disabling HPA and using full disk capacity) was introduced at the time
when the main use of HPA was to make the drive look small enough for the
BIOS to allow the system to boot with large capacity drives.
Thus to allow the maximum compatibility with the existing setups (using
HPA and partitioned with HPA disabled) we automically disable HPA if
any partitions overlapping HPA are detected. Additionally HPA can also
be disabled using the "nohpa" module parameter (i.e. "ide_core.nohpa=0.0"
to disable HPA on /dev/hda).
v2:
Fix ->resume HPA support.
While at it:
- remove stale "idebus=" entry from Documentation/kernel-parameters.txt
Cc: Robert Hancock <hancockrwd@gmail.com>
Cc: Frans Pop <elendil@planet.nl>
Cc: "Andries E. Brouwer" <Andries.Brouwer@cwi.nl>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
[patch description was based on input from Alan Cox and Frans Pop]
Emphatically-Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Use ->probed_capacity to store native device capacity for ATA disks.
* Add ->set_capacity method to struct ide_disk_ops.
* Implement disk device ->set_capacity method for ATA disks.
* Implement block device ->set_capacity method.
v2:
* Check if LBA and HPA are supported in ide_disk_set_capacity().
* According to the spec the SET MAX ADDRESS command shall be
immediately preceded by a READ NATIVE MAX ADDRESS command.
* Add ide_disk_hpa_{get_native,set}_capacity() helpers.
Together with the previous patch adding ->set_capacity block device
method this allows automatic disabling of Host Protected Area (HPA)
if any partitions overlapping HPA are detected.
Cc: Robert Hancock <hancockrwd@gmail.com>
Cc: Frans Pop <elendil@planet.nl>
Cc: "Andries E. Brouwer" <Andries.Brouwer@cwi.nl>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Emphatically-Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add ->set_capacity block device method and use it in rescan_partitions()
to attempt enabling native capacity of the device upon detecting the
partition which exceeds device capacity.
* Add GENHD_FL_NATIVE_CAPACITY flag to try limit attempts of enabling
native capacity during partition scan.
Together with the consecutive patch implementing ->set_capacity method in
ide-gd device driver this allows automatic disabling of Host Protected Area
(HPA) if any partitions overlapping HPA are detected.
Cc: Robert Hancock <hancockrwd@gmail.com>
Cc: Frans Pop <elendil@planet.nl>
Cc: "Andries E. Brouwer" <Andries.Brouwer@cwi.nl>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Emphatically-Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This patch removes the forcedeth device ids from pci_ids.h
The forcedeth driver uses the device id constants directly in its source
file.
[ Need to keep PCI_DEVICE_ID_NVIDIA_NVENET_15 in order to keep
drivers/pci/quirks.c building -DaveM ]
Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge reason: This branch was on an -rc5 base so pull almost-2.6.30
to resync with the latest upstream fixes and make sure
the combination works fine.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Retrieval of an fw_unit's parent is a common pattern in high-level code.
Wrap it up as device = fw_parent_device(unit).
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Extend generic event enumeration with the PERF_TYPE_HW_CACHE
method.
This is a 3-dimensional space:
{ L1-D, L1-I, L2, ITLB, DTLB, BPU } x
{ load, store, prefetch } x
{ accesses, misses }
User-space passes in the 3 coordinates and the kernel provides
a counter. (if the hardware supports that type and if the
combination makes sense.)
Combinations that make no sense produce a -EINVAL.
Combinations that are not supported by the hardware produce -ENOTSUP.
Extend the tools to deal with this, and rewrite the event symbol
parsing code with various popular aliases for the units and
access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
both valid aliases.
( x86 is supported for now, with the Nehalem event table filled in,
and with Core2 and Atom having placeholder tables. )
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Counter type is a frequently used value and we do a lot of
bit juggling by encoding and decoding it from attr->config.
Clean this up by creating a separate attr->type field.
Also clean up the various similarly complex user-space bits
all around counter attribute management.
The net improvement is significant, and it will be easier
to add a new major type (which is what triggered this cleanup).
(This changes the ABI, all tools are adapted.)
(PowerPC build-tested.)
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add bbt_wait & unlock_all as replaceable for some platform such as
s3c64xx s3c64xx has its own OneNAND controller and another interface
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Add support for Samsung Flex-OneNAND devices.
Flex-OneNAND combines SLC and MLC technologies into a single device.
SLC area provides increased reliability and speed, suitable for storing
code such as bootloader, kernel and root file system. MLC area
provides high density and is suitable for storing user data.
SLC and MLC regions can be configured through kernel parameter.
[akpm@linux-foundation.org: export flexoand_region and onenand_addr]
Signed-off-by: Rohit Hagargundgi <h.rohit@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Vishak G <vishak.g@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The ConnectX Programmer's Reference Manual states that the "SO" bit
must be set when posting Fast Register and Local Invalidate send work
requests. When this bit is set, the work request will be executed
only after all previous work requests on the send queue have been
executed. (If the bit is not set, Fast Register and Local Invalidate
WQEs may begin execution too early, which violates the defined
semantics for these operations)
This fixes the issue with NFS/RDMA reported in
<http://lists.openfabrics.org/pipermail/general/2009-April/059253.html>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add optional callback to allow platform to initialize partitions.
Static partitions on a nand device could vary depending on the size of the
device. This patch allows an optional platform callback to be used to
setup this partition information at runtime.
Scan order is:
1) chip.part_probe_types
2) chip.set_parts
3) chip.partitions
4) full mtd device (fallback for no partitions)
Some of the existing nand drivers could possibly be replaced by the
plat_nand driver by using this patch. These include autcpu12.c and
ts7250.c drivers.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Add optional probe and remove callbacks to the plat_nand driver.
Some platforms may require additional setup, such as configuring the
memory controller, before the nand device can be accessed. This patch
provides an optional callback to handle this setup as well as a callback
to teardown the setup.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Tested-by: Alexander Clouter <alex@digriz.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This patch adds (write|read)_buf callbacks to plat_nand.
The NAND on the TS-7800 provisioned by the FPGA allows readw() and
readl() to be used which gives a 2.5x speed up. To be able to use this
from the plat_nand driver a hook for read_buf (and also write_buf whilst
we are in there) need to be made available. This patch adds the hook.
Signed-off-by: Alexander Clouter <alex@digriz.org.uk>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
In addition to adding the Numonyx manufacturer code, this patch
also ensures 'sync. write' is disabled when reading identification
data - something that the Numonyx chip objects to, but the
Samsung chip seems to ignore.
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This file does not define any kernel-userspace API, all
it does it defines few helpers for userspace. Instead,
userspace should have a private copy of this file.
The main (if not the only) user is the mtd-utils package, but
it already has a private copy of this file.
This patch also removes references to 'jffs2-user.h' from
'Kbuild' and MAINTAINERS' files.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
In order to allow easy tracking of the period, also provide means of
adding it to the sample data.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The purpose of PERF_SAMPLE_CONFIG was to identify the counters,
since then we've added counter ids, use those instead.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a PNP resource range check function, indicating whether a resource
has been assigned to any device.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
[apw@canonical.com: fixed up exports et al]
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
The three header files of firewire-core, i.e.
"drivers/firewire/fw-device.h",
"drivers/firewire/fw-topology.h",
"drivers/firewire/fw-transaction.h",
are replaced by
"drivers/firewire/core.h",
"include/linux/firewire.h".
The latter includes everything which a firewire high-level driver (like
firewire-sbp2) needs besides linux/firewire-constants.h, while core.h
contains the rest which is needed by firewire-core itself and by low-
level drivers (card drivers) like firewire-ohci.
High-level drivers can now also reside outside of drivers/firewire
without having to add drivers/firewire to the header file search path in
makefiles. At least the firedtv driver will be such a driver.
I also considered to spread the contents of core.h over several files,
one for each .c file where the respective implementation resides. But
it turned out that most core .c files will end up including most of the
core .h files. Also, the combined core.h isn't unreasonably big, and it
will lose more of its contents to linux/firewire.h anyway soon when more
firewire drivers are added. (IP-over-1394, firedtv, and there are plans
for one or two more.)
Furthermore, fw-ohci.h is renamed to ohci.h. The name of core.h and
ohci.h is chosen with regard to name changes of the .c files in a
follow-up change.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
In order to track the vdso also generate mmap events for
install_special_mapping().
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Adds support for specifying a range of queues instead of a single queue
id. Flows will be distributed across the given range.
This is useful for multicore systems: Instead of having a single
application read packets from a queue, start multiple
instances on queues x, x+1, .. x+n. Each instance can process
flows independently.
Packets for the same connection are put into the same queue.
Signed-off-by: Holger Eitzenberger <heitzenberger@astaro.com>
Signed-off-by: Florian Westphal <fwestphal@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The "trace || CLONE_PTRACE" check in tracehook_report_clone() is not right,
- If the untraced task does clone(CLONE_PTRACE) the new child is not traced,
we must not queue SIGSTOP.
- If we forked the traced task, but the tracer exits and untraces both the
forking task and the new child (after copy_process() drops tasklist_lock),
we should not queue SIGSTOP too.
Change the code to check task_ptrace() != 0 instead. This is still racy, but
the race is harmless.
We can race with another tracer attaching to this child, or the tracer can
exit and detach in parallel. But giwen that we didn't do wake_up_new_task()
yet, the child must have the pending SIGSTOP anyway.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This can be used for other arm platforms too as discussed
on the linux-arm-kernel list.
Also check the return value with IS_ERR and return PTR_ERR
as suggested by Russell King.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The Nomadik 8815 SoC has a slightly modified version of the PL011 block.
The patch uses the different ID value as a key to select a vendor
structure that is used to keep track of the differences, as suggested
by Russell King.
Signed-off-by: Alessandro Rubini <rubini@unipv.it>
Acked-by: Andrea Gallo <andrea.gallo@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
In name of keeping it simple, only track mmap events. Userspace
will have to remove old overlapping maps when it encounters them.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Create a fork event so that we can easily clone the comm and
dso maps without having to generate all those events.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
As it stands skb fraglists can get past the check in dev_queue_xmit
if the skb is marked as GSO. In particular, if the packet doesn't
have the proper checksums for GSO, but can otherwise be handled by
the underlying device, we will not perform the fraglist check on it
at all.
If the underlying device cannot handle fraglists, then this will
break.
The fix is as simple as moving the fraglist check from the device
check into skb_gso_ok.
This has caused crashes with Xen when used together with GRO which
can generate GSO packets with fraglists.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the dependency of mmap_min_addr on CONFIG_SECURITY.
It also sets a default mmap_min_addr of 4096.
mmapping of addresses below 4096 will only be possible for processes
with CAP_SYS_RAWIO.
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Acked-by: Eric Paris <eparis@redhat.com>
Looks-ok-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: James Morris <jmorris@namei.org>
Making the drm_crtc.c code recognize the DPMS property and invoke the
connector->dpms function doesn't remove any capability from the driver while
reducing code duplication.
That just highlighted the problem with the existing DPMS functions which
could turn off the connector, but failed to turn off any relevant crtcs. The
new drm_helper_connector_dpms function manages all of that, using the
drm_helper-specific crtc and encoder dpms functions, automatically computing
the appropriate DPMS level for each object in the system.
This fixes the current troubles in the i915 driver which left PLLs, pipes
and planes running while in DPMS_OFF mode or even while they were unused.
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Prepare the api for the arrival of a new parameter, 'scribble'. This
will allow callers to identify scratchpad memory for dma address or page
address conversions. As this adds yet another parameter, take this
opportunity to convert the common submission parameters (flags,
dependency, callback, and callback argument) into an object that is
passed by reference.
Also, take this opportunity to fix up the kerneldoc and add notes about
the relevant ASYNC_TX_* flags for each routine.
[ Impact: moves api pass-by-value parameters to a pass-by-reference struct ]
Signed-off-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In support of inter-channel chaining async_tx utilizes an ack flag to
gate whether a dependent operation can be chained to another. While the
flag is not set the chain can be considered open for appending. Setting
the ack flag closes the chain and flags the descriptor for garbage
collection. The ASYNC_TX_DEP_ACK flag essentially means "close the
chain after adding this dependency". Since each operation can only have
one child the api now implicitly sets the ack flag at dependency
submission time. This removes an unnecessary management burden from
clients of the api.
[ Impact: clean up and enforce one dependency per operation ]
Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
To be easier on drivers and users, have cfg80211 register an
rfkill structure that drivers can access. When soft-killed,
simply take down all interfaces; when hard-killed the driver
needs to notify us and we will take down the interfaces
after the fact. While rfkilled, interfaces cannot be set UP.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Sometimes it is necessary to know how the state is,
and it is easier to query rfkill than keep track of
it somewhere else, so add a function for that. This
could later be expanded to return hard/soft block,
but so far that isn't necessary.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch introduces new cfg80211 API to set the TX power
via cfg80211, puts the wext code into cfg80211 and updates
mac80211 to use all that. The -ENETDOWN bits are a hack but
will go away soon.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The new code added by this patch will make rfkill create
a misc character device /dev/rfkill that userspace can use
to control rfkill soft blocks and get status of devices as
well as events when the status changes.
Using it is very simple -- when you open it you can read
a number of times to get the initial state, and every
further read blocks (you can poll) on getting the next
event from the kernel. The same structure you read is
also used when writing to it to change the soft block of
a given device, all devices of a given type, or all
devices.
This also makes CONFIG_RFKILL_INPUT selectable again in
order to be able to test without it present since its
functionality can now be replaced by userspace entirely
and distros and users may not want the input part of
rfkill interfering with their userspace code. We will
also write a userspace daemon to handle all that and
consequently add the input code to the feature removal
schedule.
In order to have rfkilld support both kernels with and
without CONFIG_RFKILL_INPUT (or new kernels after its
eventual removal) we also add an ioctl (that only exists
if rfkill-input is present) to disable rfkill-input.
It is not very efficient, but at least gives the correct
behaviour in all cases.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch completely rewrites the rfkill core to address
the following deficiencies:
* all rfkill drivers need to implement polling where necessary
rather than having one central implementation
* updating the rfkill state cannot be done from arbitrary
contexts, forcing drivers to use schedule_work and requiring
lots of code
* rfkill drivers need to keep track of soft/hard blocked
internally -- the core should do this
* the rfkill API has many unexpected quirks, for example being
asymmetric wrt. alloc/free and register/unregister
* rfkill can call back into a driver from within a function the
driver called -- this is prone to deadlocks and generally
should be avoided
* rfkill-input pointlessly is a separate module
* drivers need to #ifdef rfkill functions (unless they want to
depend on or select RFKILL) -- rfkill should provide inlines
that do nothing if it isn't compiled in
* the rfkill structure is not opaque -- drivers need to initialise
it correctly (lots of sanity checking code required) -- instead
force drivers to pass the right variables to rfkill_alloc()
* the documentation is hard to read because it always assumes the
reader is completely clueless and contains way TOO MANY CAPS
* the rfkill code needlessly uses a lot of locks and atomic
operations in locked sections
* fix LED trigger to actually change the LED when the radio state
changes -- this wasn't done before
Tested-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> [thinkpad]
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
NETDEV_UP is called after the device is set UP, but sometimes
it is useful to be able to veto the device UP. Introduce a
new NETDEV_PRE_UP notifier that can be used for exactly this.
The first use case will be cfg80211 denying interfaces to be
set UP if the device is known to be rfkill'ed.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Instead of hardcoding the key length for validation, use the
constants Zhu Yi recently added and add one for AES_CMAC too.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo has updated the driver to no longer use the change flag,
so we can remove that, but rt2x00 and ath5k still use the
actual value so let's mark it as deprecated too.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Here is an updated patch to include the extra call to
trace_seq_init() as requested. This is vs. the latest
-tip tree and fixes the use of multiple __print_flags
and __print_symbolic in a single tracer. Also tested
to ensure its working now:
mount.gfs2-2534 [000] 235.850587: gfs2_glock_queue: 8.7 glock 1:2 dequeue PR
mount.gfs2-2534 [000] 235.850591: gfs2_demote_rq: 8.7 glock 1:0 demote EX to NL flags:DI
mount.gfs2-2534 [000] 235.850591: gfs2_glock_queue: 8.7 glock 1:0 dequeue EX
glock_workqueue-2529 [000] 235.850666: gfs2_glock_state_change: 8.7 glock 1:0 state EX => NL tgt:NL dmt:NL flags:lDpI
glock_workqueue-2529 [000] 235.850672: gfs2_glock_put: 8.7 glock 1:0 state NL => IV flags:I
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
LKML-Reference: <1244037123.29604.603.camel@localhost.localdomain>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Prior implementation of the new sctp_connectx() call that returns
an association ID did not work correctly on non-blocking socket.
This is because we could not return both a EINPROGRESS error and
an association id. This is a new implementation that supports this.
Originally from Ivan Skytte Jørgensen <isj-sctp@i1.dk
Signed-off-by: Ivan Skytte Jørgensen <isj-sctp@i1.dk
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
RFC 5061 Section 5.1 ASCONF Chunk Procedures said:
B4) Re-transmit the ASCONF Chunk last sent and if possible choose an
alternate destination address (please refer to [RFC4960],
Section 6.4.1). An endpoint MUST NOT add new parameters to this
chunk; it MUST be the same (including its Sequence Number) as
the last ASCONF sent. An endpoint MAY, however, bundle an
additional ASCONF with new ASCONF parameters with the next
Sequence Number. For details, see Section 5.5.
This patch fix to choose an alternate destination address when
re-transmit the ASCONF chunk, with some dup codes cleanup.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
RFC5061 had changed the error cause codes for Dynamic Address
Reconfiguration As the following:
Cause Code
Value Cause Code
--------- ----------------
0x00A0 Request to Delete Last Remaining IP Address
0x00A1 Operation Refused Due to Resource Shortage
0x00A2 Request to Delete Source IP Address
0x00A3 Association Aborted Due to Illegal ASCONF-ACK
0x00A4 Request Refused - No Authorization
This patch fix the error cause codes.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Can remove anonymous union now it has one field.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define three accessors to get/set dst attached to a skb
struct dst_entry *skb_dst(const struct sk_buff *skb)
void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)
void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;
Delete skb->dst field
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define skb_rtable(const struct sk_buff *skb) accessor to get rtable from skb
Delete skb->rtable field
Setting rtable is not allowed, just set dst instead as rtable is an alias.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct sk_buff uses one union to define dst and rtable fields.
We want to replace direct access to these pointers by accessors.
First patch adds a new "unsigned long _skb_dst;" opaque field
in this union.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the notify chain infrastructure and replace it
by a simple function pointer. This issue has been mentioned in the
mailing list several times: the use of the notify chain adds
too much overhead for something that is only used by ctnetlink.
This patch also changes nfnetlink_send(). It seems that gfp_any()
returns GFP_KERNEL for user-context request, like those via
ctnetlink, inside the RCU read-side section which is not valid.
Using GFP_KERNEL is also evil since netlink may schedule(),
this leads to "scheduling while atomic" bug reports.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
blk_queue_bounce_limit() is more than a wrapper about the request queue
limits.bounce_pfn variable. Introduce blk_queue_bounce_pfn() which can
be called by stacking drivers that wish to set the bounce limit
explicitly.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The structure isn't hw only and when I read event, I think about those
things that fall out the other end. Rename the thing.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Cc: Stephane Eranian <eranian@googlemail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since some people worried that 4G might not be a large enough
as an mmap data window, extend it to 64 bit for capable
platforms.
Reported-by: Stephane Eranian <eranian@googlemail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
IRQ (non-NMI) sampling is not used anymore - remove the last few bits.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
A few renames:
s/irq_period/sample_period/
s/irq_freq/sample_freq/
s/PERF_RECORD_/PERF_SAMPLE_/
s/record_type/sample_type/
And change both the new sample_type and read_format to u64.
Reported-by: Stephane Eranian <eranian@googlemail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Stephan raised the issue that we currently cannot distinguish between
similar counters within a group (PERF_RECORD_GROUP uses the config
value as identifier).
Therefore, generate a new ID for each counter using a global u64
sequence counter.
Reported-by: Stephane Eranian <eranian@googlemail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Apparently I'm the first to use it :-)
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch simplifies the conntrack event caching system by removing
several events:
* IPCT_[*]_VOLATILE, IPCT_HELPINFO and IPCT_NATINFO has been deleted
since the have no clients.
* IPCT_COUNTER_FILLING which is a leftover of the 32-bits counter
days.
* IPCT_REFRESH which is not of any use since we always include the
timeout in the messages.
After this patch, the existing events are:
* IPCT_NEW, IPCT_RELATED and IPCT_DESTROY, that are used to identify
addition and deletion of entries.
* IPCT_STATUS, that notes that the status bits have changes,
eg. IPS_SEEN_REPLY and IPS_ASSURED.
* IPCT_PROTOINFO, that reports that internal protocol information has
changed, eg. the TCP, DCCP and SCTP protocol state.
* IPCT_HELPER, that a helper has been assigned or unassigned to this
entry.
* IPCT_MARK and IPCT_SECMARK, that reports that the mark has changed, this
covers the case when a mark is set to zero.
* IPCT_NATSEQADJ, to report that there's updates in the NAT sequence
adjustment.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch moves the event flags from linux/netfilter/nf_conntrack_common.h
to net/netfilter/nf_conntrack_ecache.h. This flags are not of any use
from userspace.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
During the module removal there are no possible event listeners
since ctnetlink must be removed before to allow removing
nf_conntrack. This patch removes the event reporting for the
module removal case which is not of any use in the existing code.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Ideally we should have a directory of drivers and a link to the 'active'
driver. For now just show the first device which is effectively the existing
semantics without a warning.
This is an update on the original buggy patch that I then forgot to
resubmit. Confusingly it was proposed by Red Hat, written by Etched Pixels
fixed and submitted by Intel ...
Resolves-Bug: http://bugzilla.kernel.org/show_bug.cgi?id=9749
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stop using task_struct::pid and start using PID namespaces.
PIDs will be reported in the PID namespace of the monitoring
task at the moment of counter creation.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The patch below adds supporting TCP simultaneous open to conntrack. The
unused LISTEN state is replaced by a new state (SYN_SENT2) denoting the
second SYN sent from the reply direction in the new case. The state table
is updated and the function tcp_in_window is modified to handle
simultaneous open.
The functionality can fairly easily be tested by socat. A sample tcpdump
recording
23:21:34.244733 IP (tos 0x0, ttl 64, id 49224, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.254.2020 > 192.168.0.1.2020: S, cksum 0xe75f (correct), 3383710133:3383710133(0) win 5840 <mss 1460,sackOK,timestamp 173445629 0,nop,wscale 7>
23:21:34.244783 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) 192.168.0.1.2020 > 192.168.0.254.2020: R, cksum 0x0253 (correct), 0:0(0) ack 3383710134 win 0
23:21:36.038680 IP (tos 0x0, ttl 64, id 28092, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.1.2020 > 192.168.0.254.2020: S, cksum 0x704b (correct), 2634546729:2634546729(0) win 5840 <mss 1460,sackOK,timestamp 824213 0,nop,wscale 1>
23:21:36.038777 IP (tos 0x0, ttl 64, id 49225, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.254.2020 > 192.168.0.1.2020: S, cksum 0xb179 (correct), 3383710133:3383710133(0) ack 2634546730 win 5840 <mss 1460,sackOK,timestamp 173447423 824213,nop,wscale 7>
23:21:36.038847 IP (tos 0x0, ttl 64, id 28093, offset 0, flags [DF], proto TCP (6), length 52) 192.168.0.1.2020 > 192.168.0.254.2020: ., cksum 0xebad (correct), ack 3383710134 win 2920 <nop,nop,timestamp 824213 173447423>
and the corresponding netlink events:
[NEW] tcp 6 120 SYN_SENT src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 [UNREPLIED] src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020
[UPDATE] tcp 6 120 LISTEN src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020
[UPDATE] tcp 6 60 SYN_RECV src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020
[UPDATE] tcp 6 432000 ESTABLISHED src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [ASSURED]
The RST packet was dropped in the raw table, thus it did not reach
conntrack. nfnetlink_conntrack is unpatched so it shows the new SYN_SENT2
state as the old unused LISTEN.
With TCP simultaneous open support we satisfy REQ-2 in RFC 5382 ;-) .
Additional minor correction in this patch is that in order to catch
uninitialized reply directions, "td_maxwin == 0" is used instead of
"td_end == 0" because the former can't be true except in uninitialized
state while td_end may accidentally be equal to zero in the mid of a
connection.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This removes the prev_state field of struct perf_counter since
it is now unused. It was only used by the cpu migration
counter, which doesn't use it any more.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <18979.35052.915728.626374@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This fixes the cpu migration software counter to count
correctly even when contexts get swapped from one task to
another. Previously the cpu migration counts reported by perf
stat were bogus, ranging from negative to several thousand for
a single "lat_ctx 2 8 32" run. With this patch the cpu
migration count reported for "lat_ctx 2 8 32" is almost always
between 35 and 44.
This fixes the problem by adding a call into the perf_counter
code from set_task_cpu when tasks are migrated. This enables
us to use the generic swcounter code (with some modifications)
for the cpu migration counter.
This modifies the swcounter code to allow a NULL regs pointer
to be passed in to perf_swcounter_ctx_event() etc. The cpu
migration counter does this because there isn't necessarily a
pt_regs struct for the task available. In this case, the
counter will not have interrupt capability - but the migration
counter didn't have interrupt capability before, so this is no
loss.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <18979.35006.819769.416327@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
UBI volume notifications are intended to create the API to get clients
notified about volume creation/deletion, renaming and re-sizing. A
client can subscribe to these notifications using 'ubi_volume_register()'
and cancel the subscription using 'ubi_volume_unregister()'. When UBI
volumes change, a blocking notifier is called. Clients also can request
"added" events on all volumes that existed before client subscribed
to the notifications.
If we use notifications instead of calling functions like 'ubi_gluebi_xxx()',
we can make the MTD emulation layer to be more flexible: build it as a
separate module and load/unload it on demand.
[Artem: many cleanups, rework locking, add "updated" event, provide
device/volume info in notifiers]
Signed-off-by: Dmitry Pervushin <dpervushin@embeddedalley.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Introduce snd_card_set_id() function to allow lowlevel drivers to set
default identification name for card slot. The function checks also
for identification name collisions and tries to create unique name.
Also, the snd_card_create() function is simplified, because this new
function is used. As bonus, proper name collision checks are evaluated
at the card create time.
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
After some discussion offline with Christoph Lameter and David Stevens
regarding multicast behaviour in Linux, I'm submitting a slightly
modified patch from the one Christoph submitted earlier.
This patch provides a new socket option IP_MULTICAST_ALL.
In this case, default behaviour is _unchanged_ from the current
Linux standard. The socket option is set by default to provide
original behaviour. Sockets wishing to receive data only from
multicast groups they join explicitly will need to clear this
socket option.
Signed-off-by: Nivedita Singhvi <niv@us.ibm.com>
Signed-off-by: Christoph Lameter<cl@linux.com>
Acked-by: David Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The trace_pipe did not recognize the latency format flag and would produce
different output than the trace file. The problem was partly due that
the trace flags in the iterator was not set as well as the trace_pipe
zeros out part of the iterator (including the flags) to be able to use
the same routines as the trace file. trace_flags of the iterator should
not cause any problems when not zeroed out by for trace_pipe.
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
After converting the softirq tracer to use te flags options, this
caused a regression with the name. Since the flag was used directly
it was printed out (i.e. HRTIMER_SOFTIRQ).
This patch only shows the softirq name without the SOFTIRQ part.
[ Impact: fix regression of output from softirq events ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
__string() is limited:
- it's a char array, but we may want to define array with other types
- a source string should be available, but we may just know the string size
We introduce __dynamic_array() to break those limitations, and __string()
becomes a wrapper of it. As a side effect, now __get_str() can be used
in TP_fast_assign but not only TP_print.
Take XFS for example, we have the string length in the dirent, but the
string itself is not NULL-terminated, so __dynamic_array() can be used:
TRACE_EVENT(xfs_dir2,
TP_PROTO(struct xfs_da_args *args),
TP_ARGS(args),
TP_STRUCT__entry(
__field(int, namelen)
__dynamic_array(char, name, args->namelen + 1)
...
),
TP_fast_assign(
char *name = __get_str(name);
if (args->namelen)
memcpy(name, args->name, args->namelen);
name[args->namelen] = '\0';
__entry->namelen = args->namelen;
),
TP_printk("name %.*s namelen %d",
__entry->namelen ? __get_str(name) : NULL
__entry->namelen)
);
[ Impact: allow defining dynamic size arrays ]
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4A2384D2.3080403@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Currently TP_fast_assign has a limitation that we can't define local
variables in it.
Here's one use case when we introduce __dynamic_array():
TP_fast_assign(
type *p = __get_dynamic_array(item);
foo(p);
bar(p);
),
[ Impact: allow defining local variables in TP_fast_assign ]
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4A2384B1.90100@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
v3: zhaolei@cn.fujitsu.com: Change TRACE_EVENT definition to new format
introduced by Steven Rostedt: consolidate trace and trace_event headers
v2: kosaki@jp.fujitsu.com: print the function names instead of addr, and zap
the work addr
v1: zhaolei@cn.fujitsu.com: Make workqueue tracepoints use TRACE_EVENT macro
TRACE_EVENT is a more generic way to define tracepoints.
Doing so adds these new capabilities to the tracepoints:
- zero-copy and per-cpu splice() tracing
- binary tracing without printf overhead
- structured logging records exposed under /debug/tracing/events
- trace events embedded in function tracer output and other plugins
- user-defined, per tracepoint filter expressions
Then, this patch converts DEFINE_TRACE to TRACE_EVENT in workqueue related
tracepoints.
[ Impact: expand workqueue tracer to events tracing ]
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Conflicts:
arch/mips/sibyte/bcm1480/irq.c
arch/mips/sibyte/sb1250/irq.c
Merge reason: we gathered a few conflicts plus update to latest upstream fixes.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add 'autoconf' and 'disable_ipv6' parameters to the IPv6 module.
The first controls if IPv6 addresses are autoconfigured from
prefixes received in Router Advertisements. The IPv6 loopback
(::1) and link-local addresses are still configured.
The second controls if IPv6 addresses are desired at all. No
IPv6 addresses will be added to any interfaces.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a generic driver for SJA1000 chips on the OpenFirmware
platform bus found on embedded PowerPC systems. You need a SJA1000 node
definition in your flattened device tree source (DTS) file similar to:
can@3,100 {
compatible = "nxp,sja1000";
reg = <3 0x100 0x80>;
interrupts = <2 0>;
interrupt-parent = <&mpic>;
nxp,external-clock-frequency = <16000000>;
};
See also Documentation/powerpc/dts-bindings/can/sja1000.txt.
CC: devicetree-discuss@ozlabs.org
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This abstracts out the code for locking the context associated
with a task. Because the context might get transferred from
one task to another concurrently, we have to check after
locking the context that it is still the right context for the
task and retry if not. This was open-coded in
find_get_context() and perf_counter_init_task().
This adds a further function for pinning the context for a
task, i.e. marking it so it can't be transferred to another
task. This adds a 'pin_count' field to struct
perf_counter_context to indicate that a context is pinned,
instead of the previous method of setting the parent_gen count
to all 1s. Pinning the context with a pin_count is easier to
undo and doesn't require saving the parent_gen value. This
also adds a perf_unpin_context() to undo the effect of
perf_pin_task_context() and changes perf_counter_init_task to
use it.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <18979.34748.755674.596386@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Merge reason: merge almost-rc8 into perfcounters/core, which was -rc6
based - to pick up the latest upstream fixes.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix the following 'make headers_check' warnings:
usr/include/linux/net_dropmon.h:7: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
fix the following 'make headers_check' warnings:
usr/include/linux/auto_fs.h:17: include of <linux/types.h> is preferred over <asm/types.h>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
This patch converts unicast address list to standard list_head using
previously introduced struct netdev_hw_addr. It also relaxes the
locking. Original spinlock (still used for multicast addresses) is not
needed and is no longer used for a protection of this list. All
reading and writing takes place under rtnl (with no changes).
I also removed a possibility to specify the length of the address
while adding or deleting unicast address. It's always dev->addr_len.
The convertion touched especially e1000 and ixgbe codes when the
change is not so trivial.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
drivers/net/bnx2.c | 13 +--
drivers/net/e1000/e1000_main.c | 24 +++--
drivers/net/ixgbe/ixgbe_common.c | 14 ++--
drivers/net/ixgbe/ixgbe_common.h | 4 +-
drivers/net/ixgbe/ixgbe_main.c | 6 +-
drivers/net/ixgbe/ixgbe_type.h | 4 +-
drivers/net/macvlan.c | 11 +-
drivers/net/mv643xx_eth.c | 11 +-
drivers/net/niu.c | 7 +-
drivers/net/virtio_net.c | 7 +-
drivers/s390/net/qeth_l2_main.c | 6 +-
drivers/scsi/fcoe/fcoe.c | 16 ++--
include/linux/netdevice.h | 18 ++--
net/8021q/vlan.c | 4 +-
net/8021q/vlan_dev.c | 10 +-
net/core/dev.c | 195 +++++++++++++++++++++++++++-----------
net/dsa/slave.c | 10 +-
net/packet/af_packet.c | 4 +-
18 files changed, 227 insertions(+), 137 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: libps2 - better handle bad scheduler decisions
Input: usb1400_ts - fix access to "device data" in resume function
Input: multitouch - augment event semantics documentation
Input: multitouch - add tracking ID to the protocol
mapping->tree_lock can be acquired from interrupt context. Then,
following dead lock can occur.
Assume "A" as a page.
CPU0:
lock_page_cgroup(A)
interrupted
-> take mapping->tree_lock.
CPU1:
take mapping->tree_lock
-> lock_page_cgroup(A)
This patch tries to fix above deadlock by moving memcg's hook to out of
mapping->tree_lock. charge/uncharge of pagecache/swapcache is protected
by page lock, not tree_lock.
After this patch, lock_page_cgroup() is not called under mapping->tree_lock.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
linux/cred.h can't be included as first header (alphabetical order)
because it uses __init which is enough to break compilation on some archs.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When fork() fails we cannot use perf_counter_exit_task() since that
assumes to operate on current. Write a new helper that cleans up
unused/clean contexts.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
New MEMERASE/MEMREADOOB/MEMWRITEOOB ioctls are needed in order to support
64-bit offsets into large NAND flash devices.
Signed-off-by: Kevin Cernekee <kpc.mtd@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Move the fifo_size assignment to hw->ioctl callback to allow lowlevel
drivers overwrite the default behaviour.
fifo_size is in frames not bytes as specified in asound.h and alsa-lib's
documentation, but most hardware have fixed byte based FIFOs. Introduce
internal SNDRV_PCM_INFO_FIFO_IN_FRAMES.
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
[PATCH net-next] bonding: allow bond in mode balance-alb to work properly in bridge -try4.3
(updated)
changes v4.2 -> v4.3
- memcpy the address always, not just in case it differs from master->dev_addr
- compare_ether_addr_64bits() is not used so there is no direct need to make new
header file (I think it would be good to have bond stuff in separate file
anyway).
changes v4.1 -> v4.2
- use skb->pkt_type == PACKET_HOST compare rather then comparing skb dest addr
against skb->dev->dev_addr
The problem is described in following bugzilla:
https://bugzilla.redhat.com/show_bug.cgi?id=487763
Basically here's what's going on. In every mode, bonding interface uses the same
mac address for all enslaved devices (except fail_over_mac). Only balance-alb
will simultaneously use multiple MAC addresses across different slaves. When you
put this kind of bond device into a bridge it will only add one of mac adresses
into a hash list of mac addresses, say X. This mac address is marked as local.
But this bonding interface also has mac address Y. Now then packet arrives with
destination address Y, this address is not marked as local and the packed looks
like it needs to be forwarded. This packet is then lost which is wrong.
Notice that interfaces can be added and removed from bond while it is in bridge.
***
When the multiple addresses for bridge port approach failed to solve this issue
due to STP I started to think other way to solve this. I returned to previous
solution but tweaked one.
This patch solves the situation in the bonding without touching bridge code.
For every incoming frame to bonding the destination address is compared to
current address of the slave device from which tha packet came. If these two
match destination address is replaced by mac address of the master. This address
is known by bridge so it is delivered properly. Note that the comparsion is not
made directly, it's used skb->pkt_type == PACKET_HOST instead. This is "set"
previously in eth_type_trans().
I experimentally tried that this works as good as searching through the slave
list (v4 of this patch).
Jirka
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
As the 'state_get' API call was added, we need to increase the minor
protocol version number so applications that depend on the can check
it's presence.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
wimax connection manager / daemon has to know what is current
state of the device. Previously it was only possible to get
notification whet state has changed.
Note:
By mistake, the new generic netlink's number for
WIMAX_GNL_OP_STATE_GET was declared inserting into the existing list
of API calls, not appending; thus, it'd break existing API.
Fixed by Inaky Perez-Gonzalez <inaky@linux.intel.com> by moving to
the tail, where we add to the interface, not modify the interface.
Thanks to Stephen Hemminger <shemminger@vyatta.com> for catching this.
Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
Testing the i7300_idle driver on i5000-series hardware required
an edit to i7300_idle.h to "#define SUPPORT_I5000 1" and a re-build
of both i7300_idle and ioat_dma.
Replace that build-time scheme with a load-time module parameter:
"7300_idle.forceload=1" to make it easier to test the driver
on hardware that while not officially validated, works fine
and is much more commonly available.
By default (no modparam) the driver will continue to load
only on the i7300.
Note that ioat_dma runs a copy of i7300_idle's probe routine
to know to reserve an IOAT channel for i7300_idle.
This change makes ioat_dma do that always on the i5000,
just like it does on the i7300.
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Andrew Henroid <andrew.d.henroid@intel.com>
Commit 564c2b21 ("perf_counter: Optimize context switch between
identical inherited contexts") introduced a race where it is possible
that a counter being attached to a task could get attached to the
wrong task, if the task is one that has inherited its context from
another task via fork. This happens because the optimized context
switch could switch the context to another task after find_get_context
has read task->perf_counter_ctxp. In fact, it's possible that the
context could then get freed, if the other task then exits.
This fixes the problem by protecting both the context switch and the
critical code in find_get_context with spinlocks. The context switch
locks the cxt->lock of both the outgoing and incoming contexts before
swapping them. That means that once code such as find_get_context
has obtained the spinlock for the context associated with a task,
the context can't get swapped to another task. However, the context
may have been swapped in the interval between reading
task->perf_counter_ctxp and getting the lock, so it is necessary to
check and retry.
To make sure that none of the contexts being looked at in
find_get_context can get freed, this changes the context freeing code
to use RCU. Thus an rcu_read_lock() is sufficient to ensure that no
contexts can get freed. This part of the patch is lifted from a patch
posted by Peter Zijlstra.
This also adds a check to make sure that we can't add a counter to a
task that is exiting.
There is also a race between perf_counter_exit_task and
find_get_context; this solves the race by moving the get_ctx that
was in perf_counter_alloc into the locked region in find_get_context,
so that once find_get_context has got the context for a task, it
won't get freed even if the task calls perf_counter_exit_task. It
doesn't matter if new top-level (non-inherited) counters get attached
to the context after perf_counter_exit_task has detached the context
from the task. They will just stay there and never get scheduled in
until the counters' fds get closed, and then perf_release will remove
them from the context and eventually free the context.
With this, we are now doing the unclone in find_get_context rather
than when a counter was added to or removed from a context (actually,
we were missing the unclone_ctx() call when adding a counter to a
context). We don't need to unclone when removing a counter from a
context because we have no way to remove a counter from a cloned
context.
This also takes out the smp_wmb() in find_get_context, which Peter
Zijlstra pointed out was unnecessary because the cmpxchg implies a
full barrier anyway.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <18974.33033.667187.273886@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Pointed out by compiler warnings:
tip/include/linux/perf_counter.h:644: warning: no return statement in function returning non-void
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The ACPI0007 _HID used for processor "Device" objects in the namespace
is not needed outside the processor driver, so move it there. Also, the
#define is only used once, so just remove it and hard-code "ACPI0007".
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
ACPI_PROCESSOR_OBJECT_HID is a synthetic _HID that Linux generates
for "Processor" definitions. Unlike "Device" definitions, "Processor"
definitions do not have a _HID in the namespace, so we generate a
fake _HID. By convention, all these fake _HIDs begin with "LNX".
This does change the user-visible _HID for "Processor" objects --
previously, we used "ACPI_CPU" and this changes that to "LNXCPU",
which starts with "LNX" as do all the other made-up _HIDs. This
change is visible in processor filenames and "hid" files under
/sys/devices/LNXSYSTM:00/.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
"call" is an argument of macro, but it is also used as a local
variable name of function in macro.
We should keep this local variable name distinct from any
CPP macro parameter name if both are in the same macro scope,
although it hasn't caused any problem yet.
[ Impact: robustify macro ]
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Use ALIGN() and PTR_ALIGN() macros instead of handcoding them.
Get rid of NETDEV_ALIGN_CONST ugly define
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current MTT allocator uses kmalloc() to allocate a buffer for its
buddy allocator, and thus is limited in the amount of MTT segments
that it can control. As a result, the size of memory that can be
registered is limited too. This patch uses a module parameter to
control the number of MTT entries that each segment represents,
allowing more memory to be registered with the same number of
segments.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Make REQHASH() an inline function. Rename hash_list to cache_hash.
Fix an obsolete comment.
Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
This patch adds CTA_PROTOINFO_DCCP_HANDSHAKE_SEQ that exposes
the u64 handshake sequence number to user-space.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
it's a little too large for static line.
The ts is currently the only mainline user but Marek Vasut claims that
there is a battery driver in an ARM tree which also needs this function.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
For the overwhelming majority of cases, skb_gro_header's return
value cannot be NULL. Yet we must check it because of its current
form. This patch splits it up into multiple functions in order
to avoid this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
By caching frag0_len, we can avoid checking both frag0 and the
length separately in skb_gro_header. This helps as skb_gro_header
is called four times per packet which amounts to a few million
times at 10Gb/s.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently skb_gro_header is used for packets which put the hardware
header in skb->data with the rest in frags. Since the drivers that
need this optimisation all provide completely non-linear packets,
we can gain extra optimisations by only performing the frag0
optimisation for completely non-linear packets.
In particular, we can simply test frag0 (instead of skb_headlen)
to see whether the optimisation is in force.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function skb_gro_header is called four times per packet which
quickly adds up at 10Gb/s. This patch inlines it to allow better
optimisations.
Some architectures perform multiplication for page_address, which
is done by each skb_gro_header invocation. This patch caches that
value in skb->cb to avoid the unnecessary multiplications.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update version number.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This interface enables the override or creation of a single
control method. Useful to repair a bug or install a missing method.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Version 20090422.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Merge the OSL with the actual file used by Linux, so that the
file does not require patching when integrated with Linux. General
cleanup and some restructuring.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Removed unnecessary masking. For the 64-bit macros, removed
the structure overlay. Fixes aliasing warnings seen with gcc 4+
compilers.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Mostly for acpiexec, one in the core subsystem.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Moved the module name and line number to the end of the message.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Just use the constant 20 to keep things working.
If someone is so motivated, this can be converted over to
dynamic strings. I tried and it's a lot of work.
But for now this is good enough.
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
drm/i915: Add support for VGA load detection (pre-945).
drm/i915: Use an I2C algo to do the flip to SDVO DDC bus.
drm/i915: Determine type before initialising connector
drm/i915: Return SDVO LVDS VBT mode if no EDID modes are detected.
drm/i915: Fetch SDVO LVDS mode lines from VBT, then reserve them
i915: support 8xx desktop cursors
drm/i915: allocate large pointer arrays with vmalloc
The recording of the names at trace time is inefficient. This patch
implements the softirq event recording to only record the vector
and then use the __print_symbolic interface to print out the names.
[ Impact: faster recording of softirq events ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
This patch adds __print_symbolic which is similar to __print_flags but
works for an enumeration type instead. That is, there is only a one to one
mapping between the values and the symbols. When a match is made, then
it is printed, otherwise the hex value is outputed.
[ Impact: add interface for showing symbol names in events ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
This patch changes the output for gfp_flags from being a simple hex value
to the actual names.
gfp_flags=GFP_ATOMIC instead of gfp_flags=00000020
And even
gfp_flags=GFP_KERNEL instead of gfp_flags=000000d0
(Thanks to Frederic Weisbecker for pointing out that the first version
had a bad order of GFP masks)
[ Impact: more human readable output from tracer ]
Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
It is useful to see the state of a task that is being switched out.
This patch adds the output of the state of the previous task in
the context switch event.
[ Impact: see state of switched out task in context switch ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Developers have been asking for the ability in the ftrace event tracer
to display names of bits in a flags variable.
Instead of printing out c2, it would be easier to read FOO|BAR|GOO,
assuming that FOO is bit 1, BAR is bit 6 and GOO is bit 7.
Some examples where this would be useful are the state flags in a context
switch, kmalloc flags, and even permision flags in accessing files.
[
v2 changes include:
Frederic Weisbecker's idea of using a mask instead of bits,
thus we can output GFP_KERNEL instead of GPF_WAIT|GFP_IO|GFP_FS.
Li Zefan's idea of allowing the caller of __print_flags to add their
own delimiter (or no delimiter) where we can get for file permissions
rwx instead of r|w|x.
]
[
v3 changes:
Christoph Hellwig's idea of using an array instead of va_args.
]
[ Impact: better displaying of flags in trace output ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
This breaks the dilnetpc map driver, but it could be fixed not to use
that option. We want to simplify the partition handling, and this is a
step towards that.
Remove superfluous 'index' field from private struct mtd_part too, while
we're at it.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
We would like to get rid of netdev->trans_start = jiffies; that about all net
drivers have to use in their start_xmit() function, and use txq->trans_start
instead.
This can be done generically in core network, as suggested by David.
Some devices, (particularly loopback) dont need trans_start update, because
they dont have transmit watchdog. We could add a new device flag, or rely
on fact that txq->tran_start can be updated is txq->xmit_lock_owner is
different than -1. Use a helper function to hide our choice.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When defining a dynamic size string, we add __str_loc_##item to the
trace entry, and it stores the location of the actual string in
entry->_str_data[]
'unsigned short' should be sufficient to store this information, thus
we save 2 bytes per dyn-size string in the ring buffer.
[ Impact: reduce memory occupied by dyn-size strings in ring buffer ]
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <4A14EDB6.2050507@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Introduce a generic per counter interrupt throttle.
This uses the perf_counter_overflow() quick disable to throttle a specific
counter when its going too fast when a pmu->unthrottle() method is provided
which can undo the quick disable.
Power needs to implement both the quick disable and the unthrottle method.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <20090525153931.703093461@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
remove the x86 specific interrupt throttle
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <20090525153931.616671838@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Robert L Mathews discovered that some clients send evil TCP RST segments,
which are accepted by netfilter conntrack but discarded by the
destination. Thus the conntrack entry is destroyed but the destination
retransmits data until timeout.
The same technique, i.e. sending properly crafted RST segments, can easily
be used to bypass connlimit/connbytes based restrictions (the sample
script written by Robert can be found in the netfilter mailing list
archives).
The patch below adds a new flag and new field to struct ip_ct_tcp_state so
that checking RST segments can be made more strict and thus TCP conntrack
can catch the invalid ones: the RST segment is accepted only if its
sequence number higher than or equal to the highest ack we seen from the
other direction. (The last_ack field cannot be reused because it is used
to catch resent packets.)
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Fail fork() when we fail inheritance for some reason (-ENOMEM most likely).
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <20090525124600.324656474@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
extra_config_len isn't used for anything, remove it.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <20090525124600.116035832@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All drivers are already converted to new net_device_ops API
and nobody uses old API anymore.
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new ID is validated by Cologne Chip.
LEDs control is also supported.
Signed-off-by: Andreas Eversberg <andreas@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch make debug printk's KERN_DEBUG and also fix some
codestyle issues.
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
new id for HFC-8S
Signed-off-by: Andreas Eversberg <andreas@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
New version without emulating arch specific stuff for the other
architectures, the special IO and init functions for the 8xx
microcontroller are in a separate include file.
Signed-off-by: Andreas Eversberg <andreas@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add IMHOLD_L1 ioctl.
The feature will be disabled on closing.
Signed-off-by: Andreas Eversberg <andreas@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added tx-fifo information for calculation of current delay to sync tx and rx
streams for echo canceler.
Signed-off-by: Andreas Eversberg <andreas@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch was made by Titus Moldovan and provides IOCTL functions for enabling
and disabling the controller's built in watchdog. The use is optional.
Signed-off-by: Andreas Eversberg <andreas@eversberg.eu>
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
now that pctrl() no longer disables other people's counters,
remove the PMU cache code that deals with that.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <20090523163013.032998331@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Instead of en/dis-abling all counters acting on a particular
task, en/dis- able all counters we created.
[ v2: fix crash on first counter enable ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <20090523163012.916937244@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This allows fnic to configure number of retries for lport and rport
separately.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Acked-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If a task did not complete normally due to a TMF, libiscsi will
now complete the task with the state ISCSI_TASK_ABRT_TMF. Drivers
like bnx2i that need to free resources if a command did not complete normally
can then check the task state. If a driver does not need to send
a special command if we have dropped the session then they can check
for ISCSI_TASK_ABRT_SESS_RECOV.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
bnx2i needs to send a hardware specific cleanup command if
a command has not completed normally (iscsi/scsi response from
target), and the session is still ok (this is the case when we
send a TMF to stop the command).
At this time it will need to drop the session lock. The problem
with the current code is that fail_all_commands assumes we
will hold the lock the entire time, so it uses list_for_each_entry_safe.
If while bnx2i drops the session lock multiple cmds complete then
list_for_each_entry_safe will not handle this correctly.
This patch removes the running lists and just has us loop over
the cmds array (in later patches we will then replace that
array with a block tag map at the session level). It also fixes
up the completion path so that if the TMF code and the normal recv
path were completing the same command then they both do not try
to do release the refcount taken when the task is queued.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
bnx2i needs to be able to look up mgmt task like login and nop, because
it does some processing of them on the completion path. This exports
iscsi_itt_to_task so it can look up the task.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When we create the tcp/ip connection by calling ep_connect, we currently
just go by the routing table info.
I think there are two problems with this.
1. Some drivers do not have access to a routing table. Some drivers like
qla4xxx do not even know about other ports.
2. If you have two initiator ports on the same subnet, the user may have
set things up so that session1 was supposed to be run through port1. and
session2 was supposed to be run through port2. It looks like we could
end with both sessions going through one of the ports.
Fixes for cxgb3i from Karen Xie.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
s/counter->mutex/counter->child_mutex/ and make sure its only
used to protect child_list.
The usage in __perf_counter_exit_task() doesn't appear to be
problematic since ctx->mutex also covers anything related to fd
tear-down.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <20090523163012.533186528@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We call perf_adjust_freq() from perf_counter_task_tick() which
is is called under the rq->lock causing lock recursion.
However, it's no longer required to be called under the
rq->lock, so remove it from under it.
Also, fix up some related comments.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <20090523163012.476197912@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There are a few multi-touch devices that support finger tracking
well in hardware, Stantum being the prime example. By exposing the
tracking ID in the MT protocol, evdev bandwidth and cpu usage in
user space can be reduced.
This patch adds the ABS_MT_TRACKING_ID to the MT protocol.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Tested-by: Stéphane Chatty <chatty@enac.fr>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
To support devices with physical block sizes bigger than 512 bytes we
need to ensure proper alignment. This patch adds support for exposing
I/O topology characteristics as devices are stacked.
logical_block_size is the smallest unit the device can address.
physical_block_size indicates the smallest I/O the device can write
without incurring a read-modify-write penalty.
The io_min parameter is the smallest preferred I/O size reported by
the device. In many cases this is the same as the physical block
size. However, the io_min parameter can be scaled up when stacking
(RAID5 chunk size > physical block size).
The io_opt characteristic indicates the optimal I/O size reported by
the device. This is usually the stripe width for arrays.
The alignment_offset parameter indicates the number of bytes the start
of the device/partition is offset from the device's natural alignment.
Partition tools and MD/DM utilities can use this to pad their offsets
so filesystems start on proper boundaries.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
To accommodate stacking drivers that do not have an associated request
queue we're moving the limits to a separate, embedded structure.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Convert all external users of queue limits to using wrapper functions
instead of poking the request queue variables directly.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Until now we have had a 1:1 mapping between storage device physical
block size and the logical block sized used when addressing the device.
With SATA 4KB drives coming out that will no longer be the case. The
sector size will be 4KB but the logical block size will remain
512-bytes. Hence we need to distinguish between the physical block size
and the logical ditto.
This patch renames hardsect_size to logical_block_size.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The patch moves some utility functions from mac80211 to cfg80211.
Because these functions are doing generic 802.11 operations so they
are not mac80211 specific. The moving allows some fullmac drivers
to be also benefit from these utility functions.
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: Samuel Ortiz <samuel.ortiz@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch adds the PCI Device ID 0xc409 to the PCI ID table of via82cxxx.c,
as well as the 0x8409 south bridge ID.
This is required to make the IDE driver work on the VX855/VX875 integrated
chipset.
Signed-off-by: Harald Welte <HaraldWelte@viatech.com>
Cc: Joseph Chan <JosephChan@via.com.tw>
Cc: Bruce Chang <BruceChang@via.com.tw>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
The WM9081 is designed to provide high power output at low distortion
levels in space-constrained portable applications.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
When monitoring a process and its descendants with a set of inherited
counters, we can often get the situation in a context switch where
both the old (outgoing) and new (incoming) process have the same set
of counters, and their values are ultimately going to be added together.
In that situation it doesn't matter which set of counters are used to
count the activity for the new process, so there is really no need to
go through the process of reading the hardware counters and updating
the old task's counters and then setting up the PMU for the new task.
This optimizes the context switch in this situation. Instead of
scheduling out the perf_counter_context for the old task and
scheduling in the new context, we simply transfer the old context
to the new task and keep using it without interruption. The new
context gets transferred to the old task. This means that both
tasks still have a valid perf_counter_context, so no special case
is introduced when the old task gets scheduled in again, either on
this CPU or another CPU.
The equivalence of contexts is detected by keeping a pointer in
each cloned context pointing to the context it was cloned from.
To cope with the situation where a context is changed by adding
or removing counters after it has been cloned, we also keep a
generation number on each context which is incremented every time
a context is changed. When a context is cloned we take a copy
of the parent's generation number, and two cloned contexts are
equivalent only if they have the same parent and the same
generation number. In order that the parent context pointer
remains valid (and is not reused), we increment the parent
context's reference count for each context cloned from it.
Since we don't have individual fds for the counters in a cloned
context, the only thing that can make two clones of a given parent
different after they have been cloned is enabling or disabling all
counters with prctl. To account for this, we keep a count of the
number of enabled counters in each context. Two contexts must have
the same number of enabled counters to be considered equivalent.
Here are some measurements of the context switch time as measured with
the lat_ctx benchmark from lmbench, comparing the times obtained with
and without this patch series:
-----Unmodified----- With this patch series
Counters: none 2 HW 4H+4S none 2 HW 4H+4S
2 processes:
Average 3.44 6.45 11.24 3.12 3.39 3.60
St dev 0.04 0.04 0.13 0.05 0.17 0.19
8 processes:
Average 6.45 8.79 14.00 5.57 6.23 7.57
St dev 1.27 1.04 0.88 1.42 1.46 1.42
32 processes:
Average 5.56 8.43 13.78 5.28 5.55 7.15
St dev 0.41 0.47 0.53 0.54 0.57 0.81
The numbers are the mean and standard deviation of 20 runs of
lat_ctx. The "none" columns are lat_ctx run directly without any
counters. The "2 HW" columns are with lat_ctx run under perfstat,
counting cycles and instructions. The "4H+4S" columns are lat_ctx run
under perfstat with 4 hardware counters and 4 software counters
(cycles, instructions, cache references, cache misses, task
clock, context switch, cpu migrations, and page faults).
[ Impact: performance optimization of counter context-switches ]
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <18966.10666.517218.332164@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This replaces the struct perf_counter_context in the task_struct with
a pointer to a dynamically allocated perf_counter_context struct. The
main reason for doing is this is to allow us to transfer a
perf_counter_context from one task to another when we do lazy PMU
switching in a later patch.
This has a few side-benefits: the task_struct becomes a little smaller,
we save some memory because only tasks that have perf_counters attached
get a perf_counter_context allocated for them, and we can remove the
inclusion of <linux/perf_counter.h> in sched.h, meaning that we don't
end up recompiling nearly everything whenever perf_counter.h changes.
The perf_counter_context structures are reference-counted and freed
when the last reference is dropped. A context can have references
from its task and the counters on its task. Counters can outlive the
task so it is possible that a context will be freed well after its
task has exited.
Contexts are allocated on fork if the parent had a context, or
otherwise the first time that a per-task counter is created on a task.
In the latter case, we set the context pointer in the task struct
locklessly using an atomic compare-and-exchange operation in case we
raced with some other task in creating a context for the subject task.
This also removes the task pointer from the perf_counter struct. The
task pointer was not used anywhere and would make it harder to move a
context from one task to another. Anything that needed to know which
task a counter was attached to was already using counter->ctx->task.
The __perf_counter_init_context function moves up in perf_counter.c
so that it can be called from find_get_context, and now initializes
the refcount, but is otherwise unchanged.
We were potentially calling list_del_counter twice: once from
__perf_counter_exit_task when the task exits and once from
__perf_counter_remove_from_context when the counter's fd gets closed.
This adds a check in list_del_counter so it doesn't do anything if
the counter has already been removed from the lists.
Since perf_counter_task_sched_in doesn't do anything if the task doesn't
have a context, and leaves cpuctx->task_ctx = NULL, this adds code to
__perf_install_in_context to set cpuctx->task_ctx if necessary, i.e. in
the case where the current task adds the first counter to itself and
thus creates a context for itself.
This also adds similar code to __perf_counter_enable to handle a
similar situation which can arise when the counters have been disabled
using prctl; that also leaves cpuctx->task_ctx = NULL.
[ Impact: refactor counter context management to prepare for new feature ]
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <18966.10075.781053.231153@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Should be no impact on the generated code but it helps the compiler
print clearer messages.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
This introduces genl_register_family_with_ops() that registers a genetlink
family along with operations from a table. This is used to kill copy'n'paste
occurrences in following patches.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch to add the ability to detect drops in hardware interfaces via dropwatch.
Adds a tracepoint to net_rx_action to signal everytime a napi instance is
polled. The dropmon code then periodically checks to see if the rx_frames
counter has changed, and if so, adds a drop notification to the netlink
protocol, using the reserved all-0's vector to indicate the drop location was in
hardware, rather than somewhere in the code.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
include/linux/net_dropmon.h | 8 ++
include/trace/napi.h | 11 +++
net/core/dev.c | 5 +
net/core/drop_monitor.c | 124 ++++++++++++++++++++++++++++++++++++++++++--
net/core/net-traces.c | 4 +
net/core/netpoll.c | 2
6 files changed, 149 insertions(+), 5 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
- Add support in ima_path_check() for integrity checking without
incrementing the counts. (Required for nfsd.)
- rename and export opencount_get to ima_counts_get
- replace ima_shm_check calls with ima_counts_get
- export ima_path_check
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
The the PACKET_ADD_MEMBERSHIP and the PACKET_DROP_MEMBERSHIP setsockopt
calls for af_packet already has all of the infrastructure needed to subscribe
to multiple mac addresses. All that is missing is a flag to say that
the address we want to listen on is a unicast address.
So introduce PACKET_MR_UNICAST and wire it up to dev_unicast_add and
dev_unicast_delete.
Additionally I noticed that errors from dev_mc_add were not propagated
from packet_dev_mc so fix that.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The netlink message header (struct nlmsghdr) is an unused parameter in
fill method of fib_rules_ops struct. This patch removes this
parameter from this method and fixes the places where this method is
called.
(include/net/fib_rules.h)
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The second argument of the probe method points to the amba_id
structure, so it's better passed with the correct type. None of the
current in-tree drivers uses the pointer, so they have only been
checked for a clean compile.
Change suggested by Russell King.
Signed-off-by: Alessandro Rubini <rubini@unipv.it>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Moving information from config_interface to bss_info_changed
removed struct ieee80211_if_conf which the documentation still
refers to, additionally there's one kernel-doc description too
much and one other missing, fix all this.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some applications using wireless extensions expect to be able to
remove a key that doesn't exist. One example is wpa_supplicant
which doesn't actually change behaviour when running into an
error while trying to do that, but it prints an error message
which users interpret as wpa_supplicant having problems.
The safe thing to do is not change the behaviour of wireless
extensions any more, so when the driver reports -ENOENT let
the wext bridge code return success to userspace. To guarantee
this, also document that drivers should return -ENOENT when the
key doesn't exist.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This allows drivers to use a const pointer as the privid without a cast.
Signed-off-by: David Kilroy <kilroyd@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This allows drivers to mark their cfg80211_ops tables const.
Signed-off-by: David Kilroy <kilroyd@googlemail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This is more consistent with our nl80211 naming convention
for HT40-/+.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We are not correctly listening to the regulatory max bandwidth
settings. To actually make use of it we need to redesign things
a bit. This patch does the work for that. We do this to so we
can obey to regulatory rules accordingly for use of HT40.
We end up dealing with HT40 by having two passes for each channel.
The first check will see if a 20 MHz channel fits into the channel's
center freq on a given frequency range. We check for a 20 MHz
banwidth channel as that is the maximum an individual channel
will use, at least for now. The first pass will go ahead and
check if the regulatory rule for that given center of frequency
allows 40 MHz bandwidths and we use this to determine whether
or not the channel supports HT40 or not. So to support HT40 you'll
need at a regulatory rule that allows you to use 40 MHz channels
but you're channel must also be enabled and support 20 MHz by itself.
The second pass is done after we do the regulatory checks over
an device's supported channel list. On each channel we'll check
if the control channel and the extension both:
o exist
o are enabled
o regulatory allows 40 MHz bandwidth on its frequency range
This work allows allows us to idependently check for HT40- and
HT40+.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Added constants to hid.h for all digitizer usages (including the new multitouch
ones that are not yet in the official USB spec but are being pushed by Microsft
as described in their paper "Digitizer Drivers for Windows Touch and Pen-Based
Computers"). Updated hid-debug.c to support the new MT input constants such as
ABS_MT_POSITION_X.
Signed-off-by: Stephane Chatty <chatty@enac.fr>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
For the dynamic irq_period code, log whenever we change the period so that
analyzing code can normalize the event flow.
[ Impact: add new feature to allow more precise profiling ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <20090520102553.298769743@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Instead of disabling RR scheduling of the counters, use a different list
that does not get rotated to iterate the counters on inheritance.
[ Impact: cleanup, optimization ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <20090520102553.237504544@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Merge reason: this branch was on an pre -rc1 base, merge it up to -rc6+
to get the latest upstream fixes.
Conflicts:
kernel/futex.c
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
be sent periodically. The rs_delay can be speficied when adding the
PRL entry and defaults to 15 minutes.
The RS is sent from every link local adress that's assigned to the
tunnel interface. It's directed to the (guessed) linklocal address
of the router and is sent through the tunnel.
Better: send to ff02::2 encapsuled in unicast directed to router-v4.
Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Context rotation should not occur when we are in the middle of
walking the counter list when inheriting counters ...
[ Impact: fix occasionally incorrect perf stat results ]
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For awhile now, many of the GEM code paths have allocated page or
object arrays with the slab allocator. This is nice and fast, but
won't work well if memory is fragmented, since the slab allocator works
with physically contiguous memory (i.e. order > 2 allocations are
likely to fail fairly early after booting and doing some work).
This patch works around the issue by falling back to vmalloc for
>PAGE_SIZE allocations. This is ugly, but much less work than chaining
a bunch of pages together by hand (suprisingly there's not a bunch of
generic kernel helpers for this yet afaik). vmalloc space is somewhat
precious on 32 bit kernels, but our allocations shouldn't be big enough
to cause problems, though they're routinely more than a page.
Note that this patch doesn't address the unchecked
alloc-based-on-ioctl-args in GEM; that needs to be fixed in a separate
patch.
Also, I've deliberately ignored the DRM's "area" junk. I don't think
anyone actually uses it anymore and I'm hoping it gets ripped out soon.
[Updated: removed size arg to new free function. We could unify the
free functions as well once the DRM mem tracking is ripped out.]
fd.o bug #20152 (part 1/3)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
OSD was the last in-tree user of blk_rq_append_bio(). Now
that it is fixed blk_rq_append_bio is un-exported and
is only used internally by block layer.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
New block API:
given a struct bio allocates a new request. This is the parallel of
generic_make_request for BLOCK_PC commands users.
The passed bio may be a chained-bio. The bio is bounced if needed
inside the call to this member.
This is in the effort of un-exporting blk_rq_append_bio().
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
CC: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Properly document the variable-size structure tricks we are doing
wrt. struct sched_group and sched_domain, and use the field[0] GCC
extension instead of defining a vla array.
Dont use unions for this, as pointed out by Linus.
[ Impact: cleanup, un-confuse Sparse and LLVM ]
Reported-by: Jeff Garzik <jeff@garzik.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <alpine.LFD.2.01.0905180850110.3301@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The P2020 is a dual e500v2 core based SOC with:
* 3 PCIe controllers
* 2 General purpose DMA controllers
* 2 sRIO controllers
* 3 eTSECS
* USB 2.0
* SDHC
* SPI, I2C, DUART
* enhanced localbus
* and optional Security (P2020E) security w/XOR acceleration
The p2020 DS reference board is pretty similar to the existing MPC85xx
DS boards and has a ULI 1575 connected on one of the PCIe controllers.
Signed-off-by: Ted Peters <Ted.Peters@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
This patch adds PCI IDs for MPC8569 and MPC8569E processors,
plus adds appropriate quirks for these IDs, and thus makes
PCI-E actually work on MPC8569E-MDS boards.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
One point of contention in high network loads is the dst_release() performed
when a transmited skb is freed. This is because NIC tx completion calls
dev_kree_skb() long after original call to dev_queue_xmit(skb).
CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is
quite visible if one CPU is 100% handling softirqs for a network device,
since dst_clone() is done by other cpus, involving cache line ping pongs.
It seems right place to release dst is in dev_hard_start_xmit(), for most
devices but ones that are virtual, and some exceptions.
David Miller suggested to define a new device flag, set in alloc_netdev_mq()
(so that most devices set it at init time), and carefuly unset in devices
which dont want a NULL skb->dst in their ndo_start_xmit().
List of devices that must clear this flag is :
- loopback device, because it calls netif_rx() and quoting Patrick :
"ip_route_input() doesn't accept loopback addresses, so loopback packets
already need to have a dst_entry attached."
- appletalk/ipddp.c : needs skb->dst in its xmit function
- And all devices that call again dev_queue_xmit() from their xmit function
(as some classifiers need skb->dst) : bonding, vlan, macvlan, eql, ifb, hdlc_fr
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently when we have a signal pending we have the functionality
to restart that the current system call. There are other cases
such as nasty lock ordering issues where it makes sense to have
a simple fix that uses try lock and restarts the system call.
Buying time to figure out how to rework the locking strategy.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New packet socket feature that makes packet socket more efficient for
transmission.
- It reduces number of system call through a PACKET_TX_RING mechanism,
based on PACKET_RX_RING (Circular buffer allocated in kernel space
which is mmapped from user space).
- It minimizes CPU copy using fragmented SKB (almost zero copy).
Signed-off-by: Johann Baudy <johann.baudy@gnu-log.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver adds support for the SJA1000 chips connected to the
"platform bus", which can be found on various embedded systems.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The CAN network device driver interface provides a generic interface to
setup, configure and monitor CAN network devices. It exports a set of
common data structures and functions, which all real CAN network device
drivers should use. Please have a look to the SJA1000 or MSCAN driver
to understand how to use them. The name of the module is can-dev.ko.
Furthermore, it adds a Netlink interface allowing to configure the CAN
device using the program "ip" from the iproute2 utility suite.
For further information please check "Documentation/networking/can.txt"
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The selinuxfs superblock magic is used inside the IMA code, but is being
defined in two places and could someday get out of sync. This patch moves the
declaration into magic.h so it is only done once.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
offsetof(struct net_device, features)=0x44
offsetof(struct net_device, stats.tx_packets)=0x54
offsetof(struct net_device, stats.tx_bytes)=0x5c
offsetof(struct net_device, stats.tx_dropped)=0x6c
Network drivers that touch dev->stats.tx_packets/stats.tx_bytes in their
tx path can slow down SMP operations, since they dirty a cache line
that should stay shared (dev->features is needed in rx and tx paths)
We could move away stats field in net_device but it wont help that much.
(Two cache lines dirtied in tx path, we can do one only)
Better solution is to add tx_packets/tx_bytes/tx_dropped in struct
netdev_queue because this structure is already touched in tx path and
counters updates will then be free (no increase in size)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
`local_add_unless(x, y, z)' will be expanded to `(&(x)->y, (y), (x))', but
`&(x)->y' should be `&(x)->a'
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rather than managing the bias level of the system based on if there is
an active audio stream manage it based on there being an active DAPM
widget. This simplifies the code a little, moving the power handling
into one place, and improves audio performance for bypass paths when no
playbacks or captures are active.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
DAPM has always applied any changes to the power state of widgets as soon
as it has determined that they are required. Instead of doing this store
all the changes that are required on lists of widgets to power up and
down, then iterate over those lists and apply the changes. This changes
the sequence in which changes are implemented, doing all power downs
before power ups and always using the up/down sequences (previously they
were only used when changes were due to DAC/ADC power events). The error
handling is also changed so that we continue attempting to power widgets
if some changes fail.
The main benefit of this is to allow future changes to do optimisations
over the whole power sequence and to reduce the number of walks of the
widget graph required to check the power status of widgets.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Enable the device IOTLB (i.e. ATS) for both the bare metal and KVM
environments.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Support device IOTLB invalidation to flush the translation cached
in the Endpoint.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Parse the Root Port ATS Capability Reporting Structure in the DMA
Remapping Reporting Structure ACPI table.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Add support for SG_IO passthru to virtio_blk. We add the scsi command
block after the normal outhdr, and the scsi inhdr with full status
information aswell as the sense buffer before the regular inhdr.
[hch: forward ported, added the VIRTIO_BLK_F_SCSI flags, some comments
and tested the whole beast]
[axboe: updated to use ->resid and not dual-path the byte count]
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (+ checkpatch.pl tweak)
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The PCIe ATS capability makes the Endpoint be able to request the
DMA address translation from the IOMMU and cache the translation
in the device side, thus alleviate IOMMU pressure and improve the
hardware performance in the I/O virtualization environment.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
pfn_valid() is meant to be able to tell if a given PFN has valid memmap
associated with it or not. In FLATMEM, it is expected that holes always
have valid memmap as long as there is valid PFNs either side of the hole.
In SPARSEMEM, it is assumed that a valid section has a memmap for the
entire section.
However, ARM and maybe other embedded architectures in the future free
memmap backing holes to save memory on the assumption the memmap is never
used. The page_zone linkages are then broken even though pfn_valid()
returns true. A walker of the full memmap must then do this additional
check to ensure the memmap they are looking at is sane by making sure the
zone and PFN linkages are still valid. This is expensive, but walkers of
the full memmap are extremely rare.
This was caught before for FLATMEM and hacked around but it hits again for
SPARSEMEM because the page_zone linkages can look ok where the PFN linkages
are totally screwed. This looks like a hatchet job but the reality is that
any clean solution would end up consumning all the memory saved by punching
these unexpected holes in the memmap. For example, we tried marking the
memmap within the section invalid but the section size exceeds the size of
the hole in most cases so pfn_valid() starts returning false where valid
memmap exists. Shrinking the size of the section would increase memory
consumption offsetting the gains.
This patch identifies when an architecture is punching unexpected holes
in the memmap that the memory model cannot automatically detect and sets
ARCH_HAS_HOLES_MEMORYMODEL. At the moment, this is restricted to EP93xx
which is the model sub-architecture this has been reported on but may expand
later. When set, walkers of the full memmap must call memmap_valid_within()
for each PFN and passing in what it expects the page and zone to be for
that PFN. If it finds the linkages to be broken, it assumes the memmap is
invalid for that PFN.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
after:
| commit b263295dbf
| Author: Christoph Lameter <clameter@sgi.com>
| Date: Wed Jan 30 13:30:47 2008 +0100
|
| x86: 64-bit, make sparsemem vmemmap the only memory model
we don't have MEMORY_HOTPLUG_RESERVE anymore.
Historically, x86-64 had an architecture-specific method for memory hotplug
whereby it scanned the SRAT for physical memory ranges that could be
potentially used for memory hot-add later. By reserving those ranges
without physical memory, the memmap would be allocated and left dormant
until needed. This depended on the DISCONTIG memory model which has been
removed so the code implementing HOTPLUG_RESERVE is now dead.
This patch removes the dead code used by MEMORY_HOTPLUG_RESERVE.
(Changelog authored by Mel.)
v2: updated changelog, and remove hotadd= in doc
[ Impact: remove dead code ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Reviewed-by: Mel Gorman <mel@csn.ul.ie>
Workflow-found-OK-by: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4A0C4910.7090508@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If we can find a type NETDEV_HW_ADDR_T_SAN mac address from the
corresponding netdev for a fcoe interface then sets up added the
fc->ctlr.spma flag and stores spma mode address in ctl_src_addr.
In case the spma flag is set then:-
1. Adds spma mode MAC address in ctl_src_addr as secondary
MAC address, the FLOGI for FIP and pre-FIP will go out
using this address.
2. Cleans up stored spma MAC address in ctl_src_addr in
fcoe_netdev_cleanup.
3. Sets up spma bit in fip_flags for FIP solicitations along
with exiting FPMA bit setting.
4. Initialize the FLOGI FIP MAC descriptor to stored spma
MAC address in ctl_src_addr. This is used as proposed
FCoE MAC address from initiator along with both SPMA
and FPMA bit set in FIP solicitation, in response the
switch may grant any FPMA or SPMA mode MAC address to
initiator.
Removes FIP descriptor type checking against ELS type
ELS_FLOGI in fcoe_ctlr_encaps to update a FIP MAC descriptor,
instead now checks against FIP_DT_FLOGI.
I've tested this with available FPMA-only FCoE switch but
since data_src_addr is updated using same old code for
both FPMA and SPMA modes with FIP or pre-FIP links, so added
SPMA mode will work with SPMA-only switch also provided that
switch grants a valid MAC address.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These registers were originally defined for XENPAK modules, but are
also implemented by many other 10G PHYs.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These do not have an in-kernel user but may be useful to user-space.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct net_device trans_start field is a hot spot on SMP and high performance
devices, particularly multi queues ones, because every transmitter dirties
it. Is main use is tx watchdog and bonding alive checks.
But as most devices dont use NETIF_F_LLTX, we have to lock
a netdev_queue before calling their ndo_start_xmit(). So it makes
sense to move trans_start from net_device to netdev_queue. Its update
will occur on a already present (and in exclusive state) cache line, for
free.
We can do this transition smoothly. An old driver continue to
update dev->trans_start, while an updated one updates txq->trans_start.
Further patches could also put tx_bytes/tx_packets counters in
netdev_queue to avoid dirtying dev->stats (vlan device comes to mind)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds CONFIG_REISERFS_FS_XATTR protection from reiserfs_permission.
This is needed to avoid warnings during file deletions and chowns with
xattrs disabled.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove hw_regs_t typedef and rename struct hw_regs_s to struct ide_hw.
There should be no functional changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Pass number of ports to ide_host_{alloc,add}() and then update
all users accordingly.
v2:
- drop no longer needed NULL initializers in buddha.c, cmd640.c and gayle.c
(noticed by Sergei)
There should be no functional changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Convert host drivers that still use hw_regs_t's chipset field to use
the one in struct ide_port_info instead.
* Move special handling of ide_pci chipset type from ide_hw_configure()
to ide_init_port().
* Remove chipset field from hw_regs_t.
While at it:
- remove stale comment in delkin_cb.c
There should be no functional changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Replace:
- special_t typedef by IDE_SFLAG_* flags
- 'special_t special' ide_drive_t's field by 'u8 special_flags' one
There should be no functional changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
lm8323 is the keypad driver used in n810 device.
[akpm@linux-foundation.org: coding-style fixes]
[dtor@mail.ru: various cleanups]
Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com>
Reviewed-by: Trilok Soni <soni.trilok@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
* 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
drm/i915: Add new GET_PIPE_FROM_CRTC_ID ioctl.
drm/i915: Set HDMI hot plug interrupt enable for only the output in question.
drm/i915: Include 965GME pci ID in IS_I965GM(dev) to match UMS.
drm/i915: Use the GM45 VGA hotplug workaround on G45 as well.
drm/i915: ignore LVDS on intel graphics systems that lie about having it
drm/i915: sanity check IER at wait_request time
drm/i915: workaround IGD i2c bus issue in kernel side (v2)
drm/i915: Don't allow binding objects into the last page of the aperture.
drm/i915: save/restore fence registers across suspend/resume
drm/i915: x86 always has writeq. Add I915_READ64 for symmetry.
This patch provides new heuristics for parsing both the form factor and
media rotation rate ATA IDENFITY words.
The reported ATA version must be 7 or greater and the device must return
values defined as valid in the standard. Only then are the
characteristics reported to SCSI via the VPD B1 page.
This seems like a reasonable compromise to me considering that we have
been shipping several kernel releases that key off the rotation rate bit
without any version checking whatsoever. With no complaints so far.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Andrew Vasquez wrote:
> fc-transport: Close state transition-window during rport deletion.
>
> After an rport's state has transitioned to FC_PORTSTATE_BLOCKED,
> but, prior to making the upcall to 'block' the scsi-target
> associated with an rport, queued commands can recycle and
> ultimately run out of retries causing failures to propagate to
> upper-level drivers. Close this transition-window by returning
> the non-'retries' modifying DID_IMM_RETRY status for submitted
> I/Os.
The same can happen for iscsi when transitioning from logged in
to failed and blocking the sdevs.
This patch converts iscsi and fc's transitions back to use DID_IMM_RETRY
instead of DID_TRANSPORT_DISRUPTED which has a limited number of retries
that we do not want to use for handling this race.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
[Addition of iscsi and fc port online devloss case conversion by Mike Christie]
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
Revert "mm: add /proc controls for pdflush threads"
viocd: needs to depend on BLOCK
block: fix the bio_vec array index out-of-bounds test
At present the values we put in overflow events for the misc
flags indicating processor mode and the instruction pointer are
obtained using the standard user_mode() and
instruction_pointer() functions. Those functions tell you where
the performance monitor interrupt was taken, which might not be
exactly where the counter overflow occurred, for example
because interrupts were disabled at the point where the
overflow occurred, or because the processor had many
instructions in flight and chose to complete some more
instructions beyond the one that caused the counter overflow.
Some architectures (e.g. powerpc) can supply more precise
information about where the counter overflow occurred and the
processor mode at that point. This introduces new functions,
perf_misc_flags() and perf_instruction_pointer(), which arch
code can override to provide more precise information if
available. They have default implementations which are
identical to the existing code.
This also adds a new misc flag value,
PERF_EVENT_MISC_HYPERVISOR, for the case where a counter
overflow occurred in the hypervisor. We encode the processor
mode in the 2 bits previously used to indicate user or kernel
mode; the values for user and kernel mode are unchanged and
hypervisor mode is indicated by both bits being set.
[ Impact: generalize perfcounter core facilities ]
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <18956.1272.818511.561835@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
avenrun is an rough estimate so we don't have to worry about
consistency of the three avenrun values. Remove the xtime lock
dependency and provide a function to scale the values. Cleanup the
users.
[ Impact: cleanup ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Dimitri Sivanich noticed that xtime_lock is held write locked across
calc_load() which iterates over all online CPUs. That can cause long
latencies for xtime_lock readers on large SMP systems.
The load average calculation is an rough estimate anyway so there is
no real need to protect the readers vs. the update. It's not a problem
when the avenrun array is updated while a reader copies the values.
Instead of iterating over all online CPUs let the scheduler_tick code
update the number of active tasks shortly before the avenrun update
happens. The avenrun update itself is handled by the CPU which calls
do_timer().
[ Impact: reduce xtime_lock write locked section ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Instead of specifying the irq_period for a counter, provide a target interrupt
frequency and dynamically adapt the irq_period to match this frequency.
[ Impact: new perf-counter attribute/feature ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20090515132018.646195868@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Instead of a per-process mlock gift for perf-counters, use a
per-user gift so that there is less of a DoS potential.
[ Impact: allow less worst-case unprivileged memory consumption ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20090515132018.496182835@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This reverts commit fafd688e4c.
Work is progressing to switch away from pdflush as the process backing
for flushing out dirty data. So it seems pointless to add more knobs
to control pdflush threads. The original author of the patch did not
have any specific use cases for adding the knobs, so we can easily
revert this before 2.6.30 to avoid having to maintain this API
forever.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The current disable/enable mechanism is:
token = hw_perf_save_disable();
...
/* do bits */
...
hw_perf_restore(token);
This works well, provided that the use nests properly. Except we don't.
x86 NMI/INT throttling has non-nested use of this, breaking things. Therefore
provide a reference counter disable/enable interface, where the first disable
disables the hardware, and the last enable enables the hardware again.
[ Impact: refactor, simplify the PMU disable/enable logic ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add ide_check_ireason() function that handles all ATAPI devices.
Reorganize all unlikely cases in ireason checking further down in the
code path.
In addition, add PFX for printks originating from ide-atapi. Finally,
remove ide_cd_check_ireason.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Now after all users of pc->buf have been converted, remove the 64B buffer
embedded in each packet command.
There should be no functional change resulting from this patch.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
This is in preparation of removing ide_atapi_pc. Expose the buffer as an
argument to ide_queue_pc_tail with later replacing it with local buffer
or even kmalloc'ed one if needed due to stack usage constraints.
Also, add the possibility of passing a NULL-ptr buffer for cmds which
don't transfer data besides the cdb. While at it, switch to local buffer
in idetape_get_mode_sense_results().
There should be no functional change resulting from this patch.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
This is in preparation for removing ide_atapi_pc.
There should be no functional change resulting from this patch.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Now that we have rq->resid_len, use it to account partial completion
amount during the lifetime of an rq, decrementing it on each successful
transfer. As a result, get rid of now unused pc->xferred.
While at it, remove noisy debug call in ide_prep_sense.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
This allows userlevel code to discover the pipe number corresponding
to a given CRTC ID. This is necessary for doing pipe-specific
operations such as waiting for vblank on a given CRTC. Failure to use
the right pipe mapping can result in GPU hangs, or at least failure
to actually sync to vblank.
Signed-off-by: Carl Worth <cworth@cworth.org>
[anholt: Style touchups from review]
Signed-off-by: Eric Anholt <eric@anholt.net>
When setting a key with NL80211_CMD_NEW_KEY, we should allow the key
sequence number (RSC) to be set in order to allow replay protection to
work correctly for group keys. This patch documents this use for
nl80211 and adds the couple of missing pieces in nl80211/cfg80211 and
mac80211 to support this. In addition, WEXT SIOCSIWENCODEEXT compat
processing in cfg80211 is extended to handle the RSC (this was already
specified in WEXT, but just not implemented in cfg80211/mac80211).
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add a new NL80211_ATTR_CONTROL_PORT flag for NL80211_CMD_ASSOCIATE to
allow user space to indicate that it will control the IEEE 802.1X port
in station mode. Previously, mac80211 was always marking the port
authorized in station mode. This was enough when drop_unencrypted flag
was set. However, drop_unencrypted can currently be controlled only
with WEXT and the current nl80211 design does not allow fully secure
configuration. Fix this by providing a mechanism for user space to
control the IEEE 802.1X port in station mode (i.e., do the same that
we are already doing in AP mode).
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
It is currently not possible to modify station flags, but that
capability would be very useful. This patch introduces a new
nl80211 attribute that contains a set/mask for station flags,
and updates the internal API (and mac80211) to mirror that.
The new attribute is parsed before falling back to the old so
that userspace can specify both (if it can) to work on all
kernels.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Move key handling wireless extension ioctls from mac80211 to cfg80211
so that all drivers that implement the cfg80211 operations get wext
compatibility.
Note that this drops the SIOCGIWENCODE ioctl support for getting
IW_ENCODE_RESTRICTED/IW_ENCODE_OPEN. This means that iwconfig will
no longer report "Security mode:open" or "Security mode:restricted"
for mac80211. However, what we displayed there (the authentication
algo used) was actually wrong -- linux/wireless.h states that this
setting is meant to differentiate between "Refuse non-encoded packets"
and "Accept non-encoded packets".
(Combined with "cfg80211: fix a couple of bugs with key ioctls". -- JWL)
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* Arun R Bharadwaj <arun@linux.vnet.ibm.com> [2009-04-16 12:11:36]:
This patch migrates all non pinned timers and hrtimers to the current
idle load balancer, from all the idle CPUs. Timers firing on busy CPUs
are not migrated.
While migrating hrtimers, care should be taken to check if migrating
a hrtimer would result in a latency or not. So we compare the expiry of the
hrtimer with the next timer interrupt on the target cpu and migrate the
hrtimer only if it expires *after* the next interrupt on the target cpu.
So, added a clockevents_get_next_event() helper function to return the
next_event on the target cpu's clock_event_device.
[ tglx: cleanups and simplifications ]
Signed-off-by: Arun R Bharadwaj <arun@linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* Arun R Bharadwaj <arun@linux.vnet.ibm.com> [2009-04-16 12:11:36]:
This patch creates the /proc/sys sysctl interface at
/proc/sys/kernel/timer_migration
Timer migration is enabled by default.
To disable timer migration, when CONFIG_SCHED_DEBUG = y,
echo 0 > /proc/sys/kernel/timer_migration
Signed-off-by: Arun R Bharadwaj <arun@linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* Arun R Bharadwaj <arun@linux.vnet.ibm.com> [2009-04-16 12:11:36]:
This patch creates a new framework for identifying cpu-pinned timers
and hrtimers.
This framework is needed because pinned timers are expected to fire on
the same CPU on which they are queued. So it is essential to identify
these and not migrate them, in case there are any.
For regular timers, the currently existing add_timer_on() can be used
queue pinned timers and subsequently mod_timer_pinned() can be used
to modify the 'expires' field.
For hrtimers, new modes HRTIMER_ABS_PINNED and HRTIMER_REL_PINNED are
added to queue cpu-pinned hrtimer.
[ tglx: use .._PINNED mode argument instead of creating tons of new
functions ]
Signed-off-by: Arun R Bharadwaj <arun@linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
dma: fix ipu_idmac.c to not discard the last queued buffer
ioatdma: fix "ioatdma frees DMA memory with wrong function"
ipu_idmac: Use disable_irq_nosync() from within irq handlers.
dmatest: fix max channels handling
as reported by Alexander Beregalov <a.beregalov@gmail.com>
ioatdma 0000:00:08.0: DMA-API: device driver frees DMA memory with
wrong function [device address=0x000000007f76f800] [size=2000 bytes]
[map
ped as single] [unmapped as page]
The ioatdma driver was unmapping all regions
(either allocated as page or single) using unmap_page.
This patch lets dma driver recognize if unmap_single or unmap_page should be used.
It introduces two new dma control flags:
DMA_COMPL_SRC_UNMAP_SINGLE and DMA_COMPL_DEST_UNMAP_SINGLE.
They should be set to indicate dma driver to do dma-unmapping as single
(first one for the source, tha latter for the destination).
If respective flag is not set, the driver assumes dma-unmapping as page.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Tested-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In order to build the generic syscall table, we need a declaration for
every system call. sys_pipe2 was added without a proper declaration, so
add this to syscalls.h now.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge reason: both topics modify the APIC code but were able to do it in
parallel so far. An upcoming patch generates a conflict so
merge them to avoid the conflict.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Splice is tied to pipes by design, it'll not change. And now that
the splice stuff is in splice.h (and note pipe.h), the rest of the comment
is out-of-date as well.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Bernhard Schiffner noticed I had a few comment typos in this patch,
(note: to save embarrassment, when making typos, avoid copying and
pasting them) so this patch corrects them.
[ Impact: cleanup ]
Reported-by: Bernhard Schiffner <bernhard@schiffner-limbach.de>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: riel@redhat.com
Cc: akpm@linux-foundation.org
LKML-Reference: <1242090794.7214.131.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
IEEE 802.11w/D9.0 introduces a mechanism for Action field Category
values to be used to select which Action frames are Robust. Public and
Vendor-specific categories are marked as not Robust in IEEE 802.11w;
HT will be marked not Robust in IEEE 802.11n. A new Vendor-specific
Protected category is allocated for Robust vendor-specific Action
frames. Another new category, Protected Dual of Action, is introduced
for protecting some existing Public Action frames (e.g., IEEE 802.11y
protected enablement).
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
To make it more apparent in the code what is for wext
only (and needs to be #ifdef'ed) put all the info for
wext into a substruct in each wireless_dev.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The address pointed to by mac_addr can be marked as const.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This struct is no longer used (and hasn't been for a while).
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
There really is no need to have a separate struct for a
single variable. The fact that it exists is due to the
code legacy, but we can remove that now. Very simple.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
NL80211_CMD_ASSOCIATE request must be able to indicate whether
management frame protection (IEEE 802.11w) is being used. mac80211 was
able to use MFP in client mode only with WEXT, but the new
NL80211_ATTR_USE_MFP attribute will allow this to be done with
nl80211, too.
Since we are currently using nl80211 for MFP only with drivers that
use user space SME, only MFP disabled and required values are
used. However, the NL80211_ATTR_USE_MFP attribute is an enum that can
be extended with MFP optional in the future, if that is needed with
some drivers (e.g., if the RSN IE is generated by the driver).
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
If f_op->splice_read() is not implemented, fall back to a plain read.
Use vfs_readv() to read into previously allocated pages.
This will allow splice and functions using splice, such as the loop
device, to work on all filesystems. This includes "direct_io" files
in fuse which bypass the page cache.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The last argument of block_remap prober is the original sector
before remap, so it should be 'from', not 'to'.
[ Impact: clean up ]
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: "Alan D. Brunelle" <Alan.Brunelle@hp.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
LKML-Reference: <4A07CE86.5090301@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Let's put the completion related functions back to block/blk-core.c
where they have lived. We can also unexport blk_end_bidi_request() and
__blk_end_bidi_request(), which nobody uses.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
blk_end_request_all() and __blk_end_request_all() should finish all
bytes including bidi, by definition. That's what all bidi users need ,
bidi requests must be complete as a whole (partial completion is
impossible).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Till now block layer allowed two separate modes of request execution.
A request is always acquired from the request queue via
elv_next_request(). After that, drivers are free to either dequeue it
or process it without dequeueing. Dequeue allows elv_next_request()
to return the next request so that multiple requests can be in flight.
Executing requests without dequeueing has its merits mostly in
allowing drivers for simpler devices which can't do sg to deal with
segments only without considering request boundary. However, the
benefit this brings is dubious and declining while the cost of the API
ambiguity is increasing. Segment based drivers are usually for very
old or limited devices and as converting to dequeueing model isn't
difficult, it doesn't justify the API overhead it puts on block layer
and its more modern users.
Previous patches converted all block low level drivers to dequeueing
model. This patch completes the API transition by...
* renaming elv_next_request() to blk_peek_request()
* renaming blkdev_dequeue_request() to blk_start_request()
* adding blk_fetch_request() which is combination of peek and start
* disallowing completion of queued (not started) requests
* applying new API to all LLDs
Renamings are for consistency and to break out of tree code so that
it's apparent that out of tree drivers need updating.
[ Impact: block request issue API cleanup, no functional change ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Mike Miller <mike.miller@hp.com>
Cc: unsik Kim <donari75@gmail.com>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Tim Waugh <tim@cyberelk.net>
Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Laurent Vivier <Laurent@lvivier.info>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Adrian McMenamin <adrian@mcmen.demon.co.uk>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: Pierre Ossman <drzeus@drzeus.cx>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
Cc: Stefan Weinhuber <wein@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Block low level drivers for some reason have been pretty good at
abusing block layer API. Especially struct request's fields tend to
get violated in all possible ways. Make it clear that low level
drivers MUST NOT access or manipulate rq->sector and rq->data_len
directly by prefixing them with double underscores.
This change is also necessary to break build of out-of-tree codes
which assume the previous block API where internal fields can be
manipulated and rq->data_len carries residual count on completion.
[ Impact: hide internal fields, block API change ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
struct request has had a few different ways to represent some
properties of a request. ->hard_* represent block layer's view of the
request progress (completion cursor) and the ones without the prefix
are supposed to represent the issue cursor and allowed to be updated
as necessary by the low level drivers. The thing is that as block
layer supports partial completion, the two cursors really aren't
necessary and only cause confusion. In addition, manual management of
request detail from low level drivers is cumbersome and error-prone at
the very least.
Another interesting duplicate fields are rq->[hard_]nr_sectors and
rq->{hard_cur|current}_nr_sectors against rq->data_len and
rq->bio->bi_size. This is more convoluted than the hard_ case.
rq->[hard_]nr_sectors are initialized for requests with bio but
blk_rq_bytes() uses it only for !pc requests. rq->data_len is
initialized for all request but blk_rq_bytes() uses it only for pc
requests. This causes good amount of confusion throughout block layer
and its drivers and determining the request length has been a bit of
black magic which may or may not work depending on circumstances and
what the specific LLD is actually doing.
rq->{hard_cur|current}_nr_sectors represent the number of sectors in
the contiguous data area at the front. This is mainly used by drivers
which transfers data by walking request segment-by-segment. This
value always equals rq->bio->bi_size >> 9. However, data length for
pc requests may not be multiple of 512 bytes and using this field
becomes a bit confusing.
In general, having multiple fields to represent the same property
leads only to confusion and subtle bugs. With recent block low level
driver cleanups, no driver is accessing or manipulating these
duplicate fields directly. Drop all the duplicates. Now rq->sector
means the current sector, rq->data_len the current total length and
rq->bio->bi_size the current segment length. Everything else is
defined in terms of these three and available only through accessors.
* blk_recalc_rq_sectors() is collapsed into blk_update_request() and
now handles pc and fs requests equally other than rq->sector update.
This means that now pc requests can use partial completion too (no
in-kernel user yet tho).
* bio_cur_sectors() is replaced with bio_cur_bytes() as block layer
now uses byte count as the primary data length.
* blk_rq_pos() is now guranteed to be always correct. In-block users
converted.
* blk_rq_bytes() is now guaranteed to be always valid as is
blk_rq_sectors(). In-block users converted.
* blk_rq_sectors() is now guaranteed to equal blk_rq_bytes() >> 9.
More convenient one is used.
* blk_rq_bytes() and blk_rq_cur_bytes() are now inlined and take const
pointer to request.
[ Impact: API cleanup, single way to represent one property of a request ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
With recent cleanups, there is no place where low level driver
directly manipulates request fields. This means that the 'hard'
request fields always equal the !hard fields. Convert all
rq->sectors, nr_sectors and current_nr_sectors references to
accessors.
While at it, drop superflous blk_rq_pos() < 0 test in swim.c.
[ Impact: use pos and nr_sectors accessors ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Tested-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Adrian McMenamin <adrian@mcmen.demon.co.uk>
Acked-by: Adrian McMenamin <adrian@mcmen.demon.co.uk>
Acked-by: Mike Miller <mike.miller@hp.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Tim Waugh <tim@cyberelk.net>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Dario Ballabio <ballabio_dario@emc.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: unsik Kim <donari75@gmail.com>
Cc: Laurent Vivier <Laurent@lvivier.info>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Implement accessors - blk_rq_pos(), blk_rq_sectors() and
blk_rq_cur_sectors() which return rq->hard_sector, rq->hard_nr_sectors
and rq->hard_cur_sectors respectively and convert direct references of
the said fields to the accessors.
This is in preparation of request data length handling cleanup.
Geert : suggested adding const to struct request * parameter to accessors
Sergei : spotted error in patch description
[ Impact: cleanup ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Ackec-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
rq->data_len served two purposes - the length of data buffer on issue
and the residual count on completion. This duality creates some
headaches.
First of all, block layer and low level drivers can't really determine
what rq->data_len contains while a request is executing. It could be
the total request length or it coulde be anything else one of the
lower layers is using to keep track of residual count. This
complicates things because blk_rq_bytes() and thus
[__]blk_end_request_all() relies on rq->data_len for PC commands.
Drivers which want to report residual count should first cache the
total request length, update rq->data_len and then complete the
request with the cached data length.
Secondly, it makes requests default to reporting full residual count,
ie. reporting that no data transfer occurred. The residual count is
an exception not the norm; however, the driver should clear
rq->data_len to zero to signify the normal cases while leaving it
alone means no data transfer occurred at all. This reverse default
behavior complicates code unnecessarily and renders block PC on some
drivers (ide-tape/floppy) unuseable.
This patch adds rq->resid_len which is used only for residual count.
While at it, remove now unnecessasry blk_rq_bytes() caching in
ide_pc_intr() as rq->data_len is not changed anymore.
Boaz : spotted missing conversion in osd
Sergei : spotted too early conversion to blk_rq_bytes() in ide-tape
[ Impact: cleanup residual count handling, report 0 resid by default ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Mike Miller <mike.miller@hp.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: Mike Miller <mike.miller@hp.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Darrick J. Wong <djwong@us.ibm.com>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Rename cred_exec_mutex to reflect that it's a guard against foreign
intervention on a process's credential state, such as is made by ptrace(). The
attachment of a debugger to a process affects execve()'s calculation of the new
credential state - _and_ also setprocattr()'s calculation of that state.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
As we just did for context cache flushing, clean up the logic around
whether we need to flush the iotlb or just the write-buffer, depending
on caching mode.
Fix the same bug in qi_flush_iotlb() that qi_flush_context() had -- it
isn't supposed to be returning an error; it's supposed to be returning a
flag which triggers a write-buffer flush.
Remove some superfluous conditional write-buffer flushes which could
never have happened because they weren't for non-present-to-present
mapping changes anyway.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
It really doesn't make a lot of sense to have some of the logic to
handle caching vs. non-caching mode duplicated in qi_flush_context() and
__iommu_flush_context(), while the return value indicates whether the
caller should take other action which depends on the same thing.
Especially since qi_flush_context() thought it was returning something
entirely different anyway.
This patch makes qi_flush_context() and __iommu_flush_context() both
return void, removes the 'non_present_entry_flush' argument and makes
the only call site which _set_ that argument to 1 do the right thing.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
Revert driver core: move platform_data into platform_device
Revert driver core: fix passing platform_data
Remove old PRINTK_DEBUG config item
Doc/sysfs-rules: Swap the order of the words so the sentence makes more sense
Driver core: platform: fix kernel-doc warnings
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (22 commits)
Fix the race between capifs remount and node creation
Fix races around the access to ->s_options
switch ufs directories to ufs_sync_file()
Switch open_exec() and sys_uselib() to do_open_filp()
Make open_exec() and sys_uselib() use may_open(), instead of duplicating its parts
Reduce path_lookup() abuses
Make checkpatch.pl shut up on fs/inode.c
NULL noise in fs/super.c:kill_bdev_super()
romfs: cleanup romfs_fs.h
ROMFS: romfs_dev_read() error ignored
fs: dcache fix LRU ordering
ocfs2: Use nd_set_link().
Fix deadlock in ipathfs ->get_sb()
Fix a leak in failure exit in 9p ->get_sb()
Convert obvious places to deactivate_locked_super()
New helper: deactivate_locked_super()
reiserfs: remove privroot hiding in lookup
reiserfs: dont associate security.* with xattr files
reiserfs: fixup xattr_root caching
Always lookup priv_root on reiserfs mount and keep it
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda - Fix line-in on Mac Mini Core2 Duo
ALSA: Release v1.0.20
sound: via82xx: fix DXS volume range
sound: serial-u16550: fix buffer overflow
ASoC: Fix errors in WM8990
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (26 commits)
bonding: fix panic if initialization fails
IXP4xx: complete Ethernet netdev setup before calling register_netdev().
IXP4xx: use "ENODEV" instead of "ENOSYS" in module initialization.
ipvs: Fix IPv4 FWMARK virtual services
ipv4: Make INET_LRO a bool instead of tristate.
net: remove stale reference to fastroute from Kconfig help text
net: update skb_recycle_check() for hardware timestamping changes
bnx2: Fix panic in bnx2_poll_work().
net-sched: fix bfifo default limit
igb: resolve panic on shutdown when SR-IOV is enabled
wimax: oops: wimax_dev_add() is the only one that can initialize the state
wimax: fix oops if netlink fails to add attribute
Bluetooth: Move dev_set_name() to a context that can sleep
netfilter: ctnetlink: fix wrong message type in user updates
netfilter: xt_cluster: fix use of cluster match with 32 nodes
netfilter: ip6t_ipv6header: fix match on packets ending with NEXTHDR_NONE
netfilter: add missing linux/types.h include to xt_LED.h
mac80211: pid, fix memory corruption
mac80211: minstrel, fix memory corruption
cfg80211: fix comment on regulatory hint processing
...
Put generic_show_options read access to s_options under rcu_read_lock,
split save_mount_options() into "we are setting it the first time"
(uses in foo_fill_super()) and "we are relacing and freeing the old one",
synchronize_rcu() before kfree() in the latter.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
There's no kernel-only content in it anymore, so move it to header-y
and remove the superflous #ifdef __KERNEL__.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Does equivalent of up_write(&s->s_umount); deactivate_super(s);
However, it does not does not unlock it until it's all over.
As the result, it's safe to use to dispose of new superblock on ->get_sb()
failure exits - nobody will see the sucker until it's all over.
Equivalent using up_write/deactivate_super is safe for that purpose
if superblock is either safe to use or has NULL ->s_root when we unlock.
Normally filesystems take the required precautions, but
a) we do have bugs in that area in some of them.
b) up_write/deactivate_super sequence is extremely common,
so the helper makes sense anyway.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
With Al Viro's patch to move privroot lookup to fs mount, there's no need
to have special code to hide the privroot in reiserfs_lookup.
I've also cleaned up the privroot hiding in reiserfs_readdir_dentry and
removed the last user of reiserfs_xattrs().
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The xattr_root caching was broken from my previous patch set. It wouldn't
cause corruption, but could cause decreased performance due to allocating
a larger chunk of the journal (~ 27 blocks) than it would actually use.
This patch loads the xattr root dentry at xattr initialization and creates
it on-demand. Since we're using the cached dentry, there's no point
in keeping lookup_or_create_dir around, so that's removed.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
... even if it's a negative dentry. That way we can set ->d_op on
root before anyone could race with us. Simplify d_compare(), while
we are at it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This reverts commit 006f4571a1:
This patch moves platform_data from struct device into
struct platform_device, based on the two ideas:
1. Now all platform_driver is registered by platform_driver_register,
which makes probe()/release()/... of platform_driver passed parameter
of platform_device *, so platform driver can get platform_data from
platform_device;
2. Other kind of devices do not need to use platform_data, we can
decrease size of device if moving it to platform_device.
Taking into consideration of thousands of files to be fixed and they
can't be finished in one night(maybe it will take a long time), so we
keep platform_data in device to allow two kind of cases coexist until
all platform devices pass its platfrom data from
platform_device->platform_data.
All patches to do this kind of conversion are welcome.
As we don't really want to do it, it was a bad idea.
Cc: David Brownell <david-b@pacbell.net>
Cc: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Other parts of the kernel may need to be able to enable or disable
specific events. Especially parts that create trace events.
[ Impact: allow enabling of trace events by those that create the event ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Allow recording the CPU number the event was generated on.
RFC: this leaves a u32 as reserved, should we fill in the
node_id() there, or leave this open for future extention,
as userspace can already easily do the cpu->node mapping
if needed.
[ Impact: extend perfcounter output record format ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090508170029.008627711@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Much like CONFIG_RECORD_GROUP records the hw_event.config to
identify the values, allow to record this for all counters.
[ Impact: extend perfcounter output record format ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090508170028.923228280@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Corey noticed that ioctl()s on grouped counters didn't work on
the whole group. This extends the ioctl() interface to take a
second argument that is interpreted as a flags field. We then
provide PERF_IOC_FLAG_GROUP to toggle the behaviour.
Having this flag gives the greatest flexibility, allowing you
to individually enable/disable/reset counters in a group, or
all together.
[ Impact: fix group counter enable/disable semantics ]
Reported-by: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090508170028.837558214@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use enable/disable hooks for clock framework integration.
Make sure we control the clock for the serial console as well.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Conflicts:
arch/frv/include/asm/pgtable.h
arch/x86/include/asm/required-features.h
arch/x86/xen/mmu.c
Merge reason: x86/xen was on a .29 base still, move it to a fresher
branch and pick up Xen fixes as well, plus resolve
conflicts
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (32 commits)
[CIFS] Fix double list addition in cifs posix open code
[CIFS] Allow raw ntlmssp code to be enabled with sec=ntlmssp
[CIFS] Fix SMB uid in NTLMSSP authenticate request
[CIFS] NTLMSSP reenabled after move from connect.c to sess.c
[CIFS] Remove sparse warning
[CIFS] remove checkpatch warning
[CIFS] Fix final user of old string conversion code
[CIFS] remove cifs_strfromUCS_le
[CIFS] NTLMSSP support moving into new file, old dead code removed
[CIFS] Fix endian conversion of vcnum field
[CIFS] Remove trailing whitespace
[CIFS] Remove sparse endian warnings
[CIFS] Add remaining ntlmssp flags and standardize field names
[CIFS] Fix build warning
cifs: fix length handling in cifs_get_name_from_search_buf
[CIFS] Remove unneeded QuerySymlink call and fix mapping for unmapped status
[CIFS] rename cifs_strndup to cifs_strndup_from_ucs
Added loop check when mounting DFS tree.
Enable dfs submounts to handle remote referrals.
[CIFS] Remove older session setup implementation
...
We can avoid waking up tasks not interested in receive notifications,
using wake_up_interruptible_poll() instead of wake_up_interruptible()
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Small cleanup patch to reduce line lengths, before a change in
tcp_prequeue().
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge reason: this topic is ready for upstream now. It passed
Oleg's review and Andrew had no further mm/*
objections/observations either.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Merge reason: tracing/core was on a .30-rc1 base and was missing out on
on a handful of tracing fixes present in .30-rc5-almost.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Otherwise list_for_each_entry_rcu() et al. aren't visible
and we get build failures in some configurations.
Signed-off-by: David S. Miller <davem@davemloft.net>
When building with gcc 3.2 I get thousands of warnings such as
include/linux/gfp.h: In function `allocflags_to_migratetype':
include/linux/gfp.h:105: warning: null format string
due to passing a NULL format string to warn_slowpath() in
#define __WARN() warn_slowpath(__FILE__, __LINE__, NULL)
Split this case out into a separate call. This also shrinks the kernel
slightly:
text data bss dec hex filename
4802274 707668 712704 6222646 5ef336 vmlinux
text data bss dec hex filename
4799027 703572 712704 6215303 5ed687 vmlinux
due to removeing one argument from the commonly-called __WARN().
[akpm@linux-foundation.org: reduce scope of `empty']
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
IEEE 802.11w/D8.0 changed the length of the SA Query transaction
identifier from 16 to 2 octets.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
wl12xx is a driver for TI wl1251 802.11 chipset designed for embedded
devices, supporting both SDIO and SPI busses. Currently the driver
supports only SPI. Adding support 1253 (the 5 GHz version) should be
relatively easy. More information here:
http://focus.ti.com/general/docs/wtbu/wtbuproductcontent.tsp?contentId=4711&navigationId=12494&templateId=6123
(Collapsed original sequence of pre-merge patches into single commit for
initial merge. -- JWL)
Signed-off-by: Kalle Valo <kalle.valo@nokia.com>
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When we aren't doing anything in mac80211, we can turn off
much of the hardware, depending on the driver/hw. Not doing
anything, aka being idle, means:
* no monitor interfaces
* no AP/mesh/wds interfaces
* any station interfaces are in DISABLED state
* any IBSS interfaces aren't trying to be in a network
* we aren't trying to scan
By creating a new function that verifies these conditions and calling
it at strategic points where the states of those conditions change,
we can easily make mac80211 tell the driver when we are idle to save
power.
Additionally, this fixes a small quirk where a recalculated powersave
state is passed to the driver even if the hardware is about to stopped
completely.
This patch intentionally doesn't touch radio_enabled because that is
currently implemented to be a soft rfkill which is inappropriate here
when we need to be able to wake up with low latency.
One thing I'm not entirely sure about is this:
phy0: device no longer idle - in use
wlan0: direct probe to AP 00:11:24:91:07:4d try 1
wlan0 direct probe responded
wlan0: authenticate with AP 00:11:24:91:07:4d
wlan0: authenticated
> phy0: device now idle
> phy0: device no longer idle - in use
wlan0: associate with AP 00:11:24:91:07:4d
wlan0: RX AssocResp from 00:11:24:91:07:4d (capab=0x401 status=0 aid=1)
wlan0: associated
Is it appropriate to go into idle state for a short time when we have
just authenticated, but not associated yet? This happens only with the
userspace SME, because we cannot really know how long it will wait
before asking us to associate. Would going idle after a short timeout
be more appropriate? We may need to revisit this, depending on what
happens.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The config_interface method is a little strange, it contains the
BSSID and beacon updates, while bss_info_changed contains most
other BSS information for each interface. This patch removes
config_interface and rolls all the information it previously
passed to drivers into bss_info_changed.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We currently have two beacon interval configuration knobs:
hw.conf.beacon_int and vif.bss_info.beacon_int. This is
rather confusing, even though the former is used when we
beacon ourselves and the latter when we are associated to
an AP.
This just deprecates the hw.conf.beacon_int setting in favour
of always using vif.bss_info.beacon_int. Since it touches all
the beaconing IBSS code anyway, we can also add support for
the cfg80211 IBSS beacon interval configuration easily.
NOTE: The hw.conf.beacon_int setting is retained for now due
to drivers still using it -- I couldn't untangle all
drivers, some are updated in this patch.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Kalle points out that max_sleep_interval is somewhat confusing
because the value is measured in beacon intervals, and not in
TU. Rename it to max_sleep_period to be consistent with things
like DTIM period that are also measured in beacon intervals.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This attempts to clarify names utilized during block I/O remap
operations (partition, volume manager). It correctly matches up the
/from/ information for both device & sector. This takes in the concept
from Kosaki Motohiro and extends it to include better naming for the
"device_from" field.
[ Impact: cleanup ]
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <49FF4FAE.3000301@hp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The orig_cpu parameter in trace_sched_migrate_task() is not necessary,
it can be got by using task_cpu(p) in the probe.
[ Impact: micro-optimization ]
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
[ modified from Mathieu's patch. The original patch is at:
http://marc.info/?l=linux-kernel&m=123791201716239&w=2 ]
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: zhaolei@cn.fujitsu.com
Cc: laijs@cn.fujitsu.com
LKML-Reference: <49FFFDB7.1050402@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The conversion to the ntpv4 reference model
f199239373 ("ntp: convert to the NTP4
reference model") in 2.6.19 added nanosecond resolution the adjtimex
interface, but also changed the "stiffness" of the frequency adjustments,
causing NTP convergence time to greatly increase.
SHIFT_PLL, which reduces the stiffness of the freq adjustments, was
designed to be inversely linked to HZ, and the reference value of 4 was
designed for Unix systems using HZ=100. However Linux's clock steering
code mostly independent of HZ.
So this patch reduces the SHIFT_PLL value from 4 to 2, which causes NTPd
behavior to match kernels prior to 2.6.19, greatly reducing convergence
times, and improving close synchronization through environmental thermal
changes.
The patch also changes some l's to L's in nearby code to avoid misreading
50l as 501.
[ Impact: tweak NTP algorithm for faster convergence ]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: zippel@linux-m68k.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <200905051956.n45JuVo9025575@imap1.linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When unloading a module, memory allocated by init_preds() and
trace_define_field() is not freed.
[ Impact: fix memory leak ]
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4A00F6E0.3040503@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Merge reason: we moved a mutex.h commit that originated from the
perfcounters tree into core/locking - but now merge
back that branch to solve a merge artifact and to
pick up cleanups of this commit that happened in
core/locking.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
v5 -> v6 (current):
-removed so far unused static functions
-corrected dev_addr_del_multiple to call del instead of add
v4 -> v5:
-added device address type (suggested by davem)
-removed refcounting (better to have simplier code then safe potentially few
bytes)
v3 -> v4:
-changed kzalloc to kmalloc in __hw_addr_add_ii()
-ASSERT_RTNL() avoided in dev_addr_flush() and dev_addr_init()
v2 -> v3:
-removed unnecessary rcu read locking
-moved dev_addr_flush() calling to ensure no null dereference of dev_addr
v1 -> v2:
-added forgotten ASSERT_RTNL to dev_addr_init and dev_addr_flush
-removed unnecessary rcu_read locking in dev_addr_init
-use compare_ether_addr_64bits instead of compare_ether_addr
-use L1_CACHE_BYTES as size for allocating struct netdev_hw_addr
-use call_rcu instead of rcu_synchronize
-moved is_etherdev_addr into __KERNEL__ ifdef
This patch introduces a new list in struct net_device and brings a set of
functions to handle the work with device address list. The list is a replacement
for the original dev_addr field and because in some situations there is need to
carry several device addresses with the net device. To be backward compatible,
dev_addr is made to point to the first member of the list so original drivers
sees no difference.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide a threshold to relax the mlock accounting, increasing usability.
Each counter gets perf_counter_mlock_kb for free.
[ Impact: allow more mmap buffering ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090505155437.112113632@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Provide a way to reset an existing counter - this eases PAPI
libraries around perfcounters.
Similar to read() it doesn't collapse pending child counters.
[ Impact: new perfcounter fd ioctl method to reset counters ]
Suggested-by: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090505155437.022272933@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Keep data_head up-to-date irrespective of notifications. This fixes
the case where you disable a counter and don't get a notification for
the last few pending events, and it also allows polling usage.
[ Impact: increase precision of perfcounter mmap-ed fields ]
Suggested-by: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090505155436.925084300@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The WARN_ON in the ring buffer when a commit is preempted and the
buffer is filled by preceding writes can happen in normal operations.
The WARN_ON makes it look like a bug, not to mention, because
it does not stop tracing and calls printk which can also recurse, this
is prone to deadlock (the WARN_ON is not in a position to recurse).
This patch removes the WARN_ON and replaces it with a counter that
can be retrieved by a tracer. This counter is called commit_overrun.
While at it, I added a nmi_dropped counter to count any time an NMI entry
is dropped because the NMI could not take the spinlock.
[ Impact: prevent deadlock by printing normal case warning ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This patch fixes a problem when you use 32 nodes in the cluster
match:
% iptables -I PREROUTING -t mangle -i eth0 -m cluster \
--cluster-total-nodes 32 --cluster-local-node 32 \
--cluster-hash-seed 0xdeadbeef -j MARK --set-mark 0xffff
iptables: Invalid argument. Run `dmesg' for more information.
% dmesg | tail -1
xt_cluster: this node mask cannot be higher than the total number of nodes
The problem is related to this checking:
if (info->node_mask >= (1 << info->total_nodes)) {
printk(KERN_ERR "xt_cluster: this node mask cannot be "
"higher than the total number of nodes\n");
return false;
}
(1 << 32) is 1. Thus, the checking fails.
BTW, I said this before but I insist: I have only tested the cluster
match with 2 nodes getting ~45% extra performance in an active-active setup.
The maximum limit of 32 nodes is still completely arbitrary. I'd really
appreciate if people that have more nodes in their setups let me know.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits)
e1000: fix virtualization bug
bonding: fix alb mode locking regression
Bluetooth: Fix issue with sysfs handling for connections
usbnet: CDC EEM support (v5)
tcp: Fix tcp_prequeue() to get correct rto_min value
ehea: fix invalid pointer access
ne2k-pci: Do not register device until initialized.
Subject: [PATCH] br2684: restore net_dev initialization
net: Only store high 16 bits of kernel generated filter priorities
virtio_net: Fix function name typo
virtio_net: Cleanup command queue scatterlist usage
bonding: correct the cleanup in bond_create()
virtio: add missing include to virtio_net.h
smsc95xx: add support for LAN9512 and LAN9514
smsc95xx: configure LED outputs
netconsole: take care of NETDEV_UNREGISTER event
xt_socket: checks for the state of nf_conntrack
bonding: bond_slave_info_query() fix
cxgb3: fixing gcc 4.4 compiler warning: suggest parentheses around operand of ‘!’
netfilter: use likely() in xt_info_rdlock_bh()
...
Added runtime->delay field to adjust the delayed samples for snd_pcm_delay().
Typically a hardware FIFO length is stored in this field, so that the
extra delay between hwptr and applptr can be computed.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Pointed out by Dave Miller:
CHECK include/linux/netfilter (57 files)
/home/davem/src/GIT/net-2.6/usr/include/linux/netfilter/xt_LED.h:6: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The DAI structure has two pointers to the codec, one in the body of the
DAI and one in a union for a parent pointer. Drop the parent pointer
version.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Due to a semantic changes in flush_workqueue() the current approach of
synchronizing the sysfs handling for connections doesn't work anymore. The
whole approach is actually fully broken and based on assumptions that are
no longer valid.
With the introduction of Simple Pairing support, the creation of low-level
ACL links got changed. This change invalidates the reason why in the past
two independent work queues have been used for adding/removing sysfs
devices. The adding of the actual sysfs device is now postponed until the
host controller successfully assigns an unique handle to that link. So
the real synchronization happens inside the controller and not the host.
The only left-over problem is that some internals of the sysfs device
handling are not initialized ahead of time. This leaves potential access
to invalid data and can cause various NULL pointer dereferences. To fix
this a new function makes sure that all sysfs details are initialized
when an connection attempt is made. The actual sysfs device is only
registered when the connection has been successfully established. To
avoid a race condition with the registration, the check if a device is
registered has been moved into the removal work.
As an extra protection two flush_work() calls are left in place to
make sure a previous add/del work has been completed first.
Based on a report by Marc Pignat <marc.pignat@hevs.ch>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Tested-by: Justin P. Mattock <justinmattock@gmail.com>
Tested-by: Roger Quadros <ext-roger.quadros@nokia.com>
Tested-by: Marc Pignat <marc.pignat@hevs.ch>
This introduces a CDC Ethernet Emulation Model (EEM) host side
driver to support USB EEM devices.
EEM is different from the Ethernet Control Model (ECM) currently
supported by the "CDC Ethernet" driver. One key difference is
that it doesn't require of USB interface alternate settings to
manage interface state; some maldesigned hardware can't handle
that part of USB. It also avoids a separate USB interface for
control and status updates.
[ dbrownell@users.sourceforge.net: fix skb leaks, add rx packet
checks, improve fault handling, EEM conformance updates, cleanup ]
Signed-off-by: Omar Laazimani <omar.oberthur@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_prequeue() refers to the constant value (TCP_RTO_MIN) regardless of
the actual value might be tuned. The following patches fix this and make
tcp_prequeue get the actual value returns from tcp_rto_min().
Signed-off-by: Satoru SATOH <satoru.satoh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
percpu scheduling for perfcounters wants to take the context lock,
but that lock first needs to be initialized. Currently it is an
early_initcall() - but that is too late, the task tick runs much
sooner than that.
Call it explicitly from the scheduler init sequence instead.
[ Impact: fix access-before-init crash ]
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All of the SH timers use a roughly identical structure for platform data,
which presently is broken out for each block. Consolidate all of these
definitions, as there is no reason for them to be broken out in the first
place.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds a TMU driver for the SuperH architecture.
The TMU driver is a platform driver with early platform
support to allow using a TMU channel as clockevent or
clocksource during system bootup or later.
Clocksource or clockevent can be selected.
Both periodic and oneshot clockevents are supported.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds a MTU2 driver for the SuperH architecture.
The MTU2 driver is a platform driver with early platform
support to allow using a MTU2 channel as only clockevent
during system bootup.
Clocksource on sh2a is currently unsupported due to code
generation issues with 64-bit math, so at this point only
periodic clockevent support is in place.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The defines for TDM and synchronous clocks are not used - they are
mostly a legacy of the automatic clocking configuration. TDM will
require configuration of the number of timeslots and which ones to use
so can't be fit into the DAI format and synchronous mode is handled by
symmetric_rates (and needs to be done by constraints rather than when
the DAI format is being configured).
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
The AC97 wire format is completely fixed so CODECs don't have any choice
about the formats they accept but controllers accept a variety of data
formats and render them down onto the bus. Have a shared define so all
the CODEC drivers will interoperate with any of our controller drivers.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Some arches don't supply their own clocksource. This is mainly the
case in architectures that get their inter-tick times by reading the
counter on their interval timer. Since these timers wrap every tick,
they're not really useful as clocksources. Wrapping them to act like
one is possible but not very efficient. So we provide a callout these
arches can implement for use with the jiffies clocksource to provide
finer then tick granular time.
[ Impact: ease the migration to generic time keeping ]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Setup clocksource mult_orig in clocksource_enable().
Clocksource drivers can save power by using keeping the
device clock disabled while the clocksource is unused.
In practice this means that the enable() and disable()
callbacks perform clk_enable() and clk_disable().
The enable() callback may also use clk_get_rate() to get
the clock rate from the clock framework. This information
can then be used to calculate the shift and mult variables.
Currently the mult_orig variable is setup from mult at
registration time only. This is conflicting with the above
case since the clock is disabled and the mult variable is
not yet calculated at the time of registration.
Moving the mult_orig setup code to clocksource_enable()
allows us to both handle the common case with no enable()
callback and the mult-changed-after-enable() case.
[ Impact: allow dynamic clock source usage ]
Signed-off-by: Magnus Damm <damm@igel.co.jp>
LKML-Reference: <20090501054546.8193.10688.sendpatchset@rx1.opensource.se>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Move this out of a local variable into the nfs4_delegation object in
preparation for making this an async rpc call (at which point we'll need
any state like this in a common object that's preserved across function
calls).
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
There's no point in keeping this field around--it's always zero.
(Background: the protocol allows you to tell the client that the file is
about to be truncated, as an optimization to save the client from
writing back dirty pages that will just be discarded. We don't
implement this hint. If we do some day, adding this field back in will
be the least of the work involved.)
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The nfs4_cb_recall struct is used only in nfs4_delegation, so its
pointer to the containing delegation is unnecessary--we could just use
container_of().
But there's no real reason to have this a separate struct at all--just
move these fields to nfs4_delegation.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
virtio_net.h uses the macro ETH_ALEN which is defined in linux/if_ether.h.
Discovered when hacking on virtio-over-pci patches.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
I want to use the name for a struct that actually does represent a
single callback.
(Actually, I've never been sure it helps to a separate struct for the
callback information. Some day maybe those fields could just be dumped
into struct nfs4_client. I don't know.)
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Conflicts:
arch/x86/kernel/apic/io_apic.c
Merge reason: non-trivial interaction between ongoing work in io_apic.c
and the NUMA migration feature in the irq tree.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
move_irq_desc() will try to move irq_desc to the home node if
the allocated one is not correct, in create_irq_nr().
( This can happen on devices that are on different nodes that
are using MSI, when drivers are loaded and unloaded randomly. )
v2: fix non-smp build
v3: add NUMA_IRQ_DESC to eliminate #ifdefs
[ Impact: improve irq descriptor locality on NUMA systems ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <49F95EAE.2050903@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When two (or more) contexts output to the same buffer, it is possible
to observe half written output.
Suppose we have CPU0 doing perf_counter_mmap(), CPU1 doing
perf_counter_overflow(). If CPU1 does a wakeup and exposes head to
user-space, then CPU2 can observe the data CPU0 is still writing.
[ Impact: fix occasionally corrupted profiling records ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090501102533.007821627@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is necessary to avoid the conflict of syscall numbers.
Conflicts:
arch/x86/ia32/ia32entry.S
arch/x86/include/asm/unistd_32.h
arch/x86/include/asm/unistd_64.h
Fixes up the borked syscall numbers of perfcounters versus
preadv/pwritev as well.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
sys_kill has the per thread counterpart sys_tgkill. sigqueueinfo is
missing a thread directed counterpart. Such an interface is important
for migrating applications from other OSes which have the per thread
delivery implemented.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Acked-by: Ulrich Drepper <drepper@redhat.com>
Original patch (dfa4411cc3) was buggy.
This is a more proper fix which introduces blk_rq_quiet() macro
alleviating the need for dumb, too short caching variables.
Thanks to Helge Deller and Bart for debugging this.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Reported-and-tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
It's possible for character sets to require a multi-byte null
string terminator. Add a helper function that determines the size
of the null terminator at runtime.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The new requeue PI futex op codes were modeled after the existing
FUTEX_REQUEUE and FUTEX_CMP_REQUEUE calls. I was unaware at the time
that FUTEX_REQUEUE was only around for compatibility reasons and
shouldn't be used in new code. Ulrich Drepper elaborates on this in his
Futexes are Tricky paper: http://people.redhat.com/drepper/futex.pdf.
The deprecated call doesn't catch changes to the futex corresponding to
the destination futex which can lead to deadlock.
Therefor, I feel it best to remove FUTEX_REQUEUE_PI and leave only
FUTEX_CMP_REQUEUE_PI as there are not yet any existing users of the API.
This patch does change the OP code value of FUTEX_CMP_REQUEUE_PI to 12
from 13. Since my test case is the only known user of this API, I felt
this was the right thing to do, rather than leave a hole in the
enumeration.
I chose to continue using the _CMP_ modifier in the OP code to make it
explicit to the user that the test is being done.
Builds, boots, and ran several hundred iterations requeue_pi.c.
Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
LKML-Reference: <49ED580E.1050502@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
include/linux/mutex.h:136: warning: 'mutex_lock' declared inline after being called
include/linux/mutex.h:136: warning: previous declaration of 'mutex_lock' was here
uninline it.
[ Impact: clean up and uninline, address compiler warning ]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Eric Paris <eparis@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <200904292318.n3TNIsi6028340@imap1.linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add mdio_support and lp_advertising fields to ethtool_cmd. Set these
in mdio45_ethtool_gset{,_npage}().
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This implements the ETHTOOL_SPAUSEPARAM operation for MDIO (clause 45)
PHYs with auto-negotiation MMDs.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This converts flow control capabilites to an advertising mask and can
be useful in combination with mii_resolve_flowctrl_fdx().
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a shorter and more comprehensible formulation of the
conditions for each flow control mode.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These roughly mirror many of the MII library functions and are based
on code from the sfc driver.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IEEE 802.3 clause 45 specifies the MDIO interface and registers for
use in 10G and other PHYs, similar to the MII management interface.
PHYs may have up to 32 MMDs corresponding to different sub-layers and
functions, each with up to 65536 registers. These are addressed by
PRTAD (similar to the MII PHY address) and DEVAD. Define a mapping
for specifying PRTAD and DEVAD through the existing MII ioctls.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a PORT_OTHER to represent all other physical port types. Current
NICs generally do not allow switching between multiple port types in
software so specific types should not be needed.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't flush inherited SIGKILL during execve() in SELinux's post cred commit
hook. This isn't really a security problem: if the SIGKILL came before the
credentials were changed, then we were right to receive it at the time, and
should honour it; if it came after the creds were changed, then we definitely
should honour it; and in any case, all that will happen is that the process
will be scrapped before it ever returns to userspace.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
ext4 supports a real NFSv4 change attribute, which is bumped whenever
the ctime would be updated, including times when two updates arrive
within a jiffy of each other. (Note that although ext4 has space for
nanosecond-precision ctime, the real resolution is lower: it actually
uses jiffies as the time-source.) This ensures clients will invalidate
their caches when they need to.
There is some fear that keeping the i_version up-to-date could have
performance drawbacks, so for now it's turned on only by a mount option.
We hope to do something better eventually.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Theodore Tso <tytso@mit.edu>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (24 commits)
e100: do not go D3 in shutdown unless system is powering off
netfilter: revised locking for x_tables
Bluetooth: Fix connection establishment with low security requirement
Bluetooth: Add different pairing timeout for Legacy Pairing
Bluetooth: Ensure that HCI sysfs add/del is preempt safe
net: Avoid extra wakeups of threads blocked in wait_for_packet()
net: Fix typo in net_device_ops description.
ipv4: Limit size of route cache hash table
Add reference to CAPI 2.0 standard
Documentation/isdn/INTERFACE.CAPI
update Documentation/isdn/00-INDEX
ixgbe: Fix WoL functionality for 82599 KX4 devices
veth: prevent oops caused by netdev destructor
xfrm: wrong hash value for temporary SA
forcedeth: tx timeout fix
net: Fix LL_MAX_HEADER for CONFIG_TR_MODULE
mlx4_en: Handle page allocation failure during receive
mlx4_en: Fix cleanup flow on cq activation
vlan: update vlan carrier state for admin up/down
netfilter: xt_recent: fix stack overread in compat code
...
Much like the atomic_dec_and_lock() function in which we take an hold a
spin_lock if we drop the atomic to 0 this function takes and holds the
mutex if we dec the atomic to 0.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.410913479@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The type of counter index is sometimes implemented as unsigned
int. This patch changes this to have a consistent usage of int.
[ Impact: cleanup ]
Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-21-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch renames struct hw_perf_counter_ops into struct pmu. It
introduces a structure to describe a cpu specific pmu (performance
monitoring unit). It may contain ops and data. The new name of the
structure fits better, is shorter, and thus better to handle. Where it
was appropriate, names of function and variable have been changed too.
[ Impact: cleanup ]
Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-7-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is only needed for CONFIG_PERF_COUNTERS enabled.
[ Impact: cleanup ]
Signed-off-by: Robert Richter <robert.richter@amd.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1241002046-8832-3-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Replace the current event parser hack with a better one. Filters are
no longer specified predicate by predicate, but all at once and can
use parens and any of the following operators:
numeric fields:
==, !=, <, <=, >, >=
string fields:
==, !=
predicates can be combined with the logical operators:
&&, ||
examples:
"common_preempt_count > 4" > filter
"((sig >= 10 && sig < 15) || sig == 17) && comm != bash" > filter
If there was an error, the erroneous string along with an error
message can be seen by looking at the filter e.g.:
((sig >= 10 && sig < 15) || dsig == 17) && comm != bash
^
parse_error: Field not found
Currently the caret for an error always appears at the beginning of
the filter; a real position should be used, but the error message
should be useful even without it.
To clear a filter, '0' can be written to the filter file.
Filters can also be set or cleared for a complete subsystem by writing
the same filter as would be written to an individual event to the
filter file at the root of the subsytem. Note however, that if any
event in the subsystem lacks a field specified in the filter being
set, the set will fail and all filters in the subsytem are
automatically cleared. This change from the previous version was made
because using only the fields that happen to exist for a given event
would most likely result in a meaningless filter.
Because the logical operators are now implemented as predicates, the
maximum number of predicates in a filter was increased from 8 to 16.
[ Impact: add new, extended trace-filter implementation ]
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: fweisbec@gmail.com
Cc: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <1240905899.6416.121.camel@tropicana>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The new filter comparison ops need to be able to distinguish between
signed and unsigned field types, so add an is_signed flag/param to the
event field struct/trace_define_fields(). Also define a simple macro,
is_signed_type() to determine the signedness at compile time, used in the
trace macros. If the is_signed_type() macro won't work with a specific
type, a new slightly modified version of TRACE_FIELD() called
TRACE_FIELD_SIGN(), allows the signedness to be set explicitly.
[ Impact: extend trace-filter code for new feature ]
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: fweisbec@gmail.com
Cc: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <1240905893.6416.120.camel@tropicana>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Create a new event_filter object, and move the pred-related members
out of the call and subsystem objects and into the filter object - the
details of the filter implementation don't need to be exposed in the
call and subsystem in any case, and it will also help make the new
parser implementation a little cleaner.
[ Impact: refactor trace-filter code to prepare for new features ]
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: fweisbec@gmail.com
Cc: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <1240905887.6416.119.camel@tropicana>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Follow up to afcfe024ae in Linus' tree
("x86: mmiotrace: quieten spurious warning message")
Signed-off-by: Stuart Bennett <stuart@freedesktop.org>
Acked-by: Pekka Paalanen <pq@iki.fi>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1240946271-7083-5-git-send-email-stuart@freedesktop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The patch adds kernel parameter intel_iommu=pt to set up pass through
mode in context mapping entry. This disables DMAR in linux kernel; but
KVM still runs on VT-d and interrupt remapping still works.
In this mode, kernel uses swiotlb for DMA API functions but other VT-d
functionalities are enabled for KVM. KVM always uses multi level
translation page table in VT-d. By default, pass though mode is disabled
in kernel.
This is useful when people don't want to enable VT-d DMAR in kernel but
still want to use KVM and interrupt remapping for reasons like DMAR
performance concern or debug purpose.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Weidong Han <weidong@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The x_tables are organized with a table structure and a per-cpu copies
of the counters and rules. On older kernels there was a reader/writer
lock per table which was a performance bottleneck. In 2.6.30-rc, this
was converted to use RCU and the counters/rules which solved the performance
problems for do_table but made replacing rules much slower because of
the necessary RCU grace period.
This version uses a per-cpu set of spinlocks and counters to allow to
table processing to proceed without the cache thrashing of a global
reader lock and keeps the same performance for table updates.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
drm/i915: fix up error path leak in i915_cmdbuffer
drm/i915: fix unpaired i915 device mutex on entervt failure.
drm/i915: add support for G41 chipset
drm/i915: Enable ASLE if present
drm/i915: Unregister ACPI video driver when exiting
drm/i915: Register ACPI video even when not modesetting
drm/i915: fix transition to I915_TILING_NONE
drm/i915: Don't let an oops get triggered from irq_emit without dma init.
drm/i915: allow tiled front buffers on 965+
Add regulator header file missing kernel-doc:
Warning(include/linux/regulator/driver.h:117): No description found for parameter 'set_mode'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
cc: Liam Girdwood <lrg@slimlogic.co.uk>
cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Adjust the synopsis of svc_sock_names() to pass in the size of the
output buffer. Add a documenting comment.
This is a cosmetic change for now. A subsequent patch will make sure
the buffer length is passed to one_sock_name(), where the length will
actually be useful.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Adjust the synopsis of svc_addsock() to pass in the size of the output
buffer. Add a documenting comment.
This is a cosmetic change for now. A subsequent patch will make sure
the buffer length is passed to one_sock_name(), where the length will
actually be useful.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The svc_xprt_names() function can overflow its buffer if it's so near
the end of the passed in buffer that the "name too long" string still
doesn't fit. Of course, it could never tell if it was near the end
of the passed in buffer, since its only caller passes in zero as the
buffer length.
Let's make this API a little safer.
Change svc_xprt_names() so it *always* checks for a buffer overflow,
and change its only caller to pass in the correct buffer length.
If svc_xprt_names() does overflow its buffer, it now fails with an
ENAMETOOLONG errno, instead of trying to write a message at the end
of the buffer. I don't like this much, but I can't figure out a clean
way that's always safe to return some of the names, *and* an
indication that the buffer was not long enough.
The displayed error when doing a 'cat /proc/fs/nfsd/portlist' is
"File name too long".
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The svc_addr_len() helper function returns -EAFNOSUPPORT if it doesn't
recognize the address family of the passed-in socket address. However,
the return type of this function is size_t, which means -EAFNOSUPPORT
is turned into a very large positive value in this case.
The check in svc_udp_recvfrom() to see if the return value is less
than zero therefore won't work at all.
Additionally, handle_connect_req() passes this value directly to
memset(). This could cause memset() to clobber a large chunk of memory
if svc_addr_len() has returned an error. Currently the address family
of these addresses, however, is known to be supported long before
handle_connect_req() is called, so this isn't a real risk.
Change the error return value of svc_addr_len() to zero, which fits in
the range of size_t, and is safer to pass to memset() directly.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
In order to utilize the full power of the new multi-touch devices, a
way to report detailed finger data to user space is needed. This patch
adds a multi-touch (MT) protocol which allows drivers to report details
for an arbitrary number of fingers.
The driver sends a SYN_MT_REPORT event via the input_mt_sync() function
when a complete finger has been reported.
In order to stay compatible with existing applications, the data
reported in a finger packet must not be recognized as single-touch
events. In addition, all finger data must bypass input filtering,
since subsequent events of the same type refer to different fingers.
A set of ABS_MT events with the desired properties are defined. The
events are divided into categories, to allow for partial implementation.
The minimum set consists of ABS_MT_TOUCH_MAJOR, ABS_MT_POSITION_X and
ABS_MT_POSITION_Y, which allows for multiple fingers to be tracked.
If the device supports it, the ABS_MT_WIDTH_MAJOR may be used to provide
the size of the approaching finger. Anisotropy and direction may be
specified with ABS_MT_TOUCH_MINOR, ABS_MT_WIDTH_MINOR and
ABS_MT_ORIENTATION. Devices with more granular information may specify
general shapes as blobs, i.e., as a sequence of rectangular shapes
grouped together by a ABS_MT_BLOB_ID. Finally, the ABS_MT_TOOL_TYPE
may be used to specify whether the touching tool is a finger or a pen.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
The integrated button on the new unibody Macbooks presents a need to
report explicit four-finger actions. Evidently, the finger pressing
the button is also touching the trackpad, so in order to fully support
three-finger actions, the driver must be able to report four-finger
actions. This patch adds a new button, BTN_TOOL_QUADTAP, which
achieves this.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
The Bluetooth stack uses a reference counting for all established ACL
links and if no user (L2CAP connection) is present, the link will be
terminated to save power. The problem part is the dedicated pairing
when using Legacy Pairing (Bluetooth 2.0 and before). At that point
no user is present and pairing attempts will be disconnected within
10 seconds or less. In previous kernel version this was not a problem
since the disconnect timeout wasn't triggered on incoming connections
for the first time. However this caused issues with broken host stacks
that kept the connections around after dedicated pairing. When the
support for Simple Pairing got added, the link establishment procedure
needed to be changed and now causes issues when using Legacy Pairing
When using Simple Pairing it is possible to do a proper reference
counting of ACL link users. With Legacy Pairing this is not possible
since the specification is unclear in some areas and too many broken
Bluetooth devices have already been deployed. So instead of trying to
deal with all the broken devices, a special pairing timeout will be
introduced that increases the timeout to 60 seconds when pairing is
triggered.
If a broken devices now puts the stack into an unforeseen state, the
worst that happens is the disconnect timeout triggers after 120 seconds
instead of 4 seconds. This allows successful pairings with legacy and
broken devices now.
Based on a report by Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
netif_tx_queue_stopped(txq) is most of the time false.
Yet its cost is very expensive on SMP.
static inline int netif_tx_queue_stopped(const struct netdev_queue *dev_queue)
{
return test_bit(__QUEUE_STATE_XOFF, &dev_queue->state);
}
I saw this on oprofile hunting and bnx2 driver bnx2_tx_int().
We probably should split "struct netdev_queue" in two parts, one
being read mostly.
__netif_tx_lock() touches _xmit_lock & xmit_lock_owner, these
deserve a separate cache line.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Try to get irq_desc on the home node in create_irq_nr().
v2: don't check if we can move it when sparse_irq is not used
v3: use move_irq_des, if that node is not what we want
[ Impact: optimization, make MSI IRQ descriptors more NUMA aware ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <49F6559F.7070005@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We want to use dev_to_node() later on, to be aware of the 'home node'
of the GSI in question.
[ Impact: cleanup, prepare the IRQ code to be more NUMA aware ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Len Brown <lenb@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Len Brown <lenb@kernel.org>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-acpi@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
LKML-Reference: <49F65560.20904@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This simplifies the node awareness of the code. All our allocators
only deal with a NUMA node ID locality not with CPU ids anyway - so
there's no need to maintain (and transform) a CPU id all across the
IRq layer.
v2: keep move_irq_desc related
[ Impact: cleanup, prepare IRQ code to be NUMA-aware ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
LKML-Reference: <49F65536.2020300@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
according to Ingo, change set_affinity() in irq_chip should return int,
because that way we can handle failure cases in a much cleaner way, in
the genirq layer.
v2: fix two typos
[ Impact: extend API ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-arch@vger.kernel.org
LKML-Reference: <49F654E9.4070809@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The original feature of migrating irq_desc dynamic was too fragile
and was causing problems: it caused crashes on systems with lots of
cards with MSI-X when user-space irq-balancer was enabled.
We now have new patches that create irq_desc according to device
numa node. This patch removes the leftover bits of the dynamic balancer.
[ Impact: remove dead code ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <49F654AF.8000808@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
CPUMASKS_OFFSTACK is not defined anywhere (it is CPUMASK_OFFSTACK).
It is a typo and init_allocate_desc_masks() is called before it set
affinity to all cpus...
Split init_alloc_desc_masks() into all_desc_masks() and init_desc_masks().
Also use CPUMASK_OFFSTACK in alloc_desc_masks().
[ Impact: fix smp_affinity copying/setup when moving irq_desc between CPUs ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
LKML-Reference: <49F6546E.3040406@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In 2.6.25 we added UDP mem accounting.
This unfortunatly added a penalty when a frame is transmitted, since
we have at TX completion time to call sock_wfree() to perform necessary
memory accounting. This calls sock_def_write_space() and utimately
scheduler if any thread is waiting on the socket.
Thread(s) waiting for an incoming frame was scheduled, then had to sleep
again as event was meaningless.
(All threads waiting on a socket are using same sk_sleep anchor)
This adds lot of extra wakeups and increases latencies, as noted
by Christoph Lameter, and slows down softirq handler.
Reference : http://marc.info/?l=linux-netdev&m=124060437012283&w=2
Fortunatly, Davide Libenzi recently added concept of keyed wakeups
into kernel, and particularly for sockets (see commit
37e5540b3c
epoll keyed wakeups: make sockets use keyed wakeups)
Davide goal was to optimize epoll, but this new wakeup infrastructure
can help non epoll users as well, if they care to setup an appropriate
handler.
This patch introduces new DEFINE_WAIT_FUNC() helper and uses it
in wait_for_packet(), so that only relevant event can wakeup a thread
blocked in this function.
Trace of function calls from bnx2 TX completion bnx2_poll_work() is :
__kfree_skb()
skb_release_head_state()
sock_wfree()
sock_def_write_space()
__wake_up_sync_key()
__wake_up_common()
receiver_wake_function() : Stops here since thread is waiting for an INPUT
Reported-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
this is the sctp code to enable hardware crc32c offload for
adapters that support it.
Originally by: Vlad Yasevich <vladislav.yasevich@hp.com>
modified by Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is V2 of the smsc911x fifo byteswap patch.
The smsc911x hardware supports both big and little and endian
hardware configurations, and the linux smsc911x driver currently
detects word order.
For correct operation on big endian platforms lacking swapped
byte lanes the following patch is needed. Only fifo data is
swapped, register data does not require any swapping.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
include/linux/mg_disk.h is used only by drivers/block/mg_disk.c. No
reason to put it in a separate header. Fold it into mg_disk.c.
[ Impact: cleanup ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: unsik Kim <donari75@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
In the process of mindlessly copying [__]blk_end_request_all(),
[__]blk_end_request_cur() ended up returning void even though they're
partial completion functions. Fix it.
[ Impact: fix braindead API ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Now that all block request data transfer is done via bio, rq->data
isn't used. Kill it.
While at it, make the roles of rq->special and buffer clear.
[ Impact: drop now unncessary field from struct request ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Boaz Harrosh <bharrosh@panasas.com>
end_request() has been kept around for backward compatibility;
however, it's about time for it to go away.
* There aren't too many users left.
* Its use of @updtodate is pretty confusing.
* In some cases, newer code ends up using mixture of end_request() and
[__]blk_end_request[_all](), which is way too confusing.
So, add [__]blk_end_request_cur() and replace end_request() with it.
Most conversions are straightforward. Noteworthy ones are...
* paride/pcd: next_request() updated to take 0/-errno instead of 1/0.
* paride/pf: pf_end_request() and next_request() updated to take
0/-errno instead of 1/0.
* xd: xd_readwrite() updated to return 0/-errno instead of 1/0.
* mtd/mtd_blkdevs: blktrans_discard_request() updated to return
0/-errno instead of 1/0. Unnecessary local variable res
initialization removed from mtd_blktrans_thread().
[ Impact: cleanup ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Joerg Dorchain <joerg@dorchain.net>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Laurent Vivier <Laurent@lvivier.info>
Cc: Tim Waugh <tim@cyberelk.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: unsik Kim <donari75@gmail.com>
There are many [__]blk_end_request() call sites which call it with
full request length and expect full completion. Many of them ensure
that the request actually completes by doing BUG_ON() the return
value, which is awkward and error-prone.
This patch adds [__]blk_end_request_all() which takes @rq and @error
and fully completes the request. BUG_ON() is added to to ensure that
this actually happens.
Most conversions are simple but there are a few noteworthy ones.
* cdrom/viocd: viocd_end_request() replaced with direct calls to
__blk_end_request_all().
* s390/block/dasd: dasd_end_request() replaced with direct calls to
__blk_end_request_all().
* s390/char/tape_block: tapeblock_end_request() replaced with direct
calls to blk_end_request_all().
[ Impact: cleanup ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Mike Miller <mike.miller@hp.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Request completion has gone through several changes and became a bit
messy over the time. Clean it up.
1. end_that_request_data() is a thin wrapper around
end_that_request_data_first() which checks whether bio is NULL
before doing anything and handles bidi completion.
blk_update_request() is a thin wrapper around
end_that_request_data() which clears nr_sectors on the last
iteration but doesn't use the bidi completion.
Clean it up by moving the initial bio NULL check and nr_sectors
clearing on the last iteration into end_that_request_data() and
renaming it to blk_update_request(), which makes blk_end_io() the
only user of end_that_request_data(). Collapse
end_that_request_data() into blk_end_io().
2. There are four visible completion variants - blk_end_request(),
__blk_end_request(), blk_end_bidi_request() and end_request().
blk_end_request() and blk_end_bidi_request() uses blk_end_request()
as the backend but __blk_end_request() and end_request() use
separate implementation in __blk_end_request() due to different
locking rules.
blk_end_bidi_request() is identical to blk_end_io(). Collapse
blk_end_io() into blk_end_bidi_request(), separate out request
update into internal helper blk_update_bidi_request() and add
__blk_end_bidi_request(). Redefine [__]blk_end_request() as thin
inline wrappers around [__]blk_end_bidi_request().
3. As the whole request issue/completion usages are about to be
modified and audited, it's a good chance to convert completion
functions return bool which better indicates the intended meaning
of return values.
4. The function name end_that_request_last() is from the days when it
was a public interface and slighly confusing. Give it a proper
internal name - blk_finish_request().
5. Add description explaning that blk_end_bidi_request() can be safely
used for uni requests as suggested by Boaz Harrosh.
The only visible behavior change is from #1. nr_sectors counts are
cleared after the final iteration no matter which function is used to
complete the request. I couldn't find any place where the code
assumes those nr_sectors counters contain the values for the last
segment and this change is good as it makes the API much more
consistent as the end result is now same whether a request is
completed using [__]blk_end_request() alone or in combination with
blk_update_request().
API further cleaned up per Christoph's suggestion.
[ Impact: cleanup, rq->*nr_sectors always updated after req completion ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: Christoph Hellwig <hch@infradead.org>
With recent IDE updates, blk_end_request_callback() doesn't have any
user now. Kill it.
[ Impact: removal of unused convoluted interface ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Reorder request completion functions such that
* All request completion functions are located together.
* Functions which are used by only one caller is put right above the
caller.
* end_request() is put after other completion functions but before
blk_update_request().
This change is for completion function cleanup which will follow.
[ Impact: cleanup, code reorganization ]
Signed-off-by: Tejun Heo <tj@kernel.org>
blk_start_queueing() is identical to __blk_run_queue() except that it
doesn't check for recursion. None of the current users depends on
blk_start_queueing() running request_fn directly. Replace usages of
blk_start_queueing() with [__]blk_run_queue() and kill it.
[ Impact: removal of mostly duplicate interface function ]
Signed-off-by: Tejun Heo <tj@kernel.org>
Impact: remove fields and code paths which are no longer necessary
Now that ide-tape uses standard mechanisms to transfer data, special
case handling for bh handling can be dropped from ide-atapi. Drop the
followings.
* pc->cur_pos, b_count, bh and b_data
* drive->pc_update_buffers() and pc_io_buffers().
Signed-off-by: Tejun Heo <tj@kernel.org>
Impact: unify request data buffer handling
rq->data is used mostly to pass kernel buffer through request queue
without using bio. There are only a couple of places which still do
this in kernel and converting to bio isn't difficult.
This patch converts ide-cd and atapi to use bio instead of rq->data
for request sense and internal pc commands. With previous change to
unify sense request handling, this is relatively easily achieved by
adding blk_rq_map_kern() during sense_rq prep and PC issue.
If blk_rq_map_kern() fails for sense, the error is deferred till sense
issue and aborts the failed command which triggered the sense. Note
that this is a slim possibility as sense prep is done on each command
issue, so for the above condition to actually trigger, all preps since
the last sense issue till the issue of the request which would require
a sense should fail.
* do_request functions might sleep now. This should be okay as ide
request_fn - do_ide_request() - is invoked only from make_request
and plug work. Make sure this is the case by adding might_sleep()
to do_ide_request().
* Functions which access the read sense data before the sense request
is complete now should access bio_data(sense_rq->bio) as the sense
buffer might have been copied during blk_rq_map_kern().
* ide-tape updated to map sg.
* cdrom_do_block_pc() now doesn't have to deal with REQ_TYPE_ATA_PC
special case. Simplified.
* tp_ops->output/input_data path dropped from ide_pc_intr().
Signed-off-by: Tejun Heo <tj@kernel.org>
Since we're issuing REQ_TYPE_SENSE now we need to allow those types of
rqs in the ->do_request callbacks. As a future improvement, sense_len
assignment might be unified across all ATAPI devices. Borislav to
check with specs and test.
As a result, get rid of ide_queue_pc_head() and
drive->request_sense_rq.
tj: * Init request sense ide_atapi_pc from sense request. In the
longer timer, it would probably better to fold
ide_create_request_sense_cmd() into its only current user -
ide_floppy_get_format_progress().
* ide_retry_pc() no longer takes @disk.
CC: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
This is in preparation of removing the queueing of a sense request out
of the IRQ handler path.
Use struct request_sense as a general sense buffer for all ATAPI
devices ide-{floppy,tape,cd}.
tj: * blk_get_request(__GFP_WAIT) can't be called from do_request() as
it can cause deadlock. Converted to use inline struct request
and blk_rq_init().
* Added xfer / cdb len selection depending on device type.
* All sense prep logics folded into ide_prep_sense() which never
fails.
* hwif->rq clearing and sense_rq used handling moved into
ide_queue_sense_rq().
* blk_rq_map_kern() conversion is moved to later patch.
CC: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Now that the bio list management stuff is generic, convert loop to use
bio lists instead of its own private bio list implementation.
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The old refok sections
.text.init.refok
.data.init.refok
.exit.text.refok
have been deprecated since commit
312b1485fb. After the other patches in
this patch series nothing is put in these sections, so clean things up
by eliminating all the remaining references to them.
Signed-off-by: Tim Abbott <tabbott@mit.edu>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: only save/restore existent registers in the PCIe capability
x86/PCI: don't bother with root quirks if _CRS is used
docbooks: add/fix PCI kernel-doc
PCI: cleanup debug output resources
x86/PCI: set_pci_bus_resources_arch_default cleanups
x86/PCI: Move set_pci_bus_resources_arch_default into arch/x86
x86/PCI: don't call e820_all_mapped with -1 in the mmconfig case
PCI quirk: disable MSI on VIA VT3364 chipsets
On a brand new GRO skb, we cannot call ip_hdr since the header
may lie in the non-linear area. This patch adds the helper
skb_gro_network_header to handle this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The skb_gro_* code fails to handle the case where a header starts
in the linear area but ends in the frags area. Since the goal
of skb_gro_* is to optimise the case of completely non-linear
packets, we can simply bail out if we have anything in the linear
area.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
When creating a certain types of VPN, NetworkManager will first attempt
to find an available tun device by iterating through 'vpn%d' until it
finds one that isn't already busy. Then it'll set that to be persistent
and owned by the otherwise unprivileged user that the VPN dæmon itself
runs as.
There's a race condition here -- during the period where the vpn%d
device is created and we're waiting for the VPN dæmon to actually
connect and use it, if we try to create _another_ device we could end up
re-using the same one -- because trying to open it again doesn't get
-EBUSY as it would while it's _actually_ busy.
So solve this, we add an IFF_TUN_EXCL flag which causes tun_set_iff() to
fail if it would be opening an existing persistent tundevice -- so that
we can make sure we're getting an entirely _new_ device.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch simplifies the driver by making use of more common code.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for parsing the device tree for PHY devices on an MDIO bus.
Currently many of the PowerPC ethernet drivers are open coding a solution
for reading data out of the device tree to find the correct PHY device.
This patch implements a set of common routines to:
a) let MDIO bus drivers register phy_devices described in the tree, and
b) let MAC drivers find the correct phy_device via the tree.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add phy_connect_direct() and phy_attach_direct() functions so that
drivers can use a pointer to the phy_device instead of trying to determine
the phy's bus_id string.
This patch is useful for OF device tree descriptions of phy devices where
the driver doesn't need or know what the bus_id value in order to get a
phy_device pointer.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes changes in preparation for supporting open firmware
device tree descriptions of MDIO busses. Changes include:
- Cleanup handling of phy_map[] entries; they are already NULLed when
registering and so don't need to be re-cleared, and it is good practice
to clear them out when unregistering.
- Split phy_device registration out into a new function so that the
OF helpers can do two stage registration (separate allocation and
registration steps).
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_parse_phandle() is a helper function to read and parse a phandle
property and return a pointer to the resulting device_node.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IP MIB (RFC 4293) defines stats for InOctets, OutOctets, InMcastOctets and
OutMcastOctets:
http://tools.ietf.org/html/rfc4293
But it seems we don't track those in any way that easy to separate from other
protocols. This patch adds those missing counters to the stats file. Tested
successfully by me
With help from Eric Dumazet.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes an unused parameter (tb_stamp) from fib_table
structure in include/net/ip_fib.h.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is preparation for replacing all uses of ".head.text" or
".text.head" in the kernel with macros, so that the section name can
later be changed without having to touch a lot of the kernel.
Since some linker scripts do more complex things than referencing
HEAD_TEXT, we add a HEAD_TEXT_SECTION macro that just contains the
actual name.
I've defined HEAD_TEXT_SECTION in a new header,
include/linux/section-names.h, so that this section name only needs to
appear in one place. I anticipate creating similar macro structures
for a number of other section names.
The long-term goal here is to be able to change the kernel's magic
section names to those that are compatible with -ffunction-sections
-fdata-sections. This requires renaming all magic sections with names
of the form ".text.foo".
Signed-off-by: Tim Abbott <tabbott@mit.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With modules being able to add trace events, and the max trace event
counter is 16 bits (65536) we can overflow the counter easily
with a simple while loop adding and removing modules that contain
trace events.
This patch links together the registered trace events and on overflow
searches for available trace event ids. It will still fail if
over 65536 events are registered, but considering that a typical
kernel only has 22000 functions, 65000 events should be sufficient.
Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
For every lock request lockd creates a new file_lock object
in nlmsvc_setgrantargs() by copying the passed in file_lock with
locks_copy_lock(). A filesystem can attach it's own lock_operations
vector to the file_lock. It has to be cleaned up at the end of the
file_lock's life. However, lockd doesn't do it today, yet it
asserts in nlmclnt_release_lockargs() that the per-filesystem
state is clean.
This patch fixes it by exporting locks_release_private() and adding
it to nlmsvc_freegrantargs(), to be symmetrical to creating a
file_lock in nlmsvc_setgrantargs().
Signed-off-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Add a macro for double controls with special callback functions.
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
The TRACE_FORMAT macro has been deprecated by the TRACE_EVENT macro.
There are no more users. All new users must use the TRACE_EVENT macro.
[ Impact: remove old functionality ]
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (34 commits)
ACPI, i915: Register ACPI video even when not modesetting
Revert "ACPICA: delete check for AML access to port 0x81-83"
I/O port protection: update for windows compatibility.
sony-laptop: always try to unblock rfkill on load
sony-laptop: fix bogus error message display on resume
ACPI: EC: Fix ACPI EC resume non-query interrupt message
sony-laptop: SNC input event 38 fix
sony-laptop: SNC 127 Initialization Fix
sony-laptop: Duplicate SNC 127 Event Fix
ACPI: prevent processor.max_cstate=0 boot crash
ACPI/hpet: prevent boot hang when hpet=force used on ICH-4M
ACPI: delete obsolete "bus master activity" proc field
ACPI: idle: mark_tsc_unstable() at init-time, not run-time
ACPI: add /sys/firmware/acpi/interrupts/sci_not counter
ACPI video: fix an error when the brightness levels on AC and on Battery are same
acpi-cpufreq: Do not let get_measured perf depend on internal variable
acpi-cpufreq: style-only: add parens to math expression
acpi-cpufreq: Cleanup: Use printk_once
x86, acpi_cpufreq: Fix the NULL pointer dereference in get_measured_perf
thinkpad-acpi: bump up version to 0.23
...
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: Fix potential inode allocation soft lockup in Orlov allocator
ext4: Make the extent validity check more paranoid
jbd: use SWRITE_SYNC_PLUG when writing synchronous revoke records
jbd2: use SWRITE_SYNC_PLUG when writing synchronous revoke records
ext4: really print the find_group_flex fallback warning only once
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
USB: pwc : do not pass stack allocated buffers to USB core.
USB: otg: Fix bug on remove path without transceiver
USB: correct error handling in cdc-wdm
USB: removal of tty->low_latency hack dating back to the old serial code
USB: serial: sierra driver bug fix for composite interface
USB: gadget: omap_udc uses platform_driver_probe()
USB: ci13xxx_udc: fix build error
USB: musb: Prevent multiple includes of musb.h
USB: pass mem_flags to dma_alloc_coherent
USB: g_file_storage: fix use-after-free bug when closing files
USB: ehci-sched.c: EHCI SITD scheduling bugfix
USB: fix mos7840 problem with minor numbers
USB: mos7840: add new device id
USB: musb: fix build when !CONFIG_PM
USB: musb: Remove my email address from few musb related drivers
USB: Gadget: MIPS CI13xxx UDC bugfixes
USB: Unusual Device support for Gold MP3 Player Energy
USB: serial: fix lifetime and locking problems
The TRACE_FORMAT will soon be deprecated. This patch converts it to
the TRACE_EVENT macro.
Note, this change should also speed up the tracing.
[ Impact: remove a user of deprecated TRACE_FORMAT ]
Cc: Jason Baron <jbaron@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The TRACE_FORMAT will soon be deprecated. This patch converts it to
the TRACE_EVENT macro.
Note, this change should also speed up the tracing.
[ Impact: remove a user of deprecated TRACE_FORMAT ]
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This patch adds missing role attribute to the DCCP type, otherwise
the creation of entries is not of any use.
The attribute added is CTA_PROTOINFO_DCCP_ROLE which contains the
role of the conntrack original tuple.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
cfq-iosched: cache prio_tree root in cfqq->p_root
cfq-iosched: fix bug with aliased request and cooperation detection
cfq-iosched: clear ->prio_trees[] on cfqd alloc
block: fix intermittent dm timeout based oops
umem: fix request_queue lock warning
block: simplify I/O stat accounting
pktcdvd.h should include mempool.h
cfq-iosched: use the default seek distance when there aren't enough seek samples
cfq-iosched: make seek_mean converge more quickly
block: make blk_abort_queue() ignore non-request based devices
block: include empty disks in /proc/diskstats
bio: use bio_kmalloc() in copy/map functions
bio: fix bio_kmalloc()
block: fix queue bounce limit setting
block: fix SG_IO vector request data length handling
scatterlist: make sure sg_miter_next() doesn't return 0 sized mappings
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (94 commits)
netfilter: ctnetlink: fix gcc warning during compilation
net/netrom: Fix socket locking
netlabel: Always remove the correct address selector
ucc_geth.c: Fix upsmr setting in RMII mode
8139too: fix HW initial flow
af_iucv: Fix race when queuing incoming iucv messages
af_iucv: Test additional sk states in iucv_sock_shutdown
af_iucv: Reject incoming msgs if RECV_SHUTDOWN is set
af_iucv: fix oops in iucv_sock_recvmsg() for MSG_PEEK flag
af_iucv: consider state IUCV_CLOSING when closing a socket
iwlwifi: DMA fixes
iwlwifi: add debugging for TX path
mwl8: fix build warning.
mac80211: fix alignment calculation bug
mac80211: do not print WARN if config interface
iwl3945: use cancel_delayed_work_sync to cancel rfkill_poll
iwlwifi: fix EEPROM validation mask to include OTP only devices
atmel: fix netdev ops conversion
pcnet_cs: add cis(firmware) of the Allied Telesis LA-PCM
mlx4_en: Fix cleanup if workqueue create in mlx4_en_add() fails
...
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc: Fix modular build of ide-pmac when mediabay is built in
powerpc/pasemi: Fix build error on UP
powerpc: Make macintosh/mediabay driver depend on CONFIG_BLOCK
maintainers: Fix PS3 patterns
powerpc/ps3: Fix CONFIG_PS3_FLASH=n build warning
powerpc/32: Don't clobber personality flags on exec
powerpc: Fix crash on CPU hotplug
powerpc/85xx: Remove defconfigs that mpc85xx_{smp_}defconfig cover
powerpc/85xx: Added SMP defconfig
powerpc/85xx: Enabled a bunch of FSL specific drivers/options
powerpc/85xx: Updated generic mpc85xx_defconfig
powerpc: don't disable SATA interrupts on Freescale MPC8610 HPCD
fsl_rio: Pass the proper device to dma mapping routines
powerpc: Fix of_node_put() exit path in of_irq_map_one()
powerpc/5200: defconfig updates
powerpc/5200: Add FLASH nodes to lite5200 device tree
powerpc/device-tree: Document MTD nodes with multiple "reg" tuples
powerpc/of-device-tree: Factor MTD physmap bindings out of booting-without-of
powerpc/5200: Bring the legacy fsl_spi_platform_data hooks back
The current mm interface is asymetric. One function allocates a locked
buffer, another function only refunds the memory.
Change this to have two functions for accounting and refunding locked
memory, respectively; and do the actual buffer allocation in ptrace.
[ Impact: refactor BTS buffer allocation code ]
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090424095143.A30265@sedona.ch.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Conflicts:
arch/x86/kernel/ptrace.c
Merge reason: fix the conflict above, and also pick up the CONFIG_BROKEN
dependency change from upstream so that we can remove it
here.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This simplifies I/O stat accounting switching code and separates it
completely from I/O scheduler switch code.
Requests are accounted according to the state of their request queue
at the time of the request allocation. There is no need anymore to
flush the request queue when switching I/O accounting state.
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Fix this build error:
In file included from fs/compat_ioctl.c:104:
include/linux/pktcdvd.h:285: error: expected specifier-qualifier-list before 'mempool_t'
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
RB_MAX_SMALL_DATA = 28bytes is too small for most tracers, it wastes
an 'u32' to save the actually length for events which data size > 28.
This fix uses compressed event header and enlarges RB_MAX_SMALL_DATA.
[ Impact: saves about 0%-12.5%(depends on tracer) memory in ring_buffer ]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <49F13189.3090000@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
In case a module uses the TRACE_EVENT macro for creating automated
events in ftrace, it may choose to use a different file name
than the defined system name, or choose to use a different path than
the default "include/trace/events" include path.
If this is done, then before including trace/define_trace.h the
header would define either "TRACE_INCLUDE_FILE" for the file
name or "TRACE_INCLUDE_PATH" for the include path.
If it does not define these, then the define_trace.h defines them
instead. If define trace defines them, then define_trace.h should
also undefine them before exiting. To do this a macro is used
to note this:
#ifndef TRACE_INCLUDE_FILE
# define TRACE_INCLUDE_FILE TRACE_SYSTEM
# define UNDEF_TRACE_INCLUDE_FILE
#endif
[...]
#ifdef UNDEF_TRACE_INCLUDE_FILE
# undef TRACE_INCLUDE_FILE
# undef UNDEF_TRACE_INCLUDE_FILE
#endif
The UNDEF_TRACE_INCLUDE_FILE acts as a CPP variable to know to undef
the TRACE_INCLUDE_FILE before leaving define_trace.h.
Unfortunately, due to cut and paste errors, the macros between
FILE and PATH got mixed up.
[ Impact: undef TRACE_INCLUDE_FILE and/or TRACE_INCLUDE_PATH when needed ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
With the new event tracing registration, we must increase the number
of events that can be registered. Currently the type field is only
one byte, which leaves us only 256 possible events.
Since we do not save the CPU number in the tracer anymore (it is determined
by the per cpu ring buffer that is used) we have an extra byte to use.
This patch increases the size of type from 1 byte (256 events) to
2 bytes (65,536 events).
It also adds a WARN_ON_ONCE if we exceed that limit.
[ Impact: allow more than 255 events ]
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Add #ifndef to musb header file to prevent multiple inclusions.
Signed-off-by: Mark A. Greer <mgreer@mvista.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The SO_MSGLIMIT socket option modifies the message limit for new
IUCV communication paths.
The message limit specifies the maximum number of outstanding messages
that are allowed for connections. This setting can be lowered by z/VM
when an IUCV connection is established.
Expects an integer value in the range of 1 to 65535.
The default value is 65535.
The message limit must be set before calling connect() or listen()
for sockets.
If sockets are already connected or in state listen, changing the message
limit is not supported.
For reading the message limit value, unconnected sockets return the limit
that has been set or the default limit. For connected sockets, the actual
message limit is returned. The actual message limit is assigned by z/VM
for each connection and it depends on IUCV MSGLIMIT authorizations
specified for the z/VM guest virtual machine.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow 'classification' of socket data that is sent or received over
an af_iucv socket. For classification of data, the target class of an
(native) iucv message is used.
This patch provides the cmsg interface for iucv_sock_recvmsg() and
iucv_sock_sendmsg(). Applications can use the msg_control field of
struct msghdr to set or get the target class as a
"socket control message" (SCM/CMSG).
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide the socket operations getsocktopt() and setsockopt() to enable/disable
sending of data in the parameter list of IUCV messages.
The patch sets respective flag only.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Finds the first set bit in a 64 bit word. This is required in order
to fix a bug in GFS2, but I think it should be a generic function
in case of future users.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Willy Tarreau <w@1wt.eu>
Linux-2.6.29 deleted the legacy ACPI idle handler, leaving
the CPU_IDLE handler, which does not track bus master activity.
So delete the unused bm_activity field -- it is confusing to
print an always zero value.
This patch could break programs that parse
/proc/acpi/processor/*/power, since it deletes this
line from that file:
bus master activity: 00000000
http://bugzilla.kernel.org/show_bug.cgi?id=13145
is not fixed by this patch, but provoked this patch.
Signed-off-by: Len Brown <len.brown@intel.com>
PCIe 1.1 base neither requires the endpoint to implement the entire
PCIe capability structure nor specifies default values of registers
that are not implemented by the device. So we only save and restore
registers that must be implemented by different device types if the
device PCIe capability version is 1.
PCIe 1.1 Capability Structure Expansion ECN and PCIe 2.0 requires
all registers in the PCIe capability to be either implemented or
hardwired to 0. Their PCIe capability version is 2.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
SME needs to be notified when the authentication or association
attempt times out and MLME has stopped processing in order to allow
the SME to decide what to do next.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The maximum sleep interval, for powersave purposes, is
determined by the DTIM period (it may not be larger)
and the required networking latency (it must be small
enough to fulfil those constraints).
This makes mac80211 calculate the maximum sleep interval
based on those constraints, and pass it to the driver.
Then the driver should instruct the device to sleep at
most that long.
Note that the device is responsible for aligning the
maximum sleep interval between DTIMs, we make sure it's
not longer but it needs to make sure it's between them.
Also, group some powersave documentation together and
make it more explicit that we support managed mode only,
and no IBSS powersaving (yet).
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Make the JOIN_IBSS command look at the beacon interval
attribute to see if the user requested a specific beacon
interval, if not default to 100 TU (wext too).
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Just setting IEEE80211_CONF_CHANGE_PS should be sufficient
for changes in the power saving things. The driver already
tells us whether it wants notification of dynps via the
"have dynps support" hw flag.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Kalle Valo <kalle.valo@iki.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The TIM IE must not be shorter than 4 bytes, so verify that
when parsing it and use the proper type. To ease that adjust
struct ieee80211_tim_ie to have a virtual bitmap of size
at least 1.
Also check that the TIM IE is actually present before trying
to parse it!
Because other people may need the function, make it a static
inline in ieee80211.h.
(The original "mac80211: validate TIM IE length" was a minimal fix for
2.6.30. This purports to be the full, correct fix. -- JWL)
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add new nl80211 attributes that can be used with NL80211_CMD_SET_WIPHY
and NL80211_CMD_GET_WIPHY to manage fragmentation/RTS threshold and
retry limits.
Since these values are stored in struct wiphy, remove the local copy
from mac80211 where feasible (frag & rts threshold). The retry limits
are currently needed in struct ieee80211_conf, but these could be
eventually removed since the driver should have access to the values
in struct wiphy.
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Trying to separate header files into net/wireless.h and
net/cfg80211.h has been a source of confusion. Remove
net/wireless.h (because there also is the linux/wireless.h)
and subsume everything into net/cfg80211.h -- except the
definitions for regulatory structures which get moved to
a new header net/regulatory.h.
The "new" net/cfg80211.h is now divided into sections.
There are no real changes in this patch but code shuffling
and some very minor documentation fixes.
I have also, to make things reflect reality, put in a
copyright line for Luis to net/regulatory.h since that
is probably exclusively written by him but was formerly
in a file that only had my copyright line.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This adds IBSS API along with (preliminary) wext handlers.
The wext handlers can only do IBSS so you need to call them
from your own wext handlers if the mode is IBSS.
The nl80211 API requires
* an SSID
* a channel (frequency) for the case that a new IBSS
has to be created
It optionally supports
* a flag to fix the channel
* a fixed BSSID
The cfg80211 code also takes care to leave the IBSS before
the netdev is set down. If wireless extensions are used, it
also caches values when the interface is down and instructs
the driver to join when the interface is set up.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Since we have ->deauth and ->disassoc we can support the
wext SIWMLME call directly without driver wext handlers.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Document what mac80211 will do in the future to help save power.
We're not quite there yet, but a plan helps. Also, while at it,
fix the docs wrt. multicast traffic.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Kalle Valo <kalle.valo@iki.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When an application asks for a latency lower than the beacon interval
there's nothing we can do -- we need to stay awake and not have the
AP buffer frames for us. Add code to automatically calculate this
constraint in mac80211 so drivers need not concern themselves with it.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some hardware defects may require the hardware to be re-initialised
completely from scratch. Drivers would need much information (for
instance the current MAC address, crypto keys, beaconing information,
etc.) stored duplicated from mac80211 to be able to do this, so let
mac80211 help them.
The new ieee80211_restart_hw() function requires the same code as
resuming, so move that code into a new ieee80211_reconfig() function
in util.c and leave only the suspend code in pm.c.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
libertas: add support for Marvell SD8688 chip
Use RxPD->pkt_ptr to locate eth803 header in the packet
received since SD8688/v10 firmware allows a gap between
RxPD and eth803 header.
Set SDIO block size to 256 for CMD53.
The maximum block size for SD8688 WLAN function is set
to 512 in TPLFE_MAX_BLK_SIZE. But using 512 as block size
results upto 2K bytes data (4 blocks) being transferred
and causes buffer overflow in firmware.
Both changes above are backward compatible with earlier
firmware versions for SD8385/SD8686.
The SDIO_DEVICE_IDs for SD8688 chip are added in
include/linux/mmc/sdio_ids.h
Signed-off-by: Kiran Divekar <dkiran@marvell.com>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This adds support for the LEDs on the Jupiter netbook.
Reported-by: Martin Bammer <mrb74@gmx.at>
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This adds the necessary code and fields to let drivers specify
their cipher capabilities and exports them to userspace. Also
update mac80211 to export the ciphers it has.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This informs userspace when a change has occured on a world
roaming wiphy's channel which has lifted some restrictions
due to a regulatory beacon hint.
Because this is now sent to userspace through the regulatory
multicast group we remove the debug prints we used to use as
they are no longer necessary.
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Instead of just passing the cfg80211-requested IEs, pass
the locally generated ones as well.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch introduces a new attribute for a wiphy that tells
userspace how long the information elements added to a probe
request frame can be at most. It also updates the at76 to
advertise that it cannot support that, and, for now until I
can fix that, iwlwifi too.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Added cfg80211_inform_bss() for full-mac devices to use.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Define a new nl80211 event, NL80211_CMD_MICHAEL_MIC_FAILURE, to be
used to notify user space about locally detected Michael MIC failures.
This matches with the MLME-MICHAELMICFAILURE.indication() primitive.
Since we do not actually have TSC in the skb anymore when
mac80211_ev_michael_mic_failure() is called, that function is changed
to take in the TSC as an optional parameter instead of as a
requirement to include the TSC after the hdr field (which we did not
really follow). For now, TSC is not included in the events from
mac80211, but it could be added at some point.
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Previously, nl80211 mlme events were generated only for received
deauthentication and disassociation frames. We need to do the same for
locally generated ones in order to let applications know that we
disconnected (e.g., when AP does not reply to a probe). Rename the
nl80211 and cfg80211 functions (s/rx_//) to make it clearer that they
are used for both received and locally generated frames.
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Remove duplicated #include in include/net/cfg80211.h.
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Almost all drivers do not support user_claim, so remove it
completely and always report -EOPNOTSUPP to userspace. Since
userspace cannot really drive rfkill _anyway_ (due to the
odd restrictions imposed by the documentation) having this
code is just pointless.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
I only did superficial review, but these constants are stupid
to have and without proper warnings nobody will review the
code anyway, no amount of shouting will help.
Also fix wimax to use correct states.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Many modern CODECs have shared resources on chip which must be enabled
for portions of the chip to work but which can be disabled at other times
in order to achieve power savings. Examples of such resources include
power supplies and some internal clocks.
Since these widgets are dependencies for the audio path but do not carry
audio signals they require slightly different handling to most widgets -
they do not contribute to the audio path and so should not be counted as
either inputs or outputs during path walks.
Cases where one supply provides a supply for another will require
additional work. There is also room for more optimisation of the graph
walking to avoid repeated checks for the same thing.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
When checking for overlapping slots on registration of a new one, kvm
currently also considers zero-length (ie. deleted) slots and rejects
requests incorrectly. This finally denies user space from joining slots.
Fix the check by skipping deleted slots and advertise this via a
KVM_CAP_JOIN_MEMORY_REGIONS_WORKS.
Cc: stable@kernel.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
The __get_str() macro is used in a code part then its content should be
protected with parenthesis.
[ Impact: make macro definition more robust ]
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Now that we can support the dynamic sized string, make the lock tracing
able to use it, making it safe against modules removal and consuming
the right amount of memory needed for each lock name
Changes in v2:
adapt to the __ending_string() updates and the opening_string() removal.
[ Impact: protect lock tracer against module removal ]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
This patch provides the support for dynamic size strings on
event tracing.
The key concept is to use a structure with an ending char array field of
undefined size and use such ability to allocate the minimal size on the
ring buffer to make one or more string entries fit inside, as opposite
to a fixed length strings with upper bound.
The strings themselves are represented using fields which have an offset
value from the beginning of the entry.
This patch provides three new macros:
__string(item, src)
This one declares a string to the structure inside TP_STRUCT__entry.
You need to provide the name of the string field and the source that will
be copied inside.
This will also add the dynamic size of the string needed for the ring
buffer entry allocation.
A stack allocated structure is used to temporarily store the offset
of each strings, avoiding double calls to strlen() on each event
insertion.
__get_str(field)
This one will give you a pointer to the string you have created. This
is an abstract helper to resolve the absolute address given the field
name which is a relative address from the beginning of the trace_structure.
__assign_str(dst, src)
Use this macro to automatically perform the string copy from src to
dst. src must be a variable to assign and dst is the name of a __string
field.
Example on how to use it:
TRACE_EVENT(my_event,
TP_PROTO(char *src1, char *src2),
TP_ARGS(src1, src2),
TP_STRUCT__entry(
__string(str1, src1)
__string(str2, src2)
),
TP_fast_assign(
__assign_str(str1, src1);
__assign_str(str2, src2);
),
TP_printk("%s %s", __get_str(src1), __get_str(src2))
)
Of course you can mix-up any __field or __array inside this
TRACE_EVENT. The position of the __string or __assign_str
doesn't matter.
Changes in v2:
Address the suggestion of Steven Rostedt: drop the opening_string() macro
and redefine __ending_string() to get the size of the string to be copied
instead of overwritting the whole ring buffer allocation.
Changes in v3:
Address other suggestions of Steven Rostedt and Peter Zijlstra with
some changes: drop the __ending_string and the need to have only one
string field.
Use offsets instead of absolute addresses.
[ Impact: allow more compact memory usage for string tracing ]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
struct trace_entry->type is unsigned char, while trace event's id is
int type, thus for a event with id >= 256, it's entry->type is cast
to (id % 256), and then we can't see the trace output of this event.
# insmod trace-events-sample.ko
# echo foo_bar > /mnt/tracing/set_event
# cat /debug/tracing/events/trace-events-sample/foo_bar/id
256
# cat /mnt/tracing/trace_pipe
<...>-3548 [001] 215.091142: Unknown type 0
<...>-3548 [001] 216.089207: Unknown type 0
<...>-3548 [001] 217.087271: Unknown type 0
<...>-3548 [001] 218.085332: Unknown type 0
[ Impact: fix output for trace events with id >= 256 ]
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <49EEDB0E.5070207@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
/proc/diskstats used to show stats for all disks whether they're
zero-sized or not and their non-zero partitions. Commit
074a7aca7a accidentally changed the
behavior such that it doesn't print out zero sized disks. This patch
implements DISK_PITER_INCL_EMPTY_PART0 flag to partition iterator and
uses it in diskstats_show() such that empty part0 is shown in
/proc/diskstats.
Reported and bisectd by Dianel Collins.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Daniel Collins <solemnwarning@solemnwarning.no-ip.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Impact: fix bio_kmalloc() and its destruction path
bio_kmalloc() was broken in two ways.
* bvec_alloc_bs() first allocates bvec using kmalloc() and then
ignores it and allocates again like non-kmalloc bvecs.
* bio_kmalloc_destructor() didn't check for and free bio integrity
data.
This patch fixes the above problems. kmalloc patch is separated out
from bio_alloc_bioset() and allocates the requested number of bvecs as
inline bvecs.
* bio_alloc_bioset() no longer takes NULL @bs. None other than
bio_kmalloc() used it and outside users can't know how it was
allocated anyway.
* Define and use BIO_POOL_NONE so that pool index check in
bvec_free_bs() triggers if inline or kmalloc allocated bvec gets
there.
* Relocate destructors on top of each allocation function so that how
they're used is more clear.
Jens Axboe suggested allocating bvecs inline.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
There is currently only one way for userspace to say "wait for my storage
device to get ready for the modules I just loaded": to load the
scsi_wait_scan module. Expectations of userspace are that once this
module is loaded, all the (storage) devices for which the drivers
were loaded before the module load are present.
Now, there are some issues with the implementation, and the async
stuff got caught in the middle of this: The existing code only
waits for the scsy async probing to finish, but it did not take
into account at all that probing might not have begun yet.
(Russell ran into this problem on his computer and the fix works for him)
This patch fixes this more thoroughly than the previous "fix", which
had some bad side effects (namely, for kernel code that wanted to wait for
the scsi scan it would also do an async sync, which would deadlock if you did
it from async context already.. there's a report about that on lkml):
The patch makes the module first wait for all device driver probes, and then it
will wait for the scsi parallel scan to finish.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Tested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix a comment typo in slow-work.h
...a trivial mistake, but it will mess up kerneldoc if nothing else.
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Collect the DECLARE/DEFINE declarations together in linux/percpu-defs.h so
that they're in one place, and give them descriptive comments, particularly
the SHARED_ALIGNED variant.
It would be nice to collect these in linux/percpu.h, but that's not possible
without sorting out the severe #include recursion between the x86 arch headers
and the general headers (and possibly other arches too).
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In non-SMP mode, the variable section attribute specified by DECLARE_PER_CPU()
does not agree with that specified by DEFINE_PER_CPU(). This means that
architectures that have a small data section references relative to a base
register may throw up linkage errors due to too great a displacement between
where the base register points and the per-CPU variable.
On FRV, the .h declaration says that the variable is in the .sdata section, but
the .c definition says it's actually in the .data section. The linker throws
up the following errors:
kernel/built-in.o: In function `release_task':
kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o
kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o
To fix this, DECLARE_PER_CPU() should simply apply the same section attribute
as does DEFINE_PER_CPU(). However, this is made slightly more complex by
virtue of the fact that there are several variants on DEFINE, so these need to
be matched by variants on DECLARE.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This had been delayed for some time due to failure to work on the one piece
of G41 hardware we had, and lack of success reports from anybody else.
Current hardware appears to be OK.
Signed-off-by: Zhenyu Wang <zhenyu.z.wang@intel.com>
[anholt: hand-applied due to conflicts with IGD patches]
Signed-off-by: Eric Anholt <eric@anholt.net>
This is a doc-only patch which I hope will reduce the number of
spi_master controller driver patches starting out with a common
implementation bug.
(As in: almost every spi_master driver I see starts out with its
version of this bug. Sigh.)
It just re-emphasizes that the setup() method may be called for one
device while a transfer is active on another ... which means that most
driver implementations shouldn't touch any registers.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Enable userspace to receive messages that a BMC transmits using an OEM
medium. This is used by the HP iLO2.
Based on code originally written by Patrick Schoeller.
Signed-off-by: dann frazier <dannf@hp.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The IPMI driver would attempt to use the event buffer even if that
didn't exist on the BMC. This patch modified the IPMI driver to check
for the event buffer's existence before trying to use it.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add enable() and disable() callbacks for clocksources.
This allows us to put unused clocksources in power save mode. The
functions clocksource_enable() and clocksource_disable() wrap the
callbacks and are inserted in the timekeeping code to enable before use
and disable after switching to a new clocksource.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pass clocksource pointer to the read() callback for clocksources. This
allows us to share the callback between multiple instances.
[hugh@veritas.com: fix powerpc build of clocksource pass clocksource mods]
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
reiserfs: fix j_last_flush_trans_id type
fs: Mark get_filesystem_list() as __init function.
kill vfs_stat_fd / vfs_lstat_fd
Separate out common fstatat code into vfs_fstatat
ecryptfs: use memdup_user()
ncpfs: use memdup_user()
xfs: use memdup_user()
sysfs: use memdup_user()
btrfs: use memdup_user()
xattr: use memdup_user()
autofs4: use memchr() in invalid_string()
Documentation/filesystems: remove out of date reference to BKL being held
Fix i_mutex vs. readdir handling in nfsd
fs/compat_ioctl: fix build when !BLOCK
Fix autofs_expire()
No need for crossing to mountpoint in audit_tag_tree()
Safer nfsd_cross_mnt()
Touch all affected namespaces on propagation of mount
Fix AUTOFS_DEV_IOCTL_REQUESTER_CMD
Older MIPS assembler don't support .set for defining aliases.
Using = works for old and new assembers.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
aio_write gets const struct iovec * but tun_chr_aio_write casts this to struct
iovec * and modifies the iovec. As a result, attempts to use io_submit
to send packets to a tun device fail with weird errors such as EINVAL.
Since tun is the only user of skb_copy_datagram_from_iovec, we can
fix this simply by changing the later so that it does not
touch the iovec passed to it.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's an skb_copy_datagram_iovec() to copy out of a paged skb,
but it modifies the iovec, and does not support starting
at an offset in the destination. We want both in tun.c, so let's
add the function.
It's a carbon copy of skb_copy_datagram_iovec() with enough changes to
be annoying.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
because of using the same function get_ethernet_addr as cdc_ether.c
i export usbnet_get_ethernet_addr from usbnet and fixed cdc_ether
(suggested by Oliver Neukum).
Signed-off-by: Peter Holik <peter@holik.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conversion in commit 600ed41675 had missed
that one, but converted format from %lu to %u. As the result,
/proc/..../journal got buggered on 64bit boxen.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
"int get_filesystem_list(char * buf)" is called by only
"static void __init get_fs_names(char *page)".
We can mark get_filesystem_list() as "__init".
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
There's really no reason to keep vfs_stat_fd and vfs_lstat_fd with
Oleg's vfs_fstatat. Use vfs_fstatat for the few cases having the
directory fd, and switch all others to vfs_stat / vfs_lstat.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This is a version incorporating Christoph's suggestion.
Separate out common *fstatat functionality into a single function
instead of duplicating it all over the code.
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This function was left orphan by the latest round of sw-counter
cleanups.
[ Impact: remove unused kernel function ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Rather than having switch statements at point of use make the DAPM
power check a member of the widget structure and set it when we
instantiate the widget.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Due to a cut and paste error, the trace_seq_putc had a semicolon
after the prototype but before the stub function when tracing is
disabled.
[Impact: fix compile error ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
last_synq_overflow eats 4 or 8 bytes in struct tcp_sock, even
though it is only used when a listening sockets syn queue
is full.
We can (ab)use rx_opt.ts_recent_stamp to store the same information;
it is not used otherwise as long as a socket is in listen state.
Move linger2 around to avoid splitting struct mtu_probe
across cacheline boundary on 32 bit arches.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 900af0d973 (PM: Change suspend
code ordering) changed the ordering of suspend code in such a way
that the platform .prepare() callback is now executed after the
device drivers' late suspend callbacks have run. Unfortunately, this
turns out to break ARM platforms that need to talk via I2C to power
control devices during the .prepare() callback.
For this reason introduce two new platform suspend callbacks,
.prepare_late() and .wake(), that will be called just prior to
disabling non-boot CPUs and right after bringing them back on line,
respectively, and use them instead of .prepare() and .finish() for
ACPI suspend. Make the PM core execute the .prepare() and .finish()
platform suspend callbacks where they were executed previously (that
is, right after calling the regular suspend methods provided by
device drivers and right before executing their regular resume
methods, respectively).
It is not necessary to make analogous changes to the hibernation
code and data structures at the moment, because they are only used
by ACPI platforms.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Len Brown <len.brown@intel.com>
<linux/seccomp.h> uses EINVAL so should include <linux/errno.h>. This
fixes a build error on 64-bit MIPS if CONFIG_SECCOMP is disabled.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, when x2apic is not enabled, interrupt remapping
will be enabled in init_dmars(), where it is too late to remap
ioapic interrupts, that is, ioapic interrupts are really in
compatibility mode, not remappable mode.
This patch always enables interrupt remapping before ioapic
setup, it guarantees all interrupts will be remapped when
interrupt remapping is enabled. Thus it doesn't need to set
the compatibility interrupt bit.
[ Impact: refactor intr-remap init sequence, enable fuller remap mode ]
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Weidong Han <weidong.han@intel.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: allen.m.kay@intel.com
Cc: fenghua.yu@intel.com
LKML-Reference: <1239957736-6161-4-git-send-email-weidong.han@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: remove fields and code paths which are no longer necessary
Now that ide-tape uses standard mechanisms to transfer data, special
case handling for bh handling can be dropped from ide-atapi. Drop the
followings.
* pc->cur_pos, b_count, bh and b_data
* drive->pc_update_buffers() and pc_io_buffers().
Signed-off-by: Tejun Heo <tj@kernel.org>
Impact: unify request data buffer handling
rq->data is used mostly to pass kernel buffer through request queue
without using bio. There are only a couple of places which still do
this in kernel and converting to bio isn't difficult.
This patch converts ide-cd and atapi to use bio instead of rq->data
for request sense and internal pc commands. With previous change to
unify sense request handling, this is relatively easily achieved by
adding blk_rq_map_kern() during sense_rq prep and PC issue.
If blk_rq_map_kern() fails for sense, the error is deferred till sense
issue and aborts the failed command which triggered the sense. Note
that this is a slim possibility as sense prep is done on each command
issue, so for the above condition to actually trigger, all preps since
the last sense issue till the issue of the request which would require
a sense should fail.
* do_request functions might sleep now. This should be okay as ide
request_fn - do_ide_request() - is invoked only from make_request
and plug work. Make sure this is the case by adding might_sleep()
to do_ide_request().
* Functions which access the read sense data before the sense request
is complete now should access bio_data(sense_rq->bio) as the sense
buffer might have been copied during blk_rq_map_kern().
* ide-tape updated to map sg.
* cdrom_do_block_pc() now doesn't have to deal with REQ_TYPE_ATA_PC
special case. Simplified.
* tp_ops->output/input_data path dropped from ide_pc_intr().
Signed-off-by: Tejun Heo <tj@kernel.org>
Since we're issuing REQ_TYPE_SENSE now we need to allow those types of
rqs in the ->do_request callbacks. As a future improvement, sense_len
assignment might be unified across all ATAPI devices. Borislav to
check with specs and test.
As a result, get rid of ide_queue_pc_head() and
drive->request_sense_rq.
tj: * Init request sense ide_atapi_pc from sense request. In the
longer timer, it would probably better to fold
ide_create_request_sense_cmd() into its only current user -
ide_floppy_get_format_progress().
* ide_retry_pc() no longer takes @disk.
CC: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
This is in preparation of removing the queueing of a sense request out
of the IRQ handler path.
Use struct request_sense as a general sense buffer for all ATAPI
devices ide-{floppy,tape,cd}.
tj: * blk_get_request(__GFP_WAIT) can't be called from do_request() as
it can cause deadlock. Converted to use inline struct request
and blk_rq_init().
* Added xfer / cdb len selection depending on device type.
* All sense prep logics folded into ide_prep_sense() which never
fails.
* hwif->rq clearing and sense_rq used handling moved into
ide_queue_sense_rq().
* blk_rq_map_kern() conversion is moved to later patch.
CC: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Due to a cut and paste error, I added the gcc attribute for printf
format to the static inline stub of trace_seq_printf.
This will cause a compile failure.
[ Impact: fix compiler error when CONFIG_TRACING is off ]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: =?ISO-8859-15?Q?Fr=E9d=E9ric_Weisbecker?= <fweisbec@gmail.com>
LKML-Reference: <alpine.DEB.2.00.0904171717080.1016@gandalf.stny.rr.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The rotary encoder driver only supports returning input events
for ABS_* axes, this adds support for REL_* axes. The relative
axis input event is reported as -1 for each counter-clockwise
step and +1 for each clockwise step.
The ability to clamp the position of ABS_* axes between 0 and
a maximum of "steps" has also been added.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
UIO: fix specific device driver missing statement for depmod
Driver core: remove pr_fmt() from dynamic_dev_dbg() printk
driver core: prevent device_for_each_child from oopsing
dynamic debug: resurrect old pr_debug() semantics as pr_devel()
Driver Core: early platform driver
proc: mounts_poll() make consistent to mdstat_poll
sysfs: sysfs poll keep the poll rule of regular file.
driver core: allow non-root users to listen to uevents
driver core: fix driver_match_device
sysfs: don't use global workqueue in sysfs_schedule_callback()
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (22 commits)
WUSB: correct format of wusb_chid sysfs file
WUSB: fix oops when completing URBs for disconnected devices
WUSB: disconnect all devices when stopping a WUSB HCD
USB: whci-hcd: check return value of usb_hcd_link_urb_to_ep()
USB: whci-hcd: provide a endpoint_reset method
USB: add reset endpoint operations
USB device codes for Motorola phone.
usb-storage: fix mistake in Makefile
USB: usb-serial ch341: support for DTR/RTS/CTS
Revert USB: usb-serial ch341: support for DTR/RTS/CTS
USB: musb: fix possible panic while resuming
USB: musb: fix isochronous TXDMA (take 2)
USB: musb: sanitize clearing TXCSR DMA bits (take 2)
USB: musb: bugfixes for multi-packet TXDMA support
USB: musb_host, fix ep0 fifo flushing
USB: usb-storage: augment unusual_devs entry for Simple Tech/Datafab
USB: musb_host, minor enqueue locking fix (v2)
USB: fix oops in cdc-wdm in case of malformed descriptors
USB: qcserial: Add extra device IDs
USB: option: Add ids for D-Link DWM-652 3.5G modem
...
The i915 DRM triggers registration of the ACPI video driver on load. It
should unregister it at unload in order to avoid generating backtraces on
being reloaded.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
The tracing infrastructure allows for recursion. That is, an interrupt
may interrupt the act of tracing an event, and that interrupt may very well
perform its own trace. This is a recursive trace, and is fine to do.
The problem arises when there is a bug, and the utility doing the trace
calls something that recurses back into the tracer. This recursion is not
caused by an external event like an interrupt, but by code that is not
expected to recurse. The result could be a lockup.
This patch adds a bitmask to the task structure that keeps track
of the trace recursion. To find the interrupt depth, the following
algorithm is used:
level = hardirq_count() + softirq_count() + in_nmi;
Here, level will be the depth of interrutps and softirqs, and even handles
the nmi. Then the corresponding bit is set in the recursion bitmask.
If the bit was already set, we know we had a recursion at the same level
and we warn about it and fail the writing to the buffer.
After the data has been committed to the buffer, we clear the bit.
No atomics are needed. The only races are with interrupts and they reset
the bitmask before returning anywy.
[ Impact: detect same irq level trace recursion ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The CONFIG_EVENT_TRACER is the way to turn on event tracing when no
other tracing has been configured. All code to get enabled should
depend on CONFIG_EVENT_TRACING. That is what is enabled when TRACING
(or CONFIG_EVENT_TRACER) is selected.
This patch enables the include/trace/ftrace.h file when
CONFIG_EVENT_TRACING is enabled.
[ Impact: fix warning in event tracer selftest ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Wireless USB endpoint state has a sequence number and a current
window and not just a single toggle bit. So allow HCDs to provide a
endpoint_reset method and call this or clear the software toggles as
required (after a clear halt, set configuration etc.).
usb_settoggle() and friends are then HCD internal and are moved into
core/hcd.h and all device drivers call usb_reset_endpoint() instead.
If the device endpoint state has been reset (with a clear halt) but
the host endpoint state has not then subsequent data transfers will
not complete. The device will only work again after it is reset or
disconnected.
Signed-off-by: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This renames include/asm-h8300/timer.h into arch/h8300/include/asm: it
was left over just because that file had been created in the -mm tree
before the whole h8300 header subdirectory had been moved, and then got
merged in the old location afterwards.
(See commits e0b0f9e4ea: "h8300: update
timer handler - new files" and 758db3f211:
"[h8300] move include/asm-h8300 to arch/h8300/include/asm" for details).
This also removes a left-over .gitignore file in include/asm-arm that
became stale when the ARM header files were moved (which happened in
multiple commits, just see "git log -- include/asm-arm" for details).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://www.linux-m32r.org/git/takata/linux-2.6_dev:
m32r: move include/asm-m32r/* to arch/m32r/include/asm/
m32r: move include/asm-m32r headers to arch/m32r/include/asm
kmem_event_types.h is no longer necessary since tracepoint definitions
are put into include/trace/events/kmem.h
[ Impact: remove now-unused file. ]
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <49E7EF37.2080205@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tracepoints with no arguments can issue two warnings:
"field" defined by not used
"ret" is uninitialized in this function
Mark field as being OK to leave unused, and initialize ret.
[ Impact: fix false positive compiler warnings. ]
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: mathieu.desnoyers@polymtl.ca
LKML-Reference: <1239950139-1119-5-git-send-email-jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently, every thing needed to read the binary output from the
ring buffers is available, with the exception of the way the ring
buffers handles itself internally.
This patch creates two special files in the debugfs/tracing/events
directory:
# cat /debug/tracing/events/header_page
field: u64 timestamp; offset:0; size:8;
field: local_t commit; offset:8; size:8;
field: char data; offset:16; size:4080;
# cat /debug/tracing/events/header_event
type : 2 bits
len : 3 bits
time_delta : 27 bits
array : 32 bits
padding : type == 0
time_extend : type == 1
data : type == 3
This is to allow a userspace app to see if the ring buffer format changes
or not.
[ Impact: allow userspace apps to know of ringbuffer format changes ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The hooks in the module code for the function tracer must be called
before any of that module code runs. The function tracer hooks
modify the module (replacing calls to mcount to nops). If the code
is executed while the change occurs, then the CPU can take a GPF.
To handle the above with a bit of paranoia, I originally implemented
the hooks as calls directly from the module code.
After examining the notifier calls, it looks as though the start up
notify is called before any of the module's code is executed. This makes
the use of the notify safe with ftrace.
Only the startup notify is required to be "safe". The shutdown simply
removes the entries from the ftrace function list, and does not modify
any code.
This change has another benefit. It removes a issue with a reverse dependency
in the mutexes of ftrace_lock and module_mutex.
[ Impact: fix lock dependency bug, cleanup ]
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
kernel/softirq.c: fix sparse warning
rcu: Make hierarchical RCU less IPI-happy
When pr_fmt() was added to the pr_debug() code, we added it not only to the
dynamic_pr_debug() function, but also to the dynamic_dev_dbg() funciton.
However, dev_dbg() doesn't make use of pr_fmt(), so neither should
dynamic_dev_dbg().
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
pr_debug() used to produce zero code unless DEBUG was #defined. This is
now no longer the case in practice[1].
There are places where it's useful to have debugging printks, but we don't
want them to generate any code in production kernels.
So add a new macro, pr_devel(), for _devel_opment, to provide the old
semantics, ie. if the programmer doesn't explicitly enable debugging, no
code is produced.
[1]: You can turn CONFIG_DYNAMIC_DEBUG off, but it's enabled in at least
one distro kernel, so it's not really a solution.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Greg Banks <gnb@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
V3 of the early platform driver implementation.
Platform drivers are great for embedded platforms because we can separate
driver configuration from the actual driver. So base addresses,
interrupts and other configuration can be kept with the processor or board
code, and the platform driver can be reused by many different platforms.
For early devices we have nothing today. For instance, to configure early
timers and early serial ports we cannot use platform devices. This
because the setup order during boot. Timers are needed before the
platform driver core code is available. The same goes for early printk
support. Early in this case means before initcalls.
These early drivers today have their configuration either hard coded or
they receive it using some special configuration method. This is working
quite well, but if we want to support both regular kernel modules and
early devices then we need to have two ways of configuring the same
driver. A single way would be better.
The early platform driver patch is basically a set of functions that allow
drivers to register themselves and architecture code to locate them and
probe. Registration happens through early_param(). The time for the
probe is decided by the architecture code.
See Documentation/driver-model/platform.txt for more details.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: David Brownell <david-b@pacbell.net>
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The legacy old IDE ioctl API for this is a bit primitive so we try
and map stuff sensibly onto it.
- Set PIO over DMA devices to report 32bit
- Add ability to change the PIO32 settings if the controller permits it
- Add that functionality into the sff drivers
- Add that functionality into the VLB legacy driver
- Turn on the 32bit PIO on the ninja32 and add support there
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The removal of the SAME target accidentally removed one feature that is
not available from the normal NAT targets so far, having multi-range
mappings that use the same mapping for each connection from a single
client. The current behaviour is to choose the address from the range
based on source and destination IP, which breaks when communicating
with sites having multiple addresses that require all connections to
originate from the same IP address.
Introduce a IP_NAT_RANGE_PERSISTENT option that controls whether the
destination address is taken into account for selecting addresses.
http://bugzilla.kernel.org/show_bug.cgi?id=12954
Signed-off-by: Patrick McHardy <kaber@trash.net>
In commit 364fdbc00f ("spi_mpc83xx:
rework chip selects handling"), I merged activate_cs and deactivate_cs
hooks into cs_control, but I overlooked that mpc52xx_psc_spi driver
is using these hooks too. And that resulted in the following build
failure:
CC drivers/spi/mpc52xx_psc_spi.o
drivers/spi/mpc52xx_psc_spi.c: In function 'mpc52xx_psc_spi_do_probe':
drivers/spi/mpc52xx_psc_spi.c:398: error: 'struct fsl_spi_platform_data'
has no member named 'activate_cs'
drivers/spi/mpc52xx_psc_spi.c:399: error: 'struct fsl_spi_platform_data'
has no member named 'deactivate_cs'
make[2]: *** [drivers/spi/mpc52xx_psc_spi.o] Error 1
This patch simply adds the legacy hooks back for 2.6.30, and for
2.6.31 we'll convert the driver to ->cs_control.
Reported-by: Subrata Modak <subrata@linux.vnet.ibm.com>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
block_write_full_page doesn't allow the caller to control what happens
when the IO is over. This adds a new call named block_write_full_page_endio
so the buffer head end_io handler can be provided by the caller.
This will be used by the ext3 data=guarded mode to do i_size updates in
a workqueue based end_io handler. end_buffer_async_write is also
exported so it can be called to do the dirty work of managing page
writeback for the higher level end_io handler.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Acked-by: Theodore Tso <tytso@mit.edu>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (64 commits)
phylib: Fix delay argument of schedule_delayed_work
NET/ixgbe: Fix powering off during shutdown
NET/e1000e: Fix powering off during shutdown
NET/e1000: Fix powering off during shutdown
packet: avoid warnings when high-order page allocation fails
gianfar: stop send queue before resetting gianfar
myr10ge: again fix lro_gen_skb() alignment
declance: convert to net_device_ops
bfin_mac: convert to net_device_ops
au1000: convert to net_device_ops
atarilance: convert to net_device_ops
a2065: convert to net_device_ops
ixgbe: update real_num_tx_queues on changing num_rx_queues
ixgbe: fix tx queue index
Revert "rose: zero length frame filtering in af_rose.c"
sfc: Use correct macro to set event bitfield
sfc: Match calls to netif_napi_add() and netif_napi_del()
bonding: Remove debug printk
e1000/e1000: fix compile warning
ehea: Fix incomplete conversion to net_device_ops
...
It turns out that copying a 16-byte area at ~800k times a second
can be really expensive :) This patch redesigns the frags GRO
interface to avoid copying that area twice.
The two disciples of the frags interface have been converted.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Though one can specify '-d /dev/sda1' when using blktrace, it still
traces the whole sda.
To support per-partition tracing, when we start tracing, we initialize
bt->start_lba and bt->end_lba to the start and end sector of that
partition.
Note some actions are per device, thus we don't filter 0-sector events.
The original patch and discussion can be found here:
http://marc.info/?l=linux-btrace&m=122949374214540&w=2
Signed-off-by: Shawn Du <duyuyang@gmail.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
LKML-Reference: <49E42620.4050701@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The default CONFIG_BUG=n version of BUG() should incorporate an empty a
do...while statement to avoid compilation weirdness.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda - Fix the cmd cache keys for amp verbs
ALSA: add missing definitions(letters) to HD-Audio.txt
ALSA: hda - Add quirk mask for Fujitsu Amilo laptops with ALC883
[ALSA] intel8x0: add one retry to the ac97_clock measurement routine
[ALSA] intel8x0: fix wrong conditions in ac97_clock measure routine
ALSA: hda - Avoid call of snd_jack_report at release
ALSA: add private_data to struct snd_jack
ALSA: snd-usb-caiaq: rename files to remove redundant information in file pathes
ALSA: snd-usb-caiaq: clean up header includes
ALSA: sound/pci: use memdup_user()
ALSA: sound/usb: use memdup_user()
ALSA: sound/isa: use memdup_user()
ALSA: sound/core: use memdup_user()
[ALSA] intel8x0: do not use zero value from PICB register
[ALSA] intel8x0: an attempt to make ac97_clock measurement more reliable
[ALSA] pcm-midlevel: Add more strict buffer position checks based on jiffies
[ALSA] hda_intel: fix unexpected ring buffer positions
ASoC: Disable S3C64xx support in Kconfig
ASoC: magician: remove un-necessary #include of pxa-regs.h and hardware.h
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (28 commits)
cfq-iosched: add close cooperator code
cfq-iosched: log responsible 'cfqq' in idle timer arm
cfq-iosched: tweak kick logic a bit more
cfq-iosched: no need to save interrupts in cfq_kick_queue()
brd: fix cacheflushing
brd: support barriers
swap: Remove code handling bio_alloc failure with __GFP_WAIT
gfs2: Remove code handling bio_alloc failure with __GFP_WAIT
ext4: Remove code handling bio_alloc failure with __GFP_WAIT
dio: Remove code handling bio_alloc failure with __GFP_WAIT
block: Remove code handling bio_alloc failure with __GFP_WAIT
bio: add documentation to bio_alloc()
splice: add helpers for locking pipe inode
splice: remove generic_file_splice_write_nolock()
ocfs2: fix i_mutex locking in ocfs2_splice_to_file()
splice: fix i_mutex locking in generic_splice_write()
splice: remove i_mutex locking in splice_from_pipe()
splice: split up __splice_from_pipe()
block: fix SG_IO to return a proper error value
cfq-iosched: don't delay queue kick for a merged request
...
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc: pseries/dtl.c should include asm/firmware.h
powerpc: Fix data-corrupting bug in __futex_atomic_op
powerpc/pseries: Set error_state to pci_channel_io_normal in eeh_report_reset()
powerpc: Allow 256kB pages with SHMEM
powerpc: Document new FSL I2C bindings and cleanup
powerpc/mm: Fix compile warning
powerpc/85xx: TQM8548: update defconfig
powerpc/85xx: TQM8548: use proper phy-handles for enet2 and enet3
powerpc/85xx: TQM85xx: correct address of LM75 I2C device nodes
powerpc: Add support for early tlbilx opcode
powerpc: Fix tlbilx opcode
We use a static value for the number of dma_debug_entries. It can be
overwritten by a kernel command line option.
Some IOMMUs (e.g. GART) can't set an appropriate value by a kernel
command line option because they can't know such value until they
finish initializing up their hardware.
This patch adds dma_debug_resize_entries() enables IOMMUs to adjust
the number of dma_debug_entries anytime.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: fujita.tomonori@lab.ntt.co.jp
Cc: akpm@linux-foundation.org
LKML-Reference: <20090415182234R.fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There are lots of sequences like this, especially in splice code:
if (pipe->inode)
mutex_lock(&pipe->inode->i_mutex);
/* do something */
if (pipe->inode)
mutex_unlock(&pipe->inode->i_mutex);
so introduce helpers which do the conditional locking and unlocking.
Also replace the inode_double_lock() call with a pipe_double_lock()
helper to avoid spreading the use of this functionality beyond the
pipe code.
This patch is just a cleanup, and should cause no behavioral changes.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Remove the now unused generic_file_splice_write_nolock() function.
It's conceptually broken anyway, because splice may need to wait for
pipe events so holding locks across the whole operation is wrong.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Rearrange locking of i_mutex on destination and call to
ocfs2_rw_lock() so locks are only held while buffers are copied with
the pipe_to_file() actor, and not while waiting for more data on the
pipe.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Split up __splice_from_pipe() into four helper functions:
splice_from_pipe_begin()
splice_from_pipe_next()
splice_from_pipe_feed()
splice_from_pipe_end()
splice_from_pipe_next() will wait (if necessary) for more buffers to
be added to the pipe. splice_from_pipe_feed() will feed the buffers
to the supplied actor and return when there's no more data available
(or if all of the requested data has been copied).
This is necessary so that implementations can do locking around the
non-waiting splice_from_pipe_feed().
This patch should not cause any change in behavior.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* 'master' of git://git.alsa-project.org/alsa-kernel:
[ALSA] intel8x0: add one retry to the ac97_clock measurement routine
[ALSA] intel8x0: fix wrong conditions in ac97_clock measure routine
[ALSA] intel8x0: do not use zero value from PICB register
[ALSA] intel8x0: an attempt to make ac97_clock measurement more reliable
[ALSA] pcm-midlevel: Add more strict buffer position checks based on jiffies
[ALSA] hda_intel: fix unexpected ring buffer positions
It's a somewhat twisty maze of hints and behavioural modifiers, try
and clear it up a bit with some documentation.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
It's used by DM and MD and generally useful, so move the bio list
helpers into bio.h.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Currently there are two possible platform datas for the PXA AC97 driver:
one supported by the generic AC97 driver only which provides callbacks
to allow board-specific configuration at stream startup and teardown,
and another for pxa2xx-ac97-lib which allows configuration of the reset
GPIO for PXA2xx CPUs.
Obviously this won't actually work when using the generic AC97 driver
since the drivers will attempt to parse the platform data in both
formats. Fix this by merging the two structures.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
Cc: Marek Vasut <marek.vasut@gmail.com>
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Impact: clean up
Create a sub directory in include/trace called events to keep the
trace point headers in their own separate directory. Only headers that
declare trace points should be defined in this directory.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Impact: fix compile error of lockdep event tracer
Ingo Molnar pointed out that the system name for the lockdep tracer was "lock"
which is used to include the event trace file name. It should be "lockdep"
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
drm/i915: fix scheduling while holding the new active list spinlock
drm/i915: Allow tiling of objects with bit 17 swizzling by the CPU.
drm/i915: Correctly set the write flag for get_user_pages in pread.
drm/i915: Fix use of uninitialized var in 40a5f0de
drm/i915: indicate framebuffer restore key in SysRq help message
drm/i915: sync hdmi detection by hdmi identifier with 2D
drm/i915: Fix a mismerge of the IGD patch (new .find_pll hooks missed)
drm/i915: Implement batch and ring buffer dumping
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
lockdep: warn about lockdep disabling after kernel taint, fix
Impact: allow modules to add TRACE_EVENTS on load
This patch adds the final hooks to allow modules to use the TRACE_EVENT
macro. A notifier and a data structure are used to link the TRACE_EVENTs
defined in the module to connect them with the ftrace event tracing system.
It also adds the necessary automated clean ups to the trace events when a
module is removed.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Impact: makes it possible to define events in modules
The events are created by reading down the section that they are linked
in by the macros. But this is not scalable to modules. This patch converts
the manipulations to use a global link list, and on boot up it adds
the items in the section to the list.
This change will allow modules to add their tracing events to the list as
well.
Note, this change alone does not permit modules to use the TRACE_EVENT macros,
but the change is needed for them to eventually do so.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This patch moves the ftrace creation into include/trace/ftrace.h and
simplifies the work of developers in adding new tracepoints.
Just the act of creating the trace points in include/trace and including
define_trace.h will create the events in the debugfs/tracing/events
directory.
This patch removes the need of include/trace/trace_events.h
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
In preparation to allowing trace events to happen in modules, we need
to move some of the local declarations in the kernel/trace directory
into include/linux.
This patch simply moves the declarations and performs no context changes.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
In the process to make TRACE_EVENT macro work for modules, the trace_seq
operations must be available for core kernel code.
These operations are quite useful and can be used for other implementations.
The main idea is that we create a trace_seq handle that acts very much
like the seq_file handle.
struct trace_seq *s = kmalloc(sizeof(*s, GFP_KERNEL);
trace_seq_init(s);
trace_seq_printf(s, "some data %d\n", variable);
printk("%s", s->buffer);
The main use is to allow a top level function call several other functions
that may store printf like data into the buffer. Then at the end, the top
level function can process all the data with any method it would like to.
It could be passed to userspace, output via printk or even use seq_file:
trace_seq_to_user(s, ubuf, cnt);
seq_puts(m, s->buffer);
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This patch lowers the number of places a developer must modify to add
new tracepoints. The current method to add a new tracepoint
into an existing system is to write the trace point macro in the
trace header with one of the macros TRACE_EVENT, TRACE_FORMAT or
DECLARE_TRACE, then they must add the same named item into the C file
with the macro DEFINE_TRACE(name) and then add the trace point.
This change cuts out the needing to add the DEFINE_TRACE(name).
Every file that uses the tracepoint must still include the trace/<type>.h
file, but the one C file must also add a define before the including
of that file.
#define CREATE_TRACE_POINTS
#include <trace/mytrace.h>
This will cause the trace/mytrace.h file to also produce the C code
necessary to implement the trace point.
Note, if more than one trace/<type>.h is used to create the C code
it is best to list them all together.
#define CREATE_TRACE_POINTS
#include <trace/foo.h>
#include <trace/bar.h>
#include <trace/fido.h>
Thanks to Mathieu Desnoyers and Christoph Hellwig for coming up with
the cleaner solution of the define above the includes over my first
design to have the C code include a "special" header.
This patch converts sched, irq and lockdep and skb to use this new
method.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
I've run into the situation where I need to use list_first_entry with
rcu-guarded list. This patch introduces this.
Also simplify list_for_each_entry_rcu() to use new list_entry_rcu()
instead of list_entry().
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: dipankar@in.ibm.com
LKML-Reference: <20090414153356.GC3999@psychotron.englab.brq.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
'777c6c5 wait: prevent exclusive waiter starvation' made
__wake_up_common() global to be used from abort_exclusive_wait().
It was needed to do a wake-up with the waitqueue lock held while
passing down a key to the wake-up function.
Since '4ede816 epoll keyed wakeups: add __wake_up_locked_key() and
__wake_up_sync_key()' there is an appropriate wrapper for this case:
__wake_up_locked_key().
Use it here and make __wake_up_common() private to the scheduler
again.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1239720785-19661-1-git-send-email-hannes@cmpxchg.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Added private_data and private_free fields to struct snd_jack so that
the caller can assign the data. It'll be helpful for avoiding the
double-free of the jack instance.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The revoke records must be written using the same way as the rest of
the blocks during the commit process; that is, either marked as
synchronous writes or as asynchornous writes.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Impact: clean up
Neil Horman (et. al.) criticized the way the trace events were broken up
into two files. The reason for that was that ftrace needed to separate out
the declarations from where the #include <linux/tracepoint.h> was used.
It then dawned on me that the tracepoint.h header only needs to define the
TRACE_EVENT macro if it is not already defined.
The solution is simply to test if TRACE_EVENT is defined, and if it is not
then the linux/tracepoint.h header can define it. This change consolidates
all the <traces>.h and <traces>_event_types.h into the <traces>.h file.
Reported-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Theodore Tso <tytso@mit.edu>
Reported-by: Jiaying Zhang <jiayingz@google.com>
Cc: Zhaolei <zhaolei@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The revoke records must be written using the same way as the rest of
the blocks during the commit process; that is, either marked as
synchronous writes or as asynchornous writes.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
prototype of a driver for the digigram lx6464es 64 channel ethersound
interface.
Signed-off-by: Tim Blechmann <tim@klingt.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This patch fixes a hierarchical-RCU performance bug located by Anton
Blanchard. The problem stems from a misguided attempt to provide a
work-around for jiffies-counter failure. This work-around uses a per-CPU
n_rcu_pending counter, which is incremented on each call to rcu_pending(),
which in turn is called from each scheduling-clock interrupt. Each CPU
then treats this counter as a surrogate for the jiffies counter, so
that if the jiffies counter fails to advance, the per-CPU n_rcu_pending
counter will cause RCU to invoke force_quiescent_state(), which in turn
will (among other things) send resched IPIs to CPUs that have thus far
failed to pass through an RCU quiescent state.
Unfortunately, each CPU resets only its own counter after sending a
batch of IPIs. This means that the other CPUs will also (needlessly)
send -another- round of IPIs, for a full N-squared set of IPIs in the
worst case every three scheduler-clock ticks until the grace period
finally ends. It is not reasonable for a given CPU to reset each and
every n_rcu_pending for all the other CPUs, so this patch instead simply
disables the jiffies-counter "training wheels", thus eliminating the
excessive IPIs.
Note that the jiffies-counter IPIs do not have this problem due to
the fact that the jiffies counter is global, so that the CPU sending
the IPIs can easily reset things, thus preventing the other CPUs from
sending redundant IPIs.
Note also that the n_rcu_pending counter remains, as it will continue to
be used for tracing. It may also see use to update the jiffies counter,
should an appropriate kick-the-jiffies-counter API appear.
Located-by: Anton Blanchard <anton@au1.ibm.com>
Tested-by: Anton Blanchard <anton@au1.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: anton@samba.org
Cc: akpm@linux-foundation.org
Cc: dipankar@in.ibm.com
Cc: manfred@colorfullife.com
Cc: cl@linux-foundation.org
Cc: josht@linux.vnet.ibm.com
Cc: schamp@sgi.com
Cc: niv@us.ibm.com
Cc: dvhltc@us.ibm.com
Cc: ego@in.ibm.com
Cc: laijs@cn.fujitsu.com
Cc: rostedt@goodmis.org
Cc: peterz@infradead.org
Cc: penberg@cs.helsinki.fi
Cc: andi@firstfloor.org
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
LKML-Reference: <12396834793575-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: build fix for Sparc and s390
Stephen Rothwell reported that the Sparc build broke:
In file included from kernel/panic.c:12:
include/linux/debug_locks.h: In function '__debug_locks_off':
include/linux/debug_locks.h:15: error: implicit declaration of function 'xchg'
due to:
9eeba61: lockdep: warn about lockdep disabling after kernel taint
There is some inconsistency between architectures about where exactly
xchg() is defined.
The traditional place is in system.h but the more logical point for it
is in atomic.h - where most architectures (especially new ones) have
it defined. These architecture also still offer it via system.h.
Some, such as Sparc or s390 only have it in asm/system.h and not available
via asm/atomic.h at all.
Use the widest set of headers in debug_locks.h and also include asm/system.h.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20090414144317.026498df.sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch creates auditing functions usable by LSM to audit security
events. It provides standard dumping of FS, NET, task etc ... events
(code borrowed from SELinux)
and provides 2 callbacks to define LSM specific auditing, which should be
flexible enough to convert SELinux too.
Signed-off-by: Etienne Basset <etienne.basset@numericable.fr>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
cked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Support the Intel 854 Chipset in fbdev.
We test and use the patch on a Thomson IP1101 IPTV-Box. On the VGA-Port
we get a normal signal.
Here is the link to the Mambux-Project: http://www.mambux.de
Cc: Keith Packard <keithp@keithp.com>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Stefan Husemann <shusemann@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
commit ddb53d48da ("fbdev: remove cyblafb
driver") removed drivers/video/cyblafb.c, but not its .h file
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Jani Monoses" <jani@ubuntu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: (nearly) trivial
The patch
commit da654b74bd
Author: Srinivasa Ds <srinivasa@in.ibm.com>
Date: Tue Sep 23 15:23:52 2008 +0530
signals: demultiplexing SIGTRAP signal
forgot to update the NSIGTRAP define in asm-generic/siginfo.h to the new
number of sigtrap subcodes. Nothing in the tree seems to use it, but
presumably something in user space might. So update it.
Cc: Srinivasa Ds <srinivasa@in.ibm.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Include <linux/types.h> in fiemap.h. Sam Ravnborg pointed out that this
was missing in this newly-exported header which uses the __u32 and __u64
types.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Data sheet at:
http://www.sensirion.ch/en/pdf/product_information/Datasheet-humidity-sensor-SHT1x.pdf
These sensors communicate over a 2 wire bus running a device specific
protocol. The complexity of the driver is mainly due to handling the
substantial delays between requesting a reading and the device pulling the
data line low to indicate that the data is available. This is handled by
an interrupt that is disabled under all other conditions.
I wasn't terribly clear on the best way to handle this, so comments on
that aspect would be particularly welcome!
Interpretation of the temperature depends on knowing the supply voltage.
If configured in a board config as a regulator consumer this is obtained
from the regulator subsystem. If not it should be provided in the
platform data.
I've placed this driver in the hwmon subsystem as it is definitely a
device that may be used for hardware monitoring and with it's relatively
slow response times (up to 120 millisecs to get a reading) a caching
strategy certainly seems to make sense!
Signed-off-by: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The documentation about the meaning of the color component bitfield
lengths in pseudocolor modes is inconsistent. Fix it, so that it
indicates the correct interpretation everywhere, i.e. that 1 << length is
the number of palette entries.
Signed-off-by: Michal Januszewski <spock@gentoo.org>
Acked-by: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: <syrjala@sci.fi>
Acked-by: Geert Uytterhoeven <geert.uytterhoeven@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a new config option, CONFIG_EVENT_TRACING that gets selected
when CONFIG_TRACING is selected and adds everything needed by the stuff
in trace_export - basically all the event tracing support needed by e.g.
bprint, minus the actual events, which are only included if
CONFIG_EVENT_TRACER is selected.
So CONFIG_EVENT_TRACER can be used to turn on or off the generated events
(what I think of as the 'event tracer'), while CONFIG_EVENT_TRACING turns
on or off the base event tracing support used by both the event tracer and
the other things such as bprint that can't be configured out.
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: fweisbec@gmail.com
LKML-Reference: <1239178441.10295.34.camel@tropicana>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The ring_buffer_discard_commit is similar to ring_buffer_event_discard
but it can only be done on an event that has yet to be commited.
Unpredictable results can happen otherwise.
The main difference between ring_buffer_discard_commit and
ring_buffer_event_discard is that ring_buffer_discard_commit will try
to free the data in the ring buffer if nothing has addded data
after the reserved event. If something did, then it acts almost the
same as ring_buffer_event_discard followed by a
ring_buffer_unlock_commit.
Note, either ring_buffer_commit_discard and ring_buffer_unlock_commit
can be called on an event, not both.
This commit also exports both discard functions to be usable by
GPL modules.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Marvell 88E1121R Dual PHY device can be hardware-configured
to use shared interrupt pin for both PHY ports. For such
PHY configurations using shared PHY interrupt phy_interrupt()
handler will also schedule a work for PHY port which didn't
cause an interrupt.
This patch adds a possibility for PHY drivers to provide
did_interrupt() function which reports if the PHY (or a PHY
port in a multi-PHY device) generated an interrupt. This
function is called in phy_change() as phy_change() shouldn't
proceed if it is invoked for a PHY which didn't cause an
interrupt. So check for interrupt originator in phy_change()
to allow early-out.
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a race between resume from hibernation and the asynchronous
scanning of SCSI devices and to prevent it from happening we need to
call scsi_complete_async_scans() during resume from hibernation.
In addition, if the resume from hibernation is userland-driven, it's
better to wait for all device probes in the kernel to complete before
attempting to open the resume device.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing/filters: return proper error code when writing filter file
tracing/filters: allow user input integer to be oct or hex
tracing/filters: fix NULL pointer dereference
tracing/filters: NIL-terminate user input filter
ftrace: Output REC->var instead of __entry->var for trace format
Make __stringify support variable argument macros too
tracing: fix document references
tracing: fix splice return too large
tracing: update file->f_pos when splice(2) it
tracing: allocate page when needed
tracing: disable seeking for trace_pipe_raw
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
lockdep: continue lock debugging despite some taints
lockdep: warn about lockdep disabling after kernel taint
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
i2c: Let new-style drivers implement attach_adapter
i2c: Fix sparse warnings for I2C_BOARD_INFO()
i2c-voodoo3: Deprecate in favor of tdfxfb
i2c-algo-pca: Fix use of uninitialized variable in debug message
When POSIX capabilities were introduced during the 2.1 Linux
cycle, the fs mask, which represents the capabilities which having
fsuid==0 is supposed to grant, did not include CAP_MKNOD and
CAP_LINUX_IMMUTABLE. However, before capabilities the privilege
to call these did in fact depend upon fsuid==0.
This patch introduces those capabilities into the fsmask,
restoring the old behavior.
See the thread starting at http://lkml.org/lkml/2009/3/11/157 for
reference.
Note that if this fix is deemed valid, then earlier kernel versions (2.4
and 2.2) ought to be fixed too.
Changelog:
[Mar 23] Actually delete old CAP_FS_SET definition...
[Mar 20] Updated against J. Bruce Fields's patch
Reported-by: Igor Zhbanov <izh1979@gmail.com>
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: stable@kernel.org
Cc: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
percpu: unbreak alpha percpu
mutex: have non-spinning mutexes on s390 by default
Since the first argument to I2C_BOARD_INFO() must be a string constant,
there is no need to parenthesise it, and adding parentheses results in
an invalid initialiser for char[]. gcc obviously accepts this syntax as
an extension, but sparse complains, e.g.:
drivers/net/sfc/boards.c:173:2: warning: array initialized from parenthesized string constant
Therefore, remove the parentheses.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Impact: provide useful missing info for developers
Kernel taint can occur in several situations such as warnings,
load of prorietary or staging modules, bad page, etc...
But when such taint happens, a developer might still be working on
the kernel, expecting that lockdep is still enabled. But a taint
disables lockdep without ever warning about it.
Such a kernel behaviour doesn't really help for kernel development.
This patch adds this missing warning.
Since the taint is done most of the time after the main message that
explain the real source issue, it seems safe to warn about it inside
add_taint() so that it appears at last, without hurting the main
information.
v2: Use a generic helper to disable lockdep instead of an
open coded xchg().
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1239412638-6739-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
TRACE_EVENT is a more generic way to define tracepoints.
Doing so adds these new capabilities to this tracepoint:
- zero-copy and per-cpu splice() tracing
- binary tracing without printf overhead
- structured logging records exposed under /debug/tracing/events
- trace events embedded in function tracer output and other plugins
- user-defined, per tracepoint filter expressions
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <49DEE6DA.80600@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: refactor code for future changes
Current kmemtrace.h is used both as header file of kmemtrace and kmem's
tracepoints definition.
Tracepoints' definition file may be used by other code, and should only have
definition of tracepoint.
We can separate include/trace/kmemtrace.h into 2 files:
include/linux/kmemtrace.h: header file for kmemtrace
include/trace/kmem.h: definition of kmem tracepoints
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <49DEE68A.5040902@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Noises can be introduced when LCD signals are being driven, some platforms
provide a signal to assist the synchronization of this sampling procedure.
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
For the time being, move the generic percpu_*() accessors to
linux/percpu.h.
asm-generic/percpu.h is meant to carry generic stuff for low level
stuff - declarations, definitions and pointer offset calculation
and so on but not for generic interface.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-mn10300:
Separate out the proc- and unit-specific header directories from the general
Move arch headers from include/asm-mn10300/ to arch/mn10300/include/asm/.
For example:
__stringify(__entry->irq, __entry->ret)
will now convert it to:
"REC->irq, REC->ret"
It also still supports single arguments as the old macro did.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <49DC6751.30308@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
TRACE_EVENT is a more generic way to define a tracepoint.
Doing so adds these new capabilities to this tracepoint:
- zero-copy and per-cpu splice() tracing
- binary tracing without printf overhead
- structured logging records exposed under /debug/tracing/events
- trace events embedded in function tracer output and other plugins
- user-defined, per tracepoint filter expressions
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "Steven Rostedt ;" <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <49DD90D2.5020604@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
While trying to optimize the new lock on reiserfs to replace
the bkl, I find the lock tracing very useful though it lacks
something important for performance (and latency) instrumentation:
the time a task waits for a lock.
That's what this patch implements:
bash-4816 [000] 202.652815: lock_contended: lock_contended: &sb->s_type->i_mutex_key
bash-4816 [000] 202.652819: lock_acquired: &rq->lock (0.000 us)
<...>-4787 [000] 202.652825: lock_acquired: &rq->lock (0.000 us)
<...>-4787 [000] 202.652829: lock_acquired: &rq->lock (0.000 us)
bash-4816 [000] 202.652833: lock_acquired: &sb->s_type->i_mutex_key (16.005 us)
As shown above, the "lock acquired" field is followed by the time
it has been waiting for the lock. Usually, a lock contended entry
is followed by a near lock_acquired entry with a non-zero time waited.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1238975373-15739-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Merge reason: pick up both v2.6.30-rc1 [which includes tracing/urgent fixes]
and pick up the current lineup of tracing/urgent fixes as well
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some drivers like Intel8x0 or Intel HDA are broken for some hardware variants.
This patch adds more strict buffer position checks based on jiffies when
internal hw_ptr is updated. Enable xrun_debug to see mangling of wrong
positions.
As a side effect, the hw_ptr interrupt update routine might do slightly better
job when many interrupts are lost.
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (27 commits)
xsysace: Fix dereferencing of cf_id after hd_driveid removal
at91_ide: turn on PIO 6 support
at91_ide: remove unused ide_mm_{outb,inb}
ide-cd: reverse NOT_READY sense key logic
ide: refactor tf_read() method
ide: refactor tf_load() method
ide: call write_devctl() method from tf_read() method
ide: move common code out of tf_load() method
ide: simplify 'struct ide_taskfile'
ide: replace IDE_TFLAG_* flags by IDE_VALID_*
ide-cd: fix intendation in cdrom_decode_status()
ide-cd: unify handling of fs and pc requests in cdrom_decode_status()
ide-cd: convert cdrom_decode_status() to use switch statements
ide-cd: update debugging support
ide-cd: respect REQ_QUIET for fs requests in cdrom_decode_status()
ide: remove unused #include <linux/version.h>
tx4939ide: Fix tx4939ide_{in,out}put_data_swap argument
tx493[89]ide: Remove big endian version of tx493[89]ide_tf_{load,read}
ide-cd: carve out an ide_cd_breathe()-helper for fs write requests
ide-cd: move status checking into the IRQ handler
...
asm-frv/pgtable.h could just #include <asm-generic/pgtable.h> in NOMMU mode
rather than #defining macros for lazy MMU and CPU stuff.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: consolidate documents
blktrace: pass the right pointer to kfree()
tracing/syscalls: use a dedicated file header
tracing: append a comma to INIT_FTRACE_GRAPH
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: do not count frozen tasks toward load
sched: refresh MAINTAINERS entry
sched: Print sched_group::__cpu_power in sched_domain_debug
cpuacct: add per-cgroup utime/stime statistics
posixtimers, sched: Fix posix clock monotonicity
sched_rt: don't allocate cpumask in fastpath
cpuacct: make cpuacct hierarchy walk in cpuacct_charge() safe when rcupreempt is used -v2
Since the whole point of try_then_request_module is to retry
the operation after a module has been loaded, we must wait for
the module to fully load.
Otherwise all sort of things start breaking, e.g., you won't
be able to read your encrypted disks on the first attempt.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com>
Tested-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: add sysctl for paranoid/relaxed perfcounters policy
Allow the use of system wide perf counters to everybody, but provide
a sysctl to disable it for the paranoid security minded.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090409085524.514046352@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Freezing tasks via the cgroup freezer causes the load average to climb
because the freezer's current implementation puts frozen tasks in
uninterruptible sleep (D state).
Some applications which perform job-scheduling functions consult the
load average when making decisions. If a cgroup is frozen, the load
average does not provide a useful measure of the system's utilization
to such applications. This is especially inconvenient if the job
scheduler employs the cgroup freezer as a mechanism for preempting low
priority jobs. Contrast this with using SIGSTOP for the same purpose:
the stopped tasks do not count toward system load.
Change task_contributes_to_load() to return false if the task is
frozen. This results in /proc/loadavg behavior that better meets
users' expectations.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Nigel Cunningham <nigel@tuxonice.net>
Tested-by: Nigel Cunningham <nigel@tuxonice.net>
Cc: <stable@kernel.org>
Cc: containers@lists.linux-foundation.org
Cc: linux-pm@lists.linux-foundation.org
Cc: Matt Helsley <matthltc@us.ibm.com>
LKML-Reference: <20090408194512.47a99b95@manatee.lan>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix build warnings and possibe compat misbehavior on IA64
Building a kernel on ia64 might trigger these ugly build warnings:
CC arch/ia64/ia32/sys_ia32.o
In file included from arch/ia64/ia32/sys_ia32.c:55:
arch/ia64/ia32/ia32priv.h:290:1: warning: "elf_check_arch" redefined
In file included from include/linux/elf.h:7,
from include/linux/module.h:14,
from include/linux/ftrace.h:8,
from include/linux/syscalls.h:68,
from arch/ia64/ia32/sys_ia32.c:18:
arch/ia64/include/asm/elf.h:19:1: warning: this is the location of the previous definition
[...]
sys_ia32.c includes linux/syscalls.h which in turn includes linux/ftrace.h
to import the syscalls tracing prototypes.
But including ftrace.h can pull too much things for a low level file,
especially on ia64 where the ia32 private headers conflict with higher
level headers.
Now we isolate the syscall tracing headers in their own lightweight file.
Reported-by: Tony Luck <tony.luck@intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jason Baron <jbaron@redhat.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Michael Rubin <mrubin@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Michael Davidson <md@google.com>
LKML-Reference: <20090408184058.GB6017@nowhere>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
igb: remove sysfs entry that was used to set the number of vfs
igbvf: add new driver to support 82576 virtual functions
drivers/net/eql.c: Fix a dev leakage.
niu: Fix unused variable warning.
r6040: set MODULE_VERSION
bnx2: Don't use reserved names
FEC driver: add missing #endif
niu: Fix error handling
mv643xx_eth: don't reset the rx coal timer on interface up
smsc911x: correct debugging message on mii read timeout
ethoc: fix library build errors
netfilter: ctnetlink: fix regression in expectation handling
netfilter: fix selection of "LED" target in netfilter
netfilter: ip6tables regression fix
Prepare for full barrier implementation: first remove the restricted support.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
x86 ACPI: Add support for Always Running APIC timer
ACPI x86: Make aperf/mperf MSR access in acpi_cpufreq read_only
ACPI x86: Cleanup acpi_cpufreq structures related to aperf/mperf
ACPICA: delete check for AML access to port 0x81-83
ACPI: WMI: use .notify method instead of installing handler directly
sony-laptop: use .notify method instead of installing handler directly
panasonic-laptop: use .notify method instead of installing handler directly
fujitsu-laptop: use .notify method instead of installing hotkey handler directly
fujitsu-laptop: use .notify method instead of installing handler directly
ACPI: video: use .notify method instead of installing handler directly
ACPI: thermal: use .notify method instead of installing handler directly
ACPI battery: fix async boot oops
ACPI: delete acpi_device.g_list
NULL noise: drivers/platform/x86/panasonic-laptop.c
ACPI: cpufreq: remove dupilcated #include
ACPI: Adjust Kelvin offset to match local implementation
ACPI: convert acpi_device_lock spinlock to mutex
Thou shalt remember to use 'git add' or errors shall be visited on your
downloads and there shall be wrath from on list and much gnashing of teeth.
Thou shalt remember to use git status or there shall be catcalls and much
embarrasment shall come to pass.
Signed-off-by: Alan "I'm hiding" Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
'zero_sum' does not properly describe the operation of generating parity
and checking that it validates against an existing buffer. Change the
name of the operation to 'val' (for 'validate'). This is in
anticipation of the p+q case where it is a requirement to identify the
target parity buffers separately from the source buffers, because the
target parity buffers will not have corresponding pq coefficients.
Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Same patch as before, modified to use bool. Also adds description of
the new field in struct atmel_mci that I missed in the first patch.
This patch adds Atmel MCI support for inverted detect pins.
Signed-off-by: Jonas Larsson <jonas.larsson@martinsson.se>
Acked-by: Pierre Ossman <pierre@ossman.eu>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Save the bit 17 state of the pages when freeing the page list, and
reswizzle them if necessary when rebinding the pages (in case they were
swapped out). Since we have userland with expectations that the swizzle
enums let it pread and pwrite contents accurately, we can't expose a new
swizzle enum for bit 17 (which it would have to GTT map to handle), so we
handle it down in pread and pwrite by swizzling the copy when bit 17 of the
page address is set.
Signed-off-by: Eric Anholt <eric@anholt.net>
Paul suggested we allow for data addresses to be recorded along with
the traditional IPs as power can provide these.
For now, only the software pagefault events provide data addresses,
but in the future power might as well for some events.
x86 doesn't seem capable of providing this atm.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090408130409.394816925@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move PERF_RECORD_TIME so that all the fixed length items come before
the variable length ones.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090408130409.307926436@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Similar to the mmap data stream, add one that tracks the task COMM field,
so that the userspace reporting knows what to call a task.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090408130409.127422406@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a few comments because I was forgetting what field what for what
functionality.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090408130409.036984214@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Push the PERF_EVENT_COUNTER_OVERFLOW bit into the misc field so that
we can have the full 32bit for PERF_RECORD_ bits.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090408130408.891867663@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Limit the size of each record to 64k (or should we count in multiples
of u64 and have a 512K limit?), this gives 16 bits or spare room in the
header, which we can use for misc bits, so as to not have to grow the
record with u64 every time we have a few bits to report.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090408130408.769271806@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a hwdev argument that is needed on some architectures
in order to access a per-device offset that is taken into
account when producing a physical address (also needed to
get from bus address to virtual address because the physical
address is an intermediate step).
Also make swiotlb_bus_to_virt weak so architectures can
override it.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Cc: jeremy@goop.org
Cc: ian.campbell@citrix.com
LKML-Reference: <1239199761-22886-8-git-send-email-galak@kernel.crashing.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Simplify tf_read() method, making it deal only with 'struct ide_taskfile' and
the validity flags that the upper layer passes, and factoring out the code that
deals with the high order bytes into ide_tf_readback() to be called from the
only two functions interested, ide_complete_cmd() and ide_dump_sector().
This should stop the needless code duplication in this method and so make
it about twice smaller than it was; along with simplifying the setup for
the method call, this should save both time and space...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Simplify tf_load() method, making it deal only with 'struct ide_taskfile' and
the validity flags that the upper layer passes, and moving the code that deals
with the high order bytes into the only function interested, do_rw_taskfile().
This should stop the needless code duplication in this method and so make
it about twice smaller than it was; along with simplifying the setup for the
method call, this should save both time and space...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Make 'struct ide_taskfile' cover only 8 register values and thus put two such
fields ('tf' and 'hob') into 'struct ide_cmd', dropping unnecessary 'tf_array'
field from it.
This required changing the prototype of ide_get_lba_addr() and ide_tf_dump().
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
[bart: fix setting of ATA_LBA bit for LBA48 commands in __ide_do_rw_disk()]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Replace IDE_TFLAG_{IN|OUT}_* flags meaning to the taskfile register validity on
input/output by the IDE_VALID_* flags and introduce 4 symmetric 8-bit register
validity indicator subfields, 'valid.{input/output}.{tf|hob}', into the 'struct
ide_cmd' instead of using the 'tf_flags' field for that purpose (this field can
then be turned from 32-bit into 8-bit one).
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Andrew Morton noticed that mm.h needlessly includes sched.h - remove it.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Conflicts:
arch/powerpc/include/asm/systbl.h
arch/powerpc/include/asm/unistd.h
include/linux/init_task.h
Merge reason: the conflicts are non-trivial: PowerPC placement
of sys_perf_counter_open has to be mixed with the
new preadv/pwrite syscalls.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: dont break future extensions of INIT_TASK
While not a problem right now, due to lack of a comma, build fails if
elements are appended to INIT_TASK() macro in development code:
arch/x86/kernel/init_task.c:33: error: request for member `XXXXXXXXXX' in something not a structure or union
arch/x86/kernel/init_task.c:33: error: initializer element is not constant
arch/x86/kernel/init_task.c:33: error: (near initialization for `init_task.ret_stack')
make[1]: *** [arch/x86/kernel/init_task.o] Error 1
make: *** [arch/x86/kernel] Error 2
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: srostedt@redhat.com
LKML-Reference: <200904080505.n3855hcn017109@www262.sakura.ne.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch documents the new bindings for the MPC I2C bus driver.
Furthermore, it removes obsolete FSL device related definitions
for I2C.
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* 'core/softlockup' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
softlockup: make DETECT_HUNG_TASK default depend on DETECT_SOFTLOCKUP
softlockup: move 'one' to the softlockup section in sysctl.c
softlockup: ensure the task has been switched out once
softlockup: remove timestamp checking from hung_task
softlockup: convert read_lock in hung_task to rcu_read_lock
softlockup: check all tasks in hung_task
softlockup: remove unused definition for spawn_softlockup_task
softlockup: fix potential race in hung_task when resetting timeout
softlockup: fix to allow compiling with !DETECT_HUNG_TASK
softlockup: decouple hung tasks check from softlockup detection
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
branch tracer, intel-iommu: fix build with CONFIG_BRANCH_TRACER=y
branch tracer: Fix for enabling branch profiling makes sparse unusable
ftrace: Correct a text align for event format output
Update /debug/tracing/README
tracing/ftrace: alloc the started cpumask for the trace file
tracing, x86: remove duplicated #include
ftrace: Add check of sched_stopped for probe_sched_wakeup
function-graph: add proper initialization for init task
tracing/ftrace: fix missing include string.h
tracing: fix incorrect return type of ns2usecs()
tracing: remove CALLER_ADDR2 from wakeup tracer
blktrace: fix pdu_len when tracing packet command requests
blktrace: small cleanup in blk_msg_write()
blktrace: NUL-terminate user space messages
tracing: move scripts/trace/power.pl to scripts/tracing/power.pl
* 'irq/threaded' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: fix devres.o build for GENERIC_HARDIRQS=n
genirq: provide old request_irq() for CONFIG_GENERIC_HARDIRQ=n
genirq: threaded irq handlers review fixups
genirq: add support for threaded interrupts to devres
genirq: add threaded interrupt handler support
* commit 'origin/master': (4825 commits)
Fix build errors due to CONFIG_BRANCH_TRACER=y
parport: Use the PCI IRQ if offered
tty: jsm cleanups
Adjust path to gpio headers
KGDB_SERIAL_CONSOLE check for module
Change KCONFIG name
tty: Blackin CTS/RTS
Change hardware flow control from poll to interrupt driven
Add support for the MAX3100 SPI UART.
lanana: assign a device name and numbering for MAX3100
serqt: initial clean up pass for tty side
tty: Use the generic RS485 ioctl on CRIS
tty: Correct inline types for tty_driver_kref_get()
splice: fix deadlock in splicing to file
nilfs2: support nanosecond timestamp
nilfs2: introduce secondary super block
nilfs2: simplify handling of active state of segments
nilfs2: mark minor flag for checkpoint created by internal operation
nilfs2: clean up sketch file
nilfs2: super block operations fix endian bug
...
Conflicts:
arch/x86/include/asm/thread_info.h
arch/x86/lguest/boot.c
drivers/xen/manage.c
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: pci_slot: grab refcount on slot's bus
PCI Hotplug: acpiphp: grab refcount on p2p subordinate bus
PCI: allow PCI core hotplug to remove PCI root bus
PCI: Fix oops in pci_vpd_truncate
PCI: don't corrupt enable_cnt when doing manual resource alignment
PCI: annotate pci_rescan_bus as __ref, not __devinit
PCI-IOV: fix missing kernel-doc
PCI: Setup disabled bridges even if buses are added
PCI: SR-IOV quirk for Intel 82576 NIC
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
loop: mutex already unlocked in loop_clr_fd()
cfq-iosched: don't let idling interfere with plugging
block: remove unused REQ_UNPLUG
cfq-iosched: kill two unused cfqq flags
cfq-iosched: change dispatch logic to deal with single requests at the time
mflash: initial support
cciss: change to discover first memory BAR
cciss: kernel scan thread for MSA2012
cciss: fix residual count for block pc requests
block: fix inconsistency in I/O stat accounting code
block: elevator quiescing helpers
Many devices require symmetric configurations of capture and playback
data formats, often due to shared clocking but sometimes also due to
other shared playback and record configuration in the device. Start
providing core support for this by allowing the DAIs or the machine
to specify that the sample rates used should be kept symmetric.
A flag symmetric_rates is provided in the snd_soc_dai and
snd_soc_dai_link structures. If this is set in either of the DAIs or in
the machine then a constraint will be applied when a stream is already
open preventing any changes in sample rate.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
The code that enables branch tracing for all (non-constant) branches
plays games with the preprocessor and #define's the C 'if ()' construct
to do tracing.
That's all fine, but it fails for some unusual but valid C code that is
sometimes used in macros, notably by the intel-iommu code:
if (i=drhd->iommu, drhd->ignored) ..
because now the preprocessor complains about multiple arguments to the
'if' macro.
So make the macro expansion of this particularly horrid trick use
varargs, and handle the case of comma-expressions in if-statements. Use
another macro to do it cleanly in just one place.
This replaces a patch by David (and acked by Steven) that did this all
inside that one already-too-horrid macro.
Tested-by: Ingo Molnar <mingo@elte.hu>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'i2c-for-2630-v2' of git://aeryn.fluff.org.uk/bjdooks/linux:
i2c: imx: Make disable_delay a per-device variable
i2c: xtensa s6000 i2c driver
powerpc/85xx: i2c-mpc: use new I2C bindings for the Socates board
i2c: i2c-mpc: make I2C bus speed configurable
i2c: i2c-mpc: use dev based printout function
i2c: i2c-mpc: various coding style fixes
i2c: imx: Add missing request_mem_region in probe()
i2c: i2c-s3c2410: Initialise Samsung I2C controller early
i2c-s3c2410: Simplify bus frequency calculation
i2c-s3c2410: sda_delay should be in ns, not clock ticks
i2c: iMX/MXC support
PCI parallel port devices can IRQ share so we should stop them hogging
the line and making a mess on modern PC systems. We know the sharing
side works as the PCMCIA driver has shared the parallel port IRQ for
some time.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(akpm: queued pending confirmation of the new major number)
[randy.dunlap@oracle.com: select SERIAL_CORE]
Signed-off-by: Christian Pellegrin <chripell@fsfe.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
tty_driver_kref_get() should be static inline and not extern inline
(the latter even changed it's semantics in gcc >= 4.3).
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After a review of user's feedback for finding out other compatibility
issues, I found nilfs improperly initializes timestamps in inode;
CURRENT_TIME was used there instead of CURRENT_TIME_SEC even though nilfs
didn't have nanosecond timestamps on disk. A few users gave us the report
that the tar program sometimes failed to expand symbolic links on nilfs,
and it turned out to be the cause.
Instead of applying the above displacement, I've decided to support
nanosecond timestamps on this occation. Fortunetaly, a needless 64-bit
field was in the nilfs_inode struct, and I found it's available for this
purpose without impact for the users.
So, this will do the enhancement and resolve the tar problem.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The former versions didn't have extra super blocks. This improves the
weak point by introducing another super block at unused region in tail of
the partition.
This doesn't break disk format compatibility; older versions just ingore
the secondary super block, and new versions just recover it if it doesn't
exist. The partition created by an old mkfs may not have unused region,
but in that case, the secondary super block will not be added.
This doesn't make more redundant copies of the super block; it is a future
work.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nilfs creates checkpoints even for garbage collection or metadata updates
such as checkpoint mode change. So, user often sees checkpoints created
only by such internal operations.
This is inconvenient in some situations. For example, application that
monitors checkpoints and changes them to snapshots, will fall into an
infinite loop because it cannot distinguish internally created
checkpoints.
This patch solves this sort of problem by adding a flag to checkpoint for
identification.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The sketch file is a file to mark checkpoints with user data. It was
experimentally introduced in the original implementation, and now
obsolete. The file was handled differently with regular files; the file
size got truncated when a checkpoint was created.
This stops the special treatment and will treat it as a regular file.
Most users are not affected because mkfs.nilfs2 no longer makes this file.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds a new argument to the nilfs_sustat structure.
The extended field allows to delete volatile active state of segments,
which was needed to protect freshly-created segments from garbage
collection but has confused code dealing with segments. This
extension alleviates the mess and gives room for further
simplifications.
The volatile active flag is not persistent, so it's eliminable on this
occasion without affecting compatibility other than the ioctl change.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This removes compat code from the nilfs ioctls and applies the same
function for both .ioctl and .compat_ioctl file operations.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nilfs ioctl had structures not having fixed sized types such as:
struct nilfs_argv {
void *v_base;
size_t v_nmembs;
size_t v_size;
int v_index;
int v_flags;
};
Further, some of them are wrongly aligned:
e.g.
struct nilfs_cpmode {
__u64 cm_cno;
int cm_mode;
};
The size of wrongly aligned structures varies depending on
architectures, and it breaks the identity of ioctl commands, which
leads to arch dependent errors.
Previously, these are compensated by using compat_ioctl.
This fixes these problems and allows removal of compat ioctl.
Since this will change sizes of those structures, binary compatibility
for the past utilities will once break; new utilities have to be used
instead. However, it would be helpful to avoid platform dependent
problems in the long term.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This removes NILFS_IOCTL_TIMEDWAIT command from ioctl interface along
with the related flags and wait queue.
The command is terrible because it just sleeps in the ioctl. I prefer
to avoid this by devising means of event polling in userland program.
By reconsidering the userland GC daemon, I found this is possible
without changing behaviour of the daemon and sacrificing efficiency.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds a header file which specifies the on-disk format and ioctl
interface of the nilfs2 file system.
Signed-off-by: Koji Sato <sato.koji@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)
Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Largely inspired from ipc/ipc_sysctl.c. This patch isolates the mqueue
sysctl stuff in its own file.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement multiple mounts of the mqueue file system, and link it to usage
of CLONE_NEWIPC.
Each ipc ns has a corresponding mqueuefs superblock. When a user does
clone(CLONE_NEWIPC) or unshare(CLONE_NEWIPC), the unshare will cause an
internal mount of a new mqueuefs sb linked to the new ipc ns.
When a user does 'mount -t mqueue mqueue /dev/mqueue', he mounts the
mqueuefs superblock.
Posix message queues can be worked with both through the mq_* system calls
(see mq_overview(7)), and through the VFS through the mqueue mount. Any
usage of mq_open() and friends will work with the acting task's ipc
namespace. Any actions through the VFS will work with the mqueuefs in
which the file was created. So if a user doesn't remount mqueuefs after
unshare(CLONE_NEWIPC), mq_open("/ab") will not be reflected in "ls
/dev/mqueue".
If task a mounts mqueue for ipc_ns:1, then clones task b with a new ipcns,
ipcns:2, and then task a is the last task in ipc_ns:1 to exit, then (1)
ipc_ns:1 will be freed, (2) it's superblock will live on until task b
umounts the corresponding mqueuefs, and vfs actions will continue to
succeed, but (3) sb->s_fs_info will be NULL for the sb corresponding to
the deceased ipc_ns:1.
To make this happen, we must protect the ipc reference count when
a) a task exits and drops its ipcns->count, since it might be dropping
it to 0 and freeing the ipcns
b) a task accesses the ipcns through its mqueuefs interface, since it
bumps the ipcns refcount and might race with the last task in the ipcns
exiting.
So the kref is changed to an atomic_t so we can use
atomic_dec_and_lock(&ns->count,mq_lock), and every access to the ipcns
through ns = mqueuefs_sb->s_fs_info is protected by the same lock.
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move mqueue vfsmount plus a few tunables into the ipc_namespace struct.
The CONFIG_IPC_NS boolean and the ipc_namespace struct will serve both the
posix message queue namespaces and the SYSV ipc namespaces.
The sysctl code will be fixed separately in patch 3. After just this
patch, making a change to posix mqueue tunables always changes the values
in the initial ipc namespace.
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The mqueuefs filesystem will use this helper as well. Proc's main get_sb
could also be made to use it, but that will require a bit more rework.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The I2C functionality provided by the i2c-voodoo3 driver is moved into the
tdfxfb (frame buffer driver for Voodoo3 cards). This way there is no
conflict between the i2c driver and the fb driver.
The tdfxfb does not make use from the DDC functionality yet but provides
all the functionality of the i2c-voodoo3 driver.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add disable/enable_kretprobe() and disable/enable_jprobe().
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add disable_kprobe() and enable_kprobe() to disable/enable kprobes
temporarily.
disable_kprobe() asynchronously disables probe handlers of specified
kprobe. So, after calling it, some handlers can be called at a while.
enable_kprobe() enables specified kprobe.
aggr_pre_handler and aggr_post_handler check disabled probes. On the
other hand aggr_break_handler and aggr_fault_handler don't check it
because these handlers will be called while executing pre or post handlers
and usually those help error handling.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix comment style in kprobes.h.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some SPI controllers have restrictions on DMAable buffers alignemt.
Currently if the buffer supplied by protocol driver is not properly
aligned, the controller silently performs transfer in PIO mode. Addition
of dma_alignment field to spi_master allows protocol drivers to perform
proper alignment.
Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Cc: Bryan Wu <bryan.wu@analog.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add /proc entries to give the admin the ability to control the minimum and
maximum number of pdflush threads. This allows finer control of pdflush
on both large and small machines.
The rationale is simply one size does not fit all. Admins on large and/or
small systems may want to tune the min/max pdflush thread count to best
suit their needs. Right now the min/max is hardcoded to 2/8. While
probably a fair estimate for smaller machines, large machines with large
numbers of CPUs and large numbers of filesystems/block devices may benefit
from larger numbers of threads working on different block devices.
Even if the background flushing algorithm is radically changed, it is
still likely that multiple threads will be involved and admins would still
desire finer control on the min/max other than to have to recompile the
kernel.
The patch adds '/proc/sys/vm/nr_pdflush_threads_min' and
'/proc/sys/vm/nr_pdflush_threads_max' with r/w permissions.
The minimum value for nr_pdflush_threads_min is 1 and the maximum value is
the current value of nr_pdflush_threads_max. This minimum is required
since additional thread creation is performed in a pdflush thread itself.
The minimum value for nr_pdflush_threads_max is the current value of
nr_pdflush_threads_min and the maximum value can be 1000.
Documentation/sysctl/vm.txt is also updated.
[akpm@linux-foundation.org: fix comment, fix whitespace, use __read_mostly]
Signed-off-by: Peter W Morreale <pmorreale@novell.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
One of the changes between kernels 2.6.28 and 2.6.29 is that a branch profiler
has been added for if() statements. Unfortunately this patch makes the sparse
output unusable with CONFIG_TRACE_BRANCH_PROFILING=y: when branch profiling is
enabled, sparse prints so much false positives that the real issues are no
longer visible. This behavior can be reproduced as follows:
* enable CONFIG_TRACE_BRANCH_PROFILING, e.g. by running make allyesconfig or
make allmodconfig.
* run make C=2
Result: a huge number of the following sparse warnings.
...
include/linux/cpumask.h:547:2: warning: symbol '______r' shadows an earlier one
include/linux/cpumask.h:547:2: originally declared here
...
The patch below fixes this by disabling branch profiling while analyzing the
kernel code with sparse.
See also:
* http://lkml.org/lkml/2008/11/21/18
* http://bugzilla.kernel.org/show_bug.cgi?id=12925
Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <200904051620.02311.bart.vanassche@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (28 commits)
powerpc: Fix oops when loading modules
powerpc: Wire up preadv and pwritev
powerpc/ftrace: Fix printf format warning
powerpc/ftrace: Fix #if that should be #ifdef
powerpc: Fix ptrace compat wrapper for FPU register access
powerpc: Print information about mapping hw irqs to virtual irqs
powerpc: Correct dependency of KEXEC
powerpc: Disable VSX or current process in giveup_fpu/altivec
powerpc/pseries: Enable relay in pseries_defconfig
powerpc/pseries: Fix ibm,client-architecture comment
powerpc/pseries: Scan for all events in rtasd
powerpc/pseries: Add dispatch dispersion statistics
powerpc: Clean up some prom printouts
powerpc: Print progress of ibm,client-architecture method
powerpc: Remove duplicated #include's
powerpc/pmac: Fix internal modem IRQ on Wallstreet PowerBook
powerpc/wdrtas: Update wdrtas_get_interval to use rtas_data_buf
fsl-diu-fb: Pass the proper device for dma mapping routines
powerpc/pq2fads: Update device tree for use with device-tree-aware u-boot.
cpm_uart: Disable CPM udbg when re-initing CPM uart, even if not the console.
...
Asus boards have an ACPI interface for interacting with the hwmon (fan,
temperatures, voltages) subsystem; this driver exposes the relevant
information via the standard sysfs interface.
There are two different ACPI interfaces:
- an old one (based on RVLT/RFAN/RTMP)
- a new one (GGRP/GITM)
Both may be present but there a few cases (my board, sigh) where the
new interface is just an empty stub; the driver defaults to the old one
when both are present.
The old interface has received a considerable testing, but I'm still
awaiting confirmation from my tester that the new one is working as
expected (hence the debug code is still enabled).
Currently all the attributes are read-only, though a (partial) control
should be possible with a bit more work.
Signed-off-by: Luca Tettamanti <kronos.it@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Impact: fix to crash going to kexec
The init task did not properly initialize the function graph pointers.
Altough these pointers are NULL, they can not be assumed to be NULL
for the init task, and must still be properly initialize.
This usually is not an issue since a problem only arises when a task
exits, and the init tasks do not usually exit. But when doing tests
with kexec, the init tasks do exit, and the bug appears.
This patch properly initializes the init tasks function graph data
structures.
Reported-and-Tested-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <alpine.DEB.2.00.0903252053080.5675@gandalf.stny.rr.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add the ptrace bts context field to task_struct unconditionally.
Initialize the field directly in copy_process().
Remove all the unneeded functionality used to initialize that field.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Cc: roland@redhat.com
Cc: eranian@googlemail.com
Cc: oleg@redhat.com
Cc: juan.villacis@intel.com
Cc: ak@linux.jf.intel.com
LKML-Reference: <20090403144603.292754000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When a ptraced task is unlinked, we need to stop branch tracing for
that task.
Since the unlink is called with interrupts disabled, and we need
interrupts enabled to stop branch tracing, we defer the work.
Collect all branch tracing related stuff in a branch tracing context.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: roland@redhat.com
Cc: eranian@googlemail.com
Cc: juan.villacis@intel.com
Cc: ak@linux.jf.intel.com
LKML-Reference: <20090403144550.712401000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a function to wait until some other task has been
switched out at least once.
This differs from wait_task_inactive() subtly, in that the
latter will wait until the task has left the CPU.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Cc: markus.t.metzger@gmail.com
Cc: roland@redhat.com
Cc: eranian@googlemail.com
Cc: oleg@redhat.com
Cc: juan.villacis@intel.com
Cc: ak@linux.jf.intel.com
LKML-Reference: <20090403144549.794157000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Support for the s6000 on-chip i2c controller.
Signed-off-by: Oskar Schirmer <os@emlix.com>
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Now that all the task runtime clock users are gone, remove the ugly
rq->lock usage from perf counters, which solves the nasty deadlock
seen when a software task clock counter was read from an NMI overflow
context.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090406094518.531137582@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since perf_counter_context is switched along with tasks, we can
maintain the context time without using the task runtime clock.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090406094518.353552838@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently the definition of an event is slightly ambiguous. We have
wakeup events, for poll() and SIGIO, which are either generated
when a record crosses a page boundary (hw_events.wakeup_events == 0),
or every wakeup_events new records.
Now a record can be either a counter overflow record, or a number of
different things, like the mmap PROT_EXEC region notifications.
Then there is the PERF_COUNTER_IOC_REFRESH event limit, which only
considers counter overflows.
This patch changes then wakeup_events and SIGIO notification to only
consider overflow events. Furthermore it changes the SIGIO notification
to report SIGHUP when the event limit is reached and the counter will
be disabled.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090406094518.266679874@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Provide means to auto-disable the counter after 'n' overflow events.
Create the counter with hw_event.disabled = 1, and then issue an
ioctl(fd, PREF_COUNTER_IOC_REFRESH, n); to set the limit and enable
the counter.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090406094518.083139737@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
By popular request, provide means to log a timestamp along with the
counter overflow event.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090406094518.024173282@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Prepare for more generic overflow handling. The new perf_counter_overflow()
method will handle the generic bits of the counter overflow, and can return
a !0 return value, in which case the counter should be (soft) disabled, so
that it won't count until it's properly disabled.
XXX: do powerpc and swcounter
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090406094517.812109629@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Prepare the pending infrastructure to do more than wakeups.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090406094517.634732847@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Provide support for fcntl() I/O availability signals.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090406094517.579788800@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Change the callchain context entries to u16, so as to gain some space.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090406094517.457320003@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Update the userspace read method.
Paul noted that:
- userspace cannot observe ->lock & 1 on the same cpu.
- we need a barrier() between reading ->lock and ->index
to ensure we read them in that prticular order.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090406094517.368446033@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The request inherits the unplug flag from the bio, but it isn't actually
used. The bio flag stops at __make_request(), which tells it to unplug
after submission. Passing it on to the request doesn't make any sense.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This driver supports mflash IO mode for linux.
Mflash is embedded flash drive and mainly targeted mobile and consumer
electronic devices.
Internally, mflash has nand flash and other hardware logics and supports 2
different operation (ATA, IO) modes. ATA mode doesn't need any new driver
and currently works well under standard IDE subsystem. Actually it's one
chip SSD. IO mode is ATA-like custom mode for the host that doesn't have
IDE interface.
Followings are brief descriptions about IO mode.
A. IO mode based on ATA protocol and uses some custom command. (read confirm,
write confirm)
B. IO mode uses SRAM bus interface.
C. IO mode supports 4kB boot area, so host can boot from mflash.
This driver is quitely similar to a standard ATA driver, but because of
following reasons it is currently seperated with ATA layer.
1. ATA layer deals standard ATA protocol. ATA layer have many low-
level device specific interface, but data transfer keeps ATA rule.
But, mflash IO mode doesn't.
2. Even though currently not used in mflash driver code, mflash has
some custom command and modes. (nand fusing, firmware patch, etc) If
this feature supported in linux kernel, ATA layer more altered.
3. Currently PATA platform device driver doesn't support interrupt.
(I'm not sure) But, mflash uses interrupt (polling mode is just for
debug).
4. mflash is somewhat under-develop product. Even though some company
already using mflash their own product, I think more time is needed for
standardization of custom command and mode. That time (maybe October)
I will talk to with ATA people. If they accept integration, I will
integrate.
Signed-off-by: unsik Kim <donari75@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This forces in_flight to be zero when turning off or on the I/O stat
accounting and stops updating I/O stats in attempt_merge() when
accounting is turned off.
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Add support to the SQ-905 driver to pass back to user space the
sensor orientation information obtained from the camera during init.
Modifies gspca and the videodev2.h header to create the necessary
API.
[mchehab@redhat.com: Changed "Output is" to "Frames are" at the comments, as suggested at LMML]
Signed-off-by: Adam Baker <linux@baker-net.org.uk>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Remove support for the debug call VIDIOC_INT_S_AUDIO_ROUTING from cx18
and ivtv. These internal ioctls shouldn't be exposed. These were only
used through the cx18-ctl and ivtv-ctl utilities, and only when testing
a new card variant.
This cleanup allows the removal of this ioctl from v4l2-common.h.
Cc: Andy Walls <awalls@radix.net>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
It is no longer needed to use a struct pointer as argument, since v4l2_subdev
doesn't require that ioctl-like approach anymore. Instead just pass the input,
output and config (new!) arguments directly.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
With all the v4l2_subdev changes that were made to these drivers it is a
good idea to increase the version number of each driver.
It's just the patch level that is increased, except for the zoran and saa7146
drivers where the minor number was increased due to the more substantial
changes that were made to those two drivers.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Rather than duplicating this list everywhere, just put it in tvaudio.h.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The functions v4l2_i2c_new_subdev and v4l2_i2c_new_probed_subdev relied on
i2c_get_adapdata to return the v4l2_device. However, this is not always
possible on embedded platforms. So modify the API to pass the v4l2_device
pointer explicitly.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Add utility function to probe for a single address, rather than a list
of addresses.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Originally the intention was to switch to the new style i2c API starting with
the introduction of the API in 2.6.22. However, the i2c_new_probed_device()
function has a lethal bug that wasn't fixed until 2.6.25. Or more accurately,
it was only fixed in the stable series of 2.6.25 and 2.6.26.
Given the fact that the new i2c API also changed starting with 2.6.26 (the
addition of i2c_device_id), it is easiest to switch APIs starting with
2.6.26.
This patch updates all the legacy code accordingly.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
s_std didn't belong in the tuner ops. Stricly speaking it should be part of
the video ops, but it is used by audio and tuner devices as well, so it is
more efficient to make it part of the core ops.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The init callback was used in several places to load firmware. Make a separate
load_fw callback for that. This makes the code a lot more understandable.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
s_standby is only used to put the tuner in powersaving mode, so move it
from core to tuner.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Now that all drivers are converted to v4l2_subdev we can remove legacy code
in v4l2-common. Also move the documentation of the internal API to
v4l2-subdev.h where it really belongs.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
v4l2-subdev.c and v4l2-i2c-drv-legacy.h were used to support the old i2c
API. All v4l drivers are now converted to v4l2_subdev, so these two files
can be removed.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This is common code shared between the IDE and libata implementations
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* git://git.infradead.org/mtd-2.6: (53 commits)
[MTD] struct device - replace bus_id with dev_name(), dev_set_name()
[MTD] [NOR] Fixup for Numonyx M29W128 chips
[MTD] mtdpart: Make ecc_stats more realistic.
powerpc/85xx: TQM8548: Update DTS file for multi-chip support
powerpc: NAND: FSL UPM: document new bindings
[MTD] [NAND] FSL-UPM: Add wait flags to support board/chip specific delays
[MTD] [NAND] FSL-UPM: add multi chip support
[MTD] [NOR] Add device parent info to physmap_of
[MTD] [NAND] Add support for NAND on the Socrates board
[MTD] [NAND] Add support for 4KiB pages.
[MTD] sysfs support should not depend on CONFIG_PROC_FS
[MTD] [NAND] Add parent info for CAFÉ controller
[MTD] support driver model updates
[MTD] driver model updates (part 2)
[MTD] driver model updates
[MTD] [NAND] move gen_nand's probe function to .devinit.text
[MTD] [MAPS] move sa1100 flash's probe function to .devinit.text
[MTD] fix use after free in register_mtd_blktrans
[MTD] [MAPS] Drop now unused sharpsl-flash map
[MTD] ofpart: Check name property to determine partition nodes.
...
Manually fix trivial conflict in drivers/mtd/maps/Makefile
Fixes:
In file included from drivers/serial/mux.c:37:
include/linux/serial_core.h: In function 'uart_handle_sysrq_char':
include/linux/serial_core.h:467: error: 'struct uart_port' has no member named 'sysrq'
include/linux/serial_core.h:468: error: 'struct uart_port' has no member named 'sysrq'
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This implements basic support for all 843x RS232 devices, but does not
add DMA support. This means that sustained data transfers at high baud
rates may not be possible on multiple ports simultaneously.
Signed-off-by: Shawn Bohrer <shawn.bohrer@ni.com>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'kmemtrace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
kmemtrace: trace kfree() calls with NULL or zero-length objects
kmemtrace: small cleanups
kmemtrace: restore original tracing data binary format, improve ABI
kmemtrace: kmemtrace_alloc() must fill type_id
kmemtrace: use tracepoints
kmemtrace, rcu: don't include unnecessary headers, allow kmemtrace w/ tracepoints
kmemtrace, rcu: fix rcupreempt.c data structure dependencies
kmemtrace, rcu: fix rcu_tree_trace.c data structure dependencies
kmemtrace, rcu: fix linux/rcutree.h and linux/rcuclassic.h dependencies
kmemtrace, mm: fix slab.h dependency problem in mm/failslab.c
kmemtrace, kbuild: fix slab.h dependency problem in lib/decompress_unlzma.c
kmemtrace, kbuild: fix slab.h dependency problem in lib/decompress_bunzip2.c
kmemtrace, kbuild: fix slab.h dependency problem in lib/decompress_inflate.c
kmemtrace, squashfs: fix slab.h dependency problem in squasfs
kmemtrace, befs: fix slab.h dependency problem
kmemtrace, security: fix linux/key.h header file dependencies
kmemtrace, fs: fix linux/fdtable.h header file dependencies
kmemtrace, fs: uninline simple_transaction_set()
kmemtrace, fs, security: move alloc_secdata() and free_secdata() to linux/security.h
* 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: (81 commits)
nfsd41: define nfsd4_set_statp as noop for !CONFIG_NFSD_V4
nfsd41: define NFSD_DRC_SIZE_SHIFT in set_max_drc
nfsd41: Documentation/filesystems/nfs41-server.txt
nfsd41: CREATE_EXCLUSIVE4_1
nfsd41: SUPPATTR_EXCLCREAT attribute
nfsd41: support for 3-word long attribute bitmask
nfsd: dynamically skip encoded fattr bitmap in _nfsd4_verify
nfsd41: pass writable attrs mask to nfsd4_decode_fattr
nfsd41: provide support for minor version 1 at rpc level
nfsd41: control nfsv4.1 svc via /proc/fs/nfsd/versions
nfsd41: add OPEN4_SHARE_ACCESS_WANT nfs4_stateid bmap
nfsd41: access_valid
nfsd41: clientid handling
nfsd41: check encode size for sessions maxresponse cached
nfsd41: stateid handling
nfsd: pass nfsd4_compound_state* to nfs4_preprocess_{state,seq}id_op
nfsd41: destroy_session operation
nfsd41: non-page DRC for solo sequence responses
nfsd41: Add a create session replay cache
nfsd41: create_session operation
...
* 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds:
leds: introduce lp5521 led driver
leds: just ignore invalid GPIOs in leds-gpio
leds: Fix &&/|| confusion in leds-pca9532.c
leds: move h1940-leds's probe function to .devinit.text
leds: remove an unnecessary "goto" on drivers/leds/leds-s3c24.c
leds: add BD2802GU LED driver
leds: remove experimental flag from leds-clevo-mail
leds: Prevent multiple LED triggers with the same name
leds: Add gpio-led trigger
leds: Add rb532 LED driver for the User LED
leds: Add suspend/resume state flags to leds-gpio
leds: simple driver for pwm driven LEDs
leds: Fix leds-gpio driver multiple module_init/exit usage
leds: Add dac124s085 driver
leds: allow led-drivers to use a variable range of brightness values
leds: Add openfirmware platform device support
This patch sets up disabled bridges even if buses have already been
added.
pci_assign_unassigned_resources is called after buses are added.
pci_assign_unassigned_resources calls pci_bus_assign_resources.
pci_bus_assign_resources calls pci_setup_bridge to configure BARs of
bridges.
Currently pci_setup_bridge returns immediately if the bus have already
been added. So pci_assign_unassigned_resources can't configure BARs of
bridges that were added in a disabled state; this patch fixes the issue.
On logical hot-add, we need to prevent the kernel from re-initializing
bridges that have already been initialized. To achieve this,
pci_setup_bridge returns immediately if the bridge have already been
enabled.
We don't need to check whether the specified bus is a root bus or not.
pci_setup_bridge is not called on a root bus, because a root bus does
not have a bridge.
The patch adds a new helper function, pci_is_enabled. I made the
function name similar to pci_is_managed. The codes which use
enable_cnt directly are changed to use pci_is_enabled.
Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Yuji Shimada <shimada-yxb@necst.nec.co.jp>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fixes the following compiler error:
fs/nfsd/nfssvc.c: In function 'set_max_drc':
fs/nfsd/nfssvc.c:240: error: 'NFSD_DRC_SIZE_SHIFT' undeclared
CONFIG_NFSD_V4 is not set
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The new i2c binding model makes the client_register and
client_unregister methods of struct i2c_adapter useless, so we can
remove them with the rest of the legacy model.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
This patch fixes a regression (introduced by myself in commit 19abb7b:
netfilter: ctnetlink: deliver events for conntracks changed from
userspace) that results in an expectation re-insertion since
__nf_ct_expect_check() may return 0 for expectation timer refreshing.
This patch also removes a unnecessary refcount bump that
pretended to avoid a possible race condition with event delivery
and expectation timers (as said, not needed since we hold a
reference to the object since until we finish the expectation
setup). This also merges nf_ct_expect_related_report() and
nf_ct_expect_related() which look basically the same.
Reported-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
ROHM BD2802GU is a RGB LED controller attached to i2c bus and specifically
engineered for decoration purposes. This RGB controller incorporates
lighting patterns and illuminates.
This driver is designed to minimize power consumption, so when there is no
emitting LED, it enters to reset state. And because the BD2802GU has lots
of features that can't be covered by the current LED framework, it
provides Advanced Configuration Function(ADF) mode, so that user
applications can set registers of BD2802GU directly.
Here are basic usage examples :
; to turn on LED (not blink)
$ echo 1 > /sys/class/leds/led1_R/brightness
; to blink LED
$ echo timer > /sys/class/leds/led1_R/trigger
$ echo 1 > /sys/class/leds/led1_R/delay_on
$ echo 1 > /sys/class/leds/led1_R/delay_off
; to turn off LED
$ echo 0 > /sys/class/leds/led1_R/brightness
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kim Kyuwon <chammoru@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Richard Purdie <rpurdie@linux.intel.com>
Add an option to preserve LED state when suspending/resuming to the LED
gpio driver. Based on a suggestion from Robert Jarzmik.
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Richard Purdie <rpurdie@linux.intel.com>
Add a simple driver for pwm driver LEDs. pwm_id and period can be defined
in board file. It is developed for pxa, however it is probably generic
enough to be used on other platforms with pwm.
Signed-off-by: Luotao Fu <l.fu@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Richard Purdie <rpurdie@linux.intel.com>
This patch allows drivers to override the default maximum brightness value
of 255. We take care to preserve backwards-compatibility as much as
possible, so that user-space ABI doesn't change for existing drivers.
LED trigger code has also been updated to use the per-LED maximum.
Signed-off-by: Guennadi Liakhovetski <lg@denx.de>
Signed-off-by: Richard Purdie <rpurdie@linux.intel.com>
By default, CFQ will anticipate more IO from a given io context if the
previously completed IO was sync. This used to be fine, since the only
sync IO was reads and O_DIRECT writes. But with more "normal" sync writes
being used now, we don't want to anticipate for those.
Add a bio/request flag that informs the IO scheduler that this is a sync
request that we should not idle for. Introduce WRITE_ODIRECT specifically
for O_DIRECT writes, and make sure that the other sync writes set this
flag.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(S)WRITE_SYNC always unplugs the device right after IO submission.
Sometimes we want to build up a queue before doing so, so add
variants that explicitly DON'T unplug the queue. The caller must
then do that after submitting all the IO.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This makes sure that we never wait on async IO for sync requests, instead
of doing the split on writes vs reads.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
PI Futexes and their underlying rt_mutex cannot be left ownerless if
there are pending waiters as this will break the PI boosting logic, so
the standard requeue commands aren't sufficient. The new commands
properly manage pi futex ownership by ensuring a futex with waiters
has an owner at all times. This will allow glibc to properly handle
pi mutexes with pthread_condvars.
The approach taken here is to create two new futex op codes:
FUTEX_WAIT_REQUEUE_PI:
Tasks will use this op code to wait on a futex (such as a non-pi waitqueue)
and wake after they have been requeued to a pi futex. Prior to returning to
userspace, they will acquire this pi futex (and the underlying rt_mutex).
futex_wait_requeue_pi() is the result of a high speed collision between
futex_wait() and futex_lock_pi() (with the first part of futex_lock_pi() being
done by futex_proxy_trylock_atomic() on behalf of the top_waiter).
FUTEX_REQUEUE_PI (and FUTEX_CMP_REQUEUE_PI):
This call must be used to wake tasks waiting with FUTEX_WAIT_REQUEUE_PI,
regardless of how many tasks the caller intends to wake or requeue.
pthread_cond_broadcast() should call this with nr_wake=1 and
nr_requeue=INT_MAX. pthread_cond_signal() should call this with nr_wake=1 and
nr_requeue=0. The reason being we need both callers to get the benefit of the
futex_proxy_trylock_atomic() routine. futex_requeue() also enqueues the
top_waiter on the rt_mutex via rt_mutex_start_proxy_lock().
Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Paul noted that we don't need SMP barriers for the mmap() counter read
because its always on the same cpu (otherwise you can't access the hw
counter anyway).
So remove the SMP barriers and replace them with regular compiler
barriers.
Further, update the comment to include a race free method of reading
said hardware counter. The primary change is putting the pmc_read
inside the seq-loop, otherwise we can still race and read rubbish.
Noticed-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Orig-LKML-Reference: <20090402091319.577951445@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Put in counts to tell which ips belong to what context.
-----
| | hv
| --
nr | | kernel
| --
| | user
-----
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Orig-LKML-Reference: <20090402091319.493101305@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
By request, provide a way to request a wakeup every 'n' events instead
of every page of output.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Orig-LKML-Reference: <20090402091319.323309784@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Per suggestion from Paul, move the event overflow bits to record_type
and sanitize the enums a bit.
Breaks the ABI -- again ;-)
Suggested-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Orig-LKML-Reference: <20090402091319.151921176@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Provide the generic callchain support bits. If hw_event->callchain is
set the arch specific perf_callchain() function is called upon to
provide a perf_callchain_entry structure filled with the current
callchain.
If it does so, it is added to the overflow output event.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090330171024.254266860@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Breaks ABI yet again :-)
Change the event type so that [0, 2^31-1] are regular event types, but
[2^31, 2^32-1] forms a bitmask for overflow events.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090330171024.047961770@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently the profiling information returns userspace IPs but no way
to correlate them to userspace code. Userspace could look into
/proc/$pid/maps but that might not be current or even present anymore
at the time of analyzing the IPs.
Therefore provide means to track the mmap information and provide it
in the output stream.
XXX: only covers mmap()/munmap(), mremap() and mprotect() are missing.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Orig-LKML-Reference: <20090330171023.417259499@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It just occured to me it is possible to have multiple contending
updates of the userpage (mmap information vs overflow vs counter).
This would break the seqlock logic.
It appear the arch code uses this from NMI context, so we cannot
possibly serialize its use, therefore separate the data_head update
from it and let it return to its original use.
The arch code needs to make sure there are no contending callers by
disabling the counter before using it -- powerpc appears to do this
nicely.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090330171023.241410660@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
While going over the wakeup code I noticed delayed wakeups only work
for hardware counters but basically all software counters rely on
them.
This patch unifies and generalizes the delayed wakeup to fix this
issue.
Since we're dealing with NMI context bits here, use a cmpxchg() based
single link list implementation to track counters that have pending
wakeups.
[ This should really be generic code for delayed wakeups, but since we
cannot use cmpxchg()/xchg() in generic code, I've let it live in the
perf_counter code. -- Eric Dumazet could use it to aggregate the
network wakeups. ]
Furthermore, the x86 method of using TIF flags was flawed in that its
quite possible to end up setting the bit on the idle task, loosing the
wakeup.
The powerpc method uses per-cpu storage and does appear to be
sufficient.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090330171023.153932974@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: new functionality
Currently, if there are more counters enabled than can fit on the CPU,
the kernel will multiplex the counters on to the hardware using
round-robin scheduling. That isn't too bad for sampling counters, but
for counting counters it means that the value read from a counter
represents some unknown fraction of the true count of events that
occurred while the counter was enabled.
This remedies the situation by keeping track of how long each counter
is enabled for, and how long it is actually on the cpu and counting
events. These times are recorded in nanoseconds using the task clock
for per-task counters and the cpu clock for per-cpu counters.
These values can be supplied to userspace on a read from the counter.
Userspace requests that they be supplied after the counter value by
setting the PERF_FORMAT_TOTAL_TIME_ENABLED and/or
PERF_FORMAT_TOTAL_TIME_RUNNING bits in the hw_event.read_format field
when creating the counter. (There is no way to change the read format
after the counter is created, though it would be possible to add some
way to do that.)
Using this information it is possible for userspace to scale the count
it reads from the counter to get an estimate of the true count:
true_count_estimate = count * total_time_enabled / total_time_running
This also lets userspace detect the situation where the counter never
got to go on the cpu: total_time_running == 0.
This functionality has been requested by the PAPI developers, and will
be generally needed for interpreting the count values from counting
counters correctly.
In the implementation, this keeps 5 time values (in nanoseconds) for
each counter: total_time_enabled and total_time_running are used when
the counter is in state OFF or ERROR and for reporting back to
userspace. When the counter is in state INACTIVE or ACTIVE, it is the
tstamp_enabled, tstamp_running and tstamp_stopped values that are
relevant, and total_time_enabled and total_time_running are determined
from them. (tstamp_stopped is only used in INACTIVE state.) The
reason for doing it like this is that it means that only counters
being enabled or disabled at sched-in and sched-out time need to be
updated. There are no new loops that iterate over all counters to
update total_time_enabled or total_time_running.
This also keeps separate child_total_time_running and
child_total_time_enabled fields that get added in when reporting the
totals to userspace. They are separate fields so that they can be
atomic. We don't want to use atomics for total_time_running,
total_time_enabled etc., because then we would have to use atomic
sequences to update them, which are slower than regular arithmetic and
memory accesses.
It is possible to measure total_time_running by adding a task_clock
counter to each group of counters, and total_time_enabled can be
measured approximately with a top-level task_clock counter (though
inaccuracies will creep in if you need to disable and enable groups
since it is not possible in general to disable/enable the top-level
task_clock counter simultaneously with another group). However, that
adds extra overhead - I measured around 15% increase in the context
switch latency reported by lat_ctx (from lmbench) when a task_clock
counter was added to each of 2 groups, and around 25% increase when a
task_clock counter was added to each of 4 groups. (In both cases a
top-level task-clock counter was also added.)
In contrast, the code added in this commit gives better information
with no overhead that I could measure (in fact in some cases I
measured lower times with this code, but the differences were all less
than one standard deviation).
[ v2: address review comments by Andrew Morton. ]
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Orig-LKML-Reference: <18890.6578.728637.139402@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Allow cpu wide counters to profile userspace by providing what process
the sample belongs to.
This raises the first issue with the output type, lots of these
options: group, tid, callchain, etc.. are non-exclusive and could be
combined, suggesting a bitfield.
However, things like the mmap() data stream doesn't fit in that.
How to split the type field...
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Orig-LKML-Reference: <20090325113317.013775235@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Provide a {type,size} header for each output entry.
This should provide extensible output, and the ability to mix multiple streams.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Orig-LKML-Reference: <20090325113316.831607932@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix kerneltop 100% CPU usage
Only return a poll event when there's actually been one, poll_wait()
doesn't actually wait for the waitq you pass it, it only enqueues
you on it.
Only once all FDs have been iterated and none of thm returned a
poll-event will it schedule().
Also make it return POLL_HUP when there's not mmap() area to read from.
Further, fix a silly bug in the write code.
Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Orig-LKML-Reference: <1237897096.24918.181.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: Rework the perfcounter output ABI
use sys_read() only for instant data and provide mmap() output for all
async overflow data.
The first mmap() determines the size of the output buffer. The mmap()
size must be a PAGE_SIZE multiple of 1+pages, where pages must be a
power of 2 or 0. Further mmap()s of the same fd must have the same
size. Once all maps are gone, you can again mmap() with a new size.
In case of 0 extra pages there is no data output and the first page
only contains meta data.
When there are data pages, a poll() event will be generated for each
full page of data. Furthermore, the output is circular. This means
that although 1 page is a valid configuration, its useless, since
we'll start overwriting it the instant we report a full page.
Future work will focus on the output format (currently maintained)
where we'll likey want each entry denoted by a header which includes a
type and length.
Further future work will allow to splice() the fd, also containing the
async overflow data -- splice() would be mutually exclusive with
mmap() of the data.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.470536358@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Much like the atomic_dec_and_lock() function in which we take an hold a
spin_lock if we drop the atomic to 0 this function takes and holds the
mutex if we dec the atomic to 0.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.410913479@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: new feature giving performance improvement
This adds the ability for userspace to do an mmap on a hardware counter
fd and get access to a read-only page that contains the information
needed to translate a hardware counter value to the full 64-bit
counter value that would be returned by a read on the fd. This is
useful on architectures that allow user programs to read the hardware
counters, such as PowerPC.
The mmap will only succeed if the counter is a hardware counter
monitoring the current process.
On my quad 2.5GHz PowerPC 970MP machine, userspace can read a counter
and translate it to the full 64-bit value in about 30ns using the
mmapped page, compared to about 830ns for the read syscall on the
counter, so this does give a significant performance improvement.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Orig-LKML-Reference: <20090323172417.297057964@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tracepoint events like lock_acquire and software counters like
pagefaults can recurse into the perf counter code again, avoid that.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.152096433@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since the bitfields turned into a bit of a mess, remove them and rely on
good old masks.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.059499915@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: build fix for powerpc
Commit db3a944aca35ae61 ("perf_counter: revamp syscall input ABI")
expanded the hw_event.type field into a union of structs containing
bitfields. In particular it introduced a type field and a raw_type
field, with the intention that the 1-bit raw_type field should
overlay the most-significant bit of the 8-bit type field, and in fact
perf_counter_alloc() now assumes that (or at least, assumes that
raw_type doesn't overlay any of the bits that are 1 in the values of
PERF_TYPE_{HARDWARE,SOFTWARE,TRACEPOINT}).
Unfortunately this is not true on big-endian systems such as PowerPC,
where bitfields are laid out from left to right, i.e. from most
significant bit to least significant. This means that setting
hw_event.type = PERF_TYPE_SOFTWARE will set hw_event.raw_type to 1.
This fixes it by making the layout depend on whether or not
__BIG_ENDIAN_BITFIELD is defined. It's a bit ugly, but that's what
we get for using bitfields in a user/kernel ABI.
Also, that commit didn't fix up some places in arch/powerpc/kernel/
perf_counter.c where hw_event.raw and hw_event.event_id were used.
This fixes them too.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Impact: cleanup
Having 3 slightly different copies of the same code around does nobody
any good. First step in revamping the output format.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Orig-LKML-Reference: <20090319194233.929962222@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: modify ABI
The hardware/software classification in hw_event->type became a little
strained due to the addition of tracepoint tracing.
Instead split up the field and provide a type field to explicitly specify
the counter type, while using the event_id field to specify which event to
use.
Raw counters still work as before, only the raw config now goes into
raw_event.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Orig-LKML-Reference: <20090319194233.836807573@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: new perfcounters feature
Enable usage of tracepoints as perf counter events.
tracepoint event ids can be found in /debug/tracing/event/*/*/id
and (for now) are represented as -65536+id in the type field.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Orig-LKML-Reference: <20090319194233.744044174@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Use the generic software events for context switches.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Orig-LKML-Reference: <20090319194233.283522645@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix boot crash
When doing the generic context switch event I ran into some early
boot hangs, which were caused by inf func recursion (event, fault,
event, fault).
I eventually tracked it down to event_list not being initialized
at the time of the first event. Fix this.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Orig-LKML-Reference: <20090319194233.195392657@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
I noticed that the counter_list only includes top-level counters, thus
perf_swcounter_event() will miss sw-counters in groups.
Since perf_swcounter_event() also wants an RCU safe list, create a new
event_list that includes all counters and uses RCU list ops and use call_rcu
to free the counter structure.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use hrtimers to profile timer based sampling for the software time
counters.
This allows platforms without hardware counter support to still
perform sample based profiling.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Provide separate sw counters for major and minor page faults.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Provide generic software counter infrastructure that supports
software events.
This will be used to allow sample based profiling based on software
events such as pagefaults. The current infrastructure can only
provide a count of such events, no place information.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Merge reason: we have gathered quite a few conflicts, need to merge upstream
Conflicts:
arch/powerpc/kernel/Makefile
arch/x86/ia32/ia32entry.S
arch/x86/include/asm/hardirq.h
arch/x86/include/asm/unistd_32.h
arch/x86/include/asm/unistd_64.h
arch/x86/kernel/cpu/common.c
arch/x86/kernel/irq.c
arch/x86/kernel/syscall_table_32.S
arch/x86/mm/iomap_32.c
include/linux/sched.h
kernel/Makefile
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently the 9p code crashes when a operation is interrupted, i.e. for
example when the user presses ^C while reading from a file.
This patch fixes the code that is responsible for interruption and flushing
of 9P operations.
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
ide-cd: fix REQ_QUIET tests in cdrom_decode_status
Fix up trivial conflicts in include/linux/blkdev.h
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: document the multi-touch (MT) protocol
Input: add detailed multi-touch finger data report protocol
Input: allow certain EV_ABS events to bypass all filtering
Input: bcm5974 - add documentation for the driver
Input: bcm5974 - augment debug information
Input: bcm5974 - Add support for the Macbook 5 (Unibody)
Input: bcm5974 - add quad-finger tapping
Input: bcm5974 - prepare for a new trackpad header type
Input: appletouch - fix DMA to/from stack buffer
Input: wacom - fix TabletPC touch bug
Input: lifebook - add DMI entry for Fujitsu B-2130
Input: ALPS - add signature for Toshiba Satellite Pro M10
Input: elantech - make sure touchpad is really in absolute mode
Input: elantech - provide a workaround for jumpy cursor on firmware 2.34
Input: ucb1400 - use disable_irq_nosync() in irq handler
Input: tsc2007 - use disable_irq_nosync() in irq handler
Input: sa1111ps2 - use disable_irq_nosync() in irq handlers
Input: omap-keypad - use disable_irq_nosync() in irq handler
See http://bugzilla.kernel.org/show_bug.cgi?id=13034
If the port gets into a TIME_WAIT state, then we cannot reconnect without
binding to a new port.
Tested-by: Petr Vandrovec <petr@vandrovec.name>
Tested-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Committed_AS field can underflow in certain situations:
> # while true; do cat /proc/meminfo | grep _AS; sleep 1; done | uniq -c
> 1 Committed_AS: 18446744073709323392 kB
> 11 Committed_AS: 18446744073709455488 kB
> 6 Committed_AS: 35136 kB
> 5 Committed_AS: 18446744073709454400 kB
> 7 Committed_AS: 35904 kB
> 3 Committed_AS: 18446744073709453248 kB
> 2 Committed_AS: 34752 kB
> 9 Committed_AS: 18446744073709453248 kB
> 8 Committed_AS: 34752 kB
> 3 Committed_AS: 18446744073709320960 kB
> 7 Committed_AS: 18446744073709454080 kB
> 3 Committed_AS: 18446744073709320960 kB
> 5 Committed_AS: 18446744073709454080 kB
> 6 Committed_AS: 18446744073709320960 kB
Because NR_CPUS can be greater than 1000 and meminfo_proc_show() does
not check for underflow.
But NR_CPUS proportional isn't good calculation. In general,
possibility of lock contention is proportional to the number of online
cpus, not theorical maximum cpus (NR_CPUS).
The current kernel has generic percpu-counter stuff. using it is right
way. it makes code simplify and percpu_counter_read_positive() don't
make underflow issue.
Reported-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Eric B Munson <ebmunson@us.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: <stable@kernel.org> [All kernel versions]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some drivers using of_register_platform_driver() wrapper break on sparc
because the wrapper isn't in the header file. This patch moves it from
Microblaze and PowerPC implementations and makes it common code.
Fixes this sparc64 allmodconfig build error (at least):
drivers/leds/leds-gpio.c: In function `gpio_led_init':
drivers/leds/leds-gpio.c:295: error: implicit declaration of function `of_register_platform_driver'
drivers/leds/leds-gpio.c: In function `gpio_led_exit':
drivers/leds/leds-gpio.c:311: error: implicit declaration of function `of_unregister_platform_driver'
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes the problem introduced by commit 3bfacef412 (get rid of
special-casing the /sbin/loader on alpha): osf/1 ecoff binary segfaults
when binfmt_aout built as module. That happens because aout binary
handler gets on the top of the binfmt list due to late registration, and
kernel attempts to execute the binary without preparatory work that must
be done by binfmt_loader.
Fixed by changing the registration order of the default binfmt handlers
using list_add_tail() and introducing insert_binfmt() function which
places new handler on the top of the binfmt list. This might be generally
useful for installing arch-specific frontends for default handlers or just
for overriding them.
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Richard Henderson <rth@twiddle.net
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Current mem_cgroup_shrink_usage() has two problems.
1. It doesn't call mem_cgroup_out_of_memory and doesn't update
last_oom_jiffies, so pagefault_out_of_memory invokes global OOM.
2. Considering hierarchy, shrinking has to be done from the
mem_over_limit, not from the memcg which the page would be charged to.
mem_cgroup_try_charge_swapin() does all of these things properly, so we
use it and call cancel_charge_swapin when it succeeded.
The name of "shrink_usage" is not appropriate for this behavior, so we
change it too.
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.cn>
Cc: Paul Menage <menage@google.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On a linux-next allyesconfig build:
kernel/trace/ring_buffer.c:1726:
warning: passing argument 1 of 'atomic_cmpxchg' from incompatible pointer type
linux-next/arch/s390/include/asm/atomic.h:112:
note: expected 'struct atomic_t *' but argument is of type 'struct atomic64_t *'
atomic_long_cmpxchg and atomic_long_xchg are incorrectly defined for 64
bit architectures. They should be mapped to the atomic64_* variants.
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OSC's OSD2 target: [git clone git://git.open-osd.org/osc-osd/ master]
(Initiator code prior to this patch must use: "git checkout CDB_VER_OSD2r01"
in the target tree above)
This is a summery of the wire changes:
* OSDv2_ADDITIONAL_CDB_LENGTH == 192 => 228 (Total CDB is now 236 bytes)
* Attributes List Element Header grew, so attribute values are 8 bytes
aligned.
* Cryptographic keys and signatures are 20 => 32
* Few new definitions.
(Still missing new standard definitions attribute values, these do not change
wire format and will be added later when needed)
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
In OSD2r04 draft, cryptographic key size changed to 32 bytes from
OSD1's 20 bytes. This causes a couple of on-the-wire structures
to change, including the CDB.
In this patch the OSD1/OSD2 handling is separated out in regard
to affected structures, but on-the-wire is still the same. All
on the wire changes will be submitted in one patch for bisect-ability.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
In OSD2r05 draft each attribute list element header was changed
so attribute-value would be 8 bytes aligned. In OSD2r01-r04
it was aligned on 2 bytes. (This is because in OSD2r01 the complete
element was 8 bytes padded at end but the header was not adjusted
and caused permanent miss-alignment.)
OSD1 elements are not padded and might be or might not be aligned.
OSD1 is still supported.
In this code we do all the code re-factoring to separate OSD1/OSD2
differences but do not change actual wire format. All wire format
changes will happen in one patch later, for bisect-ability.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When building with a .config generated from 'make allmodconfig'
some build warnings are generated. This patch corrects the warnings,
adds a FC_FID_NONE (= 0) enumeration for FC-IDs and cleans up one
variable naming to meet our variable naming conventions. For example,
fc_lport's should be named "lport," not "lp."
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Rogue ports are currently not tracked on any list. The only reference
to them is through any outstanding exchanges pending on the rogue ports.
If the module is removed while a retry is set on a rogue port
(say a Plogi retry for instance), this retry is not cancelled because there
is no reference to the rogue port in the discovery rports list. Thus the
local port can clean itself up, delete the exchange pool, and then the
rogue port timeout can fire and try to start up another exchange.
This patch tracks the rogue ports in a new list disc->rogue_rports. Creating
a new list instead of using the disc->rports list keeps remote port code
change to a minimum.
1) Whenever a rogue port is created, it is immediately added to the
disc->rogue_rports list.
2) When the rogues port goes to ready, it is removed from the rogue list
and the real remote port is added to the disc->rports list
3) The removal of the rogue from the disc->rogue_rports list is done in
the context of the fc_rport_work() workQ thread in discovery callback.
4) Real rports are removed from the disc->rports list like before. Lookup
is done only in the real rports list. This avoids making large changes
to the remote port code.
5) In fc_disc_stop_rports, the rogues list is traversed in addition to the
real list to stop the rogue ports and issue logoffs on them. This way, rogue
ports get cleaned up when the local port goes away.
6) rogue remote ports are not removed from the list right away, but
removed late in fc_rport_work() context, multiple threads can find the same
remote port in the list and call rport_logoff(). Rport_logoff() only
continues with the logoff if port is not in NONE state, thus preventing
multiple logoffs and multiple list deletions.
7) Since the rport is removed from the disc list at a later stage
(in the disc callback), incoming frames can find the rport even if
rport_logoff() has been called on the rport. When rport_logoff() is called,
the rport state is set to NONE, and we are trying to cancel all exchanges
and retries on that port. While in this state, if an incoming
Plogi/Prli/Logo or other frames match the rport, we should not reply
because the rport is in the NONE state. Just drop the frame, since the
rport will be deleted soon in the disc callback (fc_rport_work)
8) In fc_disc_single(), remove rport lookup and call to fc_disc_del_target.
fc_disc_single() is called from recv_rscn_req() where rport lookup
and rport_logoff is already done.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Set target can queue limit to the number of preallocated
session tasks we have.
This along with the cxgb3i can_queue patch will fix a throughput
problem where it could only queue one LU worth of data at a time.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* 'for-next' of git://git.o-hand.com/linux-mfd:
mfd: fix da903x warning
mfd: fix MAINTAINERS entry
mfd: Use the value of the final spin when reading the AUXADC
mfd: Storage class should be before const qualifier
mfd: PASIC3: supply clock_rate to DS1WM via driver_data
mfd: remove DS1WM clock handling
mfd: remove unused PASIC3 bus_shift field
pxa/magician: remove deprecated .bus_shift from PASIC3 platform_data
mfd: convert PASIC3 to use MFD core
mfd: convert DS1WM to use MFD core
mfd: Support active high IRQs on WM835x
mfd: Use bulk read to fill WM8350 register cache
mfd: remove duplicated #include from pcf50633
* 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits)
tracing, net: fix net tree and tracing tree merge interaction
tracing, powerpc: fix powerpc tree and tracing tree interaction
ring-buffer: do not remove reader page from list on ring buffer free
function-graph: allow unregistering twice
trace: make argument 'mem' of trace_seq_putmem() const
tracing: add missing 'extern' keywords to trace_output.h
tracing: provide trace_seq_reserve()
blktrace: print out BLK_TN_MESSAGE properly
blktrace: extract duplidate code
blktrace: fix memory leak when freeing struct blk_io_trace
blktrace: fix blk_probes_ref chaos
blktrace: make classic output more classic
blktrace: fix off-by-one bug
blktrace: fix the original blktrace
blktrace: fix a race when creating blk_tree_root in debugfs
blktrace: fix timestamp in binary output
tracing, Text Edit Lock: cleanup
tracing: filter fix for TRACE_EVENT_FORMAT events
ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release()
x86: kretprobe-booster interrupt emulation code fix
...
Fix up trivial conflicts in
arch/parisc/include/asm/ftrace.h
include/linux/memory.h
kernel/extable.c
kernel/module.c
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask: (36 commits)
cpumask: remove cpumask allocation from idle_balance, fix
numa, cpumask: move numa_node_id default implementation to topology.h, fix
cpumask: remove cpumask allocation from idle_balance
x86: cpumask: x86 mmio-mod.c use cpumask_var_t for downed_cpus
x86: cpumask: update 32-bit APM not to mug current->cpus_allowed
x86: microcode: cleanup
x86: cpumask: use work_on_cpu in arch/x86/kernel/microcode_core.c
cpumask: fix CONFIG_CPUMASK_OFFSTACK=y cpu hotunplug crash
numa, cpumask: move numa_node_id default implementation to topology.h
cpumask: convert node_to_cpumask_map[] to cpumask_var_t
cpumask: remove x86 cpumask_t uses.
cpumask: use cpumask_var_t in uv_flush_tlb_others.
cpumask: remove cpumask_t assignment from vector_allocation_domain()
cpumask: make Xen use the new operators.
cpumask: clean up summit's send_IPI functions
cpumask: use new cpumask functions throughout x86
x86: unify cpu_callin_mask/cpu_callout_mask/cpu_initialized_mask/cpu_sibling_setup_mask
cpumask: convert struct cpuinfo_x86's llc_shared_map to cpumask_var_t
cpumask: convert node_to_cpumask_map[] to cpumask_var_t
x86: unify 32 and 64-bit node_to_cpumask_map
...
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-module-and-param:
module: use strstarts()
strstarts: helper function for !strncmp(str, prefix, strlen(prefix))
arm: allow usage of string functions in linux/string.h
module: don't use stop_machine on module load
module: create a request_module_nowait()
module: include other structures in module version check
module: remove the SHF_ALLOC flag on the __versions section.
module: clarify the force-loading taint message.
module: Export symbols needed for Ksplice
Ksplice: Add functions for walking kallsyms symbols
module: remove module_text_address()
module: __module_address
module: Make find_symbol return a struct kernel_symbol
kernel/module.c: fix an unused goto label
param: fix charp parameters set via sysfs
Fix trivial conflicts in kernel/extable.c manually.
* 'printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
printk: correct the behavior of printk_timed_ratelimit()
vsprintf: unify the format decoding layer for its 3 users, cleanup
fix regression from "vsprintf: unify the format decoding layer for its 3 users"
vsprintf: fix bug in negative value printing
vsprintf: unify the format decoding layer for its 3 users
vsprintf: add binary printf
printk: introduce printk_once()
Fix trivial conflicts (printk_once vs log_buf_kexec_setup() added near
each other) in include/linux/kernel.h.
This patch adds support for ACPI device driver .notify() methods. If
such a method is present, Linux/ACPI installs a handler for device
notifications (but not for system notifications such as Bus Check,
Device Check, etc). When a device notification occurs, Linux/ACPI
passes it on to the driver's .notify() method.
In most cases, this removes the need for drivers to install their own
handlers for device-specific notifications.
For fixed hardware devices like some power and sleep buttons, there's
no notification value because there's no control method to execute a
Notify opcode. When a fixed hardware device generates an event, we
handle it the same as a regular device notification, except we send
a ACPI_FIXED_HARDWARE_EVENT value. This is outside the normal 0x0-0xff
range used by Notify opcodes.
Several drivers install their own handlers for system Bus Check and
Device Check notifications so they can support hot-plug. This patch
doesn't affect that usage.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Reviewed-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This driver requests a clock that usually is supplied by the MFD in which
the DS1WM is contained. Currently, it is impossible for a MFD to register
their clocks with the generic clock API due to different implementations
across architectures.
For now, this patch removes the clock handling from DS1WM altogether,
trusting that the MFD enable/disable functions will switch the clock if
needed. The clock rate is obtained from a new parameter in driver_data.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Removes the now-unused bus_shift field from pasic3_platform_data.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
This patch converts the DS1WM driver into an MFD cell. It also
calculates the bus_shift parameter from the memory resource size.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Instead of always splitting the file offset into 32-bit 'high' and 'low'
parts, just split them into the largest natural word-size - which in C
terms is 'unsigned long'.
This allows 64-bit architectures to avoid the unnecessary 32-bit
shifting and masking for native format (while the compat interfaces will
obviously always have to do it).
This also changes the order of 'high' and 'low' to be "low first". Why?
Because when we have it like this, the 64-bit system calls now don't use
the "pos_high" argument at all, and it makes more sense for the native
system call to simply match the user-mode prototype.
This results in a much more natural calling convention, and allows the
compiler to generate much more straightforward code. On x86-64, we now
generate
testq %rcx, %rcx # pos_l
js .L122 #,
movq %rcx, -48(%rbp) # pos_l, pos
from the C source
loff_t pos = pos_from_hilo(pos_h, pos_l);
...
if (pos < 0)
return -EINVAL;
and the 'pos_h' register isn't even touched. It used to generate code
like
mov %r8d, %r8d # pos_low, pos_low
salq $32, %rcx #, tmp71
movq %r8, %rax # pos_low, pos.386
orq %rcx, %rax # tmp71, pos.386
js .L122 #,
movq %rax, -48(%rbp) # pos.386, pos
which isn't _that_ horrible, but it does show how the natural word size
is just a more sensible interface (same arguments will hold in the user
level glibc wrapper function, of course, so the kernel side is just half
of the equation!)
Note: in all cases the user code wrapper can again be the same. You can
just do
#define HALF_BITS (sizeof(unsigned long)*4)
__syscall(PWRITEV, fd, iov, count, offset, (offset >> HALF_BITS) >> HALF_BITS);
or something like that. That way the user mode wrapper will also be
nicely passing in a zero (it won't actually have to do the shifts, the
compiler will understand what is going on) for the last argument.
And that is a good idea, even if nobody will necessarily ever care: if
we ever do move to a 128-bit lloff_t, this particular system call might
be left alone. Of course, that will be the least of our worries if we
really ever need to care, so this may not be worth really caring about.
[ Fixed for lost 'loff_t' cast noticed by Andrew Morton ]
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-api@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ralf Baechle <ralf@linux-mips.org>>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Update driver model support in the MTD framework, so it fits
better into the current udev-based hotplug framework:
- Each mtd_info now has a device node. MTD drivers should set
the dev.parent field to point to the physical device, before
setting up partitions or otherwise declaring MTDs.
- Those device nodes always map to /sys/class/mtdX device nodes,
which no longer depend on MTD_CHARDEV.
- Those mtdX sysfs nodes have a "starter set" of attributes;
it's not yet sufficient to replace /proc/mtd.
- Enabling MTD_CHARDEV provides /sys/class/mtdXro/ nodes and the
/sys/class/mtd*/dev attributes (for udev, mdev, etc).
- Include a MODULE_ALIAS_CHARDEV_MAJOR macro. It'll work with
udev creating the /dev/mtd* nodes, not just a static rootfs.
So the sysfs structure is pretty much what you'd expect, except
that readonly chardev nodes are a bit quirky.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
We were comparing {bus,devfn} and assuming that a match meant it was the
same device. It doesn't -- the same {bus,devfn} can exist in
multiple PCI domains. Include domain number in device identification
(and call it 'segment' in most places, because there's already a lot of
references to 'domain' which means something else, and this code is
infected with ACPI thinking already).
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Implement the CREATE_EXCLUSIVE4_1 open mode conforming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26
This mode allows the client to atomically create a file
if it doesn't exist while setting some of its attributes.
It must be implemented if the server supports persistent
reply cache and/or pnfs.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Also, use client minorversion to generate supported attrs
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Support enabling and disabling nfsv4.1 via /proc/fs/nfsd/versions
by writing the strings "+4.1" or "-4.1" correspondingly.
Use user mode nfs-utils (rpc.nfsd option) to enable.
This will allow us to get rid of CONFIG_NFSD_V4_1
[nfsd41: disable support for minorversion by default]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
For nfs41, the open share flags are used also for
delegation "wants" and "signals". Check that they are valid.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
When sessions are used, stateful operation sequenceid and stateid handling
are not used. When sessions are used, on the first open set the seqid to 1,
mark state confirmed and skip seqid processing.
When sessionas are used the stateid generation number is ignored when it is zero
whereas without sessions bad_stateid or stale stateid is returned.
Add flags to propagate session use to all stateful ops and down to
check_stateid_generation.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfsd4_has_session should return a boolean, not u32]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: pass nfsd4_compoundres * to nfsd4_process_open1]
[nfsd41: calculate HAS_SESSION in nfs4_preprocess_stateid_op]
[nfsd41: calculate HAS_SESSION in nfs4_preprocess_seqid_op]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Currently we only use cstate->current_fh,
will also be used by nfsd41 code.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Implement the destory_session operation confoming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26
[use sessionid_lock spin lock]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
A session inactivity time compound (lease renewal) or a compound where the
sequence operation has sa_cachethis set to FALSE do not require any pages
to be held in the v4.1 DRC. This is because struct nfsd4_slot is already
caching the session information.
Add logic to the nfs41 server to not cache response pages for solo sequence
responses.
Return nfserr_replay_uncached_rep on the operation following the sequence
operation when sa_cachethis is FALSE.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use cstate session in nfsd4_replay_cache_entry]
[nfsd41: rename nfsd4_no_page_in_cache]
[nfsd41 rename nfsd4_enc_no_page_replay]
[nfsd41 nfsd4_is_solo_sequence]
[nfsd41 change nfsd4_not_cached return]
Signed-off-by: Andy Adamson <andros@netapp.com>
[changed return type to bool]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41 drop parens in nfsd4_is_solo_sequence call]
Signed-off-by: Andy Adamson <andros@netapp.com>
[changed "== 0" to "!"]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Replace the nfs4_client cl_seqid field with a single struct nfs41_slot used
for the create session replay cache.
The CREATE_SESSION slot sets the sl_session pointer to NULL. Otherwise, the
slot and it's replay cache are used just like the session slots.
Fix unconfirmed create_session replay response by initializing the
create_session slot sequence id to 0.
A future patch will set the CREATE_SESSION cache when a SEQUENCE operation
preceeds the CREATE_SESSION operation. This compound is currently only cached
in the session slot table.
Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use bool inuse for slot state]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: revert portion of nfsd4_set_cache_entry]
Signed-off-by: Andy Adamson <andros@netpp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Implement the create_session operation confoming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26
Look up the client id (generated by the server on exchange_id,
given by the client on create_session).
If neither a confirmed or unconfirmed client is found
then the client id is stale
If a confirmed cilent is found (i.e. we already received
create_session for it) then compare the sequence id
to determine if it's a replay or possibly a mis-ordered rpc.
If the seqid is in order, update the confirmed client seqid
and procedd with updating the session parameters.
If an unconfirmed client_id is found then verify the creds
and seqid. If both match move the client id to confirmed state
and proceed with processing the create_session.
Currently, we do not support persistent sessions, and RDMA.
alloc_init_session generates a new sessionid and creates
a session structure.
NFSD_PAGES_PER_SLOT is used for the max response cached calculation, and for
the counting of DRC pages using the hard limits set in struct srv_serv.
A note on NFSD_PAGES_PER_SLOT:
Other patches in this series allow for NFSD_PAGES_PER_SLOT + 1 pages to be
cached in a DRC slot when the response size is less than NFSD_PAGES_PER_SLOT *
PAGE_SIZE but xdr_buf pages are used. e.g. a READDIR operation will encode a
small amount of data in the xdr_buf head, and then the READDIR in the xdr_buf
pages. So, the hard limit calculation use of pages by a session is
underestimated by the number of cached operations using the xdr_buf pages.
Yet another patch caches no pages for the solo sequence operation, or any
compound where cache_this is False. So the hard limit calculation use of
pages by a session is overestimated by the number of these operations in the
cache.
TODO: improve resource pre-allocation and negotiate session
parameters accordingly. Respect and possibly adjust
backchannel attributes.
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
[nfsd41: remove headerpadsz from channel attributes]
Our client and server only support a headerpadsz of 0.
[nfsd41: use DRC limits in fore channel init]
[nfsd41: do not change CREATE_SESSION back channel attrs]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[use sessionid_lock spin lock]
[nfsd41: use bool inuse for slot state]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41 remove sl_session from alloc_init_session]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[simplify nfsd4_encode_create_session error handling]
[nfsd41: fix comment style in init_forechannel_attrs]
[nfsd41: allocate struct nfsd4_session and slot table in one piece]
[nfsd41: no need to INIT_LIST_HEAD in alloc_init_session just prior to list_add]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Replay a request in nfsd4_sequence.
Add a minorversion to struct nfsd4_compound_state.
Pass the current slot to nfs4svc_encode_compound res via struct
nfsd4_compoundres to set an NFSv4.1 DRC entry.
Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use bool inuse for slot state]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use cstate session in nfs4svc_encode_compoundres]
[nfsd41 replace nfsd4_set_cache_entry]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Use no more than 1/128th of the number of free pages at nfsd startup for the
v4.1 DRC.
This is an arbitrary default which should probably end up under the control
of an administrator.
Signed-off-by: Andy Adamson <andros@netapp.com>
[moved added fields in struct svc_serv under CONFIG_NFSD_V4_1]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[fix set_max_drc calculation of sv_drc_max_pages]
[moved NFSD_DRC_SIZE_SHIFT's declaration up in header file]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cache all the result pages, including the rpc header in rq_respages[0],
for a request in the slot table cache entry.
Cache the statp pointer from nfsd_dispatch which points into rq_respages[0]
just past the rpc header. When setting a cache entry, calculate and save the
length of the nfs data minus the rpc header for rq_respages[0].
When replaying a cache entry, replace the cached rpc header with the
replayed request rpc result header, unless there is not enough room in the
cached results first page. In that case, use the cached rpc header.
The sessions fore channel maxresponse size cached is set to NFSD_PAGES_PER_SLOT
* PAGE_SIZE. For compounds we are cacheing with operations such as READDIR
that use the xdr_buf->pages to hold data, we choose to cache the extra page of
data rather than copying data from xdr_buf->pages into the xdr_buf->head page.
[nfsd41: limit cache to maxresponsesize_cached]
[nfsd41: mv nfsd4_set_statp under CONFIG_NFSD_V4_1]
[nfsd41: rename nfsd4_move_pages]
[nfsd41: rename page_no variable]
[nfsd41: rename nfsd4_set_cache_entry]
[nfsd41: fix nfsd41_copy_replay_data comment]
[nfsd41: add to nfsd4_set_cache_entry]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Implement the sequence operation conforming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26
Check for stale clientid (as derived from the sessionid).
Enforce slotid range and exactly-once semantics using
the slotid and seqid.
If everything went well renew the client lease and
mark the slot INPROGRESS.
Add a struct nfsd4_slot pointer to struct nfsd4_compound_state.
To be used for sessions DRC replay.
[nfsd41: rename sequence catchthis to cachethis]
Signed-off-by: Andy Adamson<andros@netapp.com>
[pulled some code to set cstate->slot from "nfsd DRC logic"]
[use sessionid_lock spin lock]
[nfsd41: use bool inuse for slot state]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd: add a struct nfsd4_slot pointer to struct nfsd4_compound_state]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: add nfsd4_session pointer to nfsd4_compound_state]
[nfsd41: set cstate session]
[nfsd41: use cstate session in nfsd4_sequence]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[simplify nfsd4_encode_sequence error handling]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
We need to distinguish between client names provided by NFSv4.0 clients
SETCLIENTID and those provided by NFSv4.1 via EXCHANGE_ID when looking
up the clientid by string.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfsd41: use boolean values for use_exchange_id argument]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: simplify match_clientid_establishment logic]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Implement the exchange_id operation confoming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-28
Based on the client provided name, hash a client id.
If a confirmed one is found, compare the op's creds and
verifier. If the creds match and the verifier is different
then expire the old client (client re-incarnated), otherwise,
if both match, assume it's a replay and ignore it.
If an unconfirmed client is found, then copy the new creds
and verifer if need update, otherwise assume replay.
The client is moved to a confirmed state on create_session.
In the nfs41 branch set the exchange_id flags to
EXCHGID4_FLAG_USE_NON_PNFS | EXCHGID4_FLAG_SUPP_MOVED_REFER
(pNFS is not supported, Referrals are supported,
Migration is not.).
Address various scenarios from section 18.35 of the spec:
1. Check for EXCHGID4_FLAG_UPD_CONFIRMED_REC_A and set
EXCHGID4_FLAG_CONFIRMED_R as appropriate.
2. Return error codes per 18.35.4 scenarios.
3. Update client records or generate new client ids depending on
scenario.
Note: 18.35.4 case 3 probably still needs revisiting. The handling
seems not quite right.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Andy Adamosn <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use utsname for major_id (and copy to server_scope)]
[nfsd41: fix handling of various exchange id scenarios]
Signed-off-by: Mike Sager <sager@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: reverse use of EXCHGID4_INVAL_FLAG_MASK_A]
[simplify nfsd4_encode_exchange_id error handling]
[nfsd41: embed an xdr_netobj in nfsd4_exchange_id]
[nfsd41: return nfserr_serverfault for spa_how == SP4_MACH_CRED]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Define nfsd41_dec_ops vector and add it to nfsd4_minorversion for
minorversion 1.
Note: nfsd4_enc_ops vector is shared for v4.0 and v4.1
since we don't need to filter out obsolete ops as this is
done in the decoding phase.
exchange_id, create_session, destroy_session, and sequence ops are
implemented as stubs returning nfserr_opnotsupp at this stage.
[was nfsd41: xdr stubs]
[get rid of CONFIG_NFSD_V4_1]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Simple sessionid hashing using its monotonically increasing sequence number.
Locking considerations:
sessionid_hashtbl access is controlled by the sessionid_lock spin lock.
It must be taken for insert, delete, and lookup.
nfsd4_sequence looks up the session id and if the session is found,
it calls nfsd4_get_session (still under the sessionid_lock).
nfsd4_destroy_session calls nfsd4_put_session after unhashing
it, so when the session's kref reaches zero it's going to get freed.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[we don't use a prime for sessionid hash table size]
[use sessionid_lock spin lock]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
This patch provides basic data structures representing the nfs41
sessions and slots, plus helpers for keeping a reference count
on the session and freeing it.
Note that our server only support a headerpadsz of 0 and
it ignores backchannel attributes at the moment.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: remove headerpadsz from channel attributes]
[nfsd41: embed nfsd4_channel in nfsd4_session]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use bool inuse for slot state]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41 remove sl_session from nfsd4_slot]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Define all NFSv4.1 common operation and error code constants.
Note that some of the definitions are used by both the nfs41 client
and the server code. This patch is duplicated in the nfs41 and nfsd41
sessions patchset.
Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: add exchange id flags]
Signed-off-by: Mike Sager <sager@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[removed server-only hunk changing NFSERR_REPLAY_ME]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: add SEQ4_XX to nfs41-common-protocol]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: generic error code update]
[nfs41: reverse EXCHGID4_INVAL_FLAG_MASK_{A,R}]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
On an NFSv4.1 server cache miss that causes an upcall, NFS4ERR_DELAY will be
returned. It is up to the NFSv4.1 client to resend only the operations that
have not been processed.
Initialize rq_usedeferral to 1 in svc_process(). It sill be turned off in
nfsd4_proc_compound() only when NFSv4.1 Sessions are used.
Note: this isn't an adequate solution on its own. It's acceptable as a way
to get some minimal 4.1 up and working, but we're going to have to find a
way to avoid returning DELAY in all common cases before 4.1 can really be
considered ready.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: reverse rq_nodeferral negative logic]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[sunrpc: initialize rq_usedeferral]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* 'ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
s390: remove arch specific smp_send_stop()
panic: clean up kernel/panic.c
panic, smp: provide smp_send_stop() wrapper on UP too
panic: decrease oops_in_progress only after having done the panic
generic-ipi: eliminate WARN_ON()s during oops/panic
generic-ipi: cleanups
generic-ipi: remove CSD_FLAG_WAIT
generic-ipi: remove kmalloc()
generic IPI: simplify barriers and locking
* 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
locking: rename trace_softirq_[enter|exit] => lockdep_softirq_[enter|exit]
lockdep: remove duplicate CONFIG_DEBUG_LOCKDEP definitions
lockdep: require framepointers for x86
lockdep: remove extra "irq" string
lockdep: fix incorrect state name
All logical processors with APIC ID values of 255 and greater will have their
APIC reported through Processor X2APIC structure (type-9 entry type) and all
logical processors with APIC ID less than 255 will have their APIC reported
through legacy Processor Local APIC (type-0 entry type) only. This is the
same case even for NMI structure reporting.
The Processor X2APIC Affinity structure provides the association between the
X2APIC ID of a logical processor and the proximity domain to which the logical
processor belongs.
For OSPM, Procssor IDs outside the 0-254 range are to be declared as Device()
objects in the ACPI namespace.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: remove compat stuff
HID: constify arrays of struct apple_key_translation
HID: add support for Kye/Genius Ergo 525V
HID: Support Apple mini aluminum keyboard
HID: support for Kensington slimblade device
HID: DragonRise game controller force feedback driver
HID: add support for another version of 0e8f:0003 device in hid-pl
HID: fix race between usb_register_dev() and hiddev_open()
HID: bring back possibility to specify vid/pid ignore on module load
HID: make HID_DEBUG defaults consistent
HID: autosuspend -- fix lockup of hid on reset
HID: hid_reset_resume() needs to be defined only when CONFIG_PM is set
HID: fix USB HID devices after STD with autosuspend
HID: do not try to compile PM code with CONFIG_PM unset
HID: autosuspend support for USB HID
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (28 commits)
trivial: Update my email address
trivial: NULL noise: drivers/mtd/tests/mtd_*test.c
trivial: NULL noise: drivers/media/dvb/frontends/drx397xD_fw.h
trivial: Fix misspelling of "Celsius".
trivial: remove unused variable 'path' in alloc_file()
trivial: fix a pdlfush -> pdflush typo in comment
trivial: jbd header comment typo fix for JBD_PARANOID_IOFAIL
trivial: wusb: Storage class should be before const qualifier
trivial: drivers/char/bsr.c: Storage class should be before const qualifier
trivial: h8300: Storage class should be before const qualifier
trivial: fix where cgroup documentation is not correctly referred to
trivial: Give the right path in Documentation example
trivial: MTD: remove EOL from MODULE_DESCRIPTION
trivial: Fix typo in bio_split()'s documentation
trivial: PWM: fix of #endif comment
trivial: fix typos/grammar errors in Kconfig texts
trivial: Fix misspelling of firmware
trivial: cgroups: documentation typo and spelling corrections
trivial: Update contact info for Jochen Hein
trivial: fix typo "resgister" -> "register"
...
This patch contains DST core files, which introduce
block layer, connector and sysfs registration glue and main headers.
Connector is used for the configuration of the node (its type, address,
device name and so on). Sysfs provides bits of information about running
devices in the following format:
+/*
+ * DST sysfs tree for device called 'storage':
+ *
+ * /sys/bus/dst/devices/storage/
+ * /sys/bus/dst/devices/storage/type : 192.168.4.80:1025
+ * /sys/bus/dst/devices/storage/size : 800
+ * /sys/bus/dst/devices/storage/name : storage
+ */
DST header contains structure definitions and protocol command description.
Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When extended interrupt mode (x2apic mode) is not supported in a
system, it must set compatibility format interrupt to bypass
interrupt remapping, otherwise compatibility format interrupts
will be blocked.
This will be used when interrupt remapping is enabled while x2apic
is not supported.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This patch enables suspend/resume for interrupt remapping. During suspend,
interrupt remapping is disabled. When resume, interrupt remapping is enabled
again.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This patch implements the suspend and resume feature for Intel IOMMU
DMAR. It hooks to kernel suspend and resume interface. When suspend happens, it
saves necessary hardware registers. When resume happens, it restores the
registers and restarts IOMMU by enabling translation, setting up root entry, and
re-enabling queued invalidation.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
dma: Add SoF and EoF debugging to ipu_idmac.c, minor cleanup
dw_dmac: add cyclic API to DW DMA driver
dmaengine: Add privatecnt to revert DMA_PRIVATE property
dmatest: add dma interrupts and callbacks
dmatest: add xor test
dmaengine: allow dma support for async_tx to be toggled
async_tx: provide __async_inline for HAS_DMA=n archs
dmaengine: kill some unused headers
dmaengine: initialize tx_list in dma_async_tx_descriptor_init
dma: i.MX31 IPU DMA robustness improvements
dma: improve section assignment in i.MX31 IPU DMA driver
dma: ipu_idmac driver cosmetic clean-up
dmaengine: fail device registration if channel registration fails
* 'ext3-latency-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext3: Add replace-on-rename hueristics for data=writeback mode
ext3: Add replace-on-truncate hueristics for data=writeback mode
ext3: Use WRITE_SYNC for commits which are caused by fsync()
block_write_full_page: Use synchronous writes for WBC_SYNC_ALL writebacks
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6: (32 commits)
regulator: twl4030 VAUX3 supports 3.0V
regulator: Support disabling of unused regulators by machines
regulator: Don't increment use_count for boot_on regulators
twl4030-regulator: expose VPLL2
regulator: refcount fixes
regulator: Don't warn if we failed to get a regulator
regulator: Allow boot_on regulators to be disabled by clients
regulator: Implement list_voltage for WM835x LDOs and DCDCs
twl4030-regulator: list more VAUX4 voltages
regulator: Don't warn on omitted voltage constraints
regulator: Implement list_voltage() for WM8400 DCDCs and LDOs
MMC: regulator utilities
regulator: twl4030 voltage enumeration (v2)
regulator: twl4030 regulators
regulator: get_status() grows kerneldoc
regulator: enumerate voltages (v2)
regulator: Fix get_mode() for WM835x DCDCs
regulator: Allow regulators to set the initial operating mode
regulator: Suggest use of datasheet supply or pin names for consumers
regulator: email - update email address and regulator webpage.
...
* git://git.infradead.org/iommu-2.6:
intel-iommu: Fix address wrap on 32-bit kernel.
intel-iommu: Enable DMAR on 32-bit kernel.
intel-iommu: fix PCI device detach from virtual machine
intel-iommu: VT-d page table to support snooping control bit
iommu: Add domain_has_cap iommu_ops
intel-iommu: Snooping control support
Fixed trivial conflicts in arch/x86/Kconfig and drivers/pci/intel-iommu.c
Need to free the old cpumask for affinity and pending_mask.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <49D18FF0.50707@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-fscache: (41 commits)
NFS: Add mount options to enable local caching on NFS
NFS: Display local caching state
NFS: Store pages from an NFS inode into a local cache
NFS: Read pages from FS-Cache into an NFS inode
NFS: nfs_readpage_async() needs to be accessible as a fallback for local caching
NFS: Add read context retention for FS-Cache to call back with
NFS: FS-Cache page management
NFS: Add some new I/O counters for FS-Cache doing things for NFS
NFS: Invalidate FsCache page flags when cache removed
NFS: Use local disk inode cache
NFS: Define and create inode-level cache objects
NFS: Define and create superblock-level objects
NFS: Define and create server-level objects
NFS: Register NFS for caching and retrieve the top-level index
NFS: Permit local filesystem caching to be enabled for NFS
NFS: Add FS-Cache option bit and debug bit
NFS: Add comment banners to some NFS functions
FS-Cache: Make kAFS use FS-Cache
CacheFiles: A cache that backs onto a mounted filesystem
CacheFiles: Export things for CacheFiles
...
Commit f4112de6b6 ("mm: introduce
debug_kmap_atomic") broke PPC builds with CONFIG_HIGHMEM=y:
CC init/main.o
In file included from include/linux/highmem.h:25,
from include/linux/pagemap.h:11,
from include/linux/mempolicy.h:63,
from init/main.c:53:
arch/powerpc/include/asm/highmem.h: In function 'kmap_atomic_prot':
arch/powerpc/include/asm/highmem.h:98: error: implicit declaration of function 'debug_kmap_atomic'
In file included from include/linux/pagemap.h:11,
from include/linux/mempolicy.h:63,
from init/main.c:53:
include/linux/highmem.h: At top level:
include/linux/highmem.h:196: warning: conflicting types for 'debug_kmap_atomic'
include/linux/highmem.h:196: error: static declaration of 'debug_kmap_atomic' follows non-static declaration
include/asm/highmem.h:98: error: previous implicit declaration of 'debug_kmap_atomic' was here
make[1]: *** [init/main.o] Error 1
make: *** [init] Error 2
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: ixp4xx - Fix handling of chained sg buffers
crypto: shash - Fix unaligned calculation with short length
hwrng: timeriomem - Use phys address rather than virt
* 'for-linus' of git://neil.brown.name/md: (53 commits)
md/raid5 revise rules for when to update metadata during reshape
md/raid5: minor code cleanups in make_request.
md: remove CONFIG_MD_RAID_RESHAPE config option.
md/raid5: be more careful about write ordering when reshaping.
md: don't display meaningless values in sysfs files resync_start and sync_speed
md/raid5: allow layout and chunksize to be changed on active array.
md/raid5: reshape using largest of old and new chunk size
md/raid5: prepare for allowing reshape to change layout
md/raid5: prepare for allowing reshape to change chunksize.
md/raid5: clearly differentiate 'before' and 'after' stripes during reshape.
Documentation/md.txt update
md: allow number of drives in raid5 to be reduced
md/raid5: change reshape-progress measurement to cope with reshaping backwards.
md: add explicit method to signal the end of a reshape.
md/raid5: enhance raid5_size to work correctly with negative delta_disks
md/raid5: drop qd_idx from r6_state
md/raid6: move raid6 data processing to raid6_pq.ko
md: raid5 run(): Fix max_degraded for raid level 4.
md: 'array_size' sysfs attribute
md: centralize ->array_sectors modifications
...
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/linux-hdreg-h-cleanup:
remove <linux/ata.h> include from <linux/hdreg.h>
include/linux/hdreg.h: remove unused defines
isd200: use ATA_* defines instead of *_STAT and *_ERR ones
include/linux/hdreg.h: cover WIN_* and friends with #ifndef/#endif __KERNEL__
aoe: WIN_* -> ATA_CMD_*
isd200: WIN_* -> ATA_CMD_*
include/linux/hdreg.h: cover struct hd_driveid with #ifndef/#endif __KERNEL__
xsysace: make it 'struct hd_driveid'-free
ubd_kern: make it 'struct hd_driveid'-free
isd200: make it 'struct hd_driveid'-free
nfs_readpage_async() needs to be non-static so that it can be used as a
fallback for the local on-disk caching should an EIO crop up when reading the
cache.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Add some new NFS I/O counters for FS-Cache doing things for NFS. A new line is
emitted into /proc/pid/mountstats if caching is enabled that looks like:
fsc: <rok> <rfl> <wok> <wfl> <unc>
Where <rok> is the number of pages read successfully from the cache, <rfl> is
the number of failed page reads against the cache, <wok> is the number of
successful page writes to the cache, <wfl> is the number of failed page writes
to the cache, and <unc> is the number of NFS pages that have been disconnected
from the cache.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Bind data storage objects in the local cache to NFS inodes.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Define and create superblock-level cache index objects (as managed by
nfs_server structs).
Each superblock object is created in a server level index object and is itself
an index into which inode-level objects are inserted.
Ideally there would be one superblock-level object per server, and the former
would be folded into the latter; however, since the "nosharecache" option
exists this isn't possible.
The superblock object key is a sequence consisting of:
(1) Certain superblock s_flags.
(2) Various connection parameters that serve to distinguish superblocks for
sget().
(3) The volume FSID.
(4) The security flavour.
(5) The uniquifier length.
(6) The uniquifier text. This is normally an empty string, unless the fsc=xyz
mount option was used to explicitly specify a uniquifier.
The key blob is of variable length, depending on the length of (6).
The superblock object is given no coherency data to carry in the auxiliary data
permitted by the cache. It is assumed that the superblock is always coherent.
This patch also adds uniquification handling such that two otherwise identical
superblocks, at least one of which is marked "nosharecache", won't end up
trying to share the on-disk cache. It will be possible to manually provide a
uniquifier through a mount option with a later patch to avoid the error
otherwise produced.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Define and create server-level cache index objects (as managed by nfs_client
structs).
Each server object is created in the NFS top-level index object and is itself
an index into which superblock-level objects are inserted.
Ideally there would be one superblock-level object per server, and the former
would be folded into the latter; however, since the "nosharecache" option
exists this isn't possible.
The server object key is a sequence consisting of:
(1) NFS version
(2) Server address family (eg: AF_INET or AF_INET6)
(3) Server port.
(4) Server IP address.
The key blob is of variable length, depending on the length of (4).
The server object is given no coherency data to carry in the auxiliary data
permitted by the cache.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Add FS-Cache option bit to nfs_server struct. This is set to indicate local
on-disk caching is enabled for a particular superblock.
Also add debug bit for local caching operations.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Add a function to install a monitor on the page lock waitqueue for a particular
page, thus allowing the page being unlocked to be detected.
This is used by CacheFiles to detect read completion on a page in the backing
filesystem so that it can then copy the data to the waiting netfs page.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Implement the data I/O part of the FS-Cache netfs API. The documentation and
API header file were added in a previous patch.
This patch implements the following functions for the netfs to call:
(*) fscache_attr_changed().
Indicate that the object has changed its attributes. The only attribute
currently recorded is the file size. Only pages within the set file size
will be stored in the cache.
This operation is submitted for asynchronous processing, and will return
immediately. It will return -ENOMEM if an out of memory error is
encountered, -ENOBUFS if the object is not actually cached, or 0 if the
operation is successfully queued.
(*) fscache_read_or_alloc_page().
(*) fscache_read_or_alloc_pages().
Request data be fetched from the disk, and allocate internal metadata to
track the netfs pages and reserve disk space for unknown pages.
These operations perform semi-asynchronous data reads. Upon returning
they will indicate which pages they think can be retrieved from disk, and
will have set in progress attempts to retrieve those pages.
These will return, in order of preference, -ENOMEM on memory allocation
error, -ERESTARTSYS if a signal interrupted proceedings, -ENODATA if one
or more requested pages are not yet cached, -ENOBUFS if the object is not
actually cached or if there isn't space for future pages to be cached on
this object, or 0 if successful.
In the case of the multipage function, the pages for which reads are set
in progress will be removed from the list and the page count decreased
appropriately.
If any read operations should fail, the completion function will be given
an error, and will also be passed contextual information to allow the
netfs to fall back to querying the server for the absent pages.
For each successful read, the page completion function will also be
called.
Any pages subsequently tracked by the cache will have PG_fscache set upon
them on return. fscache_uncache_page() must be called for such pages.
If supplied by the netfs, the mark_pages_cached() cookie op will be
invoked for any pages now tracked.
(*) fscache_alloc_page().
Allocate internal metadata to track a netfs page and reserve disk space.
This will return -ENOMEM on memory allocation error, -ERESTARTSYS on
signal, -ENOBUFS if the object isn't cached, or there isn't enough space
in the cache, or 0 if successful.
Any pages subsequently tracked by the cache will have PG_fscache set upon
them on return. fscache_uncache_page() must be called for such pages.
If supplied by the netfs, the mark_pages_cached() cookie op will be
invoked for any pages now tracked.
(*) fscache_write_page().
Request data be stored to disk. This may only be called on pages that
have been read or alloc'd by the above three functions and have not yet
been uncached.
This will return -ENOMEM on memory allocation error, -ERESTARTSYS on
signal, -ENOBUFS if the object isn't cached, or there isn't immediately
enough space in the cache, or 0 if successful.
On a successful return, this operation will have queued the page for
asynchronous writing to the cache. The page will be returned with
PG_fscache_write set until the write completes one way or another. The
caller will not be notified if the write fails due to an I/O error. If
that happens, the object will become available and all pending writes will
be aborted.
Note that the cache may batch up page writes, and so it may take a while
to get around to writing them out.
The caller must assume that until PG_fscache_write is cleared the page is
use by the cache. Any changes made to the page may be reflected on disk.
The page may even be under DMA.
(*) fscache_uncache_page().
Indicate that the cache should stop tracking a page previously read or
alloc'd from the cache. If the page was alloc'd only, but unwritten, it
will not appear on disk.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Implement the cookie management part of the FS-Cache netfs client API. The
documentation and API header file were added in a previous patch.
This patch implements the following three functions:
(1) fscache_acquire_cookie().
Acquire a cookie to represent an object to the netfs. If the object in
question is a non-index object, then that object and its parent indices
will be created on disk at this point if they don't already exist. Index
creation is deferred because an index may reside in multiple caches.
(2) fscache_relinquish_cookie().
Retire or release a cookie previously acquired. At this point, the
object on disk may be destroyed.
(3) fscache_update_cookie().
Update the in-cache representation of a cookie. This is used to update
the auxiliary data for coherency management purposes.
With this patch it is possible to have a netfs instruct a cache backend to
look up, validate and create metadata on disk and to destroy it again.
The ability to actually store and retrieve data in the objects so created is
added in later patches.
Note that these functions will never return an error. _All_ errors are
handled internally to FS-Cache.
The worst that can happen is that fscache_acquire_cookie() may return a NULL
pointer - which is considered a negative cookie pointer and can be passed back
to any function that takes a cookie without harm. A negative cookie pointer
merely suppresses caching at that level.
The stub in linux/fscache.h will detect inline the negative cookie pointer and
abort the operation as fast as possible. This means that the compiler doesn't
have to set up for a call in that case.
See the documentation in Documentation/filesystems/caching/netfs-api.txt for
more information.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Add functions to register and unregister a network filesystem or other client
of the FS-Cache service. This allocates and releases the cookie representing
the top-level index for a netfs, and makes it available to the netfs.
If the FS-Cache facility is disabled, then the calls are optimised away at
compile time.
Note that whilst this patch may appear to work with FS-Cache enabled and a
netfs attempting to use it, it will leak the cookie it allocates for the netfs
as fscache_relinquish_cookie() is implemented in a later patch. This will
cause the slab code to emit a warning when the module is removed.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Implement two features of FS-Cache:
(1) The ability to request and release cache tags - names by which a cache may
be known to a netfs, and thus selected for use.
(2) An internal function by which a cache is selected by consulting the netfs,
if the netfs wishes to be consulted.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Make FS-Cache create its /proc interface and present various statistical
information through it. Also provide the functions for updating this
information.
These features are enabled by:
CONFIG_FSCACHE_PROC
CONFIG_FSCACHE_STATS
CONFIG_FSCACHE_HISTOGRAM
The /proc directory for FS-Cache is also exported so that caching modules can
add their own statistics there too.
The FS-Cache module is loadable at this point, and the statistics files can be
examined by userspace:
cat /proc/fs/fscache/stats
cat /proc/fs/fscache/histogram
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Add the API for a generic facility (FS-Cache) by which caches may declare them
selves open for business, and may obtain work to be done from network
filesystems. The header file is included by:
#include <linux/fscache-cache.h>
Documentation for the API is also added to:
Documentation/filesystems/caching/backend-api.txt
This API is not usable without the implementation of the utility functions
which will be added in further patches.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Add the API for a generic facility (FS-Cache) by which filesystems (such as AFS
or NFS) may call on local caching capabilities without having to know anything
about how the cache works, or even if there is a cache:
+---------+
| | +--------------+
| NFS |--+ | |
| | | +-->| CacheFS |
+---------+ | +----------+ | | /dev/hda5 |
| | | | +--------------+
+---------+ +-->| | |
| | | |--+
| AFS |----->| FS-Cache |
| | | |--+
+---------+ +-->| | |
| | | | +--------------+
+---------+ | +----------+ | | |
| | | +-->| CacheFiles |
| ISOFS |--+ | /var/cache |
| | +--------------+
+---------+
General documentation and documentation of the netfs specific API are provided
in addition to the header files.
As this patch stands, it is possible to build a filesystem against the facility
and attempt to use it. All that will happen is that all requests will be
immediately denied as if no cache is present.
Further patches will implement the core of the facility. The facility will
transfer requests from networking filesystems to appropriate caches if
possible, or else gracefully deny them.
If this facility is disabled in the kernel configuration, then all its
operations will trivially reduce to nothing during compilation.
WHY NOT I_MAPPING?
==================
I have added my own API to implement caching rather than using i_mapping to do
this for a number of reasons. These have been discussed a lot on the LKML and
CacheFS mailing lists, but to summarise the basics:
(1) Most filesystems don't do hole reportage. Holes in files are treated as
blocks of zeros and can't be distinguished otherwise, making it difficult
to distinguish blocks that have been read from the network and cached from
those that haven't.
(2) The backing inode must be fully populated before being exposed to
userspace through the main inode because the VM/VFS goes directly to the
backing inode and does not interrogate the front inode's VM ops.
Therefore:
(a) The backing inode must fit entirely within the cache.
(b) All backed files currently open must fit entirely within the cache at
the same time.
(c) A working set of files in total larger than the cache may not be
cached.
(d) A file may not grow larger than the available space in the cache.
(e) A file that's open and cached, and remotely grows larger than the
cache is potentially stuffed.
(3) Writes go to the backing filesystem, and can only be transferred to the
network when the file is closed.
(4) There's no record of what changes have been made, so the whole file must
be written back.
(5) The pages belong to the backing filesystem, and all metadata associated
with that page are relevant only to the backing filesystem, and not
anything stacked atop it.
OVERVIEW
========
FS-Cache provides (or will provide) the following facilities:
(1) Caches can be added / removed at any time, even whilst in use.
(2) Adds a facility by which tags can be used to refer to caches, even if
they're not available yet.
(3) More than one cache can be used at once. Caches can be selected
explicitly by use of tags.
(4) The netfs is provided with an interface that allows either party to
withdraw caching facilities from a file (required for (1)).
(5) A netfs may annotate cache objects that belongs to it. This permits the
storage of coherency maintenance data.
(6) Cache objects will be pinnable and space reservations will be possible.
(7) The interface to the netfs returns as few errors as possible, preferring
rather to let the netfs remain oblivious.
(8) Cookies are used to represent indices, files and other objects to the
netfs. The simplest cookie is just a NULL pointer - indicating nothing
cached there.
(9) The netfs is allowed to propose - dynamically - any index hierarchy it
desires, though it must be aware that the index search function is
recursive, stack space is limited, and indices can only be children of
indices.
(10) Indices can be used to group files together to reduce key size and to make
group invalidation easier. The use of indices may make lookup quicker,
but that's cache dependent.
(11) Data I/O is effectively done directly to and from the netfs's pages. The
netfs indicates that page A is at index B of the data-file represented by
cookie C, and that it should be read or written. The cache backend may or
may not start I/O on that page, but if it does, a netfs callback will be
invoked to indicate completion. The I/O may be either synchronous or
asynchronous.
(12) Cookies can be "retired" upon release. At this point FS-Cache will mark
them as obsolete and the index hierarchy rooted at that point will get
recycled.
(13) The netfs provides a "match" function for index searches. In addition to
saying whether a match was made or not, this can also specify that an
entry should be updated or deleted.
FS-Cache maintains a virtual index tree in which all indices, files, objects
and pages are kept. Bits of this tree may actually reside in one or more
caches.
FSDEF
|
+------------------------------------+
| |
NFS AFS
| |
+--------------------------+ +-----------+
| | | |
homedir mirror afs.org redhat.com
| | |
+------------+ +---------------+ +----------+
| | | | | |
00001 00002 00007 00125 vol00001 vol00002
| | | | |
+---+---+ +-----+ +---+ +------+------+ +-----+----+
| | | | | | | | | | | | |
PG0 PG1 PG2 PG0 XATTR PG0 PG1 DIRENT DIRENT DIRENT R/W R/O Bak
| |
PG0 +-------+
| |
00001 00003
|
+---+---+
| | |
PG0 PG1 PG2
In the example above, two netfs's can be seen to be backed: NFS and AFS. These
have different index hierarchies:
(*) The NFS primary index will probably contain per-server indices. Each
server index is indexed by NFS file handles to get data file objects.
Each data file objects can have an array of pages, but may also have
further child objects, such as extended attributes and directory entries.
Extended attribute objects themselves have page-array contents.
(*) The AFS primary index contains per-cell indices. Each cell index contains
per-logical-volume indices. Each of volume index contains up to three
indices for the read-write, read-only and backup mirrors of those volumes.
Each of these contains vnode data file objects, each of which contains an
array of pages.
The very top index is the FS-Cache master index in which individual netfs's
have entries.
Any index object may reside in more than one cache, provided it only has index
children. Any index with non-index object children will be assumed to only
reside in one cache.
The FS-Cache overview can be found in:
Documentation/filesystems/caching/fscache.txt
The netfs API to FS-Cache can be found in:
Documentation/filesystems/caching/netfs-api.txt
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Recruit a page flag to aid in cache management. The following extra flag is
defined:
(1) PG_fscache (PG_private_2)
The marked page is backed by a local cache and is pinning resources in the
cache driver.
If PG_fscache is set, then things that checked for PG_private will now also
check for that. This includes things like truncation and page invalidation.
The function page_has_private() had been added to make the checks for both
PG_private and PG_private_2 at the same time.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
The attached patch causes read_cache_pages() to release page-private data on a
page for which add_to_page_cache() fails. If the filler function fails, then
the problematic page is left attached to the pagecache (with appropriate flags
set, one presumes) and the remaining to-be-attached pages are invalidated and
discarded. This permits pages with caching references associated with them to
be cleaned up.
The invalidatepage() address space op is called (indirectly) to do the honours.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Document the slow work thread pool.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Make the slow work pool configurable through /proc/sys/kernel/slow-work.
(*) /proc/sys/kernel/slow-work/min-threads
The minimum number of threads that should be in the pool as long as it is
in use. This may be anywhere between 2 and max-threads.
(*) /proc/sys/kernel/slow-work/max-threads
The maximum number of threads that should in the pool. This may be
anywhere between min-threads and 255 or NR_CPUS * 2, whichever is greater.
(*) /proc/sys/kernel/slow-work/vslow-percentage
The percentage of active threads in the pool that may be used to execute
very slow work items. This may be between 1 and 99. The resultant number
is bounded to between 1 and one fewer than the number of active threads.
This ensures there is always at least one thread that can process very
slow work items, and always at least one thread that won't.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
Create a dynamically sized pool of threads for doing very slow work items, such
as invoking mkdir() or rmdir() - things that may take a long time and may
sleep, holding mutexes/semaphores and hogging a thread, and are thus unsuitable
for workqueues.
The number of threads is always at least a settable minimum, but more are
started when there's more work to do, up to a limit. Because of the nature of
the load, it's not suitable for a 1-thread-per-CPU type pool. A system with
one CPU may well want several threads.
This is used by FS-Cache to do slow caching operations in the background, such
as looking up, creating or deleting cache objects.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
FIP is the new standard way to discover Fibre-Channel Forwarders (FCFs)
by sending solicitations and listening for advertisements from FCFs.
It also provides for keep-alives and period advertisements so that both
parties know they have connectivity. If the FCF loses connectivity to
the storage fabric, it can send a Link Reset to inform the E_node.
This version is also compatible with pre-FIP implementations, so no
configured selection between FIP mode and non-FIP mode is required.
We wait a couple seconds after sending the initial solicitation
and then send an old-style FLOGI. If we receive any FIP frames,
we use FIP only mode. If the old FLOGI receives a response,
we disable FIP mode. After every reset or link up, this
determination is repeated.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Adds include/scsi/fc/fc_fip.h for FIP protocol definitions.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The foce_softc mem was reserved by libfc_host_alloc as well as
by fcoe_host_alloc.
Removes one liner fcoe_host_alloc completely, instead directly calls
libfc_host_alloc to alloc scsi_host with libfc for just one fcoe_softc
as fcoe private data.
Moves libfc_host_alloc to libfc.h since it is a libfc API, placed
lport_priv API adjacent to libfc_host_alloc since this is related
to scsi_host priv data.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Removes no where used several inline functions prefixed with skb_*
and be16_to_cpu.
Moves fcoe module specific func prototypes to fcoe.c from libfcoe.h,
moved only need for build.
Adds fcoe module header file fcoe.h and then moves fcoe module
specific fcoe_percpu_s and fcoe_softc to fcoe.h from libfcoe.h.
Moves all defines from fcoe.c to fcoe.h since now fcoe module
has its own header file fcoe.h.
[jejb: removed EXPORT_SYMBOL_GPL(fcoe_fc_crc) which caused a section mismatch]
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Moves only required code from fcoe_sw.c to libfcoe.c towards having
just one source file for fcoe module, this gets rid off default sw
transport code in a separate fcoe_sw.c file.
Very minor renaming along this move, dropped _sw_ or _SW_ use
in names and replaced them by _if_ as a auxiliary interface
functions. Now some of these funcs can be removed or merged with
other func after fcoe transport is gone, but that should be
in another patch to keep this patch simple.
Now the libfcoe.c file name for fcoe module doesn't go along well,
so the libfcoe.c file renaming to fcoe.c as the only single fcoe
module file is done in next patch to keep this patch clean
and small for review.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The fcoe transport code was added for generic FCoE transport
infrastructure to allow additional offload related module loading
on demand, this is not required anymore after recently added
different offload approach by having offload related func ops
in netdev.
This patch removes fcoe transport related code use, calls functions
directly between existing libfcoe.c and fcoe_sw.c for now, for
example fcoe_sw_destroy and fcoe_sw_create calling.
The fcoe_sw.c and libfcoe.c code will be further consolidated in
later patches and then also the default fcoe sw transport code
file fcoe_sw.c will be completely removed.
The fcoe transport code files are completely removed in next
patch to keep this patch simple for reviewing.
[This patch is an update to a previous patch. This update
resolves a build error as well as fixes a defect related to
not calling fc_release_transport().]
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Remove the hotplug creation of dev_stats, we allocate for all possible CPUs
now when we allocate the lport.
v2: Durring the 2.6.30 merge window, before these patches were comitted,
'percpu_ptr' was renamed 'per_cpu_ptr'. This latest update updates this
patch for the name change.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Convert fcoe_percpu array to use the per-cpu variables
that the kernel provides. Use the kernel's functions to
access this structure.
The cpu member of the fcoe_percpu_s is no longer needed,
so this patch removes it too.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Currently the skb_queue is initialized every time the associated
CPU goes online. This patch has libfcoe initializing the skb_queue
for all possible CPUs when the module is loaded.
This patch also re-orders some declarations in the fcoe_rcv()
function so the structure declarations are grouped before
the primitive declarations.
Lastly, this patch converts all CPU indicies to use unsigned int
since CPU indicies should not be negative.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
kmemtrace now uses tracepoints instead of markers. We no longer need to
use format specifiers to pass arguments.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
[ folded: Use the new TP_PROTO and TP_ARGS to fix the build. ]
[ folded: fix build when CONFIG_KMEMTRACE is disabled. ]
[ folded: define tracepoints when CONFIG_TRACEPOINTS is enabled. ]
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <ae61c0f37156db8ec8dc0d5778018edde60a92e3.1237813499.git.eduard.munteanu@linux360.ro>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
linux/percpu.h includes linux/slab.h, which generates circular inclusion
dependencies when trying to switch kmemtrace to use tracepoints instead
of markers.
This patch allows tracing within slab headers' inline functions.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: paulmck@linux.vnet.ibm.com
LKML-Reference: <1237898630.25315.83.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
We want to remove percpu.h from rcupreempt.h, but if we do that
the percpu primitives there wont build anymore. Move them to the
.c file instead.
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: paulmck@linux.vnet.ibm.com
LKML-Reference: <1237898630.25315.83.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: build fix for all non-x86 architectures
We want to remove percpu.h from rcuclassic.h/rcutree.h (for upcoming
kmemtrace changes) but that would break the DECLARE_PER_CPU based
declarations in these files.
Move the quiescent counter management functions to their respective
RCU implementation .c files - they were slightly above the inlining
limit anyway.
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: paulmck@linux.vnet.ibm.com
LKML-Reference: <1237898630.25315.83.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
We want to remove percpu.h from rcupdate.h (for upcoming kmemtrace
changes), but this is not possible currently without breaking the
build because key.h has an implicit include file dependency on
rwsem.h:
CC [M] fs/cifs/cifs_spnego.o
In file included from include/keys/user-type.h:15,
from fs/cifs/cifs_spnego.c:24:
include/linux/key.h:128: error: field ‘sem’ has incomplete type
make[2]: *** [fs/cifs/cifs_spnego.o] Error 1
make[1]: *** [fs/cifs] Error 2
make: *** [fs] Error 2
Fix it by making the dependency explicit.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
LKML-Reference: <1237884886.25315.39.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
We want to remove percpu.h from rcupdate.h (for upcoming kmemtrace
changes), but this is not possible currently without breaking the
build because fdtable.h has an implicit include file dependency: it
uses __init does not include init.h.
This can cause build failures on non-x86 architectures:
/home/mingo/tip/include/linux/fdtable.h:66: error: expected '=', ',',
';', 'asm' or '__attribute__' before 'files_defer_init'
make[2]: *** [fs/locks.o] Error 1
We got this header included indirectly via rcupdate.h's percpu.h
inclusion - but if that is not there the build will break.
Fix it.
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: paulmck@linux.vnet.ibm.com
LKML-Reference: <1237898630.25315.83.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
We want to remove percpu.h from rcupdate.h (for upcoming kmemtrace
changes), but this is not possible currently without breaking the
build because fs.h has an implicit include file depedency: it
uses PAGE_SIZE but does not include asm/page.h which defines it.
This problem gets masked in practice because most fs.h using sites
use rcupreempt.h (and other headers) which includes percpu.h which
brings in asm/page.h indirectly.
We cannot add asm/page.h to asm/fs.h because page.h is not an
exported header.
Move simple_transaction_set() to the other simple-transaction
file helpers in fs/libfs.c.
This removes the include file hell and also reduces
kernel size a bit.
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: paulmck@linux.vnet.ibm.com
LKML-Reference: <1237898630.25315.83.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
We want to remove percpu.h from rcupdate.h (for upcoming kmemtrace
changes), but this is not possible currently without breaking the
build because fs.h has implicit include file depedencies: it uses
GFP_* types in inlines but does not include gfp.h.
In practice most fs.h using .c files get gfp.h included implicitly,
via an indirect route: via rcupdate.h inclusion - so this underlying
problem gets masked in practice.
So we want to solve fs.h's dependency on gfp.h.
gfp.h can not be included here directly because it is not exported and it
would break the build the following way:
/home/mingo/tip/usr/include/linux/bsg.h:11: found __[us]{8,16,32,64} type without #include <linux/types.h>
/home/mingo/tip/usr/include/linux/fs.h:11: included file 'linux/gfp.h' is not exported
make[3]: *** [/home/mingo/tip/usr/include/linux/.check] Error 1
make[2]: *** [linux] Error 2
As suggested by Alexey Dobriyan, move alloc_secdata() and free_secdata()
to linux/security.h - they belong there. This also cleans fs.h of GFP_*
usage.
Suggested-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
LKML-Reference: <1237906803.25315.96.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In data=writeback mode, start an asynchronous flush when closing a
file which had been previously truncated down to zero. This lowers
the probability of data loss in the case of applications that attempt
to replace a file using truncate.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
Remove two unneeded exports and make two symbols static in fs/mpage.c
Cleanup after commit 585d3bc06f
Trim includes of fdtable.h
Don't crap into descriptor table in binfmt_som
Trim includes in binfmt_elf
Don't mess with descriptor table in load_elf_binary()
Get rid of indirect include of fs_struct.h
New helper - current_umask()
check_unsafe_exec() doesn't care about signal handlers sharing
New locking/refcounting for fs_struct
Take fs_struct handling to new file (fs/fs_struct.c)
Get rid of bumping fs_struct refcount in pivot_root(2)
Kill unsharing fs_struct in __set_personality()
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (21 commits)
drm/radeon: load the right microcode on rs780
drm: remove unused "can_grow" parameter from drm_crtc_helper_initial_config
drm: fix EDID backward compat check
drm: sync the mode validation for INTERLACE/DBLSCAN
drm: fix typo in edid vendor parsing.
DRM: drm_crtc_helper.h doesn't actually need i2c.h
drm: fix missing inline function on 32-bit powerpc.
drm: Use pgprot_writecombine in GEM GTT mapping to get the right bits for !PAT.
drm/i915: Add a spinlock to protect the active_list
drm/i915: Fix SDVO TV support
drm/i915: Fix SDVO CREATE_PREFERRED_INPUT_TIMING command
drm/i915: Fix error in SDVO DTD and modeline convert
drm/i915: Fix SDVO command debug function
drm/i915: fix TV mode setting in property change
drm/i915: only set TV mode when any property changed
drm/i915: clean up udelay usage
drm/i915: add VGA hotplug support for 945+
drm/i915: correctly set IGD device's gtt size for KMS.
drm/i915: avoid hanging on to a stale pointer to raw_edid.
drm/i915: check for -EINVAL from vm_insert_pfn
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (54 commits)
glge: remove unused #include <version.h>
dnet: remove unused #include <version.h>
tcp: miscounts due to tcp_fragment pcount reset
tcp: add helper for counter tweaking due mid-wq change
hso: fix for the 'invalid frame length' messages
hso: fix for crash when unplugging the device
fsl_pq_mdio: Fix compile failure
fsl_pq_mdio: Revive UCC MDIO support
ucc_geth: Pass proper device to DMA routines, otherwise oops happens
i.MX31: Fixing cs89x0 network building to i.MX31ADS
tc35815: Fix build error if NAPI enabled
hso: add Vendor/Product ID's for new devices
ucc_geth: Remove unused header
gianfar: Remove unused header
kaweth: Fix locking to be SMP-safe
net: allow multiple dev per napi with GRO
r8169: reset IntrStatus after chip reset
ixgbe: Fix potential memory leak/driver panic issue while setting up Tx & Rx ring parameters
ixgbe: fix ethtool -A|a behavior
ixgbe: Patch to fix driver panic while freeing up tx & rx resources
...
Pass the original flags to rwlock arch-code, so that it can re-enable
interrupts if implemented for that architecture.
Initially, make __raw_read_lock_flags and __raw_write_lock_flags stubs
which just do the same thing as non-flags variants.
Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SGI has observed that on large systems, interrupts are not serviced for a
long period of time when waiting for a rwlock. The following patch series
re-enables irqs while waiting for the lock, resembling the code which is
already there for spinlocks.
I only made the ia64 version, because the patch adds some overhead to the
fast path. I assume there is currently no demand to have this for other
architectures, because the systems are not so large. Of course, the
possibility to implement raw_{read|write}_lock_flags for any architecture
is still there.
This patch:
The new macro LOCK_CONTENDED_FLAGS expands to the correct implementation
depending on the config options, so that IRQ's are re-enabled when
possible, but they remain disabled if CONFIG_LOCKDEP is set.
Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: <linux-arch@vger.kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds preadv and pwritev system calls. These syscalls are a
pretty straightforward combination of pread and readv (same for write).
They are quite useful for doing vectored I/O in threaded applications.
Using lseek+readv instead opens race windows you'll have to plug with
locking.
Other systems have such system calls too, for example NetBSD, check
here: http://www.daemon-systems.org/man/preadv.2.html
The application-visible interface provided by glibc should look like
this to be compatible to the existing implementations in the *BSD family:
ssize_t preadv(int d, const struct iovec *iov, int iovcnt, off_t offset);
ssize_t pwritev(int d, const struct iovec *iov, int iovcnt, off_t offset);
This prototype has one problem though: On 32bit archs is the (64bit)
offset argument unaligned, which the syscall ABI of several archs doesn't
allow to do. At least s390 needs a wrapper in glibc to handle this. As
we'll need a wrappers in glibc anyway I've decided to push problem to
glibc entriely and use a syscall prototype which works without
arch-specific wrappers inside the kernel: The offset argument is
explicitly splitted into two 32bit values.
The patch sports the actual system call implementation and the windup in
the x86 system call tables. Other archs follow as separate patches.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <linux-api@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It would be nice to be able to extract the dmesg log from a vmcore file
without needing to keep the debug symbols for the running kernel handy all
the time. We have a facility to do this in /proc/vmcore. This patch adds
the log_buf and log_end symbols to the vmcoreinfo area so that tools (like
makedumpfile) can easily extract the dmesg logs from a vmcore image.
[akpm@linux-foundation.org: several fixes and cleanups]
[akpm@linux-foundation.org: fix unused log_buf_kexec_setup()]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Cc: Simon Horman <horms@verge.net.au>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add the PCI Device ID of the PCI Bridge Controller on AMD8111 chip.
Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We are wasting 2 words in signal_struct without any reason to implement
task_pgrp_nr() and task_session_nr().
task_session_nr() has no callers since
2e2ba22ea4, we can remove it.
task_pgrp_nr() is still (I believe wrongly) used in fs/autofsX and
fs/coda.
This patch reimplements task_pgrp_nr() via task_pgrp_nr_ns(), and kills
__pgrp/__session and the related helpers.
The change in drivers/char/tty_io.c is cosmetic, but hopefully makes sense
anyway.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Alan Cox <number6@the-village.bc.nu> [tty parts]
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Inho, the safety rules for vnr/nr_ns helpers are horrible and buggy.
task_pid_nr_ns(task) needs rcu/tasklist depending on task == current.
As for "special" pids, vnr/nr_ns helpers always need rcu. However, if
task != current, they are unsafe even under rcu lock, we can't trust
task->group_leader without the special checks.
And almost every helper has a callsite which needs a fix.
Also, it is a bit annoying that the implementations of, say,
task_pgrp_vnr() and task_pgrp_nr_ns() are not "symmetrical".
This patch introduces the new helper, __task_pid_nr_ns(), which is always
safe to use, and turns all other helpers into the trivial wrappers.
After this I'll send another patch which converts task_tgid_xxx() as well,
they're are a bit special.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Even if task == current, it is not safe to dereference the result of
task_pgrp/task_session. We can race with another thread which changes the
special pid via setpgid/setsid.
Document this. The next 2 patches give an example of the unsafe usage, we
have more bad users.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for x8 asynchronous sample rate and ability to specify base
clock frequency.
Signed-off-by: Paul Fulghum <paulkf@microgate.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
cpuhotplug_mutex_lock() is not used, remove it.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that task_detached() is exported, change tracehook_notify_death() to
use this helper, nobody else checks ->exit_signal == -1 by hand.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Metzger, Markus T" <markus.t.metzger@intel.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
By discussion with Roland.
- Rename ptrace_exit() to exit_ptrace(), and change it to do all the
necessary work with ->ptraced list by its own.
- Move this code from exit.c to ptrace.c
- Update the comment in ptrace_detach() to explain the rechecking of
the child->ptrace.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Metzger, Markus T" <markus.t.metzger@intel.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When ptrace_detach() takes tasklist, the tracee can be SIGKILL'ed. If it
has already passed exit_notify() we can leak a zombie, because a) ptracing
disables the auto-reaping logic, and b) ->real_parent was not notified
about the child's death.
ptrace_detach() should follow the ptrace_exit's logic, change the code
accordingly.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Tested-by: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Container-init must behave like global-init to processes within the
container and hence it must be immune to unhandled fatal signals from
within the container (i.e SIG_DFL signals that terminate the process).
But the same container-init must behave like a normal process to processes
in ancestor namespaces and so if it receives the same fatal signal from a
process in ancestor namespace, the signal must be processed.
Implementing these semantics requires that send_signal() determine pid
namespace of the sender but since signals can originate from workqueues/
interrupt-handlers, determining pid namespace of sender may not always be
possible or safe.
This patchset implements the design/simplified semantics suggested by
Oleg Nesterov. The simplified semantics for container-init are:
- container-init must never be terminated by a signal from a
descendant process.
- container-init must never be immune to SIGKILL from an ancestor
namespace (so a process in parent namespace must always be able
to terminate a descendant container).
- container-init may be immune to unhandled fatal signals (like
SIGUSR1) even if they are from ancestor namespace. SIGKILL/SIGSTOP
are the only reliable signals to a container-init from ancestor
namespace.
This patch:
Based on an earlier patch submitted by Oleg Nesterov and comments from
Roland McGrath (http://lkml.org/lkml/2008/11/19/258).
The handler parameter is currently unused in the tracehook functions.
Besides, the tracehook functions are called with siglock held, so the
functions can check the handler if they later need to.
Removing the parameter simiplifies changes to sig_ignored() in a follow-on
patch.
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The cpuset_zone_allowed() variants are actually only a function of the
zone's node.
Cc: Paul Menage <menage@google.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We need to pass some data to test_task() or process_task() in some cases.
Will be used later.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Try to use CSS ID for records in swap_cgroup. By this, on 64bit machine,
size of swap_cgroup goes down to 2 bytes from 8bytes.
This means, when 2GB of swap is equipped, (assume the page size is 4096bytes)
From size of swap_cgroup = 2G/4k * 8 = 4Mbytes.
To size of swap_cgroup = 2G/4k * 2 = 1Mbytes.
Reduction is large. Of course, there are trade-offs. This CSS ID will
add overhead to swap-in/swap-out/swap-free.
But in general,
- swap is a resource which the user tend to avoid use.
- If swap is never used, swap_cgroup area is not used.
- Reading traditional manuals, size of swap should be proportional to
size of memory. Memory size of machine is increasing now.
I think reducing size of swap_cgroup makes sense.
Note:
- ID->CSS lookup routine has no locks, it's under RCU-Read-Side.
- memcg can be obsolete at rmdir() but not freed while refcnt from
swap_cgroup is available.
Changelog v4->v5:
- reworked on to memcg-charge-swapcache-to-proper-memcg.patch
Changlog ->v4:
- fixed not configured case.
- deleted unnecessary comments.
- fixed NULL pointer bug.
- fixed message in dmesg.
[nishimura@mxp.nes.nec.co.jp: css_tryget can be called twice in !PageCgroupUsed case]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, mem_cgroup_calc_mapped_ratio() is unused at all. it can be
removed and KAMEZAWA-san suggested it.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add RSS and swap to OOM output from memcg
Display memcg values like failcnt, usage and limit when an OOM occurs due
to memcg.
Thanks to Johannes Weiner, Li Zefan, David Rientjes, Kamezawa Hiroyuki,
Daisuke Nishimura and KOSAKI Motohiro for review.
Sample output
-------------
Task in /a/x killed as a result of limit of /a
memory: usage 1048576kB, limit 1048576kB, failcnt 4183
memory+swap: usage 1400964kB, limit 9007199254740991kB, failcnt 0
[akpm@linux-foundation.org: compilation fix]
[akpm@linux-foundation.org: fix kerneldoc and whitespace]
[akpm@linux-foundation.org: add printk facility level]
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch tries to fix OOM Killer problems caused by hierarchy.
Now, memcg itself has OOM KILL function (in oom_kill.c) and tries to
kill a task in memcg.
But, when hierarchy is used, it's broken and correct task cannot
be killed. For example, in following cgroup
/groupA/ hierarchy=1, limit=1G,
01 nolimit
02 nolimit
All tasks' memory usage under /groupA, /groupA/01, groupA/02 is limited to
groupA's 1Gbytes but OOM Killer just kills tasks in groupA.
This patch provides makes the bad process be selected from all tasks
under hierarchy. BTW, currently, oom_jiffies is updated against groupA
in above case. oom_jiffies of tree should be updated.
To see how oom_jiffies is used, please check mem_cgroup_oom_called()
callers.
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: const fix]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We have some read-only files and write-only files, but currently they are
all set to 0644, which is counter-intuitive and cause trouble for some
cgroup tools like libcgroup.
This patch adds 'mode' to struct cftype to allow cgroup subsys to set it's
own files' file mode, and for the most cases cft->mode can be default to 0
and cgroup will figure out proper mode.
Acked-by: Paul Menage <menage@google.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In following situation, with memory subsystem,
/groupA use_hierarchy==1
/01 some tasks
/02 some tasks
/03 some tasks
/04 empty
When tasks under 01/02/03 hit limit on /groupA, hierarchical reclaim
is triggered and the kernel walks tree under groupA. In this case,
rmdir /groupA/04 fails with -EBUSY frequently because of temporal
refcnt from the kernel.
In general. cgroup can be rmdir'd if there are no children groups and
no tasks. Frequent fails of rmdir() is not useful to users.
(And the reason for -EBUSY is unknown to users.....in most cases)
This patch tries to modify above behavior, by
- retries if css_refcnt is got by someone.
- add "return value" to pre_destroy() and allows subsystem to
say "we're really busy!"
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch for Per-CSS(Cgroup Subsys State) ID and private hierarchy code.
This patch attaches unique ID to each css and provides following.
- css_lookup(subsys, id)
returns pointer to struct cgroup_subysys_state of id.
- css_get_next(subsys, id, rootid, depth, foundid)
returns the next css under "root" by scanning
When cgroup_subsys->use_id is set, an id for css is maintained.
The cgroup framework only parepares
- css_id of root css for subsys
- id is automatically attached at creation of css.
- id is *not* freed automatically. Because the cgroup framework
don't know lifetime of cgroup_subsys_state.
free_css_id() function is provided. This must be called by subsys.
There are several reasons to develop this.
- Saving space .... For example, memcg's swap_cgroup is array of
pointers to cgroup. But it is not necessary to be very fast.
By replacing pointers(8bytes per ent) to ID (2byes per ent), we can
reduce much amount of memory usage.
- Scanning without lock.
CSS_ID provides "scan id under this ROOT" function. By this, scanning
css under root can be written without locks.
ex)
do {
rcu_read_lock();
next = cgroup_get_next(subsys, id, root, &found);
/* check sanity of next here */
css_tryget();
rcu_read_unlock();
id = found + 1
} while(...)
Characteristics:
- Each css has unique ID under subsys.
- Lifetime of ID is controlled by subsys.
- css ID contains "ID" and "Depth in hierarchy" and stack of hierarchy
- Allowed ID is 1-65535, ID 0 is UNUSED ID.
Design Choices:
- scan-by-ID v.s. scan-by-tree-walk.
As /proc's pid scan does, scan-by-ID is robust when scanning is done
by following kind of routine.
scan -> rest a while(release a lock) -> conitunue from interrupted
memcg's hierarchical reclaim does this.
- When subsys->use_id is set, # of css in the system is limited to
65535.
[bharata@linux.vnet.ibm.com: remove rcu_read_lock() from css_get_next()]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The ns_proxy cgroup allows moving processes to child cgroups only one
level deep at a time. This commit relaxes this restriction and makes it
possible to attach tasks directly to grandchild cgroups, e.g.:
($pid is in the root cgroup)
echo $pid > /cgroup/CG1/CG2/tasks
Previously this operation would fail with -EPERM and would have to be
performed as two steps:
echo $pid > /cgroup/CG1/tasks
echo $pid > /cgroup/CG1/CG2/tasks
Also, the target cgroup no longer needs to be empty to move a task there.
Signed-off-by: Grzegorz Nosek <root@localdomain.pl>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the style of some multi-line comments in cgroup.h to match
Documentation/CodingStyle
Signed-off-by: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reformat ext3/ioctl.c to make it look more like ext4/ioctl.c and remove
the BKL around ext3_ioctl().
Signed-off-by: Cyrus Massoumi <cyrusm@gmx.net>
Cc: <linux-ext4@vger.kernel.org>
Acked-by: Jan Kara <jack@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change spi-gpio so that it is possible to drive SPI communications over
GPIO without the need for a chipselect signal.
This is useful in very small setups where there's only one slave device
on the bus.
This patch does not affect existing setups.
I use this for a tiny communication channel between an embedded device and
a microcontroller. There are not enough GPIOs available for chipselect
and it's not needed anyway in this case.
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow GPIOs in GPIOLIB chips to be named. This name is then used when the
GPIO is exported to sysfs, although it could be used elsewhere if deemed
useful.
Signed-off-by: Daniel Silverstone <dsilvers@simtec.co.uk>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The v3020 RTC can be connected to GPIOs as well as to memory-like
interface. Add ability to use GPIO bit-bang for v3020 read-write access.
[akpm@linux-foundation.org: fix off-by-one in error path]
Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Define new setup() hook to export the accessor
- Implement accessor methods
Moves some error checking out of the sysfs interface code into the layer
below it, which is now shared by both sysfs and memory access code.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the case of at24, the platform code registers a 'setup' callback with
the at24_platform_data. When the at24 driver detects an EEPROM, it fills
out the read and write functions of the memory_accessor and calls the
setup callback passing the memory_accessor struct. The platform code can
then use the read/write functions in the memory_accessor struct for
reading and writing the EEPROM.
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add an interface by which other kernel code can read/write persistent
memory such as I2C or SPI EEPROMs, or devices which provide NVRAM. Use
cases include storage of board-specific configuration data like Ethernet
addresses and sensor calibrations.
Original idea, review and improvement suggestions by David Brownell.
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It is a fairly common operation to have a pointer to a work and to need a
pointer to the delayed work it is contained in. In particular, all
delayed works which want to rearm themselves will have to do that. So it
would seem fair to offer a helper function for this operation.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Greg KH <greg@kroah.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A new "address_space flag"--AS_MM_ALL_LOCKS--was defined to use the next
available AS flag while the Unevictable LRU was under development. The
Unevictable LRU was using the same flag and "no one" noticed. Current
mainline, since 2.6.28, has same value for two symbolic flag names.
So, define a unique flag value for AS_UNEVICTABLE--up close to the other
flags, [at the cost of an additional #ifdef] so we'll notice next time.
Note that #ifdef is not actually required, if we don't mind having the
unused flag value defined.
Replace #defines with an enum.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: <stable@kernel.org> [2.6.28.x, 2.6.29.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Include fiemap.h in header-y; it defines the interface for the
FS_IOC_FIEMAP file mapping ioctl.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix a number of issues with the per-MM VMA patch:
(1) Make mmap_pages_allocated an atomic_long_t, just in case this is used on
a NOMMU system with more than 2G pages. Makes no difference on a 32-bit
system.
(2) Report vma->vm_pgoff * PAGE_SIZE as a 64-bit value, not a 32-bit value,
lest it overflow.
(3) Move the allocation of the vm_area_struct slab back for fork.c.
(4) Use KMEM_CACHE() for both vm_area_struct and vm_region slabs.
(5) Use BUG_ON() rather than if () BUG().
(6) Make the default validate_nommu_regions() a static inline rather than a
#define.
(7) Make free_page_series()'s objection to pages with a refcount != 1 more
informative.
(8) Adjust the __put_nommu_region() banner comment to indicate that the
semaphore must be held for writing.
(9) Limit the number of warnings about munmaps of non-mmapped regions.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes a build failure with generic debug pagealloc:
mm/debug-pagealloc.c: In function 'set_page_poison':
mm/debug-pagealloc.c:8: error: 'struct page' has no member named 'debug_flags'
mm/debug-pagealloc.c: In function 'clear_page_poison':
mm/debug-pagealloc.c:13: error: 'struct page' has no member named 'debug_flags'
mm/debug-pagealloc.c: In function 'page_poison':
mm/debug-pagealloc.c:18: error: 'struct page' has no member named 'debug_flags'
mm/debug-pagealloc.c: At top level:
mm/debug-pagealloc.c:120: error: redefinition of 'kernel_map_pages'
include/linux/mm.h:1278: error: previous definition of 'kernel_map_pages' was here
mm/debug-pagealloc.c: In function 'kernel_map_pages':
mm/debug-pagealloc.c:122: error: 'debug_pagealloc_enabled' undeclared (first use in this function)
by fixing
- debug_flags should be in struct page
- define DEBUG_PAGEALLOC config option for all architectures
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We need full-scale adjustment to fix a TCP miscount in the next
patch, so just move it into a helper and call for that from the
other places.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove an include that isn't actually needed to prevent needless
rebuilds.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The readq/writeq really need to be static inline on the arches which
don't provide them.
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The logging API needs an extra function to make cluster mirroring
possible. This new function allows us to check whether a mirror
region is being recovered on another machine in the cluster. This
helps us prevent simultaneous recovery I/O and process I/O to the
same locations on disk.
Cluster-aware log modules will implement this function. Single
machine log modules will not. So, there is no performance
penalty for single machine mirrors.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Acked-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Remove the 'dm_dirty_log_internal' structure. The resulting cleanup
eliminates extra memory allocations. Therefore exposing the internal
list_head to the external 'dm_dirty_log_type' structure is a worthwhile
compromise.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
The tt_internal is really just a list_head to manage registered target_type
in a double linked list,
Here embed the list_head into target_type directly,
1. to avoid kmalloc/kfree;
2. then tt_internal is really unneeded;
Cc: stable@kernel.org
Signed-off-by: Cheng Renquan <crquan@gmail.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Reviewed-by: Alasdair G Kergon <agk@redhat.com>
Today's linux-next build (powerpc ppc64_defconfig) failed like this:
In file included from net/core/skbuff.c:69:
include/trace/skb.h:4: error: expected ')' before '(' token
include/trace/skb.h:4: error: expected ')' before '(' token
[...]
Caused by commit 2939b0469d ("tracing:
replace TP<var> with TP_<var>") from the tracing tree interacting with
commit 4893d39e86 ("Network Drop Monitor:
Add trace declaration for skb frees") from the net tree.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds a cyclic DMA interface to the DW DMA driver. This is
very useful if you want to use the DMA controller in combination with a
sound device which uses cyclic buffers.
Using a DMA channel for cyclic DMA will disable the possibility to use
it as a normal DMA engine until the user calls the cyclic free function
on the DMA channel. Also a cyclic DMA list can not be prepared if the
channel is already active.
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
All <linux/hdreg.h> users that need <linux/ata.h> have been fixed
to include it directly.
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* 'for-linus' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (58 commits)
SUNRPC: Ensure IPV6_V6ONLY is set on the socket before binding to a port
NSM: Fix unaligned accesses in nsm_init_private()
NFS: Simplify logic to compare socket addresses in client.c
NFS: Start PF_INET6 callback listener only if IPv6 support is available
lockd: Start PF_INET6 listener only if IPv6 support is available
SUNRPC: Remove CONFIG_SUNRPC_REGISTER_V4
SUNRPC: rpcb_register() should handle errors silently
SUNRPC: Simplify kernel RPC service registration
SUNRPC: Simplify svc_unregister()
SUNRPC: Allow callers to pass rpcb_v4_register a NULL address
SUNRPC: rpcbind actually interprets r_owner string
SUNRPC: Clean up address type casts in rpcb_v4_register()
SUNRPC: Don't return EPROTONOSUPPORT in svc_register()'s helpers
SUNRPC: Use IPv4 loopback for registering AF_INET6 kernel RPC services
SUNRPC: Set IPV6ONLY flag on PF_INET6 RPC listener sockets
NFS: Revert creation of IPv6 listeners for lockd and NFSv4 callbacks
SUNRPC: Remove @family argument from svc_create() and svc_create_pooled()
SUNRPC: Change svc_create_xprt() to take a @family argument
SUNRPC: svc_setup_socket() gets protocol family from socket
SUNRPC: Pass a family argument to svc_register()
...
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (33 commits)
ext4: Regularize mount options
ext4: fix locking typo in mballoc which could cause soft lockup hangs
ext4: fix typo which causes a memory leak on error path
jbd2: Update locking coments
ext4: Rename pa_linear to pa_type
ext4: add checks of block references for non-extent inodes
ext4: Check for an valid i_mode when reading the inode from disk
ext4: Use WRITE_SYNC for commits which are caused by fsync()
ext4: Add auto_da_alloc mount option
ext4: Use struct flex_groups to calculate get_orlov_stats()
ext4: Use atomic_t's in struct flex_groups
ext4: remove /proc tuning knobs
ext4: Add sysfs support
ext4: Track lifetime disk writes
ext4: Fix discard of inode prealloc space with delayed allocation.
ext4: Automatically allocate delay allocated blocks on rename
ext4: Automatically allocate delay allocated blocks on close
ext4: add EXT4_IOC_ALLOC_DA_BLKS ioctl
ext4: Simplify delalloc code by removing mpage_da_writepages()
ext4: Save stack space by removing fake buffer heads
...
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (59 commits)
ide-floppy: do not complete rq's prematurely
ide: be able to build pmac driver without IDE built-in
ide-pmac: IDE cable detection on Apple PowerBook
ide: inline SELECT_DRIVE()
ide: turn selectproc() method into dev_select() method (take 5)
MAINTAINERS: move old ide-{floppy,tape} entries to CREDITS (take 2)
ide: move data register access out of tf_{read|load}() methods (take 2)
ide: call {in|out}put_data() methods from tf_{read|load}() methods (take 2)
ide-io-std: shorten ide_{in|out}put_data()
ide: rename IDE_TFLAG_IN_[HOB_]FEATURE
ide: turn set_irq() method into write_devctl() method
ide: use ATA_HOB
ide-disk: use ATA_ERR
ide: add support for CFA specified transfer modes (take 3)
ide-iops: only clear DMA words on setting DMA mode
ide: identify data word 53 bit 1 doesn't cover words 62 and 63 (take 3)
au1xxx-ide: auide_{in|out}sw() should be static
ide-floppy: use ide_pio_bytes()
ide-{floppy,tape}: fix padding for PIO transfers
ide: remove CONFIG_BLK_DEV_IDEDOUBLER config option
...
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (88 commits)
PCI: fix HT MSI mapping fix
PCI: don't enable too much HT MSI mapping
x86/PCI: make pci=lastbus=255 work when acpi is on
PCI: save and restore PCIe 2.0 registers
PCI: update fakephp for bus_id removal
PCI: fix kernel oops on bridge removal
PCI: fix conflict between SR-IOV and config space sizing
powerpc/PCI: include pci.h in powerpc MSI implementation
PCI Hotplug: schedule fakephp for feature removal
PCI Hotplug: rename legacy_fakephp to fakephp
PCI Hotplug: restore fakephp interface with complete reimplementation
PCI: Introduce /sys/bus/pci/devices/.../rescan
PCI: Introduce /sys/bus/pci/devices/.../remove
PCI: Introduce /sys/bus/pci/rescan
PCI: Introduce pci_rescan_bus()
PCI: do not enable bridges more than once
PCI: do not initialize bridges more than once
PCI: always scan child buses
PCI: pci_scan_slot() returns newly found devices
PCI: don't scan existing devices
...
Fix trivial append-only conflict in Documentation/feature-removal-schedule.txt
The s1d13xxx chip provides two values of identification value: the
Production id (e.g 13506/13505/13806..) and a revision number 0,1,2,3).
Together these can help us to differentiate between similiar setups.
This patch adds the proper way of grabbing both those values and save them
for future reference (in order to decide what functions a card supports,
e.g acceleration).
We also move away from the concept of all s1d13xxx = s1d13806 when we
really support alot more.
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: simplify s1d13xxxfb_probe()]
Signed-off-by: Kristoffer Ericson <kristoffer.ericson@gmail.com
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With a postfix decrement t reaches -1 on timeout which results in a
return of 0.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add an accelerator constant so almost all Cirrus are recognized as
accelerators by the fbset command.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix 8bpp mode by adding handling of the Laguna chipsets to various places
and stop trashing a HDR register which probably does not exist on the
Laguna.
Fix compilation warnings about uninitialized variables also.
Finally, all 8bpp, 16bpp and 32bpp modes work on the Laguna chipset.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add additional overflow register setting for Laguna chips.
Also, simplify some code in the cirrusfb_pan_display() and
cirrusfb_blank().
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix trailing whitespace because quilt complained about it.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- the LEAP_YEAR macro is buggy - it references its arg multiple times.
Fix this by turning it into a C function.
- give it a more approriate name
- Move it to rtc.h so that other .c files can use it, instead of copying it.
Cc: dann frazier <dannf@hp.com>
Acked-by: Alessandro Zummo <alessandro.zummo@towertech.it>
Cc: stephane eranian <eranian@googlemail.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
autofs_dev-ioctl.h is included by both the kernel module and user space tools
and it includes two kernel header files. Compiles work if the kernel headers
are installed but fail otherwise.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement full support for OF SPI bindings. Now the driver can manage its
own chip selects without any help from the board files and/or fsl_soc
constructors.
The "legacy" code is well isolated and could be removed as time goes by.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Cc: David Brownell <david-b@pacbell.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The main purpose of this patch is to pass 'struct spi_device' to the chip
select handling routines. This is needed so that we could implement
full-fledged OpenFirmware support for this driver.
While at it, also:
- Replace two {de,activate}_cs routines by single cs_contol().
- Don't duplicate platform data callbacks in mpc83xx_spi struct.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Cc: David Brownell <david-b@pacbell.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce new wakeup macros that allow passing an event mask to the wakeup
targets. They exactly mimic their non-_poll() counterpart, with the added
event mask passing capability. I did add only the ones currently
requested, avoiding the _nr() and _all() for the moment.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: William Lee Irwin III <wli@movementarian.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patchset introduces wakeup hints for some of the most popular (from
epoll POV) devices, so that epoll code can avoid spurious wakeups on its
waiters.
The problem with epoll is that the callback-based wakeups do not, ATM,
carry any information about the events the wakeup is related to. So the
only choice epoll has (not being able to call f_op->poll() from inside the
callback), is to add the file* to a ready-list and resolve the real events
later on, at epoll_wait() (or its own f_op->poll()) time. This can cause
spurious wakeups, since the wake_up() itself might be for an event the
caller is not interested into.
The rate of these spurious wakeup can be pretty high in case of many
network sockets being monitored.
By allowing devices to report the events the wakeups refer to (at least
the two major classes - POLLIN/POLLOUT), we are able to spare useless
wakeups by proper handling inside the epoll's poll callback.
Epoll will have in any case to call f_op->poll() on the file* later on,
since the change to be done in order to have the full event set sent via
wakeup, is too invasive for the way our f_op->poll() system works (the
full event set is calculated inside the poll function - there are too many
of them to even start thinking the change - also poll/select would need
change too).
Epoll is changed in a way that both devices which send event hints, and
the ones that don't, are correctly handled. The former will gain some
efficiency though.
As a general rule for devices, would be to add an event mask by using
key-aware wakeup macros, when making up poll wait queues. I tested it
(together with the epoll's poll fix patch Andrew has in -mm) and wakeups
for the supported devices are correctly filtered.
Test program available here:
http://www.xmailserver.org/epoll_test.c
This patch:
Nothing revolutionary here. Just using the available "key" that our
wakeup core already support. The __wake_up_locked_key() was no brainer,
since both __wake_up_locked() and __wake_up_locked_key() are thin wrappers
around __wake_up_common().
The __wake_up_sync() function had a body, so the choice was between
borrowing the body for __wake_up_sync_key() and calling it from
__wake_up_sync(), or make an inline and calling it from both. I chose the
former since in most archs it all resolves to "mov $0, REG; jmp ADDR".
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: William Lee Irwin III <wli@movementarian.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
People started using eventfd in a semaphore-like way where before they
were using pipes.
That is, counter-based resource access. Where a "wait()" returns
immediately by decrementing the counter by one, if counter is greater than
zero. Otherwise will wait. And where a "post(count)" will add count to
the counter releasing the appropriate amount of waiters. If eventfd the
"post" (write) part is fine, while the "wait" (read) does not dequeue 1,
but the whole counter value.
The problem with eventfd is that a read() on the fd returns and wipes the
whole counter, making the use of it as semaphore a little bit more
cumbersome. You can do a read() followed by a write() of COUNTER-1, but
IMO it's pretty easy and cheap to make this work w/out extra steps. This
patch introduces a new eventfd flag that tells eventfd to only dequeue 1
from the counter, allowing simple read/write to make it behave like a
semaphore. Simple test here:
http://www.xmailserver.org/eventfd-sem.c
To be back-compatible with earlier kernels, userspace applications should
probe for the availability of this feature via
#ifdef EFD_SEMAPHORE
fd = eventfd2 (CNT, EFD_SEMAPHORE);
if (fd == -1 && errno == EINVAL)
<fallback>
#else
<fallback>
#endif
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: <linux-api@vger.kernel.org>
Tested-by: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We cover all log-levels by pr_... macros except KERN_CONT one. Add it
for convenience.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that the filesystem freeze operation has been elevated to the VFS, and
is just an ioctl away, some sort of safety net for unintentionally frozen
root filesystems may be in order.
The timeout thaw originally proposed did not get merged, but perhaps
something like this would be useful in emergencies.
For example, freeze /path/to/mountpoint may freeze your root filesystem if
you forgot that you had that unmounted.
I chose 'j' as the last remaining character other than 'h' which is sort
of reserved for help (because help is generated on any unknown character).
I've tested this on a non-root fs with multiple (nested) freezers, as well
as on a system rendered unresponsive due to a frozen root fs.
[randy.dunlap@oracle.com: emergency thaw only if CONFIG_BLOCK enabled]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: Takashi Sato <t-sato@yk.jp.nec.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make the following header file changes:
- remove arch ifdefs and asm/suspend.h from linux/suspend.h
- add asm/suspend.h to disk.c (for arch_prepare_suspend())
- add linux/io.h to swsusp.c (for ioremap())
- x86 32/64 bit compile fixes
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Synopsis: if shmem_writepage calls swap_writepage directly, most shmem
swap loads benefit, and a catastrophic interaction between SLUB and some
flash storage is avoided.
shmem_writepage() has always been peculiar in making no attempt to write:
it has just transferred a shmem page from file cache to swap cache, then
let that page make its way around the LRU again before being written and
freed.
The idea was that people use tmpfs because they want those pages to stay
in RAM; so although we give it an overflow to swap, we should resist
writing too soon, giving those pages a second chance before they can be
reclaimed.
That was always questionable, and I've toyed with this patch for years;
but never had a clear justification to depart from the original design.
It became more questionable in 2.6.28, when the split LRU patches classed
shmem and tmpfs pages as SwapBacked rather than as file_cache: that in
itself gives them more resistance to reclaim than normal file pages. I
prepared this patch for 2.6.29, but the merge window arrived before I'd
completed gathering statistics to justify sending it in.
Then while comparing SLQB against SLUB, running SLUB on a laptop I'd
habitually used with SLAB, I found SLUB to run my tmpfs kbuild swapping
tests five times slower than SLAB or SLQB - other machines slower too, but
nowhere near so bad. Simpler "cp -a" swapping tests showed the same.
slub_max_order=0 brings sanity to all, but heavy swapping is too far from
normal to justify such a tuning. The crucial factor on that laptop turns
out to be that I'm using an SD card for swap. What happens is this:
By default, SLUB uses order-2 pages for shmem_inode_cache (and many other
fs inodes), so creating tmpfs files under memory pressure brings lumpy
reclaim into play. One subpage of the order is chosen from the bottom of
the LRU as usual, then the other three picked out from their random
positions on the LRUs.
In a tmpfs load, many of these pages will be ones which already passed
through shmem_writepage, so already have swap allocated. And though their
offsets on swap were probably allocated sequentially, now that the pages
are picked off at random, their swap offsets are scattered.
But the flash storage on the SD card is very sensitive to having its
writes merged: once swap is written at scattered offsets, performance
falls apart. Rotating disk seeks increase too, but less disastrously.
So: stop giving shmem/tmpfs pages a second pass around the LRU, write them
out to swap as soon as their swap has been allocated.
It's surely possible to devise an artificial load which runs faster the
old way, one whose sizing is such that the tmpfs pages on their second
pass are the ones that are wanted again, and other pages not.
But I've not yet found such a load: on all machines, under the loads I've
tried, immediate swap_writepage speeds up shmem swapping: especially when
using the SLUB allocator (and more effectively than slub_max_order=0), but
also with the others; and it also reduces the variance between runs. How
much faster varies widely: a factor of five is rare, 5% is common.
One load which might have suffered: imagine a swapping shmem load in a
limited mem_cgroup on a machine with plenty of memory. Before 2.6.29 the
swapcache was not charged, and such a load would have run quickest with
the shmem swapcache never written to swap. But now swapcache is charged,
so even this load benefits from shmem_writepage directly to swap.
Apologies for the #ifndef CONFIG_SWAP swap_writepage() stub in swap.h:
it's silly because that will never get called; but refactoring shmem.c
sensibly according to CONFIG_SWAP will be a separate task.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
try_to_free_pages() is used for the direct reclaim of up to
SWAP_CLUSTER_MAX pages when watermarks are low. The caller to
alloc_pages_nodemask() can specify a nodemask of nodes that are allowed to
be used but this is not passed to try_to_free_pages(). This can lead to
unnecessary reclaim of pages that are unusable by the caller and int the
worst case lead to allocation failure as progress was not been make where
it is needed.
This patch passes the nodemask used for alloc_pages_nodemask() to
try_to_free_pages().
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The mlock() facility does not exist for NOMMU since all mappings are
effectively locked anyway, so we don't make the bits available when
they're not useful.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Greg Ungerer <gerg@snapgear.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Enrik Berkhan <Enrik.Berkhan@ge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use debug_kmap_atomic in kmap_atomic, kmap_atomic_pfn, and
iomap_atomic_prot_pfn.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
x86 has debug_kmap_atomic_prot() which is error checking function for
kmap_atomic. It is usefull for the other architectures, although it needs
CONFIG_TRACE_IRQFLAGS_SUPPORT.
This patch exposes it to the other architectures.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change the page_mkwrite prototype to take a struct vm_fault, and return
VM_FAULT_xxx flags. There should be no functional change.
This makes it possible to return much more detailed error information to
the VM (and also can provide more information eg. virtual_address to the
driver, which might be important in some special cases).
This is required for a subsequent fix. And will also make it easier to
merge page_mkwrite() with fault() in future.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Cc: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On PowerPC we allocate large boot time hashes on node 0. This leads to an
imbalance in the free memory, for example on a 64GB box (4 x 16GB nodes):
Free memory:
Node 0: 97.03%
Node 1: 98.54%
Node 2: 98.42%
Node 3: 98.53%
If we switch to using vmalloc (like ia64 and x86-64) things are more
balanced:
Free memory:
Node 0: 97.53%
Node 1: 98.35%
Node 2: 98.33%
Node 3: 98.33%
For many HPC applications we are limited by the free available memory on
the smallest node, so even though the same amount of memory is used the
better balancing helps.
Since all 64bit NUMA capable architectures should have sufficient vmalloc
space, it makes sense to enable it via CONFIG_64BIT.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=9838
On i386, HZ=1000, jiffies_to_clock_t() converts time in a somewhat strange
way from the user's point of view:
# echo 500 >/proc/sys/vm/dirty_writeback_centisecs
# cat /proc/sys/vm/dirty_writeback_centisecs
499
So, we have 5000 jiffies converted to only 499 clock ticks and reported
back.
TICK_NSEC = 999848
ACTHZ = 256039
Keeping in-kernel variable in units passed from userspace will fix issue
of course, but this probably won't be right for every sysctl.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CONFIG_DEBUG_PAGEALLOC is now supported by x86, powerpc, sparc64, and
s390. This patch implements it for the rest of the architectures by
filling the pages with poison byte patterns after free_pages() and
verifying the poison patterns before alloc_pages().
This generic one cannot detect invalid page accesses immediately but
invalid read access may cause invalid dereference by poisoned memory and
invalid write access can be detected after a long delay.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I notice there are many places doing copy_from_user() which follows
kmalloc():
dst = kmalloc(len, GFP_KERNEL);
if (!dst)
return -ENOMEM;
if (copy_from_user(dst, src, len)) {
kfree(dst);
return -EFAULT
}
memdup_user() is a wrapper of the above code. With this new function, we
don't have to write 'len' twice, which can lead to typos/mistakes. It
also produces smaller code and kernel text.
A quick grep shows 250+ places where memdup_user() *may* be used. I'll
prepare a patchset to do this conversion.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Americo Wang <xiyou.wangcong@gmail.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a helper function account_page_dirtied(). Use that from two
callsites. reiser4 adds a function which adds a third callsite.
Signed-off-by: Edward Shishkin<edward.shishkin@gmail.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: cleanup
In almost cases, for_each_zone() is used with populated_zone(). It's
because almost function doesn't need memoryless node information.
Therefore, for_each_populated_zone() can help to make code simplify.
This patch has no functional change.
[akpm@linux-foundation.org: small cleanup]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
struct tty_operations::proc_fops took it's place and there is one less
create_proc_read_entry() user now!
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Used for gradual switch of TTY drivers from using ->read_proc which helps
with gradual switch from ->read_proc for the whole tree.
As side effect, fix possible race condition when ->data initialized after
PDE is hooked into proc tree.
->proc_fops takes precedence over ->read_proc.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 29a814d2ee (vfs: add hooks for
ext4's delayed allocation support) exported the following functions
mpage_bio_submit()
__mpage_writepage()
for the benefit of ext4's delayed allocation support. Since commit
a1d6cc563b (ext4: Rework the
ext4_da_writepages() function), these functions are not used by the
ext4 driver anymore. However, the now unnecessary exports still
remain, and this patch removes those. Moreover, these two functions
can become static again.
The issue was spotted by namespacecheck.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Don't pull it in sched.h; very few files actually need it and those
can include directly. sched.h itself only needs forward declaration
of struct fs_struct;
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* all changes of current->fs are done under task_lock and write_lock of
old fs->lock
* refcount is not atomic anymore (same protection)
* its decrements are done when removing reference from current; at the
same time we decide whether to free it.
* put_fs_struct() is gone
* new field - ->in_exec. Set by check_unsafe_exec() if we are trying to do
execve() and only subthreads share fs_struct. Cleared when finishing exec
(success and failure alike). Makes CLONE_FS fail with -EAGAIN if set.
* check_unsafe_exec() may fail with -EAGAIN if another execve() from subthread
is in progress.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pure code move; two new helper functions for nfsd and daemonize
(unshare_fs_struct() and daemonize_fs_struct() resp.; for now -
the same code as used to be in callers). unshare_fs_struct()
exported (for nfsd, as copy_fs_struct()/exit_fs() used to be),
copy_fs_struct() and exit_fs() don't need exports anymore.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Since SELECT_DRIVE() has boiled down to a mere dev_select() method call, it now
makes sense to just inline it...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Turn selectproc() method into dev_select() method by teaching it to write to the
device register and moving it from 'struct ide_port_ops' to 'struct ide_tp_ops'.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: benh@kernel.crashing.org
Cc: petkovbb@gmail.com
[bart: add ->dev_select to at91_ide.c and tx4939.c (__BIG_ENDIAN case)]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
The feature register has never been readable -- when its location is read, one
gets the error register value; hence rename IDE_TFLAG_IN_[HOB_]FEATURE into
IDE_TFLAG_IN_[HOB_]ERROR and introduce the 'hob_error' field into the 'struct
ide_taskfile' (despite the error register not really depending on the HOB bit).
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Turn set_irq() method with its software reset hack into write_devctl() method
(for just writing a value into the device control register) at last...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Fix ide_init_sg_cmd() setup for non-fs requests.
* Convert ide_pc_intr() to use ide_pio_bytes() for floppy media.
* Remove no longer needed ide_io_buffers() and sg/sg_cnt fields
from struct ide_atapi_pc.
* Remove partial completions; kill idefloppy_update_buffers(), as a
result.
* Add some more debugging statements.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
struct ide_atapi_pc is often allocated on the stack and size of ->pc_buf
size is 256 bytes. However since only ide_floppy_create_read_capacity_cmd()
and idetape_create_inquiry_cmd() require such size allocate buffers for
these pc-s explicitely and decrease ->pc_buf size to 64 bytes.
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Move ide_map_sg() calls out from ide_build_sglist()
to ide_dma_prepare().
* Pass command to ide_destroy_dmatable().
* Rename ide_build_sglist() to ide_dma_map_sg()
and ide_destroy_dmatable() to ide_dma_unmap_sg().
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add (an optional) ->dma_check method for checking if DMA can be
used for a given command and fail DMA setup in ide_dma_prepare()
if necessary.
* Convert alim15x3 and trm290 host drivers to use ->dma_check.
* Rename ali15x3_dma_setup() to ali_dma_check() while at it.
There should be no functional changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add ide_dma_prepare() helper.
* Convert ide_issue_pc() and do_rw_taskfile() to use it.
* Make ide_build_sglist() static.
There should be no functional changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
All custom ->dma_timeout implementations call the generic one thus it is
possible to have only an optional method for resetting DMA engine instead:
* Add ->dma_clear method and convert hpt366, pdc202xx_old and sl82c105
host drivers to use it.
* Always use ide_dma_timeout() in ide_dma_timeout_retry() and remove
->dma_timeout method.
* Make ide_dma_timeout() static.
There should be no functional changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
We now support arbitrary number of bytes per-IRQ also for fs requests
so remove ide_cd_check_transfer_size() and IDE_AFLAG_LIMIT_NFRAMES.
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Export ide_pio_bytes().
* Add ->last_xfer_len field to struct ide_cmd.
* Add ide_cd_error_cmd() helper to ide-cd.
* Convert ide-cd to use scatterlists also for PIO transfers (fs requests
only for now) and get rid of partial completions (except when the error
happens -- which is still subject to change later because looking at
ATAPI spec it seems that the device is free to error the whole transfer
with setting the Error bit only on the last transfer chunk).
* Update ide_cd_{prepare_rw,restore_request,do_request}() accordingly.
* Inline ide_cd_restore_request() into cdrom_start_rw().
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Remove old artifacts leftover from the platform driver gianfar and
fsl_i2c drivers. These symbols became unused when the drivers
were migrated over to use the of_platform bus.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
It appears I inadvertly introduced rq->lock recursion to the
hrtimer_start() path when I delegated running already expired
timers to softirq context.
This patch fixes it by introducing a __hrtimer_start_range_ns()
method that will not use raise_softirq_irqoff() but
__raise_softirq_irqoff() which avoids the wakeup.
It then also changes schedule() to check for pending softirqs and
do the wakeup then, I'm not quite sure I like this last bit, nor
am I convinced its really needed.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org
LKML-Reference: <20090313112301.096138802@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
At present it is not possible for machine constraints to disable
regulators which have been left on when the system starts, for example
as a result of fixed default configurations in hardware. This means that
power may be wasted by these regulators if they are not in use.
Provide intial support for this with a late_initcall which will disable
any unused regulators if the machine has enabled this feature by calling
regulator_has_full_constraints(). If this has not been called then print
a warning to encourage users to fully specify their constraints so that
we can change this to be the default behaviour in future.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Rather than incrementing the reference count for boot_on regulators
(which prevents them being disabled later on) simply force the
regulator to be enabled when applying the constraints. Previously
boot_on was essentially equivalent to always_on.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Glue between MMC and regulator stacks ... verified with
some OMAP3 boards using adjustable and configured-as-fixed
regulators on several MMC controllers.
These calls are intended to be used by MMC host adapters
using at least one regulator per host. Examples include
slots with regulators supporting multiple voltages and
ones using multiple voltage rails (e.g. DAT4..DAT7 using a
separate supply, or a split rail chip like certain SDIO
WLAN or eMMC solutions).
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Support most of the LDO regulators in the twl4030 family chips.
In the case of LDOs supporting MMC/SD, the voltage controls are
used; but in most other cases, the regulator framework is only
used to enable/disable a supplies, conserving power when a given
voltage rail is not needed.
The drivers/mfd/twl4030-core.c code already sets up the various
regulators according to board-specific configuration, and knows
that some chips don't provide the full set of voltage rails.
The omitted regulators are intended to be under hardware control,
such as during the hardware-mediated system powerup, powerdown,
and suspend states. Unless/until software hooks are known to
be safe, they won't be exported here.
These regulators implement the new get_status() operation, but
can't realistically implement get_mode(); the status output is
effectively the result of a vote, with the relevant hardware
inputs not exposed.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Add kerneldoc for the new get_status() message. Fix the existing
kerneldoc for that struct in two ways:
(a) Syntax, making sure parameter descriptions immediately
follow the one-line struct description and that the first
blank lines is before any more expansive description;
(b) Presentation for a few points, to highlight the fact that
the previous "get" methods exist only to report the current
configuration, not to display actual status.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Add a basic mechanism for regulators to report the discrete
voltages they support: list_voltage() enumerates them using
selectors numbered from 0 to an upper bound.
Use those methods to force machine-level constraints into bounds.
(Example: regulator supports 1.8V, 2.4V, 2.6V, 3.3V, and board
constraints for that rail are 2.0V to 3.6V ... so the range of
voltages is then 2.4V to 3.3V on this board.)
Export those voltages to the regulator consumer interface, so for
example regulator hooked up to an MMC/SD/SDIO slot can report the
actual voltage options available to cards connected there.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
This is useful when wishing to run in a fixed operating mode that isn't
the default.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Remove deceased email address and update to new address. Also update
website details in MAINTAINERS with correct page.
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Fix regulator/driver.h missing kernel-doc:
Warning(linux-next-20090120//include/linux/regulator/driver.h:108): No description found for parameter 'get_status'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
cc: Liam Girdwood <lrg@slimlogic.co.uk>
cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Commit 872ed3fe176833f7d43748eb88010da4bbd2f983 caused regulator drivers
to take the struct regulator_dev lock themselves which requires that the
struct be visible to them. Band aid this by making the struct visible.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Previously it was not possible to do so, making it impossible for
machines to configure the driver.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Rather than having the regulator init data read from the platform_data
member of the struct device that is registered for the regulator make
the init data an explict argument passed in when registering. This
allows drivers to use the platform data for their own purposes if they
wish.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Regulator: Push lock out of _notifier_call_chain and into caller functions
(side effect of fixing deadlock in regulator_force_disable)
+ Add a voltage changed event.
Signed-off-by: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Based on previous LKML discussions:
* Update docs for regulator sysfs class attributes to highlight
the fact that all current attributes are intended to be control
inputs, including notably "state" and "opmode" which previously
implied otherwise.
* Define a new regulator driver get_status() method, which is the
first method reporting regulator outputs instead of inputs.
It can report on/off and error status; or instead of simply
"on", report the actual operating mode.
For the moment, this is a sysfs-only interface, not accessible to
regulator clients. Such clients can use the current notification
interfaces to detect errors, if the regulator reports them.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Move the raid6 data processing routines into a standalone module
(raid6_pq) to prepare them to be called from async_tx wrappers and other
non-md drivers/modules. This precludes a circular dependency of raid456
needing the async modules for data processing while those modules in
turn depend on raid456 for the base level synchronous raid6 routines.
To support this move:
1/ The exportable definitions in raid6.h move to include/linux/raid/pq.h
2/ The raid6_call, recovery calls, and table symbols are exported
3/ Extra #ifdef __KERNEL__ statements to enable the userspace raid6test to
compile
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
This makes the includes more explicit, and is preparation for moving
md_k.h to drivers/md/md.h
Remove include/raid/md.h as its only remaining use was to #include
other files.
Signed-off-by: NeilBrown <neilb@suse.de>
The extern function definitions are kernel-internal definitions, so
they belong in md_k.h
The MD_*_VERSION values could reasonably go in a number of places,
but md_u.h seems most reasonable.
This leaves almost nothing in md.h. It will go soon.
Signed-off-by: NeilBrown <neilb@suse.de>
.. as they are part of the user-space interface.
Also move MdpMinorShift into there so we can remove duplication.
Lastly move mdp_major in. It is less obviously part of the user-space
interface, but do_mounts_md.c uses it, and it is acting a bit like
user-space.
Signed-off-by: NeilBrown <neilb@suse.de>
Move the headers with the local structures for the disciplines and
bitmap.h into drivers/md/ so that they are more easily grepable for
hacking and not far away. md.h is left where it is for now as there
are some uses from the outside.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: NeilBrown <neilb@suse.de>
There are two problems with is_mddev_idle.
1/ sync_io is 'atomic_t' and hence 'int'. curr_events and all the
rest are 'long'.
So if sync_io were to wrap on a 64bit host, the value of
curr_events would go very negative suddenly, and take a very
long time to return to positive.
So do all calculations as 'int'. That gives us plenty of precision
for what we need.
2/ To initialise rdev->last_events we simply call is_mddev_idle, on
the assumption that it will make sure that last_events is in a
suitable range. It used to do this, but now it does not.
So now we need to be more explicit about initialisation.
Signed-off-by: NeilBrown <neilb@suse.de>
Impact: minor new API
ksplice added a "starts_with" function, which seems like a common need.
When people open-code it they seem to use fixed numbers rather than strlen,
so it's quite a readability win (also, strncmp() almost always wants != 0
on it).
So here's strstarts().
Cc: Anders Kaseorg <andersk@mit.edu>
Cc: Jeff Arnold <jbarnold@mit.edu>
Cc: Tim Abbott <tabbott@mit.edu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There seems to be a common pattern in the kernel where drivers want to
call request_module() from inside a module_init() function. Currently
this would deadlock.
As a result, several drivers go through hoops like scheduling things via
kevent, or creating custom work queues (because kevent can deadlock on them).
This patch changes this to use a request_module_nowait() function macro instead,
which just fires the modprobe off but doesn't wait for it, and thus avoids the
original deadlock entirely.
On my laptop this already results in one less kernel thread running..
(Includes Jiri's patch to use enum umh_wait)
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (bool-ified)
Cc: Jiri Slaby <jirislaby@gmail.com>
Impact: Expose some module.c symbols
Ksplice uses several functions from module.c in order to resolve
symbols and implement dependency handling. Calling these functions
requires holding module_mutex, so it is exported.
(This is just the module part of a bigger add-exports patch from Tim).
Cc: Anders Kaseorg <andersk@mit.edu>
Cc: Jeff Arnold <jbarnold@mit.edu>
Signed-off-by: Tim Abbott <tabbott@mit.edu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Impact: New API
kallsyms_lookup_name only returns the first match that it finds. Ksplice
needs information about all symbols with a given name in order to correctly
resolve local symbols.
kallsyms_on_each_symbol provides a generic mechanism for iterating over the
kallsyms table.
Cc: Jeff Arnold <jbarnold@mit.edu>
Cc: Tim Abbott <tabbott@mit.edu>
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Impact: Replace and remove risky (non-EXPORTed) API
module_text_address() returns a pointer to the module, which given locking
improvements in module.c, is useless except to test for NULL:
1) If the module can't go away, use __module_text_address.
2) Otherwise, just use is_module_text_address().
Cc: linux-mtd@lists.infradead.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Impact: New API, cleanup
ksplice wants to know the bounds of a module, not just the module text.
It makes sense to have __module_address. We then implement
is_module_address and __module_text_address in terms of this (and
change is_module_text_address() to bool while we're at it).
Also, add proper kerneldoc for them all.
Cc: Anders Kaseorg <andersk@mit.edu>
Cc: Jeff Arnold <jbarnold@mit.edu>
Cc: Tim Abbott <tabbott@mit.edu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Impact: fix crash on reading from /sys/module/.../ieee80211_default_rc_algo
The module_param type "charp" simply sets a char * pointer in the
module to the parameter in the commandline string: this is why we keep
the (mangled) module command line around. But when set via sysfs (as
about 11 charp parameters can be) this memory is freed on the way
out of the write(). Future reads hit random mem.
So we kstrdup instead: we have to check we're not in early commandline
parsing, and we have to note when we've used it so we can reliably
kfree the parameter when it's next overwritten, and also on module
unload.
(Thanks to Randy Dunlap for CONFIG_SYSFS=n fixes)
Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Diagnosed-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
wireless: remove duplicated .ndo_set_mac_address
netfilter: xtables: fix IPv6 dependency in the cluster match
tg3: Add GRO support.
niu: Add GRO support.
ucc_geth: Fix use-after-of_node_put() in ucc_geth_probe().
gianfar: Fix use-after-of_node_put() in gfar_of_init().
kernel: remove HIPQUAD()
netpoll: store local and remote ip in net-endian
netfilter: fix endian bug in conntrack printks
dmascc: fix incomplete conversion to network_device_ops
gso: Fix support for linear packets
skbuff.h: fix missing kernel-doc
ni5010: convert to net_device_ops
* 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
hwmon: (fschmd) Add support for the FSC Hades IC
hwmon: (fschmd) Add support for the FSC Syleus IC
i2c-i801: Instantiate FSC hardware montioring chips
dmi: Let dmi_walk() users pass private data
hwmon: Define a standard interface for chassis intrusion detection
Move the pcf8591 driver to hwmon
hwmon: (w83627ehf) Only expose in6 or temp3 on the W83667HG
hwmon: (w83627ehf) Add support for W83667HG
hwmon: (w83627ehf) Invert fan pin variables logic
hwmon: (hdaps) Fix Thinkpad X41 axis inversion
hwmon: (hdaps) Allow inversion of separate axis
hwmon: (ds1621) Clean up documentation
hwmon: (ds1621) Avoid unneeded register access
hwmon: (ds1621) Clean up register access
hwmon: (ds1621) Reorder code statements
* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc:
Revert "proc: revert /proc/uptime to ->read_proc hook"
proc 2/2: remove struct proc_dir_entry::owner
proc 1/2: do PDE usecounting even for ->read_proc, ->write_proc
proc: fix sparse warnings in pagemap_read()
proc: move fs/proc/inode-alloc.txt comment into a source file
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PCI PM: Make pci_prepare_to_sleep() disable wake-up if needed
radeonfb: Use __pci_complete_power_transition()
PCI PM: Introduce __pci_[start|complete]_power_transition() (rev. 2)
PCI PM: Restore config spaces of all devices during early resume
PCI PM: Make pci_set_power_state() handle devices with no PM support
PCI PM: Put devices into low power states during late suspend (rev. 2)
PCI PM: Move pci_restore_standard_config to pci-driver.c
PCI PM: Use pci_set_power_state during early resume
PCI PM: Consistently use variable name "error" for pm call return values
kexec: Change kexec jump code ordering
PM: Change hibernation code ordering
PM: Change suspend code ordering
PM: Rework handling of interrupts during suspend-resume
PM: Introduce functions for suspending and resuming device interrupts
Fix this build error when REISERFS_FS_POSIX_ACL is not set:
fs/reiserfs/inode.c: In function 'reiserfs_new_inode':
fs/reiserfs/inode.c:1919: warning: passing argument 1 of 'reiserfs_inherit_default_acl' from incompatible pointer type
fs/reiserfs/inode.c:1919: warning: passing argument 2 of 'reiserfs_inherit_default_acl' from incompatible pointer type
fs/reiserfs/inode.c:1919: warning: passing argument 3 of 'reiserfs_inherit_default_acl' from incompatible pointer type
fs/reiserfs/inode.c:1919: error: too many arguments to function 'reiserfs_inherit_default_acl'
due to a missing transaction-handle argument in the non-acl
compatibility function.
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Acked-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
as correctly noted at bug #12454. Someone can lookup entry with NULL
->owner, thus not pinning enything, and release it later resulting
in module refcount underflow.
We can keep ->owner and supply it at registration time like ->proc_fops
and ->data.
But this leaves ->owner as easy-manipulative field (just one C assignment)
and somebody will forget to unpin previous/pin current module when
switching ->owner. ->proc_fops is declared as "const" which should give
some thoughts.
->read_proc/->write_proc were just fixed to not require ->owner for
protection.
rmmod'ed directories will be empty and return "." and ".." -- no harm.
And directories with tricky enough readdir and lookup shouldn't be modular.
We definitely don't want such modular code.
Removing ->owner will also make PDE smaller.
So, let's nuke it.
Kudos to Jeff Layton for reminding about this, let's say, oversight.
http://bugzilla.kernel.org/show_bug.cgi?id=12454
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
* 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (53 commits)
drm: detect hdmi monitor by hdmi identifier (v3)
drm: drm_fops.c unlock missing on error path
drm: reorder struct drm_ioctl_desc to save space on 64 bit builds
radeon: add some new pci ids
drm: read EDID extensions from monitor
drm: Use a little stash on the stack to avoid kmalloc in most DRM ioctls.
drm/radeon: add regs required for occlusion queries support
drm/i915: check the return value from the copy from user
drm/radeon: fix logic in r600_page_table_init() to match ati_gart
drm/radeon: r600 ptes are 64-bit, cleanup cleanup function.
drm/radeon: don't call irq changes on r600 suspend/resume
drm/radeon: fix r600 writeback across suspend/resume
drm/radeon: fix r600 writeback setup.
drm: fix warnings about new mappings in info code.
drm/radeon: NULL noise: drivers/gpu/drm/radeon/radeon_*.c
drm/radeon: fix r600 pci mapping calls.
drm/radeon: r6xx/r7xx: fix possible oops in r600_page_table_cleanup()
radeon: call the correct idle function, logic got inverted.
drm/radeon: RS600: fix interrupt handling
drm/r600: fix rptr address along lines of previous fixes to radeon.
...
* 'iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
dma-debug: make memory range checks more consistent
dma-debug: warn of unmapping an invalid dma address
dma-debug: fix dma_debug_add_bus() definition for !CONFIG_DMA_API_DEBUG
dma-debug/x86: register pci bus for dma-debug leak detection
dma-debug: add a check dma memory leaks
dma-debug: add checks for kernel text and rodata
dma-debug: print stacktrace of mapping path on unmap error
dma-debug: Documentation update
dma-debug: x86 architecture bindings
dma-debug: add function to dump dma mappings
dma-debug: add checks for sync_single_sg_*
dma-debug: add checks for sync_single_range_*
dma-debug: add checks for sync_single_*
dma-debug: add checking for [alloc|free]_coherent
dma-debug: add add checking for map/unmap_sg
dma-debug: add checking for map/unmap_page/single
dma-debug: add core checking functions
dma-debug: add debugfs interface
dma-debug: add kernel command line parameters
dma-debug: add initialization code
...
Fix trivial conflicts due to whitespace changes in arch/x86/kernel/pci-nommu.c
The radeonfb driver needs to program the device's PMCSR directly due
to some quirky hardware it has to handle (see
http://bugzilla.kernel.org/show_bug.cgi?id=12846 for details) and
after doing that it needs to call the platform (usually ACPI) to
finish the power transition of the device. Currently it uses
pci_set_power_state() for this purpose, however making a specific
assumption about the internal behavior of this function, which has
changed recently so that this assumption is no longer satisfied.
For this reason, introduce __pci_complete_power_transition() that may
be called by the radeonfb driver to complete the power transition of
the device. For symmetry, introduce __pci_start_power_transition().
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Introduce helper functions allowing us to prevent device drivers from
getting any interrupts (without disabling interrupts on the CPU)
during suspend (or hibernation) and to make them start to receive
interrupts again during the subsequent resume. These functions make it
possible to keep timer interrupts enabled while the "late" suspend and
"early" resume callbacks provided by device drivers are being
executed. In turn, this allows device drivers' "late" suspend and
"early" resume callbacks to sleep, execute ACPI callbacks etc.
The functions introduced here will be used to rework the handling of
interrupts during suspend (hibernation) and resume. Namely,
interrupts will only be disabled on the CPU right before suspending
sysdevs, while device drivers will be prevented from receiving
interrupts, with the help of the new helper function, before their
"late" suspend callbacks run (and analogously during resume).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
At the moment, dmi_walk() lacks flexibility, users can't pass data to
the callback function. Add a pointer for private data to make this
function more flexible.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Roland Dreier <rolandd@cisco.com>
This patch is a simple s/p_._//g to the reiserfs code. This is the
fifth in a series of patches to rip out some of the awful variable
naming in reiserfs.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is a simple s/p_s_tb/tb/g to the reiserfs code. This is the
fourth in a series of patches to rip out some of the awful variable
naming in reiserfs.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is a simple s/p_s_inode/inode/g to the reiserfs code. This
is the third in a series of patches to rip out some of the awful
variable naming in reiserfs.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is a simple s/p_s_bh/bh/g to the reiserfs code. This is the
second in a series of patches to rip out some of the awful variable
naming in reiserfs.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is a simple s/p_s_sb/sb/g to the reiserfs code. This is the
first in a series of patches to rip out some of the awful variable
naming in reiserfs.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch strips trailing whitespace from the reiserfs code.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some time ago, some changes were made to make security inode attributes
be atomically written during inode creation. ReiserFS fell behind in
this area, but with the reworking of the xattr code, it's now fairly
easy to add.
The following patch adds the ability for security attributes to be added
automatically during inode creation.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The current reiserfs xattr implementation open codes reiserfs_readdir
and frees the path before calling the filldir function. Typically, the
filldir function is something that modifies the file system, such as a
chown or an inode deletion that also require reading of an inode
associated with each direntry. Since the file system is modified, the
path retained becomes invalid for the next run. In addition, it runs
backwards in attempt to minimize activity.
This is clearly suboptimal from a code cleanliness perspective as well
as performance-wise.
This patch implements a generic reiserfs_for_each_xattr that uses the
generic readdir and a specific filldir routine that simply populates an
array of dentries and then performs a specific operation on them. When
all files have been operated on, it then calls the operation on the
directory itself.
The result is a noticable code reduction and better performance.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Deadlocks are possible in the xattr code between the journal lock and the
xattr sems.
This patch implements journalling for xattr operations. The benefit is
twofold:
* It gets rid of the deadlock possibility by always ensuring that xattr
write operations are initiated inside a transaction.
* It corrects the problem where xattr backing files aren't considered any
differently than normal files, despite the fact they are metadata.
I discussed the added journal load with Chris Mason, and we decided that
since xattrs (versus other journal activity) is fairly rare, the introduction
of larger transactions to support journaled xattrs wouldn't be too big a deal.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig had asked me quite some time ago to port the reiserfs
xattrs to the generic xattr interface.
This patch replaces the reiserfs-specific xattr handling code with the
generic struct xattr_handler.
However, since reiserfs doesn't split the prefix and name when accessing
xattrs, it can't leverage generic_{set,get,list,remove}xattr without
needlessly reconstructing the name on the back end.
Update 7/26/07: Added missing dput() to deletion path.
Update 8/30/07: Added missing mark_inode_dirty when i_mode is used to
represent an ACL and no previous ACL existed.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The per-inode locking can be made more fine-grained to surround just the
interaction with the filesystem itself. This really only applies to
protecting reads during a write, since concurrent writes are barred with
inode->i_mutex at the vfs level.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the switch to using inode->i_mutex locking during lookups/creation
in the xattr root, the per-super xattr lock is no longer needed.
This patch removes it.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The current reiserfs xattr implementation will not clean up old xattr
files if files are deleted when REISERFS_FS_XATTR is unset. This
results in inaccessible lost files, wasting space.
This patch compiles in basic xattr knowledge, such as how to delete them
and change ownership for quota tracking. If the file system has never
used xattrs, then the operation is quite fast: it returns immediately
when it sees there is no .reiserfs_priv directory.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are a number of helper functions for marking a reiserfs inode
private that were leftover from reiserfs did its own thing wrt to
private inodes. S_PRIVATE has been in the kernel for some time, so this
patch removes the helpers and uses IS_PRIVATE instead.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Although reiserfs can currently handle severe errors such as journal failure,
it cannot handle less severe errors like metadata i/o failure. The following
patch adds a reiserfs_error() function akin to the one in ext3.
Subsequent patches will use this new error handler to handle errors more
gracefully in general.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch kills off reiserfs_journal_abort as it is never called, and
combines __reiserfs_journal_abort_{soft,hard} into one function called
reiserfs_abort_journal, which performs the same work. It is silent
as opposed to the old version, since the message was always issued
after a regular 'abort' message.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ReiserFS panics can be somewhat inconsistent.
In some cases:
* a unique identifier may be associated with it
* the function name may be included
* the device may be printed separately
This patch aims to make warnings more consistent. reiserfs_warning() prints
the device name, so printing it a second time is not required. The function
name for a warning is always helpful in debugging, so it is now automatically
inserted into the output. Hans has stated that every warning should have
a unique identifier. Some cases lack them, others really shouldn't have them.
reiserfs_warning() now expects an id associated with each message. In the
rare case where one isn't needed, "" will suffice.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
uniqueness2type and type2uniquness issue a warning when the value is
unknown. When called from reiserfs_warning, this causes a re-entrancy
problem and deadlocks on the error buffer lock.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ReiserFS warnings can be somewhat inconsistent.
In some cases:
* a unique identifier may be associated with it
* the function name may be included
* the device may be printed separately
This patch aims to make warnings more consistent. reiserfs_warning() prints
the device name, so printing it a second time is not required. The function
name for a warning is always helpful in debugging, so it is now automatically
inserted into the output. Hans has stated that every warning should have
a unique identifier. Some cases lack them, others really shouldn't have them.
reiserfs_warning() now expects an id associated with each message. In the
rare case where one isn't needed, "" will suffice.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch makes leaf_paste_entries more consistent with respect to the
other leaf operations. Using buffer_info instead of buffer_head
directly allows us to get a superblock pointer for use in error
handling.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes up the reiserfs code such that transaction ids are
always unsigned ints. In places they can currently be signed ints or
unsigned longs.
The former just causes an annoying clm-2200 warning and may join a
transaction when it should wait.
The latter is just for correctness since the disk format uses a 32-bit
transaction id. There aren't any runtime problems that result from it
not wrapping at the correct location since the value is truncated
correctly even on big endian systems. The 0 value might make it to
disk, but the mount-time checks will bump it to 10 itself.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The following patch adds the fields for tracking mount counts and last
fsck timestamps to the superblock. It also increments the mount count
on every read-write mount.
Reiserfsprogs 3.6.21 added support for these fields.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'x86-stage-3-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (190 commits)
Revert "cpuacct: reduce one NULL check in fast-path"
Revert "x86: don't compile vsmp_64 for 32bit"
x86: Correct behaviour of irq affinity
x86: early_ioremap_init(), use __fix_to_virt(), because we are sure it's safe
x86: use default_cpu_mask_to_apicid for 64bit
x86: fix set_extra_move_desc calling
x86, PAT, PCI: Change vma prot in pci_mmap to reflect inherited prot
x86/dmi: fix dmi_alloc() section mismatches
x86: e820 fix various signedness issues in setup.c and e820.c
x86: apic/io_apic.c define msi_ir_chip and ir_ioapic_chip all the time
x86: irq.c keep CONFIG_X86_LOCAL_APIC interrupts together
x86: irq.c use same path for show_interrupts
x86: cpu/cpu.h cleanup
x86: Fix a couple of sparse warnings in arch/x86/kernel/apic/io_apic.c
Revert "x86: create a non-zero sized bm_pte only when needed"
x86: pci-nommu.c cleanup
x86: io_delay.c cleanup
x86: rtc.c cleanup
x86: i8253 cleanup
x86: kdebugfs.c cleanup
...
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (180 commits)
powerpc: clean up ssi.txt, add definition for fsl,ssi-asynchronous
powerpc/85xx: Add support for the "socrates" board (MPC8544).
powerpc: Fix bugs introduced by sysfs changes
powerpc: Sanitize stack pointer in signal handling code
powerpc: Add write barrier before enabling DTL flags
powerpc/83xx: Update ranges in gianfar node to match other dts
powerpc/86xx: Move gianfar mdio nodes under the ethernet nodes
powerpc/85xx: Move gianfar mdio nodes under the ethernet nodes
powerpc/83xx: Move gianfar mdio nodes under the ethernet nodes
powerpc/83xx: Add power management support for MPC837x boards
powerpc/mm: Introduce early_init_mmu() on 64-bit
powerpc/mm: Add option for non-atomic PTE updates to ppc64
powerpc/mm: Fix printk type warning in mmu_context_nohash
powerpc/mm: Rename arch/powerpc/kernel/mmap.c to mmap_64.c
powerpc/mm: Merge various PTE bits and accessors definitions
powerpc/mm: Tweak PTE bit combination definitions
powerpc/cell: Fix iommu exception reporting
powerpc/mm: e300c2/c3/c4 TLB errata workaround
powerpc/mm: Used free register to save a few cycles in SW TLB miss handling
powerpc/mm: Remove unused register usage in SW TLB miss handling
...
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (70 commits)
ide: keep track of number of bytes instead of sectors in struct ide_cmd
ide: remove ide_execute_pkt_cmd() (v2)
ide: add ->dma_timer_expiry method and remove ->dma_exec_cmd one (v2)
ide: set hwif->expiry prior to calling [__]ide_set_handler()
ide: use do_rw_taskfile() for ATA_CMD_PACKET commands
ide: pass command to ide_map_sg()
ide: remove ide_end_request()
ide: use ide_end_rq() in ide_complete_rq()
ide: pass number of bytes to complete to ide_complete_rq()
ide: remove BUG() from ide_complete_rq()
ide: move rq->errors quirk out from ide_end_request()
ide: pass error value to ide_complete_rq()
ide: sanitize ide_end_rq()
ide: add ide_end_rq() (v2)
ide: make ide_special_rq() BUG() on unknown requests
ide: sanitize ide_finish_cmd()
ide: use ide_complete_cmd() for REQ_UNPARK_HEADS
ide: use ide_complete_cmd() for head unload commands
ide: task_error() -> task_error_cmd()
ide: unify exit paths in task_pio_intr()
...
Adds support for Xen info under /sys/hypervisor. Taken from Novell 2.6.27
backport tree.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Remove suspend_cancel hook from xenbus_driver, in preparation for using
the device model for suspending.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
This driver is used by application which wish to receive notifications
from the hypervisor or other guests via Xen's event channel
mechanism. In particular it is used by the xenstore daemon in domain
0.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Given an evtchn, return the corresponding irq.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Shared memory mappings on nommu machines require a get_unmapped_area
file operation that suggests an address for the mapping. This patch
adds a way for v4l2 drivers to provide this callback.
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The code in the new sq905c.c is based upon the structure of the code in
gspca/sq905.c, and upon the code in libgphoto2/camlibs/digigr8, which supports
the same set of cameras in stillcam mode. I am a co-author of gspca/sq905.c and
I am the sole author of libgphoto2/camlibs/digigr8, which is licensed under the
LGPL. I hereby give myself permission to use my own code from libgphoto2 in
gspca/sq905c.c.
Signed-off-by: Theodore Kilgore <kilgota@auburn.edu>
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Using VIDIOC_DBG_S/G_REGISTER is the standard way of reading/writing register
for advanced debugging under v4l2. In addition, using this means that the
cafe_ccic driver doesn't need to have knowledge about the used sensor: the
debug ioctl can be passed on to the sensor if it isn't for the host.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
These ops are used by the ov7670 driver, so these need to be added.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The device encodes component video up to 1080i to a MPEG-TS stream with
H.264 video and stereo AAC audio. Newer firmwares accept also AC3
(up to 5.1) audio over optical SPDIF without reencoding.
Firmware upgrade is unimplemeted but rather unimportant since
the firmware sits on a flash chip.
The I2C adapter to drive the integrated infrared receiver/sender is
currently disabled due to a conflict with cx18-based devices.
Tested-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Janne Grunau <j@jannau.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Make the g_chip_ident call work for the au0828/au8522. Discovered when testing
with the v4l2_compliance tool
Signed-off-by: Devin Heitmueller <dheitmueller@linuxtv.org>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
[mchehab@redhat.com: fix merge conflict, due to a path change for analog demod]
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Add support for the analog functionality in the au8522 analog/digital
demodulator
Thanks to Michael Krufky <mkrufky@linuxtv.org> and Steven Toth
<stoth@linuxtv.org> for providing sample hardware, engineering level support,
and testing.
Signed-off-by: Devin Heitmueller <dheitmueller@linuxtv.org>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
[mchehab: renamed drivers/media/video/au8522_decoder.c as drivers/media/dvb/frontends/au8522_decoder.c to avoid breaking bisect]
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Call v4l2_device_disconnect when the parent of a hotpluggable device
disconnects. This ensures that you do not have a pointer to a device that
is no longer present.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This addition to the v4l2-api add definitions for the constants and
data structures used for sliced VBI data insertion into MPEG streams triggered
by V4L2_MPEG_STREAM_VBI_FMT_IVTV. This simply declares what the ivtv and
cx18 drivers and MythTV have already been doing and provides a proper data
structure definition to user space.
Signed-off-by: Andy Walls <awalls@radix.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Camera sensors have a native bus width say support, but on some
boards not all sensor data lines are connected to the image
interface and thus support a different bus width than the sensors
native one. Some boards even have a bus driver which dynamically
switches between different bus widths with a GPIO.
This patch adds a hook which board code can use to support different
bus widths.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Currently soc-camera doesn't set up any image format without an explicit
S_FMT. According to the API this should be supported, for example,
capture-example.c from v4l2-apps by default doesn't issue an S_FMT. This
patch moves negotiating of available host-camera format translations to
probe() time, and restores the state from the last close() on the next
open(). This is needed for some drivers, which power down or reset
hardware after the last user closes the interface. This patch also has a
nice side-effect of avoiding multiple allocation anf freeing of format
translation tables.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
As host and camera drivers become more complex, differences between S_FMT and
S_CROP functionality grow, this patch separates them.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Bt819 needs the parent driver to drive a GPIO pin low and high in order to
reset its fifo. Use the new notify callback for this.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Add a notify callback to v4l2_device to let sub-devices notify their
parent of special events.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The removal of the timer which polls the infrared input is racy.
Replacing the timer with a delayed work solves the problem.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
As announced VIDIOC_G_CHIP_IDENT_OLD is now removed for 2.6.30.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
changeset 04934e44e3784a1b969582e2d59afcec278470c6 removed the last implementation
that were still using the V4L1 obsoleted header.
Now, video_decoder.h is not used anymore by any driver.
Let's remove it and all references for it in Kernel.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Some code was calling v4l2_video_std_construct() when all it cared about
was the frame period. So make a function that just returns that and have
v4l2_video_std_construct() use it.
At this point there are no users of v4l2_video_std_construct() left outside
of v4l2-ioctl, so it could be un-exported and made static.
Change v4l2_video_std_construct() so that it doesn't zero out the struct
v4l2_standard passed in. It's already been zeroed out in the common ioctl
code.
Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Added a new chip identifer to v4l2-chip-ident for the integrated A/V broadcast
decoder core internal to the CX23418. Completed separation and encapsulation
of the A/V decoder core interface as a v4l2_subdevice. The cx18 driver now
compiles and links again.
Signed-off-by: Andy Walls <awalls@radix.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The driver infrared remote code part is altered to switch to a work queue.
Also ir_codes table moved to ir-common module for shared access.
Signed-off-by: Igor M. Liplianin <liplianin@netup.ru>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
include/linux/video_encoder.h is not used anymore by a v4l driver.
Let's remove it and its occurences.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The v4l2_ctrl_query_fill_std() function wasn't one the best idea I ever had.
It doesn't add anything valuable that cannot be expressed equally well with
v4l2_ctrl_query_fill and only adds overhead.
Replace it with v4l2_ctrl_query_fill() everywhere it is used and remove it
from v4l2_common.c.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Add small function to retrieve the i2c address from a v4l2_subdev
pointer.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Add data signal polarity, mode, and bus-width tests to
soc_camera_bus_param_compatible().
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
sh_mobile_ceu_camera.c support for signal polarity flags isn't platform
dependent, provide them locally. Only the bus width is implementation
specific.
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Acked-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Remain consistent in the naming: fields pointing to v4l2_device should
be called v4l2_dev. There are too many device-like entities without
adding to the confusion by mixing naming conventions.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Some drivers (e.g. for ISA devices) have no parent device because there
is no associated bus driver. Allow the parent device to be NULL in
those cases when registering v4l2_device.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Added support for I2C device at address 0x40 and subaddress 0x0d/0x0b
that provides remote control key reading support for AVerMedia Cardbus
Hybrid card, possibly for other AVerMedia Cardbus cards.
The I2C address 0x40 doesn't like the SAA7134's 0xfd quirk, so it was
disabled.
[mchehab@redhat.com: CodingStyle fixes]
Signed-off-by: Oldřich Jedlička <oldium.pro@seznam.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This is a common feature on many cameras. the options are:
Default colors,
B & W,
Sepia
Signed-off-by: Sergio Aguirre <saaguirre@ti.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Control arrays as are used with v4l2_ctrl_next must be sorted from
low to high. Add a comment at the top of all such arrays to warn
about this.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
In order to convert the v4l1 zoran and vino i2c drivers to v4l2 these
extra ops are required.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Kaiomy entry.
Thanks to Peter Senna Tschudin <peter.senna@gmail.com> for borrow me one
of those devices.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Modified mxb to load the i2c modules through v4l2_subdev. So no more probing.
Modified tea6415c and tea6420 to use the standard routing ops to do the
routing, rather than using private commands. Dropped the private commands
from tda9840 (they were never used except during initialization of the
module).
Added saa7146 support for VIDIOC_DBG_G_CHIP_IDENT.
Converted saa5246a and saa5249 to v4l2_subdev.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The MR97310A is a dual-mode webcam controller that provides compressed BGGR
Bayer frames. The decompression algorithm for still images is the same as for
video, and is currently implemented in libgphoto2.
Signed-off-by: Kyle Guinn <elyk03@gmail.com>
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Fix bugs in the cx18 AC3 control implementation that would have affected
ivtv and other drivers via the cx2341x module. Bring AC3 controls
behavior into comliance with V4L2 specification. Thanks to Hans Verkuil
for reviewing the previous patch and pointing out the problems.
Reported-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Andy Walls <awalls@radix.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Initial addition of controls to set AC-3 audio encoding for the CX23418 - it
does not work yet due to firmware or cx18 driver issues. This change affects
the common cx2341x and ivtv modules due to shared structures and
common functions.
Signed-off-by: Andy Walls <awalls@radix.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The conversion to video_ioctl2 is the first phase to converting this driver
to the latest v4l2 framework.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Add v4l2_i2c_tuner_addrs() to obtain the various I2C tuner addresses.
This will be used in several drivers, so make this a common function
as we do not want to have these I2C addresses all over the place.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Lockdep gripes if file->f_lock is taken in a no-IRQ situation, since that
is not always the case. We don't really want to disable IRQs for every
acquisition of f_lock; instead, just move it outside of fasync_lock.
Reported-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
The C99 specification states in section 6.11.5:
The placement of a storage-class specifier other than at the beginning
of the declaration specifiers in a declaration is an obsolescent
feature.
Acked-by: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
cgroup documentation was moved to Documentation/cgroups/. There are some
places that still refer to Documentation/controllers/,
Documentation/cgroups.txt and Documentation/cpusets.txt. Fix those.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This removal was scheduled and there is no problem with later
distros to adapt for the new bus, thanks to aliases.
module-init-tools map files are deprecated nowadays, so that
the patch which introduced hid ones into the m-i-t won't be
accepted and hence there is no reason for leaving compat stuff in.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
When hid quirks were converted to specialized driver, the HID_QUIRK_IGNORE
has been moved completely, as the hid_ignore_list[] has been moved into the
generic code.
However userspace already got used to the possibility that modprobing
usbhid with
'quirks=vid:pid:0x4'
makes the device ignored by usbhid driver. So keep this quirk flag in place
for backwards compatibility.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Impact: cleanup
Time to clean up remaining laggards using the old cpu_ functions.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Trond.Myklebust@netapp.com
Everyone defines it, and only one person uses it
(arch/mips/sgi-ip27/ip27-nmi.c). So just open code it there.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-mips@linux-mips.org
Impact: fix lazy context switch API
Pass the previous and next tasks into the context switch start
end calls, so that the called functions can properly access the
task state (esp in end_context_switch, in which the next task
is not yet completely current).
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Impact: simplification, prepare for later changes
Make lazy cpu mode more specific to context switching, so that
it makes sense to do more context-switch specific things in
the callbacks.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Remove the allocation of struct nfsd4_compound_state.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
<linux/irq.h> relies on <linux/gfp.h> and <linux/topology.h> having been
included previous. If not, the errors like below will result.
CC arch/mips/mti-malta/malta-int.o
In file included from arch/mips/mti-malta/malta-int.c:25:
include/linux/irq.h: In function ‘init_alloc_desc_masks’:
include/linux/irq.h:444: error: implicit declaration of function ‘cpu_to_node’
include/linux/irq.h:446: error: ‘GFP_ATOMIC’ undeclared (first use in this function)
include/linux/irq.h:446: error: (Each undeclared identifier is reported only once
include/linux/irq.h:446: error: for each function it appears in.)
make[3]: *** [arch/mips/mti-malta/malta-int.o] Error 1
make[2]: *** [arch/mips/mti-malta] Error 2
make[1]: *** [sub-make] Error 2
Fixed by including the two missing headers.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sometime we need to communicate with HDMI monitor by sending audio or video
info frame, so we have to know monitor type. However if user utilize HDMI-DVI adapter to connect DVI monitor, hardware detection will incorrectly show the monitor is HDMI. HDMI spec tell us that any device containing IEEE registration Identifier will be treated as HDMI device. The patch intends to detect HDMI monitor by this rule.
Signed-off-by: Ma Ling <ling.ma@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
shrinks drm_ioctl_desc from 24 bytes to 16 bytes by reordering members
to remove padding.
updates DRM_IOCTL_DEF macro to initialise structure members by name to
handle the structure reorder.
The applied patch reduces data used in drm.ko from 10440 to 9032
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Usually drm read basic EDID, that is enough for us, but since igital display
were introduced i.e. HDMI monitor, sometime we need to interact with monitor by
EDID extension information,
EDID extensions include audio/video data block, speaker allocation and vendor specific data blocks.
This patch intends to read EDID extensions from digital monitor for users.
Signed-off-by: Ma Ling <ling.ma@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Allows for the removal of byteswapping in some places and
the removal of HIPQUAD (replaced by %pI4).
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add missing struct field to fix kernel-doc warning:
Warning(include/linux/skbuff.h:182): No description found for parameter 'flags'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
smack: Add a new '-CIPSO' option to the network address label configuration
netlabel: Cleanup the Smack/NetLabel code to fix incoming TCP connections
lsm: Remove the socket_post_accept() hook
selinux: Remove the "compat_net" compatibility code
netlabel: Label incoming TCP connections correctly in SELinux
lsm: Relocate the IPv4 security_inet_conn_request() hooks
TOMOYO: Fix a typo.
smack: convert smack to standard linux lists
Annotate struct fs_struct's usage count to indicate the restrictions upon it.
It may not be incremented, except by clone(CLONE_FS), as this affects the
check in check_unsafe_exec() in fs/exec.c.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
i2c-core: Some style cleanups
i2c-piix4: Add support for the Broadcom HT1100 chipset
i2c-piix4: Add support to SB800 SMBus changes
i2c-pca-platform: Use defaults if no platform_data given
i2c-algo-pca: Use timeout for checking the state machine
i2c-algo-pca: Rework waiting for a free bus
i2c-algo-pca: Add PCA9665 support
i2c: Adapt debug macros for KERN_* constants
i2c-davinci: Fix timeout handling
i2c: Adapter timeout is in jiffies
i2c: Set a default timeout value for all adapters
i2c: Add missing KERN_* constants to printks
i2c-algo-pcf: Handle timeout correctly
i2c-algo-pcf: Style cleanups
eeprom/at24: Remove EXPERIMENTAL
i2c-nforce2: Add support for MCP67, MCP73, MCP78S and MCP79
i2c: Clarify which clients are auto-removed
i2c: Let checkpatch shout on users of the legacy model
i2c: Document the different ways to instantiate i2c devices
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (422 commits)
[ARM] 5435/1: fix compile warning in sanity_check_meminfo()
[ARM] 5434/1: ARM: OMAP: Fix mailbox compile for 24xx
[ARM] pxa: fix the bad assumption that PCMCIA sockets always start with 0
[ARM] pxa: fix Colibri PXA300 and PXA320 LCD backlight pins
imxfb: Fix TFT mode
i.MX21/27: remove ifdef CONFIG_FB_IMX
imxfb: add clock support
mxc: add arch_reset() function
clkdev: add possibility to get a clock based on the device name
i.MX1: remove fb support from mach-imx
[ARM] pxa: build arch/arm/plat-pxa/mfp.c only when PXA3xx or ARCH_MMP defined
Gemini: Add support for Teltonika RUT100
Gemini: gpiolib based GPIO support v2
MAINTAINERS: add myself as Gemini architecture maintainer
ARM: Add Gemini architecture v3
[ARM] OMAP: Fix compile for omap2_init_common_hw()
MAINTAINERS: Add myself as Faraday ARM core variant maintainer
ARM: Add support for FA526 v2
[ARM] acorn,ebsa110,footbridge,integrator,sa1100: Convert asm/io.h to linux/io.h
[ARM] collie: fix two minor formatting nits
...
* 'percpu-cpumask-x86-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (682 commits)
percpu: fix spurious alignment WARN in legacy SMP percpu allocator
percpu: generalize embedding first chunk setup helper
percpu: more flexibility for @dyn_size of pcpu_setup_first_chunk()
percpu: make x86 addr <-> pcpu ptr conversion macros generic
linker script: define __per_cpu_load on all SMP capable archs
x86: UV: remove uv_flush_tlb_others() WARN_ON
percpu: finer grained locking to break deadlock and allow atomic free
percpu: move fully free chunk reclamation into a work
percpu: move chunk area map extension out of area allocation
percpu: replace pcpu_realloc() with pcpu_mem_alloc() and pcpu_mem_free()
x86, percpu: setup reserved percpu area for x86_64
percpu, module: implement reserved allocation and use it for module percpu variables
percpu: add an indirection ptr for chunk page map access
x86: make embedding percpu allocator return excessive free space
percpu: use negative for auto for pcpu_setup_first_chunk() arguments
percpu: improve first chunk initial area map handling
percpu: cosmetic renames in pcpu_setup_first_chunk()
percpu: clean up percpu constants
x86: un-__init fill_pud/pmd/pte
x86: remove vestigial fix_ioremap prototypes
...
Manually merge conflicts in arch/ia64/kernel/irq_ia64.c
Add support for the Broadcom HT1100 LD chipset (SMBus function.)
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Waiting for a free bus now accepts the timeout value in jiffies and does
proper checking using time_before.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
The MCP78S and MCP79 appear to be compatible with the previous nForce
chips as far as the SMBus controller is concerned. The MCP67 and MCP73
were not tested yet but I'd be very surprised if they weren't
compatible too.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Oleg Ryjkov <olegr@olegr.ca>
Cc: Malcolm Lalkaka <mlalkaka@gmail.com>
Cc: Zbigniew Luszpinski <zbiggy@o2.pl>
Since an RPC service listener's protocol family is specified now via
svc_create_xprt(), it no longer needs to be passed to svc_create() or
svc_create_pooled(). Remove that argument from the synopsis of those
functions, and remove the sv_family field from the svc_serv struct.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The sv_family field is going away. Pass a protocol family argument to
svc_create_xprt() instead of extracting the family from the passed-in
svc_serv struct.
Again, as this is a listener socket and not an address, we make this
new argument an "int" protocol family, instead of an "sa_family_t."
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The sv_family field is going away. Instead of using sv_family, have
the svc_register() function take a protocol family argument.
Since this argument represents a protocol family, and not an address
family, this argument takes an int, as this is what is passed to
sock_create_kern(). Also make sure svc_register's helpers are
checking for PF_FOO instead of AF_FOO. The value of [AP]F_FOO are
equivalent; this is simply a symbolic change to reflect the semantics
of the value stored in that variable.
sock_create_kern() should return EPFNOSUPPORT if the passed-in
protocol family isn't supported, but it uses EAFNOSUPPORT for this
case. We will stick with that tradition here, as svc_register()
is called by the RPC server in the same path as sock_create_kern().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: add documentating comment and use appropriate data types for
svc_find_xprt()'s arguments.
This also eliminates a mixed sign comparison: @port was an int, while
the return value of svc_xprt_local_port() is an unsigned short.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: Enable the use of const arguments in higher level svc_ APIs
by adding const to the arguments of the helper functions in svc_xprt.h
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch cleans up a lot of the Smack network access control code. The
largest changes are to fix the labeling of incoming TCP connections in a
manner similar to the recent SELinux changes which use the
security_inet_conn_request() hook to label the request_sock and let the label
move to the child socket via the normal network stack mechanisms. In addition
to the incoming TCP connection fixes this patch also removes the smk_labled
field from the socket_smack struct as the minor optimization advantage was
outweighed by the difficulty in maintaining it's proper state.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <jmorris@namei.org>
The socket_post_accept() hook is not currently used by any in-tree modules
and its existence continues to cause problems by confusing people about
what can be safely accomplished using this hook. If a legitimate need for
this hook arises in the future it can always be reintroduced.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
The current NetLabel/SELinux behavior for incoming TCP connections works but
only through a series of happy coincidences that rely on the limited nature of
standard CIPSO (only able to convey MLS attributes) and the write equality
imposed by the SELinux MLS constraints. The problem is that network sockets
created as the result of an incoming TCP connection were not on-the-wire
labeled based on the security attributes of the parent socket but rather based
on the wire label of the remote peer. The issue had to do with how IP options
were managed as part of the network stack and where the LSM hooks were in
relation to the code which set the IP options on these newly created child
sockets. While NetLabel/SELinux did correctly set the socket's on-the-wire
label it was promptly cleared by the network stack and reset based on the IP
options of the remote peer.
This patch, in conjunction with a prior patch that adjusted the LSM hook
locations, works to set the correct on-the-wire label format for new incoming
connections through the security_inet_conn_request() hook. Besides the
correct behavior there are many advantages to this change, the most significant
is that all of the NetLabel socket labeling code in SELinux now lives in hooks
which can return error codes to the core stack which allows us to finally get
ride of the selinux_netlbl_inode_permission() logic which greatly simplfies
the NetLabel/SELinux glue code. In the process of developing this patch I
also ran into a small handful of AF_INET6 cleanliness issues that have been
fixed which should make the code safer and easier to extend in the future.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <jmorris@namei.org>
Conflicts:
arch/sparc/kernel/time_64.c
drivers/gpu/drm/drm_proc.c
Manual merge to resolve build warning due to phys_addr_t type change
on x86:
drivers/gpu/drm/drm_info.c
Signed-off-by: Ingo Molnar <mingo@elte.hu>
ACPI has smart batteries, which work in units of energy and measure
rate of (dis)charge as power, thus it is not appropriate to export it
as a current_now. Current_now will still be exported to allow
for userland applications to match.
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
If a commit is triggered by fsync(), set a flag indicating the journal
blocks associated with the transaction should be flushed out using
WRITE_SYNC.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Jan Kara <jack@suse.cz>
Intel graphics hardware that implements the ACPI IGD OpRegion spec
requires that the list of display devices be populated before any ACPI
video methods are called. Detect when this is the case and defer
registration until the opregion code calls it. Fixes crashes on HP
laptops.
http://bugzilla.kernel.org/show_bug.cgi?id=11259
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Len Brown <len.brown@intel.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (166 commits)
Revert "ax25: zero length frame filtering in AX25"
Revert "netrom: zero length frame filtering in NetRom"
cfg80211: default CONFIG_WIRELESS_OLD_REGULATORY to n
mac80211/iwlwifi: move virtual A-MDPU queue bookkeeping to iwlwifi
mac80211: fix aggregation to not require queue stop
mac80211: add skb length sanity checking
mac80211: unify and fix TX aggregation start
mac80211: clean up __ieee80211_tx args
mac80211: rework the pending packets code
mac80211: fix A-MPDU queue assignment
mac80211: rewrite fragmentation
iwlwifi: show current driver status in user readable format
b43: Add BCM4307 PCI-ID
cfg80211: fix locking in nl80211_set_wiphy
mac80211: fix RX path
ath5k: properly drop packets from ops->tx
ar9170: single module build
ath9k: fix dma mapping leak of rx buffer upon rmmod
rt2x00: New USB ID for rt73usb
ath5k: warn and correct rate for unknown hw rate indexes
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: (53 commits)
DVB: firedtv: FireDTV S2 problems with tuning solved
DVB: firedtv: fix printk format mismatch
ieee1394: constify device ID tables
ieee1394: raw1394: add sparse annotations to raw1394_compat_write
ieee1394: Storage class should be before const qualifier
ieee1394: sbp2: follow up on "ieee1394: inherit ud vendor_id from node vendor_id"
firewire: core: optimize propagation of BROADCAST_CHANNEL
firewire: core: simplify broadcast channel allocation
firewire: core: increase bus manager grace period
firewire: core: drop unused call parameters of close_transaction
firewire: cdev: add closure to async stream ioctl
firewire: cdev: simplify FW_CDEV_IOC_SEND_REQUEST return value
firewire: cdev: fix race of ioctl_send_request with bus reset
firewire: cdev: secure add_descriptor ioctl
firewire: cdev: amendment to "add ioctl to query maximum transmission speed"
firewire: broadcast channel support
firewire: implement asynchronous stream transmission
firewire: core: normalize a function argument name
firewire: normalize a variable name
firewire: core: remove condition which is always false
...
This patch removes all the virtual A-MPDU-queue bookkeeping from
mac80211. Curiously, iwlwifi already does its own bookkeeping, so
it doesn't require much changes except where it needs to handle
starting and stopping the queues in mac80211.
To handle the queue stop/wake properly, we rewrite the software
queue number for aggregation frames and internally to iwlwifi keep
track of the queues that map into the same AC queue, and only talk
to mac80211 about the AC queue. The implementation requires calling
two new functions, iwl_stop_queue and iwl_wake_queue instead of the
mac80211 counterparts.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Reinette Chattre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Instead of stopping the entire AC queue when enabling aggregation
(which was only done for hardware with aggregation queues) buffer
the packets for each station, and release them to the pending skb
queue once aggregation is turned on successfully.
We get a little more code, but it becomes conceptually simpler and
we can remove the entire virtual queue mechanism from mac80211 in
a follow-up patch.
This changes how mac80211 behaves towards drivers that support
aggregation but have no hardware queues -- those drivers will now
not be handed packets while the aggregation session is being
established, but only after it has been fully established.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When TX aggregation becomes operational, we do a number of steps:
1) print a debug message
2) wake the virtual queue
3) notify the driver
Unfortunately, 1) and 3) are only done if the driver is first to
reply to the aggregation request, it is, however, possible that the
remote station replies before the driver! Thus, unify the code for
this and call the new function ieee80211_agg_tx_operational in both
places where TX aggregation can become operational.
Additionally, rename the driver notification from
IEEE80211_AMPDU_TX_RESUME to IEEE80211_AMPDU_TX_OPERATIONAL.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch changes mac80211 to not notify the rate control algorithm's
tx_status() method when reporting status for a packet that didn't go
through the rate control algorithm's get_rate() method.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add IEEE80211_HW_BEACON_FILTERING flag so that driver inform that it supports
beacon filtering. Drivers need to call the new function
ieee80211_beacon_loss() to notify about beacon loss.
Signed-off-by: Kalle Valo <kalle.valo@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In beacon filtering there needs to be a way to not expire the BSS even
when no beacons are received. Add an interface to cfg80211 to hold
BSS and make sure that it's not expired.
Signed-off-by: Kalle Valo <kalle.valo@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When software scanning we need to disable power save so that all possible
probe responses and beacons are received. For hardware scanning assume that
hardware will take care of that and document that assumption.
Signed-off-by: Kalle Valo <kalle.valo@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add vendor ID for Quanta Microsystems and update the led table with the reported device.
Reported-by: Scott Barnes <nekoreeve@gmail.com>
Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The functionality that NL80211_CMD_SET_MGMT_EXTRA_IE provided can now
be achieved with cleaner design by adding IE(s) into
NL80211_CMD_TRIGGER_SCAN, NL80211_CMD_AUTHENTICATE,
NL80211_CMD_ASSOCIATE, NL80211_CMD_DEAUTHENTICATE, and
NL80211_CMD_DISASSOCIATE.
Since this is a very recently added command and there are no known (or
known planned) applications using NL80211_CMD_SET_MGMT_EXTRA_IE and
taken into account how much extra complexity it adds to the IE
processing we have now (and need to add in the future to fix IE order
in couple of frames), it looks like the best option is to just remove
the implementation of this command for now. The enum values themselves
are left to avoid changing the nl80211 command or attribute numbers.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch adds new nl80211 commands to allow user space to request
authentication and association (and also deauthentication and
disassociation). The commands are structured to allow separate
authentication and association steps, i.e., the interface between
kernel and user space is similar to the MLME SAP interface in IEEE
802.11 standard and an user space application takes the role of the
SME.
The patch introduces MLME-AUTHENTICATE.request,
MLME-{,RE}ASSOCIATE.request, MLME-DEAUTHENTICATE.request, and
MLME-DISASSOCIATE.request primitives. The authentication and
association commands request the actual operations in two steps
(assuming the driver supports this; if not, separate authentication
step is skipped; this could end up being a separate "connect"
command).
The initial implementation for mac80211 uses the current
net/mac80211/mlme.c for actual sending and processing of management
frames and the new nl80211 commands will just stop the current state
machine from moving automatically from authentication to association.
Future cleanup may move more of the MLME operations into cfg80211.
The goal of this design is to provide more control of authentication and
association process to user space without having to move the full MLME
implementation. This should be enough to allow IEEE 802.11r FT protocol
and 802.11s SAE authentication to be implemented. Obviously, this will
also bring the extra benefit of not having to use WEXT for association
requests with mac80211. An example implementation of a user space SME
using the new nl80211 commands is available for wpa_supplicant.
This patch is enough to get IEEE 802.11r FT protocol working with
over-the-air mechanism (over-the-DS will need additional MLME
primitives for handling the FT Action frames).
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add new nl80211 event notifications (and a new multicast group, "mlme")
for informing user space about received and processed Authentication,
(Re)Association Response, Deauthentication, and Disassociation frames in
station and IBSS modes (i.e., MLME SAP interface primitives
MLME-AUTHENTICATE.confirm, MLME-ASSOCIATE.confirm,
MLME-REASSOCIATE.confirm, MLME-DEAUTHENTICATE.indicate, and
MLME-DISASSOCIATE.indication). The event data is encapsulated as the 802.11
management frame since we already have the frame in that format and it
includes all the needed information.
This is the initial step in providing MLME SAP interface for
authentication and association with nl80211. In other words, kernel code
will act as the MLME and a user space application can control it as the
SME.
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
No drivers use it any more, so it can now be removed safely.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
I keep needing this because I'm too stupid to remember it.
Everybody else can probably remember, but who knows :)
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This makes nl80211 export the supported commands (command groups)
per wiphy so userspace has an idea what it can do -- this will be
required reading for userspace when we introduce auth/assoc /or/
connect for older hardware that cannot separate auth and assoc.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Radiotap was updated to include a "bad PLCP" flag and standardise
the "bad FCS" flag in the "flags" rather than "RX flags" field,
this patch updates Linux to that standard.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
No hw/driver actually supports more than four queues right now,
and we allocate a number of things per queue which means we
waste a bit of memory. Reduce the maximum number to four to
accurately reflect what we do (and need for QoS). Even if we
had hardware supporting more queues we couldn't take advantage
of that right now anyway.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This inline is useless and actually makes the code _longer_
rather than shorter.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel: (25 commits)
drm/i915: Fix LVDS dither setting
drm/i915: Check for dev->primary->master before dereference.
drm/i915: TV detection fix
drm/i915: TV mode_set sync up with 2D driver
drm/i915: Fix TV get_modes to return modes count
drm/i915: Sync crt hotplug detection with intel video driver
drm/i915: Sync mode_valid/mode_set with intel video driver
drm/i915: TV modes' parameters sync up with 2D driver
agp/intel: Add support for new intel chipset.
i915/drm: Remove two redundant agp_chipset_flushes
drm/i915: Display fence register state in debugfs i915_gem_fence_regs node.
drm/i915: Add information on pinning and fencing to the i915 list debug.
drm/i915: Consolidate gem object list dumping
drm/i915: Convert i915 proc files to seq_file and move to debugfs.
drm: Convert proc files to seq_file and introduce debugfs
drm/i915: Fix lock order reversal in GEM relocation entry copying.
drm/i915: Fix lock order reversal with cliprects and cmdbuf in non-DRI2 paths.
drm/i915: Fix lock order reversal in shmem pread path.
drm/i915: Fix lock order reversal in shmem pwrite path.
drm/i915: Make GEM object's page lists refcounted instead of get/free.
...
Fix a build warning about leaking CONFIG_NFSD to userspace.
The nfsd_stats data structure does not need to be available to
userspace; no kernel interface uses it. So move it inside #ifdef
__KERNEL__ and the warning goes away.
Signed-off-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The inline function skb_gro_mac_header defined in include/linux/netdevice.h
makes use of page_address(). Depending on configuration options, the latter
is either defined as a macro or is declared as a function in another header
file, namely include/linux/mm.h. However, include/linux/netdevice.h does not
include include/linux/mm.h.
On MIPS, this has produced the following build error:
CC kernel/sysctl_check.o
In file included from include/linux/icmpv6.h:173,
from include/linux/ipv6.h:208,
from include/net/ip_vs.h:26,
from kernel/sysctl_check.c:6:
include/linux/netdevice.h: In function 'skb_gro_mac_header':
include/linux/netdevice.h:1132: error: implicit declaration of function
'page_address'
include/linux/netdevice.h:1133: warning: pointer/integer type mismatch
in conditional expression
make[1]: *** [kernel/sysctl_check.o] Error 1
make: *** [kernel] Error 2
The patch adds the missing include and fixes the build error.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The old mechanism to formatting proc files is extremely ugly. The
seq_file API was designed specifically for cases like this and greatly
simplifies the process.
Also, most of the files in /proc really don't belong there. This patch
introduces the infrastructure for putting these into debugfs and exposes
all of the proc files in debugfs as well.
Signed-off-by: Ben Gamari <bgamari@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
In acpi_bus_ops, only the acpi_op_add and acpi_op_start flags are used,
so remove all the rest.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>