linux

Author	SHA1	Message	Date
Sage Weil	be4f104dfd	ceph: select CRYPTO We select CRYPTO_AES, but not CRYPTO. Signed-off-by: Sage Weil <sage@newdream.net>	2010-09-17 12:30:31 -07:00
Sage Weil	a43fb73101	ceph: check mapping to determine if FILE_CACHE cap is used See if the i_data mapping has any pages to determine if the FILE_CACHE capability is currently in use, instead of assuming it is any time the rdcache_gen value is set (i.e., issued -> used). This allows the MDS RECALL_STATE process work for inodes that have cached pages. Signed-off-by: Sage Weil <sage@newdream.net>	2010-09-17 09:54:31 -07:00
Sage Weil	e835124c2b	ceph: only send one flushsnap per cap_snap per mds session Sending multiple flushsnap messages is problematic because we ignore the response if the tid doesn't match, and the server may only respond to each one once. It's also a waste. So, skip cap_snaps that are already on the flushing list, unless the caller tells us to resend (because we are reconnecting). Signed-off-by: Sage Weil <sage@newdream.net>	2010-09-17 08:03:08 -07:00
Sage Weil	ae00d4f37f	ceph: fix cap_snap and realm split The cap_snap creation/queueing relies on both the current i_head_snapc _and_ the i_snap_realm pointers being correct, so that the new cap_snap can properly reference the old context and the new i_head_snapc can be updated to reference the new snaprealm's context. To fix this, we: - move inodes completely to the new (split) realm so that i_snap_realm is correct, and - generate the new snapc's _before_ queueing the cap_snaps in ceph_update_snap_trace(). Signed-off-by: Sage Weil <sage@newdream.net>	2010-09-16 16:26:51 -07:00
Sage Weil	cfc0bf6640	ceph: stop sending FLUSHSNAPs when we hit a dirty capsnap Stop sending FLUSHSNAP messages when we hit a capsnap that has dirty_pages or is still writing. We'll send the newer capsnaps only after the older ones complete. Signed-off-by: Sage Weil <sage@newdream.net>	2010-09-14 15:50:59 -07:00
Sage Weil	8bef9239ee	ceph: correctly set 'follows' in flushsnap messages The 'follows' should match the seq for the snap context for the given snap cap, which is the context under which we have been dirtying and writing data and metadata. The snapshot that _contains_ those updates thus _follows_ that context's seq #. Signed-off-by: Sage Weil <sage@newdream.net>	2010-09-14 15:45:44 -07:00
Sage Weil	467c525109	ceph: fix dn offset during readdir_prepopulate When adding the readdir results to the cache, ceph_set_dentry_offset was clobbered our just-set offset. This can cause the readdir result offsets to get out of sync with the server. Add an argument to the helper so that it does not. This bug was introduced by `1cd3935bed`. Signed-off-by: Sage Weil <sage@newdream.net>	2010-09-13 11:40:36 -07:00
Sage Weil	a77d9f7dce	ceph: fix file offset wrapping at 4GB on 32-bit archs Cast the value before shifting so that we don't run out of bits with a 32-bit unsigned long. This fixes wrapping of high file offsets into the low 4GB of a file on disk, and the subsequent data corruption for large files. Signed-off-by: Sage Weil <sage@newdream.net>	2010-09-11 10:55:25 -07:00
Sage Weil	3612abbd5d	ceph: fix reconnect encoding for old servers Fix the reconnect encoding to encode the cap record when the MDS does not have the FLOCK capability (i.e., pre v0.22). Signed-off-by: Sage Weil <sage@newdream.net>	2010-09-11 10:52:47 -07:00
Yehuda Sadeh	3d4401d9d0	ceph: fix pagelist kunmap tail A wrong parameter was passed to the kunmap. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>	2010-09-11 10:52:47 -07:00
Sage Weil	ca04d9c3ec	ceph: fix null pointer deref on anon root dentry release When we release a root dentry, particularly after a splice, the parent (actually our) inode was evaluating to NULL and was getting dereferenced by ceph_snap(). This is reproduced by something as simple as mount -t ceph monhost:/a/b mnt mount -t ceph monhost:/a mnt2 ls mnt2 A splice_dentry() would kill the old 'b' inode's root dentry, and we'd crash while releasing it. Fix by checking for both the ROOT and NULL cases explicitly. We only need to invalidate the parent dir when we have a correct parent to invalidate. Signed-off-by: Sage Weil <sage@newdream.net>	2010-09-11 10:52:47 -07:00
Dan Carpenter	b545787dbb	ceph: fix get_ticket_handler() error handling get_ticket_handler() returns a valid pointer or it returns ERR_PTR(-ENOMEM) if kzalloc() fails. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-26 09:26:50 -07:00
Sage Weil	e072f8aa35	ceph: don't BUG on ENOMEM during mds reconnect We are in a position to return an error; do that instead. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-26 09:26:37 -07:00
Dan Carpenter	f44c3890d9	ceph: ceph_mdsc_build_path() returns an ERR_PTR ceph_mdsc_build_path() returns an ERR_PTR but this code is set up to handle NULL returns. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-26 09:24:28 -07:00
Alan Cox	ad8453ab0a	ceph: Fix warnings Just scrubbing some warnings so I can see real problem ones in the build noise. For 32bit we need to coax gcc politely into believing we really honestly intend to the casts. Using (u64)(unsigned long) means we cast from a pointer to a type of the right size and then extend it. This stops the warning spew. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-25 12:02:14 -07:00
Dan Carpenter	ac1f12ef56	ceph: ceph_get_inode() returns an ERR_PTR ceph_get_inode() returns an ERR_PTR and it doesn't return a NULL. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-25 12:01:54 -07:00
Sage Weil	36e21687e6	ceph: initialize fields on new dentry_infos Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-24 16:24:19 -07:00
Sage Weil	7d8cb26d7d	ceph: maintain i_head_snapc when any caps are dirty, not just for data We used to use i_head_snapc to keep track of which snapc the current epoch of dirty data was dirtied under. It is used by queue_cap_snap to set up the cap_snap. However, since we queue cap snaps for any dirty caps, not just for dirty file data, we need to keep a valid i_head_snapc anytime we have dirty\|flushing caps. This fixes a NULL pointer deref in queue_cap_snap when writing back dirty caps without data (e.g., snaptest-authwb.sh). Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-24 16:24:18 -07:00
Henry C Chang	07a27e226d	ceph: fix osd request lru adjustment when sending request Fix argument order. We want to move the item to the end of the list, not change the position of the head. Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com> Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-22 21:34:27 -07:00
Sage Weil	124514918b	ceph: don't improperly set dir complete when holding EXCL cap If we hold the EXCL cap, we cannot trust the dir stats from the MDS (num files, subdirs) and must not incorrectly conclude that the directory is empty. If we do, we get can bad results from lookup (bad ENOENT) and bad readdir results. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-22 21:33:32 -07:00
Michael Rubin	679ceace84	mm: exporting account_page_dirty This allows code outside of the mm core to safely manipulate page state and not worry about the other accounting. Not using these routines means that some code will lose track of the accounting and we get bugs. This has happened once already. Signed-off-by: Michael Rubin <mrubin@google.com> Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-22 15:16:51 -07:00
Sage Weil	eb6bb1c5bd	ceph: direct requests in snapped namespace based on nonsnap parent When making a request in the virtual snapdir or a snapped portion of the namespace, we should choose the MDS based on the first nonsnap parent (and its caps). If that is not the best place, we will get forward hints to find the right MDS in the cluster. This fixes ESTALE errors when using the .snap directory and namespace with multiple MDSs. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-22 15:16:48 -07:00
Sage Weil	ed32604448	ceph: queue cap snap writeback for realm children on snap update When a realm is updated, we need to queue writeback on inodes in that realm _and_ its children. Otherwise, if the inode gets cowed on the server, we can get a hang later due to out-of-sync cap/snap state. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-22 15:16:47 -07:00
Sage Weil	4a625be472	ceph: include dirty xattrs state in snapped caps When we snapshot dirty metadata that needs to be written back to the MDS, include dirty xattr metadata. Make the capsnap reference the encoded xattr blob so that it will be written back in the FLUSHSNAP op. Also fix the capsnap creation guard to include dirty auth or file bits, not just tests specific to dirty file data or file writes in progress (this fixes auth metadata writeback). Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-22 15:16:46 -07:00
Sage Weil	082afec92d	ceph: fix xattr cap writeback We should include the xattr metadata blob in the cap update message any time we are flushing dirty state, NOT just when we are also dropping the cap. This fixes async xattr writeback. Also, clean up the code slightly to avoid duplicating the bit test. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-22 15:16:41 -07:00
Sage Weil	f3c60c5918	ceph: fix multiple mds session shutdown The use of a completion when waiting for session shutdown during umount is inappropriate, given the complexity of the condition. For multiple MDS's, this resulted in the umount thread spinning, often preventing the session close message from being processed in some cases. Switch to a waitqueue and defined a condition helper. This cleans things up nicely. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-22 15:04:43 -07:00
Yehuda Sadeh	e56fa10e92	ceph: generalize mon requests, add pool op support Generalize the current statfs synchronous requests, and support pool_ops. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-10 14:41:25 -07:00
Sage Weil	0eb6cd49f6	ceph: only queue async writeback on cap revocation if there is dirty data Normally, if the Fb cap bit is being revoked, we queue an async writeback. If there is no dirty data but we still hold the cap, this leaves the client sitting around doing nothing until the cap timeouts expire and the cap is released on its own (as it would have been without the revocation). Instead, only queue writeback if the bit is actually used (i.e., we have dirty data). If not, we can reply to the revocation immediately. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-05 13:53:40 -07:00
Sage Weil	e9d1774431	ceph: do not ignore osd_idle_ttl mount option Actually apply the mount option to the mount_args struct. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-03 12:56:57 -07:00
Sage Weil	52dfb8ac0e	ceph: constify dentry_operations This makes checkpatch happy. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-03 10:25:30 -07:00
Sage Weil	213c99ee0c	ceph: whitespace cleanup Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-03 10:25:11 -07:00
Greg Farnum	40819f6fb2	ceph: add flock/fcntl lock support Implement flock inode operation to support advisory file locking. All lock/unlock operations are synchronous with the MDS. Lock state is sent when reconnecting to a recovering MDS to restore the shared lock state. Signed-off-by: Greg Farnum <gregf@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-02 16:10:53 -07:00
Greg Farnum	fbaad9797a	ceph: define on-wire types, constants for file locking support Define the MDS operations and data types for doing file advisory locking with the MDS. Signed-off-by: Greg Farnum <gregf@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-02 15:48:54 -07:00
Greg Farnum	c6f3fdc592	ceph: add CEPH_FEATURE_FLOCK to the supported feature bits This informs the server that we will accept v2 client_caps format and v2 client_reconnect format messages. Signed-off-by: Greg Farnum <gregf@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-02 15:48:51 -07:00
Sage Weil	20cb34ae9e	ceph: support v2 reconnect encoding Encode either old or v2 encoding of client_reconnect message, depending on whether the peer has the FLOCK feature bit. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-02 15:48:50 -07:00
Sage Weil	ce1fbc8dd6	ceph: support v2 client_caps encoding Add support for v2 encoding of MClientCaps, which includes a flock blob. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-02 15:48:49 -07:00
Sage Weil	cbbfe49905	ceph: move AES iv definition to shared header Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-02 15:48:31 -07:00
Sage Weil	73a7e693f9	ceph: fix decoding of pool snap info The pool info contains a vector for snap_info_t, not snap ids. This fixes the broken decoding, which would declare teh update corrupt when a pool snapshot was created. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-02 11:10:07 -07:00
Sage Weil	2d9c98ae97	ceph: make ->sync_fs not wait if wait==0 The ->sync_fs() super op only needs to wait if wait is true. Otherwise, just get some dirty cap writeback started. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-01 20:11:42 -07:00
Sage Weil	b8cd07e78e	ceph: warn on missing snap realm Well, this Shouldn't Happen, so it would be helpful to know the caller when it does. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-01 20:11:42 -07:00
Sage Weil	effcb9ed43	ceph: print useful error message when crush rule not found Include the crush_ruleset in the error message. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-01 20:11:42 -07:00
Sage Weil	a8b763a9b3	ceph: use %pU to print uuid (fsid) Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-01 20:11:42 -07:00
Sage Weil	f0b18d9f22	ceph: sync header defs with server code Define ROLLBACK op, IFLOCK inode lock (for advisory file locking). Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-01 20:11:42 -07:00
Sage Weil	5cd068c200	ceph: clean up header guards Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-01 20:11:42 -07:00
Sage Weil	9688f19a18	ceph: strip misleading/obsolete version, feature info Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-01 20:11:41 -07:00
Sage Weil	6a2593823a	ceph: specify supported features in super.h Specify the supported/required feature bits in super.h client code instead of using the definitions from the shared kernel/userspace headers (which will go away shortly). Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-01 20:11:41 -07:00
Sage Weil	c309f0ab26	ceph: clean up fsid mount option Specify the fsid mount option in hex, not via the major/minor u64 hackery we had before. Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-01 20:11:41 -07:00
Sage Weil	e0f9f9ee8f	ceph: remove unused 'monport' mount option Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-01 20:11:41 -07:00
Greg Farnum	e55b71f802	ceph: handle ESTALE properly; on receipt send to authority if it wasn't Signed-off-by: Greg Farnum <gregf@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-01 20:11:41 -07:00
Greg Farnum	2bc50259fa	ceph: add ceph_get_cap_for_mds function. Signed-off-by: Greg Farnum <gregf@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>	2010-08-01 20:11:41 -07:00

1 2 3 4 5 ...

434 Commits