linux/drivers/firewire
Stefan Richter 792a61021c firewire: fix race of bus reset with request transmission
Reported by Jay Fenlason:  A bus reset tasklet may call
fw_flush_transactions and touch transactions (call their callback which
will free them) while the context which submitted the transaction is
still inserting it into the transmission queue.

A simple solution to this problem is to _not_ "flush" the transactions
because of a bus reset (complete the transcations as 'cancelled').  They
will now simply time out (completed as 'cancelled' by the split-timeout
timer).

Jay Fenlason thought of this fix too but I was quicker to type it out.
:-)

Background:
Contexts which access an instance of struct fw_transaction are:
 1. the submitter, until it inserted the packet which is embedded in the
    transaction into the AT req DMA,
 2. the AsReqTrContext tasklet when the request packet was acked by the
    responder node or transmission to the responder failed,
 3. the AsRspRcvContext tasklet when it found a request which matched
    an incoming response,
 4. the card->flush_timer when it picks up timed-out transactions to
    cancel them,
 5. the bus reset tasklet when it cancels transactions (this access is
    eliminated by this patch),
 6. a process which shuts down an fw_card (unregisters it from fw-core
    when the controller is unbound from fw-ohci) --- although in this
    case there shouldn't really be any transactions anymore because we
    wait until all card users finished their business with the card.

All of these contexts run concurrently (except for the 6th, presumably).
The 1st is safe against the 2nd and 3rd because of the way how a request
packet is carefully submitted to the hardware.  A race between 2nd and
3rd has been fixed a while ago (bug 9617).  The 4th is almost safe
against 1st, 2nd, 3rd;  there are issues with it if huge scheduling
latencies occur, to be fixed separately.  The 5th looks safe against
2nd, 3rd, and 4th but is unsafe against 1st.  Maybe this could be fixed
with an explicit state variable in struct fw_transaction.  But this
would require fw_transaction to be rewritten as only dynamically
allocatable object with reference counting --- not a good solution if we
also can simply kill this 5th accessing context (replace it by the 4th).

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-07-14 13:06:04 +02:00
..
fw-card.c firewire: clean up fw_card reference counting 2008-07-14 13:06:03 +02:00
fw-cdev.c firewire: fill_bus_reset_event needs lock protection 2008-06-19 00:12:35 +02:00
fw-device.c firewire: clean up fw_card reference counting 2008-07-14 13:06:03 +02:00
fw-device.h firewire: remove unused struct members 2008-07-14 13:06:03 +02:00
fw-iso.c firewire: cleanups 2008-04-18 17:55:37 +02:00
fw-ohci.c firewire: remove unused struct members 2008-07-14 13:06:03 +02:00
fw-ohci.h firewire: fw-ohci: log regAccessFail events 2008-04-18 17:55:34 +02:00
fw-sbp2.c firewire: fw-sbp2: fix parsing of logical unit directories 2008-06-27 20:55:00 +02:00
fw-topology.c firewire: fix race of bus reset with request transmission 2008-07-14 13:06:04 +02:00
fw-topology.h firewire: reread config ROM when device reset the bus 2008-04-18 17:55:36 +02:00
fw-transaction.c firewire: don't respond to broadcast write requests 2008-07-14 13:06:03 +02:00
fw-transaction.h firewire: clean up fw_card reference counting 2008-07-14 13:06:03 +02:00
Kconfig firewire: Kconfig menu touch-up 2008-06-19 00:12:35 +02:00
Makefile firewire: prefix modules with firewire- instead of fw- 2007-05-27 23:21:01 +02:00