SSD on nVidia MCP55 SATA port generates error 'EH in SWNCQ mode' and 'failed command: READ FPDMA QUEUED'
Symptoms
- Disk I/O causes messages in the log
- The disk is removed from a Linux software RAID 1 array
Environment
- Supermicro H8DM8E-2 motherboard
- nVidia MCP55 chipset
- Two Corsair Force 3 SATA drives in a linux software RAID 1 configuration
Howto reproduce
Cause high read disk I/O by performing a check of the RAID
# echo check > /sys/block/md1/md/sync_action
Possible fixes
The following ways that might possibly overcome the problem have been suggested in the various forums:
- Kernel options:
-
sata_nv.swncq=1
- libata.force=noncq
- noapic
- acpi=off
- libata.force=1.5G
- hdparm -W0 /dev/sd?
- disable irqbalance
Workaround
The following directive has been added to the end of the Linux command line. The errors have gone away (and I am unconcerned about performance on these disks as they are onlu used for booting).
sata_nv.swncq=0
Links
- Kernel/Hard drive bad sector/freezing issues http://ubuntuforums.org/showthread.php?t=1651369
- http://www.j-schmitz.net/blog/today-i-fixed-my-hdd-breakdowns
- Nvidia MCP55 and WRITE FPDMA QUEUED failed commands http://lkml.indiana.edu/hypermail/linux/kernel/1106.2/03179.html
- http://forums.debian.net/viewtopic.php?f=7&t=71106
- http://ubuntuforums.org/showthread.php?t=2167494
- http://lxr.free-electrons.com/source/drivers/ata/sata_nv.c
- https://ata.wiki.kernel.org/index.php/Hardware,_driver_status#NVIDIA
Appendices
Error log
ata2: EH in SWNCQ mode,QC:qc_active 0x7FFFFFFF sactive 0x7FFFFFFF
ata2: SWNCQ:qc_active 0x1008010 defer_bits 0x7EFF7FEF last_issue_tag 0x18
dhfis 0x1008010 dmafis 0x8010 sdbfis 0x0
ata2: ATA_REG 0x41 ERR_REG 0x84
ata2: tag : dhfis dmafis sdbfis sactive
ata2: tag 0x4: 1 1 0 1
ata2: tag 0xf: 1 1 0 1
ata2: tag 0x18: 1 0 0 1
ata2.00: exception Emask 0x1 SAct 0x7fffffff SErr 0x0 action 0x6 frozen
ata2.00: Ata error. fis:0x21
ata2.00: failed command: READ FPDMA QUEUED
ata2.00: cmd 60/80:00:80:c9:51/00:00:01:00:00/40 tag 0 ncq 65536 in
res 41/84:c0:80:c5:51/84:00:01:00:00/40 Emask 0x10 (ATA bus error)
ata2.00: status: { DRDY ERR }
ata2.00: error: { ICRC ABRT }
ata2.00: failed command: READ FPDMA QUEUED
ata2.00: cmd 60/80:08:80:c7:51/00:00:01:00:00/40 tag 1 ncq 65536 in
res 41/84:c0:80:c5:51/84:00:01:00:00/40 Emask 0x10 (ATA bus error)
ata2.00: status: { DRDY ERR }
ata2.00: error: { ICRC ABRT }
ata2.00: failed command: READ FPDMA QUEUED
ata2.00: cmd 60/80:10:00:ca:51/00:00:01:00:00/40 tag 2 ncq 65536 in
res 41/84:c0:80:c5:51/84:00:01:00:00/40 Emask 0x10 (ATA bus error)
ata2.00: status: { DRDY ERR }
ata2.00: error: { ICRC ABRT }
ata2.00: failed command: READ FPDMA QUEUED
ata2.00: cmd 60/80:18:00:d1:51/00:00:01:00:00/40 tag 3 ncq 65536 in
res 41/84:c0:80:c5:51/84:00:01:00:00/40 Emask 0x10 (ATA bus error)
ata2.00: status: { DRDY ERR }
ata2.00: error: { ICRC ABRT }
ata2.00: failed command: READ FPDMA QUEUED
ata2.00: cmd 60/80:20:80:c4:51/00:00:01:00:00/40 tag 4 ncq 65536 in
res 41/84:c0:80:c5:51/84:00:01:00:00/40 Emask 0x10 (ATA bus error)
ata2.00: status: { DRDY ERR }
ata2.00: error: { ICRC ABRT }
ata2.00: failed command: READ FPDMA QUEUED
ata2.00: cmd 60/80:28:80:cd:51/00:00:01:00:00/40 tag 5 ncq 65536 in
res 41/84:c0:80:c5:51/84:00:01:00:00/40 Emask 0x10 (ATA bus error)
ata2.00: status: { DRDY ERR }
ata2.00: error: { ICRC ABRT }
ata2.00: failed command: READ FPDMA QUEUED
Script from ubuntu forums
#!/bin/bash
do_grub_var() {
cat /etc/default/grub | sed "$1" > /tmp/grub-tmp
sudo cp /tmp/grub-tmp /etc/default/grub
}
append_grub_var() {
do_grub_var 's|\('$1'="[^"]*\)"|\1 '"$2"'"|'
}
change_grub_var() {
do_grub_var 's|\('$1'="\)[^"]*"|\1'"$2"'"|'
}
disable_grub_var() {
do_grub_var 's|'$1'=.*|#&|'
}
enable_grub_var() {
do_grub_var 's|#*\('$1'=.*\)|\1|'
}
add_to_startup() {
local linecount=`cat /etc/rc.local | wc -l`
sed $(( $linecount - 1 ))'s_.*_&\n'"$1"'_' /etc/rc.local
}
append_grub_var GRUB_CMDLINE_LINUX_DEFAULT "libata.force=noncq noapic acpi=off"
add_to_startup "hdparm -W0 /dev/sd?"
MCP55 PCI info
00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) (prog-if 85 [Master SecO PriO])
Subsystem: Super Micro Computer Inc Device 1611
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0 (750ns min, 250ns max)
Interrupt: pin A routed to IRQ 21
Region 0: I/O ports at a480 [size=8]
Region 1: I/O ports at a400 [size=4]
Region 2: I/O ports at a080 [size=8]
Region 3: I/O ports at a000 [size=4]
Region 4: I/O ports at 9c00 [size=16]
Region 5: Memory at fd9bd000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [44] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [b0] MSI: Enable- Count=1/4 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [cc] HyperTransport: MSI Mapping Enable- Fixed+
Kernel driver in use: sata_nv
00:05.1 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) (prog-if 85 [Master SecO PriO])
Subsystem: Super Micro Computer Inc Device 1611
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0 (750ns min, 250ns max)
Interrupt: pin B routed to IRQ 23
Region 0: I/O ports at 9880 [size=8]
Region 1: I/O ports at 9800 [size=4]
Region 2: I/O ports at 9480 [size=8]
Region 3: I/O ports at 9400 [size=4]
Region 4: I/O ports at 9080 [size=16]
Region 5: Memory at fd9bc000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [44] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [b0] MSI: Enable- Count=1/4 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [cc] HyperTransport: MSI Mapping Enable- Fixed+
Kernel driver in use: sata_nv
00:05.2 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) (prog-if 85 [Master SecO PriO])
Subsystem: Super Micro Computer Inc Device 1611
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0 (750ns min, 250ns max)
Interrupt: pin C routed to IRQ 22
Region 0: I/O ports at 9000 [size=8]
Region 1: I/O ports at 8c00 [size=4]
Region 2: I/O ports at 8880 [size=8]
Region 3: I/O ports at 8800 [size=4]
Region 4: I/O ports at 8480 [size=16]
Region 5: Memory at fd9bb000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [44] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [b0] MSI: Enable- Count=1/4 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [cc] HyperTransport: MSI Mapping Enable- Fixed+
Kernel driver in use: sata_nv
sata_nv module parameters
# modinfo -p sata_nv adma:Enable use of ADMA (Default: false) (bool) swncq:Enable use of SWNCQ (Default: true) (bool) msi:Enable use of MSI (Default: false) (bool)
show data_nv settings
# grep . /sys/module/sata_nv/parameters/* /sys/module/sata_nv/parameters/adma:N /sys/module/sata_nv/parameters/msi:N /sys/module/sata_nv/parameters/swncq:Y

