SSD on nVidia MCP55 SATA port generates error 'EH in SWNCQ mode' and 'failed command: READ FPDMA QUEUED'
Symptoms
- Disk I/O causes messages in the log
- The disk is removed from a Linux software RAID 1 array
Environment
- Supermicro H8DM8E-2 motherboard
- nVidia MCP55 chipset
- Two Corsair Force 3 SATA drives in a linux software RAID 1 configuration
Howto reproduce
Cause high read disk I/O by performing a check of the RAID
# echo check > /sys/block/md1/md/sync_action
Possible fixes
The following ways that might possibly overcome the problem have been suggested in the various forums:
- Kernel options:
-
sata_nv.swncq=1
- libata.force=noncq
- noapic
- acpi=off
- libata.force=1.5G
- hdparm -W0 /dev/sd?
- disable irqbalance
Workaround
The following directive has been added to the end of the Linux command line. The errors have gone away (and I am unconcerned about performance on these disks as they are onlu used for booting).
sata_nv.swncq=0
Links
- Kernel/Hard drive bad sector/freezing issues http://ubuntuforums.org/showthread.php?t=1651369
- http://www.j-schmitz.net/blog/today-i-fixed-my-hdd-breakdowns
- Nvidia MCP55 and WRITE FPDMA QUEUED failed commands http://lkml.indiana.edu/hypermail/linux/kernel/1106.2/03179.html
- http://forums.debian.net/viewtopic.php?f=7&t=71106
- http://ubuntuforums.org/showthread.php?t=2167494
- http://lxr.free-electrons.com/source/drivers/ata/sata_nv.c
- https://ata.wiki.kernel.org/index.php/Hardware,_driver_status#NVIDIA
Appendices
Error log
ata2: EH in SWNCQ mode,QC:qc_active 0x7FFFFFFF sactive 0x7FFFFFFF ata2: SWNCQ:qc_active 0x1008010 defer_bits 0x7EFF7FEF last_issue_tag 0x18 dhfis 0x1008010 dmafis 0x8010 sdbfis 0x0 ata2: ATA_REG 0x41 ERR_REG 0x84 ata2: tag : dhfis dmafis sdbfis sactive ata2: tag 0x4: 1 1 0 1 ata2: tag 0xf: 1 1 0 1 ata2: tag 0x18: 1 0 0 1 ata2.00: exception Emask 0x1 SAct 0x7fffffff SErr 0x0 action 0x6 frozen ata2.00: Ata error. fis:0x21 ata2.00: failed command: READ FPDMA QUEUED ata2.00: cmd 60/80:00:80:c9:51/00:00:01:00:00/40 tag 0 ncq 65536 in res 41/84:c0:80:c5:51/84:00:01:00:00/40 Emask 0x10 (ATA bus error) ata2.00: status: { DRDY ERR } ata2.00: error: { ICRC ABRT } ata2.00: failed command: READ FPDMA QUEUED ata2.00: cmd 60/80:08:80:c7:51/00:00:01:00:00/40 tag 1 ncq 65536 in res 41/84:c0:80:c5:51/84:00:01:00:00/40 Emask 0x10 (ATA bus error) ata2.00: status: { DRDY ERR } ata2.00: error: { ICRC ABRT } ata2.00: failed command: READ FPDMA QUEUED ata2.00: cmd 60/80:10:00:ca:51/00:00:01:00:00/40 tag 2 ncq 65536 in res 41/84:c0:80:c5:51/84:00:01:00:00/40 Emask 0x10 (ATA bus error) ata2.00: status: { DRDY ERR } ata2.00: error: { ICRC ABRT } ata2.00: failed command: READ FPDMA QUEUED ata2.00: cmd 60/80:18:00:d1:51/00:00:01:00:00/40 tag 3 ncq 65536 in res 41/84:c0:80:c5:51/84:00:01:00:00/40 Emask 0x10 (ATA bus error) ata2.00: status: { DRDY ERR } ata2.00: error: { ICRC ABRT } ata2.00: failed command: READ FPDMA QUEUED ata2.00: cmd 60/80:20:80:c4:51/00:00:01:00:00/40 tag 4 ncq 65536 in res 41/84:c0:80:c5:51/84:00:01:00:00/40 Emask 0x10 (ATA bus error) ata2.00: status: { DRDY ERR } ata2.00: error: { ICRC ABRT } ata2.00: failed command: READ FPDMA QUEUED ata2.00: cmd 60/80:28:80:cd:51/00:00:01:00:00/40 tag 5 ncq 65536 in res 41/84:c0:80:c5:51/84:00:01:00:00/40 Emask 0x10 (ATA bus error) ata2.00: status: { DRDY ERR } ata2.00: error: { ICRC ABRT } ata2.00: failed command: READ FPDMA QUEUED
Script from ubuntu forums
#!/bin/bash do_grub_var() { cat /etc/default/grub | sed "$1" > /tmp/grub-tmp sudo cp /tmp/grub-tmp /etc/default/grub } append_grub_var() { do_grub_var 's|\('$1'="[^"]*\)"|\1 '"$2"'"|' } change_grub_var() { do_grub_var 's|\('$1'="\)[^"]*"|\1'"$2"'"|' } disable_grub_var() { do_grub_var 's|'$1'=.*|#&|' } enable_grub_var() { do_grub_var 's|#*\('$1'=.*\)|\1|' } add_to_startup() { local linecount=`cat /etc/rc.local | wc -l` sed $(( $linecount - 1 ))'s_.*_&\n'"$1"'_' /etc/rc.local } append_grub_var GRUB_CMDLINE_LINUX_DEFAULT "libata.force=noncq noapic acpi=off" add_to_startup "hdparm -W0 /dev/sd?"
MCP55 PCI info
00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) (prog-if 85 [Master SecO PriO]) Subsystem: Super Micro Computer Inc Device 1611 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 (750ns min, 250ns max) Interrupt: pin A routed to IRQ 21 Region 0: I/O ports at a480 [size=8] Region 1: I/O ports at a400 [size=4] Region 2: I/O ports at a080 [size=8] Region 3: I/O ports at a000 [size=4] Region 4: I/O ports at 9c00 [size=16] Region 5: Memory at fd9bd000 (32-bit, non-prefetchable) [size=4K] Capabilities: [44] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [b0] MSI: Enable- Count=1/4 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [cc] HyperTransport: MSI Mapping Enable- Fixed+ Kernel driver in use: sata_nv 00:05.1 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) (prog-if 85 [Master SecO PriO]) Subsystem: Super Micro Computer Inc Device 1611 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 (750ns min, 250ns max) Interrupt: pin B routed to IRQ 23 Region 0: I/O ports at 9880 [size=8] Region 1: I/O ports at 9800 [size=4] Region 2: I/O ports at 9480 [size=8] Region 3: I/O ports at 9400 [size=4] Region 4: I/O ports at 9080 [size=16] Region 5: Memory at fd9bc000 (32-bit, non-prefetchable) [size=4K] Capabilities: [44] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [b0] MSI: Enable- Count=1/4 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [cc] HyperTransport: MSI Mapping Enable- Fixed+ Kernel driver in use: sata_nv 00:05.2 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) (prog-if 85 [Master SecO PriO]) Subsystem: Super Micro Computer Inc Device 1611 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 (750ns min, 250ns max) Interrupt: pin C routed to IRQ 22 Region 0: I/O ports at 9000 [size=8] Region 1: I/O ports at 8c00 [size=4] Region 2: I/O ports at 8880 [size=8] Region 3: I/O ports at 8800 [size=4] Region 4: I/O ports at 8480 [size=16] Region 5: Memory at fd9bb000 (32-bit, non-prefetchable) [size=4K] Capabilities: [44] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [b0] MSI: Enable- Count=1/4 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [cc] HyperTransport: MSI Mapping Enable- Fixed+ Kernel driver in use: sata_nv
sata_nv module parameters
# modinfo -p sata_nv adma:Enable use of ADMA (Default: false) (bool) swncq:Enable use of SWNCQ (Default: true) (bool) msi:Enable use of MSI (Default: false) (bool)
show data_nv settings
# grep . /sys/module/sata_nv/parameters/* /sys/module/sata_nv/parameters/adma:N /sys/module/sata_nv/parameters/msi:N /sys/module/sata_nv/parameters/swncq:Y