Okay, here's what I found so far using the crash and objdump utilities. The start of the panic in dmesg:
Code: Select all
[ 124.985777] divide error: 0000 [#1] SMP
[ 124.985862] CPU: 1 PID: 4026 Comm: wpa_supplicant Kdump: loaded Not tainted 5
.15.26-gentoo-x86-1 #1
[ 124.985948] Hardware name: Gateway MX /, BIOS 83.08 03/06/07
[ 124.986016] EIP: rtl8180_tx+0x1c1/0x530 [rtl818x_pci]
EIP is the program counter at the crash point. rtl8180_tx is a function in the rtl818x_pci module. Here's the beginning of it, starting at line 454.
Code: Select all
crash> mod -s rtl818x_pci
MODULE NAME BASE SIZE OBJECT FILE
f8337100 rtl818x_pci f832e000 45056 /lib/modules/5.15.26-gentoo-x86-1/kernel/drivers/net/wireless/realtek/rtl818x/rtl8180/rtl818x_pci.ko
crash>
crash> gdb list *rtl8180_tx
0xf832f8c0 is in rtl8180_tx (drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c:457).
452 }
453
454 static void rtl8180_tx(struct ieee80211_hw *dev,
455 struct ieee80211_tx_control *control,
456 struct sk_buff *skb)
457 {
458 struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
459 struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data;
460 struct rtl8180_priv *priv = dev->priv;
461 struct rtl8180_tx_ring *ring;
crash>
and here's the listing centered on the crashing line, line 544.
Code: Select all
crash> gdb list *rtl8180_tx+0x1c1
0xf832fa81 is in rtl8180_tx (drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c:544).
539 priv->seqno += 0x10;
540 hdr->seq_ctrl &= cpu_to_le16(IEEE80211_SCTL_FRAG);
541 hdr->seq_ctrl |= cpu_to_le16(priv->seqno);
542 }
543
544 idx = (ring->idx + skb_queue_len(&ring->queue)) % ring->entries;
545 entry = &ring->desc[idx];
546
547 if (priv->chip_family == RTL818X_CHIP_FAMILY_RTL8187SE) {
548 entry->frame_duration = cpu_to_le16(frame_duration);
crash>
The variable ring was initialized a bit earlier.
Code: Select all
460 struct rtl8180_priv *priv = dev->priv;
...
473 prio = skb_get_queue_mapping(skb);
474 ring = &priv->tx_ring[prio];
Anyway, Hu was exactly right in [post]8696042[/post].
Here is the disassembly of line 544 (from objdump -S).
Code: Select all
idx = (ring->idx + skb_queue_len(&ring->queue)) % ring->entries;
1a6a: 8b 75 f0 mov -0x10(%ebp),%esi
1a6d: 31 d2 xor %edx,%edx
1a6f: c1 e6 05 shl $0x5,%esi
1a72: 8d 0c 37 lea (%edi,%esi,1),%ecx
1a75: 8b 81 bc 00 00 00 mov 0xbc(%ecx),%eax
1a7b: 03 81 ac 00 00 00 add 0xac(%ecx),%eax
1a81: f7 b1 b0 00 00 00 divl 0xb0(%ecx)
The assembly references memory at register ecx + three offsets, 0xb0, 0xac, 0xbc. Those must be the three accesses to *ring fields in the source line. The 0xb0(%ecx) must be ring->entries, the divisor. From the first source listing, ring is a (struct rtl8180_tx_ring *) and ring->entries is at offset 12 (0xc) into that structure.
Code: Select all
crash> whatis struct rtl8180_tx_ring
struct rtl8180_tx_ring {
struct rtl8180_tx_desc *desc;
dma_addr_t dma;
unsigned int idx;
unsigned int entries;
struct sk_buff_head queue;
}
SIZE: 32
crash>
(The first three fields are all size 4, as will be apparent later when I dump the struct.) So *ring itself must be at ecx + (0xb0-0xc = 0xa4). And we know ecx from the start of the panic dump. Showing a little more:
Code: Select all
[ 124.985777] divide error: 0000 [#1] SMP
[ 124.985862] CPU: 1 PID: 4026 Comm: wpa_supplicant Kdump: loaded Not tainted 5
.15.26-gentoo-x86-1 #1
[ 124.985948] Hardware name: Gateway MX /, BIOS 83.08 03/06/07
[ 124.986016] EIP: rtl8180_tx+0x1c1/0x530 [rtl818x_pci]
[ 124.986096] Code: 16 83 e0 0f 66 89 46 16 66 0b 87 9e 05 00 00 66 89 46 16 8b
75 f0 31 d2 c1 e6 05 8d 0c 37 8b 81 bc 00 00 00 03 81 ac 00 00 00 <f7> b1 b0 00
00 00 c1 e2 05 03 91 a4 00 00 00 83 bf 84 05 00 00 02
[ 124.986204] EAX: 00000000 EBX: c84c9b40 ECX: c89318a0 EDX: 00000000
[ 124.986276] ESI: 00000040 EDI: c8931860 EBP: c9eefa8c ESP: c9eefa5c
[ 124.986346] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00210046
[ 124.986419] CR0: 80050033 CR2: b7d125f0 CR3: 09133000 CR4: 000006d0
Fourth line from the bottom, ecx was 0xc89318a0 at the time of the crash. ecx+0xa4 is 0xc8931944, the address of *ring. So look there:
Code: Select all
crash> struct rtl8180_tx_ring c8931944
struct rtl8180_tx_ring {
desc = 0x0,
dma = 0,
idx = 0,
entries = 0,
queue = {
next = 0x0,
prev = 0x0,
qlen = 0,
lock = {
{
rlock = {
raw_lock = {
{
val = {
counter = 0
},
{
locked = 0 '\000',
pending = 0 '\000'
},
{
locked_pending = 0,
tail = 0
}
}
}
}
}
}
}
}
crash>
That looks highly suspicious. Somebody is passing uninitialized data to rtl8180_tx(). I haven't been able to reach the argument "dev" from which ring is ultimately derived. But clearly the bug is higher up in the call chain.
Anyway, I'll keep looking. I'm learning as I go.