View previous topic :: View next topic |
Author |
Message |
Master One l33t
Joined: 25 Aug 2003 Posts: 754 Location: Austria
|
Posted: Fri Jun 11, 2004 12:02 pm Post subject: ati-drivers + more than 1 Xsession = crash => SOLVED!!! |
|
|
This was my largest fight since I started playing arround with a Linux desktop installation.
Since the beginning I had a very strange issue with my workstation and my notebook, as both have an ATI Radeon graphics card:
The whole installation and configuration went just fine, everything was working as expected, as long as only one Xsession was running.
But as soon as I started another Xsession (with no difference, if done with "startx -- :1 vt8" or just using gdm to spawn another Xsession automatically), and I swapped between vt's, the screen messed up when returning to the inital Xsession (as usual on vt7) and the system usually hardlocked, so only a hard-reset could be done.
I suspected a lot of difference things, starting with the ati-driver (actually using 3.9.0-r1) and my xorg-x11 installation, ended up trying various settings in my xorg.conf and mtrr settings (applying the well known fix, as using vesafb, like most people do).
After a lot of time and a lot of different tries, I finally ended up with the solution for this particular problem. It was so easy:
I just had to build the AGP support (/dev/agpgart and the proper chipset-support) as MODULE instead of building it into the kernel!
This seems to have something to do with the fact, that direct rendering is only support for one Xsession, which means, only the first Xsession gets the fglrx 3D support, all other Xsession that follow only get the Mesa OpenGL renderer (I just remembered, that this is also the reason for xinerama disabling hardware 3D support).
Starting another Xsession gives "(EE) flrx(0): DRIScreenInit failed!", but everything is working fine this way.
I do not think that this has been mentioned anywhere before, maybe this should find it's way into some ATI FAQs. _________________ Las torturas mentales de la CIA |
|
Back to top |
|
|
psofa Guru
Joined: 28 Feb 2004 Posts: 485
|
Posted: Fri Jun 11, 2004 3:51 pm Post subject: |
|
|
r u sure its just that? i mean what diff does it make compiling agp as a module?Can u post this in the rage3d forums? i think they have [no multiple xservers] as a known prob in the linux subforum. _________________ psofa |
|
Back to top |
|
|
Master One l33t
Joined: 25 Aug 2003 Posts: 754 Location: Austria
|
Posted: Fri Jun 11, 2004 4:54 pm Post subject: |
|
|
It is indeed just that. I have no explaination for this, as I do not know anything about the internal mechanism of the driver use, but it looks like there is a serious conflict when the agpgart is built into the kernel.
My assumption: Due to the various test I did, it looks like a new Xsession takes the access to agpgart when built into the kernel, that's why the new session comes up correctly, and it is possible to swap between the new Xsession and other vt's, but the system gets locked up as soon as the initial Xsession is selected. This may be somehow get prevented when the AGP support is built as a module, as a new Xsession does not get control over agpgart, and therefore also gets no hardware 3D rendering.
Anyway, building AGP support as module did the trick, and it is finally working here as expected.
I am not registered in the rage3d forum, but you can post this info there if you like, it may help others not willing to play arround with this matter that much, as I did. _________________ Las torturas mentales de la CIA |
|
Back to top |
|
|
SupapleX n00b
Joined: 12 Jun 2004 Posts: 37
|
Posted: Wed Jun 16, 2004 8:53 am Post subject: |
|
|
Is it so easy?
All my efforts faild. I just want to use more then one Xsessions. I (dis/en)able SMP, radeonfb, options & settings on XF86Config.
And now I have:
P4 with HT (kernel 2.6.7rc_SMP)
i865 ( and agpgart.ko + intel_agp.ko)
R300 (and fglrx.ko(3.9.0-r1) without radeonfb)
Start first Xsession -> gl_hardware enabled.
Start second session -> gl software
And when I go back to the first session -> system freeze... |
|
Back to top |
|
|
Master One l33t
Joined: 25 Aug 2003 Posts: 754 Location: Austria
|
Posted: Wed Jun 16, 2004 4:05 pm Post subject: |
|
|
Hm, did you get any error messages in your system log after spawning the second Xsession?
Do you use vesafb?
Do you use bootsplash?
Did you set your mtrr correctly (the mtrr-fix has to be applied in any case when using vesafb!)?
As from what I've seen, the AGP aparture size should be set to 64 or 128 MB, but not more than that, and you have to check that the mtrr entries (cat /proc/mtrr) are set for both (card memory and framebuffer) correctly.
If may not be related, but I am using vesafb without bootsplash (I switched it off, when I started my tests on this matter, and kept it off since then).
It may be another issue if using a SMP system, I don't really know. I just can tell you, that all the previous problems have gone away since I have the AGP support compiled as modules. _________________ Las torturas mentales de la CIA |
|
Back to top |
|
|
SupapleX n00b
Joined: 12 Jun 2004 Posts: 37
|
Posted: Wed Jun 16, 2004 8:02 pm Post subject: |
|
|
aperture AGP 128MB (or 256)
radeonfb disabled
In kernel VESA VGA fb enabled, but I boot no from:
kernel /vmlinuz root=/dev/hdb3
i.e. without fb.
I haven't bootsplash in kernel at all, just I can see tux.
About mttr:
Did 3.9.0-r1 patched with mtrr-fix?
With a single-CPU mode the bug was to.
Now I'm trying to clean kernel from vesafb at all... tomorrow I reply. |
|
Back to top |
|
|
Master One l33t
Joined: 25 Aug 2003 Posts: 754 Location: Austria
|
Posted: Wed Jun 16, 2004 8:14 pm Post subject: |
|
|
SupapleX wrote: | aperture AGP 128MB (or 256) |
As told, not more than 128, if you want to reserve that much mem for mtrr.
SupapleX wrote: | radeonfb disabled |
I don't know, what you mean by "disabled". Don't select this in the kernel at all, and use vesafb compiled in instead.
SupapleX wrote: | In kernel VESA VGA fb enabled, but I boot no from:
kernel /vmlinuz root=/dev/hdb3
i.e. without fb. |
If you have no framebuffer selected at all, the mtrr's should normally be set correctly automatically, you can check with "cat /proc/mtrr", when using vesafb (as almost everybody does), at least one mtrr entry has to be corrected using the well known mtrr-fix patch.
SupapleX wrote: | I haven't bootsplash in kernel at all, just I can see tux. |
Turn off the bootlogo, I remember an issue with this as well.
SupapleX wrote: | About mttr:Did 3.9.0-r1 patched with mtrr-fix? |
No, this has nothing to do with the ati-driver. Search for mtrr-fix in the forum. _________________ Las torturas mentales de la CIA |
|
Back to top |
|
|
SupapleX n00b
Joined: 12 Jun 2004 Posts: 37
|
Posted: Thu Jun 17, 2004 10:55 am Post subject: |
|
|
And now:
AGPSize=128. (R300(9500nonpro) 64mb at card)
NO Any FB in the kernel.
Do mtrr-fix at this maner(is it important?):
Do all at first(and alone) X:
#cat /var/log/XFree86.0.log |grep ATI
(--) PCI:*(1:0:0) ATI Technologies Inc Radeon R300 AD [Radeon 9500 Pro] rev 0, Mem @ 0xf0000000/26, 0xf9000000/16, I/O @ 0xb000/8
#cat /proc/mtrr
reg00: base=0x00000000 ( 0MB), size= 512MB: write-back, count=1
reg01: base=0xf0000000 (3840MB), size= 64MB: write-combining, count=1
reg02: base=0xe8000000 (3712MB), size= 128MB: write-combining, count=1
#echo "disable=1" > /proc/mtrr
#echo "base=0xf0000000 size=0x4000000 type=write-combining" > /proc/mtrr
#cat /proc/mtrr
reg00: base=0x00000000 ( 0MB), size= 512MB: write-back, count=1
reg01: base=0xf0000000 (3840MB), size= 64MB: write-combining, count=1
reg02: base=0xe8000000 (3712MB), size= 128MB: write-combining, count=1
But it just for: "It's just users who've enabled vesa framebuffer who have this problem".
And I haven't such strings anywhere:
"mtrr: 0xd8000000,0x8000000 overlaps existing 0xd8000000,0x1000000" &
[fglrx:firegl_addmap] *ERROR* mtrr allocation failed (-22)
And it can't change anything! 64->64? (not 16->64) as in http://www.rage3d.net/board/showthread.php?s=e13f46aec4e42fe34404b035ea5e51f9&threadid=33736241&highlight=fglrx+mtrrfix
dmesg said(about fglrx & agp)(after change mtrr):
fglrx: module license 'Proprietary. (C) 2002 - ATI Technologies, Starnberg, GERMANY' taints kernel.
[fglrx] Maximum main memory to use for locked dma buffers: 432 MBytes.
[fglrx] module loaded - fglrx 3.9.0 [May 11 2004] on minor 0
[fglrx] Maximum main memory to use for locked dma buffers: 432 MBytes.
[fglrx] AGP detected, AgpState = 0x1f004217 (hardware caps of chipset)
agpgart: Found an AGP 3.0 compliant device at 0000:00:00.0.
agpgart: Device is in legacy mode, falling back to 2.x
agpgart: Putting AGP V2 device at 0000:00:00.0 into 4x mode
agpgart: Putting AGP V2 device at 0000:01:00.0 into 4x mode
[fglrx] AGP enabled, AgpCommand = 0x1f000314 (selected caps)
[fglrx] free AGP = 121909248
[fglrx] max AGP = 121909248
[fglrx] free LFB = 49283072
[fglrx] max LFB = 49283072
[fglrx] free Inv = 0
[fglrx] max Inv = 0
[fglrx] total Inv = 0
[fglrx] total TIM = 0
[fglrx] total FB = 0
[fglrx] total AGP = 32768
atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86, might be trying access hardware directly.
atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86, might be trying access hardware directly.
[fglrx] Maximum main memory to use for locked dma buffers: 432 MBytes.
[fglrx] free AGP = 121909248
[fglrx] max AGP = 121909248
[fglrx] free LFB = 49283072
[fglrx] max LFB = 49283072
[fglrx] free Inv = 0
[fglrx] max Inv = 0
[fglrx] total Inv = 0
[fglrx] total TIM = 0
[fglrx] total FB = 0
[fglrx] total AGP = 32768
atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86, might be trying access hardware directly.
Hmm...
Troubles are not at this point... But crash of second X continues
And part of my XF86Config:
Option "KernelModuleParm" "agplock=0"
Option "mtrr" "off"
Option "UseFastTLS" "0"
Option "BlockSignalsOnLock" "off"
Option "UseInternalAGPGART" "no"
Option "ForceGenericCPU" "no" |
|
Back to top |
|
|
SupapleX n00b
Joined: 12 Jun 2004 Posts: 37
|
Posted: Thu Jun 17, 2004 11:44 am Post subject: |
|
|
Uau!
I got diff from dmesg! Such diff corresponding for my steps:
1) start X:0
3) start X:4
4) dmesg > dmesg1
5) console1: # sleep 20; dmesg > dmesg2
5) goto X:0 <- crash. GDM-autorestart.
6) goto X:4 <- crash, freeze.
...
<20sec, dmesg > dmesg2
...
hard reboot from my hands.
And you can see #diff dmesg2 dmesg1 >
[fglrx:firegl_lock_free] *ERROR* lock was not held by 1! (*lock=0x00000000)
< [fglrx:firegl_unlock] *ERROR* firegl_lock_free failed!
< Unable to handle kernel paging request at virtual address a588219c
< printing eip:
< e0b2e4d3
< *pde = 00000000
< Oops: 0002 [#1]
< PREEMPT SMP
< Modules linked in: fglrx snd_seq_midi snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_emu10k1 snd_rawmidi snd_ac97_codec snd_util_mem snd_hwdep snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_pcm snd_page_alloc snd_timer snd_mixer_oss snd usbcore intel_agp agpgart
< CPU: 1
< EIP: 0060:[<e0b2e4d3>] Tainted: P
< EFLAGS: 00213296 (2.6.7-c3-gos4null)
< EIP is at firegl_PM4WaitForIdle+0x53/0x160 [fglrx]
< eax: a588219c ebx: cc13f014 ecx: cc13f014 edx: f020c067
< esi: cc13f014 edi: f020c06f ebp: e0b4cd24 esp: dab6dec8
< ds: 007b es: 007b ss: 0068
< Process X (pid: 6555, threadinfo=dab6c000 task=db2876f0)
< Stack: e0b4cd24 00000004 cc13f000 e0b4c8e0 00000000 e0b4ca80 00000001 e0b283ed
< e0b4cd24 00000001 00000000 e0b28905 e0b3d280 e0b3a78e 00000000 dab6df38
< e0b4c948 ce5d4000 00000000 ce5d4000 c1482638 e0b4ca80 00000001 e0b28726
< Call Trace:
< [<e0b283ed>] firegl_lock_device+0x38d/0x630 [fglrx]
< [<e0b28905>] firegl_lock_free+0x55/0xd0 [fglrx]
< [<e0b28726>] firegl_lock+0x96/0x220 [fglrx]
< [<e0b28690>] firegl_lock+0x0/0x220 [fglrx]
< [<e0b263dd>] firegl_ioctl+0x15d/0x1e0 [fglrx]
< [<c0168673>] sys_ioctl+0x119/0x286
< [<c0112497>] smp_apic_timer_interrupt+0xdd/0x145
< [<c0105ac3>] syscall_call+0x7/0xb
<
< Code: c7 00 0b 0d 00 00 89 c2 83 c2 10 c7 40 04 0f 00 00 00 c7 40
< <6>[fglrx] Maximum main memory to use for locked dma buffers: 432 MBytes.
< [fglrx:firegl_umm_init] *ERROR* UMM area already initialized!
< [fglrx:firegl_unlock] *ERROR* Process 8107 using kernel context 0
< atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86, might be trying access hardware directly.
< atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86, might be trying access hardware directly.
< ------------[ cut here ]------------
< kernel BUG at mm/memory.c:1573!
< invalid operand: 0000 [#2]
< PREEMPT SMP
< Modules linked in: fglrx snd_seq_midi snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_emu10k1 snd_rawmidi snd_ac97_codec snd_util_mem snd_hwdep snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_pcm snd_page_alloc snd_timer snd_mixer_oss snd usbcore intel_agp agpgart
< CPU: 0
< EIP: 0060:[<c014732f>] Tainted: P
< EFLAGS: 00213246 (2.6.7-c3-gos4null)
< EIP is at do_file_page+0x111/0x11e
< eax: cc13f028 ebx: dada27b0 ecx: df2b0404 edx: 00000000
< esi: 00000001 edi: 4040a000 ebp: df181d00 esp: d3971eb4
< ds: 007b es: 007b ss: 0068
< Process X (pid: 7862, threadinfo=d3970000 task=d2983160)
< Stack: c040adc0 00000000 00000001 00000100 00000028 df2b0404 00203202 df2b0404
< df181d00 4040a000 c01473e4 df181d00 dada27b0 4040a000 00000001 cc13f028
< df2b0404 c1407060 df181d00 df181d20 dada27b0 d2983160 c01154a4 df181d00
< Call Trace:
< [<c01473e4>] handle_mm_fault+0xa8/0x1af
< [<c01154a4>] do_page_fault+0x328/0x512
< [<c01186b2>] scheduler_tick+0x11a/0x4f7
< [<c010c0c9>] timer_interrupt+0x7a/0x16f
< [<c0120db8>] __do_softirq+0xa8/0xaa
< [<c011517c>] do_page_fault+0x0/0x512
< [<c010654d>] error_code+0x2d/0x38
<
< Code: 0f 0b 25 06 75 04 32 c0 e9 22 ff ff ff 55 b9 00 e0 ff ff 21
< <6>note: X[7862] exited with preempt_count 1 |
|
Back to top |
|
|
Master One l33t
Joined: 25 Aug 2003 Posts: 754 Location: Austria
|
Posted: Sun Jun 20, 2004 11:07 am Post subject: |
|
|
The only thing, that looks suspicious to me is Code: | atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86, might be trying access hardware directly. |
Do you have the option "omit xfree86-dga" enabled in your xorg.conf (it should be uncommented)?
BTW I assume you use XFree86. Try to swap over to xorg-x11. _________________ Las torturas mentales de la CIA |
|
Back to top |
|
|
hbmartin Guru
Joined: 12 Sep 2003 Posts: 386 Location: Home is where the boxen are
|
Posted: Tue Jan 04, 2005 3:17 am Post subject: |
|
|
I'm getting similiar errors when xdm starts. I wouldn't mind switch to xorg, but I don't know how to since it has blocking with my current X when I try to emerge it.
How do I fix the Radeon bug or switch to xorg?
Thanks,
Harold |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|