Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
ati-drivers + more than 1 Xsession = crash => SOLVED!!!
View unanswered posts
View posts from last 24 hours
View posts from last 7 days

 
Reply to topic    Gentoo Forums Forum Index Desktop Environments
View previous topic :: View next topic  
Author Message
Master One
l33t
l33t


Joined: 25 Aug 2003
Posts: 754
Location: Austria

PostPosted: Fri Jun 11, 2004 12:02 pm    Post subject: ati-drivers + more than 1 Xsession = crash => SOLVED!!! Reply with quote

This was my largest fight since I started playing arround with a Linux desktop installation.

Since the beginning I had a very strange issue with my workstation and my notebook, as both have an ATI Radeon graphics card:

The whole installation and configuration went just fine, everything was working as expected, as long as only one Xsession was running.

But as soon as I started another Xsession (with no difference, if done with "startx -- :1 vt8" or just using gdm to spawn another Xsession automatically), and I swapped between vt's, the screen messed up when returning to the inital Xsession (as usual on vt7) and the system usually hardlocked, so only a hard-reset could be done.

I suspected a lot of difference things, starting with the ati-driver (actually using 3.9.0-r1) and my xorg-x11 installation, ended up trying various settings in my xorg.conf and mtrr settings (applying the well known fix, as using vesafb, like most people do).

After a lot of time and a lot of different tries, I finally ended up with the solution for this particular problem. It was so easy:

I just had to build the AGP support (/dev/agpgart and the proper chipset-support) as MODULE instead of building it into the kernel!

This seems to have something to do with the fact, that direct rendering is only support for one Xsession, which means, only the first Xsession gets the fglrx 3D support, all other Xsession that follow only get the Mesa OpenGL renderer (I just remembered, that this is also the reason for xinerama disabling hardware 3D support).

Starting another Xsession gives "(EE) flrx(0): DRIScreenInit failed!", but everything is working fine this way.

I do not think that this has been mentioned anywhere before, maybe this should find it's way into some ATI FAQs.
_________________
Las torturas mentales de la CIA
Back to top
View user's profile Send private message
psofa
Guru
Guru


Joined: 28 Feb 2004
Posts: 485

PostPosted: Fri Jun 11, 2004 3:51 pm    Post subject: Reply with quote

r u sure its just that? i mean what diff does it make compiling agp as a module?Can u post this in the rage3d forums? i think they have [no multiple xservers] as a known prob in the linux subforum.
_________________
psofa
Back to top
View user's profile Send private message
Master One
l33t
l33t


Joined: 25 Aug 2003
Posts: 754
Location: Austria

PostPosted: Fri Jun 11, 2004 4:54 pm    Post subject: Reply with quote

It is indeed just that. I have no explaination for this, as I do not know anything about the internal mechanism of the driver use, but it looks like there is a serious conflict when the agpgart is built into the kernel.

My assumption: Due to the various test I did, it looks like a new Xsession takes the access to agpgart when built into the kernel, that's why the new session comes up correctly, and it is possible to swap between the new Xsession and other vt's, but the system gets locked up as soon as the initial Xsession is selected. This may be somehow get prevented when the AGP support is built as a module, as a new Xsession does not get control over agpgart, and therefore also gets no hardware 3D rendering.

Anyway, building AGP support as module did the trick, and it is finally working here as expected.

I am not registered in the rage3d forum, but you can post this info there if you like, it may help others not willing to play arround with this matter that much, as I did.
_________________
Las torturas mentales de la CIA
Back to top
View user's profile Send private message
SupapleX
n00b
n00b


Joined: 12 Jun 2004
Posts: 37

PostPosted: Wed Jun 16, 2004 8:53 am    Post subject: Reply with quote

Is it so easy?
All my efforts faild. I just want to use more then one Xsessions. I (dis/en)able SMP, radeonfb, options & settings on XF86Config.
And now I have:
P4 with HT (kernel 2.6.7rc_SMP)
i865 ( and agpgart.ko + intel_agp.ko)
R300 (and fglrx.ko(3.9.0-r1) without radeonfb)
Start first Xsession -> gl_hardware enabled.
Start second session -> gl software
And when I go back to the first session -> system freeze... :evil:
Back to top
View user's profile Send private message
Master One
l33t
l33t


Joined: 25 Aug 2003
Posts: 754
Location: Austria

PostPosted: Wed Jun 16, 2004 4:05 pm    Post subject: Reply with quote

Hm, did you get any error messages in your system log after spawning the second Xsession?

Do you use vesafb?
Do you use bootsplash?
Did you set your mtrr correctly (the mtrr-fix has to be applied in any case when using vesafb!)?

As from what I've seen, the AGP aparture size should be set to 64 or 128 MB, but not more than that, and you have to check that the mtrr entries (cat /proc/mtrr) are set for both (card memory and framebuffer) correctly.

If may not be related, but I am using vesafb without bootsplash (I switched it off, when I started my tests on this matter, and kept it off since then).

It may be another issue if using a SMP system, I don't really know. I just can tell you, that all the previous problems have gone away since I have the AGP support compiled as modules.
_________________
Las torturas mentales de la CIA
Back to top
View user's profile Send private message
SupapleX
n00b
n00b


Joined: 12 Jun 2004
Posts: 37

PostPosted: Wed Jun 16, 2004 8:02 pm    Post subject: Reply with quote

aperture AGP 128MB (or 256)
radeonfb disabled
In kernel VESA VGA fb enabled, but I boot no from:
kernel /vmlinuz root=/dev/hdb3
i.e. without fb.
I haven't bootsplash in kernel at all, just I can see tux.

About mttr:
Did 3.9.0-r1 patched with mtrr-fix?

With a single-CPU mode the bug was to.

Now I'm trying to clean kernel from vesafb at all... tomorrow I reply.
Back to top
View user's profile Send private message
Master One
l33t
l33t


Joined: 25 Aug 2003
Posts: 754
Location: Austria

PostPosted: Wed Jun 16, 2004 8:14 pm    Post subject: Reply with quote

SupapleX wrote:
aperture AGP 128MB (or 256)

As told, not more than 128, if you want to reserve that much mem for mtrr.

SupapleX wrote:
radeonfb disabled

I don't know, what you mean by "disabled". Don't select this in the kernel at all, and use vesafb compiled in instead.

SupapleX wrote:
In kernel VESA VGA fb enabled, but I boot no from:
kernel /vmlinuz root=/dev/hdb3
i.e. without fb.

If you have no framebuffer selected at all, the mtrr's should normally be set correctly automatically, you can check with "cat /proc/mtrr", when using vesafb (as almost everybody does), at least one mtrr entry has to be corrected using the well known mtrr-fix patch.

SupapleX wrote:
I haven't bootsplash in kernel at all, just I can see tux.

Turn off the bootlogo, I remember an issue with this as well.

SupapleX wrote:
About mttr:Did 3.9.0-r1 patched with mtrr-fix?

No, this has nothing to do with the ati-driver. Search for mtrr-fix in the forum.
_________________
Las torturas mentales de la CIA
Back to top
View user's profile Send private message
SupapleX
n00b
n00b


Joined: 12 Jun 2004
Posts: 37

PostPosted: Thu Jun 17, 2004 10:55 am    Post subject: Reply with quote

And now:
AGPSize=128. (R300(9500nonpro) 64mb at card)
NO Any FB in the kernel.
Do mtrr-fix at this maner(is it important?):
Do all at first(and alone) X:
#cat /var/log/XFree86.0.log |grep ATI
(--) PCI:*(1:0:0) ATI Technologies Inc Radeon R300 AD [Radeon 9500 Pro] rev 0, Mem @ 0xf0000000/26, 0xf9000000/16, I/O @ 0xb000/8
#cat /proc/mtrr
reg00: base=0x00000000 ( 0MB), size= 512MB: write-back, count=1
reg01: base=0xf0000000 (3840MB), size= 64MB: write-combining, count=1
reg02: base=0xe8000000 (3712MB), size= 128MB: write-combining, count=1
#echo "disable=1" > /proc/mtrr
#echo "base=0xf0000000 size=0x4000000 type=write-combining" > /proc/mtrr
#cat /proc/mtrr
reg00: base=0x00000000 ( 0MB), size= 512MB: write-back, count=1
reg01: base=0xf0000000 (3840MB), size= 64MB: write-combining, count=1
reg02: base=0xe8000000 (3712MB), size= 128MB: write-combining, count=1

But it just for: "It's just users who've enabled vesa framebuffer who have this problem".

And I haven't such strings anywhere:
"mtrr: 0xd8000000,0x8000000 overlaps existing 0xd8000000,0x1000000" &
[fglrx:firegl_addmap] *ERROR* mtrr allocation failed (-22)

And it can't change anything! 64->64? (not 16->64) as in http://www.rage3d.net/board/showthread.php?s=e13f46aec4e42fe34404b035ea5e51f9&threadid=33736241&highlight=fglrx+mtrrfix

dmesg said(about fglrx & agp)(after change mtrr):
fglrx: module license 'Proprietary. (C) 2002 - ATI Technologies, Starnberg, GERMANY' taints kernel.
[fglrx] Maximum main memory to use for locked dma buffers: 432 MBytes.
[fglrx] module loaded - fglrx 3.9.0 [May 11 2004] on minor 0
[fglrx] Maximum main memory to use for locked dma buffers: 432 MBytes.
[fglrx] AGP detected, AgpState = 0x1f004217 (hardware caps of chipset)
agpgart: Found an AGP 3.0 compliant device at 0000:00:00.0.
agpgart: Device is in legacy mode, falling back to 2.x
agpgart: Putting AGP V2 device at 0000:00:00.0 into 4x mode
agpgart: Putting AGP V2 device at 0000:01:00.0 into 4x mode
[fglrx] AGP enabled, AgpCommand = 0x1f000314 (selected caps)
[fglrx] free AGP = 121909248
[fglrx] max AGP = 121909248
[fglrx] free LFB = 49283072
[fglrx] max LFB = 49283072
[fglrx] free Inv = 0
[fglrx] max Inv = 0
[fglrx] total Inv = 0
[fglrx] total TIM = 0
[fglrx] total FB = 0
[fglrx] total AGP = 32768
atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86, might be trying access hardware directly.
atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86, might be trying access hardware directly.
[fglrx] Maximum main memory to use for locked dma buffers: 432 MBytes.
[fglrx] free AGP = 121909248
[fglrx] max AGP = 121909248
[fglrx] free LFB = 49283072
[fglrx] max LFB = 49283072
[fglrx] free Inv = 0
[fglrx] max Inv = 0
[fglrx] total Inv = 0
[fglrx] total TIM = 0
[fglrx] total FB = 0
[fglrx] total AGP = 32768
atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86, might be trying access hardware directly.

Hmm...
Troubles are not at this point... But crash of second X continues
And part of my XF86Config:
Option "KernelModuleParm" "agplock=0"
Option "mtrr" "off"
Option "UseFastTLS" "0"
Option "BlockSignalsOnLock" "off"
Option "UseInternalAGPGART" "no"
Option "ForceGenericCPU" "no"
Back to top
View user's profile Send private message
SupapleX
n00b
n00b


Joined: 12 Jun 2004
Posts: 37

PostPosted: Thu Jun 17, 2004 11:44 am    Post subject: Reply with quote

Uau!
I got diff from dmesg! Such diff corresponding for my steps:
1) start X:0
3) start X:4
4) dmesg > dmesg1
5) console1: # sleep 20; dmesg > dmesg2
5) goto X:0 <- crash. GDM-autorestart.
6) goto X:4 <- crash, freeze.
...
<20sec, dmesg > dmesg2
...
hard reboot from my hands.
And you can see #diff dmesg2 dmesg1 >

[fglrx:firegl_lock_free] *ERROR* lock was not held by 1! (*lock=0x00000000)
< [fglrx:firegl_unlock] *ERROR* firegl_lock_free failed!
< Unable to handle kernel paging request at virtual address a588219c
< printing eip:
< e0b2e4d3
< *pde = 00000000
< Oops: 0002 [#1]
< PREEMPT SMP
< Modules linked in: fglrx snd_seq_midi snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_emu10k1 snd_rawmidi snd_ac97_codec snd_util_mem snd_hwdep snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_pcm snd_page_alloc snd_timer snd_mixer_oss snd usbcore intel_agp agpgart
< CPU: 1
< EIP: 0060:[<e0b2e4d3>] Tainted: P
< EFLAGS: 00213296 (2.6.7-c3-gos4null)
< EIP is at firegl_PM4WaitForIdle+0x53/0x160 [fglrx]
< eax: a588219c ebx: cc13f014 ecx: cc13f014 edx: f020c067
< esi: cc13f014 edi: f020c06f ebp: e0b4cd24 esp: dab6dec8
< ds: 007b es: 007b ss: 0068
< Process X (pid: 6555, threadinfo=dab6c000 task=db2876f0)
< Stack: e0b4cd24 00000004 cc13f000 e0b4c8e0 00000000 e0b4ca80 00000001 e0b283ed
< e0b4cd24 00000001 00000000 e0b28905 e0b3d280 e0b3a78e 00000000 dab6df38
< e0b4c948 ce5d4000 00000000 ce5d4000 c1482638 e0b4ca80 00000001 e0b28726
< Call Trace:
< [<e0b283ed>] firegl_lock_device+0x38d/0x630 [fglrx]
< [<e0b28905>] firegl_lock_free+0x55/0xd0 [fglrx]
< [<e0b28726>] firegl_lock+0x96/0x220 [fglrx]
< [<e0b28690>] firegl_lock+0x0/0x220 [fglrx]
< [<e0b263dd>] firegl_ioctl+0x15d/0x1e0 [fglrx]
< [<c0168673>] sys_ioctl+0x119/0x286
< [<c0112497>] smp_apic_timer_interrupt+0xdd/0x145
< [<c0105ac3>] syscall_call+0x7/0xb
<
< Code: c7 00 0b 0d 00 00 89 c2 83 c2 10 c7 40 04 0f 00 00 00 c7 40
< <6>[fglrx] Maximum main memory to use for locked dma buffers: 432 MBytes.
< [fglrx:firegl_umm_init] *ERROR* UMM area already initialized!
< [fglrx:firegl_unlock] *ERROR* Process 8107 using kernel context 0
< atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86, might be trying access hardware directly.
< atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86, might be trying access hardware directly.
< ------------[ cut here ]------------
< kernel BUG at mm/memory.c:1573!
< invalid operand: 0000 [#2]
< PREEMPT SMP
< Modules linked in: fglrx snd_seq_midi snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_emu10k1 snd_rawmidi snd_ac97_codec snd_util_mem snd_hwdep snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_pcm snd_page_alloc snd_timer snd_mixer_oss snd usbcore intel_agp agpgart
< CPU: 0
< EIP: 0060:[<c014732f>] Tainted: P
< EFLAGS: 00213246 (2.6.7-c3-gos4null)
< EIP is at do_file_page+0x111/0x11e
< eax: cc13f028 ebx: dada27b0 ecx: df2b0404 edx: 00000000
< esi: 00000001 edi: 4040a000 ebp: df181d00 esp: d3971eb4
< ds: 007b es: 007b ss: 0068
< Process X (pid: 7862, threadinfo=d3970000 task=d2983160)
< Stack: c040adc0 00000000 00000001 00000100 00000028 df2b0404 00203202 df2b0404
< df181d00 4040a000 c01473e4 df181d00 dada27b0 4040a000 00000001 cc13f028
< df2b0404 c1407060 df181d00 df181d20 dada27b0 d2983160 c01154a4 df181d00
< Call Trace:
< [<c01473e4>] handle_mm_fault+0xa8/0x1af
< [<c01154a4>] do_page_fault+0x328/0x512
< [<c01186b2>] scheduler_tick+0x11a/0x4f7
< [<c010c0c9>] timer_interrupt+0x7a/0x16f
< [<c0120db8>] __do_softirq+0xa8/0xaa
< [<c011517c>] do_page_fault+0x0/0x512
< [<c010654d>] error_code+0x2d/0x38
<
< Code: 0f 0b 25 06 75 04 32 c0 e9 22 ff ff ff 55 b9 00 e0 ff ff 21
< <6>note: X[7862] exited with preempt_count 1
Back to top
View user's profile Send private message
Master One
l33t
l33t


Joined: 25 Aug 2003
Posts: 754
Location: Austria

PostPosted: Sun Jun 20, 2004 11:07 am    Post subject: Reply with quote

The only thing, that looks suspicious to me is
Code:
atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86, might be trying access hardware directly.

Do you have the option "omit xfree86-dga" enabled in your xorg.conf (it should be uncommented)?
BTW I assume you use XFree86. Try to swap over to xorg-x11.
_________________
Las torturas mentales de la CIA
Back to top
View user's profile Send private message
hbmartin
Guru
Guru


Joined: 12 Sep 2003
Posts: 386
Location: Home is where the boxen are

PostPosted: Tue Jan 04, 2005 3:17 am    Post subject: Reply with quote

I'm getting similiar errors when xdm starts. I wouldn't mind switch to xorg, but I don't know how to since it has blocking with my current X when I try to emerge it.
How do I fix the Radeon bug or switch to xorg?
Thanks,
Harold
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Desktop Environments All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum