Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
gentoo docs not valid xml?
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2  
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
codergeek42
Bodhisattva
Bodhisattva


Joined: 05 Apr 2004
Posts: 5142
Location: Anaheim, CA (USA)

PostPosted: Tue Jan 03, 2006 5:42 pm    Post subject: Reply with quote

mark_lybarger wrote:
probably. i don't understand why it's a big deal to change from .xml to .html extensions in order to allow browsers to properly display the content. in order to allow the extension to more match the content of the file.
You've seen for yourself examples of this, where the extension of the file on the server does not give you the document type which your browser or other user-agent ("UA") receives. Take a look at RFC 2616 if you're really interested in this (especially the Content-Type header, section 14.17).
Quote:
but file extensions do have special meaning to the browsers. i know that trying to render foo.xml that is not well formed xml does not work so nicely.
Then your UA is not compliant with the standards published long ago by the W3C and IETF (among others).
Quote:
yes, there have been several work arounds pointed out. each of which require the document viewer to know something special about the site. from that respect, i would still call the work arounds a kludge. workable, yes.
The specific UA used to view/render/output the site should obey the Content-Type header at all times if the server sends it in its response to the UA's request. This is a published and accepted part of the HTTP/1.1 standard; and is most definitely not a kludge.
_________________
~~ Peter: Programmer, Mathematician, STEM & Free Software Advocate, Enlightened Agent, Transhumanist, Fedora contributor
Who am I? :: EFF & FSF
Back to top
View user's profile Send private message
neysx
Retired Dev
Retired Dev


Joined: 27 Jan 2003
Posts: 795

PostPosted: Tue Jan 03, 2006 6:11 pm    Post subject: Reply with quote

mark_lybarger wrote:
Quote:
You may want to read XSL Transformations (XSLT) - HTML Output Method.
i don't see anything about file extensions.
Simply because they are not relevant.
mark_lybarger wrote:
probably. i don't understand why it's a big deal to change from .xml to .html extensions in order to allow browsers to properly display the content. in order to allow the extension to more match the content of the file.
It's not and you have already been told repeatedly to save files with a .html extension if you need it. When you save dynamic content locally, it becomes static content. Name it and organise it any way you like.
mark_lybarger wrote:
i'm merely curious why it's viewed as "better" to have foo.xml as as url, than to have foo.html for a url that returns html?
Because the url is an xml file, as explained many times already.
The source files are actually .xml, not .html, not .php, not .asp, not .jsp
mark_lybarger wrote:
i'm actually quite surprised this issue hasn't been raised before
Take it as a clue as how valid your point is.
See http://codesnippets.services.openoffice.org/index.xml for another example, if you must have one.
Back to top
View user's profile Send private message
mlybarger
Guru
Guru


Joined: 04 Sep 2002
Posts: 475

PostPosted: Tue Jan 03, 2006 6:19 pm    Post subject: Reply with quote

you're assuming the content is actually served via http. if the document is not served via http, but available over a local filesystem perhaps, the browser has to determine what to do with it.

i guess I assumed incorrectly that the "documentation" on the gentoo site didn't required the baggage of http to be viewed and that it could be viewed as a standalone "document". i can see now how it's important that this type of dynamic documentation must be properly served by an http server in order to be viewed. it's certainly a good practice / exercise to force the document reader to learn how to manipulate the server response in order to view the document locally.

i looked over Content-Type headers. It seems like a mime type for streams, meaning tell me what type of data this is incoming and i'll know what to do with it. great stuff for viewing content over http. meaningless for local documents.

i still haven't seen a valid reason _to_ use .xml as the extension for these url's as oppose to html.
Back to top
View user's profile Send private message
codergeek42
Bodhisattva
Bodhisattva


Joined: 05 Apr 2004
Posts: 5142
Location: Anaheim, CA (USA)

PostPosted: Tue Jan 03, 2006 6:33 pm    Post subject: Reply with quote

mark_lybarger wrote:
you're assuming the content is actually served via http. if the document is not served via http, but available over a local filesystem perhaps, the browser has to determine what to do with it.
When you save it to your local filesystem, then, you have the ability to rename it as you see fit; and probably will save it with a ".html" extension (oh, by the way, Firefox and Epiphany both offer to save it with a ".htm" extension automatically thanks to that text/html Content-Type header).
Quote:
i still haven't seen a valid reason _to_ use .xml as the extension for these url's as oppose to html.
The reason is that the markup source code used to create the documentation is XML (Guide XML to be exact). You've been shown this with the ?passthru=1 option.
_________________
~~ Peter: Programmer, Mathematician, STEM & Free Software Advocate, Enlightened Agent, Transhumanist, Fedora contributor
Who am I? :: EFF & FSF
Back to top
View user's profile Send private message
neysx
Retired Dev
Retired Dev


Joined: 27 Jan 2003
Posts: 795

PostPosted: Tue Jan 03, 2006 6:34 pm    Post subject: Reply with quote

mark_lybarger wrote:
you're assuming the content is actually served via http. if the document is not served via http, but available over a local filesystem perhaps, the browser has to determine what to do with it.
How many times and in how many languages do you need to be told to rename the files if you want to store them as static local files? man wget has all you need. wget can rename files and links automatically.
mark_lybarger wrote:
i guess I assumed incorrectly that the "documentation" on the gentoo site didn't required the baggage of http to be viewed and that it could be viewed as a standalone "document".
It can, but you don't want to (or can't) understand.
mark_lybarger wrote:
i looked over Content-Type headers. It seems like a mime type for streams, meaning tell me what type of data this is incoming and i'll know what to do with it. great stuff for viewing content over http. meaningless for local documents.
Amazingly, you've managed to learn something.
mark_lybarger wrote:
i still haven't seen a valid reason _to_ use .xml as the extension for these url's as oppose to html.
You're obviously a lost cause.
/me gives up this thread and stops following it.
Back to top
View user's profile Send private message
nephros
Advocate
Advocate


Joined: 07 Feb 2003
Posts: 2139
Location: Graz, Austria (Europe - no kangaroos.)

PostPosted: Tue Jan 03, 2006 7:48 pm    Post subject: Reply with quote

mark_lybarger wrote:
i still haven't seen a valid reason _to_ use .xml as the extension for these url's as oppose to html.

This would actually create the situation you are describing, which is currently a non-problem.
You would then have a XML file named .html.

You would then need to configure the web server to invoke the XSLT system to convert (xml).html files to (html).html files. Insanity.
_________________
Please put [SOLVED] in your topic if you are a moron.
Back to top
View user's profile Send private message
mlybarger
Guru
Guru


Joined: 04 Sep 2002
Posts: 475

PostPosted: Tue Jan 03, 2006 8:42 pm    Post subject: Reply with quote

Quote:
You would then need to configure the web server to invoke the XSLT system to convert (xml).html files to (html).html files. Insanity.


there's a million ways to skin a cat.

you could generate all the docs once everytime they're updated. a bulk transform process which converts the xml into html files. it'll probably be much less on the server than running an xslt transform on the docs each and every time they're requested. we use doc book to document some of our stuff here, and just run an ant task on the documentation and get html files as output.

you can also configure an apache server to rewrite the *.html requests internally into *.xml requests, and internally still do the xml xslt processing. that's more of a hack i'd imagine.

don't most modern browsers handle xslt transforms natively? would it be possible to return the xml document such that the client can do the transform?
Back to top
View user's profile Send private message
codergeek42
Bodhisattva
Bodhisattva


Joined: 05 Apr 2004
Posts: 5142
Location: Anaheim, CA (USA)

PostPosted: Tue Jan 03, 2006 9:02 pm    Post subject: Reply with quote

mark_lybarger wrote:
you could generate all the docs once everytime they're updated. a bulk transform process which converts the xml into html files. it'll probably be much less on the server than running an xslt transform on the docs each and every time they're requested.
If I'm not mistaken, this is how the docs are created. The XML is stored on the server and a cron job creates the cached HTML output files every so often (why it takes a while for CVS to sync with the docs and markup on the website, I'd imagine). Though, I'm not certain, as I'm not a Documentation developer.
Quote:
don't most modern browsers handle xslt transforms natively? would it be possible to return the xml document such that the client can do the transform?
Most complaint browsers do (I know the recent Firefox and Opera releases can). Note, however, that Gentoo wants the docs to be just as usable and readable through text-based browsers such as links (e.g., for reading through the handbook as you follow it to install Gentoo or tweak the network or similar) which do not have this capability.
_________________
~~ Peter: Programmer, Mathematician, STEM & Free Software Advocate, Enlightened Agent, Transhumanist, Fedora contributor
Who am I? :: EFF & FSF
Back to top
View user's profile Send private message
neysx
Retired Dev
Retired Dev


Joined: 27 Jan 2003
Posts: 795

PostPosted: Fri Jan 06, 2006 5:59 pm    Post subject: Reply with quote

codergeek42 wrote:
mark_lybarger wrote:
you could generate all the docs once everytime they're updated. a bulk transform process which converts the xml into html files. it'll probably be much less on the server than running an xslt transform on the docs each and every time they're requested.
If I'm not mistaken, this is how the docs are created. The XML is stored on the server and a cron job creates the cached HTML output files every so often (why it takes a while for CVS to sync with the docs and markup on the website, I'd imagine). Though, I'm not certain, as I'm not a Documentation developer.
Not at all. The web servers sync off our CVS server hourly (2x an hour soon). That's the only reason for the delay between commit & publication.
Content is definitely not pre-generated. I even doubt it could be done in an hour BTW. It would be an option if 1 file == 1 output, but, as with any dynamic content, this is not the case. The same file can deliver different outputs depending on parameters and sourced files. Most pages have a printer-friendly version, ever read the handbook? handbook-x86.xml can deliver a full toc, a partial toc, a chapter or a full handbook, all with their printer-friendly version. The main index.xml hasn't changed in months, but its output changes almost weekly....
It's quite trivial that if alsa-guide.xml is changed, you'd need to regenerate its html pages, but what about a handbook chapter, or the list of devs that is sourced by many pages, or the localised strings, the month names or simply the main XSL transform?
The fact is pages are generated on-the-fly as they are requested. XSL transforms are not run on each hit because pages output and the list of their dependencies are cached.
codergeek42 wrote:
Quote:
don't most modern browsers handle xslt transforms natively? would it be possible to return the xml document such that the client can do the transform?
Most complaint browsers do (I know the recent Firefox and Opera releases can). Note, however, that Gentoo wants the docs to be just as usable and readable through text-based browsers such as links (e.g., for reading through the handbook as you follow it to install Gentoo or tweak the network or similar) which do not have this capability.
There's no way browsers could apply the transforms that happen on the server. They do a lot more than a simple tag-to-tag mapping with a bit of XSL thrown in. FYI, some pages source hundreds of xml files and I don't think it would be reasonable to send it all to the clients :)
Back to top
View user's profile Send private message
codergeek42
Bodhisattva
Bodhisattva


Joined: 05 Apr 2004
Posts: 5142
Location: Anaheim, CA (USA)

PostPosted: Fri Jan 06, 2006 6:03 pm    Post subject: Reply with quote

Thanks for the wonderful explanation, neysx! :)
neysx wrote:
FYI, some pages source hundreds of xml files and I don't think it would be reasonable to send it all to the clients
That's a lot of XML! 8O
_________________
~~ Peter: Programmer, Mathematician, STEM & Free Software Advocate, Enlightened Agent, Transhumanist, Fedora contributor
Who am I? :: EFF & FSF
Back to top
View user's profile Send private message
Agilo
n00b
n00b


Joined: 01 Jan 2004
Posts: 38
Location: The Netherlands

PostPosted: Mon Jan 09, 2006 12:27 am    Post subject: Reply with quote

neysx wrote:
Agilo wrote:
omp wrote:
Agilo, the new site was made by someone who has quite a lot of experience, so my guess is that it would be fine.

I hope so, because currently it's as invalid as can be..
No, it's not.
We currently deliver valid transitional html 4.01. You're only assuming that you get an xml file file because the url ends with .xml and you assume wrong. You don't get some php code when you visit http://agilo.acjs.net/alessandro.php do you?

{...}

hth


It is, according to its doctype.

I didn't assume anything, you're now making assumptions, heh.


curtis119 wrote:

{...}

Actually I'm right in the middle of reading the complete xhtml-1.1 specification (that's why production on the redesign has ground to a halt recently). I haven't decided if I'm going to use it or not. I may go down to xhtml-1.0 depending on how hard it is to make it work. I do have the wwwredesign.gentoo.org site serving xhtml-1.1 at the moment but only for testing purposes and NONE of the pages are compliant yet.

{...}


After which, you'll (probably) know why XHTML 1.1 is a bad decision (and probably go for XHTML 1.0, but that'll be your decision to make).




And stop treating me like I'm new to the web, I have a firm grasp on XML(/XHTML), accessibility guidelines as wel as web-semantics so I know what I'm talking about. :\
I know very well the extentions in the URL bar don't mean jack.



Anyway; good luck with it, I know it's a lot of work.
_________________
- Agilo (Alessandro Lo-Presti)
Back to top
View user's profile Send private message
codergeek42
Bodhisattva
Bodhisattva


Joined: 05 Apr 2004
Posts: 5142
Location: Anaheim, CA (USA)

PostPosted: Mon Jan 09, 2006 1:10 am    Post subject: Reply with quote

Agilo wrote:
It is, according to its doctype.
It's DOCTYPE declaration explicitly specificies HTML 4.01 Transitional, using the W3C's SGML data type definition ("DTD"):
Quote:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">


Quote:
After which, you'll (probably) know why XHTML 1.1 is a bad decision (and probably go for XHTML 1.0, but that'll be your decision to make).
The only reason it would be a bad decision is currently that many browsers do not support proper XHTML.

Quote:
I know very well the extentions in the URL bar don't mean jack.
Then why are we even having this discussion in the first place? *confused*
_________________
~~ Peter: Programmer, Mathematician, STEM & Free Software Advocate, Enlightened Agent, Transhumanist, Fedora contributor
Who am I? :: EFF & FSF
Back to top
View user's profile Send private message
Agilo
n00b
n00b


Joined: 01 Jan 2004
Posts: 38
Location: The Netherlands

PostPosted: Mon Jan 09, 2006 1:26 am    Post subject: Reply with quote

codergeek42 wrote:
Agilo wrote:
It is, according to its doctype.
It's DOCTYPE declaration explicitly specificies HTML 4.01 Transitional, using the W3C's SGML data type definition ("DTD"):
Quote:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">


Quote:
After which, you'll (probably) know why XHTML 1.1 is a bad decision (and probably go for XHTML 1.0, but that'll be your decision to make).
The only reason it would be a bad decision is currently that many browsers do not support proper XHTML.

Quote:
I know very well the extentions in the URL bar don't mean jack.
Then why are we even having this discussion in the first place? *confused*


http://wwwredesign.gentoo.org/ - "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">"

That's what I was talking about in the first place, the redesign, and yes; I've read the previous posts, it's not done yet, I know.


And I didn't have any discussion at all.
Neysx assumed something which was wrong and everyone else blatently continued on his assumption.
I'm only correcting him.
_________________
- Agilo (Alessandro Lo-Presti)
Back to top
View user's profile Send private message
codergeek42
Bodhisattva
Bodhisattva


Joined: 05 Apr 2004
Posts: 5142
Location: Anaheim, CA (USA)

PostPosted: Mon Jan 09, 2006 2:04 am    Post subject: Reply with quote

Ah, then I must have misunderstood your meaning. :oops:

Thanks for clearing it up.
_________________
~~ Peter: Programmer, Mathematician, STEM & Free Software Advocate, Enlightened Agent, Transhumanist, Fedora contributor
Who am I? :: EFF & FSF
Back to top
View user's profile Send private message
mold
n00b
n00b


Joined: 31 Jul 2004
Posts: 52
Location: Essen, Germany

PostPosted: Mon Jan 09, 2006 8:44 pm    Post subject: Reply with quote

Quote:

i still haven't seen a valid reason _to_ use .xml as the extension for these url's as oppose to html.


I think he actually has half a point here. There is no real reason to name a URI according to the file that generates it. The most elegant way is to use no file extension in the URI at all. In stead of http://www.gentoo.org/foo.xml or http://www.gentoo.org/foo.html, the URI should just be http://www.gentoo.org/foo. Any web server should easily allow this. As has been pointed out, the UA depends on the content-type header anyway. And the user should not be required to memorize some part of the URI that hints at the web server's internal method of content generation. That way, it is also more elegantly possible to change the content generation method while still keeping a nice and easy to remember URI.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum