View previous topic :: View next topic |
Author |
Message |
hilfsbremser n00b
Joined: 04 Jun 2004 Posts: 60 Location: Hamm
|
Posted: Thu Oct 21, 2004 9:34 am Post subject: extracting data from another website? |
|
|
hi all!
i want to fetch a html file from another website to extract some data from. the file will be fetched with a cron job and saved to some directory.
the problem ist that i need to extract some data from a table beeing shown on the website. i need some cells that contain a special keyword.
the lines that contain the keyword need to be shown in a table on my website.
anyone an idea how I could do this?
greetz
flo |
|
Back to top |
|
|
untiefe Apprentice
Joined: 12 Jan 2004 Posts: 230 Location: the nonexisting Bielefeld, Germany
|
Posted: Thu Oct 21, 2004 9:53 am Post subject: Re: extracting data from another website? |
|
|
hilfsbremser wrote: | i want to fetch a html file from another website to extract some data from. the file will be fetched with a cron job and saved to some directory.
the problem ist that i need to extract some data from a table beeing shown on the website. i need some cells that contain a special keyword.
the lines that contain the keyword need to be shown in a table on my website.
anyone an idea how I could do this? |
There are far too many ways to do this to list here.
E.g. you could do it with perl (with the HTML::Parser module), or you could do it with XSLT or PHP.
You could get the page with wget, cut out the part you are interested (html2wml) and add it to you page with aww/wml (another shameless self-advertisment .
Well, there are a huge number of different programming languages that can do this... _________________ "I'm an angel bored like hell
And you're a devil meaning well"
:: Cardigans - You're The Storm ::
glcu - gentoo linux cron update (full featured semi-automatic updates via cron) |
|
Back to top |
|
|
hilfsbremser n00b
Joined: 04 Jun 2004 Posts: 60 Location: Hamm
|
Posted: Thu Oct 21, 2004 10:01 am Post subject: |
|
|
hi!
i got some ideas already. i will have a cron job that fetches the desired website and saves it to hard disk. this is because the fetched website will be updated once an hour or so.
i would like to do it with PHP, because this is integrated in my server already. the next thing i am thinking about is save the extracted lines to a mysql db. but this doesn't matter right now.
the only problem is the extraction of the table from he html document.
hope this helps any further.
greetz
flo |
|
Back to top |
|
|
|