Gentoo Forums
Gentoo Forums
Quick Search: in
Perl regexp woes (solved)
View unanswered posts
View posts from last 24 hours

rackathon
 
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
jho
Apprentice
Apprentice


Joined: 24 May 2007
Posts: 153
Location: Laukaa, Finland

PostPosted: Mon Apr 14, 2008 8:35 am    Post subject: Perl regexp woes (solved) Reply with quote

Hello,

I am creating a script in which I'll parse the currently running programs from a finnish site http://www.telkku.com/

The part for the currently ongoing program for each channel is like this:
Code:
<tr class="tuleeNyt" ><td>16.00&nbsp;</td><td style="width: 100%"><a href="tiedot?oid=2008041416000" id="oid2008041416000">A-zoom</a></td></tr>

(<tr class="tuleeNyt"> being the unique identifier for currently running program)

I need to pick up the starting time (16.00 in the example), the "oid" (2008041416000 in the example) and the name of the programme (A-zoom in the example).

Now I have a line like this:
Code:
$qry->content =~ m{<tr class="tuleeNyt" ><td>([^&]+)&nbsp;</td>.+?<a href="tiedot\?oid=([^"]+)" id=".+?">([^<]+)};

Which works well for the first one, but after storing the first entries, I need to look for the same string again to get the status for other channels.

If I put the regexp again and check the new variables, they are still the same as the previous ones (channel 1), so it just starts from the beginning and picks the first matching entry all the time. How can I skip the first match and go to the second one?


Last edited by jho on Tue Apr 15, 2008 12:03 am; edited 1 time in total
Back to top
View user's profile Send private message
notHerbert
Veteran
Veteran


Joined: 11 Mar 2008
Posts: 2228
Location: 45N 73W

PostPosted: Mon Apr 14, 2008 9:14 am    Post subject: Reply with quote

Maybe try a while loop, something like
Code:
while (<>) ...
foreach my ...
if ...
last;

If you can shoehorn that into a function and run it in your query ...
Back to top
View user's profile Send private message
perga
n00b
n00b


Joined: 18 Sep 2004
Posts: 64
Location: SWEDEN

PostPosted: Mon Apr 14, 2008 12:39 pm    Post subject: Reply with quote

If you're sure each item to which you want to apply the regexp is on a line by itself, the usual way of doing this would be something like this:

Code:

open(FILE, $fname) || die("Could not open $fname\n");

while ($line = <FILE>)
{
        @vars = extract($line);
   ## ... do something with vars
}

close(FILE);

sub extract
{
    ## extract
    $_[0] =~ /REGEXP.../;
    return ($1, $2);
}

_________________
perga
Back to top
View user's profile Send private message
jho
Apprentice
Apprentice


Joined: 24 May 2007
Posts: 153
Location: Laukaa, Finland

PostPosted: Tue Apr 15, 2008 12:03 am    Post subject: Reply with quote

Yes, I know of that.. But as you can see by yourself from the link I pasted, some entries are on the same line as others. So I'd just need to skip the previous match and go to a second one..
I got around this by clearing the first match with s///; so that the first one that matches when the loop is ran second time is the next one and so on.

Still would be nice to know if there's a way to skip the first entry any other way.. Marking solved anyways. Thanks for your suggestions though.
Back to top
View user's profile Send private message
Anarcho
Veteran
Veteran


Joined: 06 Jun 2004
Posts: 2850
Location: Wuppertal (Germany)

PostPosted: Tue Apr 15, 2008 8:43 am    Post subject: Reply with quote

This should also be possible with something like

while ($line =~ /pre(.*)post/g) {
print $1."\n";
}

The "g" at the end is for "global" and using this the regex should find all matches.
_________________
...it's only Rock'n'Roll, but I like it!
HOWTO:WLAN mit OpenVPN absichern | TOOL:useedit - USE-flag editor/changer
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum