| View previous topic :: View next topic |
| Author |
Message |
jho Apprentice


Joined: 24 May 2007 Posts: 153 Location: Laukaa, Finland
|
Posted: Mon Apr 14, 2008 8:35 am Post subject: Perl regexp woes (solved) |
|
|
Hello,
I am creating a script in which I'll parse the currently running programs from a finnish site http://www.telkku.com/
The part for the currently ongoing program for each channel is like this:
| Code: | | <tr class="tuleeNyt" ><td>16.00 </td><td style="width: 100%"><a href="tiedot?oid=2008041416000" id="oid2008041416000">A-zoom</a></td></tr> |
(<tr class="tuleeNyt"> being the unique identifier for currently running program)
I need to pick up the starting time (16.00 in the example), the "oid" (2008041416000 in the example) and the name of the programme (A-zoom in the example).
Now I have a line like this:
| Code: | | $qry->content =~ m{<tr class="tuleeNyt" ><td>([^&]+) </td>.+?<a href="tiedot\?oid=([^"]+)" id=".+?">([^<]+)}; |
Which works well for the first one, but after storing the first entries, I need to look for the same string again to get the status for other channels.
If I put the regexp again and check the new variables, they are still the same as the previous ones (channel 1), so it just starts from the beginning and picks the first matching entry all the time. How can I skip the first match and go to the second one?
Last edited by jho on Tue Apr 15, 2008 12:03 am; edited 1 time in total |
|
| Back to top |
|
 |
notHerbert Veteran


Joined: 11 Mar 2008 Posts: 2228 Location: 45N 73W
|
Posted: Mon Apr 14, 2008 9:14 am Post subject: |
|
|
Maybe try a while loop, something like
| Code: | while (<>) ...
foreach my ...
if ...
last;
|
If you can shoehorn that into a function and run it in your query ... |
|
| Back to top |
|
 |
perga n00b

Joined: 18 Sep 2004 Posts: 64 Location: SWEDEN
|
Posted: Mon Apr 14, 2008 12:39 pm Post subject: |
|
|
If you're sure each item to which you want to apply the regexp is on a line by itself, the usual way of doing this would be something like this:
| Code: |
open(FILE, $fname) || die("Could not open $fname\n");
while ($line = <FILE>)
{
@vars = extract($line);
## ... do something with vars
}
close(FILE);
sub extract
{
## extract
$_[0] =~ /REGEXP.../;
return ($1, $2);
}
|
_________________ perga |
|
| Back to top |
|
 |
jho Apprentice


Joined: 24 May 2007 Posts: 153 Location: Laukaa, Finland
|
Posted: Tue Apr 15, 2008 12:03 am Post subject: |
|
|
Yes, I know of that.. But as you can see by yourself from the link I pasted, some entries are on the same line as others. So I'd just need to skip the previous match and go to a second one..
I got around this by clearing the first match with s///; so that the first one that matches when the loop is ran second time is the next one and so on.
Still would be nice to know if there's a way to skip the first entry any other way.. Marking solved anyways. Thanks for your suggestions though. |
|
| Back to top |
|
 |
Anarcho Veteran


Joined: 06 Jun 2004 Posts: 2850 Location: Wuppertal (Germany)
|
Posted: Tue Apr 15, 2008 8:43 am Post subject: |
|
|
This should also be possible with something like
while ($line =~ /pre(.*)post/g) {
print $1."\n";
}
The "g" at the end is for "global" and using this the regex should find all matches. _________________ ...it's only Rock'n'Roll, but I like it!
HOWTO:WLAN mit OpenVPN absichern | TOOL:useedit - USE-flag editor/changer |
|
| Back to top |
|
 |
|