| View previous topic :: View next topic |
| Author |
Message |
idl Retired Dev


Joined: 24 Dec 2002 Posts: 1728 Location: Nottingham, UK
|
Posted: Wed May 28, 2003 6:01 pm Post subject: Need a little help with making a split function (C) |
|
|
I'm making a split function to replicate the perl one, as a decent split function is what I miss most about C.
I've been meaning to get this finished for ages, and it'd make my life so much easyer having a good split function
So anyway.. the code so far:
| Code: | int split(char *pattern, char *string, int max_elem)
{
/* An attempt at creating a similar function to that of split() in perl.
* This function uses PCRE and will split a string at the match returning
* the data each side of the matched pattern in array elements.
*/
pcre *re;
const char *error;
int orroffset;
int ovector[OVECCOUNT];
int rc, i;
char *split_output[512];
char *matched_string[512];
re = prce_compile(pattern, 0, &error, *erroffset, NULL);
if(re == NULL)
{
printf("ERROR:(Fatal) PCRE compliation of pattern in function split() failed at offset %d: %s\n", erroffset, error);
printf("\nExplanation of error:\nGlobalNews uses PCRE(Perl Compatible Regular Expresions) for matching and spliting of strings gained from RDF and RSS files. GlobalNews was unable to successfuly execute the pcre_compile function before matching took place, rendering the program useless.\nPlease check the website for help.\n"); /* I'm a nice guy :) */
exit(1);
}
rc = pcre_exec(re, NULL, string, (int)strlen(string), 0, 0, ovector, OVECCOUNT);
if(rc < 0)
{
return NULL;
}
for(i = 0; i < rc; i++)
{
matched_string = pattern + ovector[2*i];
}
/* Here is where I need to search string with matched_string and chop at each occurance and put them in an array element */
pcre_free(re);
return 0;
}
|
As you can see, it uses PCRE so I can use regex. The part I am stuck on however is near the bottom.. see the comment.
I thought I could match the strings character by character untill I have a match, then do the whole thing over again untill the end of the string. But that would be pretty lengthy and I presume very slow.
I thought about using strstr() as that would make it easy for me to find the occurance of a match, but strstr doesn't let me determine the ending offset of the match... only the offset of where the match begins
Any help is greatly appreciated. _________________ a.k.a port001
Found a bug? Please report it: Gentoo Bugzilla |
|
| Back to top |
|
 |
iwasbiggs Apprentice

Joined: 17 Jan 2003 Posts: 203
|
Posted: Wed May 28, 2003 7:23 pm Post subject: |
|
|
I dunno about that regex stuff, I'd just do it character by character, or be lazy and use strstr. You said that it doesn't return the ending position? Well, adding strlen() to the returned value will be only what, one more line of code? _________________ www.ruinedsoft.com
Freeware development. |
|
| Back to top |
|
 |
compu-tom Guru

Joined: 09 Jan 2003 Posts: 415 Location: Berlin, Germany
|
Posted: Wed May 28, 2003 7:24 pm Post subject: |
|
|
According to the manpage strstr returns a pointer to the beginning of the substring. These are the facts you're knowing then:
- the beginning of the "haystack" (see manpage)
- the beginning (i.e., the position) of the "needle"
- the length of the needle (the string you're searching)
Then you can imply:
- the length of the string prior to the needle in the haystack.
- the position where to start the next search.
Example:
- haystack = "abcdefghijklmnoabcdefghijklmno"
- needle = "ef"
- strstr(...) returns position of needle = haystack + 4
- therefore, length of first match is posOfNeedle - posOfHaystack = 4
- copy the first match with strncpy to an newly allocated string.
- offset for next search is posOfNeedle + lengthOfNeedle = posOfHaystack + 6
- continue until strstr returns NULL.
Hope this algorithm and pseudo code helps a bit  |
|
| Back to top |
|
 |
Pythonhead Developer


Joined: 16 Dec 2002 Posts: 1801 Location: Redondo Beach, Republic of Calif.
|
|
| Back to top |
|
 |
idl Retired Dev


Joined: 24 Dec 2002 Posts: 1728 Location: Nottingham, UK
|
Posted: Wed May 28, 2003 8:08 pm Post subject: |
|
|
| compu-tom wrote: | According to the manpage strstr returns a pointer to the beginning of the substring. These are the facts you're knowing then:
- the beginning of the "haystack" (see manpage)
- the beginning (i.e., the position) of the "needle"
- the length of the needle (the string you're searching)
Then you can imply:
- the length of the string prior to the needle in the haystack.
- the position where to start the next search.
Example:
- haystack = "abcdefghijklmnoabcdefghijklmno"
- needle = "ef"
- strstr(...) returns position of needle = haystack + 4
- therefore, length of first match is posOfNeedle - posOfHaystack = 4
- copy the first match with strncpy to an newly allocated string.
- offset for next search is posOfNeedle + lengthOfNeedle = posOfHaystack + 6
- continue until strstr returns NULL.
Hope this algorithm and pseudo code helps a bit  |
I figured I could do something like that before I read your post I should have thought of it before
EndOffset = strlen(matched_string) + StartOffset;
So now I need to split string at the StartOffset and put the crap to the left of the offset into the first array element then do another strstr to find the next match.
I think I can do that, thnx for your help  _________________ a.k.a port001
Found a bug? Please report it: Gentoo Bugzilla |
|
| Back to top |
|
 |
idl Retired Dev


Joined: 24 Dec 2002 Posts: 1728 Location: Nottingham, UK
|
Posted: Wed May 28, 2003 8:09 pm Post subject: |
|
|
Thnx, but it's not quite the same without being able to use regex  _________________ a.k.a port001
Found a bug? Please report it: Gentoo Bugzilla |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|