Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Need a little help with making a split function (C)
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
idl
Retired Dev
Retired Dev


Joined: 24 Dec 2002
Posts: 1728
Location: Nottingham, UK

PostPosted: Wed May 28, 2003 6:01 pm    Post subject: Need a little help with making a split function (C) Reply with quote

I'm making a split function to replicate the perl one, as a decent split function is what I miss most about C.

I've been meaning to get this finished for ages, and it'd make my life so much easyer having a good split function :D

So anyway.. the code so far:

Code:
int split(char *pattern, char *string, int max_elem)
{

   /* An attempt at creating a similar function to that of split() in perl.
    * This function uses PCRE and will split a string at the match returning
    * the data each side of the matched pattern in array elements.
    */

   pcre *re;
   const char *error;
   int orroffset;
   int ovector[OVECCOUNT];
   int rc, i;
   char *split_output[512];
   char *matched_string[512];

   re = prce_compile(pattern, 0, &error, *erroffset, NULL);
 
   if(re == NULL)
   {
      printf("ERROR:(Fatal) PCRE compliation of pattern in function split() failed at offset %d: %s\n", erroffset, error);
      printf("\nExplanation of error:\nGlobalNews uses PCRE(Perl Compatible Regular Expresions) for matching and spliting of strings gained from RDF and RSS files. GlobalNews was unable to successfuly execute the pcre_compile function before matching took place, rendering the program useless.\nPlease check the website for help.\n"); /* I'm a nice guy :) */
      exit(1);
   }

   rc = pcre_exec(re, NULL, string, (int)strlen(string), 0, 0, ovector, OVECCOUNT);

   if(rc < 0)
   {
      return NULL;
   }

   for(i = 0; i < rc; i++)
   {
      matched_string = pattern + ovector[2*i];
   }
   
   /* Here is where I need to search string with matched_string and chop at each occurance and put them in an array element */
   
   pcre_free(re);
   return 0;
}


As you can see, it uses PCRE so I can use regex. The part I am stuck on however is near the bottom.. see the comment.

I thought I could match the strings character by character untill I have a match, then do the whole thing over again untill the end of the string. But that would be pretty lengthy and I presume very slow.

I thought about using strstr() as that would make it easy for me to find the occurance of a match, but strstr doesn't let me determine the ending offset of the match... only the offset of where the match begins :?

Any help is greatly appreciated.
_________________
a.k.a port001
Found a bug? Please report it: Gentoo Bugzilla
Back to top
View user's profile Send private message
iwasbiggs
Apprentice
Apprentice


Joined: 17 Jan 2003
Posts: 203

PostPosted: Wed May 28, 2003 7:23 pm    Post subject: Reply with quote

I dunno about that regex stuff, I'd just do it character by character, or be lazy and use strstr. You said that it doesn't return the ending position? Well, adding strlen() to the returned value will be only what, one more line of code?
_________________
www.ruinedsoft.com
Freeware development.
Back to top
View user's profile Send private message
compu-tom
Guru
Guru


Joined: 09 Jan 2003
Posts: 415
Location: Berlin, Germany

PostPosted: Wed May 28, 2003 7:24 pm    Post subject: Reply with quote

According to the manpage strstr returns a pointer to the beginning of the substring. These are the facts you're knowing then:
- the beginning of the "haystack" (see manpage)
- the beginning (i.e., the position) of the "needle"
- the length of the needle (the string you're searching)
Then you can imply:
- the length of the string prior to the needle in the haystack.
- the position where to start the next search.

Example:
- haystack = "abcdefghijklmnoabcdefghijklmno"
- needle = "ef"
- strstr(...) returns position of needle = haystack + 4
- therefore, length of first match is posOfNeedle - posOfHaystack = 4
- copy the first match with strncpy to an newly allocated string.
- offset for next search is posOfNeedle + lengthOfNeedle = posOfHaystack + 6
- continue until strstr returns NULL.

Hope this algorithm and pseudo code helps a bit :)
Back to top
View user's profile Send private message
Pythonhead
Developer
Developer


Joined: 16 Dec 2002
Posts: 1801
Location: Redondo Beach, Republic of Calif.

PostPosted: Wed May 28, 2003 7:25 pm    Post subject: Reply with quote

Maybe this will do what you want:

http://www.experts-exchange.com/Programming/Programming_Languages/C/Q_20417383.html
Back to top
View user's profile Send private message
idl
Retired Dev
Retired Dev


Joined: 24 Dec 2002
Posts: 1728
Location: Nottingham, UK

PostPosted: Wed May 28, 2003 8:08 pm    Post subject: Reply with quote

compu-tom wrote:
According to the manpage strstr returns a pointer to the beginning of the substring. These are the facts you're knowing then:
- the beginning of the "haystack" (see manpage)
- the beginning (i.e., the position) of the "needle"
- the length of the needle (the string you're searching)
Then you can imply:
- the length of the string prior to the needle in the haystack.
- the position where to start the next search.

Example:
- haystack = "abcdefghijklmnoabcdefghijklmno"
- needle = "ef"
- strstr(...) returns position of needle = haystack + 4
- therefore, length of first match is posOfNeedle - posOfHaystack = 4
- copy the first match with strncpy to an newly allocated string.
- offset for next search is posOfNeedle + lengthOfNeedle = posOfHaystack + 6
- continue until strstr returns NULL.

Hope this algorithm and pseudo code helps a bit :)


I figured I could do something like that before I read your post :) I should have thought of it before 8O

EndOffset = strlen(matched_string) + StartOffset;

So now I need to split string at the StartOffset and put the crap to the left of the offset into the first array element then do another strstr to find the next match.
I think I can do that, thnx for your help :D
_________________
a.k.a port001
Found a bug? Please report it: Gentoo Bugzilla
Back to top
View user's profile Send private message
idl
Retired Dev
Retired Dev


Joined: 24 Dec 2002
Posts: 1728
Location: Nottingham, UK

PostPosted: Wed May 28, 2003 8:09 pm    Post subject: Reply with quote

Pythonhead wrote:
Maybe this will do what you want:

http://www.experts-exchange.com/Programming/Programming_Languages/C/Q_20417383.html


Thnx, but it's not quite the same without being able to use regex :)
_________________
a.k.a port001
Found a bug? Please report it: Gentoo Bugzilla
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum