Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
simple thread communication or synchronization (in pthreads)
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
Bones McCracker
Veteran
Veteran


Joined: 14 Mar 2006
Posts: 1553
Location: U.S.A.

PostPosted: Sat Feb 02, 2013 1:01 am    Post subject: simple thread communication or synchronization (in pthreads) Reply with quote

Two questions:

1. When used in the context of POSIX threads, should a semaphore (say, a thread-shared semaphore, not a process-shared one) always be protected by a mutex?

Do people ever freeball it and go without them? Does a semaphore carry much overhead, or is it merely like a short int or char in the global namespace? I know about pthread condition variables, but it seemed to me like overkill for a simple requirement like this.

In case that's not enough to go by, here's the scenario (simplified; the general idea here is that the output should be exactly once per t seconds). It works; the actual application has been running fine for 24 hours without any apparent problems, but I have these questions.
Code:
// global var
sem_t sem;

// thread (always takes exactly t seconds)
void * thread() {

    sleep(t);
    sem_post(&sem);

    return NULL;
}


// called from main(); always takes far less than t sec (but theoretically might not I suppose)
void prepare_stuff() {

    <prepare stuff, etc.>     // <--- always takes far less than t sec

}

int main() {

<variable declarations, etc.>

while(1) {
    prepare_stuff();
    sem_wait(&sem);
    <output stuff>
    }

return 0;

}


2. The default stack size for a thread is 2 MiB (IA32), but this thread does virtually nothing but operate a timer. If I would like to reduce the stack size, how do I determine a safe stack size for it (besides trial-and-error)?
[Edit: never mind on #2 here; I gather from a bit more research that pthread stack size is an upper limit, and the actual memory is dynamically allocated).


Last edited by Bones McCracker on Sun Feb 03, 2013 12:31 am; edited 1 time in total
Back to top
View user's profile Send private message
dmitchell
Veteran
Veteran


Joined: 17 May 2003
Posts: 1159
Location: Austin, Texas

PostPosted: Sat Feb 02, 2013 5:10 am    Post subject: Reply with quote

I don't have an answer to your question. However, I would suggest looking at openmp. It's easy.
_________________
Your argument is invalid.
Back to top
View user's profile Send private message
petrjanda
Veteran
Veteran


Joined: 05 Sep 2003
Posts: 1557
Location: Brno, Czech Republic

PostPosted: Sat Feb 02, 2013 7:13 am    Post subject: Reply with quote

http://stackoverflow.com/questions/2065747/pthreads-mutex-vs-semaphore

Semaphore has a synchronizing counter, while mutex is used for preventing race conditions among threads.
_________________
There is, a not-born, a not-become, a not-made, a not-compounded. If that unborn, not-become, not-made, not-compounded were not, there would be no escape from this here that is born, become, made and compounded. - Gautama Siddharta
Back to top
View user's profile Send private message
Bones McCracker
Veteran
Veteran


Joined: 14 Mar 2006
Posts: 1553
Location: U.S.A.

PostPosted: Sat Feb 02, 2013 3:15 pm    Post subject: Reply with quote

dmitchell wrote:
I don't have an answer to your question. However, I would suggest looking at openmp. It's easy.

Thank you for the suggestion.
Back to top
View user's profile Send private message
Bones McCracker
Veteran
Veteran


Joined: 14 Mar 2006
Posts: 1553
Location: U.S.A.

PostPosted: Sat Feb 02, 2013 3:24 pm    Post subject: Reply with quote

petrjanda wrote:
http://stackoverflow.com/questions/2065747/pthreads-mutex-vs-semaphore

Semaphore has a synchronizing counter, while mutex is used for preventing race conditions among threads.

Yeah, I know that, but thanks. :lol:
Back to top
View user's profile Send private message
Prenj
n00b
n00b


Joined: 20 Nov 2011
Posts: 10

PostPosted: Sat Feb 02, 2013 6:27 pm    Post subject: Reply with quote

Are you trying to solve something specific, or you just fiddling with semaphores and mutexes for the heck of it / educational purposes?

In my humble opinion, in general, semaphores and mutexes kinda kill the point of using threads in first place, whenever I needed to pass results from one thread to another, I was using messages passed along an async queue.

Say you have a thread just doing i/o and sending recieveing HTTP requests. It reads stuff from socket, does some basic framing to figure if it read everything, and then puts a message on outgoing queue. It checks ingoing queue to see if there is anything to send, if not, it reads if there is a new message, or just sleeps.

The thread that is supposed to parse contents and something clever with it, pops messages from the queue, does its stuff, and if there is a need for a reply, crafts one, and puts it in outgoing queue.

Both threads "sort of" synchronize with each other, but neither forces another to wait or abort, since the shared point is asynchronious. If one thread is working too fast for another, so that say worker thread cannot process everything in time, the i/o thread can check queue length, and if above certain threshold, it can send 503 instead of pushing it to worker, and give worker thread some breathing space.
Back to top
View user's profile Send private message
Bones McCracker
Veteran
Veteran


Joined: 14 Mar 2006
Posts: 1553
Location: U.S.A.

PostPosted: Sat Feb 02, 2013 8:35 pm    Post subject: Reply with quote

I'm fiddling to learn, but am working on a specific requirement. I was trying to avoid overwhelming people with unnecessary details, but you asked for it, so: :)

I novice programmer, doing this for learning value. My toy application runs an infinite loop, which collects various system status information (takes about 0.05 sec), and then outputs that information. (I use it to put system status at top of root window in my window manager.) The trick is that part of that information is the time, and ideally the output should be generated exactly once each second.

Since the work takes a variable amount of time, I can't simply use a single, sequential loop that delays the appropriate amount of time, does the work, outputs. I had been doing this, and in actuality this works, with the output coming apparently once per second, but if you sit there and stare at it long enough, you'll see it seem to jump ahead by a second. (Once every few minutes if using sleep(), which has a resolution in seconds. Could refine this by using nanosleep, but I'd rather see if I can't find a precise, light solution).

I am far too obsessive and anally retentive to have my clock doing shit like that.

So what I've got right now is the application creating one pthread to accurately measure 1 second of elapsed real time, with the main thread doing its work but then waiting to actually output the result until the parallel thread has measured exactly 1 second.

Both the main thread and the parallel thread are based on infinite loops. (I'm not spawning a thread, waiting for it to complete, then spawning a thread ... over and over again every second, which I assume would be inefficient).

So, there are two aspects to my problem:

a. There's the immediate question of how to most efficiently and reliably communicate between the threads that a second has elapsed and it's time to go ahead and push the output. I have considered:
> just using a simple global variable set by the timer thread, with the main thread idling in a loop until it is activated (did a working prototype of this)

> using the pthread-provided "condition variable" constructs (condition-set, condition-wait, etc.), which have the advantage of being atomic operations and having integrated mutex management (started implementing this but stopped, thinking "this is awfully complex for a requirement so simple")

> using a semaphore: sem_set() by the timer thread and sem_wait() in the main thread before pushing the output (actually implemented this and it's been running on my machine for about two days, but I feel I should have protected the semaphore from simultaneous modification by the main and timer threads (especially since I'm not sure the sem_set() and sem_wait() operations are atomic)

> dmitchell's suggestion of openmp (after 15 minutes of scanning it, also seems like overkill for this, although perhaps not in reality any more heavy than pthreads)

b. And, then there's the aspect of "this is a stupid way of satisfying my basic requirement" (which I didn't really ask about, but any thinking person will simultaneous consider):

> was originally using a single-threaded single process with "sleep(1)" inline. This is acceptable, given that the work actually only takes an eyeblink, but it feels like kludge to me.

> considered refining the delay to be (1 sec) - (avg work time), but that's kludgy too. even considered computing avg work time on the fly as a moving average for this purpose (still kludgy, plus added work to be done every second)

> decided to try using a timer process in parallel, with output being synchronized to the timer process, at 1 second intervals. My main problems with this are:

a. displayed time is always nearly a second behind reality; this sucks; I want accurate time, but I also want output at exactly 1 second intervals! :lol:

b. additional thread has own overhead (less than an additional process, but it has own stack, etc.). After looking into this, it's my understanding at this point that a thread doesn't actually make its stack memory unavailable to other uses (it's an upper limit, not an allocation, but I'm still not confident I understand this correctly).

As I write this, I am thinking I can probably use a system call to some kind of timer that will suspend the process and then signal it when it has elapsed (e.g. POSIX timer, linux HRT, which I haven't looked into yet).

Here is my actual code at this point (I know it doesn't do any error-checking; it's for my own use so I'm not testing things that I know work):
Code:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <time.h>
#include <string.h>
#include <xcb/xcb.h>
#include <pthread.h>
#include <semaphore.h>

// global variable
sem_t sem;


// function prototypes
static size_t append_load(char *st);

static size_t append_ram(char *st);

static size_t append_time(char *st);

void * timer();



// main function
int main() {

  const size_t outsize = 70;
  char status[outsize];
  size_t len;
  xcb_connection_t *c;
  xcb_screen_t *s;
  pthread_t thread;

  c = xcb_connect(NULL,NULL);
  s = xcb_setup_roots_iterator(xcb_get_setup(c)).data;

  sem_init(&sem, 0, 0);
  pthread_create(&thread, NULL, &timer, NULL);


  while(1) {

    len = append_load(status);
    len += append_ram(status + len);
    len += append_time(status + len);


    xcb_change_property(c, XCB_PROP_MODE_REPLACE, s->root,
            XCB_ATOM_WM_NAME, XCB_ATOM_STRING, 8, len, status);

    sem_wait(&sem);
    xcb_flush(c);

  }

  xcb_disconnect(c);
  pthread_cancel(thread);
  return 0;

}


// Reads system load (1 min avg) and appends to passed status string.
static size_t append_load(char *st) {

  static const size_t max = 14;
  static double load;

  getloadavg(&load, 1);

  return snprintf(st, max, "  Load: %.2f ", load);

}



// Calculates RAM in use and appends to passed status string.
static size_t append_ram(char *st) {

  static const size_t max = 32;
  static FILE *fp;
  static char label[11];
  static int value;
  static int found;
  static int memstats[6] = {0};
  static int ram_used;
  static int swap_used;

  memset(label, 0, sizeof(label));
  found = 0;

  fp = fopen("/proc/meminfo", "r");

  // search by label because /proc/meminfo entries vary
  while ( fscanf(fp, "%10s %d %*s", label, &value) != EOF ) {

    if ( strcmp(label, "MemTotal:") == 0 ) {
      memstats[0] = value;
      found ++;
    } else if ( strcmp(label, "MemFree:") == 0 ) {
      memstats[1] = value;
      found ++;
    } else if ( strcmp(label, "Buffers:") == 0 ) {
      memstats[2] = value;
      found++;
    } else if ( strcmp(label, "Cached:") == 0 ) {
      memstats[3] = value;
      found++;
    } else if ( strcmp(label, "SwapTotal:") == 0 ) {
      memstats[4] = value;
      found++;
    } else if ( strcmp(label, "SwapFree:") == 0 ) {
      memstats[5] = value;
      found++;
    }

    if ( found >=6 ) break;

  }

  fclose(fp);

  ram_used = memstats[0] - ( memstats[1] + memstats[2] + memstats[3] );
  swap_used = memstats[4] - memstats[5];

  return snprintf(st, max, "  RAM: %d MiB  Swap: %d MiB ", ram_used/1024, swap_used/1024);

}


// Creates formatted local time and appends to status string.
static size_t append_time(char *st) {

  static const size_t max = 24;
  static time_t now;
  static struct tm ltm;

  time(&now);
  localtime_r(&now, &ltm);
  return strftime(st, max, "  %a %b %-e  %T ", &ltm);

}



// separate thread to measure time interval in parallel
void* timer() {

  while(1) {

    sem_post(&sem);
    sleep(1);

  }

  return NULL;

}


Last edited by Bones McCracker on Sat Feb 02, 2013 9:06 pm; edited 1 time in total
Back to top
View user's profile Send private message
Prenj
n00b
n00b


Joined: 20 Nov 2011
Posts: 10

PostPosted: Sat Feb 02, 2013 9:04 pm    Post subject: Reply with quote

the problem with sleep (x) is that in effect you'll end up with drift of sleep(x)+execution time.


so you could do:

thread one:
get stats, put into struct, put struct onto async queue

thread two:
pop struct from the queue and display it
take time
nanosleep(diff between next second and time taken) //so you end up waking up on next tick

for queues you could use zeromq.org
Back to top
View user's profile Send private message
Bones McCracker
Veteran
Veteran


Joined: 14 Mar 2006
Posts: 1553
Location: U.S.A.

PostPosted: Sat Feb 02, 2013 9:23 pm    Post subject: Reply with quote

Prenj wrote:
the problem with sleep (x) is that in effect you'll end up with drift of sleep(x)+execution time.

Yes, that's the essence of the problem.


Prenj wrote:
so you could do:

thread one:
get stats, put into struct, put struct onto async queue

thread two:
pop struct from the queue and display it
take time
nanosleep(diff between next second and time taken) //so you end up waking up on next tick

for queues you could use zeromq.org

Same problem still exists. The data in the struct will be stale by almost a second by the time the nanosleep has elapsed.

As to my original question, in the method you describe, it would be the struct that would then need the protection of a mutex (so that thread 1 can't change it while thread 2 is using it), right? My question was whether a mutex (or some other form of locking) is an absolute necessity in such cases. (I'm assuming the message is just passing a pointer to a struct. Does the messaging function actually copy the struct into the message queue? If that's the case, then I suppose there'd be no need for a mutex.) Is a queue really appropriate for something there should only ever be one of at a time?
Back to top
View user's profile Send private message
tomk
Administrator
Administrator


Joined: 23 Sep 2003
Posts: 7219
Location: Sat in front of my computer

PostPosted: Sat Feb 02, 2013 10:15 pm    Post subject: Reply with quote

Moved from Off the Wall to Portage & Programming at BoneKracker's request.
_________________
Search | Read | Answer | Report | Strip
Back to top
View user's profile Send private message
Prenj
n00b
n00b


Joined: 20 Nov 2011
Posts: 10

PostPosted: Sat Feb 02, 2013 11:26 pm    Post subject: Reply with quote

BoneKracker wrote:
Prenj wrote:
the problem with sleep (x) is that in effect you'll end up with drift of sleep(x)+execution time.

Yes, that's the essence of the problem.


Prenj wrote:
so you could do:

thread one:
get stats, put into struct, put struct onto async queue

thread two:
pop struct from the queue and display it
take time
nanosleep(diff between next second and time taken) //so you end up waking up on next tick

for queues you could use zeromq.org

Same problem still exists. The data in the struct will be stale by almost a second by the time the nanosleep has elapsed.

As to my original question, in the method you describe, it would be the struct that would then need the protection of a mutex (so that thread 1 can't change it while thread 2 is using it), right? My question was whether a mutex (or some other form of locking) is an absolute necessity in such cases. (I'm assuming the message is just passing a pointer to a struct. Does the messaging function actually copy the struct into the message queue? If that's the case, then I suppose there'd be no need for a mutex.) Is a queue really appropriate for something there should only ever be one of at a time?



You don't have to use mutexes when working with message queues, different implementations vary, I used zeromq and glib2's async queue.
http://developer.gnome.org/glib/2.31/glib-Asynchronous-Queues.html

As for timings, you always have to do gymnastics on calculating elapsed time and how to get on next tick, and data is always gonna be "late" by some amount, since the point in time where you sample data and display data are gonna differ anyway. You only have options of trying to make that difference as minimal as possible.
I haven't worked with real-time OS, so I imagine that control would be better, but there is still some delay.
Back to top
View user's profile Send private message
Bones McCracker
Veteran
Veteran


Joined: 14 Mar 2006
Posts: 1553
Location: U.S.A.

PostPosted: Sun Feb 03, 2013 12:29 am    Post subject: Reply with quote

Thank you; you've given me some good ideas to work with.
Back to top
View user's profile Send private message
Dr.Willy
Guru
Guru


Joined: 15 Jul 2007
Posts: 318
Location: NRW, Germany

PostPosted: Sun Feb 03, 2013 2:05 pm    Post subject: Reply with quote

For another idea to work with see 'man timer_create' ;)
Back to top
View user's profile Send private message
Bones McCracker
Veteran
Veteran


Joined: 14 Mar 2006
Posts: 1553
Location: U.S.A.

PostPosted: Sun Feb 03, 2013 3:37 pm    Post subject: Reply with quote

Good tip, thanks.

I modified the application to be single-threaded, added a "ring buffer" (of scale 2) to hold timevals, and started taking a timestamp at the end of each loop and then using usleep to sleep for the delta between the elapsed time and 1 second.

This eliminated the "stale data" problem, with the resulting data being no more than about 0.0003 sec old. To reduce thrashing, I had it also calculate the error (i.e., how close it got to exactly 1 sec for the subsequent iteration), and then used that error to adjust the usleep interval for the subsequent iteration.

That rapidly (10 iterations) converged on an average error of about 15 usec. By dampening the adjustment (only correcting by 1/2 the error value each interation), it took a few more cycles (18 or so) to converge, but reached an average error of about 8 usec, which is better than acceptable.

This variability appears to be relatively independent of what else is going on (as far as user activity), so I assume it's due to competition for some kernel / hardware resource that needs to be scheduled (and hopefully it's not a clock related resource I'm introducing contention for by measuring and sleeping).

I could probably improve that further by using a weighted moving average of recent error values, but I don't think I'll bother.

So this fixes both the data currency and interval accuracy problems. It's a little more complex that I think is appropriate for this requirement, so I may do something simpler (like the sleep() loop in parallel thread, or maybe a timer and signals or something), but I'm learning stuff.

I'll now look into reimplementing this same algorithm using a timer instead of gettimeofday() and nonosleep().

I think the answer to my mutex question is "yes" (always protect shared memory). This other challenge is a "have your cake and eat it too" problem. If you want precisely-timed cycles, you must delay output until exactly the 1 sec mark (thereby making the data stale by a fraction of a second). If you want data currency, you have to estimate the time to start the operation, so output will arrive approximately at the 1 sec mark, but the cycle won't be exactly 1 sec. What I see from this experiment, though, is that I can do the latter and get within a 10 usec error on cycle time, with data that no more than 250 usec old. That's an acceptable outcome.

One alternative would be a hybrid approach that would assemble the output just a tad earlier, then produce it precisely on time, but I think the tradeoff as-is is preferable.

Thanks for the help.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum