Unit 2c - Synchronization.md

Unit 2c - Synchronization

Synchronization

We invented threads to
- exploit parallelism do things at the same time on different processors
- manage asynchrony do something else while waiting for I/O controller
But, we now have two problems related to controlling operation order
- coordinating access to memory (variables) shared among multiple threads
- control flow transfers among threads (wait until notified by another thread)
Synchronization is the mechanism threads use to
- ensure mutual exclusion of critical sections
- wait for an signal the occurence of events

Communicating Through Shared Data

We have a problem if
- threads shared a data structure
- operations involve multiple memory accesses
- these accesses can be arbitrarily interleaved

The Importance of Mutual Exclusion

Shared data
- data structure that could be accessed by multiple threads
- typically ocncurrent accesss to shared data is a bug
Critical Sections
- sections of code that access shared data
Race Condition
- simultaneous access to critical section code by multiple threads
- conflicting operations on shared data structure are arbitrarily interleaved
- unpredictable (non-deterministic) program behaviour – usually a bug (a serious bug)
Mutual Exclusion
- a mechanism implemented in software (with some special hardware support)
- to ensure critical sections of a shared data item are executed by one thread at a time
- reading and writing should be handled different (more later)
For example
- consider the implementation of a shared stack by a linked list...

Concurrent test doesn't always work

 
xxxxxxxxxx
5
1
et = uthread_create ((void* (*)(void*)) push_driver, (void*) n);
2
dt = uthread_create ((void* (*)(void*)) pop_driver, (void*) n);
3
uthread_join (et, 0);
4
uthread_join (dt, 0);
5
assert (top==0);

What is wrong?

 
xxxxxxxxxx
4
1
void push_st (struct SE* e) {
2
  e->next   = top;
3
  top       = e;
4
}

 
xxxxxxxxxx
5
1
struct SE* pop_st () {
2
  struct SE* e = top;
3
  top = (top) ? top->next : 0;
4
  return e;
5
}

Notice: It does work if we say uthread_init(1) But it does not work with uthread_init(2) or higher

The bug
- push and pop are critical sections on the shared stack
- they run in parallel so their operations are arbitrarily interleaved
- sometimes, this interleaving corrupts the data structure

Mutual Exclusion Using Locks

Lock Semantics
- a lock is either held by a thread or available
- at most one thread can hold a lock at a time
- a thread attempting to acquire a lock that is already held is forced to WAIT
Lock Primitives
- lock acquire lock, wait if necessary
  - unlock release lock, allowing another thread to acquire if waiting
Using locks for the shared stack

 
xxxxxxxxxx
5
1
void push_cs (struct SE* e) {
2
  lock (&aLock);
3
    push_st (e); // Critical sections are inside the lock
4
  unlock (&aLock);
5
}

 
xxxxxxxxxx
8
1
void SE* pop_cs () {
2
  struct SE* e;
3
  lock (&aLock);
4
    e = pop_st ();
5
  unlock (&aLock);
6
  
7
  return e;
8
}

Implementing Simple Locks

Here's a first cut

use a shared global variable for synchronization
lock – loops until the variable is 0 and then sets it to 1
unlock – sets the variable 0

 
xxxxxxxxxx
1
1
int lock = 0;

 
xxxxxxxxxx
4
1
void lock(int* lock) {
2
  while (*lock == 1) {}
3
  *lock = 1;
4
}

 
xxxxxxxxxx
3
1
void unlock(int* lock) {
2
  *lock = 0;
3
}

We now have a race in the lock code!

The race exists even at the machine-code level
- two instructions acquire lock: one to read it free, one to set it held
- but read by another thread and interpose between these two

Atomic Memory Exchange Instruction

We need a new instruction
- to atomically read and write a memory location
- no intervening access to that location from any other thread
Atomicity
- is a general property in systems
- where a group of operations are performed as single, individisible unit (no way an interrupt can occur before the whole unit is finished)
The Atomic Memory Exchange
- one type of atomic memory instruction (there are other types)
- group a load and store together atomically

Name	Semantics	Assembly
atomic exchange	`r[v] ← m[r[a]]` `m[r[a]] ← r[v]`	`xchg (ra), rv`

Spinlock

A Spinlock is
- a lock where waiter spins, looping on memory reads until lock is acquired
- also a busy-waiting lock

Implementation using Atomic Exchange

spin on atomic memory operation
that attempts to acquire lock while
atomically reading its old value

 
xxxxxxxxxx
6
1
        ld $lock, r1
2
        ld $1, r0
3
loop:   xchg (r1), r0
4
        beq r0, held
5
        br loop
6
held:

but there is a problem: atomic-exchange is an expensive instruction

Implementing Atomic Exchange

Can not be implemented just by CPU
- must synchronize across multiple CPUs
- accessing the same memory location at the same time
Implemented by Memory Bus
- memory bus synchronizes every CPUs access to memory
- the two parts of the exchange (read + write) are coupled on bus
- bus ensures that no other memory transaction can intervene
- this instruction is much slower, higher overhead than normal read or write

Spin first on normal read

normal reads are very fast and efficient compared to exchange
use normal read in loop until lock appears free
when lock appears free use exchange to try to grab it
if exchange fails then go back to normal read

 
xxxxxxxxxx
9
1
        ld $lock, r1    # r1 = &lock
2
loop:   ld (r1), r0     # r0 = lock
3
        beq r0, try     # goto try if lock == 0 (available)
4
        br loop         # goto loop if lock !=0 (held)
5
try:    ld $1, r0       # r0 = 1
6
        xchg (r1), r0   # atomically swap r0 and lock
7
        beq r0, held    # goto held lock was 0 before swap
8
        br loop         # try again if another lock holds lock
9
held:                   # we now hold the lock

Busy-waiting pros and cons
- Spinlocks are necessary and okay if spinner only waits a short time
- But, using a spinlock to wait for a long time, wastes CPU cycles

Blocking Locks

If a thread may wait a long time
- it should block so that other threads can run
- it will then unblock when it becomes runnable
  - lock is unlocked or event is signalled
Blocking locks for mutual exclusion
- attempting to acquire a held lock BLOCKs calling thread
  - blocked thread's TCB is stored on lock's waiting queue
- when releasing lock, UNBLOCK a waiting thread if there is one
  - remove thread block's waiting queue and place it on ready queue
Blocking for event notification
- wait by blocking, placing TCB on a waiting queue
- signal a specific waiting queue by moving a thread to ready queue

Blocking vs Busy Waiting

Spinlocks
- Busy Waiting
- Pros and Cons
  - uncontended locking has low overhead
  - waiting for lock has high overhead
- Use When
  - critical section is small
  - contention is expected to be minimal
  - event wait is expected to be very shot
  - when implementing blocking locks
Blocking Locks
- blocking waiting
- Pros and Cons
  - uncontended locking has higher overhead
  - waiting for lock has small, fixed overhead
- Use when
  - lock may be held for soem time
  - when contention is high
  - when event wait may be long

Video Playback System Example

General Problem
- video playback has two parts: (1) fetch/decode and (2) playback
- fetch has variable latency and so we need a buffer
  - sometimes you can fetch faster than playback rate
  - but, sometimes there are long delays
  - buffer hides the delays by fetched ahead of playback position when possible
Bounded Buffer and Two Independent Threads
- finite buffer of the next few video frames to play
- maximum size is N
- goal: keep buffer at least 50% full (lets say)
Producer Thread
- fetch frame from network and put them in buffer
Consumer Thread
- fetch frames from buffer, decode them and send them to video drivre
How are Producer and Consumer Connected?
- advantage of this approach is that they are largely decoupled, each has a separate job
- but, its the consumer that decides when the producer should run … HOW?

Monitors and Condition Variables

Mutual exclusion plus inter-thread synchronization
- basis for synchronization primitives in Unix, Java, etc
Monitor / MUTEX
- blocking lock to guarantee mutual exclusion
- monitor operations were enter and exit
- typically called a mutex (or just a lock) with operations lock and unlock
Condition Variable
- allows threads to synchronize with each other
- wait blocks until a subsequent signal operation on the variable
- signal unblocks waiter
- broadcast unblocks all waiters
- can only be accessed from inside of a monitor (i.e., mutex held)

UThreads Mutex and Condition

 
xxxxxxxxxx
16
1
struct uthread_mutex;
2
typedef struct uthread_mutex* uthread_mutex_t;
3
struct uthread_cond;
4
typedef struct uthread_cond* uthread_cond_t;
5
6
uthread_mutex_t uthread_mutex_create        ();
7
void            uthread_mutex_lock          (uthread_mutex_t);
8
void            uthread_mutex_lock_readonly (uthread_mutex_t);
9
void            uthread_mutex_unlock        (uthread_mutex_t);
10
void            uthread_mutex_destory       (uthread_mutex_t);
11
12
uthread_cond_t  uthread_cond_create         (uthread_mutex_t);
13
void            uthread_cond_wait           (uthread_cond_t);
14
void            uthread_cond_signal         (uthread_cond_t);
15
void            uthread_cond_broadcast      (uthread_cond_t);
16
void            uthread_cond_destroy        (uthread_cond_t);

Video Playback

 
xxxxxxxxxx
9
1
struct video_frame;
2
#define N 100;
3
struct video_frame buf [N];
4
int buf_length = 0;
5
int buf_pcur   = 0;
6
int buf_ccur   = 0;
7
8
uthread_mutex_t mx;
9
uthread_cond_t  need_frames;

 
xxxxxxxxxx
12
1
void producer() {
2
  uthread_lock (mx);
3
  while(1) {
4
    while (buf_length < N) {
5
      buf [pcur] = get_next_frame();
6
      buf_pcur   = (pcur + 1) % N;
7
      buf_length +- 1;
8
    }
9
    uthread_cond_wait (need_frames);
10
  }
11
  uthread_unlock (mx);
12
}

Using Conditions

Basic formulation

one thread acquires mutex and may wait for a condition to be established

 
xxxxxxxxxx
5
1
uthread_mutex_lock (aMutex);
2
    while(!aDesiredState)
3
      uthread_cond_wait(aCond);
4
    aDesiredState = 0;
5
uthread_mutex_unlock (aMutex);

another threads acquires mutex, establishes condition and signals waiter, if there is one

 
xxxxxxxxxx
4
1
uthread_mutex_lock (aMutex);
2
    aDesiredState = 1;
3
    uthread_cond_signal(aCond);
4
uthread_mutex_unlock (aMutex);

wait releases the mutex and blocks thread
- before waiter blocks, it releases mutex to allow other threads to acquire it
- when wait unblocks, it re-acquires mutex, waiting/blocking to enter if necessary
- note: other threads may have acquired mutex between wait call and return
signal awakens at most one thread
- waiter does not run until signaller releases the mutex explicity
- a third thread could intervene and acquire mutex before waiter
- waiter must thus re-check wait condition
- if no threads are waiting, then calling signal has no effect
broadcast awakens all threads
- may wakeup too many
- okay since threads re-check wake condition and re-wait if necessary

Video Playback (Pause on Empty)

Drinking Beer Example

Beer pitcher is shared data structure with these operations
- pour from pitcher into glass
- refill pitcher
Implementation goal
- synchronize access to the shared pitcher
- pouring from an empty pitcher requires waiting for it to be filled
- filling pitcher releases waiters
Data Structure for Beer Pitcher
- glasses will count the number of glasses of beer left
- mx is the mutex
- hasBeer is condition indicating that there's at least one glass of beer

Signal and Monitor Race

Extending the Example

What if you want to refill automatically?
- a pitcher has capacity maxGlasses and current volume glasses
- pouring removes one glass if there is enough beer and waits otherwise
- refilling loops forever waiting for pitcher to be empty, when it is, it refills the pitcher to its capacity awakening any poureres

Using Condition Variables for Disk REad

Blocking Read

schedule read as before
but now block on condition variable

 
xxxxxxxxxx
8
1
void read(char* buf, int nbytes, int blockno) {
2
  uthread_mutex_lock(mx);
3
  
4
  scheduleRead(buf, nbytes, blockno);
5
  uthread_cond_wait(readComplete);
6
  
7
  uthread_mutex_unlock(mx);
8
}

Read Completion

called by disk ISR as before
but now restarted blocked reader thread by signalling condition variable

 
xxxxxxxxxx
7
1
void readComplete() {
2
  uthread_mutex_lock(mx);
3
  
4
  uthread_cond_signal(readComplete);
5
  
6
  uthread_mutex_unlock(mx);
7
}

Why Must Mutex be held when Calling Wait?

We do this

 
xxxxxxxxxx
6
1
void read(char* buf, int nbytes, int blockno) {
2
  uthread_mutex_lock(mx);
3
  scheduleRead(buf, nbytes, blockno);
4
  uthread_cond_wait(readComplete);
5
  uthread_mutex_unlock(mx);
6
}

Why not this

 
xxxxxxxxxx
4
1
void read(char* buf, int nbytes, int blockno) {
2
  scheduleRead(buf, nbytes, blockNo);
3
  uthread_cond_wait(readComplete);
4
}

Wait-Signal Race

The Problem
- wait condition check / trigger and wait are not atomic
- signal could occur before wait
- waiter could thus miss signal
The Solution
- ensure that condition check / trigger and wait are atomic
- so that wait is ordered before signal

Wait-Signal Race … Again

Preventing the Race
- requires making waiter code atomic
- using monitor lock
- but, its not atomic if signal isn't inside monitor

Naked Notify

that's what we call a signal outside of a monitor
it's sometimes necessary
- when signal is called in a context where blocking is not allowed

Shared Queue Example

Unsynchronized code

 
xxxxxxxxxx
21
1
void enqueue (uthread_queue_t* queue, uthread_t thread) {
2
  thread->next = 0;
3
  if (queue->tail)
4
    queue->tail->next = thread;
5
  queue->tail = thread;
6
  if (queue->head == 0)
7
    queue->head = queue->tail;
8
}
9
10
uthread_t* dequeue (uthread_queue_t queue) {
11
  uthread_t thread;
12
  if (queue->head) {
13
    thread = queue->head;
14
    queue->head = queue->head->next;
15
    if (queue->head==0)
16
      queue->tail=0;
17
  } else 
18
    thread = 0;
19
  thread->next = 0;
20
  return thread;
21
}

Adding Mutual Exclusion & Wait if queue is empty

 
xxxxxxxxxx
22
1
void enqueue (uthread_queue_t* queue, uthread_t thread) {
2
  uthread_mutex_lock(&queue->mx);
3
  
4
  thread->next = 0;
5
  if (queue->tail)
6
    queue->tail->next = thread;
7
  queue->tail = thread;
8
  if (queue->head == 0)
9
    queue->head = queue->tail;
10
  uthread_cond_signal(&queue->not_empty)
11
  uthread_mutex_unlock(&queue->mx);
12
}
13
14
uthread_t* dequeue (uthread_queue_t queue) {
15
  uthread_t thread;
16
  uthread_mutex_lock(&queue->mx);
17
  while(queue->head == 0)
18
    
19
  thread->next = 0;
20
  uthread_mutex_unlock(&queue->mx);
21
  return thread;
22
}

You Have to Signal Every Time

This code seems like it would be right

 
xxxxxxxxxx
7
1
void enqueue (uthread_queue_t* queue, uthread_t thread) {
2
  uthread_mutex_lock(&queue->mx);
3
  ...
4
  if (queue->head == 0)
5
    uthread_cond_signal (&queue->not_empty);
6
  uthread_mutex_unlock(&queue->mx);
7
}

Just signal when adding to an empty queue

But it is WRONG
- Lets say there are N threads waiting in dequeue on queue->not_empty
- If there are N enqueues, there MUST be N signals to wakeup all the dequeues
- If you get two enqueues in a row before a dequeue runs, however
  - the second enqueue does not see the queue empty (i.e. queue->head != 0)
  - and so it does not signal queue->not_empty in this version of the code
- you thus get N enqueues but fewer than N signals
  - some threads will still be waiting in dequeue, even though the queue isn't empty… a bug!

Reader-Writer Monitors

If we classify critical sections as
- reader if only reads the shared data
- writer if updates the shared data
Then we can weaken the mutual exclusion constraint
- writers require exclusive access to the monitor
- but, a group of readers can access monitor concurrently
Read-Writer Monitors
- monitor state is one of
  - free, held-for-reading, or held
- mutex_lock()
  - waits for monitor to be free then sets it state to held
- mutex_lock_read_only()
  - waits for monitor to be free or held-for-reading, then sets its state to held-for-reading
  - increment reader count
- mutex_unlock()
  - if held, then set state to free
  - if held-for-reading, then decrement reader count and set state to free if reader count is 0
Policy Question
- If monitor state is held-for-reading
- thread A calls monitor_enter() and blocks waiting for monitor to be free
- thread B calls monitor_enter_read_only(); what do we do?
Disallowing new readers while writer is waiting
- is the fair thing to do
- thread A has been waiting longer than B, shouldn't it get the monitor first?
- how does this effect concurrency and throughput?
Allowing new readers while writer is waiting
- may lead to faster programs by increasing concurrency
- if readers must WAIT for old readers and writers to finish, less work is done
What should we do
- normally either provide a reasonably fair implementation that is also efficient – a tradeoff
- or allow programmer to choose (that's what Java does)

Semaphores

Introduced by Edsget Dijkstra
- was fearful of asynchrony; Semaphores synchronize interrupts
A Semaphore is
- an atomic counter that can never be less than 0
- attempting to make counter negative blocks calling thread
P(s) – wait(s)
- try to reduce s
- atomically blocks until s>0 then decrements s
V(s) – signal(s)
- increase s
- atomically increase s unblocking threads waiting in P as appropriate
but
- you can't read the value of the counter … why not?

UThread Semaphores

 
xxxxxxxxxx
7
1
struct uthread_sem;
2
typedef struct uthread_sem* uthread_sem_t;
3
4
uthread_sem_t   uthread_sem_create  (int initial_value);
5
void            uthread_sem_destroy (uthread_sem_t);
6
void            uthread_sem_wait    (uthread_sem_t);
7
void            uthread_sem_signal  (uthread_sem_t);

Using Semaphores to Drink Beer

Use semaphore to store glasses held by pitcher

set initial value of empty when creating it

 
xxxxxxxxxx
1
1
uthread_sem_t glasses = uthread_sem_create (0);

Pouring and refilling don't require a monitor

 
xxxxxxxxxx
7
1
void pour () {
2
  uthread_sem_wait (glasses);
3
}
4
void refill (int n) {
5
  for (int i = 0; i < n; i++)
6
    uthread_sem_signal (glasses)
7
}

Using Semaphores to Implement Monitors

Implementing Monitors
- initial value of semaphore is 1
- lock is wait()
- unlock is signal()
Implementing Condition Variables
- this is very hard, as it turns out
- it took until 2003 before we actually got this right

Hiding Asynchrony

Blocking Synchronous Operations
- use threads to hide asynchrony
- requires request to synchronize with completion handler

Using Monitors and Condition Variables

to avoid wait-signal race, wait and signal must be done while mutex is held
- problematic in cases where signaller can't block

 
xxxxxxxxxx
6
1
void read (...) {
2
  uthread_mutex_lock (mx);
3
  scheduleRead (buf, nbytes, bno);
4
  uthread_cond_wait (complete);
5
  uthread_mutex_unlock (mx);
6
}

 
xxxxxxxxxx
5
1
void readCompletionHandler () {
2
  uthread_mutex_lock (mx);
3
  uthread_cond_signal (complete);
4
  uthread_mutex_unlock (mx);
5
}

Using Semaphores

no critical section, wait-signal race problem goes away … why?

signaller need not block

 
xxxxxxxxxx
4
1
void read (...) {
2
  scheduleRead (buf, nbyte, bno);
3
  uthread_sem_wait (complete);
4
}

 
xxxxxxxxxx
3
1
void readCompletionHandler () {
2
  uthread_sem_signal (complete);
3
}

Queue

With condition variables

loop on wait, re-testing wait condition … why?

 
xxxxxxxxxx
7
1
int dequeue (struct Q* q) {
2
  uthread_mutex_lock (q->mx);
3
    while (q->length == 0)
4
      uthread_cond_wait (q->notEmpty);
5
    ....
6
  uthread_mutex_unlock (q->mx);
7
}

 
xxxxxxxxxx
5
1
void enqueue (struct Q* q, int i) {
2
  uthread_mutex_lock (q->mx);
3
    uthread_cond_signal (q->notEmpty);
4
  uthread_mutex_unlock (q->mx);
5
}

With semaphores

no need to loop … why not?

 
xxxxxxxxxx
4
1
struct Q {
2
  uthread_sem_t mutex;  // initialize to 1
3
  uthread_sem_t length; // initialize to 0
4
}

 
xxxxxxxxxx
7
1
int dequeue (struct Q* q) {
2
  uthread_sem_wait (q->length);
3
  
4
  uthread_sem_wait (q->mutex);
5
  ...
6
  uthread_sem_signal (q->mutex);
7
}

 
xxxxxxxxxx
6
1
void enqueue (struct Q* q, int i) {
2
  uthread_sem_wait (q->mutex);
3
  ....
4
  uthread_sem_signal (q->mutex);
5
  uthread_sem_signal (q->length);
6
}

Ordering Two Threads

If you thread A to wait for thread B

initialize semaphore b to 0

 
xxxxxxxxxx
2
1
// Thread A
2
uthread_sem_wait (b);

 
xxxxxxxxxx
2
1
// Thread B
2
uthread_sem_signal (b);

Rendezvous: both threads wait for each other

initialize semaphores a and b to 0

 
xxxxxxxxxx
3
1
// Thread A
2
uthread_sem_signal (a);
3
uthread_sem_wait   (b);

 
xxxxxxxxxx
3
1
// Thread B
2
uthread_sem_signal (b);
3
uthread_sem_wait   (a);

What if you reversed the order of wait and signal on either (or both) threads

It works find if you reverse one of them, but if BOTH of them with wait first, they deadlock

One Thread Waits for Many Threads

Local, Non-Reusable Barrier

like thread join, but without return value
master initializes semaphore barrier to 0, creates threads and waits

 
xxxxxxxxxx
8
1
// create N threads
2
uthread_create (one, a);
3
uthread_create (two, b);
4
...
5
uthread_create(last, z);
6
// wait for them to finish
7
for (int i = 0; i < N; i++)
8
  uthread_sem_wait (barrier)

everythread calls V on semaphore when they are done

 
xxxxxxxxxx
4
1
void* one (void* a) {
2
  ...
3
  uthread_sem_signal (barrier);
4
}

Global, Reusable Barrier
- is a lot harder...

Synchronization in Java

Mutex

a few variants allow interruptibility, just trying lock, ...

 
xxxxxxxxxx
7
1
Lock l = ...;
2
l.lock();
3
try {
4
  ...
5
} finally {
6
  l.unlock();
7
}

 
xxxxxxxxxx
9
1
Lock l = ...;
2
try {
3
  l.lockInterruptibility();
4
  try {
5
    ...
6
  } finally {
7
    l.unlock();
8
  }
9
} catch (InterruptException ie) {}

multiple-reader single writer locks

 
xxxxxxxxxx
3
1
ReadWriteLock   l = ...;
2
Lock            rl = l.readLock();
3
Lock            wl = l.writeLock();

Conditions

await is wait (replaces Object wait)
signal or signalAll (replaces Object notify, notifyAll )

 
xxxxxxxxxx
26
1
class Beer {
2
  Lock      l       = ...;
3
  Condition notEmpty = l.newCondition ();
4
  int       glasses = 0;
5
  
6
  void pour() throws InterruptedException {
7
    l.lock();
8
    try {
9
      while (glasses == 0)
10
        notEmpty.await();
11
      glasses--;
12
    } finally {
13
      l.unlock();
14
    }
15
  }
16
  
17
  void refill (int n) throws InterruptedException {
18
    l.lock();
19
    try {
20
      glasses += n;
21
      notEmpty.signalAll();
22
    } finally {
23
      l.unlock();
24
    }
25
  }
26
}

Semaphore Class

acquire() is wait() – also acquire(n)
release() is signal() - release(n)

 
xxxxxxxxxx
11
1
class Beer {
2
  Semaphore glasses = new Semaphore (0);
3
  
4
  void pour () throws InterruptedException {
5
    glasses.acquire();
6
  }
7
  
8
  void refill (int n) throws InterruptedException {
9
    glasses.release (n);
10
  }
11
}

Lock-free Atomic Variables
- AtomicX where X in {Boolean, Integer, IntegerArray, Reference, ...}
- atomic operations such as getAndAdd(), compareAndSet()...
  - e.g., x.compareAndSet(y,z) atomically set x=z iff x==y and returns true iff set occured

Java AtomicReference<V> Class

V getAndSet (V newValue)
- atomically sets reference to newValue and returns old value
boolean compareAndSet (V expectedValue, V newValue)
- atomically sets reference to newValue if and only if its current value is the expectedValue; returns true if assignment is successful and false otherwise
V get()
- get the current value of reference
void set (V newValue)
- set reference to newValue
Use instead of mutual exclusion to eliminate data races..

Lock-Free Atomic Stack in Java

Recall the problem with concurrent stack

 
xxxxxxxxxx
10
1
void push_st (struct SE* e) {
2
  e->next   = top;
3
  top       = e;
4
}
5
6
struct SE* pop_st () {
7
  struct SE* e = top;
8
  top = (top) ? top->next : 0;
9
  return e;
10
}

a pop could intervene between two steps of push, corrupting linked list

we solved this problem using locks to ensure mutual exclusion
now … solve without locks, using atomic compare-and-set of top

 
xxxxxxxxxx
19
1
class Element {
2
  Element* next;
3
}
4
5
clsas Stack {
6
  AtomicReference <Element> top;
7
  Stack () {
8
    top.set (NULL);
9
  }
10
  
11
  void push () {
12
    Element t;
13
    Element e = new Element ();
14
    do {
15
      t = top.get();
16
      e.next = t;
17
    } while (!top.compareAndSet(t, e));
18
  }
19
}

Problems with Concurrency

Race Condition
- competing, unsynchronized access to shared variable
  - from multiple threads
  - at least one of the threads is attempting to update the variable
- solved with synchronization
  - guaranteeing mutual exclusion for competing accesses
  - but the language does not help you see what data might be shared – can be very hard
Deadlock
- multiple competing actions wait for each other preventing any to complete

Mutexes and Recursion

What should we do for a program like this

 
xxxxxxxxxx
7
1
void foo () {
2
  uthread_mutex_lock (mx);
3
    count--;
4
    if (count > 0)
5
      foo();
6
  uthread_mutex_unlock (mx);
7
}

Here is implementation of lock, is this okay?

 
xxxxxxxxxx
11
1
void lock (struct block_lock* l) {
2
  spinlock_lock (&l->spinlock);
3
  while (l->held) {
4
    enqueue         (&waiter_queue, uthread_self());
5
    spinlock_unlock (&l->spinlock);
6
    uthread_block   ();
7
    spinlock_lock   (&l->spinlock);
8
  }
9
  l->held = 1;
10
  spinlock_unlock (&l->spinlock);
11
}

if we try to lock the mutex again, it is a deadlock
- the thread will hold the mutex when it tries to enter
- the thread will wait for itself, and thus never wake up

allow a thread that holds the mutex to enter again

only release mutex when last recursive call calls unlock

 
xxxxxxxxxx
12
1
void uthread_mutex_lock (uthread_mutex_t mutex) {
2
  spinlock_lock (&mutex->spinlock);
3
  while (mutex->holder && mutex->holder != uthread_self()) {
4
    enqueue         (&mutex->waiter_queue, uthread_self());
5
    spinlock_unlock (&mutex->spinlock);
6
    uthread_block   ();
7
    spinlock_lock   (&muter->spinlock);
8
  }
9
  mutex->holder = uthread_self();
10
  mutex->holderCount++;
11
  spinlock_unlock(&mutex->spinlock);
12
}

Systems with Multiple Mutexes

We have already seen this with semaphores
Consider a system with two mutexes: a and b

Waiter Graph Can Show Deadlocks

Synchronization Summary

Spinlock

One acquirer at a time, busy-wait until acquired
need atomic read-write memory operation, implemented in hardware
use for locks held for short periods (or when minimal lock contention)

Monitors and Condition Variables

blocking locks, stop thread while it is waiting
monitor guarantees mutual exclusion
condition variables wait/signal provides control transfer among threads

Semaphores

blocking atomic counter, stop thread if counter would go negative
introduced to coordinate asynchronous resource use
use to implement barriers or monitors
use to implement something like condition variables, but not quire

Problems, problems, problems

race conditions to be avoided using synchronization
deadlock / livelock to be avoided using synchronization carefully.