Milestone Report
11/22 Milestone Report due at 9AM
Completed So Far
So far we have created these lock implementations: Test and set , test and test and set, test and set with backoff, test and test and set with backoff, ticket lock and array lock. We have checked the correctness of these locks using a simple microbenchmark which increments a shared variable x and runs a while loop for some iterations (to simulate delay). We had to write these locks in x86 assembly functions using instructions like lock cmpxchg, lock bts and we wrote the remaining code in C which called these assembly functions using calling conventions. We also got some race conditions while trying to implement the locks in assembly but we fixed it. We also started measuring some metrics (in GHC machine) like fairness, lock latency (for both acquiring and releasing the lock) and cache references and cache misses using perf tool (as an approximate measure of interconnect traffic).
Goals and Deliverables
Till now (checkpoint) we have implemented the locks which have been covered in the class. From now we will try to implement what has not been covered in the class. We originally set easier, more realistic goals to complete. After one of our meetings with the professor, we added on the challenge of implementing lock stealing in the case that a thread is swapped by the OS while holding onto a lock. After completing the code for locking and unlocking, we have familiarized ourselves and have been made more comfortable with locks and how they work. Alongwith that we will possibly try to explore other lock implementations. We are also planning to measure the lock latency in PSC machines. We are also planning to write better microbenchmarks, one with critical section with lot of computations (but will have cache hits) and other critical section which will incur lot of cache misses which will potentially create interconnect traffic and can have varying effects on lock performance for different locks. We do believe that we will be able to complete these things in the post checkpoint phase.
Show at Session
We hope to show a table of comparison results for different lock implementations. We will also explain thoroughly how we accomplished lock stealing.
Issues of Concern
We aren't sure about how to measure interconnect traffic. As of now we are measuring cache accesses/misses using perf tool as a rough estimate. Any help regarding more accurate way to measure it will be helpful. (Also we wont be able to measure this cache misses in PSC). We are uncertain about the time lock stealing implementation may take. We are also unsure what other metrics will be good to measure other than fairness, latency and cache accesses/misses. We were also wondering whether power consumption for different lock implementations (when there are many threads trying to acquire and release locks in quick intervals) will be a good thing to observe?