In this paper, we propose a new approach to building synchronization primitives, dubbed “lwlocks” (short for light-weight locks). The primitives are optimized for small memory footprint while maintaining efficient performance in low contention scenarios. A read-write lwlock occupies 4 bytes, a mutex occupies 4 bytes (2 if deadlock detection is not required), and a condition variable occupies 4 ...