{"id":13109,"date":"2025-04-16T18:35:51","date_gmt":"2025-04-16T16:35:51","guid":{"rendered":"https:\/\/monodes.com\/predaelli\/?p=13109"},"modified":"2025-04-16T18:35:55","modified_gmt":"2025-04-16T16:35:55","slug":"1024cores-distributed-reader-writer-mutex","status":"publish","type":"post","link":"https:\/\/monodes.com\/predaelli\/2025\/04\/16\/1024cores-distributed-reader-writer-mutex\/","title":{"rendered":"1024cores &#8211; Distributed Reader-Writer Mutex"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\"><em><a href=\"https:\/\/www.1024cores.net\/home\/lock-free-algorithms\/reader-writer-problem\/distributed-reader-writer-mutex\">1024cores &#8211; Distributed Reader-Writer Mutex<\/a><\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This is definitively something that I would like to Eiffelize!<\/p>\n\n\n\n<!--more-->\n\n\n\n<!--nextpage-->\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">Now, when we know that traditional <a href=\"https:\/\/www.1024cores.net\/home\/lock-free-algorithms\/reader-writer-problem\">reader-writer mutexes do no scale<\/a> and <a href=\"https:\/\/www.1024cores.net\/home\/lock-free-algorithms\/first-things-first\">write sharing is our foe<\/a>, and that <a href=\"https:\/\/www.1024cores.net\/home\/lock-free-algorithms\/reader-writer-problem\/state-distribution\">the way to go is state distribution<\/a>, let&#8217;s try to create a scalable distributed reader-writer mutex. The mutex is going to be very simple, I&#8217;m not going to dive too deep into advanced lockfree algorithms, let&#8217;s just create the simplest possible distributed design, and see what performance and scalability we will achieve.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The mutex is based on <a href=\"https:\/\/www.1024cores.net\/home\/lock-free-algorithms\/tricks\/per-processor-data\">per-processor data<\/a>, and it leads to a very simple implementation. If it would be based on per-thread data instead, we would need to cope with dynamic thread registration\/deregistration and properly synchronize arriving\/terminating readers with writers.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The idea is very simple. We merely create a traditional reader-writer mutex per CPU; a reader acquires in shared mode a mutex it thinks refers to presumably current CPU; while a writer acquires in exclusive mode all the mutexes:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"488\" height=\"285\" data-attachment-id=\"13111\" data-permalink=\"https:\/\/monodes.com\/predaelli\/2025\/04\/16\/1024cores-distributed-reader-writer-mutex\/distributed-reader-writer-mutex-1\/\" data-orig-file=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-1.webp?fit=488%2C285&amp;ssl=1\" data-orig-size=\"488,285\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"distributed-reader-writer-mutex-1\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-1.webp?fit=488%2C285&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-1.webp?resize=488%2C285&#038;ssl=1\" alt=\"\" class=\"wp-image-13111\" srcset=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-1.webp?w=488&amp;ssl=1 488w, https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-1.webp?resize=300%2C175&amp;ssl=1 300w\" sizes=\"auto, (max-width: 488px) 100vw, 488px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Note that it&#8217;s OK if a reader acquires a &#8220;wrong&#8221; mutex &#8211; they all are plain reader-writer mutexes in itself, so they support several concurrent readers. No additional synchronization between writers is required, writers acquire the mutexes in the same order (from 0 to P-1), so ownership over mutex 0 basically determines who is the &#8220;current&#8221; writer (all other potential writers are parked on mutex 0).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">As an underlying reader-writer mutex type I use plain pthread_rwlock_t; sched_getcpu() is used to obtain current processor number. Let&#8217;s move on to implementation. First, let&#8217;s define data structures:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">typedef struct distr_rw_mutex_cell_t<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">{<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">pthread_rwlock_t mtx;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">char pad [CACHE_LINE_SIZE &#8211; sizeof(pthread_rwlock_t)];<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">} distr_rw_mutex_cell_t;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">typedef struct distr_rw_mutex_t<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">{<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">int proc_count;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">char pad [CACHE_LINE_SIZE &#8211; sizeof(int)];<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">distr_rw_mutex_cell_t cell [0];<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">} distr_rw_mutex_t;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Constructor merely determines total number of processors in a system, memorizes it, and initializes per-processor mutexes. While destructor destroys the mutexes and frees memory:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">int distr_rw_mutex_create (distr_rw_mutex_t** mtx_p)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">{<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">distr_rw_mutex_t* mtx;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">int proc_count;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">int i;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">proc_count = (int)sysconf(_SC_NPROCESSORS_CONF);<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">if (posix_memalign((void**)&amp;mtx, CACHE_LINE_SIZE,<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">sizeof(distr_rw_mutex_t) +<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">proc_count * sizeof(distr_rw_mutex_cell_t)))<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">return 1;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">mtx-&gt;proc_count = proc_count;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">for (i = 0; i != proc_count; i += 1)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">{<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">if (pthread_rwlock_init(&amp;mtx-&gt;cell[i].mtx, 0))<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">{<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">while (i &#8211;&gt; 0)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">pthread_rwlock_destroy(&amp;mtx-&gt;cell[i].mtx);<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">free(mtx);<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">return 1;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">}<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">}<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">*mtx_p = mtx;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">return 0;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">}<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">int distr_rw_mutex_destroy (distr_rw_mutex_t* mtx)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">{<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">int i;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">for (i = 0; i != mtx-&gt;proc_count; i += 1)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">pthread_rwlock_destroy(&amp;mtx-&gt;cell[i].mtx);<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">free(mtx);<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">return 0;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">}<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Write lock\/unlock functions merely lock\/unlock all the mutexes. Not much to comment here:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">int distr_rw_mutex_wrlock (distr_rw_mutex_t* mtx)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">{<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">int i;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">for (i = 0; i != mtx-&gt;proc_count; i += 1)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">pthread_rwlock_wrlock(&amp;mtx-&gt;cell[i].mtx);<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">return 0;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">}<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">int distr_rw_mutex_wrunlock (distr_rw_mutex_t* mtx)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">{<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">int i;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">for (i = 0; i != mtx-&gt;proc_count; i += 1)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">pthread_rwlock_unlock(&amp;mtx-&gt;cell[i].mtx);<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">return 0;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">}<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Read lock function obtains [approximation] of current processor, memorizes it, and locks in shared mode respective mutex. Read unlock function just unlocks the same mutex. Note that unlock function can&#8217;t re-obtain current processor number and use, it must use processor number obtained in the lock function (because processor might be changed):<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">int distr_rw_mutex_rdlock (distr_rw_mutex_t* mtx, int* proc)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">{<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">*proc = sched_getcpu();<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">pthread_rwlock_rdlock(&amp;mtx-&gt;cell[*proc].mtx);<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">return 0;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">}<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">int distr_rw_mutex_rdunlock (distr_rw_mutex_t* mtx, int proc)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">{<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">pthread_rwlock_unlock(&amp;mtx-&gt;cell[proc].mtx);<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">return 0;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">}<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Performance<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In order to verify performance and scalability, I benchmarked the mutex against pthread_rwlock_t. The benchmark is very simple: 1 reader-writer mutex, an array of N int&#8217;s (the data) and P worker threads. Each worker thread constantly acquires the mutex in shared mode and verifies data&#8217;s consistency. Periodically each worker thread acquires the mutex in exclusive mode and mutates the data. The benchmark was executed on a 4 processor x 4 cores AMD machine (16 hardware threads in total) running Linux 2.6.29.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In the first run I set N=4, and vary period of writing as 10, 50, 100, 500, 1000 and 10000<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"510\" height=\"432\" data-attachment-id=\"13112\" data-permalink=\"https:\/\/monodes.com\/predaelli\/2025\/04\/16\/1024cores-distributed-reader-writer-mutex\/distributed-reader-writer-mutex-2\/\" data-orig-file=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-2.webp?fit=623%2C528&amp;ssl=1\" data-orig-size=\"623,528\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"distributed-reader-writer-mutex-2\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-2.webp?fit=510%2C432&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-2.webp?resize=510%2C432&#038;ssl=1\" alt=\"\" class=\"wp-image-13112\" srcset=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-2.webp?resize=510%2C432&amp;ssl=1 510w, https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-2.webp?resize=300%2C254&amp;ssl=1 300w, https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-2.webp?w=623&amp;ssl=1 623w\" sizes=\"auto, (max-width: 510px) 100vw, 510px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">And below is the same graph but without lines for distributed(500, 1000 and 10000):<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"510\" height=\"459\" data-attachment-id=\"13113\" data-permalink=\"https:\/\/monodes.com\/predaelli\/2025\/04\/16\/1024cores-distributed-reader-writer-mutex\/distributed-reader-writer-mutex-3\/\" data-orig-file=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-3.webp?fit=582%2C524&amp;ssl=1\" data-orig-size=\"582,524\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"distributed-reader-writer-mutex-3\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-3.webp?fit=510%2C459&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-3.webp?resize=510%2C459&#038;ssl=1\" alt=\"\" class=\"wp-image-13113\" srcset=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-3.webp?resize=510%2C459&amp;ssl=1 510w, https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-3.webp?resize=300%2C270&amp;ssl=1 300w, https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-3.webp?w=582&amp;ssl=1 582w\" sizes=\"auto, (max-width: 510px) 100vw, 510px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">In the second run I set N=256, the same two graphs below:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"510\" height=\"433\" data-attachment-id=\"13114\" data-permalink=\"https:\/\/monodes.com\/predaelli\/2025\/04\/16\/1024cores-distributed-reader-writer-mutex\/distributed-reader-writer-mutex-4\/\" data-orig-file=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-4.webp?fit=597%2C507&amp;ssl=1\" data-orig-size=\"597,507\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"distributed-reader-writer-mutex-4\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-4.webp?fit=510%2C433&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-4.webp?resize=510%2C433&#038;ssl=1\" alt=\"\" class=\"wp-image-13114\" srcset=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-4.webp?resize=510%2C433&amp;ssl=1 510w, https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-4.webp?resize=300%2C255&amp;ssl=1 300w, https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-4.webp?w=597&amp;ssl=1 597w\" sizes=\"auto, (max-width: 510px) 100vw, 510px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"510\" height=\"412\" data-attachment-id=\"13115\" data-permalink=\"https:\/\/monodes.com\/predaelli\/2025\/04\/16\/1024cores-distributed-reader-writer-mutex\/distributed-reader-writer-mutex-5\/\" data-orig-file=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-5.webp?fit=625%2C505&amp;ssl=1\" data-orig-size=\"625,505\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"distributed-reader-writer-mutex-5\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-5.webp?fit=510%2C412&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-5.webp?resize=510%2C412&#038;ssl=1\" alt=\"\" class=\"wp-image-13115\" srcset=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-5.webp?resize=510%2C412&amp;ssl=1 510w, https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-5.webp?resize=300%2C242&amp;ssl=1 300w, https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-5.webp?w=625&amp;ssl=1 625w\" sizes=\"auto, (max-width: 510px) 100vw, 510px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">So, what we see on the graphs? Our distributed mutex is somewhat (10-60%) slower in uncontended case (note that 60% slowdown refers to the extreme case of 10% write rate + basically no useful work). pthread_rwlock_t is completely non-scalable under load even on read-mostly workloads (however, we see a slight attempt to scale with N=256 on 2 threads). Our distributed mutex scales much better, under 1\/10000 write rate it exposes perfect linear scaling.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Note that that fact that we are using current processor number for read acquisition is crucial, because performance-wise per-processor data is basically equal to per-thread data (a processor runs one thread at a time). I&#8217;ve also benchmarked a randomized variant of the distributed mutex (it uses per-thread random number generators to choose a mutex for read acquisition), and I&#8217;ve tried to create kind of the best conditions for it &#8211; I set data size N to 256 and increase number of underlying reader-writer mutexes 4-fold. The benchmark showed that it scales better than a centralized mutex, however still far from per-processor mutex (write rate is presented in brackets):<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"510\" height=\"357\" data-attachment-id=\"13116\" data-permalink=\"https:\/\/monodes.com\/predaelli\/2025\/04\/16\/1024cores-distributed-reader-writer-mutex\/distributed-reader-writer-mutex-6\/\" data-orig-file=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-6.webp?fit=605%2C424&amp;ssl=1\" data-orig-size=\"605,424\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"distributed-reader-writer-mutex-6\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-6.webp?fit=510%2C357&amp;ssl=1\" src=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-6.webp?resize=510%2C357&#038;ssl=1\" alt=\"\" class=\"wp-image-13116\" srcset=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-6.webp?resize=510%2C357&amp;ssl=1 510w, https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-6.webp?resize=300%2C210&amp;ssl=1 300w, https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2025\/04\/distributed-reader-writer-mutex-6.webp?w=605&amp;ssl=1 605w\" sizes=\"auto, (max-width: 510px) 100vw, 510px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The bottom line is that the implementation is very simple and comprehensible, performance is somewhat worse than pthread_rwlock_t, while scalability is significantly improved. The mutex can be used whenever you have high read load and low write-to-read ratio (~&lt;1-5%).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">You can download the implementation along with the benchmark below (gcc\/Linux).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><a target=\"_blank\" href=\"https:\/\/drive.google.com\/folderview?id=16Du4plUk3iBZJzNiP1JYnmjNMmufc0ff\" rel=\"noreferrer noopener\"><\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/drive.google.com\/embeddedfolderview?id=16Du4plUk3iBZJzNiP1JYnmjNMmufc0ff#list\">https:\/\/drive.google.com\/embeddedfolderview?id=16Du4plUk3iBZJzNiP1JYnmjNMmufc0ff#list<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If not stated otherwise, all non-source-code text and images on this site are provided under the terms of the <a href=\"http:\/\/www.google.com\/url?q=http%3A%2F%2Fcreativecommons.org%2Flicenses%2Fby-nc-sa%2F3.0%2F&amp;sa=D&amp;sntz=1&amp;usg=AOvVaw2deD9cWsQvcneTOtYfZDp5\" target=\"_blank\" rel=\"noreferrer noopener\">Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License<\/a>. Source code is covered by the <a href=\"https:\/\/www.1024cores.net\/home\/code-license\">Simplified BSD License<\/a> and by <a href=\"http:\/\/www.google.com\/url?q=http%3A%2F%2Fwww.apache.org%2Flicenses%2FLICENSE-2.0&amp;sa=D&amp;sntz=1&amp;usg=AOvVaw2PdpYUD6EFcyaEJDP0W5CF\" target=\"_blank\" rel=\"noreferrer noopener\">Apache License, Version 2.0<\/a>. The opinions expressed on this site are my own and do not necessarily reflect the views of Google.<\/p>\n<\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p class=\"excerpt\">1024cores &#8211; Distributed Reader-Writer Mutex This is definitively something that I would like to Eiffelize!<\/p>\n<p class=\"more-link-p\"><a class=\"more-link\" href=\"https:\/\/monodes.com\/predaelli\/2025\/04\/16\/1024cores-distributed-reader-writer-mutex\/\">Read more &rarr;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":4,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"federated","footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false,"_members_access_role":[],"_members_access_error":""},"categories":[238,34,98],"tags":[],"class_list":["post-13109","post","type-post","status-publish","format-standard","hentry","category-agenda","category-eiffel","category-liberty-eiffel"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p6daft-3pr","jetpack-related-posts":[{"id":9687,"url":"https:\/\/monodes.com\/predaelli\/2022\/10\/01\/spacevim-has-eiffel-support\/","url_meta":{"origin":13109,"position":0},"title":"SpaceVim has Eiffel support!","author":"Paolo Redaelli","date":"2022-10-01","format":false,"excerpt":"SpaceVim, a community-driven vim distribution that seeks to provide layer feature, besides turning Vim into a nifty IDE for several languages (C\/C++, Rust, Kotlin, Go, Python, Java and JavaScript plus others), it offers among the available layers one for Eiffel! \u00a0","rel":"","context":"In &quot;Eiffel&quot;","block_context":{"text":"Eiffel","link":"https:\/\/monodes.com\/predaelli\/category\/eiffel\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":5808,"url":"https:\/\/monodes.com\/predaelli\/2019\/06\/28\/5808\/","url_meta":{"origin":13109,"position":1},"title":"https:\/\/twitter.com\/wvo\/status\/1144331578006335488?s=20 Single header C++ WASM\u2026","author":"Paolo Redaelli","date":"2019-06-28","format":false,"excerpt":"https:\/\/twitter.com\/wvo\/status\/1144331578006335488?s=20 Single header C++ WASM binary writer (with linking\/reloc support): (link: https:\/\/github.com\/aardappel\/lobster\/blob\/master\/dev\/src\/lobster\/wasm_binary_writer.h) github.com\/aardappel\/lobs\u2026 as part of the Lobster WASM backend: (link: http:\/\/aardappel.github.io\/lobster\/implementation_wasm.html) aardappel.github.io\/lobster Time is ripe to start implementing an LLVM Liberty Eiffel backend. For real, this time \ud83d\ude03","rel":"","context":"In &quot;Eiffel&quot;","block_context":{"text":"Eiffel","link":"https:\/\/monodes.com\/predaelli\/category\/eiffel\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1286,"url":"https:\/\/monodes.com\/predaelli\/2016\/04\/17\/eiffel-a-viable-candidate-as-a-language-for-the-gnome-platform\/","url_meta":{"origin":13109,"position":2},"title":"Eiffel: A viable candidate as a language for the Gnome platform ?","author":"Paolo Redaelli","date":"2016-04-17","format":false,"excerpt":"Eiffel: A viable candidate as a language for the Gnome platform ? It was 2004. Linux were labelled as a cancer by Ballmer, Android and iPhone didn't existed. Multi-core CPU were still high-end. It was a different world. From archive.org, before it got lost.... Twelve years ago. And it was\u2026","rel":"","context":"In &quot;Eiffel&quot;","block_context":{"text":"Eiffel","link":"https:\/\/monodes.com\/predaelli\/category\/eiffel\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":4776,"url":"https:\/\/monodes.com\/predaelli\/2018\/10\/24\/ecma-eiffel-syntax-guide\/","url_meta":{"origin":13109,"position":3},"title":"(ECMA) Eiffel Syntax Guide","author":"Paolo Redaelli","date":"2018-10-24","format":false,"excerpt":"(ECMA) Eiffel Syntax Guide","rel":"","context":"In &quot;Eiffel&quot;","block_context":{"text":"Eiffel","link":"https:\/\/monodes.com\/predaelli\/category\/eiffel\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":11120,"url":"https:\/\/monodes.com\/predaelli\/2024\/01\/07\/yet-another-missing-eiffel-%f0%9f%98%a2\/","url_meta":{"origin":13109,"position":4},"title":"Yet another missing Eiffel \ud83d\ude22","author":"Paolo Redaelli","date":"2024-01-07","format":false,"excerpt":"On https:\/\/github.com\/attractivechaos\/plb2 there is yet another programming language benchmark. And yet another not having Eiffel.... In addition to C, there is Nim, V, Rust.... that's very sad \ud83d\ude22","rel":"","context":"In &quot;Eiffel&quot;","block_context":{"text":"Eiffel","link":"https:\/\/monodes.com\/predaelli\/category\/eiffel\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":9332,"url":"https:\/\/monodes.com\/predaelli\/2022\/05\/01\/liberty-liberty-eiffel-programming-language\/","url_meta":{"origin":13109,"position":5},"title":"Liberty | Liberty Eiffel programming language","author":"Paolo Redaelli","date":"2022-05-01","format":false,"excerpt":"Liberty Source: Liberty | Liberty Eiffel programming language","rel":"","context":"In &quot;Liberty Eiffel&quot;","block_context":{"text":"Liberty Eiffel","link":"https:\/\/monodes.com\/predaelli\/category\/eiffel\/liberty-eiffel\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/posts\/13109","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/comments?post=13109"}],"version-history":[{"count":0,"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/posts\/13109\/revisions"}],"wp:attachment":[{"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/media?parent=13109"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/categories?post=13109"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/tags?post=13109"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}