From global to local quiescence: Wait-Free code patching of multi-threaded processes

Autor(en): Rommel, F.
Dietrich, C.
Friesel, D.
Köppen, M.
Borchert, C.
Müller, M.
Spinczyk, O. 
Lohmann, D.
Stichwörter: Quality of service; Systems analysis, Code changes; Global barriers; High load; Live updates; Multi-threaded programs; Multithreaded; Running systems; User spaces, Linux
Erscheinungsdatum: 2020
Herausgeber: USENIX Association
Journal: Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020
Startseite: 651
Seitenende: 666
Zusammenfassung: 
Live patching has become a common technique to keep long-running system services secure and up-to-date without causing downtimes during patch application. However, to safely apply a patch, existing live-update methods require the entire process to enter a state of quiescence, which can be highly disruptive for multi-threaded programs: Having to halt all threads (e.g., at a global barrier) for patching not only hampers quality of service, but can also be tremendously difficult to implement correctly without causing deadlocks or other synchronization issues. In this paper, we present WFPATCH, a wait-free approach to inject code changes into running multi-threaded programs. Instead of having to stop the world before applying a patch, WFPATCH can gradually apply it to each thread individually at a local point of quiescence, while all other threads can make uninterrupted progress. We have implemented WFPATCH as a kernel service and user-space library for Linux 5.1 and evaluated it with OpenLDAP, Apache, Memcached, Samba, Node.js, and MariaDB on Debian 10 (“buster”). In total, we successfully applied 33 different binary patches into running programs while they were actively servicing requests; 15 patches had a CVE number or were other critical updates. Applying a patch with WFPATCH did not lead to any noticeable increase in request latencies - even under high load - while applying the same patch after reaching global quiescence increases tail latencies by a factor of up to 41× for MariaDB. © 2020 Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020. All rights reserved.
Beschreibung: 
Conference of 14th USENIX Symposium on Operating Systems Design and Implementation,OSDI 2020 ; Conference Date: 4 November 2020 Through 6 November 2020; Conference Code:164991
ISBN: 9781939133199
Externe URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85096787683&partnerID=40&md5=cc3281cf226a92b4064d93fc0efd4bf7

Show full item record

Page view(s)

2
Last Week
0
Last month
0
checked on May 19, 2024

Google ScholarTM

Check

Altmetric