The paper – “Rollback-Recovery for Middleboxes” – is part of Justine’s Berkeley thesis work. Network middleboxes must offer high availability, with automatic failover when a device fails. Unlike routers, when middleboxes fail they most recover lost state about active network connections to perform properly; without this lost state clients face connection resets, downtime, or insecure behaviors. No existing middlebox design provides failover that is correct, fast to recover, and imposes little increased latency on failure-free operations. The FTMB system described in the paper adds only 30us of latency to median per packet latencies – a 100-1000x improvement over existing fault-tolerance mechanisms. FTMB introduces moderate throughput overheads (5-30%) and can reconstruct lost state in 40-275ms for practical system configurations.
UW CSE professor Arvind Krishnamurthy is one of the paper’s co-authors, along with Peter Xiang Gao, Soumya Basu, Aurojit Panda, Sylvia Ratnasamy, and Scott Shenker from UC Berkeley, Christian Maciocco and Maziar Manesh from Intel Research, Joao Martins from NEC Labs, and Luigi Rizzo from the University of Pisa.