fix: stop printing and saving halting after 10-12 hours of runtime#372
Conversation
Three bugs combined to cause the system to stop printing and saving new orders after extended runtime. Root cause (remote_printer.cc): PrinterCB() closed the socket and set failure=999 but never called RemoveInputFn(). The Xt event loop kept polling the closed FD on every tick, spinning forever. Over 10-12 hours this CPU waste starved the main event thread. Fix: deregister the input handler before closing the socket. Secondary (printer.cc): Printer::Close() called the synchronous LPDPrint() — which shells out to lpr via system() — on the main thread. A stalled or unresponsive CUPS daemon blocked the entire event loop. Fix: Close() now routes TARGET_LPD and TARGET_SOCKET through the existing CloseAsync() thread-pool path. Tertiary (archive.cc, data_file.hh): Archive::SavePacked() did not detect write errors after the drawer and check loops. A disk-full or I/O error was silently swallowed and the archive marked as saved. Fix: added OutputDataFile::HasError() (ferror/gzerror) and check it after each critical write loop.
|
These improvements in the code are all grand improvements ! I always wonder what role AI may have played in finding and repairing the code because, and I'm speculating because I haven't yet discussed these fixes with Ariel, shortcomings like these in the code have likely been present for as much as 30 years and may never have been discovered and fixed. That's a scary thought ! A key part of the solution to discovering and fixing these problems is the fact that Ariel is both relying on the behavior of the code in his own restaurant community and has been unhesitating in empowering his code development environment with AI. Anyone thinking about joining in on extending, enhancing or improving the features of ViewTouch would make a very unfortunate and unnecessary mistake of failing to connect with Ariel and learn from his vast understanding of the ViewTouch code base. |
Three bugs combined to cause the system to stop printing and saving new orders after extended runtime.
Root cause (remote_printer.cc): PrinterCB() closed the socket and set failure=999 but never called RemoveInputFn(). The Xt event loop kept polling the closed FD on every tick, spinning forever. Over 10-12 hours this CPU waste starved the main event thread. Fix: deregister the input handler before closing the socket.
Secondary (printer.cc): Printer::Close() called the synchronous LPDPrint() — which shells out to lpr via system() — on the main thread. A stalled or unresponsive CUPS daemon blocked the entire event loop. Fix: Close() now routes TARGET_LPD and TARGET_SOCKET through the existing CloseAsync() thread-pool path.
Tertiary (archive.cc, data_file.hh): Archive::SavePacked() did not detect write errors after the drawer and check loops. A disk-full or I/O error was silently swallowed and the archive marked as saved. Fix: added OutputDataFile::HasError() (ferror/gzerror) and check it after each critical write loop.