fix(udr): Add cleanup of UDR objects from Engine on attachment disconnect#8992
fix(udr): Add cleanup of UDR objects from Engine on attachment disconnect#8992TreeHunter9 wants to merge 1 commit intoFirebirdSQL:v5.0-releasefrom
Engine on attachment disconnect#8992Conversation
|
The only detail in your logic does not let me answer definitely 'yes'. You say that depending upon type of external engine you get crush or not. This is explained by creation of binary compatible or not object in RAM pointed to by dangling pointer. But pay attention - pre-6 external routines are created in attachment pool, which is destroyed together with attachment. If it was not for destroyed pool I could agree: possibility that object of same size is created in same block rather high. In a test we can expect same sequence of RAM allocation for old and new attachment making pool to have same blocks map. But in real life that hardly works, i.e. looks like external engine should segfault much more often. In all other aspects specially re v6 I agree with you - external routines were moved to database's pool, and no dangling pointers should arrive. |
|
Yes - now I've got what objects do you talk about. They are really in default pool, moreover - in plugin's default pool. In this case no more questions remain. |
There is a problem with UDR objects cleanup that can lead to server crash.
Detailed description of the Issue
As I understand:
Engineis the representation of plugin that is loaded lazy when a UDR function (or any other UDR object) is called;Engineis created, it createsUdrPluginImplwhich is basically loads UDR factory objects from plugin's dynamic library;SharedFunction) and store a pointer to this function insideEngine. The actual owner ofSharedFunctionis attachment;SharedFunctionis executed, it basically creates a copy of itself and store it in children map whereIExternalContext*is a key. Upon the next execution, the already created copy will be taken from children map;IExternalContextis created per attachment and per engine;So, after execution of UDR function, there will be one main
SharedFunctionand one child, but on attachment disconnect (Engine::closeAttachment()), only the child will be removed, while theSharedFunction*will still be stored inEngine. The attachment is dead, soSharedFunctiontoo, and now we have a dangling pointer insideSortedArray<class SharedFunction*> functions;. Referencing the dangling pointer itself doesn't cause the server to crash, but when a new object will be placed at address to which pointer is pointing to, this will become a problem.I found the way to reproduce the crash by using 2 UDR modules. I'm using http_client_udr and
udf_compat(comes with firebird), but I'm guessing that any UDR will do the job../udr.sh "/home/treehunter/Firebird_5/firebird/gen/Release/firebird" 100 10, it will create database in root folder, populate it with UDR functions and procedures, then it will wait;Enginefrom being unloaded when the script ends;Enginewith dangling pointers after attachment disconnect;select div(2, 1) from rdb$database. This is necessary to load the newEnginefromudf_compat;stacktrace
I have found that half of dangling pointers
SharedFunction*insideSortedArray<class SharedFunction*> functionsis pointing toSharedProcedure, but no crashes occur when iterating over these pointers. I suspect this is becauseSharedFunctionandSharedProcedureare binary compatible. However, when we adding new UDR engine to the game, theUdrPluginImplcan be created at the address whereSharedFunctionwas stored, and this would certainly cause a crash.Proposed fix
Track the created main UDR objects that is stored inside
Engineby attaching them to attachment, so that we can track them and remove them fromEnginewhen the attachment is disconnecting.Behavior on v6
The metacache has drastically change the logic of handling main UDR objects, they no longer belong to connections. It means that after creation of main
SharedFunctionit will be shared between connections, and execution from different attachment will now create a child. The child is normally removed on attachment disconnect, so there is no dangling pointer problem as I understand.For example, my test case from above do not reproduce the crash, and if look at
SortedArray<class SharedFunction*> functionsinsideEngine, on v5 there is around 200 elements (dangling pointers) due to creation ofSharedFuncionper attachment, but on v6 there is 2 elements - these are functions that are being called during the test (every new attachment creates only the child of these mainSharedFunctions, which are normally removed when attachment disconnects).I'm not very familiar with the new matacache code, but at first glance, there is no bug in v6. Maybe Alex can confirm this.