You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Normally when you are doing an analysis and it fails due to the load on the system, a code profiler can help you identify what step caused the failure. This allows you to put in a work around to stop the failure from happening.
In a DataSHIELD context this is both more serious and challenging to solve. It is more serious because the person running the code does not have control of the server. If it crashes it is a big delay as they will have to email the server owner to get them to restart it. Thus this has been a big usability issue for us previously and in current work on InterConnect.
In terms of solving the problem, you cannot just run the profiler on the client side, it needs to look at how the code is running on the server side. Maybe as a first attempt the profiler could run on the client at least to determine which call to the server causes the failure.... but it does not help you identify what is causing the failure on the server side. Perhaps we could develop a ds.profiler that could wrap another call and run it with a profiler on the server side? I don't think it will be easy! :-)
The text was updated successfully, but these errors were encountered:
FYI the new R server, Rock, checks every 10s whether the main R process has died and if this is the case, it automatically restarts it. No more need to have a manual intervention from the server administrator.
As the server side environment can disappear (R server crash due to memory shortage), any profiling (or logging) information should have been sent outside of the R server session workspace for latter retrieval. Apparently for logging, logger is the recommended package. It supports custom log appenders, i.e. the log message can be sent to an app instead of being simply printed on the stdout.
Then one option would be to define an API for logging info in the DataSHIELD context (obviously in the DS middleware, Opal) and for making this log available to end user (if not disclosive!).
I want to add @neelsoumya to this so that he can see the updates.
It sounds like Rock will help with some of the current downside. In my experience though it seems like the server gets completely swamped and unresponsive (all memory and all virtual memory consumed) and the whole thing needs restarting, not just the problem session. Maybe the next time it happens with our own server I will try removing the sessions (if I can get to the web UI!).
Normally when you are doing an analysis and it fails due to the load on the system, a code profiler can help you identify what step caused the failure. This allows you to put in a work around to stop the failure from happening.
In a DataSHIELD context this is both more serious and challenging to solve. It is more serious because the person running the code does not have control of the server. If it crashes it is a big delay as they will have to email the server owner to get them to restart it. Thus this has been a big usability issue for us previously and in current work on InterConnect.
In terms of solving the problem, you cannot just run the profiler on the client side, it needs to look at how the code is running on the server side. Maybe as a first attempt the profiler could run on the client at least to determine which call to the server causes the failure.... but it does not help you identify what is causing the failure on the server side. Perhaps we could develop a ds.profiler that could wrap another call and run it with a profiler on the server side? I don't think it will be easy! :-)
The text was updated successfully, but these errors were encountered: