how to run a code profiler for heavy processing loads #190

tombisho · 2021-04-14T10:16:29Z

Normally when you are doing an analysis and it fails due to the load on the system, a code profiler can help you identify what step caused the failure. This allows you to put in a work around to stop the failure from happening.

In a DataSHIELD context this is both more serious and challenging to solve. It is more serious because the person running the code does not have control of the server. If it crashes it is a big delay as they will have to email the server owner to get them to restart it. Thus this has been a big usability issue for us previously and in current work on InterConnect.

In terms of solving the problem, you cannot just run the profiler on the client side, it needs to look at how the code is running on the server side. Maybe as a first attempt the profiler could run on the client at least to determine which call to the server causes the failure.... but it does not help you identify what is causing the failure on the server side. Perhaps we could develop a ds.profiler that could wrap another call and run it with a profiler on the server side? I don't think it will be easy! :-)

ymarcon · 2021-04-14T11:20:49Z

FYI the new R server, Rock, checks every 10s whether the main R process has died and if this is the case, it automatically restarts it. No more need to have a manual intervention from the server administrator.

ymarcon · 2021-04-14T11:50:16Z

As the server side environment can disappear (R server crash due to memory shortage), any profiling (or logging) information should have been sent outside of the R server session workspace for latter retrieval. Apparently for logging, logger is the recommended package. It supports custom log appenders, i.e. the log message can be sent to an app instead of being simply printed on the stdout.

Then one option would be to define an API for logging info in the DataSHIELD context (obviously in the DS middleware, Opal) and for making this log available to end user (if not disclosive!).

tombisho · 2021-04-14T12:08:35Z

Hi Yannick,

Thanks for these responses!

I want to add @neelsoumya to this so that he can see the updates.

It sounds like Rock will help with some of the current downside. In my experience though it seems like the server gets completely swamped and unresponsive (all memory and all virtual memory consumed) and the whole thing needs restarting, not just the problem session. Maybe the next time it happens with our own server I will try removing the sessions (if I can get to the web UI!).

StuartWheater added this to the v6.4 milestone Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to run a code profiler for heavy processing loads #190

how to run a code profiler for heavy processing loads #190

tombisho commented Apr 14, 2021

ymarcon commented Apr 14, 2021

ymarcon commented Apr 14, 2021

tombisho commented Apr 14, 2021

how to run a code profiler for heavy processing loads #190

how to run a code profiler for heavy processing loads #190

Comments

tombisho commented Apr 14, 2021

ymarcon commented Apr 14, 2021

ymarcon commented Apr 14, 2021

tombisho commented Apr 14, 2021