embedded - How to identify, what is stalling the system in Linux? -

i have embedded system, when user i/o operations, scheme stalls. action after long time. scheme quite complex , has many process running. question how can identify making scheme stall - nil literally 5 minutes. after 5 minutes, see outcome. don't know stalling system. inputs on how debug issue. have run top on system. however, doesn't lead issue. see here, jup_render taking 30% of cpu, not plenty stall system. so, not sure whether top useful here or not.

~ # top

top - 12:01:05 21 min, 1 user, load average: 1.49, 1.26, 0.87 tasks: 116 total, 2 running, 114 sleeping, 0 stopped, 0 zombie cpu(s): 44.4%us, 13.9%sy, 0.0%ni, 40.3%id, 0.0%wa, 0.0%hi, 1.4%si, 0.0%st mem: 822572k total, 389640k used, 432932k free, 1980k buffers swap: 0k total, 0k used, 0k free, 227324k cached

 pid user      pr  ni  virt  res  shr s %cpu %mem    time+  command           850 root      20   0  309m  32m  16m s   30  4.0   3:10.88 jup_render                                                                               870 root      20   0  221m  13m  10m s   27  1.7   2:28.78 jup_render                                                                               688 root      20   0 1156m 4092 3688 s   11  0.5   1:25.49 rxserver                                                                                    9 root      20   0     0    0    0 s    2  0.0   0:06.81 ksoftirqd/1                                                                                16 root      20   0     0    0    0 s    1  0.0   0:06.87 ksoftirqd/3                                                                              9294 root      20   0  1904  616  508 r    1  0.1   0:00.10 top                                                                                       812 root      20   0  865m  85m  46m s    1 10.7   1:21.17 lippo_main                                                                                   13 root      20   0     0    0    0 s    1  0.0   0:06.59 ksoftirqd/2                                                                               800 root      20   0  223m 8316 6268 s    1  1.0   0:08.30 rat-cadaemon                                                                             3 root      20   0     0    0    0 s    1  0.0   0:05.94 ksoftirqd/0                                                                              1456 root      20   0 80060  10m 8208 s    1  1.2   0:04.82 jup_render                                                                              1330 root      20   0  202m  10m 8456 s    0  1.3   0:06.08 jup_render                                                                              8905 root      20   0  1868  556  424 s    0  0.1   0:02.91 dropbear                                                                                 1561 root      20   0 80084  10m 8204 s    0  1.2   0:04.92 jup_render                                                                               753 root      20   0 61500 7376 6184 s    0  0.9   0:04.06 ale_app                                                                                  1329 root      20   0 79908   9m 8208 s    0  1.2   0:04.77 jup_render                                                                               631 dbus      20   0  3248 1636  676 s    0  0.2   0:13.10 dbus-daemon                                                                              1654 root      20   0 80068  10m 8204 s    0  1.2   0:04.82 jup_render                                                                               760 root      20   0  116m  15m  12m s    0  1.9   0:10.19 jup_server                                                                                 8 root      20   0     0    0    0 s    0  0.0   0:00.00 kworker/1:0                                                                                 2 root      20   0     0    0    0 s    0  0.0   0:00.00 kthreadd                                                                                    7 root      rt   0     0    0    0 s    0  0.0   0:00.00 migration/1                                                                               170 root       0 -20     0    0    0 s    0  0.0   0:00.00 kblockd                                                                                     6 root      rt   0     0    0    0 s    0  0.0   0:00.00 migration/0                                                                               167 root      20   0     0    0    0 s    0  0.0   0:00.00 sync_supers                                                                               281 root       0 -20     0    0    0 s    0  0.0   0:00.00 nfsiod

for embedded scheme has many process running, there can multitude of reasons. may need investigate in perspective.

check code race conditions , deadlock.the kernel might busy looping in status . there can scenario application waiting on select phone call or cpu resource used (this selection of cpu resource usage ruled out based on output of top command shared you) or blocked on read.

if performing blocking i/o operations, process shall wait queue , move execution path(ready queue) after completion of request. is, moved out of scheduler run queue , set special state. shall set run queue if wake sleep or resource waited made available.

immediate step shall seek out 'strace'. shall intercept/record scheme calls called process , signals received process. able show order of events , return/resumption paths of calls. can take closer area of problem.

there other many handy tools can tried based on development environment/setup. key tools below :

'iotop' - shall provide table of current i/o usage processes or threads on scheme monitoring i/o usage info output kernel.

'lttng' - makes tracing of race conditions , interrupt cascades possible. successor ltt. combination of kprobes, tracepoint , perf functionalities.

'ftrace' - linux kernel internal tracer can analyze/debug latency , performance related issues.

if scheme based on ti processor, ccs(trace analyzer) provides capability perform non-intrusive debug , analysis of scheme activity. so, note based on setup, may need utilize relevant tool .

came across few more ideas : magic sysrq key alternative in linux. if driver stuck, command sysrq p can take exact routine causing problem.

profiling of info can tell time beingness spent kernel. there couple of tools readprofile , oprofile. oprofile can enabled configuring config_profiling , config_oprofile. alternative rebuild kernel enabling profiling alternative , reading profile counters using readprofile utility booting profile=2 via command line.

mpstat can give 'the percentage of time cpu or cpus idle during scheme had outstanding disk i/o request' via 'iowait' argument.

linux embedded embedded-linux

Search This Blog

Jaimee

embedded - How to identify, what is stalling the system in Linux? -

Comments

Post a Comment

Popular posts from this blog

c - Compilation of a code: unkown type name string -

java - Bypassing "final local variable defined in an enclosing type" -

json - Hibernate and Jackson (java.lang.IllegalStateException: Cannot call sendError() after the response has been committed) -