Discussion:
When does a synchronous execution command complete in non-stop mode?
(too old to reply)
Doug Evans
2014-10-19 18:51:45 UTC
Permalink
Raw Message
Hi.
I think(!) I've found a problem in non-stop support,
but would like to verify my understanding before I file a bug report.
[And if there is already a bug, please point me at it. Thanks.]

The simple form of my question is (phrased in two parts):
Should a synchronous command (say "continue", note no "-a" and no "&")
complete (as in return you to the command prompt) if another already running
thread hits a breakpoint before the current thread stops?
And if that's ok, is any claim that "continue" != "continue &" only
sometimes true depending on circumstances (and thus essentially broken
because when you get the prompt back there is no guarantee the thread
you resumed has now stopped, which is what "synchronous" implies)?

As a user, I would expect the answer to the first question to be "No.",
in which case the second question is moot.
My reasoning for this is that this is the intuitive interpretation of a
synchronous "continue" (no "&"). Otherwise, my second question kicks in
and I can imagine more users scratching their heads.

Here's a simple example:

---snip-begin-example---
(gdb) file non-stop # source appended below
(gdb) set non-stop on
(gdb) b break_here
(gdb) r
# Wait until all threads are stopped at break_here.
# Now type the following quickly.
(gdb) thr 5
(gdb) c &
(gdb) thr 1
(gdb) c
# Wait for gdb prompt.
(gdb) i thr
---snip-end-example---

Here's what I see after the "c".
The breakpoint is from thread 5 not thread 1.

Continuing.

Breakpoint 1, break_here () at non-stop.c:14
14 }
(gdb)

Note I've got a prompt now.
Here's what I see for the output of "i thr":

Id Target Id Frame
5 Thread 0x7ffff66d5700 (LWP 25596) "non-stop.x64" break_here ()
at non-stop.c:14
4 Thread 0x7ffff6f26700 (LWP 25595) "non-stop.x64" break_here ()
at non-stop.c:14
3 Thread 0x7ffff7777700 (LWP 25594) "non-stop.x64" break_here ()
at non-stop.c:14
2 Thread 0x7ffff7fc8700 (LWP 25593) "non-stop.x64" break_here ()
at non-stop.c:14
* 1 Thread 0x7ffff7fca780 (LWP 25589) "non-stop.x64" (running)
(gdb)

And then a few seconds later:

Breakpoint 1, break_here () at non-stop.c:14
14 }

I typed "c", not "c &", and yet thread #1 was still running.
Eh?

Here's another example with "step N", N > 1.
Continue where the previous example left off.

---snip-begin-example---
(gdb) i thr # Verify all threads are stopped at break_here
(gdb) set var sleep_in_main = 0
(gdb) thr 5
(gdb) c &
(gdb) thr 1
(gdb) step 10000 # N chosen to be large enough but not too large
# Wait for gdb prompt.
(gdb) i thr
---snip-end-example---

-->

Id Target Id Frame
5 Thread 0x7ffff66d5700 (LWP 25719) "non-stop.x64" break_here ()
at non-stop.c:16
4 Thread 0x7ffff6f26700 (LWP 25718) "non-stop.x64" break_here ()
at non-stop.c:16
3 Thread 0x7ffff7777700 (LWP 25717) "non-stop.x64" break_here ()
at non-stop.c:16
2 Thread 0x7ffff7fc8700 (LWP 25716) "non-stop.x64" break_here ()
at non-stop.c:16
* 1 Thread 0x7ffff7fca780 (LWP 25715) "non-stop.x64" (running)

And then a few seconds later:

(gdb) 38 if (sleep_in_main)

Here again the synchronous step command "completed" but the thread
is still running. Plus if one turns "set debug infrun 1" on,
one can see the continuation is still firing, stepping the thread.
The command has completed as far as the user is concerned
(I got my prompt back), but internally the command is still executing!

The step commands use an intermediate continuation that is, umm, continually
added to handle each iteration. The implementation does INF_EXEC_CONTINUE
until the step is done at which point INF_EXEC_COMPLETE is done.

infrun.c:
if (target_has_execution
&& ecs->ws.kind != TARGET_WAITKIND_NO_RESUMED
&& ecs->ws.kind != TARGET_WAITKIND_EXITED
&& ecs->ws.kind != TARGET_WAITKIND_SIGNALLED
&& ecs->event_thread->step_multi
&& ecs->event_thread->control.stop_step)
inferior_event_handler (INF_EXEC_CONTINUE, NULL);
else
{
inferior_event_handler (INF_EXEC_COMPLETE, NULL);
cmd_done = 1;
}

But here the "step N" didn't complete when the prompt was printed,
and input was returned to the user.
I can imagine there are multiple issues here, maybe the first example
is sorta/mostly/all ok whereas this example is less ok but still not
totally wrong.

Looking at the implementation, I can explain why I see what I see.
It's not clear to me that this is the intended behaviour though.
At the least more clarity is needed, in the code and in the Non-Stop Mode
section of the docs. I'm working on a patch, but I need to know whether
it's at least heading in the right direction.

--- non-stop.c ---
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>

#ifndef NR_THREADS
#define NR_THREADS 4
#endif

int sleep_in_main = 1;

pthread_t threads[NR_THREADS];

static void
break_here (void)
{
}

static void *
thread_entry (void *unused)
{
while (1)
{
sleep (4);
break_here ();
}
}

static void
all_threads_running (void)
{
}

static void
do_something (void)
{
while (1)
{
if (sleep_in_main)
{
sleep (10);
break_here ();
}
}
}

int
main (int argc, char *argv[])
{
int i;

alarm (6000);

for (i = 0; i < NR_THREADS; ++i)
pthread_create (&threads[i], NULL, thread_entry, NULL);

all_threads_running ();

while (1)
do_something ();

return 0;
}

Loading...