Discussion:
RFC: Available registers as a target property
(too old to reply)
Daniel Jacobowitz
2005-05-06 16:20:29 UTC
Permalink
Please bear with me; this is so long-winded I felt the need to give it
section titles. I'm interested in any and all comments about the problem or
about my solution. I hope to start implementing within a couple of weeks.

INTRODUCTION
============

Today, the contents of the register cache and the layout of GDB's regnum
space are determined by the gdbarch. There are several hooks for this,
primarily these three:

num_regs
register_name
register_type

The gdbarch determines what raw registers are available. But this isn't a
perfect match with what raw registers are _really_ available, because the
gdbarch only has the clues we use to select a gdbarch available: things like
byte order and BFD machine number. At best, those tell us what registers
the binary we're debugging requires. The runtime set of registers we can
see are a property of the target, not of the gdbarch.

Here's a couple of examples of existing workarounds for this problem.
/* The default settings include the FPU registers, the MMX registers
and the SSE registers. This can be overridden for a specific ABI
by adjusting the members `st0_regnum', `mm0_regnum' and
`num_xmm_regs' of `struct gdbarch_tdep', otherwise the registers
will show up in the output of "info all-registers". Ideally we
should try to autodetect whether they are available, such that we
can prevent "info all-registers" from displaying registers that
aren't available.

NOTE: kevinb/2003-07-13: ... if it's a choice between printing
[the SSE registers] always (even when they don't exist) or never
showing them to the user (even when they do exist), I prefer the
former over the latter. */

Currently we always display the SSE registers, and fill them in with dummy
values. If we knew whether they were available, we could avoid displaying
them entirely.

rs6000-tdep.c:rs6000_gdbarch_init has related problems: registers whose size
and type change depending on the architecture. Here's one I encountered
recently:
/* For e500 executables, the apuinfo section is of help here. Such
section contains the identifier and revision number of each
Application-specific Processing Unit that is present on the
chip. The content of the section is determined by the assembler
which looks at each instruction and determines which unit (and
which version of it) can execute it. In our case we just look for
the existance of the section. */

There's a number of ways to end up with binaries which will run on an e500
chip but not have an apuinfo section. The presence of the section is used
to select bfd_mach_ppc_e500, which in turn is used to select registers_e500.
This is one of the many possible PPC register layouts, but it's particularly
interesting because it changes the size of registers 32-63, which moves
register 64 (the PC) to a different offset in the 'g' packet. Most of the
other "common PPC" layouts are compatible enough that if you get the wrong
one, debugging will still work. This one isn't. GDB needs to know whether
it is talking to an e500 stub or not.

The MIPS targets have another variant of the problem; they don't know
whether the target provides 32-bit or 64-bit registers, because code
compiled for 32-bit can run on targets with 64-bit hardware registers - and,
in some cases, be influenced by corruption in the upper halves of the
registers. So GDB is doing the user a disservice by only displaying 32-bit
registers if it can see the 64-bit registers.

And so on; you get the picture :-)

ONE EXISTING SOLUTION
=====================

On the csl-arm release branch, Paul and I developed a patch specific to the
ARM VFP and Xscale iWMMXt coprocessors. It uses the xfer_partial interface
to query the target to describe the available registers, and then creates a
new gdbarch based on that information if it does not match the register
layout used by the current gdbarch. Then there are three additional hooks,
covering native, sim, and remote. I would like to propose a similar
interface for HEAD - the one on the branch is not suitable as-is.

Here's the branch patches, for reference:
http://sourceware.org/ml/gdb-patches/2005-03/msg00370.html
http://sourceware.org/ml/gdb-patches/2005-03/msg00387.html

The interface on the branch is based in ARM-specific code instead of
common code. There is an inferior_created observer which calls
arm_update_architecture. That function uses target_read_partial (sloppily)
to fetch a target-specific string. The currently supported values
of the string are:
iwmmxt
iwmmxt:<hex number>
vfp
vfp:<hex number>

The word refers to an optional register set which is present on the target.
The hex number, if present, is a target-specific number used as a base for
the register set. For instance, iwmmxt:30 means that the iWMMXt registers
are present, and wr0 (the first register) is number 0x30. The number gets
saved away in the tdep structure, and a new hook for p/P packet support uses
this if the target we are connected to uses the remote protocol. The sim
and native targets don't have separate numbering so they ignore this value.

Some shortcomings in the branch implementation:
- I never implemented support for multiple responses. It wasn't necessary
since only two register sets were implemented; there exists today no
ARM core with both extensions.
- It uses an observer. While I do like this sort of use of observers in
general, it's not appropriate here; this should happen ASAP after
connecting to the target, because until it does we may not be able to
read registers reliably.
- There's no common code infrastructure for this, which I consider a must.
I don't want targets to reinvent more than necessary.

Also, it operates at an "optional feature" level rather than an "optional
register" level. The ARM RDI protocol has a nifty feature called
Self-Describing Modules, which allows coprocessors to describe themselves to
the debugger, including describing their register sets. It includes both
user-level information (name and type - along with a complicated type
description language) and implementation information (like the ARM mode in
which the register is accessible, for banked registers). I would like
the GDB solution to this problem to be sufficiently flexible to work with
SDM - both because it's a nice model and because that way we can be
compatible with ARM debug servers, given an adequate RDI proxy.

A PROPOSAL
==========

Here's my current idea for an improved interface. I have not implemented
any of this yet, only the older interface I described above. It does borrow
heavily from that implementation.

After connecting to a target, GDB checks the current gdbarch for a new
method, gdbarch_set_available_registers. If the architecture does not
provide this method, the rest of the process is skipped.

GDB then reads the TARGET_OBJECT_AVAILABLE_REGISTERS object from the target,
parses it, and hands it to the gdbarch for final processing. This means
that the object must have a target-independent format, although it will
have target-dependent content also.

The target calls gdbarch_update_p with an appropriately filled in argument,
which calls its gdbarch_init routine, which can then do the real work of
updating gdbarch_num_regs et cetera. This means that the gdbarch_init
routine must correctly handle filling in defaults based on the last
architecture. That code is a bit fragile because it's undertested; I
recently updated ARM to do this robustly.

DETAILS
=======

First of all, the target object. It can describe either individual
registers or register sets known to the target (for brevity). Each
component is an ASCII string. Colon is used as a field delimiter and
semicolon as a component delimiter. A register set would look like:

set:<NAME>:<PROTOCOL NUMBER>

No more information is necessary; the register set is an abbreviation of a
well-defined group of registers that both the stub and GDB have external
knowledge of. GDB will already know the order, types, and sizes of
registers, and potentially other details (such as how to pass them as
arguments to functions). If GDB does not recognize the register set, it can
safely ignore it, but should issue a warning to the user recommending use of
a later GDB. If the protocol does not require numbers, they will be
ignored, but they are non-optional in the syntax.

I have spent less time thinking about how to specify individual registers.
This should suffice, but if anyone can see cause for another standard field,
please speak up.

reg:<NAME>:<PROTOCOL NUMBER>:<BITSIZE>:<TYPE>:<TARGET DATA>...

Types unknown to GDB would default to integral display; common types such as
integral, floating point (native byte order), integral vector, fp vector, et
cetera would be documented in the manual with fixed names.



The remote protocol would use a qPart packet to implement this. That means
the data would go over the wire hex encoded. I would probably end up adding
some more intelligent decoding to "set debug remote" so that I could see the
hex-decoded form of this data, since it would be printable. I've wanted
that before (for qSymbol debugging).

The optional register sets would generally not appear in the remote protocol
'g' packet. Instead they would be handled using p/P packets. This is
somewhat less efficient; if someone wants to come up with a g/G-like way to
transfer known register sets in bulk, be my guest. That's a separate
problem. The optional registers would not be blocked from appearing in the
g packet, however. For instance, if MIPS used this feature to expect 32-bit
vs 64-bit GPRs, it would be desirable to continue using a g/G packet for
those.

The architecture would have to register the remote protocol <-> gdb regcache
number mapping.



Simulator targets could implement this mechanism in the simulator. For now
I created a gdbarch hook which returns a string describing the capabilities
of the simulator, and used it to implement target_xfer_partial for the
common simulator target. Native targets should override
to_xfer_partial.
--
Daniel Jacobowitz
CodeSourcery, LLC
Eli Zaretskii
2005-05-07 10:22:57 UTC
Permalink
Date: Fri, 6 May 2005 12:20:29 -0400
Please bear with me; this is so long-winded I felt the need to give it
section titles. I'm interested in any and all comments about the problem or
about my solution. I hope to start implementing within a couple of weeks.
Thanks.
Today, the contents of the register cache and the layout of GDB's regnum
space are determined by the gdbarch. There are several hooks for this,
num_regs
register_name
register_type
The gdbarch determines what raw registers are available. But this isn't a
perfect match with what raw registers are _really_ available, because the
gdbarch only has the clues we use to select a gdbarch available: things like
byte order and BFD machine number. At best, those tell us what registers
the binary we're debugging requires. The runtime set of registers we can
see are a property of the target, not of the gdbarch.
BTW, I'd be thrilled to see these issues spelled out and explained in
gdbint.texinfo. Right now, that part of the internals manual is a
mess of outdated information and incomplete or non-existent
description of new features. If you, or someone else, could offer
even unstructured text, I could use that to start working on the
manual. I think it's absurd that such a central part of GDB's
internals is not documented in any reasonable way.
After connecting to a target, GDB checks the current gdbarch for a new
method, gdbarch_set_available_registers. If the architecture does not
provide this method, the rest of the process is skipped.
GDB then reads the TARGET_OBJECT_AVAILABLE_REGISTERS object from the target,
parses it, and hands it to the gdbarch for final processing. This means
that the object must have a target-independent format, although it will
have target-dependent content also.
The target calls gdbarch_update_p with an appropriately filled in argument,
which calls its gdbarch_init routine, which can then do the real work of
updating gdbarch_num_regs et cetera. This means that the gdbarch_init
routine must correctly handle filling in defaults based on the last
architecture. That code is a bit fragile because it's undertested; I
recently updated ARM to do this robustly.
FWIW, I think it's a good idea to add this to GDB. However, I'm
puzzled why your proposal sounds limited to remote targets (the
explicit references to the remote protocol and the syntax of the data
objects seem to suggest that). Isn't this problem relevant to native
debugging as well? If it is, then why not describe the solution in
more general terms, so that they will be appropriate for native
targets?

Also, is it indeed a fact that information about registers is the only
issue GDB has to deal with in such situations? Maybe we need to think
about a more general mechanism, even if for now we only pass
register-related information.
I created a gdbarch hook which returns a string describing the capabilities
You consistently talk about strings as representing the target
capabilities. Why not design a C data structure instead? A string is
an inefficient way of passing information around.
Daniel Jacobowitz
2005-05-07 16:19:39 UTC
Permalink
Post by Eli Zaretskii
Post by Daniel Jacobowitz
Today, the contents of the register cache and the layout of GDB's regnum
space are determined by the gdbarch. There are several hooks for this,
num_regs
register_name
register_type
The gdbarch determines what raw registers are available. But this isn't a
perfect match with what raw registers are _really_ available, because the
gdbarch only has the clues we use to select a gdbarch available: things like
byte order and BFD machine number. At best, those tell us what registers
the binary we're debugging requires. The runtime set of registers we can
see are a property of the target, not of the gdbarch.
BTW, I'd be thrilled to see these issues spelled out and explained in
gdbint.texinfo. Right now, that part of the internals manual is a
mess of outdated information and incomplete or non-existent
description of new features. If you, or someone else, could offer
even unstructured text, I could use that to start working on the
manual. I think it's absurd that such a central part of GDB's
internals is not documented in any reasonable way.
Yes indeed. When writing this I went to see if any of the parts of GDB
I would be changing were in the manual; aside from the remote protocol
bits, no hits. I do plan to write documentation to go along with these
changes.
Post by Eli Zaretskii
Post by Daniel Jacobowitz
After connecting to a target, GDB checks the current gdbarch for a new
method, gdbarch_set_available_registers. If the architecture does not
provide this method, the rest of the process is skipped.
GDB then reads the TARGET_OBJECT_AVAILABLE_REGISTERS object from the target,
parses it, and hands it to the gdbarch for final processing. This means
that the object must have a target-independent format, although it will
have target-dependent content also.
The target calls gdbarch_update_p with an appropriately filled in argument,
which calls its gdbarch_init routine, which can then do the real work of
updating gdbarch_num_regs et cetera. This means that the gdbarch_init
routine must correctly handle filling in defaults based on the last
architecture. That code is a bit fragile because it's undertested; I
recently updated ARM to do this robustly.
FWIW, I think it's a good idea to add this to GDB. However, I'm
puzzled why your proposal sounds limited to remote targets (the
explicit references to the remote protocol and the syntax of the data
objects seem to suggest that). Isn't this problem relevant to native
debugging as well? If it is, then why not describe the solution in
more general terms, so that they will be appropriate for native
targets?
This solution is not limited to remote targets. The proposal does read
that way, a bit - but that's because remote targets are the trickiest
case, because remote targets enforce a certain cleanliness of
interface. For instance, the strings I described included a
target-protocol-specific number. For sim and native targets, there are
other architecture hooks and bits of global state in GDB that the
target can read and write. But the remote protocol has to encapsulate
everything.

The structure of the proposal is divided into two parts. One's a
gdbarch hook, the other's a target hook. I've got implementations of
the target hook for sim, remote, and native targets. For a native
user-space GNU/Linux target, the implementation is fairly simple:
partly based on predefined macros, i.e. how GDB was built, and partly
based on trying ptrace operations on the new process and seeing if they
work.
Post by Eli Zaretskii
Also, is it indeed a fact that information about registers is the only
issue GDB has to deal with in such situations? Maybe we need to think
about a more general mechanism, even if for now we only pass
register-related information.
Do you have any examples?

I've got a feeling that there is additional information of this type,
but I have no idea what it would be, or how GDB would represent or use
it.

The mechanism, in its current form, does allow for additional
information - both string types start with a label, other labels could
be added later. The only thing register-specific about it is the name
of the hooks/packets. I did think a little bit about a more general
name but couldn't come up with one. If you suggest a more general
name I have no complaints about using it!
Post by Eli Zaretskii
Post by Daniel Jacobowitz
I created a gdbarch hook which returns a string describing the capabilities
You consistently talk about strings as representing the target
capabilities. Why not design a C data structure instead? A string is
an inefficient way of passing information around.
The reason I used strings was to maintain the separation between
gdbarch vector and target vector - and to reuse the existing
target_xfer_partial interface. Here's the dataflow: GDB calls the
target vector to fetch the data, then parses the data, then passes the
parsed data to the architecture vector.

Thinking about it now, the parsing could be pushed down into the remote
protocol implementation, and a C structure returned as a binary blob
via target_read_partial. That's not a normal use of the interface,
which currently returns target objects rather than internal GDB
objects. It would work, and it would save constructing the string in
the native and simulator cases (and core; I haven't been thinking about
the core case, but an implementation there is probably possible too).

Do you think that would be a better interface to choose?
--
Daniel Jacobowitz
CodeSourcery, LLC
Eli Zaretskii
2005-05-07 19:35:02 UTC
Permalink
Date: Sat, 7 May 2005 12:19:39 -0400
I do plan to write documentation to go along with these changes.
Thanks!
For sim and native targets, there are other architecture hooks and
bits of global state in GDB that the target can read and write.
I hope we could find a way to come up with a common infrastructure
that would unify all these types of targets.
Post by Eli Zaretskii
Also, is it indeed a fact that information about registers is the only
issue GDB has to deal with in such situations? Maybe we need to think
about a more general mechanism, even if for now we only pass
register-related information.
Do you have any examples?
No examples, it was just a general observation. As long as you say
you keep this is mind, I'm happy.
Thinking about it now, the parsing could be pushed down into the remote
protocol implementation, and a C structure returned as a binary blob
via target_read_partial.
That's what I had in mind, sort of.
Do you think that would be a better interface to choose?
I think so, but it's an idea based on general principles; I know much
less than you about the remote targets. So if you find that what I
suggested has any significant drawbacks, I won't insist.
Daniel Jacobowitz
2005-05-09 15:37:18 UTC
Permalink
Post by Eli Zaretskii
Post by Daniel Jacobowitz
Thinking about it now, the parsing could be pushed down into the remote
protocol implementation, and a C structure returned as a binary blob
via target_read_partial.
That's what I had in mind, sort of.
Post by Daniel Jacobowitz
Do you think that would be a better interface to choose?
I think so, but it's an idea based on general principles; I know much
less than you about the remote targets. So if you find that what I
suggested has any significant drawbacks, I won't insist.
I've thought about this some more. I see one drawback, but it is
definitely solvable.

The data structure is not readily fixed-size. The target-specific data
has no specified format, so it will be a binary (or probably ASCII)
blob; the common format data will include character strings, e.g. the
names of registers, which will have to be allocated somewhere. We
don't want to leak them, so they need to have a defined lifetime.

The target_read_partial interface is not well suited to that because
the data may be transfered in multiple chunks; each time, we call down
to the target. The best thing I can think of would be to create the
data structure once in the target, store it persistently, and then feed
bits of that data structure back via to_xfer_partial. This requires
mutable data attached to the target object. Nowadays we can use
target_ops:to_data for this, so that should be OK.

This lets the target control the data lifetime. Handy, since it allows
for const structures for simulator targets, where we know the available
features at compile time.

So that should work OK.
--
Daniel Jacobowitz
CodeSourcery, LLC
Eli Zaretskii
2005-05-09 20:55:30 UTC
Permalink
Date: Mon, 9 May 2005 11:37:18 -0400
The target_read_partial interface is not well suited to that because
the data may be transfered in multiple chunks; each time, we call down
to the target. The best thing I can think of would be to create the
data structure once in the target, store it persistently, and then feed
bits of that data structure back via to_xfer_partial. This requires
mutable data attached to the target object. Nowadays we can use
target_ops:to_data for this, so that should be OK.
This lets the target control the data lifetime. Handy, since it allows
for const structures for simulator targets, where we know the available
features at compile time.
So that should work OK.
Sounds like a good plan.
Mark Kettenis
2005-05-07 16:03:55 UTC
Permalink
Hi Daniel,

Your proposal sounds reasonable to me. Do I understand correctly that
the "set" keyword is supposed to be used to specify the contents of
the `g' packet, or is there a somewhat broader use for them?

Anyway, here are some random thoughts about things we might need to
consider. It's a bit i386 centric, the issues mostly are not.

* Is this going to allow us to make changes freely to the internal
layout of GDB's register cache?

* How will we treat registers that the user might reasonably expect to
be there, but aren't made available by the target?

FreeBSD/i386 for example still has no way to get at the SSE registers,
but has no problem executing code that uses its registers. What do we
do when the user says "print $xmm0" when connected to a FreeBSD/i386
target? Do we print:

(gdb) p $xmm0
$1 = void

or are we going to try to print a more helpful message for
well-known-but-unavailable registers?

* What do we do with pseudo registers?

The ia32 MMX registers are optional pseudo registers; they reinterpret
the floating-point registers in a particular way. As such they won't
be transmitted between the target and GDB. Yet it would be nice if we
had a mechanism for the target to indicate whether MMX is available
such that GDB knows if it should display the MMX registers or not.

* How does this interact with register groups?

Do we need a mechanism to indicate the register group to which a
register belongs, or is the default register group behaviour good
enough?


Mark
Daniel Jacobowitz
2005-05-09 16:20:35 UTC
Permalink
Post by Mark Kettenis
Hi Daniel,
Your proposal sounds reasonable to me. Do I understand correctly that
the "set" keyword is supposed to be used to specify the contents of
the `g' packet, or is there a somewhat broader use for them?
I was thinking of a broader use. The only two "messages" I have worked
with so far are not g-packet related; they are more like shorthand for
groups of "reg" keywords. If the target supplied "iwmmxt:36" then the
remote protocol would use p-packets based at register 0x36 to fetch the
iwmmxt registers.

This does raise an interesting question. Should a target specify _all_
of its available registers this way, including a predefined
"set:standard:0"? Or should the standard registers be implicit, unless
a set is specified which would conflict with them?

There's something to be said for requiring explicit reference to the
standard register set. For instance, for MIPS it isn't clear whether
the 32-bit or 64-bit packet is "standard". On the other hand, it's no
worse than the current situation if the target does not say.

Anyway, just a thought. I think that the implicit reference is OK. But
for some targets, like MIPS, supplying an explicit reference will
probably be to the target's benefit, so it would provide "set:gprs32:0"
or "set:gprs64:0".
Post by Mark Kettenis
Anyway, here are some random thoughts about things we might need to
consider. It's a bit i386 centric, the issues mostly are not.
* Is this going to allow us to make changes freely to the internal
layout of GDB's register cache?
The proposal, not specifically. The implementation, yes. It turns out
that after the last generation of cleanups this is quite simple to do.
All it required was a gdbarch method for mapping from GDB regnums to
remote protocol regnums; I preserved the assumption that the g packet
was contiguous and 0-based in the remote protocol numbering scheme,
which I think is a reasonable assumption.
Post by Mark Kettenis
* How will we treat registers that the user might reasonably expect to
be there, but aren't made available by the target?
FreeBSD/i386 for example still has no way to get at the SSE registers,
but has no problem executing code that uses its registers. What do we
do when the user says "print $xmm0" when connected to a FreeBSD/i386
(gdb) p $xmm0
$1 = void
or are we going to try to print a more helpful message for
well-known-but-unavailable registers?
Good question. Unless additional work is done, you'd get void. It
sounds like we need to define the registers with an "error" type; but
my limited experience with GCC's type system tells me that that can get
messy very quickly. How about a list of unavailable register names,
provided by the architecture, and something parallel to the existing
user-regs.c code to generate the error message?
Post by Mark Kettenis
* What do we do with pseudo registers?
The ia32 MMX registers are optional pseudo registers; they reinterpret
the floating-point registers in a particular way. As such they won't
be transmitted between the target and GDB. Yet it would be nice if we
had a mechanism for the target to indicate whether MMX is available
such that GDB knows if it should display the MMX registers or not.
This could either be a set keyword, or a new keyword; how about
"feature:mmx", defaulting to available if the new mechanism is not
used?
Post by Mark Kettenis
* How does this interact with register groups?
Do we need a mechanism to indicate the register group to which a
register belongs, or is the default register group behaviour good
enough?
I hadn't thought about this. It's a very complicated question.
One of the nice things about using the "set" notation for groups of
registers is that it allows and requires GDB to have a priori knowledge
of the use and meaning of those registers. So in that case, the
target's existing reggroup method can handle them. But for unknown
registers something else may be necessary.

The proposed register specifier was:
reg:<NAME>:<PROTOCOL NUMBER>:<BITSIZE>:<TYPE>:<TARGET DATA>...

Perhaps an additional field, call it TAGS, which could include things
like "integer", "vector", "readonly" (this is important! I forgot to
consider that). I am not sure how save_reggroup/restore_reggroup
should be represented; does readonly cover all the cases where GDB
should not save/restore a register around function calls? Probably
not.

[Reading this it occurs to me that the syntax above does not leave room
to define new common information in the future, because target data is
at the end. I may need to rethink that structure.]
--
Daniel Jacobowitz
CodeSourcery, LLC
Daniel Jacobowitz
2005-05-09 16:32:20 UTC
Permalink
Post by Daniel Jacobowitz
set:<NAME>:<PROTOCOL NUMBER>
...
reg:<NAME>:<PROTOCOL NUMBER>:<BITSIZE>:<TYPE>:<TARGET DATA>...
Would it make sense to allow these two overlap? ie. if gdb can understand the
set it will use that and ignore the associated reg entries. If it doesn't
understand the set it will use the individual set entries.
Assume I have an coprocessor not currently supported by gdb (Arm maverick for
the sake of argument), and a target that exposes maverick registers via reg:.
At some time in the future gdb implements proper maverick support, and adds
set:maverick. Under your proposal I can't use my old gdb with my new target.
My new target doesn't generate reg: entries for maverick regs, and my old gdb
doesn't understand set:maverick.
Obviously this is is purely a backwards compatibility QoI issue, and doesn't
matter if you expect everyone to use latest gdb.
reg:<NAME>:<SET NAME>:<PROTOCOL NUMBER>:<BITSIZE>:<TYPE>:<TARGET DATA>...
Where <SET NAME> can be empty if the register doesn't belong to a known set.
component redundant.
I've envisioned a different solution to this problem. The set
information does not need to come from the target; GDB can recognize it
via pattern information. "If we have eight registers named this of
these types, and eight registers named that of those types, then that's
this coprocessor".

I do see that there's some fudge factor here because register names and
types aren't a very good key. How about the tags field that I
mentioned in my last mail to Mark, which is basically the same as set?

If you report a set, you are relying upon GDB to recognize it or choose
to ignore the associated registers.
--
Daniel Jacobowitz
CodeSourcery, LLC
Paul Brook
2005-05-09 15:57:46 UTC
Permalink
Post by Daniel Jacobowitz
set:<NAME>:<PROTOCOL NUMBER>
...
reg:<NAME>:<PROTOCOL NUMBER>:<BITSIZE>:<TYPE>:<TARGET DATA>...
Would it make sense to allow these two overlap? ie. if gdb can understand the
set it will use that and ignore the associated reg entries. If it doesn't
understand the set it will use the individual set entries.

Assume I have an coprocessor not currently supported by gdb (Arm maverick for
the sake of argument), and a target that exposes maverick registers via reg:.

At some time in the future gdb implements proper maverick support, and adds
set:maverick. Under your proposal I can't use my old gdb with my new target.
My new target doesn't generate reg: entries for maverick regs, and my old gdb
doesn't understand set:maverick.

Obviously this is is purely a backwards compatibility QoI issue, and doesn't
matter if you expect everyone to use latest gdb.

I'd suggest:
reg:<NAME>:<SET NAME>:<PROTOCOL NUMBER>:<BITSIZE>:<TYPE>:<TARGET DATA>...

Where <SET NAME> can be empty if the register doesn't belong to a known set.
In fact I guess including the set name in the reg: component makes the set:
component redundant.

Paul
Chris Zankel
2005-05-09 21:33:02 UTC
Permalink
Post by Daniel Jacobowitz
Here's my current idea for an improved interface.
Great!
Post by Daniel Jacobowitz
GDB then reads the TARGET_OBJECT_AVAILABLE_REGISTERS object from the target,
parses it, and hands it to the gdbarch for final processing. This means
that the object must have a target-independent format, although it will
have target-dependent content also.
I am wondering if it would also make sense to support the other way
around and let GDB tell the target about the processor/register
configuration. A scenario for this would be where GDB talks to an OCD
daemon (=target) that controls the processor via JTAG. The daemon
wouldn't need to know everything about the processor configuration.
Post by Daniel Jacobowitz
First of all, the target object. It can describe either individual
registers or register sets known to the target (for brevity). Each
component is an ASCII string. Colon is used as a field delimiter and
set:<NAME>:<PROTOCOL NUMBER>
Sorry, but what do you mean by 'protocol number'? Is that 'pnum' in
remote.c?

The reason why I ask this is because although the current remote.c file
supports pnums, they are currently mapped 1:1 to regnum. It would be
great if you could allow a gdbarch to modify the that mapping.
Post by Daniel Jacobowitz
The architecture would have to register the remote protocol <-> gdb
regcache number mapping.
Do you intend to introduce a gdbarch function (for example,
gdbarch_pnum_to_regnum_p) and use it to define the pnum value in
remote.c (and other files)?

For example, in remote.c you could use something like this:

for (regnum = 0; regnum < NUM_REGS + NUM_PSEUDO_REGS; regnum++)
{
struct packet_reg *r = &rs->regs[regnum];
if (gdbarch_pnum_to_regnum_p)
r->pnum = gdbarch_pnum_to_regnum(regnum);
else
r->pnum = regnum;


In our case (Tensilica-Xtensa), we have a non-sequential register
encoding and use the pnum <-> regnum mapping. For example, all address
registers might have a pnum 0x10XX, special register 0x11XX, etc.
Post by Daniel Jacobowitz
For instance, if MIPS used this feature to expect 32-bit
vs 64-bit GPRs, it would be desirable to continue using a g/G packet for
those.
I think that would be a nice feature. However, it probably requires
quite a few changes to the register cache, does it not?

~Chris
Daniel Jacobowitz
2005-05-09 23:07:46 UTC
Permalink
Post by Chris Zankel
Post by Daniel Jacobowitz
GDB then reads the TARGET_OBJECT_AVAILABLE_REGISTERS object from the target,
parses it, and hands it to the gdbarch for final processing. This means
that the object must have a target-independent format, although it will
have target-dependent content also.
I am wondering if it would also make sense to support the other way
around and let GDB tell the target about the processor/register
configuration. A scenario for this would be where GDB talks to an OCD
daemon (=target) that controls the processor via JTAG. The daemon
wouldn't need to know everything about the processor configuration.
The daemon would already have to be updated to understand any new
protocol extensions, so we're talking about modifying that agent in any
case. Given that, can you explain what advantage we would gain by
having GDB pass configuration information to the daemon, instead of
having the daemon parse some text file at startup and then communicate
the configuration information to GDB?

I don't want to support both directions just for kicks, but there may
be value here that I haven't thought of yet. That's why I asked
Tensilica for feedback. I expect that support for feeding GDB from
information provided by a remote stub is actually orthogonal to telling
a remote stub about our configuration.
Post by Chris Zankel
Post by Daniel Jacobowitz
First of all, the target object. It can describe either individual
registers or register sets known to the target (for brevity). Each
component is an ASCII string. Colon is used as a field delimiter and
set:<NAME>:<PROTOCOL NUMBER>
Sorry, but what do you mean by 'protocol number'? Is that 'pnum' in
remote.c?
A number specific to whatever protocol is being used. For the remote
protocol that's the index into the g/G packet and the index used with
p/P packets. So, yes.
Post by Chris Zankel
The reason why I ask this is because although the current remote.c file
supports pnums, they are currently mapped 1:1 to regnum. It would be
great if you could allow a gdbarch to modify the that mapping.
Post by Daniel Jacobowitz
The architecture would have to register the remote protocol <-> gdb
regcache number mapping.
Do you intend to introduce a gdbarch function (for example,
gdbarch_pnum_to_regnum_p) and use it to define the pnum value in
remote.c (and other files)?
Yes, this is already implemented on the branch I referenced in my
original mail. It was fairly straightforward. I'm not sure how
exhaustive it was, since I didn't try renumbering any of the g-packet
registers, but in principle that's no harder.
Post by Chris Zankel
In our case (Tensilica-Xtensa), we have a non-sequential register
encoding and use the pnum <-> regnum mapping. For example, all address
registers might have a pnum 0x10XX, special register 0x11XX, etc.
That would work fine as long as you mapped them to sequential register
numbers internal to GDB.
Post by Chris Zankel
Post by Daniel Jacobowitz
For instance, if MIPS used this feature to expect 32-bit
vs 64-bit GPRs, it would be desirable to continue using a g/G packet for
those.
I think that would be a nice feature. However, it probably requires
quite a few changes to the register cache, does it not?
Not at all. The g packet is just the first however-many hard registers
in the remote protocol numbering (those numbers may need to be
sequential; not sure offhand).
--
Daniel Jacobowitz
CodeSourcery, LLC
Chris Zankel
2005-05-10 00:23:38 UTC
Permalink
Post by Daniel Jacobowitz
Post by Chris Zankel
I am wondering if it would also make sense to support the other way
around and let GDB tell the target about the processor/register
configuration.
The daemon would already have to be updated to understand any new
protocol extensions, so we're talking about modifying that agent in any
case. Given that, can you explain what advantage we would gain by
having GDB pass configuration information to the daemon, instead of
having the daemon parse some text file at startup and then communicate
the configuration information to GDB?
I was thinking about an architecture with multiple configurations
(registers), such as Arc, Tensilica, ARM coprocessors (?), etc.

Having a single daemon supporting these multiple (arbitrary)
configurations would probably be easier for JTAG probe vendors. Since
GDB certainly needs to know about the particular configuration, the
daemon wouldn't need to be modified for each configuration.
Post by Daniel Jacobowitz
I don't want to support both directions just for kicks, but there may
be value here that I haven't thought of yet. That's why I asked
Tensilica for feedback.
I understand. I was just wondering if this would be useful and actully
agree that your proposal makes much more sense and that the target
should know about the configuration.

In our case, the daemon currently doesn't know about a particular
configuration, and GDB only queries for registers the processor (better)
has. For example, to read 'special register' <SR>, OCD simply issues a
rsr a2,<SR> and doesn't know if this <SR> really exists.
Post by Daniel Jacobowitz
Post by Chris Zankel
In our case (Tensilica-Xtensa), we have a non-sequential register
encoding and use the pnum <-> regnum mapping. For example, all address
registers might have a pnum 0x10XX, special register 0x11XX, etc.
That would work fine as long as you mapped them to sequential register
numbers internal to GDB.
Post by Chris Zankel
Sorry, but what do you mean by 'protocol number'? Is that 'pnum' in
remote.c?
A number specific to whatever protocol is being used. For the remote
protocol that's the index into the g/G packet and the index used with
p/P packets. So, yes.
Note, however, that in our case, pnum is not the index into the g/G
packet, and hopefully doesn't need to be?

In cases where pnum is not sequential, you would also need a 'reverse'
lookup function to get the register from pnum, something like this:

static struct packet_reg *
packet_reg_from_pnum (struct remote_state *rs, LONGEST pnum)
{
int i;
for (i = 0; i < NUM_REGS + NUM_PSEUDO_REGS; i++)
{
struct packet_reg *r = &rs->regs[i];
if (r->pnum == pnum)
return r;
}
return NULL;
}

Again, this function would only be called if gdbarch provided a
pnum<->regnum mapping function.

~Chris
Daniel Jacobowitz
2005-05-10 21:08:00 UTC
Permalink
Post by Chris Zankel
Post by Daniel Jacobowitz
Post by Chris Zankel
I am wondering if it would also make sense to support the other way
around and let GDB tell the target about the processor/register
configuration.
The daemon would already have to be updated to understand any new
protocol extensions, so we're talking about modifying that agent in any
case. Given that, can you explain what advantage we would gain by
having GDB pass configuration information to the daemon, instead of
having the daemon parse some text file at startup and then communicate
the configuration information to GDB?
I was thinking about an architecture with multiple configurations
(registers), such as Arc, Tensilica, ARM coprocessors (?), etc.
Having a single daemon supporting these multiple (arbitrary)
configurations would probably be easier for JTAG probe vendors. Since
GDB certainly needs to know about the particular configuration, the
daemon wouldn't need to be modified for each configuration.
I'm afraid that doesn't answer my question :-) First of all, the
daemon would not necessarily have to be modified for each
configuration; it would need a different configuration file, which is
not the same thing. Secondly, in this case, GDB _wouldn't_ need to
know about the particular configuration. All the configuration
information GDB needed, it could retrieve from the daemon.

Sometimes, GDB needs configuration information and the target can
supply it. Sometimes (apparently) the target needs information about
its own configuration and GDB can supply it.

I think we'll always be doing one or the other; one endpoint needs to
have enough information for both rather than GDB needing to negotiate
with the target. That suggests that the two configuration steps should
be implemented independently.
Post by Chris Zankel
Post by Daniel Jacobowitz
I don't want to support both directions just for kicks, but there may
be value here that I haven't thought of yet. That's why I asked
Tensilica for feedback.
I understand. I was just wondering if this would be useful and actully
agree that your proposal makes much more sense and that the target
should know about the configuration.
In our case, the daemon currently doesn't know about a particular
configuration, and GDB only queries for registers the processor (better)
has. For example, to read 'special register' <SR>, OCD simply issues a
rsr a2,<SR> and doesn't know if this <SR> really exists.
The options are to tell GDB about this directly, or to have the OCD
tell GDB about the real properties of the target. I obviously prefer
the latter when possible, because it allows GDB to gracefully handle
binaries built for one configuration, and run on another configuration
where they still work (but may be somehow affected by state they can
not see).
Post by Chris Zankel
Post by Daniel Jacobowitz
Post by Chris Zankel
In our case (Tensilica-Xtensa), we have a non-sequential register
encoding and use the pnum <-> regnum mapping. For example, all address
registers might have a pnum 0x10XX, special register 0x11XX, etc.
That would work fine as long as you mapped them to sequential register
numbers internal to GDB.
Post by Chris Zankel
Sorry, but what do you mean by 'protocol number'? Is that 'pnum' in
remote.c?
A number specific to whatever protocol is being used. For the remote
protocol that's the index into the g/G packet and the index used with
p/P packets. So, yes.
Note, however, that in our case, pnum is not the index into the g/G
packet, and hopefully doesn't need to be?
Do you use a 'g' packet at all? Certainly you're free not to. If you
do, then I'm not sure what it means with non-sequential pnums.
Post by Chris Zankel
In cases where pnum is not sequential, you would also need a 'reverse'
static struct packet_reg *
packet_reg_from_pnum (struct remote_state *rs, LONGEST pnum)
{
int i;
for (i = 0; i < NUM_REGS + NUM_PSEUDO_REGS; i++)
{
struct packet_reg *r = &rs->regs[i];
if (r->pnum == pnum)
return r;
}
return NULL;
}
Again, this function would only be called if gdbarch provided a
pnum<->regnum mapping function.
You mean, like the function of that same name and implementation
already in remote.c? Otherwise I'm not sure what you're talking about.
--
Daniel Jacobowitz
CodeSourcery, LLC
Chris Zankel
2005-05-12 23:35:05 UTC
Permalink
Daniel,

(Sorry for the delay, but I had some mailer problems....)
Post by Daniel Jacobowitz
GDB _wouldn't_ need to
know about the particular configuration. All the configuration
information GDB needed, it could retrieve from the daemon.
Hmmm... I have to think about it. It sounds like it could work. But that
would be too easy, wouldn't it ;-)
Post by Daniel Jacobowitz
Sometimes, GDB needs configuration information and the target can
supply it. Sometimes (apparently) the target needs information about
its own configuration and GDB can supply it.
I think we'll always be doing one or the other; one endpoint needs to
have enough information for both rather than GDB needing to negotiate
with the target. That suggests that the two configuration steps should
be implemented independently.
Agreed.
Post by Daniel Jacobowitz
The options are to tell GDB about this directly, or to have the OCD
tell GDB about the real properties of the target. I obviously prefer
the latter when possible, because it allows GDB to gracefully handle
binaries built for one configuration, and run on another configuration
where they still work (but may be somehow affected by state they can
not see).
This actually goes back to your comment above - I think. How do you
tread 'pseudo' registers? Would it make sense to add 'flags' to the
'set' command?

set:<NAME>:<PROTOCOL NUMBER>[:<FLAGS>]
Post by Daniel Jacobowitz
Post by Chris Zankel
Post by Chris Zankel
In our case (Tensilica-Xtensa), we have a non-sequential register
encoding and use the pnum <-> regnum mapping. For example, all address
registers might have a pnum 0x10XX, special register 0x11XX, etc.
Do you use a 'g' packet at all? Certainly you're free not to. If you
do, then I'm not sure what it means with non-sequential pnums.
At this point, we don't. However, with the changes you are planing to
implement (thanks, btw.), we could probably use g/G again.
Post by Daniel Jacobowitz
Post by Chris Zankel
In cases where pnum is not sequential, you would also need a 'reverse'
static struct packet_reg *
packet_reg_from_pnum (struct remote_state *rs, LONGEST pnum)
You mean, like the function of that same name and implementation
already in remote.c? Otherwise I'm not sure what you're talking about.
Oops... I wasn't sure if I was looking at our code or the original GDB
sources. It looks like GDB has support for non-sequential pnums, but
doesn't allow to assign them from gdbarch.

~Chris
Daniel Jacobowitz
2005-05-17 14:03:29 UTC
Permalink
Post by Chris Zankel
Post by Daniel Jacobowitz
The options are to tell GDB about this directly, or to have the OCD
tell GDB about the real properties of the target. I obviously prefer
the latter when possible, because it allows GDB to gracefully handle
binaries built for one configuration, and run on another configuration
where they still work (but may be somehow affected by state they can
not see).
This actually goes back to your comment above - I think. How do you
tread 'pseudo' registers? Would it make sense to add 'flags' to the
'set' command?
set:<NAME>:<PROTOCOL NUMBER>[:<FLAGS>]
I don't think so, but I don't have a good idea of what you would use it
for. Do you want to give me an example?
Post by Chris Zankel
Oops... I wasn't sure if I was looking at our code or the original GDB
sources. It looks like GDB has support for non-sequential pnums, but
doesn't allow to assign them from gdbarch.
Correct - not yet. Soon I hope.
--
Daniel Jacobowitz
CodeSourcery, LLC
Paul Schlie
2005-05-09 22:39:25 UTC
Permalink
I am wondering if it would also make sense to support the other way around and
let GDB tell the target about the processor/register configuration. A scenario
for this would be where GDB talks to an OCD daemon (=target) that controls the
processor via JTAG. The daemon wouldn't need to know everything about the
processor configuration.
Would seem sensible to consider, especially coming from a company with a
better perspective on the requirements of "configurable" processors than
most?
Paul Schlie
2005-05-10 00:03:03 UTC
Permalink
Post by Daniel Jacobowitz
The daemon would already have to be updated to understand any new
protocol extensions, so we're talking about modifying that agent in
any case. Given that, can you explain what advantage we would gain
by having GDB pass configuration information to the daemon, instead of
having the daemon parse some text file at startup and then communicate
the configuration information to GDB?
Possibly because it's GDB which needs to know about both the symbolic and
semantics associated with registers and their interpretation, a target
interface only needs to know which and in what order GDB expects to have
their values communicated in, not what they mean, or how logically relate
to the program being debugged.

I.e. a target interface only needs to know how to retrieve/update register
values for a particular physical or simulated target, usually established
by convention, and possibly optionally identify a target more specifically
to GDB by returning a configuration status word typically defined by
configurable processors, or by simply literally specifying to GDB which
configuration to presume when invoked, just as is essentially done today,
as one can't expect to debug a PPC if GDB is configured to presume an x86
target for example.
Ramana Radhakrishnan
2005-05-10 00:53:43 UTC
Permalink
Hi,

<snip>
Post by Daniel Jacobowitz
Also, it operates at an "optional feature" level rather than an "optional
register" level. The ARM RDI protocol has a nifty feature called
Self-Describing Modules, which allows coprocessors to describe themselves to
the debugger, including describing their register sets. It includes both
user-level information (name and type - along with a complicated type
description language) and implementation information (like the ARM mode in
which the register is accessible, for banked registers). I would like
the GDB solution to this problem to be sufficiently flexible to work with
SDM - both because it's a nice model and because that way we can be
compatible with ARM debug servers, given an adequate RDI proxy.
On the ARC there are extension encoding sections (Look at
.arcextmaps created in binutils for extension directives. )
which describe such registers by the binary. Its possible to
rename registers with other names and to specify other such
names for auxiliary registers. Using such a mechanism would
definitely be useful . Again its possible that the same
registers appear with different data formats in different
configurations of the core.

<snip>
Post by Daniel Jacobowitz
DETAILS
=======
First of all, the target object. It can describe either individual
registers or register sets known to the target (for brevity). Each
component is an ASCII string. Colon is used as a field delimiter and
set:<NAME>:<PROTOCOL NUMBER>
No more information is necessary; the register set is an abbreviation of a
well-defined group of registers that both the stub and GDB have external
knowledge of. GDB will already know the order, types, and sizes of
registers, and potentially other details (such as how to pass them as
arguments to functions). If GDB does not recognize the register set, it can
safely ignore it, but should issue a warning to the user recommending use of
a later GDB. If the protocol does not require numbers, they will be
ignored, but they are non-optional in the syntax.
I have spent less time thinking about how to specify individual registers.
This should suffice, but if anyone can see cause for another standard field,
please speak up.
reg:<NAME>:<PROTOCOL NUMBER>:<BITSIZE>:<TYPE>:<TARGET DATA>...
Types unknown to GDB would default to integral display; common types such as
integral, floating point (native byte order), integral vector, fp vector, et
cetera would be documented in the manual with fixed names.
Can one add a gdbarch_defined_type where the arch interprets
the raw bit stream to provide the user with a decent view of
the registers . It so happens that there are many status
registers which are essentially bitfields , so having this
as a hook for gdbarch to use for printing register values
might be useful. An example where this could be used would
be printing the status flags for e.g. on the i386. (One
could print the ZNCV values automatically. )

Also a way of describing reggroups in this protocol would be
very useful and conditions underwhich these are allowed to
exist would be something interesting. (would typically be
the presence of a sequence of bits in some bcr obtainable by
basic bitwise arithmetic on some BCR values. )


cheers
Ramana
--
Ramana Radhakrishnan
GNU Tools
codito ergo sum (www.codito.com)
Daniel Jacobowitz
2005-05-10 21:13:52 UTC
Permalink
Post by Ramana Radhakrishnan
Hi,
<snip>
Post by Daniel Jacobowitz
Also, it operates at an "optional feature" level rather than an "optional
register" level. The ARM RDI protocol has a nifty feature called
Self-Describing Modules, which allows coprocessors to describe themselves to
the debugger, including describing their register sets. It includes both
user-level information (name and type - along with a complicated type
description language) and implementation information (like the ARM mode in
which the register is accessible, for banked registers). I would like
the GDB solution to this problem to be sufficiently flexible to work with
SDM - both because it's a nice model and because that way we can be
compatible with ARM debug servers, given an adequate RDI proxy.
On the ARC there are extension encoding sections (Look at
.arcextmaps created in binutils for extension directives. )
which describe such registers by the binary. Its possible to
rename registers with other names and to specify other such
names for auxiliary registers. Using such a mechanism would
definitely be useful . Again its possible that the same
registers appear with different data formats in different
configurations of the core.
It sounds like, if the target does not support reporting its formats,
we could query the binary for them. In fact, this fits onto the
existing target stack. The BFD target would use a gdbarch method to
query the binary for this information. If the remote target supported
the query, we would ignore the binary's data; otherwise, we would use
it.

What other kinds of information may be in this section? Does this
interface offer enough room to make use of them?
Post by Ramana Radhakrishnan
Can one add a gdbarch_defined_type where the arch interprets
the raw bit stream to provide the user with a decent view of
the registers . It so happens that there are many status
registers which are essentially bitfields , so having this
as a hook for gdbarch to use for printing register values
might be useful. An example where this could be used would
be printing the status flags for e.g. on the i386. (One
could print the ZNCV values automatically. )
Someone posted a patch to do this for i386 years ago. It got held up,
I don't remember why.
Post by Ramana Radhakrishnan
Also a way of describing reggroups in this protocol would be
very useful and conditions underwhich these are allowed to
exist would be something interesting. (would typically be
the presence of a sequence of bits in some bcr obtainable by
basic bitwise arithmetic on some BCR values. )
I'm afraid you've lost me. Could you try explaining that whole
paragraph again? Oh... I think I see... I'm assuming BCR is something
like "board configuration register". I'm further assuming that this is
going to be a hardwired register, not runtime configurable. Is that
right?

Others have already persuaded me that we need some way to describe
groupings of registers. But I don't think that this logic you're
describing is appropriate to live in GDB. If some registers are only
available in some board configurations, the target stub should read the
BCR, work out which are available in _this_ configuration, and report
those.
--
Daniel Jacobowitz
CodeSourcery, LLC
Paul Schlie
2005-05-10 11:12:01 UTC
Permalink
I was thinking about an architecture with multiple configurations (registers),
such as Arc, Tensilica, ARM coprocessors (?), etc.
Having a single daemon supporting these multiple (arbitrary) configurations
would probably be easier for JTAG probe vendors. Since GDB certainly needs to
know about the particular configuration, the daemon wouldn't need to be
modified for each configuration.
Post by Daniel Jacobowitz
I don't want to support both directions just for kicks, but there may
be value here that I haven't thought of yet. That's why I asked
Tensilica for feedback.
I understand. I was just wondering if this would be useful and actully agree
that your proposal makes much more sense and that the target should know about
the configuration.
In our case, the daemon currently doesn't know about a particular
configuration, and GDB only queries for registers the processor (better) has.
For example, to read 'special register' <SR>, OCD simply issues a rsr a2,<SR>
and doesn't know if this <SR> really exists.
It seems that there are two fundamental models which may be adopted:

- refine GDB to be fully architecturally neutral, whereby all target
specific architectural details are provided by the target; including
but not limited to binary code encoding format, disassembly definitions,
type encoding definitions, symbolic and semantic register definition,
specification logical register names, types, purpose {GP pointer/data,
SP, FP, PC, SR, etc. including their encoding {endian, signess, fixed/
floating point encoding}}, logical/physical memory space definition
address range, address resolution, segmentation, etc.}, not to mention
potentially countless control registers which may be present to control
the cache, MMU, FPU, etc. configurations, and/or operating modes; and
possibly even the target interface protocol specification.

- refine GDB to enable these various potential target specific details to
be extracted from a target definition/configuration binary specification
directly, likely as directed by the user and possibly further refined
after subsequently querying the target. Essentially leaving the target
interface to be primarily responsible for GDB <=> target protocol
translation, being essentially analogous to the driver for an I/O device)

(Although not vastly dissimilar, it likely boils down to where one wants to
draw the line between the division of responsibility between the debugger
and the target interface processes; where personally regardless of where,
I simply believe all target architectural specification information should
be consolidated for the benefit of other tools, rather than being scattered
all over the place, or rely on proprietary sources of this information,
being "hidden" in a "propriety target interface".)
Daniel Jacobowitz
2005-05-17 19:32:28 UTC
Permalink
This is a much-revised version of the original proposal, based on all
the feedback I've gotten and an additional week thinking about the
problem. As before, I would appreciate feedback. Otherwise, I think
this is just about sufficiently baked to implement.



After connecting to a target, GDB checks the current gdbarch for a new
method, gdbarch_set_available_features. If the architecture does not
provide this method, the rest of the process is skipped.

GDB then reads the TARGET_OBJECT_AVAILABLE_FEATURES object from the target
and hands it to the gdbarch for processing. The object must have a
target-independent format, although it will have slots for target-dependent
content also.

The architecture calls gdbarch_update_p with an appropriately filled in
argument, which calls the architecture's gdbarch_init routine, which can
then do the real work of updating gdbarch_num_regs et cetera. This means
that the gdbarch_init routine must correctly handle filling in defaults
based on the last architecture.

The data returned by target_xfer_partial is an array of C structures. Memory
allocated for any interior pointers belongs to the target; the core code and
architecture should not modify or free them. The architecture will
generally deep copy the data locally to preserve it with the correct
lifetime. The target vector is responsible for converting any data supplied
by the target into the correct structure representation; for the remote
protocol, this will require parsing a textual representation of the data.
There is no terminating array element; the interface already provides the
size of the data.

Each individual feature reported may be a register, or a target-specific
feature set. A feature set is an abbreviation for a well-defined target
property, often including a group of registers that both the stub and GDB
both have external knowledge of. GDB will already know the order, types,
and sizes of registers, and potentially other details (such as how to pass
them as arguments to functions). If GDB does not recognize a feature, it
can safely ignore it, but should issue a warning to the user recommending
use of a later GDB.

The structure looks like this:

struct gdb_available_feature
{
/* The name of this feature. For registers, the name is
only used by the user interface. For features, the name
is recognized by the architecture. */
const char *name;

/* The protocol number used by this target to provide this
feature. For instance, the register number used for remote
p/P packets to access this register, or the base register
number for a group of raw registers included in a known
feature. If none is necessary this may be set to -1. */
int protocol_number;

/* Data private to the architecture associated with this feature.
This is a NUL-terminated string. */
const char *arch_data;

/* If this flag is not set, none of the remaining fields will be
valid. */
int is_register;

/* If this flag is set, GDB should never try to write to this
register. Otherwise, the user may modify the value in the
register. */
int readonly;

/* If this flag is set, GDB should save and restore this register
around calls to an inferior function. */
int save_restore;

/* The name of the register group containing this register. If this
is "general", "float", or "vector", the corresponding "info" command
should display this register's value. It can be an arbitrary
string, but should be limited to alphanumeric characters and internal
hyphens. */
const char *group;

/* The type of the register. */
struct type *type;
};

The remote protocol needs to map from strings to these objects. The string
is sequence of semicolon-delimited objects. Within each object a colon is
used as a field delimiter. Therefore freeform strings can not contain
either colons or semicolons. Numeric fields are specified as hexadecimal.
Some string fields may be empty.

A string description of a non-register feature looks like this:

feature:<NAME>:<PROTOCOL NUMBER>:<ARCH DATA>

Trailing fields may be omitted if they are not needed. An omitted protocol
number is set to -1; omitted arch data will be set to NULL. The name may
not be omitted.

A string description of a register looks like this:

reg:<NAME>:<PROTOCOL NUMBER>:<BITSIZE>:<TYPE>:<GROUP>:<TAGS>:<ARCH DATA>

NAME, PROTOCOL NUMBER, and BITSIZE may not be omitted or empty. TYPE is a
string to be interpreted by GDB; a list of valid types will be defined in
the manual. The final type of the register will depend on both the bitsize
and the type field; for instance, the type might be "int" or "float" and the
bitsize field would determine between the C int/float types or long/double.
If GDB does not recognize the type string provided by a target, it will
display the register as an integer. An omitted type defaults to "int".
GROUP is an identifier for the set of registers as described in the
gdb_available_features structure. TAGS are a set of comma-separated
keywords known to architecture-independent code in GDB; unknown tags will be
ignored. The currently defined tags are:

ro - This register is read-only
restore - This register should be saved/restored by GDB when
making an inferior function call or otherwise
saving/restoring the inferior's state.

One possible set of additional tags would be bitfield indicators, for
example "ro,bit0=N,bit1=Z,bit2=C,bit3=V" for a readonly status register.

The remote protocol would use a qPart packet to implement this. That means
the data would go over the wire hex encoded.

The optional registers will generally not appear in the remote protocol 'g'
packet. Instead they will be handled using p/P packets. This is somewhat
less efficient; a future extension could allow for bulk transfer packets.
The optional features would not be explicitly blocked from appearing in the
g packet. For instance, if MIPS used this feature to expect 32-bit
vs 64-bit GPRs, it would be desirable to continue using a g/G packet for
those.

The architecture will have to register the remote protocol <-> gdb regcache
number mapping.
--
Daniel Jacobowitz
CodeSourcery, LLC
Richard Earnshaw
2005-05-18 09:28:35 UTC
Permalink
Post by Daniel Jacobowitz
/* If this flag is set, GDB should save and restore this register
around calls to an inferior function. */
int save_restore;
Why would the target care about this? It seems to be more a property of
an ABI than the target.

In the (IMO) unlikely case that we really want to keep this, I think it
should have a 'not-my-responsibility-to-decide' setting.

R.
Daniel Jacobowitz
2005-05-19 01:00:13 UTC
Permalink
Post by Richard Earnshaw
Post by Daniel Jacobowitz
/* If this flag is set, GDB should save and restore this register
around calls to an inferior function. */
int save_restore;
Why would the target care about this? It seems to be more a property of
an ABI than the target.
In the (IMO) unlikely case that we really want to keep this, I think it
should have a 'not-my-responsibility-to-decide' setting.
This isn't the conventional callee-saved vs. caller-saved decision. GDB
needs to handle bogus functions so it should save/restore all "normal"
registers, but there may be some registers in a system for which this
behavior is inappropriate. Perhaps there's a better name for this if I
invert the meaning... For instance, a debugger should probably not muck
with cp15 registers across a function call, even if they're not marked
readonly, to allow the user to modify them explicitly.

I'm trying to express the concept of save_reggroup/restore_reggroup for
target specified registers. Have you got another idea of how to do it?
Maybe it's not necessary after all?
--
Daniel Jacobowitz
CodeSourcery, LLC
Richard Earnshaw
2005-05-20 14:53:20 UTC
Permalink
Post by Daniel Jacobowitz
Post by Richard Earnshaw
Post by Daniel Jacobowitz
/* If this flag is set, GDB should save and restore this register
around calls to an inferior function. */
int save_restore;
Why would the target care about this? It seems to be more a property of
an ABI than the target.
In the (IMO) unlikely case that we really want to keep this, I think it
should have a 'not-my-responsibility-to-decide' setting.
This isn't the conventional callee-saved vs. caller-saved decision. GDB
needs to handle bogus functions so it should save/restore all "normal"
registers, but there may be some registers in a system for which this
behavior is inappropriate. Perhaps there's a better name for this if I
invert the meaning... For instance, a debugger should probably not muck
with cp15 registers across a function call, even if they're not marked
readonly, to allow the user to modify them explicitly.
I'm trying to express the concept of save_reggroup/restore_reggroup for
target specified registers. Have you got another idea of how to do it?
Maybe it's not necessary after all?
Ah! Light dawns.

Hmm, I've not thought this through in detail. Perhaps we should define
access/protection classes that a register belongs to. Something along
the lines of

User - normal user-space registers
Control - accessible by users, but generally stateful or volatile
System - protected to system mode access only
Sys-ctrl - Accessible by system, but generally stateful

A parallel attribute might describe whether or not a register can be
written, or whether it is purely for information.

Examples (for ARM) for each of the above classess would be

user - r0-r15
Control - VFP FPSCR
System - r8_fiq, etc
Sys-ctrl - cp15

This might be overkill for much of GDB's needs, but in some cases it
might still not be enough. For example, the CPSR has some bits which
are User, and some that are User/read-only, but system/read-write.

R.

Paul Schlie
2005-05-17 23:08:32 UTC
Permalink
Post by Daniel Jacobowitz
This is a much-revised version of the original proposal, based on all
the feedback I've gotten and an additional week thinking about the
problem. As before, I would appreciate feedback. Otherwise, I think
this is just about sufficiently baked to implement.
...
Upon also having the opportunity to thing further about this, I agree
that there's value in being able to define logical registers which may
be more target specific than traditionally defined/visible within the
architectural description files; and possibly even more generalized?

- more specifically, although I still believe that any register
descriptions which are logically part of the machine's core ISA
belong with and should correspond to that target's architectural
definitions as would be seemingly necessary to correspond to it's
disassembler register definitions and presumptions (as any alterative
doesn't seem to minimally confusing, unless I misunderstand?)

- however as there are often logical registers which are considered
supplementary to even non-configurable architectures, often representing
control registers associated with MMU, Cache, or other closely coupled
CPU subsystems which would be nice to define a "view of" more generally
(including but not limited to memory mapped I/O registers, etc.).

So wonder if some hybrid mechanism, similar to that which you describe
may be most ideally flexible, and sufficient to meet your goals:

- presume that (by practical necessity) all logical registers which are
part of a target's core architectural programming model which by
definition should directly correspond to those definitions presumed
by it's disassembler are sourced though gdbarch definitions, as seemingly
required if they are to correspond?

- enable extended logical register views to be defined either by
extended definitions via gdgarch, or the target (as you've both
specified), and alternatively simply a configuration file (which
may potentially scripted to load via an init file script to specify
extended register views for those which may be memory or I/O space
mapped.)

- where these extended architectural register definitions merely provide
a convenient view of the logical state of the machine, which may be
mapped either to specific registers which the target stub may need to
be specifically aware of how to access through target specific jtag
specified scan locations etc. or may simply be memory or I/O space
mapped, as is often the case for extended control registers, in which
case only the corresponding address and precision, would seem to need
to be specified to enable GDB to request updates from the target, which
would likely be useful for many existing target stubs for example? (where
targets which must access extended registers through non-generalized means
may publish there existence directly through their target interface stub).

So in rough summary:

- presume core architectural and disasssembly machine descriptions are
sourced in some correlated manor (so they hopefully agree)

- non-core hard-registers which need direct specialized access may be
described through gdbarch, or published via the target stub.

- non-core generalized memory/i-o space mapped logical register descriptions
may be sourced either through gdbarch, the target, or an init file.

Just as a thought, and hope that this slight generalization of your
proposal might be found potentially helpful?
Continue reading on narkive:
Loading...