2005-05-06 16:20:29 UTC
section titles. I'm interested in any and all comments about the problem or
about my solution. I hope to start implementing within a couple of weeks.
Today, the contents of the register cache and the layout of GDB's regnum
space are determined by the gdbarch. There are several hooks for this,
primarily these three:
The gdbarch determines what raw registers are available. But this isn't a
perfect match with what raw registers are _really_ available, because the
gdbarch only has the clues we use to select a gdbarch available: things like
byte order and BFD machine number. At best, those tell us what registers
the binary we're debugging requires. The runtime set of registers we can
see are a property of the target, not of the gdbarch.
Here's a couple of examples of existing workarounds for this problem.
/* The default settings include the FPU registers, the MMX registers
and the SSE registers. This can be overridden for a specific ABI
by adjusting the members `st0_regnum', `mm0_regnum' and
`num_xmm_regs' of `struct gdbarch_tdep', otherwise the registers
will show up in the output of "info all-registers". Ideally we
should try to autodetect whether they are available, such that we
can prevent "info all-registers" from displaying registers that
NOTE: kevinb/2003-07-13: ... if it's a choice between printing
[the SSE registers] always (even when they don't exist) or never
showing them to the user (even when they do exist), I prefer the
former over the latter. */
Currently we always display the SSE registers, and fill them in with dummy
values. If we knew whether they were available, we could avoid displaying
rs6000-tdep.c:rs6000_gdbarch_init has related problems: registers whose size
and type change depending on the architecture. Here's one I encountered
/* For e500 executables, the apuinfo section is of help here. Such
section contains the identifier and revision number of each
Application-specific Processing Unit that is present on the
chip. The content of the section is determined by the assembler
which looks at each instruction and determines which unit (and
which version of it) can execute it. In our case we just look for
the existance of the section. */
There's a number of ways to end up with binaries which will run on an e500
chip but not have an apuinfo section. The presence of the section is used
to select bfd_mach_ppc_e500, which in turn is used to select registers_e500.
This is one of the many possible PPC register layouts, but it's particularly
interesting because it changes the size of registers 32-63, which moves
register 64 (the PC) to a different offset in the 'g' packet. Most of the
other "common PPC" layouts are compatible enough that if you get the wrong
one, debugging will still work. This one isn't. GDB needs to know whether
it is talking to an e500 stub or not.
The MIPS targets have another variant of the problem; they don't know
whether the target provides 32-bit or 64-bit registers, because code
compiled for 32-bit can run on targets with 64-bit hardware registers - and,
in some cases, be influenced by corruption in the upper halves of the
registers. So GDB is doing the user a disservice by only displaying 32-bit
registers if it can see the 64-bit registers.
And so on; you get the picture :-)
ONE EXISTING SOLUTION
On the csl-arm release branch, Paul and I developed a patch specific to the
ARM VFP and Xscale iWMMXt coprocessors. It uses the xfer_partial interface
to query the target to describe the available registers, and then creates a
new gdbarch based on that information if it does not match the register
layout used by the current gdbarch. Then there are three additional hooks,
covering native, sim, and remote. I would like to propose a similar
interface for HEAD - the one on the branch is not suitable as-is.
Here's the branch patches, for reference:
The interface on the branch is based in ARM-specific code instead of
common code. There is an inferior_created observer which calls
arm_update_architecture. That function uses target_read_partial (sloppily)
to fetch a target-specific string. The currently supported values
of the string are:
The word refers to an optional register set which is present on the target.
The hex number, if present, is a target-specific number used as a base for
the register set. For instance, iwmmxt:30 means that the iWMMXt registers
are present, and wr0 (the first register) is number 0x30. The number gets
saved away in the tdep structure, and a new hook for p/P packet support uses
this if the target we are connected to uses the remote protocol. The sim
and native targets don't have separate numbering so they ignore this value.
Some shortcomings in the branch implementation:
- I never implemented support for multiple responses. It wasn't necessary
since only two register sets were implemented; there exists today no
ARM core with both extensions.
- It uses an observer. While I do like this sort of use of observers in
general, it's not appropriate here; this should happen ASAP after
connecting to the target, because until it does we may not be able to
read registers reliably.
- There's no common code infrastructure for this, which I consider a must.
I don't want targets to reinvent more than necessary.
Also, it operates at an "optional feature" level rather than an "optional
register" level. The ARM RDI protocol has a nifty feature called
Self-Describing Modules, which allows coprocessors to describe themselves to
the debugger, including describing their register sets. It includes both
user-level information (name and type - along with a complicated type
description language) and implementation information (like the ARM mode in
which the register is accessible, for banked registers). I would like
the GDB solution to this problem to be sufficiently flexible to work with
SDM - both because it's a nice model and because that way we can be
compatible with ARM debug servers, given an adequate RDI proxy.
Here's my current idea for an improved interface. I have not implemented
any of this yet, only the older interface I described above. It does borrow
heavily from that implementation.
After connecting to a target, GDB checks the current gdbarch for a new
method, gdbarch_set_available_registers. If the architecture does not
provide this method, the rest of the process is skipped.
GDB then reads the TARGET_OBJECT_AVAILABLE_REGISTERS object from the target,
parses it, and hands it to the gdbarch for final processing. This means
that the object must have a target-independent format, although it will
have target-dependent content also.
The target calls gdbarch_update_p with an appropriately filled in argument,
which calls its gdbarch_init routine, which can then do the real work of
updating gdbarch_num_regs et cetera. This means that the gdbarch_init
routine must correctly handle filling in defaults based on the last
architecture. That code is a bit fragile because it's undertested; I
recently updated ARM to do this robustly.
First of all, the target object. It can describe either individual
registers or register sets known to the target (for brevity). Each
component is an ASCII string. Colon is used as a field delimiter and
semicolon as a component delimiter. A register set would look like:
No more information is necessary; the register set is an abbreviation of a
well-defined group of registers that both the stub and GDB have external
knowledge of. GDB will already know the order, types, and sizes of
registers, and potentially other details (such as how to pass them as
arguments to functions). If GDB does not recognize the register set, it can
safely ignore it, but should issue a warning to the user recommending use of
a later GDB. If the protocol does not require numbers, they will be
ignored, but they are non-optional in the syntax.
I have spent less time thinking about how to specify individual registers.
This should suffice, but if anyone can see cause for another standard field,
please speak up.
reg:<NAME>:<PROTOCOL NUMBER>:<BITSIZE>:<TYPE>:<TARGET DATA>...
Types unknown to GDB would default to integral display; common types such as
integral, floating point (native byte order), integral vector, fp vector, et
cetera would be documented in the manual with fixed names.
The remote protocol would use a qPart packet to implement this. That means
the data would go over the wire hex encoded. I would probably end up adding
some more intelligent decoding to "set debug remote" so that I could see the
hex-decoded form of this data, since it would be printable. I've wanted
that before (for qSymbol debugging).
The optional register sets would generally not appear in the remote protocol
'g' packet. Instead they would be handled using p/P packets. This is
somewhat less efficient; if someone wants to come up with a g/G-like way to
transfer known register sets in bulk, be my guest. That's a separate
problem. The optional registers would not be blocked from appearing in the
g packet, however. For instance, if MIPS used this feature to expect 32-bit
vs 64-bit GPRs, it would be desirable to continue using a g/G packet for
The architecture would have to register the remote protocol <-> gdb regcache
Simulator targets could implement this mechanism in the simulator. For now
I created a gdbarch hook which returns a string describing the capabilities
of the simulator, and used it to implement target_xfer_partial for the
common simulator target. Native targets should override