?

Log in

No account? Create an account
brad's life [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | bradfitz.com ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

open source, closed specs [Nov. 13th, 2004|11:12 pm]
Brad Fitzpatrick
The MegaRAID driver is open source, but its ioctl interface at the source level only really supports:

-- querying number of adapters
-- querying driver version
-- querying logical drives per adapter

Pretty much useless.

Everything else seems to be passed through black-box from the userspace ioctl right to the card, then the response from the card copied right back to userspace.

So with an open source driver (mostly useless, but it got me this far), closed specs, and closed management utilities, what's next?

I've added printks where ioctls come in, so I can run the closed-source management utilities and see what they're sending. (and receiving, but I haven't done that yet)

strace only gives:

ioctl(0x3, 0xc06e6d00, 0x80ea780) = 0
ioctl(0x3, 0xc06e6d00, 0x80ea780) = 0
ioctl(0x3, 0xc06e6d00, 0x80ea780) = 0
ioctl(0x3, 0xc06e6d00, 0x80ea780) = 0
ioctl(0x3, 0xc06e6d00, 0x80ea780) = 0
ioctl(0x3, 0xc06e6d00, 0x80ea780) = 0

And I'd dump those 0x80ea780 addresses under gdb, except you can't strace something under gdb, and running it under gdb changes the addresses it uses (presumably).

Maybe I could look at /proc/`pidof megamgr.bin`/mem while stracing it? Hadn't thought of that until just now. But the timing's the hard part. The ioctl struct is read/write, so I want to see it both before the kernel messes with it, and after.

So back to my (ugly) kernel hacking. I've got my changes to build now (into *.o files) but I'm trying to figure out how kbuild makes the *.ko files.
LinkReply

Comments:
[User Picture]From: ydna
2004-11-14 07:31 am (UTC)
Would breaking on the ioctl in gdb let you dig out the before and after structs from memory? I guess if you could, you'd have already thought of it by now. Happy hacking.
(Reply) (Thread)
[User Picture]From: brad
2004-11-14 07:34 am (UTC)
My gdb foo has atrophied since college, and then I always had symbols.

If you can break on/after a syscall, I've never done it.

Any advice?
(Reply) (Parent) (Thread)
From: evan
2004-11-14 08:00 am (UTC)
Perhaps a good starting point is to see if they're using syscall(2) underneath by setting a breakpoint on that function?
(Reply) (Parent) (Thread)
From: jeffr
2004-11-14 08:08 am (UTC)
Why wouldn't they just be using glibc's ioctl syscall wrapper? Does 'break ioctl' not work? Furthermore, running something within gdb should not change any addresses unless they normally change with each invocation. Compiling the same source with symbols in on x86/C/ELF shouldn't even change text or data locations, and depending on your elf loader, won't even change stack locations.
(Reply) (Parent) (Thread)
[User Picture]From: ydna
2004-11-14 08:14 am (UTC)
Running stuff under gdb seems to move the start of code around. See my comment.
(Reply) (Parent) (Thread)
From: jeffr
2004-11-14 09:24 am (UTC)
er, no, it can't move the start of code around unless your code is compiled -fPIC. On my system, the only thing that changes is the stack start, because gdb passes argv[0] with the full path, and has slightly more information in envp. If we were to remotely attach gdb, we could even eliminate this difference. Here:


10# cat t.c
#include
[Error: Irreparable invalid markup ('<stdio.h>') in entry. Owner must fix manually. Raw contents below.]

er, no, it can't move the start of code around unless your code is compiled -fPIC. On my system, the only thing that changes is the stack start, because gdb passes argv[0] with the full path, and has slightly more information in envp. If we were to remotely attach gdb, we could even eliminate this difference. Here:


10# cat t.c
#include <stdio.h>
#include <stdlib.h>

int d;

int
main(int argc, char **argv)
{
int s;
char *h;

h = malloc(1);
printf("text: %p, data: %p, stack: %p heap: %p\n", &main, &d, &s, h);
}
10# gcc -o t t.c
10# ./t
text: 0x8048560, data: 0x80497a4, stack: 0xbfbfe970 heap: 0x804b030
10# gcc -g -o t t.c
10# ./t
text: 0x8048560, data: 0x80497a4, stack: 0xbfbfe970 heap: 0x804b030
10# gdb t
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd"...
(gdb) run
Starting program: /tmp/t
text: 0x8048560, data: 0x80497a4, stack: 0xbfbfe950 heap: 0x804b030

Program exited with code 0104.
(gdb) 10#
(Reply) (Parent) (Thread)
[User Picture]From: ydna
2004-11-14 08:12 am (UTC)
You almost gave me an excuse to learn something useful about gdb. Oops, I just went and diddled with it. Let's see if this goes anywhere (and show how vast my ignorance is).

Started with:
#include <stdio.h>

main () {
    printf("foo\n");
}
Built that and ran it under strace -i for fun (wanted to see if the IP matched up later in gdb; it does not, of course) and then ran it under gdb.
(gdb) b write
Function "write" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (write) pending.
(gdb) run
Starting program: /home/ydna/foo 
Breakpoint 2 at 0x400eaf20
Pending breakpoint "write" resolved

Breakpoint 2, 0x400eaf20 in write () from /lib/tls/libc.so.6
(gdb) n
Single stepping until exit from function write, 
which has no line number information.
foo
0x4008b3e9 in _IO_file_write () from /lib/tls/libc.so.6
(gdb) c
Continuing.

Program exited with code 04.
(gdb) q
Of course, I have no idea if that'd work with ioctl or not. But it seems somewhat promising.
(Reply) (Parent) (Thread)
From: jeffr
2004-11-14 08:10 am (UTC)
How is the ioctl defined?
(Reply) (Thread)
[User Picture]From: ydna
2004-11-14 08:19 am (UTC)
In: <sys/ioctl.h>

As: int ioctl(int d, int request, ...);

Is that what you meant?
(Reply) (Parent) (Thread)
From: jeffr
2004-11-14 09:18 am (UTC)
I meant, how is the actual driver ioctl defined. There is a standard macro for declaring ioctls that takes a few parameters, generally one of them is a length and another a type.
(Reply) (Parent) (Thread)
[User Picture]From: brad
2004-11-14 09:24 am (UTC)
Forget the notation, but I do remember:

rw, length 110 (sizeof(mimd_t)), type 'm', sub parameter 0
(Reply) (Parent) (Thread)
From: jeffr
2004-11-14 09:27 am (UTC)
You could also use ldpreload to wrap ioctl and print this information from userland. It might be more convenient than hacking up the same thing in the kernel. Sounds like you're on your way now though.
(Reply) (Parent) (Thread)
[User Picture]From: brad
2004-11-14 09:35 am (UTC)
Oh, good idea.

I've never actually written something using LD_PRELOAD, but I've seen enough "automatic trashcan wrapping unlink" things around the web to figure it's simple enough.
(Reply) (Parent) (Thread)
[User Picture]From: brad
2004-11-14 09:50 am (UTC)
# objdump -R megamgr.bin

megamgr.bin: file format elf32-i386

objdump: megamgr.bin: not a dynamic object

So:

(gdb) break ioctl
No symbol table is loaded. Use the "file" command.


back to the kernel approach, eh?
(Reply) (Parent) (Thread)
From: jeffr
2004-11-14 07:16 pm (UTC)
Well, you could disassemble it and manually search for the ioctl syscall wrapper function. It wouldn't be as direct as it used to be, as linux uses a mapped page to load the syscall code so that it can use sysenter/sysexit or syscall/sysret on the platforms that support it. So you'd have to find an indirect call that was setup to call syscall 54 (0x36). That might be a pain in the ass though. It's possible that the static binary just uses int $0x80 still as well, so you could try searching for that.
(Reply) (Parent) (Thread)
[User Picture]From: taral
2004-11-14 04:45 pm (UTC)
You'd have to patch gdb with a "catch syscall" command.
(Reply) (Thread)
[User Picture]From: doubleyou
2004-11-14 08:26 pm (UTC)
I haven't written any device drivers in a long time, and the last time I did so was in 1996, for SCO Unix, Novell UnixWare (which SCO bought), and Solaris 2.5.1 x86. So this might be horribly naive and out-of-date.

What resources do you have access to in the Linux DDI/DKI? Can you do file operations? I was going to say to rewrite the ioctl function so that it shuttles the data off somewhere else (to say, a file on disk), before communicating it to or from the hardware. Though that's probably a bad idea due to the facts that a) there is probably a large volume of data, and b) it hardly makes the operations atomic any more. But if you can make your test cases really small, you might be able to get away with it.
(Reply) (Thread)