User talk:Seb76

From UFOpaedia
Revision as of 19:10, 17 June 2008 by Seb76 (talk | contribs)
Jump to navigation Jump to search

Hey, sorry to pester you again. :) I've gotten access to IDA, as you suggested, and with it I'm making some slow progress toward my mod. I wanted to ask, though, do you know of any sort of tutorial or useful intro for it? The user interface is pretty obtuse, the built-in help has nothing useful, and I've been struggling just to make comments go where I want them to.

(I mean, I understand that it's meant for very advanced users, but Jesus, who writes an enterprise-grade utility and doesn't bother to implement an Undo function?!?)

Thanks again for your help! Phasma Felis 23:15, 16 June 2008 (PDT)


Okay, a little more progress since I discovered anterior comments. Couple of more specific questions: what's the difference between a "comment" and a "repeatable comment"? Or any of the several other types of comments, for that matter.

What exactly does "mov cs:word_102F9, ax" do? At first I thought it was just copying the accumulator into the data word at 02F9, but the "cs:" part is confusing. word_102F9 is 0, I think ("seg000:02F9 word_102F9 dw 0"). Does that mean it's copying AX into the current code segment, offset 0, modifying the code in progress? That seems odd.

Okay, one more and then I'll go to bed: what does "jmp short $+2" do? It looks like it just means "jump to next instruction", which is kinda redundant, but it could be "jump over next instruction", which...still seems unnecessarily verbose. I dunno. Phasma Felis 00:51, 17 June 2008 (PDT)

The last two questions are actually general Intel 16-bit assembly ;)
The cs in "mov cs:word_102F9, ax" is the 16-bit code segment base, yes. It *might* be self-modifying code, but more likely there is a C global or static variable that was implemented there and being updated. The "seg000:02F9 word_102F9 dw 0" is probably from C default initialization, but could be from an explicit initialization to 0.
Back in the 16bit days, there were several memory models. My knowledge on this is quite rusty, but IIRC COM executables were using the "tiny" one which means that the code and data use the same segment (I assume you're working on the music TSR?). Modification of data via the CS segment is not necessarily self-modifying code. Also TSRs were usually signaled using software interruptions so the code most likely sets up an interrupt vector and bails out. e.g.:
seg000:0140 mov     dx, 157h
seg000:0143 push    ds
seg000:0144 push    cs
seg000:0145 pop     ds
seg000:0146 mov     ax, 2566h
seg000:0149 int     21h                             ; DOS - SET INTERRUPT VECTOR
seg000:0149                                         ; AL = interrupt number
seg000:0149                                         ; DS:DX = new vector to be used for specified interrupt
seg000:014B pop     ds
seg000:014C call    sub_1067A
seg000:014F mov     dx, ax
seg000:0151 mov     ax, 3100h
seg000:0154 int     21h                             ; DOS - DOS 2+ - TERMINATE BUT STAY RESIDENT
seg000:0154 start endp                              ; AL = exit code, DX = program size, in paragraphs
In this example (from music.com), there is code at 157h but IDA does not detect it. You can get there, type 'C' and create a new function. The code there is the most important. HTH Seb76 12:10, 17 June 2008 (PDT)
"jmp short $+2" is jump over the next instruction, if the next instruction is 2 bytes. This probably came from an if-then-else in C (it's a common idiom in translating C to assembly). -- Zaimoni, 12:36 Jun 17 2008 CDT
I can see several instances of this in music.com for simple "return value" functions. Most likely a "feature" of the compiler. If used for padding, it is equivalent to 2 nop instructions, but takes only one cycle to execute. This was before deeply pipelined processors though ;-) Seb76 12:10, 17 June 2008 (PDT)