Friday, January 2, 2015

Practical Reverse Engineering p. 35 #4

Question number 4 on page 35 of Practical Reverse Engineering is as follows:

Implement the following functions in x86 assembly:
  • strlen
  • strchr
  • memcpy
  • memset
  • strcmp
  • strset

Here is the C prototype for strlen:

size_t strlen(const char *str);

This function returns the number of characters found before the null-byte character in a C-string. We can set AL to null and loop over EDI using REPNE SCASB. We will set ECX to -1 to begin, and then NOT the bytes and subtract by 1 to get the positive number length result.

_strlen:    
    push edi

    mov edi, [esp + 0x8]  ; char *str
    xor eax, eax          ; al = '\0'
    xor ecx, ecx          ; ecx = -1
    not ecx

    cld
    repne scasb

    not ecx               ; correct ecx
    lea eax, [ecx - 0x1]

    pop edi
    ret



Here is the C prototype for strchr:

char *strchr(char *str, int character);

The method of searching in the strchr function seems like it would be very similar to strlen. The are two main differences. Instead of searching for the null byte, we are searching for a user passed argument. And instead of returning the length, we return a pointer to the first time the search character is found.

We could be naive and just change some code in strlen, but looping with REPNE here means we could overrun the string buffer into unknown memory space. We need logic to check for the null byte as well, making the function more complicated. We could implement a strlen lookup beforehand, but it is more efficient to revert to more generic looping and comparison constructs.

_strchr:   
    mov eax, [esp + 0x4]   ; char *str
    mov ecx, [esp + 0x8]   ; char c
 
chr_loop:
    mov dl, [eax]          ; edx is caller-saved

    cmp cl, dl             ; *eax == c
    je chr_leave

    inc eax

    test dl, dl            ; check '\0'
    jnz chr_loop

    xor eax, eax           ; nullptr on fail

chr_leave:  
    ret



Here is the C prototype for memcpy:

void *memcpy (void *destination, const void *source, size_t num);

This implementation is very convenient in x86. We use the REP MOVSB operation, which will copy ESI to EDI byte by byte until ECX reaches 0.

_memcpy:
    push edi
    push esi

    mov edi, [esp + 0xc]
    mov esi, [esp + 0x10]
    mov ecx, [esp + 0x14]

    mov eax, edi            ; return dest

    cld
    rep movsb               ; ends at ecx = 0

    pop esi
    pop edi
    ret



Here is the C prototype for memset:

void *memset (void *ptr, int value, size_t num);

This implementation is similar to memcpy. We change to the REP STOSB operation, which will copy AL into EDI until ECX reaches 0.

_memset:
    push edi

    mov edi, [esp + 0x8]
    mov eax, [esp + 0xc]    ; char in al
    mov ecx, [esp + 0x10]

    push edi                ; store dest

    cld
    rep stosb               ; ends at ecx = 0

    pop eax                 ; return dest

    pop edi
    ret



Here is the C prototype for strcmp:

int strcmp(const char *str1, const char *str2);

The strcmp function returns 0 if both strings are equal, > 0 if the first character that does not match has a greater value in str1 than in str2, and < 0 in the other case.

 For this function we need to change our strategy. The REP CMPSB instruction looks like it would be great, but we would need to precompute both strings lengths and then use the minimum for our ECX value. We can achieve a better implementation using more generic looping and comparison constructs. Even though it seems to be more code, by skipping two strlen calls it ends up being less.

_strcmp:
    push edi
    push esi

    mov esi, [esp + 0xc]
    mov edi, [esp + 0x10]
    
    xor eax, eax    ; clear ret

cmp_loop:
    mov al, [esi]
    mov cl, [edi] 
    sub al, cl      ; al = *esi - *edi
    jne cmp_leave

    test cl, cl     ; check for '\0'
    jz cmp_leave

    inc esi         ; ++esi, ++edi
    inc edi
    jmp cmp_loop

cmp_leave:
    pop esi
    pop edi
    ret



Here is the C prototype for strset:

char *strset(char *str, int ch);

This is a function that isn't part of the standard library. We assume it works similar to memset, except that it will stop at the null terminator. There are two strategies we could take: precompute the strlen and then do a memset, or use more generic looping and comparison for a tighter implementation.

_strset:
    push edi

    mov edi, [esp + 0x8]
    mov ecx, [esp + 0xc]    ; char in ecx

    push edi                ; store dest

set_loop:
    mov al, [edi]
    test al, al             ; check '\0'
    jz set_leave

    mov [edi], cl           ; *edi = ch
    inc edi
    
    jmp set_loop

set_leave:
    pop eax                 ; return dest

    pop edi
    ret

No comments :

Post a Comment