Skip to main content

Metasploit Framework payload generation process source code analysis

Metasploit Framework payload generation process source code analysis

Summary

  • In Metasploit, the overall generation process of an encoded payload is:
    1. Different assembly code fragments are composed into payload assembly code and compiled into machine code.
    2. Generate decoder stubs.
    3. Align the original payload machine code to 4 bytes and encode it with an encoder.
    4. Generate nop sled.
    5. Final shellcode: nop sled + decoder stub (except the last 0-4 bytes, which are spelled into the payload for alignment) + encoded payload.
  • Encoder: Used to eliminate bad bytes (such as '\x00') in the payload, and to encode the payload. The way to remove bad bytes is to encode, rather than the traditional method of replacing push 0 with xor ecx, ecx and push ecx.
  • If the shellcode is placed on the stack from the top of the stack when the stack overflows, you need to use the nop sled when using the shikata_ga_nai encoder, because the encoder puts the obtained FpuSaveState structure data on the stack, and It will overwrite multiple bytes below the top of the stack, so you need to provide enough space through the nop sled to avoid overwriting the shellcode.

mark

  a -> b: Indicates that method a calls method b
a => b: means calling method a first and then method b
a -> b => c: Indicates that in the implementation of method a, b and c are called in sequence
a => {…}: The curly braces are Ruby statements
M1::C1#f1: Represents the instance method f1 of class C1 under module M1
M1::C1.f1: Represents the class method f1 of class C1 under module M1

Source code analysis

  • In MSF, when the execution runinstruction starts to run an exploit module , the method lib/msf/core/encoded_payload.rbin the file Msf::EncodedPayload#generatewill be called to generate the specified payload according to the user's configuration. The calling process of this method is: generate_raw=> encode=> generate_sled=> , the final payload The composition is nop sled + decoder stub (except for the last 0-4 bytes, which are spelled into the payload for alignment) + encoded payload . self.encoded = (self.nop_sled || '') + self.encoded****
    • Msf::EncodedPayload#generate_raw-> generate_complete-> Msf::Payload::Windows::ReverseTcp.generate-> generate_reverse_tcp: Generate assembly code, and Metasm::Shellcode.assemble(Metasm::X86.new, combined_asm).encode_stringobtain bytecode through assembly.

    • Msf::EncodedPayload#encode: If no encoder is specified, this method will sequentially try each encoder that conforms to the CPU architecture and platform architecture. Each encoder may encode the payload repeatedly (the user can specify the number of iterations). If the encoding is successful (there are no errors in the payload) Bytes, if the size of the payload (including the nop slider) is greater than the required minimum number of bytes), the encoding will stop.

      • Msf::Encoder#encode-> do_encode-> MetasploitModule#decoder_stub=> MetasploitModule#encode_block. The strings generated by the latter two methods are concatenated to obtain the encoded payload. Each encoder will override these two methods. Different encoders will implement a MetasploitModuleclass. Let's take x86/shikata_ga_naithe encoder as an example:
        • decoder_stub: Generate decoder stub. Each encoder will implement this method independently. In the context of this method, state.orig_bufit is the unencoded payload, and state.buffinally the encoded payload will be saved.
          • generate_shikata_block:
            • Created a large number Rex::Poly::LogicalBlockof instances:
              • There is a list in each such instance @perms(when the instance is initialized, the list formed by the second parameter and subsequent parameters is converted into a @permslist), and each item in the list represents an optional instruction.
              • @permsThe elements of can be either strings representing machine codes, or Procinstances (which can be called to Proc#callreturn strings representing machine codes).
              • Use Rex::Poly::LogicalBlock#rand_permmethod to randomly @permsselect an instruction.
              • Rex::Poly::LogicalBlock#depends_onAssociate these instances. One of the instances is loop_inst, and the line of code loop_inst.generate(block_generator_register_blacklist, nil, state.badchars)uses it to generate a "polymorphic buffer". This instance and the depends_onassociated instances will be used in generate.
            • Rex::Poly::LogicalRegisterExample: Used to represent the register encoding under a specific CPU architecture.
              • Initialization: count_reg = Rex::Poly::LogicalRegister::X86.new('count', 'ecx'), the second 'ecx'parameter will be passed Rex::Arch::X86.reg_number. This method will 'ecx'first convert it to uppercase, and then pass it to Ruby's native method Object#const_get. This method will query Rex::Arch::X86the constants defined in the module, and finally find Rex::Arch::X86::ECXthe constant, and its value is the ecx register code.
              • When called in a block instance regnum_of(<Rex::Poly::LogicalRegister实例>), the corresponding register code can be obtained.
            • loop_inst.generate: Repeatedly Rex::Poly::Permutation#do_generateuntil bufthere are no bad bytes in its return value.
              • generate_block_list(state, level): Use recursive method to generate a state.block_listlist.
                1. @dependsCall the method on each block in the list of current block instances generate_block_listand append the results to state.block_listthe list.
                2. Append [ self, perm ]to state.block_listthe list. selfThis block variable permis rand_permgenerated using .
                3. Same as 1, but @dependschanged to @next_blocks.
              • Iterate over the previous step to get block_listthe list, convert each item perminto the corresponding instruction machine code, and splice them together state.bufferto get the machine code of the decoder stub.
          • The last few bytes of the decoder stub are cut out and placed at state.bufthe beginning. The purpose is to make state.bufit 4-byte aligned.
        • "XORK"If there is a flag in the decoder stub , replace it with a real key, that is, a encodekey generated in the function without bad bytes. This key is used for encryption during encoding.
          • real_keyDisassemble the generated stub (after replacing it ) (available Rex::Assembly::Nasm.disassemblemethods and putsprint), you can see the following assembly code:
                        fcmovu st5 ; 目的是将FPUDataPointer填充到上述结构体. (执行任意fpu指令都可达到此目的)
fnstenv [esp-0xc] ; 把FpuSaveState结构体保存到栈上的esp-0xC处, 则栈顶会保存FPUDataPointer, 即上面fcmovu指令的地址
pop ebx ;
sub ecx,ecx ; ecx置零, 作为循环计数器
mov cl,0x4b

; 偏移0x10, 循环体的开始处
xor [ebx+0x12],esi ; 0x12即是上面fcmovu指令的地址到这段存根的下一个字节的地址的距离, 所以这条指令即是对编码部分的前4个字节开始解码
add ebx,byte +0x4
db 0x03
; 这段存根少了loop指令, 会在上面xor后还原出来, 如下:
; add esi, [ebx + 0x12] ; 原始数据和第一个key相加, 得到下一个key
; loop 0x10 ; 机器码是\xe2\xf5, \xf5应该是表示从loop指令的下一条指令的地址开始减去11, 得到的地址即为循环头部

                        ESP[4]: FPUStatusWord;
ESP[8]: FPUTagWord;
ESP[0x0c]: FPUDataPointer; // 指向上一条FPU指令
ESP[0x10]: FPUInstructionPointer;
ESP[0x14]: FPULastInstructionOpcode;

  • Divide the original payload into a block of 4 bytes and encode_blockencode it using (if it is less than 4 bytes, fill it with 0 at the end).
    • encode_block: The encoding method inherited Msf::Encoder::XorAdditiveFeedbackfrom it is used . The algorithm is shown in the figure below. It is the original byte (4 bytes). It is the output encoded 4 bytes. After adding the sum, the lower 4 bytes are intercepted, as The key used for the next round of encoding.encode_block``orig``oblock``key``orig

Metasploit Framework payload generation process source code analysis

  • Msf::EncodedPayload#generate_sled: Generate nop sled, which will be added in front of the encoded payload.

    • modules/nops/x86/single_byte.rb
      • Take a certain number of instructions from a pile of useless instructions, such as: nop; xchg eax,edi; cdq; dec ebp; inc edi; aaa; ; daa; das; cld; std; clc; stc; cmc; cwde; lahf; waitetc.salc