03 March 2026

🚀 RFC 1071 Standards : Checksum with x64 Assembly

 Data Sealing to RFC 1071 Standards: Anatomy of a Checksum with x64 Assembly

As the development of my ICMP-based Reverse Shell project continues at full throttle, today I want to talk about the most "diplomatic" part of the operation: the Checksum. If you don't stamp this seal correctly on the packet you're sending, the victim machine's operating system treats your packet as a "forgery" and dumps it in the trash before it even gets through the door.

So, how exactly is this "seal" calculated in a low-level language? Let's examine it step-by-step through the very algorithm I wrote and currently use in my project.


🛠️ The Heart of the Algorithm: perform_checksum

The ICMP protocol uses a 16-bit One's Complement sum to ensure data integrity. This means you have to add up the entire packet in 16-bit (2-byte) chunks.

Here is what this mathematical operation looks like in the x64 Assembly realm:

Note: In this context, rdi represents the starting address of our data buffer, r14 is the starting offset, and r15 is the ending offset.

--------------------------------------------------------------

perform_checksum:

    ; RFC 1071 standard 16-bit one's complement sum algorithm

    xor eax, eax                ; Clear eax (Accumulator for the sum)

    mov r10, r14               ; r10 = Current offset

.loop:

    mov r11, r15               ; r11 = End offset

    sub r11, r10                ; Remaining bytes to process

    cmp r11, 1                  ; Check if only 1 byte is left (odd length)

    jle .last                       ; If <= 1 byte left, jump to final block

    

    movzx r12d, word [rdi + r10]; Read 2 bytes (1 word) zero-extended

    add eax, r12d              ; Add to accumulator

    add r10, 2                   ; Move offset forward by 2 bytes

    jmp .loop                   ; Repeat 

--------------------------------------------------------------

🧩 Part 1: Gathering the Pieces

We are essentially telling the CPU: "Fetch me a 16-bit (word) chunk from memory, add it to the eax register, and move to the next 2 bytes." This loop runs smoothly until we hit the end of the packet.

⚖️ Part 2: The "Odd Byte" Paradox

If the total length of the packet is an odd number (e.g., 11 bytes), the very last byte won't have a pair to form a 16-bit word. In this scenario, our algorithm elegantly dives into the .final block:

--------------------------------------------------------------
.last:
    je .final                   ; If exactly 1 byte left, handle it
    jmp .wrap                   ; If 0 bytes left, finalize calculation
.final:
    movzx r12d, byte [rdi + r10]; Read the last remaining single byte
    add eax, r12d               ; Add it to the accumulator
--------------------------------------------------------------
🔄 Part 3: The Wrap and Carry
Mathematically, this continuous addition might exceed a 16-bit boundary. This is where the most critical aspect of RFC 1071 comes into play: Adding the overflowing bits (the carry) back into the main sum.

--------------------------------------------------------------
.wrap:
    mov r11d, eax               ; Copy sum to r11d
    shr r11d, 16                ; Shift right to isolate the carry bits
    and eax, 0xFFFF             ; Mask eax to keep only the lower 16 bits
    add ax, r11w                ; Add the carry bits back to the sum
    adc ax, 0                   ; Add any final carry (add with carry)
    not ax                      ; One's complement (invert bits) for final checksum
    ret
--------------------------------------------------------------

🎯 Why not ax?

The not instruction at the very end is the final requirement of the One's Complement logic. By inverting the bits (0 -> 1, 1 -> 0), we ensure that when the receiving end takes our packet and performs the exact same addition, the result will be 0xFFFF. If it is, the data is clean, and our seal is valid!

Conclusion

Writing this algorithm in Assembly is a fantastic exercise to truly understand how data is laid out in memory and how the CPU crunches bytes. Thanks to this algorithm, our custom ICMP packets can bypass kernel-level drops and roam the network like "official documents".

When I integrate dynamic targeting and fileless execution (memfd_create) into my C2 architecture, this checksum engine will remain the most reliable gear in the machine.

Stay Coded!

No comments:

🚀 ICMP-Ghost: A Technical Analysis of Low-Level Network Communication in x64 Assembly

 ICMP-Ghost: A Pure x64 Assembly Fileless C2 Architecture In the field of cybersecurity research, optimizing system integrity and minimizin...