Angr学习
上次面山石的时候,对面的师傅让我跟随这个angr_ctf的项目来学一学,那来试试看
配置安装Angr
首先得先装上Angr吧,先根据Angr官方文档上面的指引:https://docs.angr.io/introductory-errata/install
*nix系列的系统均可以这样安装
sudo apt-get install python3-dev libffi-dev build-essential virtualenvwrapper
mkvirtualenv --python=$(which python3) angr && pip install angr
有的系统的Virtualenv的路径有些奇怪,可以在apt安装完之后加一句
source `which virtualenvwrapper.sh`
拉取angr_ctf仓库
使用git拉取
git clone https://github.com/jakespringer/angr_ctf.git
根据readme可以知道,使用package.py可以直接打包所有题目,我们在刚刚的virtualenv环境中运行
python3 package.py ./package
即可在package目录下得到所有编译好的文件了。
00_angr_find
使用ida打开,可以观察到就是一个非常正常的check程序。可以使用angr快速解题
我们看看对应的示例脚本
顶部是对脚本使用的介绍以及对于ctf的简单介绍。
# Before you begin, here are a few notes about these capture-the-flag
# challenges.
#
# Each binary, when run, will ask for a password, which can be entered via stdin
# (typing it into the console.) Many of the levels will accept many different
# passwords. Your goal is to find a single password that works for each binary.
#
# If you enter an incorrect password, the program will print "Try again." If you
# enter a correct password, the program will print "Good Job."
#
# Each challenge will be accompanied by a file like this one, named
# "scaffoldXX.py". It will offer guidance as well as the skeleton of a possible
# solution. You will have to edit each file. In some cases, you will have to
# edit it significantly. While use of these files is recommended, you can write
# a solution without them, if you find that they are too restrictive.
#
# Places in the scaffoldXX.py that require a simple substitution will be marked
# with three question marks (???). Places that require more code will be marked
# with an ellipsis (...). Comments will document any new concepts, but will be
# omitted for concepts that have already been covered (you will need to use
# previous scaffoldXX.py files as a reference to solve the challenges.) If a
# comment documents a part of the code that needs to be changed, it will be
# marked with an exclamation point at the end, on a separate line (!).
import angr
import sys
def main(argv):
# Create an Angr project.
# If you want to be able to point to the binary from the command line, you can
# use argv[1] as the parameter. Then, you can run the script from the command
# line as follows:
# python ./scaffold00.py [binary]
# (!)
path_to_binary = "./00_angr_find" # :string
project = angr.Project(path_to_binary)
# Tell Angr where to start executing (should it start from the main()
# function or somewhere else?) For now, use the entry_state function
# to instruct Angr to start from the main() function.
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
# Create a simulation manager initialized with the starting state. It provides
# a number of useful tools to search and execute the binary.
simulation = project.factory.simgr(initial_state)
# Explore the binary to attempt to find the address that prints "Good Job."
# You will have to find the address you want to find and insert it here.
# This function will keep executing until it either finds a solution or it
# has explored every possible path through the executable.
# (!)
print_good_address = 0x80492F0 # :integer (probably in hexadecimal)
simulation.explore(find=print_good_address)
# Check that we have found a solution. The simulation.explore() method will
# set simulation.found to a list of the states that it could find that reach
# the instruction we asked it to search for. Remember, in Python, if a list
# is empty, it will be evaluated as false, otherwise true.
if simulation.found:
# The explore method stops after it finds a single state that arrives at the
# target address.
solution_state = simulation.found[0]
# Print the string that Angr wrote to stdin to follow solution_state. This
# is our solution.
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
# If Angr could not find a path that reaches print_good_address, throw an
# error. Perhaps you mistyped the print_good_address?
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
对程序逐句分析:
首先程序选取当前路径的./00_angr_find创建了一个angr项目,注释提示如果想从命令行输入路径也可以将path改为argv[1]
# Create an Angr project.
# If you want to be able to point to the binary from the command line, you can
# use argv[1] as the parameter. Then, you can run the script from the command
# line as follows:
# python ./scaffold00.py [binary]
# (!)
path_to_binary = "./00_angr_find" # :string
project = angr.Project(path_to_binary)
然后设置了angr的initial_state为项目的entry_state
# Tell Angr where to start executing (should it start from the main()
# function or somewhere else?) For now, use the entry_state function
# to instruct Angr to start from the main() function.
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
在入口点创建一个模拟器
# Create a simulation manager initialized with the starting state. It provides
# a number of useful tools to search and execute the binary.
simulation = project.factory.simgr(initial_state)
让模拟器模拟执行,并且探索输出good的位置,puts的地址可以从ida中找到
# Explore the binary to attempt to find the address that prints "Good Job."
# You will have to find the address you want to find and insert it here.
# This function will keep executing until it either finds a solution or it
# has explored every possible path through the executable.
# (!)
print_good_address = 0x80492F0 # :integer (probably in hexadecimal)
simulation.explore(find=print_good_address)
若模拟器找到了输出正确的解,那么将解的状态设置为simulation.found[0],并且输出对程序的输入,否则抛出一个异常,提示找不到解
# Check that we have found a solution. The simulation.explore() method will
# set simulation.found to a list of the states that it could find that reach
# the instruction we asked it to search for. Remember, in Python, if a list
# is empty, it will be evaluated as false, otherwise true.
if simulation.found:
# The explore method stops after it finds a single state that arrives at the
# target address.
solution_state = simulation.found[0]
# Print the string that Angr wrote to stdin to follow solution_state. This
# is our solution.
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
# If Angr could not find a path that reaches print_good_address, throw an
# error. Perhaps you mistyped the print_good_address?
raise Exception('Could not find the solution')
通过逐句分析可以了解到angr的强大之处。它可以自己找到输出正确解的位置。
01_angr_avoid
看一下第二题的逻辑,题目已经无法生成cfg了,超过了10000个node,但是查看Strings我们依然可以看到Good Job的字样
通过交叉引用可以看到程序输出正确的位置,同样找到该位置的地址填入脚本
这题通过对汇编进行分析还可以发现程序中有avoid_me函数,猜测应该是需要我们避开这个函数进行符号执行
那么我们需要对脚本进行如下改动:
在模拟器进行explore操作的时候,我们需要指定对应的地址来避开,以加速运行。
import angr
import sys
def main(argv):
path_to_binary = "./01_angr_avoid"
project = angr.Project(path_to_binary)
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
simulation = project.factory.simgr(initial_state)
# Explore the binary, but this time, instead of only looking for a state that
# reaches the print_good_address, also find a state that does not reach
# will_not_succeed_address. The binary is pretty large, to save you some time,
# everything you will need to look at is near the beginning of the address
# space.
# (!)
print_good_address = 0x8049278
will_not_succeed_address = 0x8049243
simulation.explore(find=print_good_address, avoid=will_not_succeed_address)
if simulation.found:
solution_state = simulation.found[0]
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
相比于00题添加了一个avoid
simulation.explore(find=print_good_address, avoid=will_not_succeed_address)
02_angr_find_condition
我们通过cfg和汇编窗口可以看到这个里面加入了非常多的判断分支,用来干扰执行。
ida能识别出来实际上的代码为
这里可以通过ida的tab快捷键快速定位到输出good job的分支
angr在这里提供了一种解法,在无法快速定位到这个位置的时候,我们可以通过angr对输出的dump直接判断。
可以看到在27-45行是通过两个函数对state中的stdout进行判断从而判断当前分支是否到达正确结果
# It is very useful to be able to search for a state that reaches a certain
# instruction. However, in some cases, you may not know the address of the
# specific instruction you want to reach (or perhaps there is no single
# instruction goal.) In this challenge, you don't know which instruction
# grants you success. Instead, you just know that you want to find a state where
# the binary prints "Good Job."
#
# Angr is powerful in that it allows you to search for a states that meets an
# arbitrary condition that you specify in Python, using a predicate you define
# as a function that takes a state and returns True if you have found what you
# are looking for, and False otherwise.
import angr
import sys
def main(argv):
path_to_binary = "02_angr_find_condition"
project = angr.Project(path_to_binary)
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
simulation = project.factory.simgr(initial_state)
# Define a function that checks if you have found the state you are looking
# for.
def is_successful(state):
# Dump whatever has been printed out by the binary so far into a string.
stdout_output = state.posix.dumps(sys.stdout.fileno())
# Return whether 'Good Job.' has been printed yet.
# (!)
return b'Good Job.' in stdout_output # :boolean
# Same as above, but this time check if the state should abort. If you return
# False, Angr will continue to step the state. In this specific challenge, the
# only time at which you will know you should abort is when the program prints
# "Try again."
def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Try again." in stdout_output # :boolean
# Tell Angr to explore the binary and find any state that is_successful identfies
# as a successful state by returning True.
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
solution_state = simulation.found[0]
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
03_angr_symbolic_registers
# Angr doesn't currently support reading multiple things with scanf (Ex:
# scanf("%u %u).) You will have to tell the simulation engine to begin the
# program after scanf is called and manually inject the symbols into registers.
import angr
import claripy
import sys
def main(argv):
path_to_binary = "03_angr_symbolic_registers"
project = angr.Project(path_to_binary)
# Sometimes, you want to specify where the program should start. The variable
# start_address will specify where the symbolic execution engine should begin.
# Note that we are using blank_state, not entry_state.
# (!)
start_address = 0x08049617 # :integer (probably hexadecimal)
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
# Create a symbolic bitvector (the datatype Angr uses to inject symbolic
# values into the binary.) The first parameter is just a name Angr uses
# to reference it.
# You will have to construct multiple bitvectors. Copy the two lines below
# and change the variable names. To figure out how many (and of what size)
# you need, dissassemble the binary and determine the format parameter passed
# to scanf.
# (!)
password0_size_in_bits = 32 # :integer
password0 = claripy.BVS('password0', password0_size_in_bits)
password1_size_in_bits = 32 # :integer
password1 = claripy.BVS('password1', password1_size_in_bits)
password2_size_in_bits = 32 # :integer
password2 = claripy.BVS('password2', password2_size_in_bits)
# Set a register to a symbolic value. This is one way to inject symbols into
# the program.
# initial_state.regs stores a number of convenient attributes that reference
# registers by name. For example, to set eax to password0, use:
#
# initial_state.regs.eax = password0
#
# You will have to set multiple registers to distinct bitvectors. Copy and
# paste the line below and change the register. To determine which registers
# to inject which symbol, dissassemble the binary and look at the instructions
# immediately following the call to scanf.
# (!)
initial_state.regs.eax = password0
initial_state.regs.ebx = password1
initial_state.regs.edx = password2
simulation = project.factory.simgr(initial_state)
def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Good Job." in stdout_output
def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Try again." in stdout_output
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
solution_state = simulation.found[0]
# Solve for the symbolic values. If there are multiple solutions, we only
# care about one, so we can use eval, which returns any (but only one)
# solution. Pass eval the bitvector you want to solve for.
# (!)
solution0 = solution_state.solver.eval(password0)
solution1 = solution_state.solver.eval(password1)
solution2 = solution_state.solver.eval(password2)
# Aggregate and format the solutions you computed above, and then print
# the full string. Pay attention to the order of the integers, and the
# expected base (decimal, octal, hexadecimal, etc).
solution = "Solution: {:x} {:x} {:x}".format(solution0,solution1,solution2)# :string
print(solution)
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
这题有%x的三个输入,但是angr不支持多个参数的输入,因此需要在输入之后的位置开始模拟执行,并且将寄存器的值初始化为输入
04_angr_symbolic_stack
分析得知,需要构造输入后的栈情况
可以动调/分析得知,在这个位置的时候,我们需要初始化栈为这种情况,即ebp+0xc-->password0,ebp+0x10-->password1
那么我们可以计算一下,首先mov ebp,esp 然后再sub esp,0x8 接下来再push,push进去的第一个参数就是password0,刚好在0xc的位置,第二个就是password1
# This challenge will be more challenging than the previous challenges that you
# have encountered thus far. Since the goal of this CTF is to teach symbolic
# execution and not how to construct stack frames, these comments will work you
# through understanding what is on the stack.
# ! ! !
# IMPORTANT: Any addresses in this script aren't necessarily right! Dissassemble
# the binary yourself to determine the correct addresses!
# ! ! !
import angr
import claripy
import sys
def main(argv):
path_to_binary = "04_angr_symbolic_stack"
project = angr.Project(path_to_binary)
# For this challenge, we want to begin after the call to scanf. Note that this
# is in the middle of a function.
#
# This challenge requires dealing with the stack, so you have to pay extra
# careful attention to where you start, otherwise you will enter a condition
# where the stack is set up incorrectly. In order to determine where after
# scanf to start, we need to look at the dissassembly of the call and the
# instruction immediately following it:
# sub $0x4,%esp
# lea -0x10(%ebp),%eax
# push %eax
# lea -0xc(%ebp),%eax
# push %eax
# push $0x80489c3
# call 8048370 <__isoc99_scanf@plt>
# add $0x10,%esp
# Now, the question is: do we start on the instruction immediately following
# scanf (add $0x10,%esp), or the instruction following that (not shown)?
# Consider what the 'add $0x10,%esp' is doing. Hint: it has to do with the
# scanf parameters that are pushed to the stack before calling the function.
# Given that we are not calling scanf in our Angr simulation, where should we
# start?
# (!)
start_address = 0x80493A2
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
# We are jumping into the middle of a function! Therefore, we need to account
# for how the function constructs the stack. The second instruction of the
# function is:
# mov %esp,%ebp
# At which point it allocates the part of the stack frame we plan to target:
# sub $0x18,%esp
# Note the value of esp relative to ebp. The space between them is (usually)
# the stack space. Since esp was decreased by 0x18
#
# /-------- The stack --------\
# ebp -> | |
# |---------------------------|
# | |
# |---------------------------|
# . . . (total of 0x18 bytes)
# . . . Somewhere in here is
# . . . the data that stores
# . . . the result of scanf.
# esp -> | |
# \---------------------------/
#
# Since we are starting after scanf, we are skipping this stack construction
# step. To make up for this, we need to construct the stack ourselves. Let us
# start by initializing ebp in the exact same way the program does.
initial_state.regs.ebp = initial_state.regs.esp
# scanf("%u %u") needs to be replaced by injecting two bitvectors. The
# reason for this is that Angr does not (currently) automatically inject
# symbols if scanf has more than one input parameter. This means Angr can
# handle 'scanf("%u")', but not 'scanf("%u %u")'.
# You can either copy and paste the line below or use a Python list.
# (!)
password0 = claripy.BVS('password0', 32)
password1 = claripy.BVS('password1', 32)
# Here is the hard part. We need to figure out what the stack looks like, at
# least well enough to inject our symbols where we want them. In order to do
# that, let's figure out what the parameters of scanf are:
# sub $0x4,%esp
# lea -0x10(%ebp),%eax
# push %eax
# lea -0xc(%ebp),%eax
# push %eax
# push $0x80489c3
# call 8048370 <__isoc99_scanf@plt>
# add $0x10,%esp
# As you can see, the call to scanf looks like this:
# scanf( 0x80489c3, ebp - 0xc, ebp - 0x10 )
# format_string password0 password1
# From this, we can construct our new, more accurate stack diagram:
#
# /-------- The stack --------\
# ebp -> | padding |
# |---------------------------|
# ebp - 0x01 | more padding |
# |---------------------------|
# ebp - 0x02 | even more padding |
# |---------------------------|
# . . . <- How much padding? Hint: how
# |---------------------------| many bytes is password0?
# ebp - 0x0b | password0, second byte |
# |---------------------------|
# ebp - 0x0c | password0, first byte |
# |---------------------------|
# ebp - 0x0d | password1, last byte |
# |---------------------------|
# . . .
# |---------------------------|
# ebp - 0x10 | password1, first byte |
# |---------------------------|
# . . .
# |---------------------------|
# esp -> | |
# \---------------------------/
#
# Figure out how much space there is and allocate the necessary padding to
# the stack by decrementing esp before you push the password bitvectors.
padding_length_in_bytes = 0x8 # :integer
initial_state.regs.esp -= padding_length_in_bytes
# Push the variables to the stack. Make sure to push them in the right order!
# The syntax for the following function is:
#
# initial_state.stack_push(bitvector)
#
# This will push the bitvector on the stack, and increment esp the correct
# amount. You will need to push multiple bitvectors on the stack.
# (!)
initial_state.stack_push(password0) # :bitvector (claripy.BVS, claripy.BVV, claripy.BV)
initial_state.stack_push(password1)
simulation = project.factory.simgr(initial_state)
def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Good Job." in stdout_output
def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Try again." in stdout_output
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
solution_state = simulation.found[0]
solution0 = solution_state.solver.eval(password0)
solution1 = solution_state.solver.eval(password1)
solution = str(solution0)+' '+str(solution1)
print(solution)
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
05_angr_symbolic_memory
写内存到bss里面 不再赘述,直接ida里面看bss地址即可
import angr
import claripy
import sys
def main(argv):
path_to_binary = "05_angr_symbolic_memory"
project = angr.Project(path_to_binary)
start_address = 0x804929C
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
# The binary is calling scanf("%8s %8s %8s %8s").
# (!)
password0 = claripy.BVS('password0', 64)
password1 = claripy.BVS('password1', 64)
password2 = claripy.BVS('password2', 64)
password3 = claripy.BVS('password3', 64)
...
# Determine the address of the global variable to which scanf writes the user
# input. The function 'initial_state.memory.store(address, value)' will write
# 'value' (a bitvector) to 'address' (a memory location, as an integer.) The
# 'address' parameter can also be a bitvector (and can be symbolic!).
# (!)
password0_address = 0x90A3280
initial_state.memory.store(password0_address, password0)
password1_address = 0x90A3288
initial_state.memory.store(password1_address, password1)
password2_address = 0x90A3290
initial_state.memory.store(password2_address, password2)
password3_address = 0x90A3298
initial_state.memory.store(password3_address, password3)
...
simulation = project.factory.simgr(initial_state)
def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Good Job." in stdout_output
def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Try again." in stdout_output
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
solution_state = simulation.found[0]
# Solve for the symbolic values. We are trying to solve for a string.
# Therefore, we will use eval, with named parameter cast_to=bytes
# which returns bytes that can be decoded to a string instead of an integer.
# (!)
solution0 = solution_state.solver.eval(password0,cast_to=bytes).decode()
solution1 = solution_state.solver.eval(password1,cast_to=bytes).decode()
solution2 = solution_state.solver.eval(password2,cast_to=bytes).decode()
solution3 = solution_state.solver.eval(password3,cast_to=bytes).decode()
solution = solution0+' '+solution1+' '+solution2+' '+solution3
print(solution)
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
06_angr_symbolic_dynamic_memory
这题是malloc的用来输入,那需要通过拿出未使用的区域。指针则使用与题目中相同的即可
import angr
import claripy
import sys
def main(argv):
path_to_binary = "06_angr_symbolic_dynamic_memory"
project = angr.Project(path_to_binary)
start_address = 0x80492F3
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
# The binary is calling scanf("%8s %8s").
# (!)
password0 = claripy.BVS('password0', 64)
password1 = claripy.BVS('password1', 64)
# Instead of telling the binary to write to the address of the memory
# allocated with malloc, we can simply fake an address to any unused block of
# memory and overwrite the pointer to the data. This will point the pointer
# with the address of pointer_to_malloc_memory_address0 to fake_heap_address.
# Be aware, there is more than one pointer! Analyze the binary to determine
# global location of each pointer.
# Note: by default, Angr stores integers in memory with big-endianness. To
# specify to use the endianness of your architecture, use the parameter
# endness=project.arch.memory_endness. On x86, this is little-endian.
# (!)
fake_heap_address0 = 0xB0B1000
pointer_to_malloc_memory_address0 = 0xB0B19E8
initial_state.memory.store(pointer_to_malloc_memory_address0, fake_heap_address0, endness=project.arch.memory_endness)
fake_heap_address1 = 0xB0B1020
pointer_to_malloc_memory_address1 = 0xB0B19EC
initial_state.memory.store(pointer_to_malloc_memory_address1, fake_heap_address1, endness=project.arch.memory_endness)
# Store our symbolic values at our fake_heap_address. Look at the binary to
# determine the offsets from the fake_heap_address where scanf writes.
# (!)
initial_state.memory.store(fake_heap_address0, password0)
initial_state.memory.store(fake_heap_address1, password1)
simulation = project.factory.simgr(initial_state)
def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Good Job." in stdout_output
def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Try again." in stdout_output
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
solution_state = simulation.found[0]
solution0 = solution_state.solver.eval(password0,cast_to=bytes).decode()
solution1 = solution_state.solver.eval(password1,cast_to=bytes).decode()
solution = solution0+' '+solution1
print(solution)
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
07_angr_symbolic_file
这一题是文件的输入。读取程序的文件。同样的只要把程序的字节大小,开始模拟的位置弄明白,就基本上可以通过angr来一把锁了。
注意这里的地址是从输入并且写入文件后,在打开文件前的位置开始进行符号执行。
# This challenge could, in theory, be solved in multiple ways. However, for the
# sake of learning how to simulate an alternate filesystem, please solve this
# challenge according to structure provided below. As a challenge, once you have
# an initial solution, try solving this in an alternate way.
#
# Problem description and general solution strategy:
# The binary loads the password from a file using the fread function. If the
# password is correct, it prints "Good Job." In order to keep consistency with
# the other challenges, the input from the console is written to a file in the
# ignore_me function. As the name suggests, ignore it, as it only exists to
# maintain consistency with other challenges.
# We want to:
# 1. Determine the file from which fread reads.
# 2. Use Angr to simulate a filesystem where that file is replaced with our own
# simulated file.
# 3. Initialize the file with a symbolic value, which will be read with fread
# and propogated through the program.
# 4. Solve for the symbolic input to determine the password.
import angr
import claripy
import sys
def main(argv):
path_to_binary = "07_angr_symbolic_file"
project = angr.Project(path_to_binary)
start_address = 0x804942E
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
# Specify some information needed to construct a simulated file. For this
# challenge, the filename is hardcoded, but in theory, it could be symbolic.
# Note: to read from the file, the binary calls
# 'fread(buffer, sizeof(char), 64, file)'.
# (!)
filename = "UZXREFGY.txt" # :string
symbolic_file_size_bytes = 64
# Construct a bitvector for the password and then store it in the file's
# backing memory. For example, imagine a simple file, 'hello.txt':
#
# Hello world, my name is John.
# ^ ^
# ^ address 0 ^ address 24 (count the number of characters)
# In order to represent this in memory, we would want to write the string to
# the beginning of the file:
#
# hello_txt_contents = claripy.BVV('Hello world, my name is John.', 30*8)
#
# Perhaps, then, we would want to replace John with a
# symbolic variable. We would call:
#
# name_bitvector = claripy.BVS('symbolic_name', 4*8)
#
# Then, after the program calls fopen('hello.txt', 'r') and then
# fread(buffer, sizeof(char), 30, hello_txt_file), the buffer would contain
# the string from the file, except four symbolic bytes where the name would be
# stored.
# (!)
password = claripy.BVS('password', symbolic_file_size_bytes * 8)
# Construct the symbolic file. The file_options parameter specifies the Linux
# file permissions (read, read/write, execute etc.) The content parameter
# specifies from where the stream of data should be supplied. If content is
# an instance of SimSymbolicMemory (we constructed one above), the stream will
# contain the contents (including any symbolic contents) of the memory,
# beginning from address zero.
# Set the content parameter to our BVS instance that holds the symbolic data.
# (!)
password_file = angr.storage.SimFile(filename, content=password)
# Add the symbolic file we created to the symbolic filesystem.
initial_state.fs.insert(filename, password_file)
simulation = project.factory.simgr(initial_state)
def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Good Job." in stdout_output
def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Try again." in stdout_output
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
solution_state = simulation.found[0]
solution = solution_state.solver.eval(password,cast_to=bytes).decode()
print(solution)
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
08_angr_constraints
这一关根据描述来,想要用程序本身的check需要循环65536次,会影响angr查找的效率,我们需要通过自己添加z3(claripy)约束条件来自己重写判断。
因此我们只需要添加一条路径能到达判断函数中,然后添加一个约束即可。
# The binary asks for a 16 character password to which is applies a complex
# function and then compares with a reference string with the function
# check_equals_[reference string]. (Decompile the binary and take a look at it!)
# The source code for this function is provided here. However, the reference
# string in your version will be different than AABBCCDDEEFFGGHH:
#
# #define REFERENCE_PASSWORD = "AABBCCDDEEFFGGHH";
# int check_equals_AABBCCDDEEFFGGHH(char* to_check, size_t length) {
# uint32_t num_correct = 0;
# for (int i=0; i<length; ++i) {
# if (to_check[i] == REFERENCE_PASSWORD[i]) {
# num_correct += 1;
# }
# }
# return num_correct == length;
# }
#
# ...
#
# char* input = user_input();
# char* encrypted_input = complex_function(input);
# if (check_equals_AABBCCDDEEFFGGHH(encrypted_input, 16)) {
# puts("Good Job.");
# } else {
# puts("Try again.");
# }
#
# The function checks if *to_check == "AABBCCDDEEFFGGHH". Verify this yourself.
# While you, as a human, can easily determine that this function is equivalent
# to simply comparing the strings, the computer cannot. Instead the computer
# would need to branch every time the if statement in the loop was called (16
# times), resulting in 2^16 = 65,536 branches, which will take too long of a
# time to evaluate for our needs.
#
# We do not know how the complex_function works, but we want to find an input
# that, when modified by complex_function, will produce the string:
# AABBCCDDEEFFGGHH.
#
# In this puzzle, your goal will be to stop the program before this function is
# called and manually constrain the to_check variable to be equal to the
# password you identify by decompiling the binary. Since, you, as a human, know
# that if the strings are equal, the program will print "Good Job.", you can
# be assured that if the program can solve for an input that makes them equal,
# the input will be the correct password.
import angr
import claripy
import sys
def main(argv):
path_to_binary = "08_angr_constraints"
project = angr.Project(path_to_binary)
start_address = 0x80492D7
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
password = claripy.BVS('password', 16*8)
password_address = 0x804C028
initial_state.memory.store(password_address, password)
simulation = project.factory.simgr(initial_state)
# Angr will not be able to reach the point at which the binary prints out
# 'Good Job.'. We cannot use that as the target anymore.
# (!)
address_to_check_constraint = 0x8049230
simulation.explore(find=address_to_check_constraint)
if simulation.found:
solution_state = simulation.found[0]
# Recall that we need to constrain the to_check parameter (see top) of the
# check_equals_ function. Determine the address that is being passed as the
# parameter and load it into a bitvector so that we can constrain it.
# (!)
constrained_parameter_address = 0x804C028
constrained_parameter_size_bytes = 16
constrained_parameter_bitvector = solution_state.memory.load(
constrained_parameter_address,
constrained_parameter_size_bytes
)
# We want to constrain the system to find an input that will make
# constrained_parameter_bitvector equal the desired value.
# (!)
constrained_parameter_desired_value = "PUTXLQMXLXPNKAMA".encode() # :string (encoded)
# Specify a claripy expression (using Pythonic syntax) that tests whether
# constrained_parameter_bitvector == constrained_parameter_desired_value.
# Add the constraint to the state to let z3 attempt to find an input that
# will make this expression true.
solution_state.add_constraints(constrained_parameter_bitvector == constrained_parameter_desired_value)
# Solve for the constrained_parameter_bitvector.
# (!)
solution = solution_state.solver.eval(password, cast_to=bytes)
print(solution)
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
09_angr_hooks
由于check函数循环次数过多导致angr效率低下,利用angr自带的hook重写check函数
# This level performs the following computations:
#
# 1. Get 16 bytes of user input and encrypt it.
# 2. Save the result of check_equals_AABBCCDDEEFFGGHH (or similar)
# 3. Get another 16 bytes from the user and encrypt it.
# 4. Check that it's equal to a predefined password.
#
# The ONLY part of this program that we have to worry about is #2. We will be
# replacing the call to check_equals_ with our own version, using a hook, since
# check_equals_ will run too slowly otherwise.
import angr
import claripy
import sys
def main(argv):
path_to_binary = "09_angr_hooks"
project = angr.Project(path_to_binary)
# Since Angr can handle the initial call to scanf, we can start from the
# beginning.
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
# Hook the address of where check_equals_ is called.
# (!)
check_equals_called_address = 0x804932E
# The length parameter in angr.Hook specifies how many bytes the execution
# engine should skip after completing the hook. This will allow hooks to
# replace certain instructions (or groups of instructions). Determine the
# instructions involved in calling check_equals_, and then determine how many
# bytes are used to represent them in memory. This will be the skip length.
# (!)
instruction_to_skip_length = 12
@project.hook(check_equals_called_address, length=instruction_to_skip_length)
def skip_check_equals_(state):
# Determine the address where user input is stored. It is passed as a
# parameter ot the check_equals_ function. Then, load the string. Reminder:
# int check_equals_(char* to_check, int length) { ...
user_input_buffer_address = 0x804C02C # :integer, probably hexadecimal
user_input_buffer_length = 16
# Reminder: state.memory.load will read the stored value at the address
# user_input_buffer_address of byte length user_input_buffer_length.
# It will return a bitvector holding the value. This value can either be
# symbolic or concrete, depending on what was stored there in the program.
user_input_string = state.memory.load(
user_input_buffer_address,
user_input_buffer_length
)
# Determine the string this function is checking the user input against.
# It's encoded in the name of this function; decompile the program to find
# it.
check_against_string = "JHXYDYBNCFUEJITI" # :string
# gcc uses eax to store the return value, if it is an integer. We need to
# set eax to 1 if check_against_string == user_input_string and 0 otherwise.
# However, since we are describing an equation to be used by z3 (not to be
# evaluated immediately), we cannot use Python if else syntax. Instead, we
# have to use claripy's built in function that deals with if statements.
# claripy.If(expression, ret_if_true, ret_if_false) will output an
# expression that evaluates to ret_if_true if expression is true and
# ret_if_false otherwise.
# Think of it like the Python "value0 if expression else value1".
state.regs.eax = claripy.If(
user_input_string == check_against_string,
claripy.BVV(1, 32),
claripy.BVV(0, 32)
)
simulation = project.factory.simgr(initial_state)
def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Good Job." in stdout_output
def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Try again." in stdout_output
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
solution_state = simulation.found[0]
# Since we are allowing Angr to handle the input, retrieve it by printing
# the contents of stdin. Use one of the early levels as a reference.
solution = solution_state.posix.dumps(sys.stdin.fileno()).decode()
print(solution)
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
10_angr_simprocedures
由于09的hook是对call调用直接hook,假如调用次数很多的话,那么需要对这个函数hook很多次。因此我们可以使用angr自带的angr.SimProcedure类对该类的run方法进行重写,达到替换该函数的效果,也就不用多次hookcall了
# This challenge is similar to the previous one. It operates under the same
# premise that you will have to replace the check_equals_ function. In this
# case, however, check_equals_ is called so many times that it wouldn't make
# sense to hook where each one was called. Instead, use a SimProcedure to write
# your own check_equals_ implementation and then hook the check_equals_ symbol
# to replace all calls to scanf with a call to your SimProcedure.
#
# You may be thinking:
# Why can't I just use hooks? The function is called many times, but if I hook
# the address of the function itself (rather than the addresses where it is
# called), I can replace its behavior everywhere. Furthermore, I can get the
# parameters by reading them off the stack (with memory.load(regs.esp + xx)),
# and return a value by simply setting eax! Since I know the length of the
# function in bytes, I can return from the hook just before the 'ret'
# instruction is called, which will allow the program to jump back to where it
# was before it called my hook.
# If you thought that, then congratulations! You have just invented the idea of
# SimProcedures! Instead of doing all of that by hand, you can let the already-
# implemented SimProcedures do the boring work for you so that you can focus on
# writing a replacement function in a Pythonic way.
# As a bonus, SimProcedures allow you to specify custom calling conventions, but
# unfortunately it is not covered in this CTF.
import angr
import claripy
import sys
def main(argv):
path_to_binary = "10_angr_simprocedures"
project = angr.Project(path_to_binary)
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
# Define a class that inherits angr.SimProcedure in order to take advantage
# of Angr's SimProcedures.
class ReplacementCheckEquals(angr.SimProcedure):
# A SimProcedure replaces a function in the binary with a simulated one
# written in Python. Other than it being written in Python, the function
# acts largely the same as any function written in C. Any parameter after
# 'self' will be treated as a parameter to the function you are replacing.
# The parameters will be bitvectors. Additionally, the Python can return in
# the ususal Pythonic way. Angr will treat this in the same way it would
# treat a native function in the binary returning. An example:
#
# int add_if_positive(int a, int b) {
# if (a >= 0 && b >= 0) return a + b;
# else return 0;
# }
#
# could be simulated with...
#
# class ReplacementAddIfPositive(angr.SimProcedure):
# def run(self, a, b):
# if a >= 0 and b >=0:
# return a + b
# else:
# return 0
#
# Finish the parameters to the check_equals_ function. Reminder:
# int check_equals_AABBCCDDEEFFGGHH(char* to_check, int length) { ...
# (!)
def run(self, to_check, cmp_len):
# We can almost copy and paste the solution from the previous challenge.
# Hint: Don't look up the address! It's passed as a parameter.
# (!)
user_input_buffer_address = to_check
user_input_buffer_length = cmp_len
# Note the use of self.state to find the state of the system in a
# SimProcedure.
user_input_string = self.state.memory.load(
user_input_buffer_address,
user_input_buffer_length
)
check_against_string = "QCWMODSIHVGEBBGZ"
# Finally, instead of setting eax, we can use a Pythonic return statement
# to return the output of this function.
# Hint: Look at the previous solution.
return claripy.If(user_input_string==check_against_string, claripy.BVV(1,32), claripy.BVV(0,32))
# Hook the check_equals symbol. Angr automatically looks up the address
# associated with the symbol. Alternatively, you can use 'hook' instead
# of 'hook_symbol' and specify the address of the function. To find the
# correct symbol, disassemble the binary.
# (!)
check_equals_symbol = "check_equals_QCWMODSIHVGEBBGZ" # :string
project.hook_symbol(check_equals_symbol, ReplacementCheckEquals())
simulation = project.factory.simgr(initial_state)
def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Good Job." in stdout_output
def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Try again." in stdout_output
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
solution_state = simulation.found[0]
solution = solution_state.posix.dumps(sys.stdin.fileno()).decode()
print(solution)
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
11_angr_sim_scanf
由于angr无法识别多个参数的scanf,我们可以使用angr.SimProcedure对scanf函数进行重写。
# This time, the solution involves simply replacing scanf with our own version,
# since Angr does not support requesting multiple parameters with scanf.
import angr
import claripy
import sys
def main(argv):
path_to_binary = "11_angr_sim_scanf"
project = angr.Project(path_to_binary)
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
class ReplacementScanf(angr.SimProcedure):
# Finish the parameters to the scanf function. Hint: 'scanf("%u %u", ...)'.
# (!)
def run(self, format_string, scanf0_address, scanf1_address):
# Hint: scanf0_address is passed as a parameter, isn't it?
scanf0 = claripy.BVS('scanf0', 32)
scanf1 = claripy.BVS('scanf1', 32)
# The scanf function writes user input to the buffers to which the
# parameters point.
self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
self.state.memory.store(scanf1_address, scanf1, endness=project.arch.memory_endness)
# Now, we want to 'set aside' references to our symbolic values in the
# globals plugin included by default with a state. You will need to
# store multiple bitvectors. You can either use a list, tuple, or multiple
# keys to reference the different bitvectors.
# (!)
self.state.globals['solution0'] = scanf0
self.state.globals['solution1'] = scanf1
scanf_symbol = "__isoc99_scanf"
project.hook_symbol(scanf_symbol, ReplacementScanf())
simulation = project.factory.simgr(initial_state)
def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Good Job." in stdout_output
def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Try again." in stdout_output
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
solution_state = simulation.found[0]
# Grab whatever you set aside in the globals dict.
stored_solutions0 = solution_state.globals['solution0']
stored_solutions1 = solution_state.globals['solution1']
...
solution = str(solution_state.solver.eval(stored_solutions0))+' '+str(solution_state.solver.eval(stored_solutions1))
print(solution)
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
12_angr_veritesting
输出angr的测试信息,添加veritesting=True
# When you construct a simulation manager, you will want to enable Veritesting:
# project.factory.simgr(initial_state, veritesting=True)
# Hint: use one of the first few levels' solutions as a reference.
import angr
import sys
def main(argv):
path_to_binary = "12_angr_veritesting"
project = angr.Project(path_to_binary)
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
simulation = project.factory.simgr(initial_state,veritesting=True)
# Define a function that checks if you have found the state you are looking
# for.
def is_successful(state):
# Dump whatever has been printed out by the binary so far into a string.
stdout_output = state.posix.dumps(sys.stdout.fileno())
# Return whether 'Good Job.' has been printed yet.
# (!)
return b'Good Job.' in stdout_output # :boolean
# Same as above, but this time check if the state should abort. If you return
# False, Angr will continue to step the state. In this specific challenge, the
# only time at which you will know you should abort is when the program prints
# "Try again."
def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Try again." in stdout_output # :boolean
# Tell Angr to explore the binary and find any state that is_successful identfies
# as a successful state by returning True.
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
solution_state = simulation.found[0]
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
13_angr_static_binary
angr提供了一些内置的hook,例如libc中常见的一些库函数。我们可以这样使用来对这些函数进行hook。
scanf_address = 0x80525F0
project.hook(scanf_address, angr.SIM_PROCEDURES['libc']['scanf']())
import angr
import sys
def main(argv):
path_to_binary = "13_angr_static_binary"
project = angr.Project(path_to_binary)
start_address = 0x8049777
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
printf_address = 0x8052620
puts_address = 0x805FBB0
scanf_address = 0x80525F0
project.hook(scanf_address, angr.SIM_PROCEDURES['libc']['scanf']())
project.hook(puts_address, angr.SIM_PROCEDURES['libc']['puts']())
project.hook(printf_address, angr.SIM_PROCEDURES['libc']['printf']())
simulation = project.factory.simgr(initial_state,veritesting=True)
def is_successful(state):
# Dump whatever has been printed out by the binary so far into a string.
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b'Good Job.' in stdout_output # :boolean
def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return b"Try again." in stdout_output # :boolean
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
solution_state = simulation.found[0]
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
14_angr_shared_library
动态链接库的模拟,需要自行指定基地址,然后创建一块内存存储求解的输入值,然后对eax也就是返回值的状态进行判断,如果返回1则为有解。
# The shared library has the function validate, which takes a string and returns
# either true (1) or false (0). The binary calls this function. If it returns
# true, the program prints "Good Job." otherwise, it prints "Try again."
#
# Note: When you run this script, make sure you run it on
# lib14_angr_shared_library.so, not the executable. This level is intended to
# teach how to analyse binary formats that are not typical executables.
import angr
import claripy
import sys
def main(argv):
path_to_binary = "lib14_angr_shared_library.so"
# The shared library is compiled with position-independent code. You will need
# to specify the base address. All addresses in the shared library will be
# base + offset, where offset is their address in the file.
# (!)
base = 0x40000000
project = angr.Project(path_to_binary, load_options={
'main_opts' : {
'base_addr' : base
}
})
# Initialize any symbolic values here; you will need at least one to pass to
# the validate function.
# (!)
buffer_pointer = claripy.BVV(0x30000000, 32)
# Begin the state at the beginning of the validate function, as if it was
# called by the program. Determine the parameters needed to call validate and
# replace 'parameters...' with bitvectors holding the values you wish to pass.
# Recall that 'claripy.BVV(value, size_in_bits)' constructs a bitvector
# initialized to a single value.
# Remember to add the base value you specified at the beginning to the
# function address!
# Hint: int validate(char* buffer, int length) { ...
# (!)
validate_function_address = base + 0x1234
initial_state = project.factory.call_state(
validate_function_address,
buffer_pointer,
0x8
)
# Inject a symbolic value for the password buffer into the program and
# instantiate the simulation. Another hint: the password is 8 bytes long.
# (!)
password = claripy.BVS( "password", 64 )
initial_state.memory.store( buffer_pointer , password )
simulation = project.factory.simgr(initial_state)
# We wish to reach the end of the validate function and constrain the
# return value of the function (stored in eax) to equal true (value of 1)
# just before the function returns. We could use a hook, but instead we
# can search for the address just before the function returns and then
# constrain eax
# (!)
check_constraint_address = base + 0x12E0
simulation.explore(find=check_constraint_address)
if simulation.found:
solution_state = simulation.found[0]
# Determine where the program places the return value, and constrain it so
# that it is true. Then, solve for the solution and print it.
# (!)
solution_state.add_constraints( solution_state.regs.eax != 0 )
solution = solution_state.solver.eval(password,cast_to=bytes)
print(solution)
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
15_angr_arbitrary_read
通过栈溢出,控制puts的指针,来达到输出puts的效果
# This binary takes both an integer and a string as a parameter. A certain
# integer input causes the program to reach a buffer overflow with which we can
# read a string from an arbitrary memory location. Our goal is to use Angr to
# search the program for this buffer overflow and then automatically generate
# an exploit to read the string "Good Job."
#
# What is the point of reading the string "Good Job."?
# This CTF attempts to replicate a simplified version of a possible vulnerability
# where a user can exploit the program to print a secret, such as a password or
# a private key. In order to keep consistency with the other challenges and to
# simplify the challenge, the goal of this program will be to print "Good Job."
# instead.
#
# The general strategy for crafting this script will be to:
# 1) Search for calls of the 'puts' function, which will eventually be exploited
# to print out "Good Job."
# 2) Determine if the first parameter of 'puts', a pointer to the string to be
# printed, can be controlled by the user to be set to the location of the
# "Good Job." string.
# 3) Solve for the input that prints "Good Job."
#
# Note: The script is structured to implement step #2 before #1.
# Some of the source code for this challenge:
#
# #include <stdio.h>
# #include <stdlib.h>
# #include <string.h>
# #include <stdint.h>
#
# // This will all be in .rodata
# char msg[] = "${ description }$";
# char* try_again = "Try again.";
# char* good_job = "Good Job.";
# uint32_t key;
#
# void print_msg() {
# printf("%s", msg);
# }
#
# uint32_t complex_function(uint32_t input) {
# ...
# }
#
# struct overflow_me {
# char buffer[16];
# char* to_print;
# };
#
# int main(int argc, char* argv[]) {
# struct overflow_me locals;
# locals.to_print = try_again;
#
# print_msg();
#
# printf("Enter the password: ");
# scanf("%u %20s", &key, locals.buffer);
#
# key = complex_function(key);
#
# switch (key) {
# case ?:
# puts(try_again);
# break;
#
# ...
#
# case ?:
# // Our goal is to trick this call to puts to print the "secret
# // password" (which happens, in our case, to be the string
# // "Good Job.")
# puts(locals.to_print);
# break;
#
# ...
# }
#
# return 0;
# }
import angr
import claripy
import sys
def main(argv):
path_to_binary = "15_angr_arbitrary_read"
project = angr.Project(path_to_binary)
# You can either use a blank state or an entry state; just make sure to start
# at the beginning of the program.
# (!)
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
# Again, scanf needs to be replaced.
class ReplacementScanf(angr.SimProcedure):
# Hint: scanf("%u %20s")
def run(self, format_string, key, buffer):
# %u
scanf0 = claripy.BVS('scanf0', 32)
# %20s
scanf1 = claripy.BVS('scanf1', 20*8)
# The bitvector.chop(bits=n) function splits the bitvector into a Python
# list containing the bitvector in segments of n bits each. In this case,
# we are splitting them into segments of 8 bits (one byte.)
for char in scanf1.chop(bits=8):
# Ensure that each character in the string is printable. An interesting
# experiment, once you have a working solution, would be to run the code
# without constraining the characters to the printable range of ASCII.
# Even though the solution will technically work without this, it's more
# difficult to enter in a solution that contains character you can't
# copy, paste, or type into your terminal or the web form that checks
# your solution.
# (!)
self.state.add_constraints(char >= "A", char <= "Z")
# Warning: Endianness only applies to integers. If you store a string in
# memory and treat it as a little-endian integer, it will be backwards.
scanf0_address = key
self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
scanf1_address = buffer
self.state.memory.store(scanf1_address, scanf1)
...
self.state.globals['solution0'] = scanf0
self.state.globals['solution1'] = scanf1
...
scanf_symbol = "__isoc99_scanf" # :string
project.hook_symbol(scanf_symbol, ReplacementScanf())
# We will call this whenever puts is called. The goal of this function is to
# determine if the pointer passed to puts is controllable by the user, such
# that we can rewrite it to point to the string "Good Job."
def check_puts(state):
# Recall that puts takes one parameter, a pointer to the string it will
# print. If we load that pointer from memory, we can analyse it to determine
# if it can be controlled by the user input in order to point it to the
# location of the "Good Job." string.
#
# Treat the implementation of this function as if puts was just called.
# The stack, registers, memory, etc should be set up as if the x86 call
# instruction was just invoked (but, of course, the function hasn't copied
# the buffers yet.)
# The stack will look as follows:
# ...
# esp + 7 -> /----------------\
# esp + 6 -> | puts |
# esp + 5 -> | parameter |
# esp + 4 -> \----------------/
# esp + 3 -> /----------------\
# esp + 2 -> | return |
# esp + 1 -> | address |
# esp -> \----------------/
#
# Hint: Look at level 08, 09, or 10 to review how to load a value from a
# memory address. Remember to use the correct endianness in the future when
# loading integers; it has been included for you here.
# (!)
puts_parameter = state.memory.load(state.regs.esp + 4, 4, endness=project.arch.memory_endness)
# The following function takes a bitvector as a parameter and checks if it
# can take on more than one value. While this does not necessary tell us we
# have found an exploitable state, it is a strong indication that the
# bitvector we checked may be controllable by the user.
# Use it to determine if the pointer passed to puts is symbolic.
# (!)
if state.solver.symbolic(puts_parameter):
# Determine the location of the "Good Job." string. We want to print it
# out, and we will do so by attempting to constrain the puts parameter to
# equal it. Hint: use 'objdump -s <binary>' to look for the string's
# address in .rodata.
# (!)
good_job_string_address = 0x52414657 # :integer, probably hexadecimal
# Create an expression that will test if puts_parameter equals
# good_job_string_address. If we add this as a constraint to our solver,
# it will try and find an input to make this expression true. Take a look
# at level 08 to remind yourself of the syntax of this.
# (!)
is_vulnerable_expression = good_job_string_address == puts_parameter # :boolean bitvector expression
# Finally, we test if we can satisfy the constraints of the state.
if state.satisfiable(extra_constraints=(is_vulnerable_expression,)):
# Before we return, let's add the constraint to the solver for real,
# instead of just querying whether the constraint _could_ be added.
state.add_constraints(is_vulnerable_expression)
return True
else:
return False
else: # not state.solver.symbolic(???)
return False
simulation = project.factory.simgr(initial_state)
# In order to determine if we have found a vulnerable call to 'puts', we need
# to run the function check_puts (defined above) whenever we reach a 'puts'
# call. To do this, we will look for the place where the instruction pointer,
# state.addr, is equal to the beginning of the puts function.
def is_successful(state):
# We are looking for puts. Check that the address is at the (very) beginning
# of the puts function. Warning: while, in theory, you could look for
# any address in puts, if you execute any instruction that adjusts the stack
# pointer, the stack diagram above will be incorrect. Therefore, it is
# recommended that you check for the very beginning of puts.
# (!)
puts_address = 0x8049060
if state.addr == puts_address:
# Return True if we determine this call to puts is exploitable.
return check_puts(state)
else:
# We have not yet found a call to puts; we should continue!
return False
simulation.explore(find=is_successful)
if simulation.found:
solution_state = simulation.found[0]
sol0=solution_state.solver.eval(solution_state.globals['solution0'])
sol1=solution_state.solver.eval(solution_state.globals['solution1'],cast_to=bytes)
print(sol0,sol1)
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
16_angr_arbitrary_write
同样是栈溢出
# Essentially, the program does the following:
#
# scanf("%d %20s", &key, user_input);
# ...
# // if certain unknown conditions are true...
# strncpy(random_buffer, user_input);
# ...
# if (strncmp(secure_buffer, reference_string)) {
# // The secure_buffer does not equal the reference string.
# puts("Try again.");
# } else {
# // The two are equal.
# puts("Good Job.");
# }
#
# If this program has no bugs in it, it would _always_ print "Try again." since
# user_input copies into random_buffer, not secure_buffer.
#
# The question is: can we find a buffer overflow that will allow us to overwrite
# the random_buffer pointer to point to secure_buffer? (Spoiler: we can, but we
# will need to use Angr.)
#
# We want to identify a place in the binary, when strncpy is called, when we can:
# 1) Control the source contents (not the source pointer!)
# * This will allow us to write arbitrary data to the destination.
# 2) Control the destination pointer
# * This will allow us to write to an arbitrary location.
# If we can meet both of those requirements, we can write arbitrary data to an
# arbitrary location. Finally, we need to contrain the source contents to be
# equal to the reference_string and the destination pointer to be equal to the
# secure_buffer.
import angr
import claripy
import sys
def main(argv):
path_to_binary = "16_angr_arbitrary_write"
project = angr.Project(path_to_binary)
# You can either use a blank state or an entry state; just make sure to start
# at the beginning of the program.
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
class ReplacementScanf(angr.SimProcedure):
# Hint: scanf("%u %20s")
def run(self, format_string, key, s):
# %u
scanf0 = claripy.BVS('scanf0', 32)
# %20s
scanf1 = claripy.BVS('scanf1', 20*8)
for char in scanf1.chop(bits=8):
self.state.add_constraints(char >= '0', char <= 'z')
scanf0_address = key
self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
scanf1_address = s
self.state.memory.store(scanf1_address, scanf1)
...
self.state.globals['solution0'] = scanf0
self.state.globals['solution1'] = scanf1
scanf_symbol = "__isoc99_scanf" # :string
project.hook_symbol(scanf_symbol, ReplacementScanf())
# In this challenge, we want to check strncpy to determine if we can control
# both the source and the destination. It is common that we will be able to
# control at least one of the parameters, (such as when the program copies a
# string that it received via stdin).
def check_strncpy(state):
# The stack will look as follows:
# ... ________________
# esp + 15 -> / \
# esp + 14 -> | param2 |
# esp + 13 -> | len |
# esp + 12 -> \________________/
# esp + 11 -> / \
# esp + 10 -> | param1 |
# esp + 9 -> | src |
# esp + 8 -> \________________/
# esp + 7 -> / \
# esp + 6 -> | param0 |
# esp + 5 -> | dest |
# esp + 4 -> \________________/
# esp + 3 -> / \
# esp + 2 -> | return |
# esp + 1 -> | address |
# esp -> \________________/
# (!)
strncpy_dest = state.memory.load(state.regs.esp+4, 4, endness=project.arch.memory_endness)
strncpy_src = state.memory.load(state.regs.esp+8, 4, endness=project.arch.memory_endness)
strncpy_len = state.memory.load(state.regs.esp+12, 4, endness=project.arch.memory_endness)
# We need to find out if src is symbolic, however, we care about the
# contents, rather than the pointer itself. Therefore, we have to load the
# the contents of src to determine if they are symbolic.
# Hint: How many bytes is strncpy copying?
# (!)
src_contents = state.memory.load(strncpy_src, strncpy_len)
# Our goal is to determine if we can write arbitrary data to an arbitrary
# location. This means determining if the source contents are symbolic
# (arbitrary data) and the destination pointer is symbolic (arbitrary
# destination).
# (!)
if state.solver.symbolic(src_contents) and state.solver.symbolic(strncpy_dest):
# Use ltrace to determine the password. Decompile the binary to determine
# the address of the buffer it checks the password against. Our goal is to
# overwrite that buffer to store the password.
# (!)
password_string = "JAHPDBVT" # :string
buffer_address = 0x4C56463C # :integer, probably in hexadecimal
# Create an expression that tests if the first n bytes is length. Warning:
# while typical Python slices (array[start:end]) will work with bitvectors,
# they are indexed in an odd way. The ranges must start with a high value
# and end with a low value. Additionally, the bits are indexed from right
# to left. For example, let a bitvector, b, equal 'ABCDEFGH', (64 bits).
# The following will read bit 0-7 (total of 1 byte) from the right-most
# bit (the end of the string).
# b[7:0] == 'H'
# To access the beginning of the string, we need to access the last 16
# bits, or bits 48-63:
# b[63:48] == 'AB'
# In this specific case, since we don't necessarily know the length of the
# contents (unless you look at the binary), we can use the following:
# b[-1:-16] == 'AB', since, in Python, -1 is the end of the list, and -16
# is the 16th element from the end of the list. The actual numbers should
# correspond with the length of password_string.
# (!)
does_src_hold_password = src_contents[-1 : -64] == password_string
# Create an expression to check if the dest parameter can be set to
# buffer_address. If this is true, then we have found our exploit!
# (!)
does_dest_equal_buffer_address = strncpy_dest == buffer_address
# In the previous challenge, we copied the state, added constraints to the
# copied state, and then determined if the constraints of the new state
# were satisfiable. Since that pattern is so common, Angr implemented a
# parameter 'extra_constraints' for the satisfiable function that does the
# exact same thing. Note that we can pass multiple expressions to
# extra_constraints.
if state.satisfiable(extra_constraints=(does_src_hold_password, does_dest_equal_buffer_address)):
state.add_constraints(does_src_hold_password, does_dest_equal_buffer_address)
return True
else:
return False
else: # not state.solver.symbolic(???)
return False
simulation = project.factory.simgr(initial_state)
def is_successful(state):
strncpy_address = 0x8049080
if state.addr == strncpy_address:
return check_strncpy(state)
else:
return False
simulation.explore(find=is_successful)
if simulation.found:
solution_state = simulation.found[0]
solution0 = solution_state.se.eval(solution_state.globals['solution0'])
solution1 = solution_state.se.eval(solution_state.globals['solution1'],cast_to=bytes)
print(solution0,solution1)
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
17_angr_arbitrary_jump
栈溢出,但是覆盖的是返回地址,让程序执行输出good的函数而不是控制指针。
# An unconstrained state occurs when there are too many
# possible branches from a single instruction. This occurs, among other ways,
# when the instruction pointer (on x86, eip) is completely symbolic, meaning
# that user input can control the address of code the computer executes.
# For example, imagine the following pseudo assembly:
#
# mov user_input, eax
# jmp eax
#
# The value of what the user entered dictates the next instruction. This
# is an unconstrained state. It wouldn't usually make sense for the execution
# engine to continue. (Where should the program jump to if eax could be
# anything?) Normally, when Angr encounters an unconstrained state, it throws
# it out. In our case, we want to exploit the unconstrained state to jump to
# a location of our choosing. We will get to how to disable Angr's default
# behavior later.
#
# This challenge represents a classic stack-based buffer overflow attack to
# overwrite the return address and jump to a function that prints "Good Job."
# Our strategy for solving the challenge is as follows:
# 1. Initialize the simulation and ask Angr to record unconstrained states.
# 2. Step through the simulation until we have found a state where eip is
# symbolic.
# 3. Constrain eip to equal the address of the "print_good" function.
import angr
import claripy
import sys
def main(argv):
path_to_binary = "17_angr_arbitrary_jump"
project = angr.Project(path_to_binary)
# Make a symbolic input that has a decent size to trigger overflow
# (!)
symbolic_input = claripy.BVS("input", 64*8)
# Create initial state and set stdin to the symbolic input
initial_state = project.factory.entry_state(
stdin=symbolic_input,
add_options = {
angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS
}
)
# The save_unconstrained=True parameter specifies to Angr to not throw out
# unconstrained states. Instead, it will move them to the list called
# 'simulation.unconstrained'. Additionally, we will be using a few stashes
# that are not included by default, such as 'found' and 'not_needed'. You will
# see how these are used later.
# (!)
simulation = project.factory.simgr(
initial_state,
save_unconstrained=True,
stashes={
'active' : [initial_state],
'unconstrained' : [],
'found' : [],
'not_needed' : []
}
)
# Explore will not work for us, since the method specified with the 'find'
# parameter will not be called on an unconstrained state. Instead, we want to
# explore the binary ourselves. To get started, construct an exit condition
# to know when the simulation has found a solution. We will later move
# states from the unconstrained list to the simulation.found list.
# Create a boolean value that indicates a state has been found.
def has_found_solution():
return simulation.found
# An unconstrained state occurs when there are too many possible branches
# from a single instruction. This occurs, among other ways, when the
# instruction pointer (on x86, eip) is completely symbolic, meaning
# that user input can control the address of code the computer executes.
# For example, imagine the following pseudo assembly:
#
# mov user_input, eax
# jmp eax
#
# The value of what the user entered dictates the next instruction. This
# is an unconstrained state. It wouldn't usually make sense for the execution
# engine to continue. (Where should the program jump to if eax could be
# anything?) Normally, when Angr encounters an unconstrained state, it throws
# it out. In our case, we want to exploit the unconstrained state to jump to
# a location of our choosing. Check if there are still unconstrained states
# by examining the simulation.unconstrained list.
# (!)
def has_unconstrained_to_check():
return simulation.unconstrained
# The list simulation.active is a list of all states that can be explored
# further.
# (!)
def has_active():
return simulation.active
while (has_active() or has_unconstrained_to_check()) and (not has_found_solution()):
for unconstrained_state in simulation.unconstrained:
# Look for unconstrained states and move them to the 'found' stash.
# A 'stash' should be a string that corresponds to a list that stores
# all the states that the state group keeps. Values include:
# 'active' = states that can be stepped
# 'deadended' = states that have exited the program
# 'errored' = states that encountered an error with Angr
# 'unconstrained' = states that are unconstrained
# 'found' = solutions
# anything else = whatever you want, perhaps you want a 'not_needed',
# you can call it whatever you want
# (!)
simulation.move('unconstrained', 'found')
# Advance the simulation.
simulation.step()
if simulation.found:
solution_state = simulation.found[0]
# Constrain the instruction pointer to target the print_good function and
# (!)
solution_state.add_constraints(solution_state.regs.eip == 0x57534B64)
# Constrain the symbolic input to fall within printable range (capital
# letters) for the web UI. Ensure UTF-8 encoding.
# (!)
for byte in symbolic_input.chop(bits=8):
solution_state.add_constraints(
byte >= 30,
byte <= 127
)
# Solve for the symbolic_input
solution = solution_state.solver.eval(symbolic_input,cast_to=bytes).decode()
print(solution)
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
main(sys.argv)
小结
通过angr_ctf可以见识到符号执行的强大之处。通过这个引擎我们可以实现很多自动化的效果,例如一些autopwn之类的。目前也有一些师傅开发出来了autopwn的demo。