Matt, 2016/01/04 10:00

PHP extension for code analysis (dynamic analysis)


I. Basic environment

#! bash apt-get install php5 apt-get install php5-dev apt-get install apache apt-get install mysqlCopy the code

2. Use PHPTracert

#! Bash mkdir godhead wget https://github.com/Qihoo360/phptrace/archive/v0.3.0.zip unzip v0.3.0.zip CD /configure --with-php-config=/usr/bin/php-config make & make install CD.. /cmdtool makeCopy the code

Edit php.ini to add:

#! bash extension=trace.soCopy the code

Test three.

#! php <? php for($i=0; $i<100; $i++){ echo $I; sleep(1); }? >Copy the code

CLI

#! shell php test.php & ps -axu|grep php ./phptrace -p pidCopy the code

apache

#! Ps - aux bash curl 127.0.0.1 / test. The PHP | grep apache. / phptrace -p pidCopy the code

4. Phptrace analysis

The code executed is as follows:

#! php <? php function c(){ echo 1; } function b(){ c(); } function a(){ b(); } a(); ? >Copy the code

The order of execution is:

#! bash a>b>c>echoCopy the code

Parameter Meanings:

The name of the value meaning
seq Int | perform the function of the number of times
type 1/2 1 means calling the function and 2 means returning the function
level – 10 Execution depth, for example, function A calls b, then the level of A is 1, and the level of B is 2, successively increasing
func eval The name of the function called
st 1448387651119460 The time stamp
params string Parameters of a function
file c.php Execution file
lineno 1 The line number corresponding to this function

Log output:

#! js {"seq":0, "type":1, "level":1, "func":"{main}", "st":1448387651119445, "params":"", "file":"/var/www/html/2.php", "lineno":11 } {"seq":1, "type":1, "level":2, "func":"a", "st":1448387651119451, "params":"", "file":"/var/www/html/2.php", "lineno":11 } {"seq":2, "type":1, "level":3, "func":"b", "st":1448387651119452, "params":"", "file":"/var/www/html/2.php", "lineno":9 } {"seq":3, "type":1, "level":4, "func":"c", "st":1448387651119453, "params":"", "file":"/var/www/html/2.php", "lineno":6 } {"seq":4, "type":2, "level":4, "func":"c, "st":1448387651119457, "return":"NULL", "wt":4, "ct":4, "mem":48, "pmem":144 } {"seq":5, "type":2, "level":3, "func":"b, "st":1448387651119459, "return":"NULL", "wt":7, "ct":6, "mem":48, "pmem":144 } {"seq":6, "type":2, "level":2, "func":"a, "st":1448387651119459, "return":"NULL", "wt":8, "ct":8, "mem":80, "pmem":176 } {"seq":7, "type":2, "level":1, "func":"{main}, "st":1448387651119460, "return":"1", "wt":15, "ct":14, "mem":112, "pmem":208 }Copy the code

Logical analysis

1. Parse the monitoring process

Start a background process to refresh the process list. If a process does not have tracer, it will be hosted immediately

2. Json extraction

By extracting the JSON of each file, the extraction process is as follows:

  1. Facilitate all documents
  2. Read read file
  3. Extract JSON and sort by SEq
  4. extracttype=2withtype=1To merge with
  5. Store the same dictionary according to level
  6. Sort by SEq, take out the header function for output
  7. Extract malicious function extract level up untillevel=0

The corresponding function is as follows:

#! python list1={ level1:[seq,type,func,param,return] level2:[seq,type,func,param,return] level3:[seq,type,func,param,return] #eval level4:[seq,type,func,param,return] } list2=Copy the code

3. Data viewing

By tracing the dangerous functions, and then teasing out the relationship before the function is executed for output, and then manual review.

Put on the demo

Use XDEBUG

The installation

#! bash apt-get install php5-xdebugCopy the code

Modify the PHP ini

#! bash [xdebug] zend_extension = "/usr/lib/php5/20131226/xdebug.so" xdebug.auto_trace = on xdebug.auto_profile = on xdebug.collect_params = on xdebug.collect_return = on xdebug.profiler_enable = on xdebug.trace_output_dir = "/tmp/ad/xdebug_log" xdebug.profiler_output_dir = "/tmp/ad/xdebug_log"Copy the code

Here are some demo images:

Vii. Advantages and Disadvantages

disadvantages

Human participation is large, unable to operate independently without human.

advantages

High precision, for both object-oriented and process-oriented code can be analyzed.

0x01 Syntax Analysis (Static Analysis)


Case study:

  • php-grinder.com/
  • rips-scanner.sourceforge.net/

Use php-parser

Introduction:

  • www.oschina.net/p/php-parse…
  • Github.com/nikic/PHP-P…

2. Install

#! shell git clone https://github.com/nikic/PHP-Parser.git & cd PHP-Parser curl -sS https://getcomposer.org/installer | phpCopy the code

PHP > = 5.3; Parsing PHP 5.2 to PHP 5.6

#! bash php composer.phar require nikic/php-parserCopy the code

PHP > = 5.4; Parsing PHP 5.2 to PHP 7.0

#! Bash PHP composer. Phar require nikic/php-parser 2.0.x-devCopy the code

Test three.

#! php <? php include 'autoload.php'; use PhpParser\Error; use PhpParser\ParserFactory; $code = '<? php eval($_POST[c][/c])? > '; $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7); try { $stmts = $parser->parse($code); print_r($stmts); // $stmts is an array of statement nodes } catch (Error $e) { echo 'Parse Error: ', $e->getMessage(); }Copy the code

The output is as follows:

#! js Array ( [0] => PhpParser\Node\Expr\Eval_ Object ( [expr] => PhpParser\Node\Expr\ArrayDimFetch Object ( [var] => PhpParser\Node\Expr\Variable Object ( [name] => _POST [attributes:protected] => Array ( [startLine] => 1 [endLine] => 1 ) ) [dim] => PhpParser\Node\Expr\ConstFetch Object ( [name] => PhpParser\Node\Name Object ( [parts] => Array ( [0] => c ) [attributes:protected] => Array ( [startLine] => 1 [endLine] => 1 ) ) [attributes:protected] => Array ( [startLine] =>  1 [endLine] => 1 ) ) [attributes:protected] => Array ( [startLine] => 1 [endLine] => 1 ) ) [attributes:protected] => Array ( [startLine] => 1 [endLine] => 1 ) ) )Copy the code

So we need to extract

#! js [0] => PhpParser\Node\Expr\Eval_ Object [name] => _POST [parts] => Array ( [0] => c )Copy the code

After concatenation, we can find the original statement:

#! php eval($_POST[c][/c])Copy the code

Logical analysis

Code parsing

  1. The library is used for parsing
  2. The extraction results
  3. Extraction of hazard function
  4. Extract the variables that exist in the hazard function
  5. Extract the assignment method for this variable from above
  6. Analysis of controllable results
  7. The output

V. Advantages and disadvantages

disadvantages

Analysis of object-oriented programs is weak.

advantages

Suitable for automated analysis of large quantities, can be carried out independently without manual operation