Female: You can make this forum people all quarrel, I will have dinner with you.


PHP programmer: PHP is the best language in the world!


Some forum fryer, all kinds of quarrel……


Some female: take you, let’s go!


PHP programmers: not today, I must convince them that PHP must be the best language.

The language that someone uses, someone scolds, and the louder the scolding, the more people use it. There is no perfect language in the world, the most appropriate language is the best language, we have to do, is to play to the strengths and avoid the weaknesses, less step on those pits, let’s get straight to the subject.

0 x01, weak type

== and === similarities and differences this is too low-level pit directly skip, first look at a slightly hidden pit

Function translate ($keyword) {$trMap = [' transition '= >' baidu ', 'sougou' = > 'sogou', '360' = > '360', 'Google' = > 'Google']. foreach ($trMap as $key => $value) { if (strpos($keyword, $key) ! == false) { return $value; }} return 'other '; } echo translate("baidu") . "\n"; echo translate("360") . "\n";Copy the code

The expected result is

Baidu 360Copy the code

The actual run result is

Baidu's otherCopy the code

Check carefully, there is no string and int mixed, comparison is also used! == =, no use == =, why will fall into the pit?

The problem is with Array, even though you said

$trMap = [' transition '= >' baidu ', 'sougou' = > 'sogou', '360' = > '360', 'Google' = > 'Google'].Copy the code

But PHP handles it for you

Array (4) {[" baidu "] = > string (6) "baidu" [] "sougou" = > string (6) "sogou" [360] = > string (3) "360" (" Google ") = > string (6) "Google"}Copy the code

Strpos should not report an error when 360 becomes an int. No, forgive it, of course. It chose to be compatible with int

If needle is not a string, it is converted to an integer and applied as the ordinal value of a character.

Hex for 360 is 0x168, so when you call it like this, it matches

translate("\x1\x68")
Copy the code

So what’s the correct way to write it? It can be altered slightly

Strpos ($keyword, $key) strpos($keyword, (string) $key)Copy the code

Here’s the scary part

  • We think we’re safe with ===, ignoring the fact that weak types are everywhere
  • You may not have read the description of each function carefully and checked the type of each argument individually
  • The resulting bug may not reproduce, or it may not normally trigger, but leave a security hole

How to 100% avoid weak type pits? The answer is a strongly typed language. What if you can’t? With the following guidelines, it is not 100% avoided, but 99.99% is possible.

  1. Can use = = = /! Where ==, never use ==/! =, if you know the type, then use === to compare
  2. If you know the parameter type when calling a function, do not bother to cast it

I’m talking about weak types, not dynamic types, they’re not the same thing, don’t get me wrong. Python is dynamically typed strongly, PHP is dynamically typed weakly, and C is statically typed weakly. Given a choice, I’d rather PHP abandon weak typing because weak typing is more trouble than it’s convenience. Providing a strict runtime mode will also work, giving everyone ten or eight years to migrate slowly.

0x02, empty dictionary JSON serialized to []

With the popularity of apps, PHP is often interacting not with browser-side JS, but with statically typed languages like Java and ObjC, and the type definition of the return value becomes very important, for example

$ret1 = [' choices' = > [' braised pork ', 'kung pao chicken],' answers' = > [' zhang '= > 0,' bill '= > 1,' zhaoyun '= > 0,],]; $ret2 = [ 'choices' => [], 'answers' => [], ]; echo json_encode($ret1) . "\n"; echo json_encode($ret2) . "\n";Copy the code

The output

{"choices":["\u9c7c\u9999\u8089\u4e1d","\u5bab\u4fdd\u9e21\u4e01"],"answers":{"\u5f20\u4e09":0,"\u674e\u56db":1,"\u8d75\ u4e91":0}} {"choices":[],"answers":[]}Copy the code

The client might define the model like this

class ResultDTO {
	lateinit var choices: List<String>
	lateinit var answers: Map<String, Int>
}
Copy the code

When it came time to return to REt1, all was well and well. What if you return ret2? The client protests

com.fasterxml.jackson.databind.JsonMappingException: Can not deserialize instance of java.util.LinkedHashMap out of START_ARRAY token

The reason? PHP’s jSON_encode has a dilemma when it’s faced with an empty array, it doesn’t know whether it should be a list or a map, so it just cuts through it and thinks it’s a list, and the client gets upset. The solution is still a cast.

$ret2 = [
    'choices' => [],
    'answers' => (object) [],
];
Copy the code

One problem with this is that if answers is not dead, but is the return value of some API, you are not sure if it will return empty, and it has no obligation to help you cast as object, because JSON serialization is a front-end interaction and should not be handled at the back-end service level. Then you have to do it yourself, manually cast the return value where an empty map might appear.

PHP’s associative arrays are powerful, algorithmically well designed, and perform well, but they don’t come without a cost, and the chestnut above is one of them. It might be easier for PHP to differentiate between maps and lists, as other languages do. After all, it’s not much of a learning cost for programmers to distinguish between {} and [].

0x03, forgetful FPM

In recent years, CLI deployment methods such as Swoole and WorkerMan have been gradually recognized and adopted by Chinese people. However, compared to FPM or mod_PHP, CLI is still too mainstream. In the face of the absolute monopoly of FPM/ mod_PHP, CLI is slowly growing. The drawback of FPM is obvious. At the end of each request, the objects you created in PHP code are cleaned up, leaving no trace of the code you executed.

In a micro application like Hello World, it doesn’t seem to be a problem, but for a larger project, we have to use a framework for DRY, for less rework, for more efficient development, and then there’s the problem. A PHP framework written in PHP, because of FPM’s forgetiness, starts init, reads configuration files, To initialize the various components, this kind of work in each request comes, all want to make a duplicate, if you need to read a 100 m of metadata, then each HTTP request, you have to read and parse a at a time, when you return towards the end of the HTTP request, you parsed 100 m metadata, and been destroyed, the next request, You still have to do it again.

PHP 5.6 already beats Python 3.6. PHP 7.1 doesn’t even bother to match Python’s performance. It’s several times faster. But once a framework of the same size is introduced, such as Laravel for PHP and Django for Python, the story is reversed, and Django can beat PHP7’s Laravel. A 100-meter runner, no matter how fast he runs, must first put on his shoelaces after each gunshot, then put on his shoelaces, then run again, take off his shoes after running, and then pull out his shoelaces. Even if it can run 100 meters in just one second, other runners can run back and forth in time with their shoes on.

As a result, many people, including the fathers of PHP, have been skeptical of thick frameworks such as Laravel, and when it comes to performance, the mainstream tends to recommend not using frameworks, but using minimalist frameworks, or frameworks written in C, such as Yaf and Phalcon. The framework performance problem is a curve solved, so what about the user’s own logic? This one is a little bit trickier. Simple types, such as string, can be read using the extension yaconf. If you’re dealing with complex data structures, like trees, you can’t do it this way. Is there a solution? It’s not impossible. You can also alleviate the problem by writing a script that converts the data into PHP code and caching it via OpCache. The only solution is to write a C extension that makes it resident in memory, but that’s beyond the reach of normal PHP development.

FPM is not new to PHP. CGI used to do this before FastCGI, and it still starts a new process per request, which is more expensive than FPM. In the 21st century, however, PHP is the only common language that still uses the forgetful running mode of FPM. Over the next decade, perhaps, FPM will be replaced by Swoole, the inforgetful.

0x04, multithreading support

We don’t discuss whether Apache’s MPM supports multithreading, whether PHP extensions support multithreading, and whether PHP can take advantage of multithreading or multi-core. We only discuss pure PHP code, whether it can create and manage threads. A few years ago, PHP didn’t support multithreading at all. Now? It is said to have pThreads, then open its documentation and find

WarningThe pthreads extension cannot be used in a web server environment. Threading in PHP should therefore remain to CLI-based applications only.

WarningThe pthreads extension can only be used with PHP 7.2+. This is due to ZTS mode being unsafe in prior PHP versions.

Two limitations

  1. This command is used only on the CLI
  2. Only PHP 7.2+ is supported

For those of you who have never used multithreading before, you can’t appreciate the convenience of multithreading. It’s much easier to share data within a process than with multiple processes. Modern languages support multithreading is a natural thing, compared to PHP, Python, has long had the support of native threads, although the GIL can not be CPU intensive applications, but it is very convenient to do IO intensive applications. Multithreading is just icing on the cake, not a timely solution. Fortunately, PHP multi-process support is OK, let’s use multi-process, at most when sharing data structure, try to bypass is. Thread pool + execution queue becomes process pool + execution queue.

0x05, on 32-bit platforms, there is no 8-byte long

PHP ints are platform-dependent, 4 bytes on 32-bit platforms and 8 bytes on 64-bit platforms. For code robustness and portability, we can only assume that ints are of a 4-byte type. But a lot of times we need the 8-byte type because

  1. A millisecond accurate timestamp requires a long
  2. Many platform docking needs LONG, such as Alibaba

This is where libraries like GMP and BCMath are needed, which is a bit more cumbersome than the language’s direct support for 8-byte longs.

0x06, the array function is poorly designed and difficult to use

PHP provides a bunch of array_xxxx functions instead of using them as array methods, which at first glance looks fine, but there are three functions that are less useful. These three functions are

array array_map ( callable $callback , array $array1 [, array $... ]  ) mixed array_reduce ( array $array , callable $callback [, mixed $initial = NULL ] ) array array_filter ( array $array [, callable $callback [, int $flag = 0 ]] )Copy the code

For example, if you square an array of numbers and add the numbers that are greater than 100, you can write it in the usual way

$arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15];

function foo_a($num_arr) {
    $sum = 0;

    foreach ($num_arr as $n) {
        $v = $n * $n;
        if ($v > 100) {
            $sum += $v;
        }
    }

    return $sum;
}
echo foo_a($arr) . "\n";
Copy the code

If it is simple addition, subtraction, multiplication and division, this is OK, but if it is more complex logic, each step of the operation will be presented and encapsulated into the corresponding function. So let’s try to write it functionally,

function foo_b($num_arr) { return array_sum( array_filter( array_map(function ($v) { return $v * $v; }, $num_arr), function($v){ return $v > 100; })); } echo foo_b($arr) . "\n";Copy the code

It looks unreadable and ugly, thanks to poorly designed array functions in PHP. Let’s say I could write it this way

function foo_c($num_arr) { return $num_arr.map(function ($v) { return $v * $v; }) .filter(function ($v) {return $v > 100; }) .sum() }Copy the code

Has readability and usability improved a lot? Define map/filter/reduce as an array method and return an array. We continue to

function foo_c($num_arr) {
    return $num_arr.map ($v -> $v * $v)
        .filter($v -> $v > 100)
        .sum()
}
Copy the code

Some people might say I plagiarized, but this is the Java 8 lambda, right, this is Java 8 lambda. Is Java 8 full of grammar sugar? Obviously not, but we can simplify it a little bit more.

function foo_c($num_arr) {
    return $num_arr.map {$it * $it}
        .filter {$it > 100}
        .sum()
}
Copy the code

Isn’t it simpler to give a lambda that takes only one argument a default argument called it? Can we continue? Of course I can

function foo_c($num_arr) = $num_arr.map {$it * $it}
        .filter {$it > 100}
        .sum()
Copy the code

If you look at this, some of you might already recognize the syntax of what language this is, yes, that’s it.

Take a look at list Comprehension with PHP’s favorite Python comparison

sum([y for y in [x * x for x in num_arr] if y > 100])
Copy the code

People who don’t know Python write Python like this, and you square it like this, haha

list(map(lambda x: x * x, num_arr))
Copy the code

0x07, the function naming style is too inconsistent

PHP has abbreviations such as nL2br and long names such as htmlspecialChars_decode. It is said that earlier versions of PHP used the length of function names as hashes, and the evenly distributed length of names helped reduce hash collisions. Sounds like a PHP troll or a PHP fan out fishing. But look at this

Re: Flexible Function naming me shocked, said the father of PHP

On 12/16/2013 07:30 PM, Rowan Collins wrote:





> The core functions which follow neither rule include C-style


> abbreviations like “strptime” which couldn’t be automatically swapped to


> either format, and complete anomalies like “nl2br”. If you named those


> functions as part of a consistent style, you would probably also follow


> stronger naming conventions than Rasmus did when he named


> “htmlspecialchars”.





Well, there were other factors in play there. htmlspecialchars was a


very early function. Back when PHP had less than 100 functions and the


function hashing mechanism was strlen(). In order to get a nice hash


distribution of function names across the various function name lengths


names were picked specifically to make them fit into a specific length


bucket. This was circa late 1994 when PHP was a tool just for my own


personal use and I wasn’t too worried about not being able to remember


the few function names.





-Rasmus

It’s amazing that it’s true. It is said that this design was later replaced by PHP3. PHP has been working on naming consistency, but compatibility is likely to take years.

0x08, magic_quotes…

It automatically escapes special characters in GPC(GET/POST/COOKIE) variables. Fortunately, PHP 5.4 removed this feature, but some of the more traditional frameworks still have this feature. I just wanted to ask, do you know what I’m going to do with these values? Do you know which characters count as special characters on my side? It is consistent with the idea of being afraid of being infected with HIV. Similarly, configuration has too much and too complex an impact on runtime behavior.

@fopen('http://example.com/not-existing-file', 'r');
Copy the code

It’s a simple line of code, however, and its behavior depends on many environment configurations

PHP will not work if compiled with –disable-url-fopen-wrapper. (The documentation does not say what “does not work” means; Return NULL, throw an exception?


Note that this was removed in PHP 5.2.5.


Allow_url_fopen will not work if it is disabled in php.ini. (Why? No way to know.)


A warning from a Non-Existent file will not be printed because of @.


But if you set rscream. Enabled in php.ini, it will print again.


Or if you manually set scream. Enabled with ini_set.


However, if the error_reporting level is not set, it is different.


If printed out, the exact direction depends on display_errors, again in php.ini. Or in ini_set.

The best words hide the most dark magic. The only way to avoid this pit is to make sure that the compilation parameters are consistent and the configuration parameters are consistent across all environments.

0x09, Error and Exception are completely different mechanisms

PHP errors (internally, called trigger_error) cannot be caught by try/catch.


Similarly, exceptions cannot be triggered by the set_error_handler installed error handler.


Instead, there is a separate set_Exception_handler to handle uncaught exceptions.


Fatal errors (e.g., new ClassDoesntExist()) cannot be caught by anything, a large number of completely harmless operations throw Fatal errors that are forced to terminate your program for some controversial reason.

Above, the general framework level will help you solve the application level do not need to worry too much.

0x0A, more pits

Eev. Ee/blog / 2012/0… PHP is a bad design, but it is still in a low version, some of the problems have been resolved, English is not good to see the translation, five major damage, comprehensive analysis of PHP – open source Chinese community

In fact, the 8th and 9th pits I mentioned were also mentioned in the article above, so I copied them. The other pits that I didn’t think were deep, I didn’t think they were that serious.

Of course, PHP has a lot of potholes and it is possible to write perfectly correct code if the user is skilled enough, but most of us are human and potholes are, more or less, a negative effect. Many phpers who have worked for more than five years still fall into these pits.